Liquid AI introduced LFM2.5, a new generation of small foundation models built on the LFM2 architecture and focused on hardware and edge deployments. The model family includes LFM2.5-1.2B-Base and LFM2.5-1.2B-Instruct and extends to Japanese Language, Sight Language, and Audio Language variants. It is issued as open weights on Hugging Face and displayed through the LEAP platform.
Architecture and training recipe
LFM2.5 retains the hybrid LFM2 architecture that was designed for fast, memory-efficient inference on CPUs and NPUs and benchmarks data and post-training pipelines. The 1.2 billion parameter backbone pre-training spans from 10T to 28T codes. The instruction variant then receives supervised fine-tuning, preference alignment, and large-scale, multi-stage reinforcement learning focused on following instructions, using tools, mathematics, and knowledge reasoning.
Billion scale text model performance
LFM2.5-1.2B-Instruct is a general-purpose main text template. The Liquid AI team reports benchmark results on GPQA, MMLU Pro, IFEval, IFBench, and several function call and cryptographic suites. The model reaches 38.89 on GPQA and 44.35 on MMLU Pro. Competing 1B open models, such as Llama-3.2-1B Instruct and Gemma-3-1B IT, score much lower on these metrics.

In IFEval and IFBench, which target multi-step instruction following and function call quality, the LFM2.5-1.2B-Instruct is 86.23 and 47.33. These values are ahead of the other 1B class baselines in the Liquid AI table above.
Optimized Japanese version
LFM2.5-1.2B-JP It is an improved Japanese script model derived from the same spine. It targets tasks such as JMMLU, M-IFEval in Japanese, and GSM8K in Japanese. This checkpoint improves on the general instruction model in Japanese tasks and competes with or outperforms other small multilingual models such as Qwen3-1.7B, Llama 3.2-1B Instruct, and Gemma 3-1B IT in these local benchmarks.
A vision language model for multimedia edge workloads
LFM2.5-VL-1.6B is the updated vision language model in the series. It uses LFM2.5-1.2B-Base as the backbone of the language and adds a vision tower for image understanding. The model is tuned to a range of visual inference and OCR benchmarks, including MMStar, MM IFEval, BLINK, InfoVQA, OCRBench v2, RealWorldQA, MMMU, and multilingual MMBench. The LFM2.5-VL-1.6B is improved over the previous LFM2-VL-1.6B in most metrics and is intended for real-world tasks such as document understanding, user interface reading, and multi-image reasoning under edge constraints.
Acoustic language model with native speech generation
LFM2.5-Audio-1.5B is a native audio language model that supports both text and audio input and output. It is offered as a voice-to-sound model and uses a voice code remover that is described as eight times faster than the previous Mimi-based code remover and with the same accuracy on tethered devices.
The model supports two main generation modes. Overlapping generation of real-time speech is designed to speech conversation agents where latency dominates. Sequential generation is aimed at tasks such as automatic speech recognition and text-to-speech conversion and allows the generated method to be swapped without re-initializing the model. The audio stack is trained with low-precision quantization-aware training, keeping metrics such as STOI and UTMOS near the full accurate baseline while enabling deployment on compute-constrained devices.


Key takeaways
- LFM2.5 is a 1.2B scale hybrid model family built on the improved architecture of the LFM2, with Base, Instruct, Japanese, Vision Language and Audio Language variants, all released as open weights on Hugging Face and LEAP.
- LFM2.5 pre-training spans from 10T to 28T tokens, and the Instruct model adds supervised fine-tuning, preference alignment, and large-scale multi-stage reinforcement learning, pushing the quality of following instructions and using the tool beyond other 1B class baselines.
- LFM2.5-1.2B-Instruct delivers strong performance for text benchmarks on the 1B scale, reaching 38.89 on GPQA, 44.35 on MMLU Pro, and leading peer models such as Llama 3.2 1B Instruct, Gemma 3 1B IT, and Granite 4.0 1B on IFEval and IFBench.
- The suite includes specialized regional and multimedia variants, with LFM2.5-1.2B-JP achieving the latest results for Japanese benchmarks across its range, and LFM2.5-VL-1.6B and LFM2.5-Audio-1.5B covering native vision language and audio language workloads for edge agents.
verify Technical details and Typical weights. Also, feel free to follow us on twitter Don’t forget to join us 100k+ mil SubReddit And subscribe to Our newsletter. I am waiting! Are you on telegram? Now you can join us on Telegram too.
Check out our latest version of ai2025.deva 2025-focused analytics platform that turns model launches, performance benchmarks, and ecosystem activity into a structured data set that you can filter, compare, and export
Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of AI for social good. His most recent endeavor is the launch of the AI media platform, Marktechpost, which features in-depth coverage of machine learning and deep learning news that is technically sound and easy to understand by a broad audience. The platform has more than 2 million views per month, which shows its popularity among the masses.







