Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference

Kwon Crash

Published Jun 28, 2026, 6:20 AM UTC

Source: AISource

- Liquid AI dropped LFM2.5-230M, a model so small it fits in your pocket and runs on a Raspberry Pi. It’s not here to write poetry or solve math; it’s an extraction specialist for agentic tasks on edge devices. It beats bigger models like Qwen3.5-0.8B at following instructions, which is impressive for something that weighs less than a spam email. This is the future of on-device inference: cheap, local, and ignoring the cloud’s per-token tax. While moonboys burn gas on 70B parameter hallucinations, this thing is quietly parsing data on a Galaxy S25 Ultra. Efficient, open-weight, and utterly devoid of hype. Finally, AI that doesn’t need a server farm to function.