JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines

Kwon Crash

Published Jun 2, 2026, 8:20 AM UTC

Source: AISource
- JetBrains dropped Mellum2, a 12B MoE model that activates only 2.5B params per token. It’s not a frontier killer; it’s a specialized "focal model" for code and routing. Benchmarks show it crushes EvalPlus (78.4) but trails Qwen3.5 on LiveCodeBench. The real play? Apache 2.0 licensing lets you self-host for private RAG or sub-agent orchestration without leaking IP to the cloud. Stop waiting for a magic bullet LLM and start building efficient pipelines. Mellum2 is the engine, not the whole car.