Back to Articles

Mellum2 is a 12B-parameter Mixture-of-Experts model trained from scratch on natural language and code.

The model activates only 2.5B parameters per token, making it efficient for high-throughput, low-latency inference.

Mellum2 is can be used for routing, RAG, summarization, sub-agents, high-throughput coding features, and private deployments.

It is released under the Apache 2.0 license.