MiniMax M2: China's New AI Model Beats Gemini 2.5 Pro, Tops Open-Source Charts

Chinese AI startup MiniMax released its M2 language model on Monday, and it has already become a formidable competitor to proprietary systems from OpenAI and Anthropic. The new model captured the highest score for any open-source model on the Artificial Analysis “Intelligence Index.”

With a comprehensive benchmark score of 61, M2 now ranks fifth globally, trailing only GPT-5, Grok 4, and Claude Sonnet 4.5. Significantly, MiniMax M2 surpassed Google DeepMind’s Gemini 2.5 Pro (which scored 60), marking a major achievement for China’s open-source AI ecosystem.

Hyper-Efficient Architecture Drives Performance

The MiniMax M2 utilizes a Mixture of Experts (MoE) architecture with 230 billion total parameters, but it achieves remarkable efficiency by activating only 10 billion parameters during the inference process.

“Using only a small fraction of its parameters allows the model to operate efficiently at scale,” according to Artificial Analysis. By comparison, rivals like DeepSeek’s V3.2 use 37 billion active parameters, and Moonshot AI’s Kimi K2 uses 32 billion.

This sparse design allows the M2 to be deployed with FP8 precision on just four NVIDIA H100 GPUs, making it accessible to medium-sized organizations. Despite its compact active-parameter footprint, M2’s inference speed is approximately 100 tokens per second—roughly twice as fast as competing models like Claude Sonnet 4.5.

Excellence in Programming and Agentic Workflows

MiniMax M2 shows particular strength in agentic workflows and programming applications, areas of increasing importance for enterprise use. The model has achieved impressive scores on several specialized benchmarks:

69.4 on SWE-bench Verified (real-world programming tasks)
77.2 on τ²-Bench (tool-use testing)
44.0 on BrowseComp (web research capabilities)

“The model’s strengths include its tool use and instruction-following capabilities,” Artificial Analysis noted, highlighting M2’s focus on practical applications over general-purpose tasks. Independent tests run by developers showed M2 achieving ~95% accuracy on mixed tasks, compared to 90% for GPT-4o and 88-89% for Claude 3.5.

“I’m deeply impressed by their progress,” commented Florian Brand, a PhD student at Trier University, Germany, and an open-source model expert, emphasizing the significant improvement from MiniMax’s previous M1 model.

MiniMax is offering the model at $0.30 per million input tokens and $1.20 per million output tokens—just 8% of the cost of Claude Sonnet 4.5 while maintaining competitive performance. The model is available on Hugging Face and GitHub under an MIT license, with API access currently free for a limited time.

MiniMax M2: China’s New AI Model Beats Gemini 2.5 Pro, Tops Open-Source Charts

Hyper-Efficient Architecture Drives Performance

Excellence in Programming and Agentic Workflows

Leave a ReplyCancel Reply

Hyper-Efficient Architecture Drives Performance

Excellence in Programming and Agentic Workflows

Leave a ReplyCancel Reply

Trending