Overview
The Llama 4 family, released on April 5, 2025, represents the final chapter in Meta's foundational open-weights era. Featuring the Scout and Maverick series, Llama 4 introduced the Mixture-of-Experts (MoE) architecture to the open-source community at a massive scale.
As of April 2026, Meta has transitioned its frontier AI efforts to a new, proprietary model series called Muse Spark. Developed by the newly formed Meta Superintelligence Labs, Muse Spark represents a strategic shift toward closed-source, state-of-the-art models designed to compete directly with GPT-5.4 and Claude 4.7 Opus. While the Llama 4 models remain the industry standard for open-source deployment, the frontier of Meta's intelligence is now hosted within the Muse ecosystem.
Llama 4 Scout (The Efficient Long-Context)
Scout is the high-efficiency multimodal model of the 2025 release. With 17B active parameters (109B total), it pioneered the 10M token context window for open models.
- Context Window: 10 million tokens.
- Architecture: 8 experts / 2 active.
- Best for: Extreme RAG systems and long-video analysis.
Llama 4 Maverick (The Workhorse)
Maverick is the primary general-purpose model, featuring 17B active parameters out of 400B total (utilizing 128 experts). It was designed to rival the reasoning capabilities of the previous generation's proprietary flagships.
- Context Window: 1 million tokens.
- Best for: Coding, logic-heavy tasks, and professional-grade multimodal understanding.
Muse Spark (The Proprietary Frontier)
Released in early April 2026, Muse Spark is Meta's newest intelligence engine. Unlike Llama 4, Muse Spark is closed-weights and available only via Meta's secure API. It features native agentic orchestration and is optimized for the "Superintelligence" benchmarks of 2026.
Capabilities
- State-of-the-Art Reasoning: Maverick, in particular, is designed to be a leader in complex logical and mathematical reasoning.
- Advanced Multimodality: All models can natively process and understand both text and images.
- Massive Context Windows: Scout's 10 million token context window opens up new possibilities for long-form content analysis.
- Open Source: The models are freely available for both research and commercial applications, driving innovation across the AI community.
Use Cases
- Complex Coding & Development: Maverick serves as a powerful pair programmer for developers.
- Large-Scale Data Analysis: Scout can analyze entire codebases or vast quantities of documents in a single prompt.
- Creative Multimodal Content: All models can be used to understand and interact with visual information.
- Academic & Commercial Research: The open nature of the models makes them a go-to choice for researchers pushing the boundaries of AI.
Pricing & Access
- Llama 4 (Scout/Maverick): Free to download and self-host under Meta's permissive license. Available on Hugging Face.
- Muse Spark: Available via Meta's API for enterprise customers. Pricing is reported to be competitive with GPT-5.4 at approximately $2.50 per 1M input tokens.
- Meta AI: Integrated into Instagram, WhatsApp, and Facebook, now powered by the Muse Spark engine.
Ecosystem & Tools
- Official Llama Website: The primary source for downloading the models and reading the documentation.
- Hugging Face: The main community hub for accessing and fine-tuning Llama models.
- API Access: Available through major cloud providers and dedicated AI model hosting platforms.