Qwen 3

Alibaba Cloud's flagship open-source AI, featuring a hybrid reasoning system and a Mixture-of-Experts (MoE) architecture for state-of-the-art performance.

QwenAlibabaOpen SourceLanguage ModelLarge Language ModelAI AssistantMultilingualMoE
Developer
Alibaba Cloud
Type
Hybrid Language Model
License
Open Source (Apache 2.0)

Overview

Qwen 3, released by Alibaba Cloud on April 29, 2025, is a powerful open-source AI model that sets a new benchmark in the industry. It introduces a sophisticated hybrid reasoning system and utilizes a Mixture-of-Experts (MoE) architecture, allowing it to compete directly with leading proprietary and open-source models. Trained on a massive dataset of nearly 36 trillion tokens across 119 languages, Qwen 3 is designed for high performance, efficiency, and broad accessibility.

Capabilities

Qwen 3's advanced architecture provides a wide range of powerful capabilities:

  • Hybrid Reasoning: The model can dynamically switch between two modes:
    • Thinking Mode: Engages deep reasoning pathways for complex, multi-step tasks in areas like programming, mathematics, and logic.
    • Non-Thinking Mode: Provides fast and efficient responses for general conversation and less complex queries.
  • State-of-the-Art Performance: Achieves top-tier results on various benchmarks, demonstrating its strength in both general knowledge and specialized domains.
  • Mixture-of-Experts (MoE) Architecture: The flagship model, Qwen-3-235B-A22B, contains 235 billion parameters but only activates a fraction (22 billion) for any given task, leading to significantly lower inference costs without sacrificing performance.
  • Extensive Multilingual Support: With training data covering 119 languages, the model offers robust multilingual understanding and generation.
  • Fully Open Source: Available under the permissive Apache 2.0 license, encouraging widespread adoption and innovation.

Technical Specifications

  • Model sizes: Available in multiple sizes, with the flagship being a 235B parameter MoE model.
  • Architecture: A Mixture-of-Experts (MoE) Transformer model.
  • Training data: Pre-trained on a massive 36 trillion token dataset.
  • Multilingualism: Natively supports 119 languages.

Use Cases

Qwen 3 is a versatile model suitable for a vast range of applications:

  • Complex Problem Solving: Building sophisticated AI agents that can tackle multi-step reasoning and coding challenges.
  • Efficient Conversational AI: Deploying fast and cost-effective chatbots and virtual assistants for enterprise use.
  • Global Applications: Creating content and serving users in a wide variety of languages.
  • Research and Development: Providing a powerful, open, and cost-effective foundation for cutting-edge AI research.

Limitations

  • Latency in "Thinking" Mode: While powerful, the deep reasoning mode can have higher latency compared to the standard mode.
  • Potential for Hallucinations: Like all LLMs, it can occasionally generate plausible but incorrect information, especially in highly specialized domains.

Pricing & Access

  • Open Source: The models are free to download and use from platforms like Hugging Face, GitHub, and ModelScope.
  • Cloud API: Available through Alibaba Cloud's "Bailian" platform for developers seeking a managed, pay-as-you-go API.

Ecosystem & Tools

Community & Resources

Explore More Models

Discover other AI models and compare their capabilities.