Home > Quick > Body

Alibaba launches more efficient Qwen3-Next artificial intelligence model

clock
2025-09-11 22:30:11
Tongyi Qianwen, a subsidiary of Alibaba, has released the next-generation basic model architecture Qwen3-Next and open-sourced the Qwen3-Next-80B-A3B series of models based on this architecture. This structure has the following core improvements compared to the MoE model structure of Qwen3: mixed attention mechanism, high sparsity MoE structure, a series of training stability-friendly optimizations, and a multi-token prediction mechanism to improve inference efficiency. Based on the model structure of Qwen3-Next, Alibaba trained Qwen3-Next-80B-A3B-Base model, which has 80 billion parameters only activate 3 billion parameters. The Base model achieves similar or even slightly better performance than the Qwen3-32B dense model, while its training cost (GPU hours) is only less than one-tenth of that of Qwen3-32B, and the inference throughput in the context of 32k or more is more than ten times that of Qwen3-32B, achieving the ultimate cost-effectiveness of training and inference.
Disclaimer:
1. The information provided does not constitute investment advice. Investors should make independent decisions and bear all risks themselves.
2. The copyright of this content belongs to the original author. The views expressed herein are solely those of the author and do not represent the stance or position of this website.
New Tab Page - Desk3 | Plugin
Stay ahead of the game in the cryptocurrency space.