
Key Highlights–
- Alibaba unveils Qwen3-Max with over 1 trillion parameters and open-source Qwen3-Omni multimodal model
- Plans new data centers in Brazil, France, the Netherlands, and five other countries, expanding from 91 current locations
- Qwen3-Omni was released under the Apache 2.0 license, directly challenging proprietary models from OpenAI and Google
- $53.4 billion AI infrastructure spending over three years, with additional increases announced
Chinese tech giant Alibaba launched a comprehensive challenge to U.S. AI dominance on Wednesday, unveiling its most powerful artificial intelligence models yet while announcing aggressive global infrastructure expansion. These moves come as Chinese companies increasingly present themselves as viable contenders against the American AI kings OpenAI, Google, and Microsoft.
Making Strides with Their Qwen3 Models
The center-stage focus of Alibaba’s announcement was the Qwen3-Max, a massive language model with over 1 trillion parameters, touted specifically for its prowess in code generation and autonomous agent capabilities. According to Alibaba Cloud CTO Zhou Jingren, the model bested others, including Anthropic’s Claude and DeepSeek-V3.1, on third-party benchmarks such as Tau2-Bench.
🚀 Qwen3-Max is here—no preview, just power!
— Qwen (@Alibaba_Qwen) September 23, 2025
Qwen Chat:https://t.co/FBpr7zfQY6
Blog: https://t.co/jJJcfi5FJJ
API: https://t.co/olURJV1Enl
We’ve supercharged coding & agentic skills—now Qwen3-Max-Instruct without thinking rivaling top models on SWE-Bench, Tau2-Bench,… pic.twitter.com/ZIL08Akm24
In tandem, Qwen3-Omni, a new open-source multimodal model that accepts text, image, audio, and video inputs and provides simultaneous text and audio responses, was unveiled by the company. The model is described as “natively end-to-end omni-modal,” which means the model architecture itself integrates all the modalities right from the beginning, unlike earlier systems that grafted on extra modalities to a text-first model.
Technical Superiority Claims
Qwen3-Omni challenges those long-established in technical specifications. The model theoretically attains end-to-end-first-packet latencies of just 234 milliseconds for audio and is twice that for video at 547 milliseconds, yet delivers real-time interactivity. It supports 119 languages for text, 19 for speech input, and 10 for speech output.
On 36 benchmarks, Qwen3-Omni claims state-of-the-art performance on 22 of them, and it outperforms open-source models on 32. For example, it scores 65.0 on AIME25 against GPT-4o’s 26.7, while on ZebraLogic it scores 76.0 against Gemini 2.5 Flash’s 57.9. These scores, being self-reported, suggest that Western models may actually be facing strong competition.
Strategic Open Source Advantage
Making Qwen3-Omni under a permissive license (Apache 2.0) differentiates it strategically from Western competitors. In contrast to requiring paid access, as GPT-4o and Gemini 2.5 Pro do, Qwen3-Omni can be downloaded, modified, and deployed commercially without licensing fees. This series of steps directly contradicts the proprietary-model approach of U.S. companies, potentially accelerating the adoption of affordable AI by developers and enterprises.
Global Expansion of Infrastructure
Meanwhile, beyond model development, Alibaba also announced that it would be setting up data centers in Brazil, France, the Netherlands, Mexico, Japan, South Korea, Malaysia, and Dubai in the coming year. The fact that the number is expanding from the current 91 operational areas across 29 regions speaks of a serious intent to compete with Amazon Web Services, Microsoft Azure, and Google Cloud on a global scale.
CEO Eddie Wu emphasized that AI development speed “has far exceeded our expectations,” hence the need to outspend on infrastructure even beyond the somewhat ridiculous $53.4 billion three-year commitment previously announced.
Disruption Possibilities for the Market
The timing of the Alibaba announcements coincides with NVIDIA’s $100 billion investment commitment to OpenAI, fueling the immense capital competition that drives the development of AI. Western companies tend to focus on proprietary models and high prices, whereas Chinese companies seek open-source avenues that may accelerate adoption.
The multimodal resources in Qwen3-Omni facilitate an assortment of applications such as real-time tech support via video analysis, transcription, and translation in multiple languages, and interactive audiovisual systems. The model family includes specialized variants for full-scale interaction, complicated reasoning, and audio description.
Alibaba’s open-source approach, in conjunction with competitive performance metrics, provides genuine disruption potential to established AI providers. Enterprise clients with high licensing costs for proprietary models might indeed consider Qwen3-Omni’s free availability attractive, given that in benchmark tests, its performance has been comparable or even better.