Wu Dao 2.0: China announces a 1.75T-parameter model

In one sentence BAAI (Beijing Academy of Artificial Intelligence) introduces Wu Dao 2.0, a 1.75 trillion-parameter multimodal Mixture of Experts model — China's response to GPT-3 and Switch Transformer.

Needs review Reputable source

ShareLinkedIn X

In Beijing, BAAI (Beijing Academy of Artificial Intelligence) announces Wu Dao 2.0, a model with 1.75 trillion parameters — at the time the biggest ever announced publicly.

It's multimodal (Chinese text, English text, images) and uses the Mixture of Experts approach, like Google's Switch Transformer.

In many ways it's a political statement: China declares it can compete with the US and Google on maximum-scale models. Rigorous papers and independent benchmarks are missing, but the announcement marks the start of the US-China AI race on foundation models.