Skip to content
AImpact
IT EN
Medium Foundation Models · 1 min read

mT5: a multilingual T5 over 101 languages

In one sentence Google Research publishes mT5, a T5 variant pre-trained on mC4 (multilingual Common Crawl) over 101 languages, which becomes a standard baseline for many cross-lingual NLP tasks.

Verified Official source
ShareLinkedInX
Reading level

T5 was a Google model known for turning any NLP task into a "text in → text out" problem. But it was English-only. For anyone working in another language — Italian included — it wasn't usable.

mT5 is the multilingual version: the same model, but trained on 101 different languages, from Arabic to Vietnamese. It can translate, summarize, and answer questions across all of them, and learns specific tasks much faster than a monolingual model.

For anyone building applications that need to work in multiple countries without retraining a model for each one, it's a practical turning point.

Companies

Google

Tools

mT5

Tags

GoogleT5mT5MultilingualText-to-Text

Sources