Skip to content
AImpact
IT EN
Inference Intermediate Also known as: Massive Multitask Language Understanding

MMLU

/em-em-el-you/

A benchmark of about 16,000 multiple-choice questions across 57 subjects, from math and law to medicine, used to measure an LLM's general knowledge.

ShareLinkedInX

In practice

For years it was the headline benchmark cited in new model announcements. Today it is saturated: frontier models score above 85%, and the field is moving to harder benchmarks like MMLU-Pro and GPQA.

Related terms

← All terms