Skip to content
AImpact
IT EN
Inference Intermediate Also known as: Graduate-Level Google-Proof Q&A

GPQA

/jee-pee-kew-ay/

A benchmark of 448 questions written by PhD students in biology, physics, and chemistry, designed to be hard even with Google access.

ShareLinkedInX

In practice

It is replacing MMLU as the gauge of deep scientific knowledge. Domain-expert humans score around 65%; frontier models in 2025 exceed 70%. It remains one of the not-yet-saturated benchmarks.

Related terms

Seen in the wild

0 entries mentioning it

No archive entry mentions it explicitly. Appears in broader contexts.

← All terms