Skip to content
AImpact
IT EN
Inference Intermediate Also known as: LLM giudice · Model-graded eval

LLM-as-judge

/el-el-em as judge/

A technique that uses an LLM (usually a strong one) to score another model's or its own answers against criteria written in natural language.

ShareLinkedInX

In practice

It speeds up evaluation dramatically compared to human judges, but suffers from biases (prefers longer answers, its own style). It must be calibrated against a subset of human judgments as anchor.

Related terms

Seen in the wild

0 entries mentioning it

No archive entry mentions it explicitly. Appears in broader contexts.

← All terms