Skip to content
AImpact
IT EN
High AI Security · 1 min read

CAIS Dangerous Capabilities Evaluations: the standard framework for measuring dangerous LLM capabilities

In one sentence The Center for AI Safety publishes a structured framework for evaluating dangerous LLM capabilities in CBRN, cyberoffense, and autonomy; adopted by UK AISI and integrated into Anthropic's Responsible Scaling Policy.

Verified Official source
ShareLinkedInX
Reading level

How do you measure whether an AI model is dangerous enough to be stopped before release? Until a few years ago there was no methodological answer to this question.

The Center for AI Safety developed a structured evaluation framework for so-called dangerous capabilities: a model's ability to assist in synthesizing biological or chemical agents (CBRN), assist in offensive cyberattacks, and operate autonomously toward self-assigned goals.

The framework defines specific benchmarks with risk thresholds, standardized test protocols, and a taxonomy of dangerous capabilities that enables comparisons between different models over time.

This type of evaluation is now an integral part of the deployment process at major AI labs: Anthropic has integrated it into its Responsible Scaling Policy, and UK AISI uses it as the basis for frontier model evaluations.

Companies

CAIS, Anthropic, UK AISI

Tools

Tags

CAISDangerous CapabilitiesEvaluation FrameworkCBRNCyberoffenseUK AISIRSPAnthropic

Sources