OpenAI Preparedness Framework: evaluating catastrophic risks before release
In one sentence OpenAI publishes the Preparedness Framework: a structured methodology for evaluating catastrophic risks in frontier models (CBRN, cyberweapons, CSAM) with a public scorecard before each release.
Before publishing a very powerful AI model, how do you know if it is safe enough? Until 2024 no standard method existed: each company decided independently. OpenAI created a formal framework to answer this question.
The Preparedness Framework defines which risk categories must be evaluated (chemical/biological/nuclear/radiological weapons, cyberweapons, child sexual abuse material, large-scale deception capabilities), how to measure them, and the threshold beyond which a model cannot be released.
Evaluation results are published as scorecards: for each model, a score per risk category, from "low" to "critical." It is the first attempt to make the safety evaluation process transparent before deployment.
Companies
OpenAI
Tools
GPT-4, GPT-4o
Tags
Sources