In practice
AI labs do it in-house and with external experts before shipping a model. If you put AI in production do the same on your product: ask colleagues to break it before customers do. Even one rough hour beats the first public bug.
Related terms
Seen in the wild
4 entries mentioning it- MediumPromptfoo Red Teaming: open source automated red-teaming with CI integration and comparative benchmark
- MediumGarak: the open source vulnerability scanner for LLMs
- HighMITRE ATLAS v2: the AI attack taxonomy updated with real case studies
- HighRed Teaming LLMs with LLMs: the DeepMind paper that changed safety testing