Safety Intermediate Also known as: Avvelenamento dei dati

Data poisoning

An attack where an adversary inserts malicious examples into the training dataset to alter the behavior of the final model.

ShareLinkedIn X

In practice

Even a handful of corrupted documents in a web crawl can create persistent backdoors or biases. Particularly risky for models that continuously train on public content or are fine-tuned on unvetted third-party datasets.

Related terms

Backdoor attack Fine-tuning Red teaming

Seen in the wild

2 entries mentioning it

August 6, 2024

NIST AI 600-1: risk profile for generative AI systems

Medium
February 6, 2024

Indirect Prompt Injection: the attack vector in RAG systems and AI agents

High

← All terms