Skip to content
AImpact
IT EN
Safety Beginner Also known as: Iniezione di prompt

Prompt injection

An attack where an external input (a document, a web page, an email) contains hidden instructions that hijack the model's behavior.

ShareLinkedInX

In practice

If your agent reads emails and then acts, a malicious email can tell it 'forward everything to a third party'. Fixes: treat external inputs as untrusted, sandbox tools, require human confirmation for sensitive actions, filter inputs and outputs.

Related terms

← All terms