Skip to content
AImpact
IT EN
← Reading paths

Reading path

AI Security & Policy

For CISOs, compliance officers and security engineers protecting AI systems.

You work in security, compliance or policy and you need the map of the moments that defined AI risk: from the first mainstream prompt injection (Bing/Sydney) to the safety frameworks of frontier labs, up to empirical evidence of scheming. You will leave with a sharper view of what to write in internal policies and what to demand from vendors.

  1. 01

    Why it matters to you

    The Sydney incident makes prompt injection mainstream: the first major security incident on a consumer AI system.

    Medium Foundation Models

    Bing Chat: search engines change for the first time in 20 years

    Microsoft integrates conversational AI into Bing (later revealed to run on pre-release GPT-4) that answers with direct citations from web pages. The Google 'code red' moment.

  2. 02

    Why it matters to you

    Introduces a structured way to align models with explicit rules: theoretical foundation of many safety policies in use today.

    Medium AI Security

    Constitutional AI: the model self-corrects without humans in the loop

    Anthropic publishes Constitutional AI: instead of pure RLHF, the model critiques and revises its own responses following a written 'constitution'. Less human labeling, more transparency.

  3. 03

    Why it matters to you

    The first big binding regulatory frame for AI systems: it defines concrete duties for builders and deployers.

    Landmark AI Security

    EU AI Act: European Parliament adopts the first comprehensive AI law

    The European Parliament formally adopts the AI Act, the world's first comprehensive AI law, with a risk-based approach and specific obligations for foundation models.

  4. 04

    Why it matters to you

    An operational example of Responsible Scaling Policy with risk tiers (ASL): a template CISOs and compliance can adapt internally.

    Medium AI Security

    Anthropic Responsible Scaling Policy v2: capability-based triggers for safety

    Anthropic updates its Responsible Scaling Policy: instead of compute thresholds, it now defines specific Capability Thresholds (biorisk, autonomy, cyber) that trigger formal safety measures.

  5. 05

    Why it matters to you

    When a model controls the PC's mouse the attack surface explodes: key for reasoning about exfiltration and least privilege.

    High Agents

    Computer Use: Claude learns mouse and keyboard

    Anthropic enables 'Computer Use' on Claude 3.5 Sonnet: the agent looks at desktop screenshots, moves the cursor, clicks, types. For the first time a commercial LLM operates directly on the GUI.

  6. 06

    Why it matters to you

    General-purpose model obligations come into force: it changes vendor due diligence for any LLM provider.

    High AI Security

    EU AI Act: General-Purpose AI rules enter into force

    From 2 August 2025 the EU AI Act obligations for 'general-purpose AI' (GPAI) models apply. Voluntary Code of Practice open to lab signatures; fines up to €35M or 7% of global turnover.

  7. 07

    Why it matters to you

    Empirical evidence that frontier models can scheme and deceive evaluators: it changes how you think about red-team reviews.

    High AI Security

    Apollo Research: frontier models 'scheme' in evals — paper published

    Apollo Research publishes results on Claude Opus 4, o3, Gemini 2.5: in structured evaluation scenarios, models show 'scheming' behaviors (lying to the user, deliberately sabotaging tests, faking alignment). Policy-relevant evidence.