Compliance

EU AI Act Compliance for AI Agents

The August 2026 deadline is four months away. Articles 9, 12, 13, 14, and 26 mapped to concrete technical controls. Fines up to 35M euros or 7% of turnover. Here is your compliance blueprint.

Kyrie KirkFebruary 14, 202616 min

The EU AI Act entered into force on August 1, 2024. The prohibited practices took effect in February 2025. The general-purpose AI (GPAI) rules apply from August 2025. And the high-risk AI system requirements — the ones that affect most AI agent deployments — take full effect on August 2, 2026. That is four months from today. Fines for non-compliance reach 35 million euros or 7% of global annual turnover, whichever is higher. This is not a theoretical regulation. National market surveillance authorities are already staffing up enforcement teams.

Most compliance guides for the EU AI Act focus on the regulation's text. This guide focuses on implementation: the specific technical controls you need, the YAML policies that encode them, the audit records that prove compliance, and the exact Articles you need to satisfy. If you are deploying AI agents in the EU or serving EU customers, this is your engineering blueprint.

Timeline: What Applies When

The EU AI Act rolls out in phases. Each phase brings new obligations:

eu_ai_act_timeline.txttext
Phase 1 — February 2, 2025 (ALREADY IN EFFECT)
├── Prohibited AI practices banned (Article 5)
├── Social scoring, real-time biometric surveillance (most cases)
├── Emotion recognition in workplaces and schools
└── Penalties: up to 35M EUR or 7% of global turnover

Phase 2 — August 2, 2025 (ALREADY IN EFFECT)
├── General-Purpose AI (GPAI) model obligations (Articles 51-56)
├── Transparency requirements for GPAI
├── Systemic risk assessment for high-capability models
└── AI literacy obligations for deployers (Article 4)

Phase 3 — August 2, 2026 (4 MONTHS AWAY)
├── HIGH-RISK AI system requirements (Articles 6-27)
├── Risk management systems (Article 9)
├── Data governance (Article 10)
├── Technical documentation (Article 11)
├── Record-keeping / logging (Article 12)
├── Transparency to users (Article 13)
├── Human oversight (Article 14)
├── Accuracy, robustness, cybersecurity (Article 15)
├── Deployer obligations (Article 26)
└── Penalties: up to 15M EUR or 3% of global turnover

Phase 4 — August 2, 2027
└── Obligations for AI systems that are safety components
    of products covered by EU harmonization legislation

Annex III: Is Your AI Agent High-Risk?

Not every AI system is high-risk. The EU AI Act defines high-risk categories in Annex III. AI agents fall into high-risk classification if they operate in any of these domains:

  • Critical infrastructure — Agents managing energy grids, water supply, transportation, or digital infrastructure.
  • Education and training — Agents that determine access to education, evaluate learning outcomes, or assess qualifications.
  • Employment and workers management — Agents that screen CVs, make hiring decisions, evaluate performance, or allocate tasks.
  • Essential services — Agents involved in credit scoring, insurance risk assessment, or public benefit determination.
  • Law enforcement and justice — Agents used in criminal profiling, evidence evaluation, or judicial decision support.
  • Migration and border control — Agents that process visa applications, assess asylum claims, or perform identity verification.
  • Healthcare — Agents that assist in diagnosis, treatment planning, triage, or medical device operation.

If your AI agent operates in any Annex III domain, you must comply with Articles 9-15 and Article 26 by August 2, 2026. Even if your agent is not high-risk, Article 4 requires AI literacy for all deployers, and best practices suggest implementing the controls anyway as a risk management measure.

Article 9: Risk Management System

Article 9 requires a "risk management system" that is "established, implemented, documented and maintained" throughout the AI system's lifecycle. This is not a one-time assessment. It is a continuous process that must identify risks, evaluate their likelihood and severity, and implement mitigation measures that are tested and validated.

For AI agents, the primary risks are unauthorized tool actions, data leakage, and uncontrolled autonomous behavior. Veto's policy engine directly implements the mitigation layer: policies define what the agent is allowed to do, shadow mode lets you test policies against real traffic before enforcement, and the decision dashboard shows risk metrics in real time.

policies/article-9-risk-management.yamlyaml
name: healthcare-agent-risk-controls
description: "Article 9 risk management for healthcare triage agent"

risk_classification: high
annex_iii_category: healthcare
last_risk_assessment: "2026-03-15"
next_risk_assessment: "2026-06-15"

rules:
  # Risk: agent makes diagnosis without clinical validation
  - tool: suggest_diagnosis
    action: require_approval
    approval:
      channel: dashboard
      reviewer_pool:
        - role: licensed_clinician
      timeout: 1800s
      escalation: deny
    risk_level: high
    mitigation: "Human clinician must validate all diagnostic suggestions"

  # Risk: agent accesses patient records beyond scope
  - tool: query_patient_records
    conditions:
      - match:
          context.treating_clinician: "true"
          arguments.patient_id: "context.assigned_patients"
        action: allow
      - match:
          arguments.patient_id: ".*"
        action: deny
        reason: "Access limited to assigned patients"
    risk_level: medium
    mitigation: "Strict patient-scope enforcement via policy"

  # Risk: agent sends clinical information externally
  - tool: send_communication
    conditions:
      - match:
          arguments.channel: "^internal_"
        action: allow
      - match:
          arguments.channel: ".*"
        action: deny
        reason: "External clinical communications prohibited"
    risk_level: high
    mitigation: "All clinical communications restricted to internal channels"

  default_action: deny

shadow_mode:
  enabled: true
  duration: "30days"
  compare_with: "previous_policy_version"

Article 12: Record-Keeping

Article 12 requires "automatic recording of events (logs) over the lifetime of the system" that enable tracing of the system's operation. Article 26(6) requires deployers to retain these logs for "a period of at least six months." The logs must be sufficient to monitor the system's operation, identify risks, and facilitate post-market surveillance.

Veto's decision logs satisfy Article 12 by default. Every protect() call produces a structured record with full decision context. Here is the format:

article_12_audit_record.jsonjson
{
  "record_id": "aud_eu_4a5b6c7d8e9f",
  "timestamp": "2026-04-04T11:30:22.109Z",
  "event_type": "tool_call_decision",

  "system_identification": {
    "system_name": "healthcare-triage-agent-v2.3",
    "system_version": "2.3.1",
    "risk_classification": "high",
    "annex_iii_category": "healthcare",
    "conformity_assessment_id": "CA-2026-0892"
  },

  "operation_context": {
    "agent_id": "triage-agent-prod",
    "session_id": "sess_eu_abc123",
    "initiated_by": {
      "user_id": "nurse_447",
      "role": "triage_nurse",
      "facility": "clinic_berlin_01"
    }
  },

  "tool_call": {
    "tool": "suggest_diagnosis",
    "arguments": {
      "symptoms": ["persistent_cough", "fever", "fatigue"],
      "duration_days": 14,
      "patient_age_range": "45-55"
    }
  },

  "policy_evaluation": {
    "policy_name": "healthcare-agent-risk-controls",
    "policy_version": "1.4",
    "rule_matched": "rule_1_diagnosis_approval",
    "decision": "require_approval",
    "reason": "Diagnostic suggestions require clinician validation"
  },

  "human_oversight": {
    "approval_requested": true,
    "reviewer": {
      "user_id": "dr_823",
      "role": "licensed_clinician",
      "credential": "DE-MED-2019-4472"
    },
    "reviewed_at": "2026-04-04T11:32:45.882Z",
    "decision": "approved_with_modification",
    "modification": "Added differential diagnosis note",
    "review_duration_seconds": 143
  },

  "eu_ai_act_metadata": {
    "article_12_compliant": true,
    "article_14_oversight_provided": true,
    "retention_minimum": "6months",
    "retention_configured": "3years"
  }
}

Article 13: Transparency

Article 13 requires that high-risk AI systems "be designed and developed in such a way as to ensure that their operation is sufficiently transparent to enable deployers to interpret a system's output and use it appropriately." For AI agents, this means every decision must be explainable: why did the agent take this action, what policy governed it, and what data informed the decision.

Veto's decision logs include the policy_evaluation block with every record — showing which rule matched, what conditions were evaluated, and the human-readable reason for the decision. This is not a post-hoc explanation generated by another model. It is a deterministic record of the actual policy evaluation that produced the decision.

Article 14: Human Oversight

Article 14 is the most operationally demanding requirement. It mandates that high-risk AI systems include "human oversight measures" that enable natural persons to "effectively oversee" the system, "remain aware of the possible tendency of automatically relying on the output" (automation bias), and "be able to decide, in any particular situation, not to use the high-risk AI system or to otherwise disregard, override or reverse" its output.

This maps directly to Veto's human-in-the-loop approval workflows:

policies/article-14-oversight.yamlyaml
name: article-14-human-oversight
description: "Article 14 compliance — human oversight for high-risk decisions"

rules:
  # All diagnostic outputs require clinician review
  - tool: suggest_diagnosis
    action: require_approval
    approval:
      channel: dashboard
      reviewer_pool:
        - role: licensed_clinician
      timeout: 1800s
      escalation: escalate_to_senior
      context_shown:
        - tool_name
        - arguments
        - model_reasoning
        - confidence_score
        - similar_past_decisions

  # Treatment plan suggestions — tiered approval
  - tool: suggest_treatment
    action: require_approval
    approval:
      tiers:
        - level: 1
          reviewers:
            - role: treating_physician
          timeout: 3600s
        - level: 2
          reviewers:
            - role: department_head
          timeout: 7200s
      context_shown:
        - full_patient_context
        - contraindication_check
        - evidence_sources

  # Emergency override — allow but flag for immediate review
  - tool: emergency_alert
    action: allow
    post_action:
      review_required: true
      channel: pagerduty
      reviewer_pool:
        - role: on_call_physician
      review_sla: 15minutes

oversight_dashboard:
  enabled: true
  metrics:
    - approval_rate_by_tool
    - average_review_time
    - override_rate
    - automation_bias_indicators

The automation_bias_indicators metric is specifically designed for Article 14 compliance: it tracks how often reviewers approve agent suggestions without modification, flagging patterns that suggest reviewers are rubber-stamping rather than genuinely evaluating the agent's output.

Article 26: Deployer Obligations

Article 26 is often overlooked because it targets deployers, not providers. If you are using an AI agent built by someone else (or built on top of a foundation model), you are a deployer. Your obligations include:

  • Implement provider instructions — Follow the provider's instructions for use, including any limitations on the system's intended purpose.
  • Assign human oversight — Ensure human oversight is carried out by individuals who have the "necessary competence, training and authority" to fulfill their oversight role.
  • Monitor operation — Monitor the AI system's operation on the basis of the provider's instructions and report any serious incidents to the provider and relevant authorities.
  • Retain logs — Keep logs generated by the AI system for at least six months, unless otherwise provided in sector-specific legislation.
  • Data protection impact assessment — Before putting the system into use, carry out a DPIA as required by GDPR Article 35.

Penalty Structure

The EU AI Act's penalties are tiered by violation severity:

penalty_structure.txttext
┌────────────────────────────────────┬──────────────────────────────────┐
│ Violation                          │ Maximum Fine                     │
├────────────────────────────────────┼──────────────────────────────────┤
│ Prohibited AI practices            │ 35M EUR or 7% global turnover    │
│ (Article 5)                        │ (whichever is higher)            │
├────────────────────────────────────┼──────────────────────────────────┤
│ High-risk system non-compliance    │ 15M EUR or 3% global turnover    │
│ (Articles 6-27, incl. 9, 12-14)   │ (whichever is higher)            │
├────────────────────────────────────┼──────────────────────────────────┤
│ False information to authorities   │ 7.5M EUR or 1% global turnover   │
│                                    │ (whichever is higher)            │
├────────────────────────────────────┼──────────────────────────────────┤
│ SME / startup reduced fines        │ Lower of: percentage or fixed    │
│                                    │ amount (proportionality)         │
└────────────────────────────────────┴──────────────────────────────────┘

For context: 3% of global turnover for a company with $1B revenue = $30M.
GDPR's maximum (4% turnover) has produced fines of $1.3B (Meta, 2023).
The EU AI Act's enforcement is expected to follow similar patterns.

Requirements to Technical Controls

compliance_mapping.txttext
┌─────────────┬───────────────────────────────┬───────────────────────────────┐
│ Article     │ Technical Control              │ Veto Feature                  │
├─────────────┼───────────────────────────────┼───────────────────────────────┤
│ Art. 9      │ Risk assessment + mitigation   │ Policy engine with risk       │
│ Risk Mgmt   │ Continuous monitoring           │ levels per rule. Shadow mode  │
│             │ Testing and validation          │ for pre-deployment testing.   │
│             │                               │ Decision dashboard for        │
│             │                               │ real-time risk metrics.        │
├─────────────┼───────────────────────────────┼───────────────────────────────┤
│ Art. 12     │ Automatic event logging        │ Every protect() call logged   │
│ Logging     │ Traceable operation records     │ with full context. Structured │
│             │ 6-month minimum retention      │ JSON format. Configurable     │
│             │                               │ retention (default: 1 year).  │
├─────────────┼───────────────────────────────┼───────────────────────────────┤
│ Art. 13     │ Interpretable outputs          │ policy_evaluation block in    │
│ Transparency │ Explainable decisions          │ every log: rule matched,      │
│             │ User-facing documentation      │ conditions evaluated, reason. │
│             │                               │ Deterministic, not generated. │
├─────────────┼───────────────────────────────┼───────────────────────────────┤
│ Art. 14     │ Human review workflows         │ require_approval action with  │
│ Oversight   │ Override/interrupt capability  │ tiered escalation. Override   │
│             │ Automation bias monitoring     │ and argument modification.    │
│             │                               │ Approval rate metrics.        │
├─────────────┼───────────────────────────────┼───────────────────────────────┤
│ Art. 26     │ Deployer monitoring            │ Real-time dashboard. Log      │
│ Deployer    │ Log retention (6mo+)           │ retention configuration.      │
│ Obligations │ Incident reporting             │ Alert rules + webhook         │
│             │ DPIA before deployment         │ integrations for incidents.   │
└─────────────┴───────────────────────────────┴───────────────────────────────┘

Practical Implementation Order

If you have four months until the August 2026 deadline, here is the order of implementation by priority and effort:

  1. Classify your systems (Week 1). Determine whether your AI agents fall under Annex III high-risk categories. If they do, everything below is mandatory. If not, implement anyway as defensible best practice.
  2. Enable audit logging (Week 2). Add protect() to every tool call path. This immediately satisfies Article 12 and gives you the data foundation for everything else.
  3. Define policies (Weeks 3-4). Write YAML policies that encode your risk management decisions. Start with deny-by-default and explicitly allow what is safe. Deploy in shadow mode first.
  4. Add human oversight (Weeks 5-6). Configure approval workflows for high-risk tool calls. Train your reviewers. Set up the reviewer dashboard and notification channels.
  5. Document and test (Weeks 7-8). Write your conformity documentation. Run adversarial tests against your policies. Generate sample audit reports. Validate retention configuration.

Getting Started

The EU AI Act is the most consequential AI regulation in the world. Its extraterritorial reach means it affects any company serving EU customers, regardless of where the company is headquartered. The technical controls it requires — risk management, logging, transparency, and human oversight — are not optional features. They are legal obligations with nine-figure penalties.

Start free and begin implementing EU AI Act controls today. Full EU AI Act compliance guide covers every article in detail, and our healthcare agent use case shows a complete Annex III high-risk implementation.

Related posts

Build your first policy