Use Cases

Building Safe Financial Agents

$45M lost to AI trading agent exploits. 60% of financial firms say agent misconfiguration is their top AI concern. The SEC is watching. Here's how to build financial agents that don't become liabilities.

Yaz CalebMarch 7, 202616 min

In March 2025, a misconfigured AI trading agent at a mid-size hedge fund executed a series of unauthorized leveraged positions that resulted in $45M in losses before a human noticed. The agent had passed all functional testing. It had correct API credentials. It was authenticated. What it lacked was authorization: no per-trade limits, no approval thresholds, no segregation between the agent's ability to recommend a trade and its ability to execute one. That gap is not unique to trading. 60% of financial firms now cite agent misconfiguration as their top AI risk, and the SEC has made AI-related examination a priority for 2025 and 2026. The regulatory ground is shifting under financial agents, and the ones that survive will be the ones built with guardrails from the start.

The Regulatory Landscape

Three regulatory frameworks converge on AI agents in financial services, and each demands specific controls:

  • SEC AI Examination Priorities (2025-2026). The SEC's Division of Examinations added "AI and emerging technology" as a standalone priority for the first time. Examiners are looking at how firms use AI in trading, portfolio management, and client communications. They specifically flag: whether AI outputs are subject to human review before execution, whether firms have tested AI tools for bias and accuracy, and whether adequate disclosures are made to clients about AI use. An AI agent that autonomously executes trades without documented guardrails is a finding waiting to happen.
  • SR 11-7 (Federal Reserve Model Risk Management). The Fed's guidance on model risk management was written for statistical models but maps directly to AI agents. It requires: effective challenge of model outputs (an independent party must validate results), ongoing monitoring with quantitative thresholds, and a model inventory with documented limitations. An AI trading agent is a model. Its tool calls are model outputs. SR 11-7 requires that those outputs be challengeable, monitorable, and documented.
  • SOX Sections 302 and 404. For publicly traded companies, Sarbanes-Oxley requires CEO/CFO certification that internal controls over financial reporting are effective (Section 302) and independent auditor attestation of those controls (Section 404). If an AI agent can initiate, approve, or modify financial transactions, it is part of your internal control environment. An agent that can both initiate and approve a transaction violates segregation of duties. An agent without audit logs makes Section 404 attestation impossible.

Five Financial Agent Risks

Financial agents create risk at five specific points. Each requires a distinct control:

  1. Unauthorized transactions. The agent executes a trade or transfer that no human authorized. This is the $45M scenario. The control: per-transaction authorization with amount thresholds and mandatory human approval above the threshold.
  2. Limit breaches. The agent stays within per-transaction limits but violates aggregate limits: position concentration, daily volume caps, counterparty exposure. The control: budget-scoped authorization that tracks cumulative exposure, not just individual transactions.
  3. Regulatory reporting failures. The agent executes reportable transactions without generating the required reports (SARs, CTRs, large trader reports). The control: post-action hooks that trigger reporting workflows when transaction characteristics match reporting thresholds.
  4. Segregation of duties violations. The agent both recommends and executes, or both initiates and approves. SOX and most internal control frameworks require separation between these functions. The control: role-scoped policies where an agent with "analyst" context can recommend but not execute, and an agent with "trader" context can execute but only pre-approved recommendations.
  5. Audit trail gaps. The agent takes actions that are not logged, or logged without sufficient context for reconstruction. The control: structured decision logging on every protect() call, with full tool call arguments, policy evaluation details, and execution results.

Implementation: Financial Agent with protect()

The core pattern wraps every financial tool call in protect(). The agent code is clean. The authorization logic lives entirely in the policy, not in application code:

financial_agent.pypython
import anthropic
from veto import Veto, Decision
from decimal import Decimal

client = anthropic.Anthropic()
veto = Veto(api_key="veto_live_xxx", project="trading-agent")

TOOLS = [
    {
        "name": "get_market_data",
        "description": "Fetch current price and volume for a ticker",
        "input_schema": {
            "type": "object",
            "properties": {
                "ticker": {"type": "string"},
                "exchange": {"type": "string", "enum": ["NYSE", "NASDAQ", "LSE"]},
            },
            "required": ["ticker"],
        },
    },
    {
        "name": "execute_trade",
        "description": "Submit a trade order to the broker",
        "input_schema": {
            "type": "object",
            "properties": {
                "ticker": {"type": "string"},
                "side": {"type": "string", "enum": ["buy", "sell"]},
                "quantity": {"type": "integer"},
                "order_type": {"type": "string", "enum": ["market", "limit"]},
                "limit_price": {"type": "number"},
            },
            "required": ["ticker", "side", "quantity", "order_type"],
        },
    },
    {
        "name": "transfer_funds",
        "description": "Transfer funds between accounts",
        "input_schema": {
            "type": "object",
            "properties": {
                "from_account": {"type": "string"},
                "to_account": {"type": "string"},
                "amount": {"type": "number"},
                "currency": {"type": "string"},
            },
            "required": ["from_account", "to_account", "amount", "currency"],
        },
    },
    {
        "name": "approve_recommendation",
        "description": "Record approval of a trading recommendation",
        "input_schema": {
            "type": "object",
            "properties": {
                "recommendation_id": {"type": "string"},
                "approved_by": {"type": "string"},
            },
            "required": ["recommendation_id", "approved_by"],
        },
    },
]


async def run_financial_agent(user_message: str, user_context: dict):
    """Financial agent with Veto authorization on every tool call."""
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=TOOLS,
            messages=messages,
        )

        if response.stop_reason != "tool_use":
            return response

        tool_blocks = [b for b in response.content if b.type == "tool_use"]
        tool_results = []

        for block in tool_blocks:
            decision = veto.protect(
                tool=block.name,
                arguments=block.input,
                context={
                    "user_id": user_context["user_id"],
                    "role": user_context["role"],
                    "account_id": user_context["account_id"],
                    "desk": user_context.get("desk", "general"),
                },
            )

            if decision.action == Decision.ALLOW:
                result = await execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })

            elif decision.action == Decision.DENY:
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": f"BLOCKED: {decision.reason}",
                    "is_error": True,
                })

            elif decision.action == Decision.APPROVAL_REQUIRED:
                approval = veto.wait_for_approval(
                    decision_id=decision.id,
                    timeout=decision.approval_timeout,
                )
                if approval.granted:
                    result = await execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result),
                    })
                else:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"DENIED by {approval.reviewer}: {approval.reason}",
                        "is_error": True,
                    })

        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Policy: Transaction Limits, Approvals, and Budget Caps

The policy defines the authorization rules declaratively. The agent code never changes regardless of how complex the rules get. Note the budget blocks: these are Veto's economic authorization feature, which tracks cumulative spend across multiple scopes (per-trade, daily, and weekly) without requiring any state management in your application code:

financial_agent_policy.yamlyaml
name: trading-agent-production
description: Authorization policy for trading desk agent

rules:
  - tool: get_market_data
    action: allow
    constraints:
      rate_limit: 500/hour

  - tool: execute_trade
    conditions:
      # Small trades: auto-approve
      - match:
          arguments.quantity: "<= 100"
          arguments.order_type: "limit"
        action: allow
        budget:
          scope: daily
          limit: 50000
          currency: USD
          track_by: context.account_id

      # Medium trades: require senior trader approval
      - match:
          arguments.quantity: "<= 1000"
        action: require_approval
        approval:
          channel: dashboard
          timeout: 300s
          reviewers:
            - role: senior_trader
          context_shown:
            - tool_name
            - arguments
            - session_history
            - portfolio_exposure

      # Large trades: require desk head + compliance
      - match:
          arguments.quantity: "> 1000"
        action: require_approval
        approval:
          tiers:
            - level: 1
              reviewers:
                - role: desk_head
              timeout: 600s
            - level: 2
              reviewers:
                - role: compliance_officer
              timeout: 1800s
          final_escalation: deny

  - tool: transfer_funds
    conditions:
      - match:
          arguments.amount: "<= 10000"
        action: allow
        budget:
          scope: daily
          limit: 100000
          currency: USD
          track_by: context.account_id

      - match:
          arguments.amount: "<= 100000"
        action: require_approval
        approval:
          channel: dashboard
          timeout: 600s
          reviewers:
            - role: treasury_ops

      - match:
          arguments.amount: "> 100000"
        action: deny
        reason: "Transfers > $100K require manual processing"

    budget:
      scope: weekly
      limit: 500000
      currency: USD
      track_by: context.account_id

  - tool: approve_recommendation
    conditions:
      # Segregation of duties: analyst role cannot approve
      - match:
          context.role: "analyst"
        action: deny
        reason: "Segregation of duties: analysts cannot approve recommendations"
      - match:
          context.role: "(senior_trader|desk_head)"
        action: allow

default_action: deny
logging:
  level: full
  retention: 7years
  reason: "SOX Section 802 — record retention"

Economic Authorization: Multi-Scope Budgets

Per-transaction limits are necessary but not sufficient. The $45M incident happened because each individual trade was within limits, but the aggregate exposure was catastrophic. Veto's economic authorization tracks budgets across multiple scopes simultaneously:

budget_scopes.yamlyaml
# Budget tracking across multiple time windows and dimensions
budgets:
  per_trade:
    execute_trade:
      max_notional: 50000
      currency: USD

  daily:
    execute_trade:
      max_notional: 500000
      max_trades: 200
      currency: USD
      reset: "16:00 America/New_York"
    transfer_funds:
      max_amount: 100000
      max_transfers: 20
      currency: USD

  weekly:
    execute_trade:
      max_notional: 2000000
      currency: USD
    transfer_funds:
      max_amount: 500000
      currency: USD

  per_ticker:
    execute_trade:
      max_position_pct: 25
      basis: portfolio_value
      track_by: arguments.ticker

  per_counterparty:
    transfer_funds:
      max_exposure: 250000
      currency: USD
      track_by: arguments.to_account

When the daily budget for execute_trade hits $500,000, the next trade is denied regardless of its individual size. When a single ticker exceeds 25% of portfolio value, further buys of that ticker are blocked. The agent does not need to track any of this. Veto maintains the running totals and evaluates them on every protect() call.

DIY Limit Checking vs Declarative Policy

Teams that build financial agents without a policy engine end up with limit checking scattered across application code. Every tool handler has its own validation logic, its own threshold constants, and its own logging format. Changing a limit means deploying new code. Adding a new approval tier means refactoring the execution loop. The comparison:

comparison.txttext
DIY Limit Checking                      Declarative Policy (Veto)
─────────────────────────────────────   ─────────────────────────────────────
Limits hardcoded in application code    Limits defined in YAML policy
Change requires code deploy             Change requires policy update (no deploy)
Each tool has its own validation        One protect() call per tool
Budget tracking is manual state mgmt    Budget tracking is automatic
Audit log format varies per tool        Structured audit log on every decision
Segregation of duties is ad-hoc         Segregation enforced by role in context
No aggregate limit tracking             Multi-scope budgets (daily/weekly/ticker)
Testing requires mocking business       Testing is policy evaluation (unit testable)
  logic + external services
Compliance evidence is scattered        Every decision is a compliance record
Adding approval tiers = refactor        Adding approval tiers = YAML change

The declarative approach is not just cleaner. It is auditable. When your SOX auditor asks "show me the control that prevents unauthorized transactions over $50,000," you point to the YAML policy and the decision log. With DIY limit checking, you point to a code review and hope the reviewer caught every edge case.

SR 11-7 Mapping: Agent Controls as Model Risk Management

SR 11-7 requires three things for every model: effective challenge, ongoing monitoring, and documentation. Here is how Veto maps to each:

  • Effective challenge — The require_approval action is effective challenge by definition. An independent human reviews the agent's proposed action and can approve, deny, or modify it. The approval log records who challenged, what they decided, and why.
  • Ongoing monitoring — Budget tracking provides quantitative monitoring. When the agent approaches a limit, Veto can alert before the limit is hit. Decision logs provide qualitative monitoring: denial rates, approval rates, and patterns that indicate drift in agent behavior.
  • Documentation — The YAML policy is the model's documentation. It describes exactly what the agent is authorized to do, under what conditions, with what limits. Policy version history tracks changes over time. Decision logs provide the evidence that the documented controls are actually enforced.

Getting Started

Adding financial controls to an existing agent is a single integration point: wrap your tool execution with protect(), define your limits and approval thresholds in a YAML policy, and Veto handles budget tracking, approval routing, and audit logging. Your agent code stays identical. The controls are entirely external and auditable.

Start free to add financial guardrails to your agent, or read the financial agents implementation guide and SOC 2 compliance documentation for detailed walkthroughs.

Build your first policy