Integrations

Claude Agent Guardrails: Anthropic SDK Security

The complete guide to securing Claude agents with runtime authorization. Real Anthropic SDK code, the protect() pattern, YAML policies, and audit trails that satisfy SOC 2 and GDPR.

Yaz CalebFebruary 21, 202614 min

Claude agents are uniquely powerful because Claude is uniquely willing to act. Give Claude a set of tools and a goal, and it will chain tool calls across dozens of turns to achieve that goal. This is exactly what makes Claude agents valuable — and exactly what makes them dangerous without runtime authorization. A Claude agent with access to a file system, shell, and API credentials has the same blast radius as a junior engineer with root access and no code review process.

Anthropic's own Claude Agent SDK ships with a hooks system for exactly this reason: the framework designers knew that tool_use decisions need external checkpoints. This guide covers the full authorization stack for Claude agents — from the Messages API's tool_use flow to the Agent SDK's hook system, with YAML policies, TypeScript and Python implementations, and audit evidence that maps to SOC 2 and GDPR requirements.

Claude's tool_use Flow

Claude's tool calling works through the Messages API. You define tools as JSON schemas in the tools parameter. Claude decides when to call a tool by returning a tool_use content block in its response. Your application executes the tool and sends the result back as a tool_result block. The conversation continues until Claude stops requesting tools.

The critical gap is between steps two and three: Claude returns a tool_use block, and your application executes it. In most tutorials and starter code, that execution is unconditional. Whatever Claude asks for, the application does. The protect() pattern inserts a policy evaluation at exactly that point.

The protect() Pattern

Wrap every tool_use block in a Veto protect() call before executing the tool. The pattern is identical in Python and TypeScript:

claude_protect_pattern.pypython
import anthropic
from veto import Veto, Decision

client = anthropic.Anthropic()
veto = Veto(api_key="veto_live_xxx", project="claude-agent")

TOOLS = [
    {
        "name": "read_file",
        "description": "Read a file from the filesystem",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "File path to read"}
            },
            "required": ["path"],
        },
    },
    {
        "name": "execute_command",
        "description": "Run a shell command",
        "input_schema": {
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "Shell command"}
            },
            "required": ["command"],
        },
    },
    {
        "name": "call_api",
        "description": "Make an HTTP API request",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string"},
                "method": {"type": "string", "enum": ["GET", "POST", "PUT", "DELETE"]},
                "body": {"type": "object"},
            },
            "required": ["url", "method"],
        },
    },
]

async def run_claude_agent(user_message: str, user_id: str):
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=TOOLS,
            messages=messages,
        )

        if response.stop_reason != "tool_use":
            return response

        tool_blocks = [b for b in response.content if b.type == "tool_use"]
        tool_results = []

        for block in tool_blocks:
            decision = veto.protect(
                tool=block.name,
                arguments=block.input,
                context={
                    "user_id": user_id,
                    "model": "claude-sonnet-4-6",
                    "session_id": response.id,
                    "turn_count": len(messages) // 2,
                },
            )

            if decision.action == Decision.ALLOW:
                result = await execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })
            elif decision.action == Decision.DENY:
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": f"BLOCKED: {decision.reason}",
                    "is_error": True,
                })
            elif decision.action == Decision.APPROVAL_REQUIRED:
                approval = veto.wait_for_approval(
                    decision_id=decision.id,
                    timeout=decision.approval_timeout,
                )
                if approval.granted:
                    result = await execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result),
                    })
                else:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"DENIED: {approval.reason}",
                        "is_error": True,
                    })

        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Claude Agent SDK Hooks

Anthropic's Claude Agent SDK (the framework behind Claude Code) includes a hooks system designed for exactly this use case. Hooks are lifecycle callbacks that fire at specific points in the agent loop: before_tool_call, after_tool_call, before_model_call, and after_model_call. Veto integrates at the before_tool_call hook to authorize every tool invocation before it executes.

claude_agent_sdk_hooks.tstypescript
import { Agent, AgentHooks } from "@anthropic-ai/agent-sdk";
import { Veto, Decision } from "veto";

const veto = new Veto({ apiKey: "veto_live_xxx", project: "claude-code-agent" });

const authorizationHooks: AgentHooks = {
  async before_tool_call({ toolName, toolInput, context }) {
    const decision = await veto.protect({
      tool: toolName,
      arguments: toolInput,
      context: {
        userId: context.userId,
        sessionId: context.sessionId,
        agentId: context.agentId,
      },
    });

    if (decision.action === Decision.DENY) {
      return {
        abort: true,
        result: `BLOCKED: ${decision.reason}`,
      };
    }

    if (decision.action === Decision.APPROVAL_REQUIRED) {
      const approval = await veto.waitForApproval({
        decisionId: decision.id,
        timeout: decision.approvalTimeout,
      });
      if (!approval.granted) {
        return {
          abort: true,
          result: `DENIED by ${approval.reviewer}: ${approval.reason}`,
        };
      }
    }

    return { abort: false };
  },

  async after_tool_call({ toolName, toolInput, result, context }) {
    await veto.logExecution({
      tool: toolName,
      arguments: toolInput,
      result: typeof result === "string" ? result : JSON.stringify(result),
      context: { sessionId: context.sessionId },
    });
  },
};

const agent = new Agent({
  model: "claude-sonnet-4-6",
  tools: [readFile, executeCommand, callApi],
  hooks: authorizationHooks,
});

YAML Policies for Claude Agents

Claude agents have characteristic tool patterns: file system access, shell execution, API calls, and code generation. Policies should reflect the specific risks of each:

policies/claude-agent.yamlyaml
name: claude-agent
description: "Runtime authorization for Claude-powered agents"

rules:
  # File access — allow project directory, deny system paths
  - tool: read_file
    conditions:
      - match:
          arguments.path: "^/workspace/"
        action: allow
      - match:
          arguments.path: "^(/etc/|/var/|/usr/|/sys/|/proc/)"
        action: deny
        reason: "System path access denied"
      - match:
          arguments.path: "\.(env|pem|key|secret|credentials)$"
        action: deny
        reason: "Sensitive file access denied"

  # Shell commands — allow safe commands, gate destructive ones
  - tool: execute_command
    conditions:
      - match:
          arguments.command: "^(ls|cat|grep|find|wc|head|tail|echo)\s"
        action: allow
      - match:
          arguments.command: "(rm\s+-rf|dd\s|mkfs|chmod\s+777|curl.*\|.*sh)"
        action: deny
        reason: "Destructive or unsafe command pattern"
      - match:
          arguments.command: "^(git|npm|pip|cargo)\s"
        action: allow
        logging:
          level: full
      - match:
          arguments.command: ".*"
        action: require_approval
        approval:
          channel: dashboard
          timeout: 120s

  # API calls — scope by domain and method
  - tool: call_api
    conditions:
      - match:
          arguments.method: "GET"
          arguments.url: "^https://(api\.company\.com|internal\.service)"
        action: allow
      - match:
          arguments.method: "(POST|PUT|DELETE)"
        action: require_approval
        approval:
          channel: slack
          timeout: 300s
      - match:
          arguments.url: ".*"
        action: deny
        reason: "External API access not permitted"

default_action: deny
logging:
  level: full
  retention: 1year

TypeScript Implementation: Full Agent

claude_agent_full.tstypescript
import Anthropic from "@anthropic-ai/sdk";
import { Veto, Decision } from "veto";

const anthropic = new Anthropic();
const veto = new Veto({ apiKey: "veto_live_xxx", project: "claude-ts-agent" });

const tools: Anthropic.Tool[] = [
  {
    name: "read_file",
    description: "Read a file from the filesystem",
    input_schema: {
      type: "object" as const,
      properties: { path: { type: "string" } },
      required: ["path"],
    },
  },
  {
    name: "execute_command",
    description: "Run a shell command",
    input_schema: {
      type: "object" as const,
      properties: { command: { type: "string" } },
      required: ["command"],
    },
  },
];

async function runAgent(userMessage: string, userId: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await anthropic.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      tools,
      messages,
    });

    if (response.stop_reason !== "tool_use") return response;

    const toolBlocks = response.content.filter(
      (b): b is Anthropic.ToolUseBlock => b.type === "tool_use"
    );

    const toolResults: Anthropic.ToolResultBlockParam[] = [];

    for (const block of toolBlocks) {
      const decision = await veto.protect({
        tool: block.name,
        arguments: block.input as Record<string, unknown>,
        context: { userId, sessionId: response.id },
      });

      if (decision.action === Decision.ALLOW) {
        const result = await executeTool(block.name, block.input);
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: String(result),
        });
      } else if (decision.action === Decision.DENY) {
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: `BLOCKED: ${decision.reason}`,
          is_error: true,
        });
      } else {
        const approval = await veto.waitForApproval({
          decisionId: decision.id,
          timeout: decision.approvalTimeout,
        });
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: approval.granted
            ? String(await executeTool(block.name, block.input))
            : `DENIED: ${approval.reason}`,
          is_error: !approval.granted,
        });
      }
    }

    messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
  }
}

Audit Trail: What Gets Logged

Every protect() call against a Claude agent tool_use block produces a complete audit record. Here is what a single denied file access looks like in the log:

audit_record_claude.jsonjson
{
  "record_id": "aud_cl_9x8w7v6u5t4s",
  "timestamp": "2026-04-04T14:22:08.331Z",
  "event_type": "tool_call_decision",

  "identity": {
    "agent_id": "claude-agent-v2",
    "model": "claude-sonnet-4-6",
    "session_id": "msg_01XYZ...",
    "triggered_by": {
      "user_id": "user_892",
      "email": "dev@acmecorp.com"
    }
  },

  "tool_call": {
    "tool": "read_file",
    "arguments": {
      "path": "/etc/shadow"
    }
  },

  "policy_evaluation": {
    "policy_name": "claude-agent",
    "rule_matched": "rule_1_system_path_deny",
    "conditions_evaluated": [
      {"condition": "path starts with /workspace/", "result": false},
      {"condition": "path starts with /etc/", "result": true}
    ],
    "decision": "deny",
    "reason": "System path access denied"
  },

  "compliance_metadata": {
    "soc2_controls": ["CC6.1", "CC6.3", "CC7.2"],
    "retention_policy": "1year"
  }
}

SOC 2 Mapping for Claude Agent Audit Evidence

Claude agent audit logs map directly to SOC 2 Trust Services Criteria. The two most relevant controls:

  • CC6.1 (Logical Access Controls) — Every tool_use decision includes: user identity, agent identity, model version, session ID, and the policy that governed the decision. Your auditor can trace any agent action back to the human who initiated the session and the policy that authorized (or denied) it.
  • CC6.3 (Access Authorization) — YAML policies define per-tool, per-context authorization rules. Denial logs prove enforcement. Policy version history shows who changed what and when. The combination satisfies the auditor's question: "What was the agent authorized to do, and was that authorization appropriate?"

For GDPR, every audit record includes the data subject context (when applicable) and the decision explanation, satisfying Articles 13-15 (right to information about automated decision-making) and Article 22 (right to human intervention when approvals are configured).

Getting Started

Adding authorization to a Claude agent is one protect() call per tool_use block. Adding hooks to a Claude Agent SDK agent is one configuration object. The policies live in YAML, the audit trail is automatic, and the compliance evidence writes itself.

Start free and secure your Claude agent today. Claude integration docs cover the full API surface, and our SOC 2 compliance guide maps every audit control to Veto features.

Related posts

Build your first policy