Skip to main content

Overview

klaw provides several mechanisms to control costs, enforce safety boundaries, and maintain reliability. This guide covers setting up cost budgets, tool approval gates, provider resilience, monitoring, and per-agent restrictions.

Cost Budgets

Setting a Session Budget

Add max_session_cost to your defaults configuration to cap spending per session:
[defaults]
model = "claude-sonnet-4-20250514"
max_session_cost = 5.00  # USD per session
When the accumulated cost reaches this limit, the agent stops with a budget_exceeded error. A warning is logged when cost reaches 80% of the budget.

How Cost is Calculated

klaw tracks input and output tokens for every LLM call and multiplies by per-model pricing:
cost = (input_tokens / 1,000,000 * input_price) + (output_tokens / 1,000,000 * output_price)
Built-in pricing for common models:
ModelInput (per 1M tokens)Output (per 1M tokens)
claude-sonnet-4-20250514$3.00$15.00
claude-opus-4-20250514$15.00$75.00
claude-3-5-haiku-20241022$0.80$4.00
openai/gpt-4o$2.50$10.00
deepseek/deepseek-chat$0.14$0.28
Unknown models default to 3.00input/3.00 input / 15.00 output per million tokens.

Tips for Cost Control

Begin with max_session_cost = 2.00 for interactive sessions and increase as needed. For automated pipelines, set higher limits but always have a cap.
Combine cost budgets with max_iterations to prevent runaway loops:
[agent.default]
max_iterations = 30
Use Haiku (0.80/1Minput)forsimpletaskslikerouting,Sonnet(0.80/1M input) for simple tasks like routing, Sonnet (3/1M) for most work, and Opus ($15/1M) only for complex reasoning.

Tool Approval

Configuring Approval Gates

Require user confirmation before specific tools execute:
[agent.default]
require_approval = ["bash", "write"]
When the agent tries to use an approved tool, the user sees:
⚠ Tool 'bash' requires approval. Execute? [y/N]:
  • Type y or yes to approve
  • Any other response (or Enter) denies execution
  • Denied tools return “Denied by user” to the LLM, which can adjust its approach

Which Tools to Gate

Risk LevelToolsRecommendation
HighbashAlways require approval for untrusted tasks
Mediumwrite, editRequire approval when agents modify important files
Lowread, glob, grepUsually safe to allow freely
Lowweb_fetch, web_searchSafe for research agents

Provider Resilience

Retry Configuration

Each provider retries failed requests with exponential backoff:
[provider.anthropic]
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4-20250514"
max_retries = 3
Retry behavior:
  • Backoff: Starts at 1s, doubles each attempt, caps at 30s
  • Jitter: Random 0-25% added to prevent thundering herd
  • Retryable errors: HTTP 429 (rate limit), 500, 502, 503, timeouts, connection resets

Fallback Providers

Chain providers so failures on one automatically route to another:
[provider.anthropic]
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4-20250514"
max_retries = 3
fallback = "openrouter"

[provider.openrouter]
api_key = "${OPENROUTER_API_KEY}"
model = "anthropic/claude-sonnet-4"
max_retries = 2
The flow is: try primary with retries → exhaust retries → try fallback with its own retries.

Monitoring

Structured Logging

Enable JSON-formatted logs for production monitoring:
[logging]
level = "info"
file = "~/.klaw/logs/klaw.log"
format = "json"
Log entries include structured fields:
{
  "level": "DEBUG",
  "msg": "provider response",
  "model": "claude-sonnet-4-20250514",
  "input_tokens": 1523,
  "output_tokens": 847,
  "cost": "$0.014"
}

Metrics

klaw tracks metrics globally and per-session:
MetricScopeDescription
TotalInputTokensGlobalCumulative input tokens across all sessions
TotalOutputTokensGlobalCumulative output tokens
TotalRequestsGlobalTotal LLM API requests
TotalErrorsGlobalTotal error count
TotalToolCallsGlobalTotal tool invocations
ToolCallsPer-sessionPer-tool call counts within a session

Per-Agent Restrictions

Tool Filtering

Restrict which tools an agent can access:
# Research agent: read-only tools
[agent.researcher]
tools = ["read", "glob", "grep", "web_fetch", "web_search"]

# Coder agent: file modification tools
[agent.coder]
tools = ["bash", "read", "write", "edit", "glob", "grep"]
When tools is omitted, the agent has access to all registered tools. When specified, only listed tools are available.

Combining Safety Features

For maximum control, combine multiple safety mechanisms:
[defaults]
max_session_cost = 10.00

[agent.default]
tools = ["bash", "read", "write", "edit", "glob", "grep", "web_fetch", "web_search"]
max_iterations = 50
require_approval = ["bash"]

[agent.researcher]
tools = ["read", "glob", "grep", "web_fetch", "web_search"]
max_iterations = 30

[provider.anthropic]
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4-20250514"
max_retries = 3
fallback = "openrouter"
This configuration:
  • Caps session cost at $10
  • Limits the default agent to 50 iterations with bash approval
  • Gives the researcher agent only read-only tools with 30 iterations
  • Retries Anthropic API calls 3 times then falls back to OpenRouter

Next Steps

Configuration

Full configuration reference

Resilience Architecture

Deep dive into retry, fallback, and error handling