Overview
klaw provides several mechanisms to control costs, enforce safety boundaries, and maintain reliability. This guide covers setting up cost budgets, tool approval gates, provider resilience, monitoring, and per-agent restrictions.Cost Budgets
Setting a Session Budget
Addmax_session_cost to your defaults configuration to cap spending per session:
budget_exceeded error. A warning is logged when cost reaches 80% of the budget.
How Cost is Calculated
klaw tracks input and output tokens for every LLM call and multiplies by per-model pricing:| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
claude-sonnet-4-20250514 | $3.00 | $15.00 |
claude-opus-4-20250514 | $15.00 | $75.00 |
claude-3-5-haiku-20241022 | $0.80 | $4.00 |
openai/gpt-4o | $2.50 | $10.00 |
deepseek/deepseek-chat | $0.14 | $0.28 |
Tips for Cost Control
Start with conservative budgets
Start with conservative budgets
Begin with
max_session_cost = 2.00 for interactive sessions and increase as needed. For automated pipelines, set higher limits but always have a cap.Use iteration limits
Use iteration limits
Combine cost budgets with
max_iterations to prevent runaway loops:Choose the right model
Choose the right model
Use Haiku (3/1M) for most work, and Opus ($15/1M) only for complex reasoning.
Tool Approval
Configuring Approval Gates
Require user confirmation before specific tools execute:- Type
yoryesto approve - Any other response (or Enter) denies execution
- Denied tools return “Denied by user” to the LLM, which can adjust its approach
Which Tools to Gate
| Risk Level | Tools | Recommendation |
|---|---|---|
| High | bash | Always require approval for untrusted tasks |
| Medium | write, edit | Require approval when agents modify important files |
| Low | read, glob, grep | Usually safe to allow freely |
| Low | web_fetch, web_search | Safe for research agents |
Provider Resilience
Retry Configuration
Each provider retries failed requests with exponential backoff:- Backoff: Starts at 1s, doubles each attempt, caps at 30s
- Jitter: Random 0-25% added to prevent thundering herd
- Retryable errors: HTTP 429 (rate limit), 500, 502, 503, timeouts, connection resets
Fallback Providers
Chain providers so failures on one automatically route to another:Monitoring
Structured Logging
Enable JSON-formatted logs for production monitoring:Metrics
klaw tracks metrics globally and per-session:| Metric | Scope | Description |
|---|---|---|
TotalInputTokens | Global | Cumulative input tokens across all sessions |
TotalOutputTokens | Global | Cumulative output tokens |
TotalRequests | Global | Total LLM API requests |
TotalErrors | Global | Total error count |
TotalToolCalls | Global | Total tool invocations |
ToolCalls | Per-session | Per-tool call counts within a session |
Per-Agent Restrictions
Tool Filtering
Restrict which tools an agent can access:tools is omitted, the agent has access to all registered tools. When specified, only listed tools are available.
Combining Safety Features
For maximum control, combine multiple safety mechanisms:- Caps session cost at $10
- Limits the default agent to 50 iterations with bash approval
- Gives the researcher agent only read-only tools with 30 iterations
- Retries Anthropic API calls 3 times then falls back to OpenRouter
Next Steps
Configuration
Full configuration reference
Resilience Architecture
Deep dive into retry, fallback, and error handling

