Appearance
Session & Team Budgets
AgentWatch provides two levels of budget enforcement: session budgets for individual agent runs and team budgets for organization-wide spend caps.
Session Budgets
Session budgets protect individual agent runs by setting a per-session USD ceiling.
How It Works
- Developer sets
agentwatch_session_budget_usdwhen initializing the client - Before each API call, the SDK checks cumulative token spend against the limit
- If exceeded, the request is blocked and
AgentBudgetExceededis raised - The upstream LLM provider is never billed for blocked requests
Configuration
python
from agentwatch import WatchedOpenAI, AgentBudgetExceeded
client = WatchedOpenAI(
api_key="your-openai-key",
agentwatch_api_key="aw_live_xxx",
agentwatch_session_id="ci-run-123",
agentwatch_session_budget_usd=2.00,
agentwatch_enforcement_mode=True,
)
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Refactor this module..."}]
)
except AgentBudgetExceeded as e:
print(f"Blocked at iteration {e.session_id}")
print(f"Spent: ${e.spent:.4f} | Limit: ${e.limit:.4f}")Token Cost Estimation
Budget checks use a conservative blended rate of $3 per million tokens for cost estimation:
| Model | Actual Rate | Estimated Rate | Accuracy |
|---|---|---|---|
| gpt-4o-mini | $0.15/MTok | $3/MTok | Overestimated ~20x |
| gpt-4o | $2.50/MTok | $3/MTok | Within 1.2x |
| claude-3-5-sonnet | $3.00/MTok | $3/MTok | Accurate |
For models cheaper than the blended rate, enforcement triggers earlier than actual spend. This is by design — safer to over-enforce than under-enforce.
Session State Persistence
Session state is stored in Cloudflare KV with a 24-hour TTL. This means:
- Session budgets survive process restarts
- Multiple SDK instances with the same session_id share state
- Sessions expire after 24 hours of inactivity
Team Budgets
Team budgets allow administrators to cap total spend across all models and environments for an entire team.
Setting the Budget
Use the Admin API to assign monthly budgets:
bash
curl -X POST https://agent-watch.dev/v1/teams/budgets \
-H "Authorization: Bearer aw_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"team": "backend-engineers",
"monthly_budget_usd": 500.00,
"alert_threshold_pct": 80,
"hard_stop": true
}'Enforcing Team Budgets
Pass the team name in your SDK configuration:
python
client = WatchedOpenAI(
agentwatch_api_key="aw_live_xxx",
agentwatch_team="backend-engineers",
...
)Hard Stop vs Alert-Only
| Mode | Behavior |
|---|---|
hard_stop: true | Requests blocked when team budget exceeded (403) |
hard_stop: false | Alert logged but requests proceed |
Checking Team Spend
bash
curl -H "Authorization: Bearer aw_live_xxx" \
"https://agent-watch.dev/v1/teams/budget-check?team=backend-engineers"Response:
json
{
"team": "backend-engineers",
"monthly_budget_usd": 500.00,
"current_spend": 342.50,
"pct_used": 68.5,
"hard_stop": true,
"status": "ok"
}Budget Hierarchy
When both session and team budgets are configured:
- Session budget is checked first (pre-flight)
- If session budget passes, team budget is checked
- The more restrictive limit applies
This means a session budget of $2.00 will block at $2.00 even if the team has $500 remaining.