Skip to content

Session & Team Budgets

AgentWatch provides two levels of budget enforcement: session budgets for individual agent runs and team budgets for organization-wide spend caps.

Session Budgets

Session budgets protect individual agent runs by setting a per-session USD ceiling.

How It Works

  1. Developer sets agentwatch_session_budget_usd when initializing the client
  2. Before each API call, the SDK checks cumulative token spend against the limit
  3. If exceeded, the request is blocked and AgentBudgetExceeded is raised
  4. The upstream LLM provider is never billed for blocked requests

Configuration

python
from agentwatch import WatchedOpenAI, AgentBudgetExceeded

client = WatchedOpenAI(
    api_key="your-openai-key",
    agentwatch_api_key="aw_live_xxx",
    agentwatch_session_id="ci-run-123",
    agentwatch_session_budget_usd=2.00,
    agentwatch_enforcement_mode=True,
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Refactor this module..."}]
    )
except AgentBudgetExceeded as e:
    print(f"Blocked at iteration {e.session_id}")
    print(f"Spent: ${e.spent:.4f} | Limit: ${e.limit:.4f}")

Token Cost Estimation

Budget checks use a conservative blended rate of $3 per million tokens for cost estimation:

ModelActual RateEstimated RateAccuracy
gpt-4o-mini$0.15/MTok$3/MTokOverestimated ~20x
gpt-4o$2.50/MTok$3/MTokWithin 1.2x
claude-3-5-sonnet$3.00/MTok$3/MTokAccurate

For models cheaper than the blended rate, enforcement triggers earlier than actual spend. This is by design — safer to over-enforce than under-enforce.

Session State Persistence

Session state is stored in Cloudflare KV with a 24-hour TTL. This means:

  • Session budgets survive process restarts
  • Multiple SDK instances with the same session_id share state
  • Sessions expire after 24 hours of inactivity

Team Budgets

Team budgets allow administrators to cap total spend across all models and environments for an entire team.

Setting the Budget

Use the Admin API to assign monthly budgets:

bash
curl -X POST https://agent-watch.dev/v1/teams/budgets \
  -H "Authorization: Bearer aw_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "team": "backend-engineers",
    "monthly_budget_usd": 500.00,
    "alert_threshold_pct": 80,
    "hard_stop": true
  }'

Enforcing Team Budgets

Pass the team name in your SDK configuration:

python
client = WatchedOpenAI(
    agentwatch_api_key="aw_live_xxx",
    agentwatch_team="backend-engineers",
    ...
)

Hard Stop vs Alert-Only

ModeBehavior
hard_stop: trueRequests blocked when team budget exceeded (403)
hard_stop: falseAlert logged but requests proceed

Checking Team Spend

bash
curl -H "Authorization: Bearer aw_live_xxx" \
  "https://agent-watch.dev/v1/teams/budget-check?team=backend-engineers"

Response:

json
{
  "team": "backend-engineers",
  "monthly_budget_usd": 500.00,
  "current_spend": 342.50,
  "pct_used": 68.5,
  "hard_stop": true,
  "status": "ok"
}

Budget Hierarchy

When both session and team budgets are configured:

  1. Session budget is checked first (pre-flight)
  2. If session budget passes, team budget is checked
  3. The more restrictive limit applies

This means a session budget of $2.00 will block at $2.00 even if the team has $500 remaining.