Session & Team Budgets

AgentWatch provides two levels of budget enforcement: session budgets for individual agent runs and team budgets for organization-wide spend caps.

Session Budgets

Session budgets protect individual agent runs by setting a per-session USD ceiling.

How It Works

Developer sets agentwatch_session_budget_usd when initializing the client
Before each API call, the SDK checks cumulative token spend against the limit
If exceeded, the request is blocked and AgentBudgetExceeded is raised
The upstream LLM provider is never billed for blocked requests

Configuration

python

from agentwatch import WatchedOpenAI, AgentBudgetExceeded

client = WatchedOpenAI(
    api_key="your-openai-key",
    agentwatch_api_key="aw_live_xxx",
    agentwatch_session_id="ci-run-123",
    agentwatch_session_budget_usd=2.00,
    agentwatch_enforcement_mode=True,
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Refactor this module..."}]
    )
except AgentBudgetExceeded as e:
    print(f"Blocked at iteration {e.session_id}")
    print(f"Spent: ${e.spent:.4f} | Limit: ${e.limit:.4f}")

Token Cost Estimation

Budget checks use a conservative blended rate of $3 per million tokens for cost estimation:

Model	Actual Rate	Estimated Rate	Accuracy
gpt-4o-mini	$0.15/MTok	$3/MTok	Overestimated ~20x
gpt-4o	$2.50/MTok	$3/MTok	Within 1.2x
claude-3-5-sonnet	$3.00/MTok	$3/MTok	Accurate

For models cheaper than the blended rate, enforcement triggers earlier than actual spend. This is by design — safer to over-enforce than under-enforce.

Session State Persistence

Session state is stored in Cloudflare KV with a 24-hour TTL. This means:

Session budgets survive process restarts
Multiple SDK instances with the same session_id share state
Sessions expire after 24 hours of inactivity

Team Budgets

Team budgets allow administrators to cap total spend across all models and environments for an entire team.

Setting the Budget

Use the Admin API to assign monthly budgets:

bash

curl -X POST https://agent-watch.dev/v1/teams/budgets \
  -H "Authorization: Bearer aw_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "team": "backend-engineers",
    "monthly_budget_usd": 500.00,
    "alert_threshold_pct": 80,
    "hard_stop": true
  }'

Enforcing Team Budgets

Pass the team name in your SDK configuration:

python

client = WatchedOpenAI(
    agentwatch_api_key="aw_live_xxx",
    agentwatch_team="backend-engineers",
    ...
)

Hard Stop vs Alert-Only

Mode	Behavior
`hard_stop: true`	Requests blocked when team budget exceeded (403)
`hard_stop: false`	Alert logged but requests proceed

Checking Team Spend

bash

curl -H "Authorization: Bearer aw_live_xxx" \
  "https://agent-watch.dev/v1/teams/budget-check?team=backend-engineers"

Response:

json

{
  "team": "backend-engineers",
  "monthly_budget_usd": 500.00,
  "current_spend": 342.50,
  "pct_used": 68.5,
  "hard_stop": true,
  "status": "ok"
}

Budget Hierarchy

When both session and team budgets are configured:

Session budget is checked first (pre-flight)
If session budget passes, team budget is checked
The more restrictive limit applies

This means a session budget of $2.00 will block at $2.00 even if the team has $500 remaining.

Session & Team Budgets ​

Session Budgets ​

How It Works ​

Configuration ​

Token Cost Estimation ​

Session State Persistence ​

Team Budgets ​

Setting the Budget ​

Enforcing Team Budgets ​

Hard Stop vs Alert-Only ​

Checking Team Spend ​

Budget Hierarchy ​