Python SDK Reference

The AgentWatch Python SDK provides a drop-in replacement for the OpenAI client with built-in budget enforcement, telemetry, and anomaly detection.

Installation

bash

pip install aw-sdk

Quick Start

python

from agentwatch import WatchedOpenAI, AgentBudgetExceeded

client = WatchedOpenAI(
    api_key="your-openai-key",
    agentwatch_api_key="aw_live_xxx",
    agentwatch_session_id="my-session",
    agentwatch_session_budget_usd=2.00,
    agentwatch_enforcement_mode=True,
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}]
    )
except AgentBudgetExceeded as e:
    print(f"Blocked: ${e.spent:.4f} / ${e.limit:.4f}")

API Reference

`WatchedOpenAI`

Drop-in replacement for openai.OpenAI with built-in telemetry.

Constructor Parameters:

Parameter	Type	Default	Description
`agentwatch_api_key`	`str`	required	Your AgentWatch API key
`agentwatch_project`	`str`	`None`	Project name for attribution
`agentwatch_team`	`str`	`None`	Team name for team budget enforcement
`agentwatch_session_id`	`str`	random UUID	Session identifier
`agentwatch_session_budget_usd`	`float`	`None`	Per-session budget ceiling
`agentwatch_monthly_budget_usd`	`float`	`None`	Monthly budget (reserved)
`agentwatch_enforcement_mode`	`bool`	`False`	Enable pre-call budget checks
`agentwatch_enforcement_fail_open`	`bool`	`True`	Fail open on AgentWatch outage
`ingest_url`	`str`	default	AgentWatch ingest endpoint
`timeout_seconds`	`float`	`2.0`	HTTP timeout for AgentWatch calls

All other parameters are passed to openai.OpenAI.

Methods:

Method	Description
`chat.completions.create(...)`	Drop-in replacement with budget enforcement

`wrap()`

Wrap an existing OpenAI client with AgentWatch telemetry via composition.

python

from openai import OpenAI
from agentwatch import wrap

client = OpenAI(api_key="your-key")
watched = wrap(
    client,
    agentwatch_api_key="aw_live_xxx",
    agentwatch_session_id="my-session",
    agentwatch_session_budget_usd=2.00,
    agentwatch_enforcement_mode=True,
)

Parameters: Same as WatchedOpenAI constructor.

Exceptions

`AgentBudgetExceeded`

Raised when a session exceeds its configured budget limit.

python

except AgentBudgetExceeded as e:
    print(f"Session: {e.session_id}")
    print(f"Spent: ${e.spent:.4f}")
    print(f"Limit: ${e.limit:.4f}")

Attribute	Type	Description
`session_id`	`str`	The session that exceeded the budget
`spent`	`float`	Amount spent in USD
`limit`	`float`	Budget limit in USD

`AgentBudgetCheckUnavailable`

Raised when enforcement_fail_open=False and the budget check endpoint is unreachable.

python

except AgentBudgetCheckUnavailable as e:
    print(f"Session: {e.session_id}")
    print(f"Reason: {e.reason}")

`analyze_text()`

Scan text for PII and secret risks. Returns a list of risk tag strings.

python

from agentwatch import analyze_text

risks = analyze_text("Contact user@example.com")
# Returns: ["PII_EMAIL"]

Risk Tags:

Tag	Description
`PII_EMAIL`	Email address detected
`PII_SSN`	Social Security Number detected
`FINANCIAL_CREDIT_CARD`	Credit card number detected (Luhn-validated)
`SECRET_AWS_ACCESS_KEY`	AWS access key detected
`SECRET_STRIPE`	Stripe secret key detected
`SECRET_GITHUB`	GitHub token detected
`SECRET_JWT`	JWT token detected

Streaming Support

Budget enforcement works with streaming responses. The SDK:

Performs a pre-flight budget check before the stream starts
Wraps the stream to monitor token usage
Terminates the stream if budget is exceeded mid-response

python

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a long story"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

WARNING

Budget enforcement for streaming calls uses estimated token counts until the stream completes. Actual enforcement may be slightly delayed compared to non-streaming calls.

Thread Safety

The SDK is thread-safe. Multiple threads can share a single WatchedOpenAI instance. The iteration counter uses a threading.Lock to prevent race conditions.

Telemetry

Telemetry is logged asynchronously via a background daemon thread. The main thread is never blocked. If the telemetry endpoint is unreachable, failures are silently logged at debug level.

Python SDK Reference ​

Installation ​

Quick Start ​

API Reference ​

WatchedOpenAI ​

wrap() ​

Exceptions ​

AgentBudgetExceeded ​

AgentBudgetCheckUnavailable ​

analyze_text() ​

Streaming Support ​

Thread Safety ​

Telemetry ​

Python SDK Reference

Installation

Quick Start

API Reference

`WatchedOpenAI`

`wrap()`

Exceptions

`AgentBudgetExceeded`

`AgentBudgetCheckUnavailable`

`analyze_text()`

Streaming Support

Thread Safety

Telemetry