Appearance
Python SDK Reference
The AgentWatch Python SDK provides a drop-in replacement for the OpenAI client with built-in budget enforcement, telemetry, and anomaly detection.
Installation
bash
pip install aw-sdkQuick Start
python
from agentwatch import WatchedOpenAI, AgentBudgetExceeded
client = WatchedOpenAI(
api_key="your-openai-key",
agentwatch_api_key="aw_live_xxx",
agentwatch_session_id="my-session",
agentwatch_session_budget_usd=2.00,
agentwatch_enforcement_mode=True,
)
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)
except AgentBudgetExceeded as e:
print(f"Blocked: ${e.spent:.4f} / ${e.limit:.4f}")API Reference
WatchedOpenAI
Drop-in replacement for openai.OpenAI with built-in telemetry.
Constructor Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
agentwatch_api_key | str | required | Your AgentWatch API key |
agentwatch_project | str | None | Project name for attribution |
agentwatch_team | str | None | Team name for team budget enforcement |
agentwatch_session_id | str | random UUID | Session identifier |
agentwatch_session_budget_usd | float | None | Per-session budget ceiling |
agentwatch_monthly_budget_usd | float | None | Monthly budget (reserved) |
agentwatch_enforcement_mode | bool | False | Enable pre-call budget checks |
agentwatch_enforcement_fail_open | bool | True | Fail open on AgentWatch outage |
ingest_url | str | default | AgentWatch ingest endpoint |
timeout_seconds | float | 2.0 | HTTP timeout for AgentWatch calls |
All other parameters are passed to openai.OpenAI.
Methods:
| Method | Description |
|---|---|
chat.completions.create(...) | Drop-in replacement with budget enforcement |
wrap()
Wrap an existing OpenAI client with AgentWatch telemetry via composition.
python
from openai import OpenAI
from agentwatch import wrap
client = OpenAI(api_key="your-key")
watched = wrap(
client,
agentwatch_api_key="aw_live_xxx",
agentwatch_session_id="my-session",
agentwatch_session_budget_usd=2.00,
agentwatch_enforcement_mode=True,
)Parameters: Same as WatchedOpenAI constructor.
Exceptions
AgentBudgetExceeded
Raised when a session exceeds its configured budget limit.
python
except AgentBudgetExceeded as e:
print(f"Session: {e.session_id}")
print(f"Spent: ${e.spent:.4f}")
print(f"Limit: ${e.limit:.4f}")| Attribute | Type | Description |
|---|---|---|
session_id | str | The session that exceeded the budget |
spent | float | Amount spent in USD |
limit | float | Budget limit in USD |
AgentBudgetCheckUnavailable
Raised when enforcement_fail_open=False and the budget check endpoint is unreachable.
python
except AgentBudgetCheckUnavailable as e:
print(f"Session: {e.session_id}")
print(f"Reason: {e.reason}")analyze_text()
Scan text for PII and secret risks. Returns a list of risk tag strings.
python
from agentwatch import analyze_text
risks = analyze_text("Contact user@example.com")
# Returns: ["PII_EMAIL"]Risk Tags:
| Tag | Description |
|---|---|
PII_EMAIL | Email address detected |
PII_SSN | Social Security Number detected |
FINANCIAL_CREDIT_CARD | Credit card number detected (Luhn-validated) |
SECRET_AWS_ACCESS_KEY | AWS access key detected |
SECRET_STRIPE | Stripe secret key detected |
SECRET_GITHUB | GitHub token detected |
SECRET_JWT | JWT token detected |
Streaming Support
Budget enforcement works with streaming responses. The SDK:
- Performs a pre-flight budget check before the stream starts
- Wraps the stream to monitor token usage
- Terminates the stream if budget is exceeded mid-response
python
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a long story"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")WARNING
Budget enforcement for streaming calls uses estimated token counts until the stream completes. Actual enforcement may be slightly delayed compared to non-streaming calls.
Thread Safety
The SDK is thread-safe. Multiple threads can share a single WatchedOpenAI instance. The iteration counter uses a threading.Lock to prevent race conditions.
Telemetry
Telemetry is logged asynchronously via a background daemon thread. The main thread is never blocked. If the telemetry endpoint is unreachable, failures are silently logged at debug level.