Skip to content

Python SDK Reference

The AgentWatch Python SDK provides a drop-in replacement for the OpenAI client with built-in budget enforcement, telemetry, and anomaly detection.

Installation

bash
pip install aw-sdk

Quick Start

python
from agentwatch import WatchedOpenAI, AgentBudgetExceeded

client = WatchedOpenAI(
    api_key="your-openai-key",
    agentwatch_api_key="aw_live_xxx",
    agentwatch_session_id="my-session",
    agentwatch_session_budget_usd=2.00,
    agentwatch_enforcement_mode=True,
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}]
    )
except AgentBudgetExceeded as e:
    print(f"Blocked: ${e.spent:.4f} / ${e.limit:.4f}")

API Reference

WatchedOpenAI

Drop-in replacement for openai.OpenAI with built-in telemetry.

Constructor Parameters:

ParameterTypeDefaultDescription
agentwatch_api_keystrrequiredYour AgentWatch API key
agentwatch_projectstrNoneProject name for attribution
agentwatch_teamstrNoneTeam name for team budget enforcement
agentwatch_session_idstrrandom UUIDSession identifier
agentwatch_session_budget_usdfloatNonePer-session budget ceiling
agentwatch_monthly_budget_usdfloatNoneMonthly budget (reserved)
agentwatch_enforcement_modeboolFalseEnable pre-call budget checks
agentwatch_enforcement_fail_openboolTrueFail open on AgentWatch outage
ingest_urlstrdefaultAgentWatch ingest endpoint
timeout_secondsfloat2.0HTTP timeout for AgentWatch calls

All other parameters are passed to openai.OpenAI.

Methods:

MethodDescription
chat.completions.create(...)Drop-in replacement with budget enforcement

wrap()

Wrap an existing OpenAI client with AgentWatch telemetry via composition.

python
from openai import OpenAI
from agentwatch import wrap

client = OpenAI(api_key="your-key")
watched = wrap(
    client,
    agentwatch_api_key="aw_live_xxx",
    agentwatch_session_id="my-session",
    agentwatch_session_budget_usd=2.00,
    agentwatch_enforcement_mode=True,
)

Parameters: Same as WatchedOpenAI constructor.

Exceptions

AgentBudgetExceeded

Raised when a session exceeds its configured budget limit.

python
except AgentBudgetExceeded as e:
    print(f"Session: {e.session_id}")
    print(f"Spent: ${e.spent:.4f}")
    print(f"Limit: ${e.limit:.4f}")
AttributeTypeDescription
session_idstrThe session that exceeded the budget
spentfloatAmount spent in USD
limitfloatBudget limit in USD

AgentBudgetCheckUnavailable

Raised when enforcement_fail_open=False and the budget check endpoint is unreachable.

python
except AgentBudgetCheckUnavailable as e:
    print(f"Session: {e.session_id}")
    print(f"Reason: {e.reason}")

analyze_text()

Scan text for PII and secret risks. Returns a list of risk tag strings.

python
from agentwatch import analyze_text

risks = analyze_text("Contact user@example.com")
# Returns: ["PII_EMAIL"]

Risk Tags:

TagDescription
PII_EMAILEmail address detected
PII_SSNSocial Security Number detected
FINANCIAL_CREDIT_CARDCredit card number detected (Luhn-validated)
SECRET_AWS_ACCESS_KEYAWS access key detected
SECRET_STRIPEStripe secret key detected
SECRET_GITHUBGitHub token detected
SECRET_JWTJWT token detected

Streaming Support

Budget enforcement works with streaming responses. The SDK:

  1. Performs a pre-flight budget check before the stream starts
  2. Wraps the stream to monitor token usage
  3. Terminates the stream if budget is exceeded mid-response
python
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a long story"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

WARNING

Budget enforcement for streaming calls uses estimated token counts until the stream completes. Actual enforcement may be slightly delayed compared to non-streaming calls.

Thread Safety

The SDK is thread-safe. Multiple threads can share a single WatchedOpenAI instance. The iteration counter uses a threading.Lock to prevent race conditions.

Telemetry

Telemetry is logged asynchronously via a background daemon thread. The main thread is never blocked. If the telemetry endpoint is unreachable, failures are silently logged at debug level.