BLACKBOX
flight recorder for AI agents
BLACKBOX SDK

SDK Documentation

Python · TypeScript · REST · OpenTelemetry. Record any agent in under 10 lines. Zero-dependency.

Python SDK

Python quickstart

Record your first run in under a minute.

python
import blackbox_sdk as bb

bb.configure(api_url="https://blackbox-gold.vercel.app
")

with bb.run("my_first_run", model="claude-sonnet-4-6", system_prompt="You are a helpful assistant.") as r:
    r.reasoning("Analysing the user request...")
    r.tool_call("web_search", result={"query": "...", "hits": []})
    r.output("Here is the answer.", cost_usd=0.0031, latency_ms=820)
    r.seal()

# Run is now sealed, chain is valid, replay at:
# https://blackbox-gold.vercel.app
/v1/runs/my_first_run

Python installation

The SDK has no external dependencies — stdlib only.

bash
pip install blackbox-sdk

Or from source:

bash
git clone https://github.com/Leclezio69/blackbox.git
cd blackbox/sdk && pip install -e .

Requires Python 3.11+.

Event types

MethodEvent typeWhen to use
r.genesis(…)genesisFirst event — system prompt, tools, model snapshot, sampling params
r.reasoning(title, …)reasoningModel thinking/planning step
r.tool_call(tool, result, …)tool_callAny external action: search, API call, code execution
r.retrieval(query, results, …)retrievalFetching documents, DB rows, or context for grounding
r.decision(action, …)decisionPolicy or routing decision
r.output(text, …)outputThe released output — verified clean or fault-free
r.fault(title, …)faultDetected problem — hallucination, policy violation, error
r.seal()sealClose the run. Chain is anchored.

Context manager

The context manager auto-seals and catches unhandled exceptions as fault events.

python
with bb.run("run_id", model="gpt-4o", system_prompt=SYSTEM) as r:
    for step in agent_loop():
        if step.type == "reasoning":
            r.reasoning(step.title, payload=step.data)
        elif step.type == "tool":
            # timed_tool_call(tool, inputs, fn, *args) executes fn and records latency
            result = r.timed_tool_call(step.tool_name, step.args, execute_tool, step.tool_name, step.args)
        elif step.type == "output":
            r.output(step.text, cost_usd=step.cost, latency_ms=step.ms)
    r.seal()
# If an exception escapes, it's recorded as a fault event automatically.

Async agents

Use bb.async_run() with async with. All methods are coroutines.

python
import blackbox_sdk as bb

async with bb.async_run("my_async_run", model="claude-sonnet-4-6", system_prompt=SYSTEM) as r:
    await r.reasoning("Planning the response...")
    result = await my_async_tool()
    await r.tool_call("my_tool", result=result)
    await r.output("Done.", tokens_in=200, tokens_out=40, cost_usd=0.0018)
    await r.seal()

Auto-instrumentation

Patch openai and/or anthropic globally — every LLM call is recorded without modifying agent code.

python
import blackbox_sdk as bb

# Call once at startup — before any LLM calls
bb.configure(api_url="https://blackbox-gold.vercel.app
")
bb.instrument()  # patches openai + anthropic clients

# Every client.messages.create() / client.chat.completions.create()
# is now automatically recorded as a BLACKBOX run.
# Pass record_openai=False or record_anthropic=False to selectively disable.
bb.instrument(record_openai=False)  # Anthropic only

Each auto-instrumented call creates a run named auto_<uuid> and records a genesis + reasoning event pair with model, tokens, and latency.

Chain verification

Verify a recorded run's hash chain from Python — no browser required. The chain is recomputed from scratch; any tampered event is detected.

python
import httpx

run = httpx.get("https://blackbox-gold.vercel.app
/v1/runs/my_run_001").json()
print(run["chain_valid"])       # True
print(run["event_count"])       # e.g. 4
print(run["first_broken_seq"])  # None if valid

# Public audit certificate — no API key needed, share with any auditor:
cert = httpx.get("https://blackbox-gold.vercel.app
/v1/audit/my_run_001").json()
print(cert["chain_valid"])      # True
print(cert["chain_head_hash"])  # sha256 of final event
TypeScript SDK

TypeScript installation

bash
npm install @blackbox-ai/sdk
# or
pnpm add @blackbox-ai/sdk

ESM + CJS dual build. Works in Node 18+, Deno, and the browser (except streamEvents which uses EventSource).

TypeScript quickstart

typescript
import { BlackboxClient } from "@blackbox-ai/sdk";

const bb = new BlackboxClient({
  apiUrl: process.env.BLACKBOX_API_URL,
  apiKey: process.env.BLACKBOX_AGENT_KEY,
});

const run = bb.run("invoice-review-4821");

await run.genesis({
  system_prompt: "You are a financial analyst.",
  model_snapshot: "gpt-4o",
  tools: ["ocr_extract", "calculator"],
});

await run.reasoning("Analysing the invoice...");
await run.toolCall("ocr_extract", { path: "/tmp/invoice.pdf" });
await run.reasoning("No anomalies detected.");
await run.seal();

// Verify the chain:
const cert = await bb.getAuditCertificate("invoice-review-4821");
console.log(cert.chain_valid); // true

TypeScript auto-instrumentation

Patch openai and/or @anthropic-ai/sdk globally — every LLM call recorded automatically.

typescript
import { instrument } from "@blackbox-ai/sdk";

// Call once at startup — before any LLM calls
instrument({ apiUrl: process.env.BLACKBOX_API_URL });

// Every OpenAI/Anthropic call is now auto-recorded.
// Pass recordOpenAI: false or recordAnthropic: false to be selective:
instrument({ apiUrl: "...", recordOpenAI: false }); // Anthropic only

TypeScript API reference

MethodReturnsDescription
bb.run(runId)BlackboxRunCreate a fluent run helper
bb.record(event)Promise<EventOut>Append a raw event
bb.getRun(runId)Promise<RunOut>Replay run + chain_valid
bb.listRuns()Promise<RunSummary[]>All run summaries
bb.verify(runId, input)Promise<VerifyOut>Citation verify + record
bb.getAuditCertificate(id)Promise<AuditCertificate>Public cert, no auth
bb.streamEvents({ onEvent })EventSourceSSE live event stream
run.genesis(ctx)Promise<EventOut>Record system context
run.reasoning(title, payload?, meta?)Promise<EventOut>Reasoning step
run.toolCall(tool, payload?, meta?)Promise<EventOut>Tool invocation
run.retrieval(title, payload?)Promise<EventOut>RAG retrieval step
run.fault(title, payload?)Promise<EventOut>Detected fault
run.verify(input)Promise<VerifyOut>Verify output
run.seal()Promise<EventOut>Seal the run
REST API

Manual ingest (REST)

Any language, any stack — just HTTP POST.

bash
# Append an event
curl -X POST https://blackbox-gold.vercel.app
/v1/events \
  -H "Content-Type: application/json" \
  -d '{
    "run_id": "my_run_001",
    "type": "genesis",
    "title": "Run started",
    "genesis": {
      "system_prompt": "You are a helpful assistant.",
      "tools": ["web_search"],
      "model_snapshot": "claude-sonnet-4-6",
      "sampling": {"temperature": 0.3}
    }
  }'

# Replay the run
curl https://blackbox-gold.vercel.app
/v1/runs/my_run_001

OpenTelemetry

Point your OTel exporter at BLACKBOX — zero code changes. Works with opentelemetry-instrumentation-anthropic, opentelemetry-instrumentation-openai, LangChain, and more.

bash
# Zero-config: set env vars, done
export OTEL_EXPORTER_OTLP_ENDPOINT="https://blackbox-gold.vercel.app
"
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://blackbox-gold.vercel.app
/v1/otel/traces"
export OTEL_SERVICE_NAME=my-agent
python
# Python: BLACKBOX accepts OTLP/HTTP JSON — use the built-in BlackboxExporter
# pip install opentelemetry-instrumentation-anthropic opentelemetry-sdk anthropic
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, SpanExporter, SpanExportResult
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor
import json, urllib.request

class BlackboxExporter(SpanExporter):
    def export(self, spans):
        payload = {"resourceSpans": [{"resource": {"attributes": [
            {"key": "service.name", "value": {"stringValue": "my-agent"}}]},
            "scopeSpans": [{"spans": [{"traceId": format(s.context.trace_id,"032x"),
                "spanId": format(s.context.span_id,"016x"), "name": s.name,
                "startTimeUnixNano": str(s.start_time), "endTimeUnixNano": str(s.end_time),
                "attributes": [{"key":k,"value":{"stringValue":str(v)}}
                               for k,v in (s.attributes or {}).items()],
            } for s in spans]}]}]}
        req = urllib.request.Request("https://blackbox-gold.vercel.app
/v1/otel/traces",
            data=json.dumps(payload).encode(),
            headers={"Content-Type": "application/json"}, method="POST")
        urllib.request.urlopen(req, timeout=10)
        return SpanExportResult.SUCCESS
    def shutdown(self): pass

provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(BlackboxExporter()))
trace.set_tracer_provider(provider)
AnthropicInstrumentor().instrument()
# Every Anthropic call is now sealed to BLACKBOX

Each OTel trace → one BLACKBOX run. Re-exports are idempotent. Full OTel guide →

API reference

MethodEndpointDescription
POST/v1/eventsAppend an event to a run
GET/v1/runs/{id}Replay a run + chain verification
GET/v1/runsList all runs (filterable)
POST/v1/runs/{id}/verifyVerify output + record result
GET/v1/audit/{id}Public chain certificate (no auth)
GET/v1/runs/{id}/proofMerkle inclusion proof
GET/v1/reportCompliance report (date-filtered)
GET/v1/statsAggregate metrics
GET/v1/leaderboardModel quality rankings
GET/v1/searchFull-text search across runs
GET/v1/anchor/{date}Daily Merkle root
GET/v1/integrityChain integrity panorama
POST/v1/otel/tracesIngest OTLP traces
POST/v1/demo/seedLoad demo data (idempotent)

Full interactive spec: https://blackbox-gold.vercel.app /docs ↗

Public audit certificates

Share a run with any auditor — no account required. The certificate endpoint verifies the hash chain independently and returns structured JSON.

bash
# Get a public certificate — no API key needed
curl https://blackbox-gold.vercel.app
/v1/audit/{run_id}

The certificate includes: chain validity, chain head hash, fault count, model snapshot, Merkle anchor, and the verification algorithm spec. Any party can independently verify by recomputing sha256(prev_hash + canonical_json(event)) for each event.

See an example certificate →