BLACKBOX SDK

SDK Documentation

Python · TypeScript · REST · OpenTelemetry. Record any agent in under 10 lines. Zero-dependency.

Python SDK

Python quickstart

Record your first run in under a minute.

python

import blackbox_sdk as bb

bb.configure(api_url="https://blackbox-gold.vercel.app
")

with bb.run("my_first_run", model="claude-sonnet-4-6", system_prompt="You are a helpful assistant.") as r:
    r.reasoning("Analysing the user request...")
    r.tool_call("web_search", result={"query": "...", "hits": []})
    r.output("Here is the answer.", cost_usd=0.0031, latency_ms=820)
    r.seal()

# Run is now sealed, chain is valid, replay at:
# https://blackbox-gold.vercel.app
/v1/runs/my_first_run

Python installation

The SDK has no external dependencies — stdlib only.

bash

pip install blackbox-sdk

Or from source:

bash

git clone https://github.com/Leclezio69/blackbox.git
cd blackbox/sdk && pip install -e .

Requires Python 3.11+.

Event types

Method	Event type	When to use
`r.genesis(…)`	genesis	First event — system prompt, tools, model snapshot, sampling params
`r.reasoning(title, …)`	reasoning	Model thinking/planning step
`r.tool_call(tool, result, …)`	tool_call	Any external action: search, API call, code execution
`r.retrieval(query, results, …)`	retrieval	Fetching documents, DB rows, or context for grounding
`r.decision(action, …)`	decision	Policy or routing decision
`r.output(text, …)`	output	The released output — verified clean or fault-free
`r.fault(title, …)`	fault	Detected problem — hallucination, policy violation, error
`r.seal()`	seal	Close the run. Chain is anchored.

Context manager

The context manager auto-seals and catches unhandled exceptions as fault events.

python

with bb.run("run_id", model="gpt-4o", system_prompt=SYSTEM) as r:
    for step in agent_loop():
        if step.type == "reasoning":
            r.reasoning(step.title, payload=step.data)
        elif step.type == "tool":
            # timed_tool_call(tool, inputs, fn, *args) executes fn and records latency
            result = r.timed_tool_call(step.tool_name, step.args, execute_tool, step.tool_name, step.args)
        elif step.type == "output":
            r.output(step.text, cost_usd=step.cost, latency_ms=step.ms)
    r.seal()
# If an exception escapes, it's recorded as a fault event automatically.

Async agents

Use bb.async_run() with async with. All methods are coroutines.

python

import blackbox_sdk as bb

async with bb.async_run("my_async_run", model="claude-sonnet-4-6", system_prompt=SYSTEM) as r:
    await r.reasoning("Planning the response...")
    result = await my_async_tool()
    await r.tool_call("my_tool", result=result)
    await r.output("Done.", tokens_in=200, tokens_out=40, cost_usd=0.0018)
    await r.seal()

Auto-instrumentation

Patch openai and/or anthropic globally — every LLM call is recorded without modifying agent code.

python

import blackbox_sdk as bb

# Call once at startup — before any LLM calls
bb.configure(api_url="https://blackbox-gold.vercel.app
")
bb.instrument()  # patches openai + anthropic clients

# Every client.messages.create() / client.chat.completions.create()
# is now automatically recorded as a BLACKBOX run.
# Pass record_openai=False or record_anthropic=False to selectively disable.
bb.instrument(record_openai=False)  # Anthropic only

Each auto-instrumented call creates a run named auto_<uuid> and records a genesis + reasoning event pair with model, tokens, and latency.

Chain verification

Verify a recorded run's hash chain from Python — no browser required. The chain is recomputed from scratch; any tampered event is detected.

python

import httpx

run = httpx.get("https://blackbox-gold.vercel.app
/v1/runs/my_run_001").json()
print(run["chain_valid"])       # True
print(run["event_count"])       # e.g. 4
print(run["first_broken_seq"])  # None if valid

# Public audit certificate — no API key needed, share with any auditor:
cert = httpx.get("https://blackbox-gold.vercel.app
/v1/audit/my_run_001").json()
print(cert["chain_valid"])      # True
print(cert["chain_head_hash"])  # sha256 of final event

TypeScript SDK

TypeScript installation

bash

npm install @blackbox-ai/sdk
# or
pnpm add @blackbox-ai/sdk

ESM + CJS dual build. Works in Node 18+, Deno, and the browser (except streamEvents which uses EventSource).

TypeScript quickstart

typescript

import { BlackboxClient } from "@blackbox-ai/sdk";

const bb = new BlackboxClient({
  apiUrl: process.env.BLACKBOX_API_URL,
  apiKey: process.env.BLACKBOX_AGENT_KEY,
});

const run = bb.run("invoice-review-4821");

await run.genesis({
  system_prompt: "You are a financial analyst.",
  model_snapshot: "gpt-4o",
  tools: ["ocr_extract", "calculator"],
});

await run.reasoning("Analysing the invoice...");
await run.toolCall("ocr_extract", { path: "/tmp/invoice.pdf" });
await run.reasoning("No anomalies detected.");
await run.seal();

// Verify the chain:
const cert = await bb.getAuditCertificate("invoice-review-4821");
console.log(cert.chain_valid); // true

TypeScript auto-instrumentation

Patch openai and/or @anthropic-ai/sdk globally — every LLM call recorded automatically.

typescript

import { instrument } from "@blackbox-ai/sdk";

// Call once at startup — before any LLM calls
instrument({ apiUrl: process.env.BLACKBOX_API_URL });

// Every OpenAI/Anthropic call is now auto-recorded.
// Pass recordOpenAI: false or recordAnthropic: false to be selective:
instrument({ apiUrl: "...", recordOpenAI: false }); // Anthropic only

TypeScript API reference

Method	Returns	Description
`bb.run(runId)`	`BlackboxRun`	Create a fluent run helper
`bb.record(event)`	`Promise<EventOut>`	Append a raw event
`bb.getRun(runId)`	`Promise<RunOut>`	Replay run + chain_valid
`bb.listRuns()`	`Promise<RunSummary[]>`	All run summaries
`bb.verify(runId, input)`	`Promise<VerifyOut>`	Citation verify + record
`bb.getAuditCertificate(id)`	`Promise<AuditCertificate>`	Public cert, no auth
`bb.streamEvents({ onEvent })`	`EventSource`	SSE live event stream
`run.genesis(ctx)`	`Promise<EventOut>`	Record system context
`run.reasoning(title, payload?, meta?)`	`Promise<EventOut>`	Reasoning step
`run.toolCall(tool, payload?, meta?)`	`Promise<EventOut>`	Tool invocation
`run.retrieval(title, payload?)`	`Promise<EventOut>`	RAG retrieval step
`run.fault(title, payload?)`	`Promise<EventOut>`	Detected fault
`run.verify(input)`	`Promise<VerifyOut>`	Verify output
`run.seal()`	`Promise<EventOut>`	Seal the run

REST API

Manual ingest (REST)

Any language, any stack — just HTTP POST.

bash

# Append an event
curl -X POST https://blackbox-gold.vercel.app
/v1/events \
  -H "Content-Type: application/json" \
  -d '{
    "run_id": "my_run_001",
    "type": "genesis",
    "title": "Run started",
    "genesis": {
      "system_prompt": "You are a helpful assistant.",
      "tools": ["web_search"],
      "model_snapshot": "claude-sonnet-4-6",
      "sampling": {"temperature": 0.3}
    }
  }'

# Replay the run
curl https://blackbox-gold.vercel.app
/v1/runs/my_run_001

OpenTelemetry

Point your OTel exporter at BLACKBOX — zero code changes. Works with opentelemetry-instrumentation-anthropic, opentelemetry-instrumentation-openai, LangChain, and more.

bash

# Zero-config: set env vars, done
export OTEL_EXPORTER_OTLP_ENDPOINT="https://blackbox-gold.vercel.app
"
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://blackbox-gold.vercel.app
/v1/otel/traces"
export OTEL_SERVICE_NAME=my-agent

python

# Python: BLACKBOX accepts OTLP/HTTP JSON — use the built-in BlackboxExporter
# pip install opentelemetry-instrumentation-anthropic opentelemetry-sdk anthropic
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, SpanExporter, SpanExportResult
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor
import json, urllib.request

class BlackboxExporter(SpanExporter):
    def export(self, spans):
        payload = {"resourceSpans": [{"resource": {"attributes": [
            {"key": "service.name", "value": {"stringValue": "my-agent"}}]},
            "scopeSpans": [{"spans": [{"traceId": format(s.context.trace_id,"032x"),
                "spanId": format(s.context.span_id,"016x"), "name": s.name,
                "startTimeUnixNano": str(s.start_time), "endTimeUnixNano": str(s.end_time),
                "attributes": [{"key":k,"value":{"stringValue":str(v)}}
                               for k,v in (s.attributes or {}).items()],
            } for s in spans]}]}]}
        req = urllib.request.Request("https://blackbox-gold.vercel.app
/v1/otel/traces",
            data=json.dumps(payload).encode(),
            headers={"Content-Type": "application/json"}, method="POST")
        urllib.request.urlopen(req, timeout=10)
        return SpanExportResult.SUCCESS
    def shutdown(self): pass

provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(BlackboxExporter()))
trace.set_tracer_provider(provider)
AnthropicInstrumentor().instrument()
# Every Anthropic call is now sealed to BLACKBOX

Each OTel trace → one BLACKBOX run. Re-exports are idempotent. Full OTel guide →

API reference

Method	Endpoint	Description
POST	`/v1/events`	Append an event to a run
GET	`/v1/runs/{id}`	Replay a run + chain verification
GET	`/v1/runs`	List all runs (filterable)
POST	`/v1/runs/{id}/verify`	Verify output + record result
GET	`/v1/audit/{id}`	Public chain certificate (no auth)
GET	`/v1/runs/{id}/proof`	Merkle inclusion proof
GET	`/v1/report`	Compliance report (date-filtered)
GET	`/v1/stats`	Aggregate metrics
GET	`/v1/leaderboard`	Model quality rankings
GET	`/v1/search`	Full-text search across runs
GET	`/v1/anchor/{date}`	Daily Merkle root
GET	`/v1/integrity`	Chain integrity panorama
POST	`/v1/otel/traces`	Ingest OTLP traces
POST	`/v1/demo/seed`	Load demo data (idempotent)

Full interactive spec: https://blackbox-gold.vercel.app /docs ↗

Public audit certificates

Share a run with any auditor — no account required. The certificate endpoint verifies the hash chain independently and returns structured JSON.

bash

# Get a public certificate — no API key needed
curl https://blackbox-gold.vercel.app
/v1/audit/{run_id}

The certificate includes: chain validity, chain head hash, fault count, model snapshot, Merkle anchor, and the verification algorithm spec. Any party can independently verify by recomputing sha256(prev_hash + canonical_json(event)) for each event.

See an example certificate →