From zero to a fully hash-chained, auditable AI agent run — with chain verification, Merkle anchoring, and a compliance report. Every step produces real output you can inspect.
BLACKBOX SDK has zero external dependencies — stdlib urllib only. It works in any Python 3.11+ environment including Lambda, Cloud Run, and Docker.
pip install blackbox-sdkpython -c "import blackbox_sdk as bb; print(bb.__version__)"
# → 0.3.1npm install @blackbox-ai/sdk — same API, same chain guarantees.Point the SDK at your BLACKBOX instance. In production, use an environment variable.
import blackbox_sdk as bb
# Point to your BLACKBOX instance
bb.configure(api_url="https://blackbox-gold.vercel.app")
# Verify connectivity
import urllib.request, json
resp = urllib.request.urlopen("https://blackbox-gold.vercel.app/healthz")
print(json.loads(resp.read())) # → {"status": "ok"}BLACKBOX_API_URL as an env var. The SDK reads it automatically — no hardcoded URLs in code.A run is one complete agent session — from genesis (the system prompt and config) to seal (the final tamper-evident hash). Use the context manager: it seals automatically on exit and records a fault event if an unhandled exception occurs.
import blackbox_sdk as bb
bb.configure(api_url="https://blackbox-gold.vercel.app")
SYSTEM_PROMPT = """You are a credit risk assessment agent. Evaluate loan applications."""
with bb.run(
"credit_risk_001",
model="claude-sonnet-4-6",
system_prompt=SYSTEM_PROMPT,
tools=["data_lookup", "risk_calculator"],
sampling={"temperature": 0.1, "max_tokens": 2048},
) as run:
# Your agent logic goes here
pass
# ↑ Sealed automatically. Chain hash computed and stored.
print(f"Run sealed: credit_risk_001")hash_0 = sha256("GENESIS" + canonical_json({
"run_id": "credit_risk_001",
"type": "genesis",
"system_prompt": "You are a credit risk assessment agent. Evaluate loan applications.",
...
}))Real agent runs have multiple steps. Each one becomes a chained event — every hash includes the previous hash, so the entire sequence is tamper-evident.
import blackbox_sdk as bb
bb.configure(api_url="https://blackbox-gold.vercel.app")
with bb.run(
"credit_risk_001_full",
model="claude-sonnet-4-6",
system_prompt="You are a credit risk assessment agent. Evaluate loan applications.",
) as run:
# Step 1: reasoning — record what the agent is thinking
run.reasoning(
"Analysing input data to identify risk factors",
latency_ms=340,
)
# Step 2: tool call — record what tools were used and the results
run.tool_call(
"data_lookup",
inputs={"entity_id": "LOAN-9182", "fields": ["credit_score", "income"]},
result={"credit_score": 710, "income": 85000, "debt_ratio": 0.32},
latency_ms=820,
)
# Step 3: another reasoning step
run.reasoning(
"Credit score 710 with 32% debt ratio — borderline. Requesting additional verification.",
latency_ms=290,
)
# Step 4: output — the agent's final response
run.output(
"Application LOAN-9182: CONDITIONAL APPROVAL. "
"Credit score meets minimum threshold. Debt ratio requires verification.",
tokens_in=1240,
tokens_out=420,
cost_usd=0.0089,
latency_ms=2100,
)
# Each event's hash = sha256(prev_hash + canonical_json(event))
# Alter any event → chain breaks from that point forwardIf you're using OpenAI or Anthropic directly, bb.instrument() patches the client at startup. Every client.messages.create() call is automatically recorded as a BLACKBOX run. No context managers, no changes to existing agent code.
import anthropic
import blackbox_sdk as bb
bb.configure(api_url="https://blackbox-gold.vercel.app")
bb.instrument() # ← one line at startup. That's it.
# Now every Anthropic or OpenAI call is auto-recorded:
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a credit risk assessment agent. Evaluate loan applications.",
messages=[{"role": "user", "content": "Evaluate LOAN-9182"}],
)
# ↑ Automatically sealed in BLACKBOX. View it at /dashboard.bb.instrument() monkey-patches anthropic.resources.messages.Messages.create and openai.resources.chat.completions.Completions.create. Each call gets its own run ID, genesis event, and seal. Fail-open — if BLACKBOX is unreachable, your agent continues uninterrupted.Chain verification re-derives every hash from scratch and compares it to the stored value. One altered byte — anywhere in the chain — produces a mismatch. No stored booleans to falsify.
import urllib.request, json
run_id = "credit_risk_001_full"
url = f"https://blackbox-gold.vercel.app/v1/audit/{run_id}"
cert = json.loads(urllib.request.urlopen(url).read())
print(cert["chain_valid"]) # True — every hash checks out
print(cert["event_count"]) # 6 events
print(cert["root_hash"]) # sha256 of the final seal
print(cert["merkle_proof"]) # sibling-path inclusion proof
# Share this URL with any auditor — no auth required:
print(url)import hashlib, json, urllib.request
def canonical(obj):
return json.dumps(obj, sort_keys=True, separators=(",", ":"))
def verify_chain(events):
prev = "GENESIS"
for e in events:
stored = e.pop("hash")
computed = hashlib.sha256((prev + canonical(e)).encode()).hexdigest()
assert computed == stored, f"CHAIN BREAK at seq {e['sequence']}"
prev = stored
return True
# Fetch events
events = json.loads(urllib.request.urlopen(
"https://blackbox-gold.vercel.app/v1/runs/credit_risk_001_full"
).read())["events"]
print(verify_chain(events)) # True — or raises AssertionError with break locationEvery recorded run appears in the dashboard immediately. The run detail page shows the full event timeline, chain visualization, cost breakdown, and links to the public audit certificate.
BLACKBOX detects hallucinations, policy violations, and prompt injection attempts in real time. Configure webhooks to alert your team instantly via Slack, PagerDuty, or any HTTP endpoint.
curl -X POST https://blackbox-gold.vercel.app/v1/alerts/rules \
-H "Content-Type: application/json" \
-d '{
"name": "High-severity fault → Slack",
"condition": "fault_rate > 0.1 OR severity = critical",
"channel": "slack",
"webhook_url": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK",
"enabled": true
}'Compliance reports are date-filtered PDFs suitable for risk committees, regulators, and auditors. They include fault rates, model breakdown, chain integrity status, cost attribution, and Merkle anchor hashes.
curl "https://blackbox-gold.vercel.app/v1/report?from_date=2026-01-01&to_date=2026-12-31" \
| python3 -m json.tool
# Key fields in the response:
# total_runs, total_events, total_faults, fault_rate
# chain_integrity: "all_valid" or violation count
# by_model: per-model fault rates, quality scores, costs
# fault_patterns: most common fault classes
# merkle_anchors: daily root hashes for external verificationBefore going live, run through this checklist: