Technical reference · May 2026
ZizkaDB
The operational database for AI agents. This document describes architecture, APIs, data semantics, and how ZizkaDB fits alongside vector databases, application databases, and tracing tools.
Overview
ZizkaDB stores agent events — decisions, tool calls, messages, and outcomes — with causal links, semantic search over history, reconstruction of logged state at a timestamp, and statistical behavioral baselines per agent.
| Layer | Role | Examples |
|---|---|---|
| Vector retrieval | Document / knowledge search by embedding | Pinecone, Qdrant, Weaviate |
| Application state | Transactional app data | Postgres, Redis |
| Agent operations | Decision history, lineage, drift | ZizkaDB |
| Traces & evals | Spans, experiments, framework hooks | LangSmith, OpenTelemetry |
Agent runtime ├── Vector DB → RAG / knowledge retrieval ├── Postgres / Redis → app state, caches, users ├── ZizkaDB → events, why(), at(), baseline, forget() └── OTel / LangSmith → distributed traces (optional)
Problem domain
Production agent systems typically need:
- Causal debugging — walk from a bad output to the event that triggered it
- State at time T — what was logged for an agent before a given timestamp
- Cross-session memory — under your API keys, retention policy, and erasure controls
- Behavior change detection — compare recent sessions to that agent's historical pattern
ZizkaDB targets these operational concerns. It does not replace embedding indexes for document RAG or distributed tracing for infrastructure spans.
Architecture
Managed (db.zizka.ai)
| Component | Technology | Notes |
|---|---|---|
| Dashboard | Next.js 14 | PM2 :3001, nginx TLS |
| API | FastAPI (Python) | Docker :8000 |
| Primary store | Postgres + pgvector | Events, tenants, metadata |
| Vector search | Qdrant | Semantic search index |
| Cache | Redis | Sessions / cache |
| Edge | nginx | / → dashboard, /v1/ → API, /swagger → OpenAPI |
Self-hosted
Docker Compose stack: Postgres (pgvector), Qdrant, Redis, API. Dashboard is optional (run from dashboard/). Requires OPENAI_API_KEY for embedding-backed features; logging and causal APIs work without it.
git clone https://github.com/Zizka-ai/ZizkaDB cp .env.example infra/.env docker compose -f infra/docker-compose.yml up -d
Data model
Every record is an event:
| Field | Type | Description |
|---|---|---|
agent | string | Agent identifier (fleet key) |
event | string | Event type label (e.g. tool_call, user_message) |
data | JSON object | Arbitrary payload you control |
parent_id | UUID (optional) | Causal parent event |
session_id | string (optional) | Groups a run / conversation |
metadata | JSON (optional) | Tenant tags (user_id, env, etc.) |
timestamp | ISO 8601 | Server-assigned or client-provided |
checksum | SHA-256 | Hash over canonical payload bytes |
Multi-tenancy: API keys (zizkadb_live_*) scope all reads and writes to a tenant. Dashboard auth uses email OTP → JWT.
Capabilities
POST /v1/events · db.log(...)Append events. Use parent_id to build causal trees.
GET /v1/events/{id}/why · db.why(event_id)Returns ancestor chain from event to root.
POST /v1/search · db.search(query, agent=..., limit=...)Embeddings via OpenAI text-embedding-3-small on ingest.
GET /v1/events/at · db.at(agent, timestamp)Aggregate logged events ≤ timestamp into a state snapshot.
POST /v1/memory/context · db.context_for(agent, task, max_tokens=...)Recent + semantically relevant events formatted for system prompts.
GET /v1/agents/{id}/baseline · db.baseline(agent, recent_window=...)Drift score vs historical event distribution; needs session volume.
GET /v1/memory/diff/{session_id} · db.memory_diff(session_id)Summary of event counts, errors, and changes within a session.
DELETE /v1/memory/forget · db.forget(filter_key, filter_value)Deletes events (and vectors) matching metadata filter.
Time travel semantics
at(agent, timestamp) reconstructs logged state by aggregating all events for that agent with timestamp ≤ T. The returned structure reflects what was recorded in ZizkaDB, not a re-execution of the LLM or tools.
| Property | Behavior |
|---|---|
| Determinism | Same event log → same reconstructed snapshot |
| LLM outputs | Not re-generated; only stored payloads are returned |
| Checksums | Per-event SHA-256 enables integrity verification of stored payloads |
| Completeness | Depends on what your integration logged (gaps = missing parent events) |
Integrity & retention
- Events are stored in Postgres with an append-by-default write pattern; each payload includes a SHA-256 checksum for integrity verification.
forget()removes events matching metadata filters (GDPR right to erasure) — storage is not WORM/immutable.- Managed plans enforce retention windows (90 days Pro, 1 year Team); self-hosted retention is operator-defined.
- Bulk signed audit export is on the product roadmap; today, export via API/query and verify checksums per event.
Performance expectations
ZizkaDB is built for normal agent loops — tool calls, messages, and decisions — not for blockchain-scale write throughput. Stack: Postgres (events), Qdrant (vectors), Redis (cache).
| Mode | Write path | Typical use |
|---|---|---|
| Logging only | Postgres INSERT + SHA-256 checksum | Fast enough for typical agent step loops |
| With semantic search | Above + OpenAI embedding + Qdrant upsert (per log) | Fine for normal volume; embedding adds network latency |
| why() / query | Postgres reads on explicit event_id or filters | No hidden session state on the server |
High-frequency fleets (thousands of parallel writes per second) are not a v1 target. Async embedding queues and published benchmarks are on the roadmap as usage grows.
Security (early stage)
ZizkaDB v1 is aimed at developers and small teams — not enterprise compliance certification yet. Here is what we do today:
- In transit: TLS on managed cloud (
db.zizka.aivia nginx). - Tenant isolation: every API call scoped by API key or JWT to a
tenant_id; no cross-tenant reads. - At rest: standard Postgres / disk encryption on the operator's infrastructure (AWS EBS on managed; your disk when self-hosted).
- BYOK embedding keys: encrypted in Postgres (Fernet) when you bring your own OpenAI key.
- GDPR erasure:
forget()deletes matching events and vectors. - Telemetry: one anonymous SDK/MCP startup ping — opt out with
ZIZKADB_TELEMETRY=false.
Not claimed today: SOC 2, HIPAA, or formal DPAs. For enterprise security review, contact founder@zizka.ai.
Integration
Concurrency: Python and TypeScript SDKs are stateless HTTP clients — no thread-local storage or contextvars. Pass agent, session_id, parent_id, and event_id explicitly on each call. Safe in FastAPI, Celery, and parallel async workers.
| Surface | Install | Use case |
|---|---|---|
| Python SDK | pip install zizkadb-sdk | FastAPI, notebooks, batch |
| TypeScript SDK | npm install zizkadb-sdk | Node, Bun, Deno, edge |
| MCP server | uvx zizkadb-mcp | Claude Desktop, Cursor — no app refactor |
| REST | curl / any HTTP | Go, Rust, Java, mobile |
Cloud default host: https://db.zizka.ai. Self-host: pass host= to the SDK or set ZIZKADB_HOST for MCP.
pythonfrom zizkadb import ZizkaDB
db = ZizkaDB("zizkadb_live_xxxx") # managed
# db = ZizkaDB(host="http://localhost:8000") # self-host
msg = await db.log(agent="bot", event="user_message", data={"text": "..."})
tool = await db.log(agent="bot", event="tool_call", data={...}, parent_id=msg.event_id)
chain = await db.why(tool.event_id)Cursor MCP (30 seconds)
Add to ~/.cursor/mcp.json or project .cursor/mcp.json, then reload MCP:
json{
"mcpServers": {
"zizkadb": {
"command": "uvx",
"args": ["zizkadb-mcp"],
"env": { "ZIZKADB_API_KEY": "zizkadb_live_xxxx" }
}
}
}Self-host: set ZIZKADB_HOST to http://localhost:8000 instead (dev key auto-injected).
MCP tools
log_event, search_memory, get_context, why, query_events, time_travel, memory_diff, forget — see setup guide for config.
Telemetry: one anonymous SDK startup ping (name, version, OS, cloud vs self-host). Opt out: ZIZKADB_TELEMETRY=false.
REST API
Base URL https://db.zizka.ai. Auth: Authorization: Bearer <token>. Interactive schema: /swagger.
| Method | Path | Purpose |
|---|---|---|
POST | /v1/events | Log event |
GET | /v1/events | Query events |
GET | /v1/events/{id}/why | Causal chain |
GET | /v1/events/at | State at timestamp |
POST | /v1/search | Semantic search |
POST | /v1/memory/context | Prompt context block |
GET | /v1/memory/diff/{session_id} | Session summary |
DELETE | /v1/memory/forget | Metadata erasure |
GET | /v1/agents | List agents |
GET | /v1/agents/{id}/baseline | Drift / baseline |
GET | /health | Health check |
Deployment
| Mode | Cost | Data residency |
|---|---|---|
| Self-hosted | Free (AGPL core) | Your VPC / machine |
| Managed Pro | €39/mo | Zizka cloud |
| Managed Team | €99/mo | Zizka cloud + higher limits |
Onboarding: signup → email OTP → Settings → API key → SDK or MCP env.
Comparison
Capability matrix (May 2026). Verify competitor docs before external debates; ~ = partial support.
| Capability | LangSmith | Mem0 | Pinecone | ZizkaDB |
|---|---|---|---|---|
| Agent event logging | ✓ | ✗ | ✗ | ✓ |
| Causal lineage | ~ | ✗ | ✗ | ✓ |
| Time travel (logged state at T) | ✗ | ✗ | ✗ | ✓ |
| Semantic search on agent history | ✗ | ✓ | ✓ | ✓ |
| Any framework / model | ~ | ✓ | ✓ | ✓ |
| Behavioral baseline / drift | ✗ | ✗ | ✗ | ✓ |
| Cross-agent fleet queries | ✗ | ✗ | ✗ | ✓ |
| Per-event checksum | ✗ | ✗ | ✗ | ✓ |
| Self-host (free tier) | ✓ | ✓ | ✗ | ✓ |
Licensing
| Component | License |
|---|---|
| Core API + self-host stack | AGPL-3.0 |
| Python SDK | AGPL-3.0 |
| TypeScript SDK | AGPL-3.0 |
| MCP server | MIT |
AGPL applies when you modify and distribute the server. MCP is MIT for IDE integration. Commercial embedding requires legal review.
Plan limits
| Plan | Events / month | Retention |
|---|---|---|
| Self-hosted | Unlimited (your hardware) | Operator-defined |
| Pro (€39/mo) | 100M | 90 days |
| Team (€99/mo) | Up to 1B (plan cap) | 1 year |
Baseline/drift requires sufficient session_id coverage; early agents report warming_up or insufficient_data.
FAQ
LangSmith focuses on LangChain-centric tracing and evals. ZizkaDB is a standalone operational store with causal trees, time travel over logged state, fleet baselines, and framework-agnostic ingestion.
Mem0 optimizes long-term memory retrieval for prompts. ZizkaDB adds causal lineage, session replay, drift baselines, and checksum-backed event storage.
No. Pinecone indexes documents for RAG. ZizkaDB indexes agent decision history. Most teams use both.
No. You call log() at existing decision points. Optional parent_id links causality.
Logging is async HTTP. Embedding runs on ingest; hot path is typically fire-and-forget await db.log(...).
You control data and metadata. forget() deletes by metadata filter. Self-host keeps data in your infrastructure.
Install zizkadb-sdk (not the unrelated agentdb package). Import: from zizkadb import ZizkaDB.
Contact
| Product | https://db.zizka.ai |
| Docs | https://db.zizka.ai/docs |
| API | https://db.zizka.ai/swagger |
| GitHub | https://github.com/Zizka-ai/ZizkaDB |
| founder@zizka.ai |