A full source-code review of zap by Claude Opus 4.8 — not a marketing claim. Every score is backed by code that was read and a build + test suite that was run.
Reviewer: Claude Opus 4.8 · Method: read of src/** (~29k LOC) + live cargo build --release and cargo test · June 2026
zap is a genuinely well-engineered coding agent — clean architecture, the right abstractions, a green build, and real test discipline. It is now in the same conversation as the major agents on harness quality, and it ships three capabilities the majors do not.
These are the things the review found in zap's source that Claude Code, Gemini CLI, and OpenCode do not provide. This is where zap is ahead — not on a feature checklist, but on architecture.
A tree-sitter + SQLite index of the whole repo — symbols, function call-sites, imports, and a type hierarchy, with PageRank-ranked files. The model answers "who calls X" or "what breaks if I rename Y" in one query, not ten greps.
No other major agent maintains a queryable code graph. Claude Code and Gemini CLI re-grep every time.
Per-turn, zap injects only the skills your message triggers, ranked and capped to a token budget. A Rust task gets Rust conventions; a greeting costs ~31 tokens instead of a 2,000-token wall sent on every turn.
Others send the same static system prompt regardless of task. This is zap's signature design.
MCP servers stay pending at startup. Instead of dumping every server's tool schemas into every request, zap injects one lightweight connector and loads a server's tools only when the model decides it needs them.
A genuinely different, more economical MCP integration than the eager-load model others use.
Content is scanned for ~25 credential patterns before it ever leaves for a cloud LLM, and shell execution has real isolation modes (off / workdir / container) with a documented threat model.
An enterprise-grade data-egress + execution posture built into the agent, not bolted on.
Anthropic, any OpenAI-compatible endpoint, Gemini, DeepSeek, local models (LM Studio / Ollama), corporate gateways, and gcloud ADC — switchable mid-session. Runs fully offline against a local model.
Unlike Claude Code (Anthropic-centric) or Gemini CLI (Gemini-only) — zap fits a regulated, gateway-bound enterprise.
One 26 MB static binary. No Node.js, no Python venv, no Docker to install or audit. Cold-starts in milliseconds and dramatically shrinks the supply-chain surface compared to npm-based agents.
Smaller attack surface and zero runtime dependencies — a real security and IT-approval advantage.
Harness capabilities side by side. zap's edge is concentrated in code understanding, context efficiency, and enterprise/security fit.
| Capability | Claude Code | Gemini CLI | OpenCode | zap |
|---|---|---|---|---|
| Persistent queryable code graph | ❌ | ❌ | ❌ | ✅ tree-sitter + SQLite |
| Per-task skill injection | ❌ static prompt | ❌ static prompt | ❌ static prompt | ✅ token-budgeted |
| Token-smart / lazy MCP loading | eager | eager | eager | ✅ on-demand |
| Any provider (incl. local / air-gapped) | Anthropic-centric | Gemini only | ✅ | ✅ |
| Secret pre-flight scan before cloud send | ❌ | ❌ | ❌ | ✅ |
| Shell sandbox modes | permission-based | partial | partial | ✅ off/workdir/container |
| Single binary, no runtime | needs Node | needs Node | ✅ (Go) | ✅ (Rust) |
| Full prompt caching | ✅ | N/A | provider-dependent | ✅ |
Comparison reflects the review's findings as of June 2026. Claude Code remains the most battle-tested harness overall; zap's advantage is architectural — capabilities the others do not implement.
How the review rated each dimension — and how it improved after the gap-closing work was completed and re-verified.
| Dimension | First pass | Current |
|---|---|---|
| Architecture & abstractions | 9 | 9 |
| Code understanding (graph) | 9 | 9 |
| Context efficiency / caching | 6 | 9 |
| Testing & verifiability | 4 | 8 |
| Reliability (crash-safety) | 5 | 7 |
| Safety / sandboxing | 5 | 8 |
| Correctness (edit tools) | 7 | 8 |
| Overall | 7.0 | 8.5 |
cargo build --release completed clean with zero warnings, and cargo test reported 168 passed, 0 failed — including new deterministic tests on the core agent loop that did not exist in the first pass.
Beyond the engineering review, zap is undergoing an independent security assessment.
A dedicated security assessment of zap's data-egress controls, shell sandboxing, supply-chain surface, and credential handling is being conducted by Mythos. The report will be linked here when complete.
Read zap's security model & threat model →Everything above is drawn from these source-verified documents.
Open source, MIT licensed. Zero telemetry. Single binary.