System Prompt Efficiency

Most agents send the same monolithic system prompt on every turn regardless of what you typed. Zap sends a minimal base and injects domain content only when a relevant keyword triggers a skill.

Source: Zap source (src/context_manager.rs, src/skill_manager.rs). Claude Code: Piebald-AI/claude-code-system-prompts v2.1.162 (June 2026). Gemini CLI: open-source gemini-cli repo. OpenCode: bgauryy/open-docs. Cline: dontriskit/awesome-ai-system-prompts.

Token Counts per Turn

Tokens spent on the system prompt are paid on every turn — they never help the model write better code, they just define how it behaves. Leaner prompts leave more budget for real context.

Agent Regular turn Casual turn ("ok", "thanks") How it varies
Zap ~1,750–8,000 ~15 Skills injected only when keyword fires; casual turns skip everything
Claude Code ~8,000–20,000+ ~8,000+ Full prompt every turn; git instructions always present even on non-git work
Gemini CLI ~3,000–6,000 ~3,000 Always-on: date, git context, compression instructions, edit fixer
OpenCode ~2,000–5,000 ~2,000 Provider-specific static file + env block always present
Cline ~8,000–15,000 ~8,000 Complete tool descriptions inline; no conditional injection
The casual turn gap is the most telling. When you type "ok" or "looks good", zap spends 15 tokens. Claude Code spends 8,000+. That's 500× more — for a turn where no code is being written and the instructions are wasted.

How Skill Injection Works

Instead of a 10,000-token monolith, zap assembles the system prompt dynamically. The base is ~1,750 tokens. Skills add context only when the message triggers them.

Always-on base (~1,750 tokens)

Identity + Environment

Role, platform, shell, CWD. ~50 tokens.

Code Navigation Strategy

Tool order: code_map → find_definition → search_code → read_file. ~300 tokens.

Security Rules

6 non-negotiables: no force-push to main, no --no-verify, no secret files. ~100 tokens.

Trigger-matched skills (injected only when relevant)
SkillFires when message containsTokens added
gitcommit, branch, merge, rebase, push, PR, conflict~400
code-reviewreview, PR, pull request, diff, feedback~300
debuggingbug, error, crash, exception, trace, debug, fail~350
deploydeploy, CI, CD, pipeline, release, docker~300
securityvulnerability, injection, XSS, OWASP, CVE, auth~400
Language skillsDetected from file extensions in CWD (Rust, TS, Go, Python…)~200–600

Compare this to Claude Code, which always sends ~800 tokens of git workflow instructions — even when you're asking it to explain a CSS rule. The instructions are permanently embedded in the Bash tool description, not gated by keywords.

Skill files are plain markdown and fully customizable. Drop a .md file in ~/.zap/skills/ with a trigger: array and it fires when those keywords appear. See Skill-First Context ↗

Memory & Session Persistence

How an agent remembers things across sessions defines how useful it is on long-running projects. Here's how zap and Claude Code store, inject, and expire memory.

Zap's storage systems
StoreFormatScopeInjected when
~/.zap/agent.db — memory table SQLite key-value Global (all projects) Every non-casual turn
.zap/context.md Markdown Project-local Session start (TUI always; CLI asks)
.zap/understanding.md Markdown (LLM-generated by /init) Project-local Every session start, capped at 4,000 chars
.zap/session_log.md Markdown, append-only Project-local Lazy hint only — LLM reads on demand
~/.zap/agent.db — sessions table SQLite Global On demand via /sessions picker
Accessing memory without zap
# Read all memory entries sqlite3 ~/.zap/agent.db "SELECT key, value, updated_at FROM memory ORDER BY key;" # See recent sessions sqlite3 ~/.zap/agent.db "SELECT id, goal, model, created_at FROM sessions ORDER BY id DESC LIMIT 20;"

Memory Feature Table

Zap vs Claude Code, feature by feature.

FeatureZapClaude Code
Session persistenceSQLite sessions + session_messages.jsonl file per conversation
Session resume/sessions interactive pickerConversation history in UI
Conversation branching✅ Named forks via branches table❌ Not available
Memory storageSQLite key-value (memory table)Typed .md files per project
Memory auto-save by LLMmemory_set tool — proactive saving✅ LLM saves proactively
Memory in system prompt✅ All entries injected every non-casual turn✅ MEMORY.md index always in context
Memory typesFlat key-value✅ user / feedback / project / reference
New memory visible this session✅ Patched into system prompt after tool round✅ LLM knows what it wrote immediately
Session handoff file.zap/context.md (goal + files + next steps)❌ None
Session log with file tracking.zap/session_log.md❌ None
LLM-generated project knowledge.zap/understanding.md (from /init)❌ None
Code symbol index.zap/code.db (SQLite, instant lookup)❌ Greps every query
File undo✅ In-memory snapshot stack❌ None
Sliding window summarization✅ LLM summarizes dropped turns automaticallyManual /compact only
Context viewer✅ TUI overlay with token usage per turn❌ None
Audit log~/.zap/audit.jsonl (every tool call)❌ No local audit
Casual turn optimization✅ ~15 tokens for "ok"/"yes"/"thanks"❌ Full prompt every turn

Where Zap Leads

These aren't gaps to close — they're features Claude Code doesn't have at all.

Conversation branching

Try an alternative approach without losing current state. Named forks stored in agent.db.

Code symbol index

find_definition returns in milliseconds. Other agents grep every time. See docs →

understanding.md

LLM-generated architectural reference, always in context from /init. No re-exploration every session.

Session log

Which files were touched in which session. .zap/session_log.md — human-readable, append-only.

Context viewer

See every message in context with its token count. Drop stale turns without restarting. See docs →

Audit log

Every tool call recorded to ~/.zap/audit.jsonl. Useful for debugging unexpected changes and cost tracking.

Live Case Study: Code Review

A requirements document proposing multi-phase credential auto-detection was reviewed by two agents in the same session — Claude Code (no index, file access via tools) and Zap (DeepSeek V4 Pro, code index active). Same document, same task, different results.

4
issues Claude Code caught
4
issues zap caught
1
architectural miss (Claude Code only)
Setup
Claude CodeZap (DeepSeek V4 Pro)
Code index❌ None — relies on grep/Read.zap/code.db built with /init
File access during reviewYes (Read, Bash, grep)Yes + SQL queries against index
Did it query source files?No — reviewed doc in isolationYes — cross-referenced proposals against actual code

Side-by-Side Outputs

What each agent caught
FindingClaude CodeZapHow
Provider list duplication — same 13-item list in two files ❌ Missed ✅ Exact file + line numbers SQL query against code.db found both copies
Ollama probe URL wrong (localhost:11434 → should be /api/tags) Partially (flagged timeout, not URL) ✅ Specific fix Model knowledge about Ollama API
AuthSource enum design (concrete EnvVar / ShellCommand / HttpProbe shape) Vague — "needs richer return type" ✅ Actionable enum design Model capability, more specific reasoning
Gemini auth wrong (gcloud ≠ AI Studio API key) ✅ Caught ❌ Missed Model knowledge about Google's two API surfaces
Copilot needs extra HTTP header (editor-version) ✅ Caught ❌ Missed Model knowledge about Copilot API
Key persistence unspecified (where does wizard save the key?) ✅ Caught ❌ Missed Document analysis
Zap's key finding (verbatim)
-- Blocking Issue #1: Provider list duplication is not addressed The same 13-provider list exists in two places: src/session/commands/provider.rs:22-36 (CLI) src/tui/turn_handler.rs:76-91 (TUI) Phase 1 should start by extracting a single PROVIDERS: &[ProviderDef] constant. Every phase modifies "both files" — that means duplicating every change (Copilot addition, Claude Code activation, env_key additions). This is the single biggest maintainability risk in the current code and the proposal adds more fields without fixing the root cause. → blocking: Add a Phase 0: extract provider definitions to a shared constant.
When Claude Code was asked what it missed
# Claude Code's own assessment (verbatim) "Yes — three things: 1. Provider list duplication (Phase 0) — this was my cleanest miss. I never flagged that the same 13-provider list lives in both provider.rs and turn_handler.rs. Zap caught that every proposed change would need to be made twice. I didn't mention this at all. 2. Ollama probe URL — I flagged 'timeout is unspecified' but didn't catch that localhost:11434 is the wrong target. Zap specifically identified /api/tags as the correct lightweight endpoint. I was half-right. 3. AuthSource enum specificity — I said Option<String> was wrong but was vague about the shape. Zap proposed the concrete enum and drew the right conclusion that this should move to Phase 2 before Copilot is added. The duplication one is the most meaningful miss. It's an architectural issue that affects the whole implementation plan, not just a detail."
Screenshot: asking Claude Code if it missed anything
Claude Code's response when asked: 'zap picked because of good system prompt or code indexing approach or llm model capability?' — Claude Code explains that the provider list duplication was a code indexing catch, while Ollama URL and AuthSource enum were model capability, and notes the system prompt only set the framing.

Claude Code's own analysis: the duplication catch came from code indexing (zap cited exact line numbers from .zap/code.db), not model capability or system prompt. Claude Code reviewed the doc in isolation with no access to the codebase.

What Explains the Difference

FindingWho caught itWhy
Provider list duplication Zap Queried code index — found exact file+line for both copies. Claude Code had file access but no pre-built index and didn't grep.
Ollama probe URL Zap Factual model knowledge about the Ollama API. Claude Code flagged timeout but not the wrong URL.
AuthSource enum (concrete shape) Zap More specific reasoning on the same concern Claude Code raised vaguely.
Gemini auth wrong Claude Code Factual model knowledge about Google's two API surfaces. Zap didn't catch this.
Copilot extra header Claude Code Factual model knowledge about the Copilot API.
Key persistence gap Claude Code Document analysis — both could have caught it; zap didn't.
The provider list duplication — the most architecturally significant finding — required knowing the actual state of the code. Claude Code has file access but no pre-built index and didn't look. Zap had the index, was instructed to use it, and cited exact line numbers. That's what running /init buys.
Note on model vs tooling. Zap in this test used DeepSeek V4 Pro, not Claude. The duplication catch came from the code index — a tool, not model capability. The model just had to read two SQL rows to see both file paths. The Gemini auth and Copilot header catches came from model knowledge — Claude Code's model (Claude Sonnet) happened to know those API details better.

Date: 2026-05-31. Full raw outputs available in docs/live-zap-comparison/ ↗