Zap — Independent Security Review (Mythos)

The verdict

zap is built on a sound foundation — a single memory-safe Rust binary, a real permission model that gates write and execute tools, API keys that never reach the logs, and a refreshingly honest threat model. The first review found the unlocked doors all at the edges of the trust model — where input arrives from somewhere other than your own terminal. Those doors are now shut.

Two passes: the first review scored 6.5 and named six findings; v0.15.11 fixed all six (→ 8.0). A second hardening pass in v0.15.12 then went further — egress is now scanned at every entry point (not just tool results), /remote is back behind a per-session token, file writes are jailed to the workspace, and a cargo-audit CI gate watches the dependency tree. The four security-critical axes — filesystem, data egress, remote/network, supply chain — are all now at 9/10. Re-verified against a clean build and 182 passing tests. Posture: 9.0/10.

Fit for corporate use

zap is a per-developer, local tool: one self-contained binary, run by each engineer on their own machine against code they already have. Judged against that deployment model — the way a company would actually adopt it — the enterprise-relevant controls are the strong ones.

🛰️

Data residency

Tool output is scanned for credentials before any cloud send, and zap runs fully against local / air-gapped models with zero telemetry. Corporate proxy, custom CA, and offline operation are first-class — code and prompts never have to leave the network.

📦

Minimal supply chain

One memory-safe Rust binary — no Node or Python runtime to audit. Project-local hooks and MCP servers now require explicit directory trust, so a cloned repo can't run code the moment you open it.

🔐

Credential discipline

API keys are never logged; ~/.agent.toml and the session database are locked to 0600; proxy credentials are redacted in all output.

🧱

Real execution control

Every write/exec tool is permission-gated, destructive commands force confirmation even in Auto mode, and sandbox = "container" gives true filesystem + network isolation for the shell.

Where the remaining point goes — and why it doesn't block adoption. At 9.0 the residual is not a live vulnerability. It's the local permission model (reads don't prompt; Auto is all-or-nothing) — a deliberate single-user trust design, not a gap. The reach to 9.5 is optional hardening: OS-keychain key storage, full SHA-pinning of CI actions, and an argument-level shell allowlist. Egress is now scanned at every entry point, file writes are jailed to the workspace, and remote access is token-authenticated.

Recommended posture for a security team: roll out as a per-developer tool with permission_mode = "ask" and, for the strictest stance, sandbox = "container". On that basis a reviewer can approve it for internal use today. Treat multi-tenant / untrusted-input use as out of scope until the denylists become allowlists.

What's already solid

The review verified these by reading the code — not by trusting the README.

🦀

Memory-safe by construction

Safe Rust with no unsafe on the tool or execution paths. The entire class of buffer overflows, use-after-free, and dangling pointers is off the table.

🔑

Credentials handled with care

API keys are never logged — they're only used to build the Authorization header. ~/.agent.toml is written 0600, and proxy credentials are redacted before any display.

🚪

A real permission model

Write and execute tools are gated behind explicit approval. Even in Auto mode, rm -rf, git push --force, and DROP TABLE still force a confirmation.

📜

An honest threat model

The shell guardrail is documented as a footgun-catcher, not a security boundary — exactly right. docs/SECURITY.md states plainly what is and isn't protected.

Findings — and how they were fixed

Six issues, each tied to a specific location in the source. Severity reflects realistic exploitability. All six were fixed in v0.15.11; the v0.15.12 hardening pass then went further than the original fixes (status shown reflects v0.15.12).

#	Severity	Finding	Status (v0.15.11)
1	HIGH	`/remote` tunneled the session to a public URL with no authentication — read the stream, inject prompts	✅ Token-authed + Auto-refused
2	MEDIUM	`read_file` guarded by a thin 15-entry denylist	✅ Denylist hardened + symlink-safe
3	MEDIUM	Path guard bypassable with a symlink (no `canonicalize`)	✅ Fixed — symlinks resolved
4	MEDIUM	Secret pre-flight scanner had coverage gaps	✅ Scanned at every source + entropy
5	MEDIUM	Project hooks & MCP servers ran code from a cloned repo with no consent	✅ Fixed — project-trust gate
6	LOW	Session history persisted to SQLite world-readable	✅ Fixed — `0600`

Full exploit scenarios, fixes, and remaining reach-goals are in the v3 report. As of v0.15.12 all six are fully closed — /remote is rebuilt with per-session token auth (not just disabled), file writes are jailed to the workspace, and egress is scanned at every entry point.

The one that mattered most

Finding 1, in plain terms — and what was done about it.

What it was: /remote let you drive a zap session from your phone by tunneling a local server to a public HTTPS URL via ngrok or localhost.run. The catch: the WebSocket had no authentication — no token, no password, no origin check. Anyone who saw that URL could watch everything the model output and inject their own prompts. In Auto mode that reached the shell tool.

First it was disabled (v0.15.11), closing the ingress path immediately. Then it came back hardened (v0.15.12): the server mints a per-session token from the OS CSPRNG, appends it to the printed URL, and rejects both the page and the /ws upgrade without it — so a leaked URL minus the token is inert. It also refuses to start in Auto permission mode, where a leaked URL could otherwise drive the shell unattended.

The result: you can drive zap from your phone again, but only with the secret token in the link. Authenticated-by-default, not unauthenticated-by-default.

Scorecard

Each dimension across the three passes: first review (6.5), the fix pass (8.0), and the hardening pass (9.0).

Dimension	v1	v2	v3
Memory & language safety	9	9	9
Credential handling (at rest / in logs)	8	9	9
Local execution model (permissions)	7	7	8
Filesystem boundary	5	7	9
Data egress controls	6	7	9
Remote / network surface	4	8	9
Supply chain (deps + project trust)	6	8	9
Overall	6.5	8.0	9.0

On the 9.0. The four security-critical axes — filesystem, data egress, remote/network, supply chain — are all at 9. The one dimension still at 8, the local permission model, reflects a deliberate single-user trust design (reads don't prompt; Auto is all-or-nothing by choice), not an unfixed weakness. The reach to 9.5 is documented and optional: OS-keychain key storage, full SHA-pinning of CI actions, and an argument-level shell allowlist.

The full documents

Everything above is drawn from these source-verified reports.

Security Review v3 — hardened (Mythos) ↗ Security Review v2 ↗ Security Review v1 ↗ Security Model ↗ Engineering Review (Opus 4.8) →

Reviewed, fixed, then hardened.
9.0 / 10.

The verdict

Fit for corporate use

Data residency

Minimal supply chain

Credential discipline

Real execution control

What's already solid

Memory-safe by construction

Credentials handled with care

A real permission model

An honest threat model

Findings — and how they were fixed

The one that mattered most

Scorecard

The full documents

Security you can read for yourself

Reviewed, fixed, then hardened.9.0 / 10.

The verdict

Fit for corporate use

Data residency

Minimal supply chain

Credential discipline

Real execution control

What's already solid

Memory-safe by construction

Credentials handled with care

A real permission model

An honest threat model

Findings — and how they were fixed

The one that mattered most

Scorecard

The full documents

Security you can read for yourself

Reviewed, fixed, then hardened.
9.0 / 10.