Replace bash shim with Rust binary

Bash shim had shell quoting risks and depended on socat/nc. Elixir escript would pay ~300ms BEAM boot per invocation. A second Burrito binary would unpack on every cold call. Rust gives <1ms startup, proper timeout handling, and is already in the toolchain for the tree-sitter NIF. - Add shim/ Rust crate to directory structure - Document Rust shim rationale (vs bash, escript, Burrito) - Update Dependencies with shim crate deps (serde_json, stdlib Unix socket) - Update install script, README, architecture diagram Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 12:28:18 +02:00
parent a133271a6c
commit 7645402347
2 changed files with 126 additions and 9 deletions
@@ -0,0 +1,107 @@
 # security-hooks
 Defense-in-depth security hooks for AI coding agents. Works with **Claude Code**, **Gemini CLI**, and **Codex**.
 A single daemon evaluates every tool call against configurable rules — blocking destructive commands, catching data exfiltration, and flagging suspicious behavior before it reaches your system.
 ## How it works
 ```
 AI Agent ──► Rust Shim ──► Unix Socket ──► Elixir Daemon ──► Verdict
              (<1ms)         (<1ms)        (rule engine)     allow/deny/ask
 ```
 1. The AI tool calls a hook before each tool use (bash command, file edit, MCP call)
 2. A tiny Rust binary forwards the payload to a long-running Elixir daemon
 3. The daemon evaluates rules using **two layers**:
   - **Regex** — fast pattern matching for obvious threats (fork bombs, miners)
   - **AST** — structural analysis via tree-sitter-bash that catches evasion (`rm -rf $(echo /)`, piped exfiltration, obfuscated commands)
 4. Returns `allow`, `deny` (with a nudge message), or `ask` (falls through to human approval)
 The system is **fail-closed** — if the daemon is unreachable, the tool call is blocked.
 ## Threat coverage
 - **Prompt injection** — malicious content in READMEs, web pages, or MCP responses can't trick the agent into running blocked commands
 - **Destructive operations** — `rm -rf`, force push, `sudo`, cloud resource deletion, and more
 - **Data exfiltration** — detects secrets being piped to `curl`/`nc`, reads of `~/.ssh` or `~/.aws`, env var leaks
 - **Supply chain attacks** — flags dependency mutations, unknown executables, lockfile edits
 - **MCP injection** — validates MCP server identity, scans parameters for shell injection via AST
 ## Rule DSL
 Rules live in `.rules` files with a custom syntax designed for regex without escaping pain:
 ```
 # Regex — pattern is literal to end of line, no quoting needed
 block "fork-bomb"
  match :\(\)\s*\{.*\|.*&\s*\}\s*;
  nudge "Fork bomb detected"
 # AST — structural matching that catches evasion
 block "destructive-rm"
  match command("rm") with_flags("-r", "-rf", "-fr")
  nudge "Use trash-cli or move to a temp directory"
 block "pipe-to-exfil"
  match pipeline_to("curl", "wget", "nc")
  nudge "Don't pipe output to network commands"
 # Config-referenced allowlist
 suspicious "unknown-executable"
  match_base_command_not_in allowed_executables
  nudge "Unknown command '{base_command}'. Add it to config.toml"
 ```
 The rule loader auto-detects regex vs AST based on whether the match starts with a function like `command(`.
 ## Two tiers
 - **block** — hard deny. The agent sees the nudge and self-corrects.
 - **suspicious** — falls through to the human permission prompt with context.
 ## Configuration
 ```toml
 # config.toml — defaults ship with the project
 [executables]
 allowed = ["git", "mix", "cargo", "go", "node", "npm", "python", "rg", "fd", ...]
 [secrets]
 env_vars = ["AWS_SECRET_ACCESS_KEY", "GITHUB_TOKEN", "DATABASE_URL", ...]
 [paths]
 sensitive = ["~/.ssh", "~/.aws/credentials", "~/.config/gcloud", ...]
 ```
 Override with `config.local.toml` (gitignored):
 ```toml
 [executables]
 append = ["my-custom-tool", "deno"]
 exclude = ["curl"]
 [rules]
 disabled = ["force-push"]
 ```
 ## Architecture
 - **Rust shim** — ~1MB static binary, <1ms startup, forwards payloads via Unix socket
 - **Elixir daemon** — distributed as a [Burrito](https://github.com/burrito-elixir/burrito) binary (no Erlang/Elixir install needed)
 - **Adapter layer** — normalizes payloads across Claude Code, Gemini CLI, and Codex
 - **tree-sitter-bash** — Rust NIF for robust AST parsing of shell commands
 - **Hot-reload** — edit rules or config, changes apply on the next tool call
 - **systemd/launchd** — socket activation for zero cold-start latency, automatic crash recovery
 ## Platforms
 macOS (aarch64, x86_64) &middot; Linux (x86_64, aarch64) &middot; WSL
 ## Status
 Design phase. See [`docs/superpowers/specs/2026-03-26-security-hooks-design.md`](docs/superpowers/specs/2026-03-26-security-hooks-design.md) for the full spec.
 ## License
 TBD
@@ -13,16 +13,19 @@ A general-purpose, distributable set of security hooks for AI coding agents (Cla
 Four components:
-### 1. Shell shim (`security-hook`)
+### 1. Shim (`security-hook`)
-A short bash script that AI coding tools invoke as a hook command. It:
+A small Rust binary (~1MB static, <1ms startup) that AI coding tools invoke as a hook command. Rust is chosen over bash to avoid shell quoting bugs, over Elixir escript to avoid ~300ms BEAM boot, and over a second Burrito binary to avoid unpack-on-every-invocation overhead. Since tree-sitter-bash already requires Rust in the build toolchain, this adds no new dependencies.
 The shim:
 - Accepts an `--adapter` flag to specify the calling tool (`claude`, `gemini`, `codex`)
 - Reads the JSON hook payload from stdin
- Passes the payload and adapter name to the daemon over a Unix socket
+- Connects to the daemon's Unix socket and sends the payload with the adapter name
- Prints the daemon's response to stdout (formatted for the calling tool)
+- Reads the daemon's response and prints it to stdout
 - Handles timeouts and fail-closed behavior natively (no shell `timeout` command)
-The shim is deliberately simple — it does not manage daemon lifecycle. That responsibility belongs to the platform service manager (see Daemon Lifecycle below).
+It does not manage daemon lifecycle. That responsibility belongs to the platform service manager (see Daemon Lifecycle below).
 Usage:
 ```
@@ -147,8 +150,10 @@ When installed, the entire tree below is copied to `$SECURITY_HOOKS_HOME` (defau
 ```
 security-hooks/
-├── bin/
+├── shim/                                # Rust shim binary
-│   └── security-hook                    # shell shim
+│   ├── Cargo.toml
 │   └── src/
 │       └── main.rs
 ├── service/
 │   ├── security-hookd.service           # systemd user service unit
 │   ├── security-hookd.socket            # systemd socket activation unit
@@ -838,7 +843,7 @@ Future: streaming connectors for centralized logging (stdout, webhook, syslog).
 The install script:
 1. Downloads the Burrito binary for the current platform (macOS aarch64/x86_64, Linux x86_64/aarch64) or builds from source if Elixir is available
-2. Installs the binary and shell shim to `~/.local/bin/` (or user-specified location)
+2. Installs both binaries (`security-hookd` daemon + `security-hook` shim) to `~/.local/bin/` (or user-specified location)
 3. Copies default rules and config to `~/.config/security-hooks/`
 4. Creates `config.local.toml` from a template if it does not exist
 5. Auto-detects installed MCP servers from Claude Code/Gemini CLI config and pre-populates the MCP allowlist in `config.local.toml`
@@ -981,5 +986,10 @@ Elixir/Hex packages required by the daemon:
 - `file_system` — cross-platform file watcher for hot-reload
 - `burrito` — compile to single-binary for distribution
-Rust NIF (compiled into the Burrito binary):
+Rust (compiled into the Burrito binary as a NIF, and used standalone for the shim):
 - `tree-sitter` + `tree-sitter-bash` — bash AST parser for structural command analysis (primary parser, see Bash Parser Strategy)
 Shim binary (`security-hook`, Rust):
 - `std::os::unix::net::UnixStream` — Unix socket client (stdlib, no external deps)
 - `serde_json` — JSON parsing
 - Cross-compiled for the same platform targets as the daemon