docs(readme): correct firewall scope; track egress controls in TODO

The 'What makes gnoma different' bullet and Security section both implied a network-egress firewall. Today the Firewall only enforces a content boundary (secret scan, Unicode sanitize, redact/block). Reword both spots and add a Scope note. Surface the gap as a top-of-TODO entry covering per-session audit log and per-host egress allowlist, with the open design question (host-level vs per-tool) called out. Raised via r/SideProject v0.3.0 launch thread.
2026-05-24 15:50:35 +02:00
parent 1828151162
commit 7040041f13
2 changed files with 40 additions and 6 deletions
@@ -25,9 +25,13 @@ skills, MCP servers, and plugins.
  reach for Claude when the local model would obviously flail.
 - **Tier-0 SLM routing.** A tiny local model classifies each prompt and
  handles trivial tasks itself, keeping the heavy provider for real work.
- **Built-in network firewall + secret scanner.** Paths are canonicalised
-  (TOCTOU-safe), network egress is gated, tool output is scanned for
-  secrets before reaching the model.
+- **Content boundary + secret scanner.** Every outgoing LLM message
+  and incoming tool result is scanned for secrets (regex + Shannon
+  entropy on long tokens), redacted or blocked at the content level.
+  Paths are canonicalised (TOCTOU-safe), Unicode is sanitized
+  (homoglyphs, BiDi tricks), and a `SafeProvider` boundary keeps
+  incognito-mode data out of long-lived stores. *(Per-host network
+  egress allowlist is on the roadmap, not in place today.)*
 - **Provider-agnostic from day one.** Anthropic, OpenAI, Google, Mistral,
  Ollama, llama.cpp, plus subprocess CLIs (`claude`, `codex`, `agy`,
  `vibe`). Mix cloud and local in the same session.
@@ -462,9 +466,18 @@ built-in batching skill.

 gnoma runs tools and shell commands on your behalf. The
 [`internal/security`](internal/security) package canonicalises every path
-(TOCTOU-safe), gates network access through a configurable firewall, and
-scans tool output for secrets before it ever reaches the model. The
-`SafeProvider` boundary keeps incognito-mode data out of long-lived stores.
+(TOCTOU-safe), scans every outgoing LLM message and incoming tool result
+for secrets (regex + Shannon entropy) before it reaches the model, and
+sanitizes Unicode (homoglyphs, BiDi tricks). The `SafeProvider` boundary
+keeps incognito-mode data out of long-lived stores.
+
+> **Scope note.** The current "firewall" is a content boundary — it
+> redacts/blocks secrets in inputs and outputs. It is **not** a
+> network-egress firewall: outgoing HTTP from tools and providers goes
+> through stock `http.Client`, with no per-host allowlist or
+> dial-layer enforcement. Per-host egress rules and a per-session
+> audit log of blocked/redacted events are tracked in
+> [TODO.md](TODO.md).

 ### Entropy false-positive reduction

@@ -4,6 +4,27 @@ Active work, newest first.

 ## In flight

+- **Security boundary — egress controls + session audit log.** The
+  current `Firewall` is a content boundary only (scans messages and
+  tool results for secrets via regex + Shannon entropy, redacts or
+  blocks, logs via `log/slog`). It does not enforce network egress —
+  outgoing HTTP from tools and providers uses stock `http.Client`
+  with no per-host allowlist or dial-layer interception. Two follow-
+  ups surfaced from the r/SideProject v0.3.0 launch thread
+  (2026-05-24, `u/Secret_Theme3192`):
+  1. **Per-session audit log of blocked/redacted events** —
+     grep-able file at `.gnoma/sessions/<id>/audit.jsonl` so the
+     user can answer "what did the firewall do this session?" in
+     one command. Today the `slog` output goes to whatever sink is
+     configured, with no per-session grouping.
+  2. **Per-host egress allowlist (HTTP transport layer)** — open
+     design question: host-level (`allow api.openai.com, deny *`)
+     vs per-tool (`bash can only hit these hosts`). Reply asked
+     the commenter for their mental model; revisit when feedback
+     lands. The README and v0.3.0 Reddit post phrasing oversold
+     "network egress gated"; corrected in the same commit as this
+     TODO entry.
+
 - **Tool-router specialization (functiongemma)** — gated on telemetry,
  not committed. Phase A.2 adds did-switch-rate measurement to the
  two-stage `select_category` path; Phase A.3 (LoRA fine-tune of