docs(plans): config-migration and sensitive-content-policy

Promotes two TODO entries into phased plan docs and links them from the TODO bullets. config-migration plan covers the silent layered-config corruption chain (encoder zero-spam -> reader overwrite -> wrong effective values) and its remediation across five phases: encoder fix (omitempty + pointer-numeric hybrid), project registry, gnoma doctor, gnoma upgrade-config, and auto-migration on startup with banner notice. sensitive-content-policy plan unifies three input paths (pasted text, pasted images, tool-read files) behind one decision API with consistent UI surface and audit-log integration. Phases A-E sequence the work from highest-leverage (text paste) to most complex (image OCR with local vision arm). Neither plan starts implementation in this commit — they exist to make the design decisions explicit so the eventual code can be reviewed against a written intent rather than a TODO bullet.
feat(security): per-session firewall audit log
2026-05-24 22:51:33 +02:00 · 2026-05-24 22:47:28 +02:00 · 2026-05-24 22:42:34 +02:00 · 2026-05-24 22:29:56 +02:00 · 2026-05-24 22:24:59 +02:00 · 2026-05-24 22:13:27 +02:00
36 changed files with 1689 additions and 164 deletions
@@ -1,4 +1,15 @@
-MISTRAL_API_KEY="asd**"
-ANTHROPICS_API_KEY="sk-ant-**"
+# --- LLM provider keys (set at least one) ---
+ANTHROPIC_API_KEY="sk-ant-**"
 OPENAI_API_KEY="sk-proj-**"
 GEMINI_API_KEY="AIza**"
+# Alternative to GEMINI_API_KEY (either is accepted)
+# GOOGLE_API_KEY="AIza**"
+MISTRAL_API_KEY="**"
+
+# --- Optional overrides (config can also set these) ---
+# GNOMA_PROVIDER="anthropic"
+# GNOMA_MODEL="claude-sonnet-4-6"
+
+# --- Subprocess sandbox bypass (footguns — set deliberately) ---
+# GNOMA_AGY_BYPASS_PERMISSIONS=1
+# GNOMA_CODEX_BYPASS_SANDBOX=1
@@ -0,0 +1,68 @@
+# Release workflow — runs when a vX.Y.Z tag is pushed (including mirror
+# pushes from somegit.dev). Drives GoReleaser to publish:
+#   - static binaries (linux/darwin/windows × amd64/arm64) + checksums
+#     + autogenerated changelog to the GitHub releases page
+#   - multi-arch container images to ghcr.io/vikingowl91/gnoma
+#
+# GITHUB_TOKEN is provided automatically by GitHub Actions and already
+# carries packages:write thanks to the permissions block, so no PAT is
+# needed for either the release upload or the ghcr.io push.
+#
+# Security note: this workflow does not interpolate any untrusted
+# context (commit messages, PR titles, issue bodies) into shell commands.
+# All ${{ ... }} references live in with: / env: blocks, which are
+# safely passed as strings rather than evaluated as shell.
+
+name: Release
+
+on:
+  push:
+    tags:
+      - "v*"
+
+permissions:
+  contents: write
+  packages: write
+
+jobs:
+  release:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Setup Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: "1.26"
+
+      - name: Setup QEMU
+        uses: docker/setup-qemu-action@v3
+
+      - name: Setup Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Login to GHCR
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Test
+        run: go test ./...
+
+      - name: GoReleaser
+        uses: goreleaser/goreleaser-action@v6
+        with:
+          version: latest
+          args: release --clean
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          # Force GoReleaser to use the triggering tag rather than fall
+          # back to `git describe` — which can resolve to an older tag
+          # (e.g., a vX.Y.Z-rc tag) when multiple tags point at the same
+          # commit. Surfaced as the v0.3.1 release failure on 2026-05-24.
+          GORELEASER_CURRENT_TAG: ${{ github.ref_name }}
@@ -37,9 +37,12 @@ changelog:
  sort: asc
  filters:
    exclude:
-      - "^docs:"
-      - "^test:"
-      - "^chore:"
+      # Match both bare and scoped conventional commits, e.g. both
+      # "docs:" and "docs(readme):" should be excluded.
+      - "^docs[:(]"
+      - "^test[:(]"
+      - "^chore[:(]"
+      - "^style[:(]"

 # Multi-arch Docker images published to GitHub Container Registry.
 # Build host needs Docker buildx and a `docker login ghcr.io` for the
@@ -98,3 +101,6 @@ release:
  github:
    owner: VikingOwl91
    name: gnoma
+  # Auto-detect prereleases from semver: tags with -rc, -beta, -alpha,
+  # -pre, etc. suffix get marked as prerelease on GitHub.
+  prerelease: auto
@@ -5,20 +5,60 @@ Provider-agnostic agentic coding assistant in Go 1.26.
 Named after the northern pygmy-owl (Glaucidium gnoma).
 Agents are called "elfs" (elf owl).

-## Module
-`somegit.dev/Owlibou/gnoma`
+## Module & repo layout
+- Module: `somegit.dev/Owlibou/gnoma`
+- Upstream (primary, accepts PRs): <https://somegit.dev/Owlibou/gnoma>
+- GitHub mirror (read-only): <https://github.com/VikingOwl91/gnoma>
+
+PRs go to the upstream Gitea instance, not GitHub. The GitHub side is a
+push mirror — direct pushes to `main`/`dev` there will be rejected by the
+ruleset.
+
+## Big picture (read this before diving in)
+
+Single static Go binary. Request flow:
+
+1. `cmd/gnoma` parses flags, picks TUI vs pipe mode, builds the session.
+2. `internal/session` owns one chat lifecycle; `internal/engine` runs the
+   agentic loop (stream → tool calls → re-query → until done).
+3. `internal/router` picks the arm per prompt: multi-armed bandit over
+   provider adapters in `internal/provider/{anthropic,openai,google,mistral,openaicompat}`,
+   tiered SLM (`internal/slm`) → CLI-agent subprocess → local → cloud,
+   with `Strengths` + `MaxComplexity` + `CostWeight` shaping selection.
+4. `internal/security` is the safety boundary: SafeProvider wrapping,
+   firewall (network egress), secret scanner, redaction, incognito mode.
+   `internal/safety` is separate — it's the pre-launch CWD classifier.
+5. `internal/tool` is the local-action boundary; `internal/permission`
+   gates every tool call.
+6. Extensibility surfaces: `internal/hook`, `internal/skill`,
+   `internal/mcp` (JSON-RPC over stdio), `internal/plugin` (TOFU-pinned).
+
+Discriminated unions (struct + type discriminant) are the project's
+chosen way to model variants — see `internal/message` and
+`internal/stream`. Don't reach for interfaces when a discriminant fits.
+
+Full essentials (vision, domain model, ADRs, process flows):
+`docs/essentials/INDEX.md`. **Read INDEX.md before changing
+architectural boundaries or adding new packages.** Note: INDEX
+predates `internal/safety` and `internal/slm` — cross-check the actual
+tree.

 ## Build & Test
 ```sh
-make build     # build binary to ./bin/gnoma
-make test      # run all tests
-make lint      # run golangci-lint
-make cover     # test with coverage report
-```
+make build              # ./bin/gnoma
+make test               # unit tests
+make test-integration   # //go:build integration — needs real API keys
+make lint               # golangci-lint run ./...
+make check              # fmt + vet + lint + test — canonical pre-commit gate
+make cover              # coverage.html

-## Project Essentials
-Project architecture, domain model, and design decisions: `docs/essentials/INDEX.md`
-Read INDEX.md before making architectural changes or adding new system boundaries.
+# Run a single test / package
+go test -run TestRouterSelect ./internal/router/
+go test -v ./internal/router/
+
+# Benchmarks
+go test -bench=. ./internal/router/
+```

 ## Conventions

@@ -1,4 +1,4 @@
-.PHONY: build run check install test lint cover clean fmt vet
+.PHONY: build run check install test lint cover clean fmt vet vuln sec

 BINARY := gnoma
 BINDIR := ./bin
@@ -10,7 +10,7 @@ build:
 run: build
 	$(BINDIR)/$(BINARY)

-check: fmt vet lint test
+check: fmt vet lint test vuln sec
 	@echo "All checks passed!"

 install:
@@ -43,3 +43,13 @@ clean:

 tidy:
 	go mod tidy
+
+# Reachability-checked dependency vuln scan against the Go vuln DB.
+# Install: go install golang.org/x/vuln/cmd/govulncheck@latest
+vuln:
+	govulncheck ./...
+
+# Static security analysis via Semgrep (Go ruleset + security-audit).
+# Install: pip install semgrep  (or: brew install semgrep)
+sec:
+	semgrep --config=p/golang --config=p/security-audit --metrics=off --error .
@@ -10,11 +10,65 @@ to the best available model — cloud or local — through a multi-armed bandit
 router, executes tools on your behalf, and stays extensible through hooks,
 skills, MCP servers, and plugins.

-Named after the northern pygmy-owl (*Glaucidium gnoma*); agents are called
-**elfs** (elf owl).
+![gnoma TUI showing a routed turn](docs/img/gnoma-tui.png)

- **Upstream:** <https://somegit.dev/Owlibou/gnoma>
- **GitHub mirror:** <https://github.com/VikingOwl91/gnoma>
+*Every turn shows which arm the router picked and why — here a local
+`qwen3:14b` was selected for a `generation` task.*
+
+## What makes gnoma different
+
+- **Multi-armed bandit router.** Per-prompt arm selection based on
+  capability gates, declared `Strengths`, latency, and cost. Visible in
+  the TUI on every turn — no black box.
+- **`[router].prefer = local | cloud | auto`.** Pin routing toward local
+  models, cloud, or let the bandit decide. Offline-first workflows still
+  reach for Claude when the local model would obviously flail.
+- **Tier-0 SLM routing.** A tiny local model classifies each prompt and
+  handles trivial tasks itself, keeping the heavy provider for real work.
+- **Content boundary + secret scanner.** Every outgoing LLM message
+  and incoming tool result is scanned for secrets (regex + Shannon
+  entropy on long tokens), redacted or blocked at the content level.
+  Paths are canonicalised (TOCTOU-safe), Unicode is sanitized
+  (homoglyphs, BiDi tricks), and a `SafeProvider` boundary keeps
+  incognito-mode data out of long-lived stores. *(Per-host network
+  egress allowlist is on the roadmap, not in place today.)*
+- **No phone-home.** gnoma itself sends nothing off-machine — zero
+  analytics endpoint, zero metrics service, no remote logging.
+  Prompts of course go to whatever provider you route them to:
+  cloud arms ship data to that provider by design; pair
+  Ollama/llama.cpp with `--incognito` if you want everything
+  on-device.
+- **Provider-agnostic from day one.** Anthropic, OpenAI, Google, Mistral,
+  Ollama, llama.cpp, plus subprocess CLIs (`claude`, `codex`, `agy`,
+  `vibe`). Mix cloud and local in the same session.
+- **Vision end-to-end.** `[Image: /path]` markers in prompts, `Ctrl+V`
+  paste in the TUI, capability-gated per arm.
+- **Single static binary.** `CGO_ENABLED=0`, multi-arch container on
+  ghcr.io. No daemon, no runtime deps.
+
+## Status
+
+Pre-1.0 (current: **v0.3.0**). Single maintainer, breaking changes
+possible. The provider, router, and engine surfaces are settling;
+config schema and TUI bindings may still shift between minor versions.
+Apache 2.0.
+
+## Table of contents
+
+- [Install](#install)
+- [Quickstart](#quickstart)
+- [Vision / image input](#vision--image-input)
+- [Providers](#providers)
+- [Config](#config)
+- [Routing defaults](#routing-defaults)
+- [SLM routing](#slm-small-language-model-routing)
+- [Session persistence](#session-persistence)
+- [Extensibility](#extensibility)
+- [Subcommands](#subcommands)
+- [Security](#security)
+- [Development](#development)
+- [About](#about)
+- [License](#license)

 ---

@@ -418,9 +472,25 @@ built-in batching skill.

 gnoma runs tools and shell commands on your behalf. The
 [`internal/security`](internal/security) package canonicalises every path
-(TOCTOU-safe), gates network access through a configurable firewall, and
-scans tool output for secrets before it ever reaches the model. The
-`SafeProvider` boundary keeps incognito-mode data out of long-lived stores.
+(TOCTOU-safe), scans every outgoing LLM message and incoming tool result
+for secrets (regex + Shannon entropy) before it reaches the model, and
+sanitizes Unicode (homoglyphs, BiDi tricks). The `SafeProvider` boundary
+keeps incognito-mode data out of long-lived stores.
+
+> **Scope note.** The current "firewall" is a content boundary — it
+> redacts/blocks secrets in inputs and outputs. It is **not** a
+> network-egress firewall: outgoing HTTP from tools and providers goes
+> through stock `http.Client`, with no per-host allowlist or
+> dial-layer enforcement. Per-host egress rules and a per-session
+> audit log of blocked/redacted events are tracked in
+> [TODO.md](TODO.md).
+>
+> **Data flow.** gnoma itself emits no telemetry to external services
+> — no analytics, no metrics endpoint, no remote logging. When you
+> route to a cloud provider (Anthropic, OpenAI, Google, Mistral),
+> prompts and tool data are sent to that provider as required to
+> fulfill the request — by design. For fully on-device operation,
+> use Ollama or llama.cpp and `--incognito`.

 ### Entropy false-positive reduction

@@ -498,6 +568,15 @@ Architecture, conventions, and TDD workflow: [CONTRIBUTING.md](CONTRIBUTING.md).

 ---

+## About
+
+Named after the northern pygmy-owl (*Glaucidium gnoma*); agents are called
+**elfs** (elf owl).
+
+- **Upstream:** <https://somegit.dev/Owlibou/gnoma>
+- **GitHub mirror:** <https://github.com/VikingOwl91/gnoma> (read-only;
+  PRs go to upstream Gitea)
+
 ## License

 Apache License 2.0. See [LICENSE](LICENSE) and [NOTICE](NOTICE).
@@ -4,35 +4,171 @@ Active work, newest first.

 ## In flight

- **Startup safety + context banner** — refuse / warn / OK tier check
-  on the cwd at launch (refuse in `/etc`, `/sys`, system roots; warn
-  with keypress in `$HOME`, `/tmp`, common dumping grounds; OK in
-  anything inside a git repo or with a project marker). Context
-  banner always shown with cwd, git state, model, modes, and a
-  top-level sensitive-file inventory. Bypass via
-  `--dangerously-allow-anywhere`. Complements the in-flight
-  sensitive-content unified-policy work (this is the pre-flight
-  layer; that is the runtime layer). See
-  [`docs/superpowers/plans/2026-05-23-startup-safety-banner.md`](docs/superpowers/plans/2026-05-23-startup-safety-banner.md).
- **Routing-preference policy** — `[router].prefer = "local" | "cloud" | "auto"`
-  config knob biasing selection via a soft score multiplier
-  (0.3 / 0.5 / 1.0). Preserves Strengths cross-tier promotion and
-  the bandit's learning; complements rather than replaces incognito.
-  Forced arms (`--provider X`) and incognito still take priority.
-  Closes the original 2026-05-23 session item B (deferred when the
-  defaults-refresh work landed first). See
-  [`docs/superpowers/plans/2026-05-23-prefer-routing-policy.md`](docs/superpowers/plans/2026-05-23-prefer-routing-policy.md).
- **Routing defaults refresh** — bake family-keyed `Strengths` +
-  `MaxComplexity` into discovery so a freshly-pulled local fleet
-  routes sensibly without any TOML config. Adds a non-chat exclude
-  list (filters `embeddinggemma`, `kokoros`, `whisper-base`,
-  `vibevoice`, `*-asr/-tts/-audio/-reranker`), extends
-  `knownVisionModelPrefixes` (gemma4, glm-ocr), and refreshes the
-  cloud-side registry (Gemini 3.x, `gpt-5.3-codex`). Closed-model
-  `Strengths` + `CostWeight` defaults land in the provider modules.
-  Driven by benchmark snapshot 2026-05-23
-  (artificialanalysis.ai v4.0, llm-stats.com). See
-  [`docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md`](docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md).
+- **Config write/merge — silent corruption of layered configs.**
+  `internal/config/write.go:setConfig` reads the existing TOML into a
+  zero-valued `Config` struct, sets one field, and writes the entire
+  struct back out — so every untouched field gets serialized at its
+  Go zero value (empty strings, zero ints, `false` bools). On the
+  next load, those explicit zeros overwrite higher-priority layers
+  via `toml.Decode`'s "present field beats absent field" semantics.
+
+  Concrete symptom (2026-05-24): user's `~/.config/gnoma/config.toml`
+  had `[router].prefer = "cloud"` but the project-level
+  `.gnoma/config.toml` had `prefer = ""` (generated by an earlier
+  `gnoma config set ...` call), which silently downgraded the
+  effective policy to `auto` — visible only via the new `/router`
+  TUI command, with no warning.
+
+  Same root cause is responsible for the zero-spammed global config
+  the same user has (`max_tokens = 0`, `permission.mode = ""`,
+  `bash_timeout = 0`, etc.) — all overwriting sensible defaults.
+
+  **Fix surface (multi-part, plan-worthy):**
+
+  1. **Stop generating zero-spam.** Two options:
+     - Tag struct fields with `,omitempty` so the BurntSushi encoder
+       skips zero values. Caveat: conflates "unset" with "explicitly
+       zero" for primitive types (a user who wants `max_keep = 0`
+       loses it). Safe for strings/maps/slices where empty is never
+       user-intent; lossy for numeric fields.
+     - Switch to `pelletier/go-toml/v2` and use its document model
+       to edit only the targeted key, preserving everything else
+       byte-for-byte. Cleaner semantics, bigger refactor.
+     - Hybrid: omitempty on string/map/slice fields, document-level
+       edit for numerics. Fastest path that doesn't lose intent.
+
+  2. **`gnoma doctor` — read-only diagnostic.** Scans both global
+     and project configs and reports:
+     - Zero-spam fields that would silently shadow defaults or
+       upstream layers.
+     - Invalid enum values (e.g. `permission.mode = ""`).
+     - Unknown / removed keys from older schema versions.
+     - Effective-merged values (so the user sees what gnoma will
+       actually use after layering). No writes. Exits non-zero on
+       findings so it's CI-friendly.
+
+  3. **`gnoma upgrade-config` — active migration.** For each config
+     file (global, profiles, project):
+     - Compute the cleaned form (only fields the user actually set,
+       dropping zeros that match defaults).
+     - Write the original to `<path>.bak` with timestamp suffix.
+     - Write the cleaned form to the original path.
+     - Print a diff of what changed so the user can verify.
+
+  4. **Project-level auto-migration on startup.** If gnoma detects
+     a zero-spammed project `.gnoma/config.toml` at launch:
+     - Auto-run the upgrade (project-only, never auto-touch the
+       global config).
+     - Write `.gnoma/config.toml.bak-YYYY-MM-DD-HHMMSS`.
+     - Surface a one-line notice in the startup safety banner:
+       `config: migrated .gnoma/config.toml (see .bak)`.
+     - The auto-migration is non-destructive (`.bak` preserves
+       original) but still gated behind a `[config].auto_migrate`
+       toggle, defaulting to `true`. Global configs require
+       explicit `gnoma upgrade-config`.
+
+  5. **Project registry** (`~/.config/gnoma/projects.json`). Today
+     there is no record of which directories gnoma has been launched
+     in — items #2 and #3 can work with a filesystem scan
+     (`find ~ -type d -name .gnoma`), but a registry makes them
+     significantly faster and unlocks cross-project features.
+     Sketch:
+
+     ```json
+     {
+       "projects": [
+         {
+           "path": "/home/.../my-repo",
+           "first_seen": "2026-04-15T10:30:00Z",
+           "last_seen":  "2026-05-24T19:23:00Z",
+           "session_count": 47
+         }
+       ]
+     }
+     ```
+
+     Update on every successful startup (record project root,
+     bump `last_seen` + increment `session_count`). Enables:
+     - Fast `gnoma doctor --all-projects` without a filesystem walk.
+     - Cross-project session listing (`gnoma sessions --all`
+       picker; surface most-recent sessions across the registry).
+     - `gnoma upgrade-config` that can migrate every known project
+       in one invocation.
+     - Future local-only aggregate stats (`gnoma stats`) — still
+       no-phone-home, just a sum across the registry.
+
+     **Caveats and design constraints:**
+     - The registry file becomes another silent-corruption surface
+       — must use the same `omitempty` / atomic-write discipline
+       as the encoder fix in #1, or it'll exhibit the same class
+       of bug.
+     - Stale entries (deleted projects). `gnoma doctor` should
+       detect and offer to prune; do not auto-delete.
+     - Privacy: this is literally a log of directories the user
+       has worked in. Local-only, never sent off-machine (per the
+       no-phone-home positioning), but worth a one-line note in
+       the Security section of the README so users know it exists.
+     - Opt-out: `[config].project_registry = false` for users who
+       don't want this tracked. Default `true`.
+     - Atomic writes (temp file + rename) so a crash mid-write
+       doesn't corrupt the file.
+
+  Surfaced from the v0.3.1 launch wave (2026-05-24).
+  Plan:
+  [`docs/superpowers/plans/2026-05-24-config-migration.md`](docs/superpowers/plans/2026-05-24-config-migration.md).
+
+- **Bandit selector — design decisions deferred.** The current
+  selector (`internal/router/selector.go:scoreArm`) is greedy
+  quality-weighted: per-(arm × task-type) EMA scores blended 70/30
+  with heuristic defaults, divided by CostWeight-adjusted cost. It
+  is **not** a true multi-armed bandit — no UCB-style exploration
+  bonus, no Thompson sampling. Tracked as a design question rather
+  than a must-implement item because of two open dependencies:
+
+  1. **Whether to keep numeric EMA at all.** The 2026-05-07 roadmap
+     (Phase 4) puts re-evaluating bandit learning on hold until the
+     SLM-driven dispatcher is in production. Three options on the
+     table: keep bandit as feedback for the SLM, retire EMA in
+     favour of qualitative outcome summaries fed to the SLM, or
+     split responsibilities (SLM = intent routing, bandit =
+     cost/quality within a tier). See
+     [`docs/superpowers/plans/2026-05-07-gnoma-roadmap.md`](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md)
+     §Phase 4.
+
+  2. **User-tunable selector knobs.** Several constants are
+     hardcoded today: `qualityAlpha` (EMA smoothing, ~3-sample
+     memory), the 70/30 observed/heuristic blend,
+     `strengthScoreBonus` for tagged task types, and the
+     `DefaultThresholds.Minimum` quality floor. Surfacing these as
+     `[router.bandit]` config keys would let users tune for their
+     workloads (faster alpha for shifting model performance, longer
+     memory for stable fleets) without waiting for the strategic
+     decision in #1.
+
+  Surfaced from the r/coolgithubprojects v0.3.1 launch thread
+  (2026-05-24, `u/Ha_Deal_5079`).
+
+- **Security boundary — egress controls + session audit log.** The
+  current `Firewall` is a content boundary only (scans messages and
+  tool results for secrets via regex + Shannon entropy, redacts or
+  blocks, logs via `log/slog`). It does not enforce network egress —
+  outgoing HTTP from tools and providers uses stock `http.Client`
+  with no per-host allowlist or dial-layer interception. Two follow-
+  ups surfaced from the r/SideProject v0.3.0 launch thread
+  (2026-05-24, `u/Secret_Theme3192`):
+  1. **Per-session audit log of blocked/redacted events** —
+     grep-able file at `.gnoma/sessions/<id>/audit.jsonl` so the
+     user can answer "what did the firewall do this session?" in
+     one command. Today the `slog` output goes to whatever sink is
+     configured, with no per-session grouping.
+  2. **Per-host egress allowlist (HTTP transport layer)** — open
+     design question: host-level (`allow api.openai.com, deny *`)
+     vs per-tool (`bash can only hit these hosts`). Reply asked
+     the commenter for their mental model; revisit when feedback
+     lands. The README and v0.3.0 Reddit post phrasing oversold
+     "network egress gated"; corrected in the same commit as this
+     TODO entry.
+
 - **Tool-router specialization (functiongemma)** — gated on telemetry,
  not committed. Phase A.2 adds did-switch-rate measurement to the
  two-stage `select_category` path; Phase A.3 (LoRA fine-tune of
@@ -65,7 +201,8 @@ Active work, newest first.
  warning when the content matches sensitive heuristics, a
  consent-gated review step, and consistent treatment across the
  three paths. Cross-cuts with Phase F entropy work and the
-  outgoing-scan firewall.
+  outgoing-scan firewall. Plan:
+  [`docs/superpowers/plans/2026-05-24-sensitive-content-policy.md`](docs/superpowers/plans/2026-05-24-sensitive-content-policy.md).
 - **Distribution — follow-ups.** v0.1.0 shipped (archives on
  github.com/VikingOwl91/gnoma/releases, multi-arch images on
  ghcr.io/vikingowl91/gnoma). Still optional: Homebrew tap,
@@ -84,7 +221,13 @@ Active work, newest first.
 - **Structured output** with JSON schema validation — M12.
 - **Native agy JSON output** — switch the subprocess provider to
  `--output-format stream-json` once the agy CLI supports it,
-  replacing the current prompt-augmentation fallback.
+  replacing the current prompt-augmentation fallback. Until then,
+  agy's `ToolUse` capability is set to `false` (see
+  `internal/provider/subprocess/agent.go` agy entry) — without
+  structured tool-call output, the router would otherwise dispatch
+  tool-needing tasks to agy and the turn would hang on prose
+  hallucinations of tool calls. Flip the capability back to `true`
+  in the same change that lands stream-json parsing.
 - **SQLite session persistence** + serve mode — M10.
 - **Task learning** (pattern recognition, persistent tasks) — M11.
 - **Web UI** (`gnoma web`) — M15.
@@ -2,13 +2,14 @@ package main

 import (
 	"context"
+	"crypto/rand"
+	"encoding/binary"
 	"encoding/json"
 	"errors"
 	"flag"
 	"fmt"
 	"io"
 	"log/slog"
-	mrand "math/rand"
 	"os"
 	"os/signal"
 	"path/filepath"
@@ -61,17 +62,17 @@ var (
 func main() {
 	var resumeFlag string
 	var (
-		providerName = flag.String("provider", "", "LLM provider (mistral, anthropic, openai, google, ollama, llamacpp)")
-		model        = flag.String("model", "", "model name (empty = provider default)")
-		system       = flag.String("system", "", "system prompt override (empty = built-in default)")
-		apiKey       = flag.String("api-key", "", "API key (or set MISTRAL_API_KEY env)")
-		maxTurns     = flag.Int("max-turns", 50, "max tool-calling rounds per turn")
-		permMode     = flag.String("permission", "auto", "permission mode (default, accept_edits, bypass, deny, plan, auto)")
-		incognito    = flag.Bool("incognito", false, "incognito mode — no persistence, no learning")
-		profileFlag  = flag.String("profile", "", "config profile to load (empty = default_profile from base config)")
+		providerName  = flag.String("provider", "", "LLM provider (mistral, anthropic, openai, google, ollama, llamacpp)")
+		model         = flag.String("model", "", "model name (empty = provider default)")
+		system        = flag.String("system", "", "system prompt override (empty = built-in default)")
+		apiKey        = flag.String("api-key", "", "API key (or set MISTRAL_API_KEY env)")
+		maxTurns      = flag.Int("max-turns", 50, "max tool-calling rounds per turn")
+		permMode      = flag.String("permission", "auto", "permission mode (default, accept_edits, bypass, deny, plan, auto)")
+		incognito     = flag.Bool("incognito", false, "incognito mode — no persistence, no learning")
+		profileFlag   = flag.String("profile", "", "config profile to load (empty = default_profile from base config)")
 		allowAnywhere = flag.Bool("dangerously-allow-anywhere", false, "bypass the cwd safety classifier — only use if you know what you're doing")
-		verbose      = flag.Bool("verbose", false, "enable debug logging")
-		version      = flag.Bool("version", false, "print version and exit")
+		verbose       = flag.Bool("verbose", false, "enable debug logging")
+		version       = flag.Bool("version", false, "print version and exit")
 	)
 	flag.StringVar(&resumeFlag, "resume", "", "resume session by ID (omit ID to list sessions)")
 	flag.StringVar(&resumeFlag, "r", "", "resume session (shorthand)")
@@ -396,7 +397,17 @@ func main() {

 	// Create router and register the provider as a single arm
 	// (M4 foundation: one provider from CLI. Multi-provider routing comes with config.)
-	rtr := router.New(router.Config{Logger: logger})
+	// BanditParams come from [router.bandit] config keys; zero values
+	// resolve to built-in defaults inside the router package.
+	rtr := router.New(router.Config{
+		Logger: logger,
+		Bandit: router.BanditParams{
+			QualityAlpha:    cfg.Router.Bandit.QualityAlpha,
+			MinObservations: cfg.Router.Bandit.MinObservations,
+			ObservedWeight:  cfg.Router.Bandit.ObservedWeight,
+			StrengthBonus:   cfg.Router.Bandit.StrengthBonus,
+		},
+	})

 	// Apply the prefer-routing-policy from config (default: auto).
 	// Invalid values are rejected here with an actionable error rather
@@ -656,10 +667,14 @@ func main() {
 	}
 	permChecker := permission.NewChecker(permission.Mode(*permMode), permRules, pipePromptFn)

-	// Generate session-scoped ID for /tmp artifact directory
+	// Generate session-scoped ID for /tmp artifact directory.
+	// Use crypto/rand so the suffix isn't predictable even if a future
+	// caller seeds math/rand deterministically (e.g., in tests).
+	var randBuf [8]byte
+	_, _ = rand.Read(randBuf[:])
 	sessionID := fmt.Sprintf("%s-%06x",
 		time.Now().Format("20060102-150405"),
-		mrand.Int63()&0xffffff,
+		binary.BigEndian.Uint64(randBuf[:])&0xffffff,
 	)
 	// Pass the firewall's incognito mode so Save no-ops while incognito
 	// is active. Mode is consulted on every Save (dynamic), so TUI
@@ -667,6 +682,17 @@ func main() {
 	store := persist.New(sessionID, fw.Incognito())
 	logger.Debug("session store initialized", "dir", store.Dir())

+	// Per-session firewall audit log: append-only JSONL at
+	// <projectRoot>/.gnoma/sessions/<sessionID>/audit.jsonl. Honours
+	// incognito (writes skipped when active) and tolerates fs errors —
+	// scan pipeline never depends on the audit succeeding.
+	auditPath := filepath.Join(gnomacfg.ProjectRoot(), ".gnoma", "sessions", sessionID, "audit.jsonl")
+	fw.SetAudit(security.NewAuditLogger(security.AuditLoggerConfig{
+		Path:      auditPath,
+		Incognito: fw.Incognito(),
+		Logger:    logger,
+	}))
+
 	// Create elf manager and register agent tools.
 	// Must be created after fw and permChecker so elfs inherit security layers.
 	elfMgr := elf.NewManager(elf.ManagerConfig{
@@ -1,5 +1,10 @@
 # Routing-Preference Policy — 2026-05-23

+> **Status: shipped in v0.3.0.** Commit `f9094f6`. Implementation
+> diverged from the original plan (tier-shift instead of pure score
+> multiplier) — see "Implementation note" in the Approach section.
+> All P-1 through P-7 tasks complete.
+
 Adds a config knob that biases routing toward local arms, toward
 cloud arms, or leaves the current tier+score behavior unchanged.
 Originally surfaced as item B in the 2026-05-23 routing redesign
@@ -1,5 +1,10 @@
 # Routing Defaults Refresh — 2026-05-23

+> **Status: shipped in v0.3.0.** Commits `a79e991` (scaffold) →
+> `9bb775a` (full local family table) → `2f8d4c4` (cloud defaults
+> + gpt-5.3-codex) → `c99b2c6` (README). All R-1 through R-8
+> tasks complete.
+
 Refreshes gnoma's per-arm routing defaults so that out-of-the-box
 selection produces sensible choices without requiring users to write
 a `[[arms]]` block in TOML. Surfaced during the 2026-05-23 session
@@ -1,5 +1,11 @@
 # Startup Safety + Context Banner — 2026-05-23

+> **Status: shipped in v0.3.0.** Commits `3eeb5b4` (classifier +
+> banner + main.go wiring) → `8ba77c1` (env-template precision
+> fix, label alignment, banner-under-bypass). All S-1 through
+> S-7 tasks complete; S-8 docs done in `d206b3c`. Windows path
+> handling still deferred per plan.
+
 Adds a pre-launch safety check that warns or refuses when gnoma is
 started in a directory where it could do real damage (`$HOME`,
 `/`, `/etc`, etc.), plus a context banner shown on every launch
@@ -0,0 +1,356 @@
+# Config Migration — 2026-05-24
+
+Fixes the silent-corruption pattern in `internal/config/write.go`
+that produces zero-spammed config files, adds reader-side telemetry
+to surface the resulting layering bugs (`gnoma doctor`), ships an
+active migration command (`gnoma upgrade-config`), wires automatic
+project-level migration on startup, and introduces a per-user
+project registry so all of the above can operate cross-project.
+
+Surfaces in TODO.md as "Config write/merge — silent corruption of
+layered configs" with five sub-items; this plan promotes that entry
+out of the bullet form into a phased design.
+
+---
+
+## Problem
+
+`setConfig()` in `internal/config/write.go` reads the existing TOML
+into a zero-valued `Config` struct, mutates one field, and writes
+the entire struct back out. The encoder doesn't skip zero values,
+so every untouched field gets serialized at its Go default — empty
+strings, zero ints, `false` bools, empty maps.
+
+The next layered load (`Load()` → `toml.Decode` over multiple
+files) then **does not** treat those present-but-zero fields as
+"unset" — TOML's "present field wins" semantics mean those zeros
+overwrite higher-priority layers. Concrete failure observed
+2026-05-24:
+
+- User's global `~/.config/gnoma/config.toml` has
+  `[router].prefer = "cloud"`.
+- An earlier `gnoma config set ...` call generated a project-level
+  `.gnoma/config.toml` containing `[router].prefer = ""`.
+- The merge collapses to `Prefer = ""`, which
+  `ParsePreferPolicy("")` maps to `PreferAuto`.
+- The TUI's `/router` command reads `auto` despite the global
+  config saying `cloud`. No warning, no error — purely silent.
+
+Same root cause produces zero-spammed global configs
+(`max_tokens = 0`, `permission.mode = ""`, etc.) that silently
+override sensible defaults in `internal/config/defaults.go`.
+
+This affects every layered field — provider, permission, tools,
+session, router, security, slm. Cannot be patched per-field;
+needs a structural fix.
+
+---
+
+## Non-goals
+
+- **Schema redesign.** The current `Config` struct stays as-is.
+  This plan addresses how it's written and read, not what fields
+  exist.
+- **Validation.** Future work; `gnoma doctor` will flag obviously
+  invalid values (empty enum strings, etc.) but a full validation
+  pass against the schema is out of scope here.
+- **Migration of the bandit-router quality JSON.** Unrelated file,
+  unrelated format, separate concerns.
+
+---
+
+## Approach overview
+
+Five phases, in dependency order:
+
+1. **Encoder fix** — stop generating zero-spam in the first place.
+2. **Project registry** — `~/.config/gnoma/projects.json` so later
+   phases can operate cross-project without filesystem walks.
+3. **`gnoma doctor`** — read-only diagnostic, scans global +
+   project configs (via registry), reports zero-spam, invalid
+   enums, removed keys, and the effective-merged view.
+4. **`gnoma upgrade-config`** — active migration with `.bak`
+   backup + diff output; targets one file or all known projects.
+5. **Auto-migration on startup** — when launch detects a
+   zero-spammed project config, run upgrade-config silently with
+   a banner-line notice.
+
+Phases 1 + 2 land first. 3 builds on 1 + 2. 4 builds on 3. 5
+builds on 4.
+
+---
+
+## Phase 1 — Encoder fix
+
+`setConfig()` is the bug generator. The TOML library
+(`BurntSushi/toml`) supports `omitempty` on struct tags but the
+project's `Config` struct doesn't use it. Three options:
+
+### Option A — `omitempty` on all fields
+
+Tag every field with `,omitempty`. The encoder skips fields at
+their Go zero value. **Caveat:** conflates "unset" with
+"explicitly zero" for primitive types — a user who actually
+wants `max_keep = 0` (no session retention) loses that setting on
+the next write.
+
+### Option B — `pelletier/go-toml/v2` document model
+
+Switch encoder to a TOML library that exposes a document AST.
+Edit only the targeted key, preserve everything else byte-for-byte.
+Cleaner semantics, bigger refactor — also affects the decoder side.
+
+### Option C (chosen) — hybrid
+
+Use `omitempty` for fields where the Go zero value is never
+user-intent (strings, maps, slices). For numeric fields where 0
+is a legitimate user choice, switch the field to a pointer
+(`*int`, `*float64`) so `nil` means "unset" and `*0` means
+"explicitly zero". On decode, fall back to defaults for nil
+pointers in the resolution layer.
+
+This keeps the existing BurntSushi library, preserves user intent
+across the full type space, and limits churn to the fields where
+the zero/unset ambiguity actually matters.
+
+### Phase 1 task list
+
+- **P1-1:** Audit every `Config`-tree field. Tag string/map/slice
+  fields with `,omitempty`. List numeric/bool fields that need
+  pointer conversion.
+- **P1-2:** Convert numeric/bool fields requiring zero-vs-unset
+  distinction to pointers. Update construction sites and getters.
+- **P1-3:** Add a `Resolve()` method on `Config` that walks the
+  struct and substitutes default values for nil pointers, called
+  exactly once at the end of `Load()`. All consumer code reads
+  resolved values; raw layered structs are internal.
+- **P1-4:** Tests covering: (a) write-then-read roundtrip
+  preserves only user-set fields, (b) explicit zero (e.g.
+  `max_keep = 0`) survives the roundtrip, (c) field absent from
+  TOML resolves to default.
+- **P1-5:** Backwards-compat: when reading an existing zero-spammed
+  file, the resolver must treat all-zeros-in-a-section as the
+  default — see Phase 5 for the heuristic.
+
+---
+
+## Phase 2 — Project registry
+
+New file at `~/.config/gnoma/projects.json`:
+
+```json
+{
+  "projects": [
+    {
+      "path": "/home/user/git/foo",
+      "first_seen": "2026-04-15T10:30:00Z",
+      "last_seen":  "2026-05-24T19:23:00Z",
+      "session_count": 47
+    }
+  ]
+}
+```
+
+### Phase 2 task list
+
+- **P2-1:** Add `internal/config/registry.go` with `Registry`,
+  `Load`, `Save`, `Record(projectRoot)`, `Prune(staleAfter time.Duration)`.
+- **P2-2:** Save uses atomic-write (temp file + `os.Rename`) so a
+  crash mid-write doesn't corrupt the file.
+- **P2-3:** Call `Registry.Record(projectRoot)` from
+  `cmd/gnoma/main.go` right after the startup-safety banner
+  decides to proceed. Failure is logged at Warn level but never
+  blocks startup.
+- **P2-4:** Add `[config].project_registry` toggle in defaults.go
+  (bool, default `true`). When `false`, Record is a no-op.
+- **P2-5:** Document the file in README §Security as part of the
+  no-phone-home scope note: this is purely local, never sent.
+- **P2-6:** Tests: round-trip, atomic-write under fault injection,
+  toggle off path.
+
+---
+
+## Phase 3 — `gnoma doctor`
+
+New subcommand. Read-only. Scans:
+
+- Global config at `GlobalConfigPath()`.
+- Every project in the registry (or filesystem-scan fallback when
+  the registry is disabled or empty).
+- Active profile (when profile mode is on).
+
+Reports per-file:
+
+- **Zero-spam fields** — present-with-zero where higher layer or
+  default has non-zero. The very thing this plan exists to fix.
+- **Invalid enum values** — `permission.mode = ""`,
+  `router.prefer = "yes"`, etc. Use existing parsers to detect.
+- **Unknown keys** — fields in the TOML that don't map to any
+  `Config` struct field. Decoder ignores these silently today;
+  doctor surfaces them.
+- **Removed keys** — known-historical fields from older schema
+  versions; suggest removal.
+
+Reports per-stack:
+
+- **Effective-merged values** — what gnoma will actually use after
+  layering. Helps the user see whether a project file is masking
+  a global setting.
+
+### Phase 3 task list
+
+- **P3-1:** Add `cmd/gnoma/doctor_cmd.go` with the subcommand
+  scaffold.
+- **P3-2:** `internal/config/doctor.go` with the scan logic;
+  exported `Diagnose(paths []string) []Finding`.
+- **P3-3:** Output: human format by default, `--json` for
+  CI/script consumption.
+- **P3-4:** Exit non-zero when findings have severity ≥ Warn so
+  doctor is CI-friendly.
+- **P3-5:** `--all-projects` flag (default off; uses registry).
+- **P3-6:** Tests covering each finding type.
+
+---
+
+## Phase 4 — `gnoma upgrade-config`
+
+Active migration. Writes:
+
+- Original file → `<path>.bak-YYYYMMDD-HHMMSS` (deterministic
+  timestamp suffix).
+- Cleaned content → original path.
+- Stdout: unified diff of what changed.
+
+### Phase 4 task list
+
+- **P4-1:** Add `cmd/gnoma/upgrade_config_cmd.go`.
+- **P4-2:** `internal/config/upgrade.go` with `Upgrade(path string)`
+  → reads file, applies the Phase 1 cleaning (drop fields equal to
+  their resolved default, keep explicit zeros that diverge from the
+  default via the pointer semantics).
+- **P4-3:** Atomic two-step write: rename original to `.bak-...`,
+  then atomic-write new content to original path. Crash midway
+  leaves both files present, never the corrupted state.
+- **P4-4:** `--all-projects` flag using the registry.
+- **P4-5:** `--dry-run` prints diffs without writing.
+- **P4-6:** Tests: round-trip of zero-spammed input → cleaned
+  output → identical re-read; idempotency (running twice yields
+  no second `.bak`).
+
+---
+
+## Phase 5 — Auto-migration on startup
+
+When `Load()` parses a project `.gnoma/config.toml` and the
+heuristic flags it as zero-spammed (every field at the Go zero
+value, no user content), gnoma:
+
+- Runs the Phase 4 upgrade in-process.
+- Writes `.gnoma/config.toml.bak-...`.
+- Emits a single line to the startup safety banner:
+  `config: migrated .gnoma/config.toml (see .bak)`.
+- Continues startup with the cleaned config.
+
+### Heuristic for "zero-spam"
+
+A config section is zero-spam if **all** of these hold:
+
+- Every primitive field present in the file is at its Go zero
+  value.
+- No `[[arms]]`, `[[mcp_servers]]`, or `[[hooks]]` blocks (those
+  are always user content).
+- File modification time ≥ 24h old (so we don't migrate a config
+  the user is actively editing).
+
+If only some fields are zero and some are user-set, we don't touch
+it — the user's mix of explicit zeros and meaningful values takes
+precedence.
+
+### Phase 5 task list
+
+- **P5-1:** Add `isZeroSpam(*Config) bool` heuristic in
+  `internal/config/upgrade.go`.
+- **P5-2:** Wire from `Load()` post-merge: if project layer
+  is_zero_spam → call Upgrade on the project file, log via banner.
+- **P5-3:** Add `[config].auto_migrate` toggle, default `true`.
+  Global configs are never auto-migrated; only project-level.
+- **P5-4:** Banner integration: the existing safety banner gets
+  a new optional line for "config notices" right under the
+  cwd/sensitivity summary.
+- **P5-5:** Tests: zero-spam project file gets migrated; mixed
+  project file is left alone; recently-modified file is left
+  alone; auto_migrate=false disables.
+
+---
+
+## Cross-cutting: schemas and resolution
+
+The pointer-field design (Phase 1) needs a clear resolution layer.
+Proposal: every Config section gets a `Resolved...Section` mirror
+that has plain (non-pointer) types. After Load, the resolver
+populates one from the other, substituting defaults for nils.
+
+Examples already exist in the codebase: `ResolvedSafetySection`
+mirrors `SafetySection`. The pattern is established; we just need
+to extend it.
+
+Consumer-side: code reads from `cfg.Resolved.X` not `cfg.X`.
+Loud renaming will catch any reader still using the raw layered
+struct.
+
+---
+
+## Risks
+
+- **Pointer-field migration is wide-scope.** Every reader of the
+  affected fields needs to change. Mitigated by the
+  resolver-mirror pattern (`ResolvedXSection`) — readers move from
+  one struct to another, but the call sites don't change shape.
+- **Auto-migration writes silently.** Users might be surprised
+  even with the banner notice. Mitigated by `.bak` preservation
+  and the heuristic only firing on files that are obviously
+  zero-spam.
+- **Registry becomes the same class of bug.** Documented in the
+  TODO entry already; Phase 2 explicitly requires atomic-write
+  and `omitempty` discipline. If we get this wrong the fix is the
+  same shape as Phase 1.
+- **Privacy.** The registry is a list of directories the user has
+  worked in. Local-only, opt-out toggle, README note required.
+- **Backwards compatibility for tests.** Tests that construct
+  `Config` by hand with explicit zeros may need updating.
+  Approach: add a `MustResolve` helper for test construction so
+  tests don't need to know about the pointer/resolver split.
+
+---
+
+## Rollout
+
+Phases 1 + 2 ship together as a single release (encoder fix
+needs the resolver, registry is independent but small). Tag as
+`v0.4.0` — schema-touching changes warrant a minor bump per
+the project's pre-1.0 semver discipline.
+
+Phase 3 (`gnoma doctor`) can ship in a `v0.4.x` patch — it's
+read-only and adds no surface compatibility risk.
+
+Phase 4 (`gnoma upgrade-config`) ships in a follow-up `v0.4.x`.
+
+Phase 5 (auto-migration) ships once Phase 4 has been in the wild
+for at least one release cycle, so users have a way to opt in /
+inspect before it becomes implicit.
+
+---
+
+## Open questions
+
+- Should `gnoma doctor` also check that the `quality.json` file
+  is well-formed? Same dir, different concern — probably belongs
+  in doctor's scope as the umbrella "diagnose my gnoma install"
+  command.
+- Registry size cap? After a year of usage on a busy machine
+  the file could grow to a few thousand entries. Reasonable; no
+  cap planned, but `Prune(staleAfter)` exposed for users who
+  want manual cleanup.
+- Profiles: how do profile configs interact with the doctor /
+  upgrade flow? Default: treat each profile file as its own
+  upgradeable unit. Doctor lists findings per-profile.
@@ -0,0 +1,278 @@
+# Sensitive Content — Unified Policy — 2026-05-24
+
+Promotes the "sensitive-content handling — unified policy" TODO
+entry into a phased design. Three input paths can introduce
+sensitive content into the conversation context — pasted images,
+pasted text, and tool-read files. Today each path has different
+defences; this plan unifies them behind a single policy with a
+single consent UI.
+
+Sibling concerns:
+[`2026-05-19-post-slm-unlock.md`](2026-05-19-post-slm-unlock.md)
+Phase F (entropy detection) and the outgoing-scan firewall
+already cover detection in some places; this plan unifies the
+*decision* layer that sits in front of them.
+
+---
+
+## Problem
+
+Three input paths to the engine carry distinct sensitivity
+risks; each is handled differently today.
+
+### Path 1 — Pasted images (Ctrl+V in the TUI)
+
+Screenshot might contain API keys, terminal output with creds,
+private repo contents, family photos, etc. Today:
+
+- Image bytes land in the user cache dir.
+- The router only sends to vision-capable arms.
+- Local arms are fine; cloud arms send full image content to
+  the provider.
+- Incognito skips paste entirely (per the no-persistence
+  contract).
+
+What's missing: at-paste preview / warning. The user often does
+not realise what the screenshot contained until after it's been
+sent.
+
+### Path 2 — Pasted text
+
+User pastes a chunk into the input composer. Could be a log
+snippet with credentials, an `.env` file content, an SSH key,
+or just text. Today:
+
+- Goes straight into the input buffer with no scanning.
+- Outgoing firewall scans the final composed message before
+  send — *after* the user has already pressed Enter, often
+  redacting silently in the background.
+- The user sees `[REDACTED]` in their own message after the
+  fact, no consent step.
+
+What's missing: at-paste detection so the user sees the warning
+*before* committing to send.
+
+### Path 3 — Tool-read files
+
+`fs_read`, `bash`, etc. surface file contents to the model. Today:
+
+- Outgoing firewall scans tool *results* before they reach the
+  next provider turn (`ScanToolResult`).
+- Format-aware entropy detection (Phase F-1) reduces false
+  positives on UUIDs / SHA / ISO timestamps.
+- The audit log (just shipped) records what got blocked /
+  redacted per session.
+
+What's missing: nothing structurally on this path; it's the
+most-mature of the three. Listed here only for completeness so
+the unified policy can be honest about asymmetric coverage.
+
+### The unification question
+
+These three paths converge into "content that joins the context
+window." A consistent policy needs to answer, for each path:
+
+1. **When** does detection run? (at paste / at send / at receive)
+2. **What** does the user see? (warning / preview / redacted
+   placeholder / silent)
+3. **What** is their consent gate? (approve / deny / approve-with-
+   redaction / skip)
+4. **Where** is the action recorded? (audit log, banner, slog)
+
+Today the answers vary per path. This plan picks one set of
+answers and applies them everywhere.
+
+---
+
+## Non-goals
+
+- **New detectors.** This plan reuses the existing scanner
+  (regex + entropy + unicode-sanitize). Phase F-2's SLM-assisted
+  detector lands separately when telemetry warrants.
+- **Egress allowlist.** Tracked in the security-boundary TODO
+  entry, separate plan.
+- **Provider-side redaction.** That's the provider's problem.
+  This plan is about what leaves gnoma's process.
+
+---
+
+## Approach
+
+Single policy module: `internal/security/sensitive_policy.go`.
+Exposes one decision function:
+
+```go
+type Decision int
+const (
+    DecisionAllow Decision = iota
+    DecisionWarn          // show warning, allow on confirm
+    DecisionRedactAndAllow
+    DecisionBlock
+)
+
+type Inspection struct {
+    Path       string          // "paste_text", "paste_image", "tool_result"
+    Content    string          // for text paths
+    ImageBytes []byte          // for image paths; nil otherwise
+    Matches    []scanner.Match // pre-scanned hits
+}
+
+func Decide(insp Inspection, mode IncognitoMode, prefs Preferences) Decision
+```
+
+All three paths route through `Decide` with their own
+`Inspection`. UI surface — the at-paste prompt, the at-send
+warning, the redacted-placeholder view — sits in the TUI and is
+driven by the Decision value.
+
+### Path-specific wiring
+
+| Path | When | UI | Default Decision rules |
+|---|---|---|---|
+| paste_text | Ctrl+V into composer | Inline warning under input box, with `Tab` to expand match details | Match in scanner → `Warn` (text stays, user dismisses); explicit block-tier match → `Block` (paste dropped) |
+| paste_image | Ctrl+V image | Pre-paste OCR scan (small local model) + warning before insertion | OCR finds secret pattern → `Warn`; user can choose `Redact` (image kept, warning attached) or `Cancel`. Incognito → `Block` (already today). |
+| tool_result | After tool runs | Banner: `firewall: redacted N items in this tool result` | Existing behaviour. `Decide` invoked just to keep the API surface consistent; matches go to audit log. |
+
+### Preferences
+
+New `[security.sensitive]` config section:
+
+```toml
+[security.sensitive]
+warn_on_paste_text  = true   # default true
+warn_on_paste_image = true   # default true
+ocr_image_paste     = false  # opt-in: requires local vision arm
+auto_redact         = false  # default false: ask first, redact second
+silent_tool_results = false  # default false: show banner when redactions happen
+```
+
+### Incognito interaction
+
+When incognito is active, **every** Decision is treated as either
+`Block` or `RedactAndAllow` — never `Warn`-then-`Allow`. Incognito
+implies "I don't trust this conversation to persist"; the
+sensible default is to be strict about what flows in.
+
+---
+
+## Phases
+
+### Phase A — Policy module + config
+
+- **A-1:** Add `[security.sensitive]` section to config.go with
+  the four flags above.
+- **A-2:** Add `internal/security/sensitive_policy.go` with
+  `Inspection`, `Decision`, `Decide`.
+- **A-3:** Unit tests for the decision matrix.
+
+### Phase B — Path 2 (pasted text)
+
+Highest user-visible payoff for the smallest surface.
+
+- **B-1:** TUI input composer intercepts paste, runs
+  `Decide(paste_text, ...)` before the bytes enter the buffer.
+- **B-2:** Decision = Warn → status-line warning, paste still
+  goes in. `Tab` expands details.
+- **B-3:** Decision = Block → paste discarded, status line
+  explains why; user can override with `Ctrl+Shift+V`
+  (force-paste) which bypasses but writes to audit log.
+- **B-4:** Tests: paste-of-known-secret triggers warning;
+  redacted variant shows what would have been sent.
+
+### Phase C — Path 3 (tool-results) banner
+
+- **C-1:** When `ScanToolResult` redacts ≥1 item, the engine
+  emits a system message: `firewall: redacted 2 items in
+  read-file output (see audit log)`.
+- **C-2:** Gated behind `silent_tool_results = false` default.
+  Users who already trust the firewall can flip it on.
+- **C-3:** Tests: integration test asserting the system
+  message appears.
+
+### Phase D — Path 1 (pasted images)
+
+Most complex. Image OCR requires a local vision model; without
+one the paste falls back to today's behaviour.
+
+- **D-1:** Add OCR hook: when `ocr_image_paste = true` and a
+  vision-capable local arm is available, run a small OCR pass
+  over the image before insertion.
+- **D-2:** Feed OCR output through the regex/entropy scanner.
+  Matches → `Decide(paste_image, ...)` with the original image
+  attached.
+- **D-3:** TUI shows a preview thumbnail + warning before
+  insertion confirmation.
+- **D-4:** Without a vision arm: feature degrades gracefully
+  (no OCR, paste proceeds as today, banner notes "image paste
+  scan unavailable — no local vision arm").
+
+### Phase E — Audit log integration
+
+All four Decision outcomes get an audit entry. The audit log
+already has the file format from the security-boundary work;
+just need to define new Action values:
+
+- `paste_warn`, `paste_block`, `paste_force_override`
+- `image_paste_warn`, `image_paste_block`, `image_paste_ocr_skip`
+- `tool_result_banner` (when redactions surfaced to user)
+
+---
+
+## Risks
+
+- **OCR adds latency to paste.** Bad UX if image OCR takes >300ms.
+  Mitigation: hard-cap OCR time at 500ms, skip if exceeded, fall
+  back to no-scan path with banner notice. Local vision models on
+  consumer hardware should comfortably make this budget.
+- **False positives on text paste become annoying.** If
+  `warn_on_paste_text = true` fires on every code snippet, users
+  turn it off and the protection is gone. Use the same
+  entropy_safelist Phase F-1 ships (uuid/sha/iso8601/url) — those
+  are the high-FP categories.
+- **OCR introduces a new attack surface.** A malicious image could
+  exploit the OCR model. Mitigation: only local-arm OCR (the
+  attacker's input never leaves the machine); never call cloud
+  vision models for OCR (would defeat the privacy purpose).
+- **Phase D depends on having a local vision model.** Users without
+  one get degraded UX. Document this clearly; consider whether to
+  ship a small bundled OCR-tuned model (probably no — adds 100MB+
+  to install).
+
+---
+
+## Open questions
+
+- Should there be a "trusted projects" list where the warnings
+  are suppressed? Could live in the project registry (sibling
+  plan). Useful for monorepos where the user explicitly trusts
+  the local code.
+- The `Ctrl+Shift+V` force-paste override is a footgun. Do we
+  want a confirm-second-time dialog, or just the keybind?
+- Should clipboard contents be cleared from the host clipboard
+  after a sensitive paste? Cross-platform-tricky; defer.
+- Sensitive-pattern feedback loop: when a user dismisses a warning
+  as "this isn't a secret", do we learn from that? Privacy concern
+  — would need an explicit opt-in.
+
+---
+
+## Rollout
+
+Phases A + B + C land together as one feature release. Phase D
+(image OCR) is opt-in (`ocr_image_paste = true`) and can land in
+a follow-up patch — its surface is large and benefits from real-
+world UX feedback. Phase E threads through all four; it lands
+incrementally per phase, not as a single batch.
+
+Realistic target: Phase A/B/C in v0.5.0; Phase D in v0.5.x. All
+behaviour is gated behind the four config flags so existing users
+who don't opt in see no behavioural change.
+
+---
+
+## Cross-references
+
+- TODO.md entry "Sensitive-content handling — unified policy"
+- [`2026-05-19-post-slm-unlock.md`](2026-05-19-post-slm-unlock.md) — Phase F entropy detection
+- [`2026-05-19-security-wave2-incognito.md`](2026-05-19-security-wave2-incognito.md) — incognito-mode contract
+- TODO.md entry "Security boundary — egress controls + session audit log" — the audit log this plan piggybacks on
@@ -15,7 +15,7 @@ require (
 	github.com/charmbracelet/x/ansi v0.11.6
 	github.com/openai/openai-go v1.12.0
 	github.com/pkoukk/tiktoken-go v0.1.8
-	golang.org/x/text v0.35.0
+	golang.org/x/text v0.37.0
 	google.golang.org/genai v1.52.1
 	gopkg.in/yaml.v3 v3.0.1
 	mvdan.cc/sh/v3 v3.13.0
@@ -63,10 +63,10 @@ require (
 	go.opentelemetry.io/otel v1.42.0 // indirect
 	go.opentelemetry.io/otel/metric v1.42.0 // indirect
 	go.opentelemetry.io/otel/trace v1.42.0 // indirect
-	golang.org/x/crypto v0.49.0 // indirect
-	golang.org/x/net v0.52.0 // indirect
+	golang.org/x/crypto v0.51.0 // indirect
+	golang.org/x/net v0.55.0 // indirect
 	golang.org/x/sync v0.20.0 // indirect
-	golang.org/x/sys v0.42.0 // indirect
+	golang.org/x/sys v0.45.0 // indirect
 	google.golang.org/api v0.267.0 // indirect
 	google.golang.org/genproto/googleapis/rpc v0.0.0-20260217215200-42d3e9bedb6d // indirect
 	google.golang.org/grpc v1.79.3 // indirect
@@ -142,18 +142,18 @@ go.opentelemetry.io/otel/sdk/metric v1.39.0 h1:cXMVVFVgsIf2YL6QkRF4Urbr/aMInf+2W
 go.opentelemetry.io/otel/sdk/metric v1.39.0/go.mod h1:xq9HEVH7qeX69/JnwEfp6fVq5wosJsY1mt4lLfYdVew=
 go.opentelemetry.io/otel/trace v1.42.0 h1:OUCgIPt+mzOnaUTpOQcBiM/PLQ/Op7oq6g4LenLmOYY=
 go.opentelemetry.io/otel/trace v1.42.0/go.mod h1:f3K9S+IFqnumBkKhRJMeaZeNk9epyhnCmQh/EysQCdc=
-golang.org/x/crypto v0.49.0 h1:+Ng2ULVvLHnJ/ZFEq4KdcDd/cfjrrjjNSXNzxg0Y4U4=
-golang.org/x/crypto v0.49.0/go.mod h1:ErX4dUh2UM+CFYiXZRTcMpEcN8b/1gxEuv3nODoYtCA=
+golang.org/x/crypto v0.51.0 h1:IBPXwPfKxY7cWQZ38ZCIRPI50YLeevDLlLnyC5wRGTI=
+golang.org/x/crypto v0.51.0/go.mod h1:8AdwkbraGNABw2kOX6YFPs3WM22XqI4EXEd8g+x7Oc8=
 golang.org/x/exp v0.0.0-20231006140011-7918f672742d h1:jtJma62tbqLibJ5sFQz8bKtEM8rJBtfilJ2qTU199MI=
 golang.org/x/exp v0.0.0-20231006140011-7918f672742d/go.mod h1:ldy0pHrwJyGW56pPQzzkH36rKxoZW1tw7ZJpeKx+hdo=
-golang.org/x/net v0.52.0 h1:He/TN1l0e4mmR3QqHMT2Xab3Aj3L9qjbhRm78/6jrW0=
-golang.org/x/net v0.52.0/go.mod h1:R1MAz7uMZxVMualyPXb+VaqGSa3LIaUqk0eEt3w36Sw=
+golang.org/x/net v0.55.0 h1:bcvxaJn3e1U6InsFWt1JUq1aSjnRxLzT2rtD2KfkDF8=
+golang.org/x/net v0.55.0/go.mod h1:L5U2KuzuOe1lY7Z+aWVIKK6qEeJXnXV9yzGA+WCHJww=
 golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4=
 golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0=
-golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo=
-golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
-golang.org/x/text v0.35.0 h1:JOVx6vVDFokkpaq1AEptVzLTpDe9KGpj5tR4/X+ybL8=
-golang.org/x/text v0.35.0/go.mod h1:khi/HExzZJ2pGnjenulevKNX1W67CUy0AsXcNubPGCA=
+golang.org/x/sys v0.45.0 h1:dO4czNzziLiiXplLQgBCEpCvXQ3dnkn0SdaZSYdQ+FY=
+golang.org/x/sys v0.45.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
+golang.org/x/text v0.37.0 h1:Cqjiwd9eSg8e0QAkyCaQTNHFIIzWtidPahFWR83rTrc=
+golang.org/x/text v0.37.0/go.mod h1:a5sjxXGs9hsn/AJVwuElvCAo9v8QYLzvavO5z2PiM38=
 gonum.org/v1/gonum v0.16.0 h1:5+ul4Swaf3ESvrOnidPp4GZbzf0mxVQpDCYUQE7OJfk=
 gonum.org/v1/gonum v0.16.0/go.mod h1:fef3am4MQ93R2HHpKnLk4/Tbh/s0+wqD5nfa6Pnwy4E=
 google.golang.org/api v0.267.0 h1:w+vfWPMPYeRs8qH1aYYsFX68jMls5acWl/jocfLomwE=
@@ -157,6 +157,40 @@ type RouterSection struct {
 	// and incognito take priority over this knob. See
 	// docs/superpowers/plans/2026-05-23-prefer-routing-policy.md.
 	Prefer string `toml:"prefer"`
+
+	// Bandit exposes the selector's tuning knobs. Defaults preserve
+	// previous hard-coded behaviour exactly; only set these when you
+	// need to tune the EMA quality tracker for an unusual workload.
+	Bandit BanditSection `toml:"bandit"`
+}
+
+// BanditSection holds the scoring knobs for the EMA quality tracker
+// and the score blend used by the selector. Each field has a sentinel
+// zero value that means "use the built-in default" so an empty TOML
+// block is byte-identical to pre-config behaviour. See
+// internal/router/feedback.go and internal/router/selector.go for the
+// formulas these knobs feed into.
+type BanditSection struct {
+	// QualityAlpha is the EMA smoothing factor for arm-quality
+	// observations. Larger values weight recent observations more.
+	// Default: 0.3 (~3-sample memory). 0.0 here means "use default".
+	QualityAlpha float64 `toml:"quality_alpha"`
+
+	// MinObservations is the minimum number of samples required
+	// before observed EMA overrides the heuristic fallback. Default:
+	// 3. 0 here means "use default".
+	MinObservations int `toml:"min_observations"`
+
+	// ObservedWeight is the weight of the observed EMA in the
+	// observed/heuristic blend inside scoreArm: the final quality is
+	// `observed*W + heuristic*(1-W)`. Default: 0.7. 0.0 here means
+	// "use default".
+	ObservedWeight float64 `toml:"observed_weight"`
+
+	// StrengthBonus is the quality bonus added when an arm declares
+	// the current task type in its Strengths list. Default: 0.15.
+	// 0.0 here means "use default".
+	StrengthBonus float64 `toml:"strength_bonus"`
 }

 // MCPServerConfig defines an MCP server to start and connect to.
@@ -38,7 +38,7 @@ func TestTryLoadOAuthCredentials_Formats(t *testing.T) {
 			name: "camelCase and milliseconds expiry",
 			data: oauthCreds{
 				AccessToken2: "token-camel",
-				ExpiresAt:    time.Now().Add(1 * time.Hour).UnixNano() / 1e6,
+				ExpiresAt:    time.Now().Add(1*time.Hour).UnixNano() / 1e6,
 				TokenType2:   "Bearer",
 			},
 			expectError: false,
@@ -109,8 +109,19 @@ var knownAgents = []CLIAgent{
 		// structured-output flag and no image-input mechanism. JSON support
 		// is faked via PromptResponseFormat (best-effort, model-dependent);
 		// see TODO.md for tracking native stream-json support.
+		//
+		// ToolUse is false on purpose. agy streams plain text and the
+		// agyParser turns every line into an EventTextDelta — there is
+		// no path for a structured ToolCall event to come back. With
+		// ToolUse=true the router would dispatch tool-needing tasks
+		// (security_review, spawn_elfs, file edit) to agy; the
+		// underlying Gemini model would describe calling the tool in
+		// prose (invented UUIDs and "I will pause now"-style stubs),
+		// the engine would receive only text, and the turn would hang
+		// waiting for a tool call that never arrives. Flip back to
+		// true when native stream-json lands.
 		Capabilities: provider.Capabilities{
-			ToolUse:       true,
+			ToolUse:       false,
 			ContextWindow: 200000,
 		},
 		PromptResponseFormat: true,
@@ -57,12 +57,12 @@ func benchTasks() []Task {
 func BenchmarkSelectBest(b *testing.B) {
 	arms := benchArms()
 	tasks := benchTasks()
-	qt := NewQualityTracker()
+	qt := NewQualityTracker(0, 0)

 	b.ResetTimer()
 	for b.Loop() {
 		for _, task := range tasks {
-			selectBest(qt, arms, task, PreferAuto)
+			selectBest(qt, BanditParams{}, arms, task, PreferAuto)
 		}
 	}
 }
@@ -99,13 +99,13 @@ func BenchmarkRouterSelect(b *testing.B) {

 func BenchmarkScoreArm(b *testing.B) {
 	arms := benchArms()
-	qt := NewQualityTracker()
+	qt := NewQualityTracker(0, 0)
 	task := Task{Type: TaskGeneration, Priority: PriorityNormal, EstimatedTokens: 2000, RequiresTools: true, ComplexityScore: 0.5}

 	b.ResetTimer()
 	for b.Loop() {
 		for _, arm := range arms {
-			scoreArm(qt, arm, task)
+			scoreArm(qt, BanditParams{}, arm, task)
 		}
 	}
 }
@@ -338,10 +338,10 @@ func TestRoutingDefaults_PayoffScenario(t *testing.T) {
 	}

 	cases := []struct {
-		name       string
-		task       Task
-		wantArmID  ArmID
-		reason     string
+		name      string
+		task      Task
+		wantArmID ArmID
+		reason    string
 	}{
 		{
 			name:      "Generation picks qwen3-coder",
@@ -472,4 +472,3 @@ func TestRoutingDefaults_LocalFleetVisibility(t *testing.T) {
 		}
 	}
 }
-
@@ -2,9 +2,15 @@ package router

 import "sync"

+// Built-in defaults for the bandit knobs. Surfaced via
+// [router.bandit] config keys; see BanditParams in router.go. Kept
+// here so the QualityTracker has a sensible fallback when constructed
+// without explicit parameters (tests, ad-hoc callers).
 const (
-	qualityAlpha    = 0.3 // EMA smoothing factor (~3-sample memory)
-	minObservations = 3   // min samples before observed score overrides heuristic
+	defaultQualityAlpha    = 0.3 // EMA smoothing factor (~3-sample memory)
+	defaultMinObservations = 3   // min samples before observed score overrides heuristic
+	defaultObservedWeight  = 0.7 // weight of observed score in observed/heuristic blend
+	defaultStrengthBonus   = 0.15
 )

 // EMAScore tracks an exponential moving average quality score.
@@ -19,13 +25,27 @@ type QualityTracker struct {
 	mu              sync.RWMutex
 	scores          map[ArmID]map[TaskType]*EMAScore
 	classifierCount map[ClassifierSource]int
+
+	// Configurable knobs — set via NewQualityTracker. Pass 0 for any
+	// argument to keep the built-in default.
+	alpha           float64
+	minObservations int
 }

-// NewQualityTracker returns an empty QualityTracker.
-func NewQualityTracker() *QualityTracker {
+// NewQualityTracker returns an empty QualityTracker. Pass 0 for any
+// argument to keep the built-in default (alpha=0.3, minObs=3).
+func NewQualityTracker(alpha float64, minObs int) *QualityTracker {
+	if alpha == 0 {
+		alpha = defaultQualityAlpha
+	}
+	if minObs == 0 {
+		minObs = defaultMinObservations
+	}
 	return &QualityTracker{
 		scores:          make(map[ArmID]map[TaskType]*EMAScore),
 		classifierCount: make(map[ClassifierSource]int),
+		alpha:           alpha,
+		minObservations: minObs,
 	}
 }

@@ -71,7 +91,7 @@ func (qt *QualityTracker) Record(armID ArmID, taskType TaskType, success bool) {
 	if s.Count == 0 {
 		s.Value = observation
 	} else {
-		s.Value = qualityAlpha*observation + (1-qualityAlpha)*s.Value
+		s.Value = qt.alpha*observation + (1-qt.alpha)*s.Value
 	}
 	s.Count++
 }
@@ -86,7 +106,7 @@ func (qt *QualityTracker) Quality(armID ArmID, taskType TaskType) (score float64
 		return 0, false
 	}
 	s, ok := m[taskType]
-	if !ok || s.Count < minObservations {
+	if !ok || s.Count < qt.minObservations {
 		return 0, false
 	}
 	return s.Value, true
@@ -8,7 +8,7 @@ import (
 )

 func TestQualityTracker_NoDataReturnsHeuristic(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	_, hasData := qt.Quality("arm:model", router.TaskGeneration)
 	if hasData {
 		t.Error("expected no data for unobserved arm")
@@ -16,7 +16,7 @@ func TestQualityTracker_NoDataReturnsHeuristic(t *testing.T) {
 }

 func TestQualityTracker_RecordUpdatesEMA(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	for i := 0; i < 3; i++ {
 		qt.Record("arm:model", router.TaskGeneration, true)
 	}
@@ -30,7 +30,7 @@ func TestQualityTracker_RecordUpdatesEMA(t *testing.T) {
 }

 func TestQualityTracker_AllFailuresLowScore(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	for i := 0; i < 5; i++ {
 		qt.Record("arm:model", router.TaskDebug, false)
 	}
@@ -41,7 +41,7 @@ func TestQualityTracker_AllFailuresLowScore(t *testing.T) {
 }

 func TestQualityTracker_ConcurrentSafe(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	done := make(chan struct{})
 	for i := 0; i < 10; i++ {
 		go func(success bool) {
@@ -113,3 +113,45 @@ func TestQualityTracker_InsufficientDataFallsBackToHeuristic(t *testing.T) {
 	}
 	decision.Rollback()
 }
+
+func TestQualityTracker_CustomAlphaShortensMemory(t *testing.T) {
+	// alpha=0.9 weights the latest sample heavily; after a single
+	// failure the score should drop further than with the default 0.3.
+	fast := router.NewQualityTracker(0.9, 0)
+	slow := router.NewQualityTracker(0.0, 0) // 0 → default 0.3
+
+	for _, qt := range []*router.QualityTracker{fast, slow} {
+		// Build up history at the high end with 5 successes.
+		for i := 0; i < 5; i++ {
+			qt.Record("arm:m", router.TaskGeneration, true)
+		}
+		// One failure.
+		qt.Record("arm:m", router.TaskGeneration, false)
+	}
+
+	fastScore, _ := fast.Quality("arm:m", router.TaskGeneration)
+	slowScore, _ := slow.Quality("arm:m", router.TaskGeneration)
+
+	if !(fastScore < slowScore) {
+		t.Errorf("expected fast alpha (0.9) to drop quality faster than default (0.3): fast=%f slow=%f", fastScore, slowScore)
+	}
+}
+
+func TestQualityTracker_CustomMinObservationsGatesScore(t *testing.T) {
+	// minObs=10 means Quality should return hasData=false until 10
+	// observations are recorded, even though the default would say
+	// "yes" after 3.
+	qt := router.NewQualityTracker(0, 10)
+	for i := 0; i < 5; i++ {
+		qt.Record("arm:m", router.TaskGeneration, true)
+	}
+	if _, hasData := qt.Quality("arm:m", router.TaskGeneration); hasData {
+		t.Error("expected hasData=false at 5 observations with minObs=10")
+	}
+	for i := 0; i < 5; i++ {
+		qt.Record("arm:m", router.TaskGeneration, true)
+	}
+	if _, hasData := qt.Quality("arm:m", router.TaskGeneration); !hasData {
+		t.Error("expected hasData=true after 10 observations with minObs=10")
+	}
+}
@@ -54,10 +54,10 @@ func TestPolicyMultiplier(t *testing.T) {
 	cloudArm := &Arm{IsLocal: false}

 	cases := []struct {
-		name    string
-		arm     *Arm
-		policy  PreferPolicy
-		want    float64
+		name   string
+		arm    *Arm
+		policy PreferPolicy
+		want   float64
 	}{
 		{"auto/local", localArm, PreferAuto, 1.0},
 		{"auto/cloud", cloudArm, PreferAuto, 1.0},
@@ -8,7 +8,7 @@ import (
 )

 func TestQualityTracker_SnapshotRestore_RoundTrip(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	// Record some outcomes
 	qt.Record("anthropic/claude-3-5-sonnet", router.TaskGeneration, true)
 	qt.Record("anthropic/claude-3-5-sonnet", router.TaskGeneration, true)
@@ -33,7 +33,7 @@ func TestQualityTracker_SnapshotRestore_RoundTrip(t *testing.T) {
 	}

 	// Restore into a fresh tracker
-	qt2 := router.NewQualityTracker()
+	qt2 := router.NewQualityTracker(0, 0)
 	qt2.Restore(restored)

 	// After restore, Quality() should return data (Count >= minObservations=3)
@@ -47,7 +47,7 @@ func TestQualityTracker_SnapshotRestore_RoundTrip(t *testing.T) {
 }

 func TestQualityTracker_Snapshot_Empty(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	snap := qt.Snapshot()
 	if snap.Scores == nil {
 		t.Error("scores map should be initialized (not nil)")
@@ -58,7 +58,7 @@ func TestQualityTracker_Snapshot_Empty(t *testing.T) {
 }

 func TestQualityTracker_ClassifierCounts_RecordAndSnapshot(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	qt.RecordClassifier(router.ClassifierHeuristic)
 	qt.RecordClassifier(router.ClassifierSLM)
 	qt.RecordClassifier(router.ClassifierSLM)
@@ -92,7 +92,7 @@ func TestQualityTracker_ClassifierCounts_RecordAndSnapshot(t *testing.T) {
 	if err := json.Unmarshal(data, &restored); err != nil {
 		t.Fatal(err)
 	}
-	qt2 := router.NewQualityTracker()
+	qt2 := router.NewQualityTracker(0, 0)
 	qt2.Restore(restored)
 	if qt2.ClassifierCounts()[router.ClassifierSLM] != 2 {
 		t.Errorf("restored slm count = %d, want 2", qt2.ClassifierCounts()[router.ClassifierSLM])
@@ -107,7 +107,7 @@ func TestQualityTracker_Restore_BackCompat_NoClassifierCounts(t *testing.T) {
 	if err := json.Unmarshal(legacy, &snap); err != nil {
 		t.Fatal(err)
 	}
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	qt.Restore(snap)
 	if qt.ClassifierCounts() == nil {
 		t.Error("ClassifierCounts() must return a non-nil map after restoring old snapshot")
@@ -122,7 +122,7 @@ func TestQualityTracker_Restore_BackCompat_NoClassifierCounts(t *testing.T) {
 }

 func TestQualityTracker_Restore_Replaces(t *testing.T) {
-	qt := router.NewQualityTracker()
+	qt := router.NewQualityTracker(0, 0)
 	qt.Record("arm-a", router.TaskDebug, true)
 	qt.Record("arm-a", router.TaskDebug, true)
 	qt.Record("arm-a", router.TaskDebug, true)
@@ -27,6 +27,7 @@ type Router struct {
 	preferPolicy PreferPolicy

 	quality *QualityTracker
+	bandit  BanditParams
 }

 // PreferPolicy biases the scoring step toward local or cloud arms.
@@ -77,6 +78,41 @@ func (p PreferPolicy) String() string {

 type Config struct {
 	Logger *slog.Logger
+	// Bandit tunes the selector's scoring knobs. Pass a zero value to
+	// keep all pre-config behaviour byte-identical; set individual
+	// fields to override the corresponding default.
+	Bandit BanditParams
+}
+
+// BanditParams controls the EMA quality tracker and score blend used
+// by the selector. Each field has a "use default" sentinel (0 for
+// floats and ints) so a zero-valued BanditParams is byte-identical to
+// the pre-config hardcoded constants. Defaults are defined in
+// resolveBanditParams below.
+type BanditParams struct {
+	QualityAlpha    float64
+	MinObservations int
+	ObservedWeight  float64
+	StrengthBonus   float64
+}
+
+// resolveBanditParams fills in the built-in defaults for any field
+// left at its zero value. Centralised so the same defaults apply
+// across NewQualityTracker, scoreArm, and any future caller.
+func resolveBanditParams(p BanditParams) BanditParams {
+	if p.QualityAlpha == 0 {
+		p.QualityAlpha = defaultQualityAlpha
+	}
+	if p.MinObservations == 0 {
+		p.MinObservations = defaultMinObservations
+	}
+	if p.ObservedWeight == 0 {
+		p.ObservedWeight = defaultObservedWeight
+	}
+	if p.StrengthBonus == 0 {
+		p.StrengthBonus = defaultStrengthBonus
+	}
+	return p
 }

 func New(cfg Config) *Router {
@@ -84,10 +120,12 @@ func New(cfg Config) *Router {
 	if logger == nil {
 		logger = slog.Default()
 	}
+	params := resolveBanditParams(cfg.Bandit)
 	return &Router{
 		arms:    make(map[ArmID]*Arm),
 		logger:  logger,
-		quality: NewQualityTracker(),
+		quality: NewQualityTracker(params.QualityAlpha, params.MinObservations),
+		bandit:  params,
 	}
 }

@@ -172,7 +210,7 @@ func (r *Router) Select(task Task) RoutingDecision {
 	}

 	// Select best
-	best := selectBest(r.quality, feasible, task, r.preferPolicy)
+	best := selectBest(r.quality, r.bandit, feasible, task, r.preferPolicy)
 	if best == nil {
 		return RoutingDecision{Error: fmt.Errorf("selection failed")}
 	}
@@ -262,7 +262,7 @@ func TestSelectBest_PrefersToolSupport(t *testing.T) {
 	}

 	task := Task{Type: TaskGeneration, RequiresTools: true, Priority: PriorityNormal}
-	best := selectBest(nil, []*Arm{withoutTools, withTools}, task, PreferAuto)
+	best := selectBest(nil, BanditParams{}, []*Arm{withoutTools, withTools}, task, PreferAuto)

 	if best.ID != "a/with-tools" {
 		t.Errorf("should prefer arm with tool support, got %s", best.ID)
@@ -282,7 +282,7 @@ func TestSelectBest_PrefersThinkingForPlanning(t *testing.T) {
 	}

 	task := Task{Type: TaskPlanning, RequiresTools: true, Priority: PriorityNormal, EstimatedTokens: 5000}
-	best := selectBest(nil, []*Arm{noThinking, thinking}, task, PreferAuto)
+	best := selectBest(nil, BanditParams{}, []*Arm{noThinking, thinking}, task, PreferAuto)

 	if best.ID != "a/thinking" {
 		t.Errorf("should prefer thinking model for planning, got %s", best.ID)
@@ -625,7 +625,7 @@ func TestSelectBest_SmallArmWinsTrivialTask(t *testing.T) {
 		Capabilities:  provider.Capabilities{ToolUse: false},
 	}
 	task := Task{Type: TaskExplain, ComplexityScore: 0.05, RequiresTools: false}
-	got := selectBest(nil, []*Arm{cliArm, smallArm}, task, PreferAuto)
+	got := selectBest(nil, BanditParams{}, []*Arm{cliArm, smallArm}, task, PreferAuto)
 	if got != smallArm {
 		t.Errorf("selectBest = %v, want smallArm", got)
 	}
@@ -647,7 +647,7 @@ func TestSelectBest_CLIAgentWinsComplexTask(t *testing.T) {
 		Capabilities:  provider.Capabilities{ToolUse: false},
 	}
 	task := Task{Type: TaskRefactor, ComplexityScore: 0.7, RequiresTools: true}
-	got := selectBest(nil, []*Arm{cliArm, smallArm}, task, PreferAuto)
+	got := selectBest(nil, BanditParams{}, []*Arm{cliArm, smallArm}, task, PreferAuto)
 	if got != cliArm {
 		t.Errorf("selectBest = %v, want cliArm", got)
 	}
@@ -672,21 +672,21 @@ func TestSelectBest_TierPreference(t *testing.T) {
 	task := Task{Type: TaskGeneration, Priority: PriorityNormal, EstimatedTokens: 1000}

 	t.Run("CLI beats local and API", func(t *testing.T) {
-		best := selectBest(nil, []*Arm{apiArm, localArm, cliArm}, task, PreferAuto)
+		best := selectBest(nil, BanditParams{}, []*Arm{apiArm, localArm, cliArm}, task, PreferAuto)
 		if best.ID != "subprocess/claude" {
 			t.Errorf("want subprocess/claude (tier 0), got %s", best.ID)
 		}
 	})

 	t.Run("local beats API when no CLI", func(t *testing.T) {
-		best := selectBest(nil, []*Arm{apiArm, localArm}, task, PreferAuto)
+		best := selectBest(nil, BanditParams{}, []*Arm{apiArm, localArm}, task, PreferAuto)
 		if best.ID != "ollama/llama3" {
 			t.Errorf("want ollama/llama3 (tier 1), got %s", best.ID)
 		}
 	})

 	t.Run("API selected when only option", func(t *testing.T) {
-		best := selectBest(nil, []*Arm{apiArm}, task, PreferAuto)
+		best := selectBest(nil, BanditParams{}, []*Arm{apiArm}, task, PreferAuto)
 		if best == nil || best.ID != "mistral/mistral-large" {
 			t.Errorf("want mistral/mistral-large (tier 2), got %v", best)
 		}
@@ -98,7 +98,7 @@ func armBaseTier(arm *Arm, task Task) int {
 //
 // Step 2 (fallback): walk tiers low→high. Within a tier, highest-scoring
 // arm wins.
-func selectBest(qt *QualityTracker, arms []*Arm, task Task, prefer PreferPolicy) *Arm {
+func selectBest(qt *QualityTracker, params BanditParams, arms []*Arm, task Task, prefer PreferPolicy) *Arm {
 	if len(arms) == 0 {
 		return nil
 	}
@@ -110,7 +110,7 @@ func selectBest(qt *QualityTracker, arms []*Arm, task Task, prefer PreferPolicy)
 		}
 	}
 	if len(promoted) > 0 {
-		return bestScored(qt, promoted, task, prefer)
+		return bestScored(qt, params, promoted, task, prefer)
 	}

 	// Walk tiers low→high. armTier returns up to 5 when prefer is set
@@ -124,18 +124,18 @@ func selectBest(qt *QualityTracker, arms []*Arm, task Task, prefer PreferPolicy)
 			}
 		}
 		if len(inTier) > 0 {
-			return bestScored(qt, inTier, task, prefer)
+			return bestScored(qt, params, inTier, task, prefer)
 		}
 	}
 	return nil
 }

 // bestScored returns the highest-scoring arm within a set.
-func bestScored(qt *QualityTracker, arms []*Arm, task Task, prefer PreferPolicy) *Arm {
+func bestScored(qt *QualityTracker, params BanditParams, arms []*Arm, task Task, prefer PreferPolicy) *Arm {
 	var best *Arm
 	bestScore := math.Inf(-1)
 	for _, arm := range arms {
-		score := scoreArm(qt, arm, task) * policyMultiplier(arm, prefer)
+		score := scoreArm(qt, params, arm, task) * policyMultiplier(arm, prefer)
 		if score > bestScore {
 			bestScore = score
 			best = arm
@@ -172,13 +172,12 @@ func policyMultiplier(arm *Arm, p PreferPolicy) float64 {
 	}
 }

-// strengthScoreBonus is added to quality when an arm's Strengths list
-// matches the incoming task type. Tunable in one place.
-const strengthScoreBonus = 0.15
-
 // scoreArm computes a quality/cost score for an arm.
 // When the quality tracker has sufficient observations, blends observed EMA
-// (70%) with heuristic (30%). Falls back to pure heuristic otherwise.
+// (default 70%) with heuristic (default 30%). Falls back to pure heuristic
+// otherwise. The blend ratio and strength bonus are tunable via
+// BanditParams (config: [router.bandit]); a zero-valued params falls back
+// to the built-in defaults.
 //
 // Strengths add a fixed bonus to quality when matching task.Type. CostWeight
 // dampens the cost penalty linearly:
@@ -189,16 +188,17 @@ const strengthScoreBonus = 0.15
 // the original effectiveCost == cost. With CostWeight=0 cost is fully
 // ignored (effectiveCost = 1.0). Local arms with sub-1 raw costs are not
 // amplified by fractional weights (the linear formula stays monotone).
-func scoreArm(qt *QualityTracker, arm *Arm, task Task) float64 {
+func scoreArm(qt *QualityTracker, params BanditParams, arm *Arm, task Task) float64 {
+	params = resolveBanditParams(params)
 	hq := heuristicQuality(arm, task)
 	quality := hq
 	if qt != nil {
 		if observed, hasData := qt.Quality(arm.ID, task.Type); hasData {
-			quality = 0.7*observed + 0.3*hq
+			quality = params.ObservedWeight*observed + (1-params.ObservedWeight)*hq
 		}
 	}
 	if arm.HasStrength(task.Type) {
-		quality += strengthScoreBonus
+		quality += params.StrengthBonus
 	}
 	value := task.ValueScore()
 	rawCost := effectiveCost(arm, task)
@@ -65,17 +65,17 @@ func TestScoreArm_CostWeightAffectsArmComparison(t *testing.T) {

 	// CostWeight=1.0: cost dominates, cheap arm wins.
 	cheap.CostWeight, expensive.CostWeight = 1.0, 1.0
-	if scoreArm(nil, cheap, task) <= scoreArm(nil, expensive, task) {
+	if scoreArm(nil, BanditParams{}, cheap, task) <= scoreArm(nil, BanditParams{}, expensive, task) {
 		t.Errorf("CostWeight=1.0: cheap arm should beat expensive arm; cheap=%v expensive=%v",
-			scoreArm(nil, cheap, task), scoreArm(nil, expensive, task))
+			scoreArm(nil, BanditParams{}, cheap, task), scoreArm(nil, BanditParams{}, expensive, task))
 	}

 	// CostWeight=0.0: cost ignored, quality alone decides → expensive (better
 	// context window) wins.
 	cheap.CostWeight, expensive.CostWeight = 0.001, 0.001
-	if scoreArm(nil, expensive, task) <= scoreArm(nil, cheap, task) {
+	if scoreArm(nil, BanditParams{}, expensive, task) <= scoreArm(nil, BanditParams{}, cheap, task) {
 		t.Errorf("CostWeight~0: higher-quality expensive arm should beat cheap arm; expensive=%v cheap=%v",
-			scoreArm(nil, expensive, task), scoreArm(nil, cheap, task))
+			scoreArm(nil, BanditParams{}, expensive, task), scoreArm(nil, BanditParams{}, cheap, task))
 	}
 }

@@ -140,8 +140,8 @@ func TestScoreArm_StrengthBonus(t *testing.T) {
 	}
 	task := Task{Type: TaskSecurityReview, EstimatedTokens: 5000, RequiresTools: true, Priority: PriorityNormal}

-	a := scoreArm(nil, withoutStrength, task)
-	b := scoreArm(nil, withStrength, task)
+	a := scoreArm(nil, BanditParams{}, withoutStrength, task)
+	b := scoreArm(nil, BanditParams{}, withStrength, task)
 	if !(b > a) {
 		t.Errorf("strength-tagged arm score (%v) should exceed plain arm score (%v)", b, a)
 	}
@@ -160,8 +160,8 @@ func TestScoreArm_StrengthBonusDoesNotApplyToOtherTasks(t *testing.T) {
 	}
 	task := Task{Type: TaskDebug, EstimatedTokens: 5000, RequiresTools: true, Priority: PriorityNormal}

-	a := scoreArm(nil, plain, task)
-	b := scoreArm(nil, tagged, task)
+	a := scoreArm(nil, BanditParams{}, plain, task)
+	b := scoreArm(nil, BanditParams{}, tagged, task)
 	if math.Abs(a-b) > 1e-9 {
 		t.Errorf("non-matching task should ignore Strengths: plain=%v tagged=%v", a, b)
 	}
@@ -184,7 +184,7 @@ func TestSelectBest_StrengthPromotedArmBeatsCLIAgent(t *testing.T) {
 	}

 	task := Task{Type: TaskSecurityReview, EstimatedTokens: 5000, RequiresTools: true, Priority: PriorityNormal}
-	got := selectBest(nil, []*Arm{cliAgent, opus}, task, PreferAuto)
+	got := selectBest(nil, BanditParams{}, []*Arm{cliAgent, opus}, task, PreferAuto)
 	if got == nil {
 		t.Fatal("selectBest returned nil")
 	}
@@ -208,7 +208,7 @@ func TestSelectBest_EmptyStrengthsPreservesTierOrder(t *testing.T) {
 	}

 	task := Task{Type: TaskSecurityReview, EstimatedTokens: 5000, RequiresTools: true, Priority: PriorityNormal}
-	got := selectBest(nil, []*Arm{cliAgent, opus}, task, PreferAuto)
+	got := selectBest(nil, BanditParams{}, []*Arm{cliAgent, opus}, task, PreferAuto)
 	if got.ID != cliAgent.ID {
 		t.Errorf("without Strengths, CLI-agent tier-1 should win; got %s", got.ID)
 	}
@@ -327,7 +327,7 @@ func TestSelectBest_MultiplePromotedArmsBestQualityWins(t *testing.T) {
 		Strengths:    []TaskType{TaskSecurityReview},
 	}

-	qt := NewQualityTracker()
+	qt := NewQualityTracker(0, 0)
 	// armB has consistently succeeded — minObservations=3 is enough to flip
 	// the score blend.
 	for i := 0; i < 5; i++ {
@@ -339,7 +339,7 @@ func TestSelectBest_MultiplePromotedArmsBestQualityWins(t *testing.T) {
 	}

 	task := Task{Type: TaskSecurityReview, EstimatedTokens: 5000, RequiresTools: true, Priority: PriorityNormal}
-	got := selectBest(qt, []*Arm{armA, armB}, task, PreferAuto)
+	got := selectBest(qt, BanditParams{}, []*Arm{armA, armB}, task, PreferAuto)
 	if got == nil {
 		t.Fatal("selectBest returned nil")
 	}
@@ -10,16 +10,16 @@ import (
 // Caller passes whatever is known at launch time; empty fields are
 // omitted from the rendered banner.
 type SessionInfo struct {
-	Version       string // e.g. "0.2.1"
-	GitBranch     string // empty if not in a git repo
-	GitDirty      bool   // true if working tree has uncommitted changes
-	ProjectType   string // free-form, e.g. "Go module (somegit.dev/...)"
-	Provider      string // e.g. "ollama"
-	Model         string // e.g. "qwen3-coder:30b"
-	Permission    string // e.g. "auto", "accept_edits"
-	Incognito     bool
-	Prefer        string // "auto" / "local" / "cloud"
-	Tenant        string // optional, e.g. Kubernetes context name
+	Version     string // e.g. "0.2.1"
+	GitBranch   string // empty if not in a git repo
+	GitDirty    bool   // true if working tree has uncommitted changes
+	ProjectType string // free-form, e.g. "Go module (somegit.dev/...)"
+	Provider    string // e.g. "ollama"
+	Model       string // e.g. "qwen3-coder:30b"
+	Permission  string // e.g. "auto", "accept_edits"
+	Incognito   bool
+	Prefer      string // "auto" / "local" / "cloud"
+	Tenant      string // optional, e.g. Kubernetes context name
 }

 // RenderContextBanner returns the always-shown banner with cwd, git,
@@ -21,7 +21,7 @@ func TestScanCWDForSensitive_Matches(t *testing.T) {
 	}
 	// Non-sensitive control files.
 	control := []string{
-		".envrc",       // direnv config, not a credential
+		".envrc", // direnv config, not a credential
 		"main.go",
 		"README.md",
 		"secret_handler.go", // source code, not data
@@ -0,0 +1,121 @@
+package security
+
+import (
+	"encoding/json"
+	"log/slog"
+	"os"
+	"path/filepath"
+	"sync"
+	"time"
+)
+
+// AuditEvent records a single firewall action (block / redact / sanitize)
+// in a structured form intended for per-session post-mortem grepping.
+//
+// Discipline: this struct must never carry the raw bytes of any matched
+// secret. The Pattern field names the matcher (e.g. "anthropic_api_key",
+// "high_entropy"); TokenLen carries the length of the offending token so
+// the user can recognise it in a transcript without re-leaking it.
+type AuditEvent struct {
+	// Timestamp is the wall-clock time of the event in UTC.
+	Timestamp time.Time `json:"ts"`
+	// Action is one of: "block", "redact", "warn", "unicode_sanitize".
+	Action string `json:"action"`
+	// Pattern is the human-readable matcher name (regex tag or
+	// "high_entropy" / "unicode"). Never the matched bytes themselves.
+	Pattern string `json:"pattern,omitempty"`
+	// Source describes where in the data flow the event fired —
+	// "message_text", "tool_result", "tool_call_args",
+	// "system_prompt", etc.
+	Source string `json:"source,omitempty"`
+	// TokenLen is the length of the offending token (or chars
+	// changed for unicode_sanitize). Length only, never the bytes.
+	TokenLen int `json:"token_len,omitempty"`
+}
+
+// AuditLogger appends AuditEvent records to a per-session JSON Lines
+// file. Safe for concurrent use. Writes are skipped while incognito
+// mode is active so the no-persistence contract is honoured.
+//
+// A nil *AuditLogger is a valid no-op — callers can use the same
+// `audit.Record(...)` shape whether or not auditing is configured.
+type AuditLogger struct {
+	path      string
+	incognito *IncognitoMode
+	logger    *slog.Logger
+	mu        sync.Mutex
+}
+
+// AuditLoggerConfig controls how AuditLogger is constructed.
+type AuditLoggerConfig struct {
+	// Path is the full filesystem path to write JSONL events to.
+	// Parent directories are created lazily on first successful Record.
+	Path string
+	// Incognito gates writes; when active, Record is a no-op.
+	// Optional — pass nil to always persist.
+	Incognito *IncognitoMode
+	// Logger receives one Warn per write failure so the user sees
+	// disk-full / permission errors instead of silently losing
+	// audit records. Defaults to slog.Default() when nil.
+	Logger *slog.Logger
+}
+
+// NewAuditLogger builds an AuditLogger. Pass a zero Path to disable
+// auditing (returns nil).
+func NewAuditLogger(cfg AuditLoggerConfig) *AuditLogger {
+	if cfg.Path == "" {
+		return nil
+	}
+	logger := cfg.Logger
+	if logger == nil {
+		logger = slog.Default()
+	}
+	return &AuditLogger{
+		path:      cfg.Path,
+		incognito: cfg.Incognito,
+		logger:    logger,
+	}
+}
+
+// Record appends an event to the audit log. Safe to call on a nil
+// receiver (no-op). Skipped silently when incognito is active.
+// Write failures are logged at Warn level but do not propagate to
+// the caller — auditing is best-effort and must not crash the
+// scanner pipeline.
+func (a *AuditLogger) Record(ev AuditEvent) {
+	if a == nil {
+		return
+	}
+	if a.incognito != nil && a.incognito.Active() {
+		return
+	}
+	if ev.Timestamp.IsZero() {
+		ev.Timestamp = time.Now().UTC()
+	}
+
+	a.mu.Lock()
+	defer a.mu.Unlock()
+
+	if err := os.MkdirAll(filepath.Dir(a.path), 0o700); err != nil {
+		a.logger.Warn("audit: mkdir failed", "path", a.path, "err", err)
+		return
+	}
+	f, err := os.OpenFile(a.path, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0o600)
+	if err != nil {
+		a.logger.Warn("audit: open failed", "path", a.path, "err", err)
+		return
+	}
+	defer f.Close()
+	if err := json.NewEncoder(f).Encode(ev); err != nil {
+		a.logger.Warn("audit: encode failed", "path", a.path, "err", err)
+	}
+}
+
+// Path returns the file path the logger writes to. Empty when the
+// logger is disabled (nil receiver returns "").
+func (a *AuditLogger) Path() string {
+	if a == nil {
+		return ""
+	}
+	return a.path
+}
@@ -0,0 +1,139 @@
+package security
+
+import (
+	"bufio"
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+)
+
+func readAuditLines(t *testing.T, path string) []AuditEvent {
+	t.Helper()
+	f, err := os.Open(path)
+	if err != nil {
+		t.Fatalf("open audit log: %v", err)
+	}
+	defer f.Close()
+	var events []AuditEvent
+	sc := bufio.NewScanner(f)
+	for sc.Scan() {
+		var ev AuditEvent
+		if err := json.Unmarshal(sc.Bytes(), &ev); err != nil {
+			t.Fatalf("decode line %q: %v", sc.Text(), err)
+		}
+		events = append(events, ev)
+	}
+	if err := sc.Err(); err != nil {
+		t.Fatalf("scan audit log: %v", err)
+	}
+	return events
+}
+
+func TestAuditLogger_NilReceiverIsNoop(t *testing.T) {
+	var a *AuditLogger
+	// Must not panic.
+	a.Record(AuditEvent{Action: "block"})
+}
+
+func TestAuditLogger_DisabledWhenPathEmpty(t *testing.T) {
+	a := NewAuditLogger(AuditLoggerConfig{})
+	if a != nil {
+		t.Errorf("expected nil logger for empty path, got %v", a)
+	}
+}
+
+func TestAuditLogger_AppendsJSONLines(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "audit.jsonl")
+	a := NewAuditLogger(AuditLoggerConfig{Path: path})
+	if a == nil {
+		t.Fatal("expected non-nil logger")
+	}
+
+	a.Record(AuditEvent{Action: "block", Pattern: "anthropic_api_key", Source: "tool_result", TokenLen: 51})
+	a.Record(AuditEvent{Action: "redact", Pattern: "high_entropy", Source: "message_text", TokenLen: 42})
+
+	events := readAuditLines(t, path)
+	if len(events) != 2 {
+		t.Fatalf("expected 2 events, got %d", len(events))
+	}
+	if events[0].Action != "block" || events[0].Pattern != "anthropic_api_key" {
+		t.Errorf("event 0 = %+v", events[0])
+	}
+	if events[0].Timestamp.IsZero() {
+		t.Error("event 0 missing timestamp")
+	}
+	if events[1].Action != "redact" || events[1].TokenLen != 42 {
+		t.Errorf("event 1 = %+v", events[1])
+	}
+}
+
+func TestAuditLogger_SkipsUnderIncognito(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "audit.jsonl")
+	incog := NewIncognitoMode()
+	a := NewAuditLogger(AuditLoggerConfig{Path: path, Incognito: incog})
+
+	incog.Activate()
+	a.Record(AuditEvent{Action: "block", Pattern: "x"})
+
+	if _, err := os.Stat(path); !os.IsNotExist(err) {
+		t.Errorf("expected audit file to not exist under incognito, got err=%v", err)
+	}
+
+	incog.Deactivate()
+	a.Record(AuditEvent{Action: "block", Pattern: "y"})
+
+	events := readAuditLines(t, path)
+	if len(events) != 1 {
+		t.Fatalf("expected 1 event after deactivate, got %d", len(events))
+	}
+	if events[0].Pattern != "y" {
+		t.Errorf("expected pattern=y (incognito event dropped), got %q", events[0].Pattern)
+	}
+}
+
+func TestAuditLogger_CreatesParentDir(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "deeply", "nested", "audit.jsonl")
+	a := NewAuditLogger(AuditLoggerConfig{Path: path})
+	a.Record(AuditEvent{Action: "block"})
+	if _, err := os.Stat(path); err != nil {
+		t.Errorf("expected audit file at %s, got err=%v", path, err)
+	}
+}
+
+func TestFirewall_RecordsRedactionToAudit(t *testing.T) {
+	dir := t.TempDir()
+	auditPath := filepath.Join(dir, "audit.jsonl")
+	audit := NewAuditLogger(AuditLoggerConfig{Path: auditPath})
+
+	fw := NewFirewall(FirewallConfig{
+		ScanOutgoing:    true,
+		ScanToolResults: true,
+		Audit:           audit,
+	})
+
+	// Anthropic key prefix is a built-in redact pattern; emit it
+	// through the tool-result scanning path.
+	cleaned := fw.ScanToolResult("here is the key sk-ant-abcdef1234567890abcdef1234567890abcdef")
+	if !strings.Contains(cleaned, "[REDACTED]") {
+		t.Errorf("expected [REDACTED] in cleaned content, got %q", cleaned)
+	}
+
+	events := readAuditLines(t, auditPath)
+	var sawAnthropicRedact bool
+	for _, ev := range events {
+		if ev.Action == "redact" && ev.Pattern == "anthropic_api_key" && ev.Source == "tool_result" {
+			sawAnthropicRedact = true
+			if ev.TokenLen == 0 {
+				t.Errorf("expected non-zero TokenLen on redact event, got %+v", ev)
+			}
+		}
+	}
+	if !sawAnthropicRedact {
+		t.Errorf("expected an anthropic_api_key redact event in audit log, got %+v", events)
+	}
+}
@@ -14,6 +14,7 @@ type Firewall struct {
 	scanner   *Scanner
 	incognito *IncognitoMode
 	logger    *slog.Logger
+	audit     *AuditLogger // optional; nil = no per-session audit log

 	// Config
 	scanOutgoing    bool
@@ -27,6 +28,11 @@ type FirewallConfig struct {
 	EntropyThreshold  float64
 	EntropySafelist   []string
 	Logger            *slog.Logger
+	// Audit is the optional per-session audit logger. Set via
+	// SetAudit after the session ID is known — the firewall is
+	// typically constructed before the session ID is generated.
+	// nil is safe; auditing simply turns into a no-op.
+	Audit *AuditLogger
 }

 func NewFirewall(cfg FirewallConfig) *Firewall {
@@ -50,11 +56,20 @@ func NewFirewall(cfg FirewallConfig) *Firewall {
 		scanner:         scanner,
 		incognito:       NewIncognitoMode(),
 		logger:          logger,
+		audit:           cfg.Audit,
 		scanOutgoing:    cfg.ScanOutgoing,
 		scanToolResults: cfg.ScanToolResults,
 	}
 }

+// SetAudit attaches an AuditLogger after construction. The firewall
+// is typically built before the session ID exists, so callers usually
+// construct the AuditLogger later and inject it via this setter.
+// Pass nil to disable auditing.
+func (f *Firewall) SetAudit(a *AuditLogger) {
+	f.audit = a
+}
+
 // Incognito returns the incognito mode controller.
 func (f *Firewall) Incognito() *IncognitoMode {
 	return f.incognito
@@ -131,7 +146,16 @@ func (f *Firewall) scanMessage(m message.Message) message.Message {

 func (f *Firewall) scanAndRedact(content, source string) string {
 	// Unicode sanitization first
+	originalLen := len(content)
 	content = SanitizeUnicode(content)
+	if delta := originalLen - len(content); delta != 0 {
+		f.audit.Record(AuditEvent{
+			Action:   "unicode_sanitize",
+			Pattern:  "unicode",
+			Source:   source,
+			TokenLen: delta,
+		})
+	}

 	// Secret scanning
 	matches := f.scanner.Scan(content)
@@ -146,6 +170,12 @@ func (f *Firewall) scanAndRedact(content, source string) string {
 				"pattern", m.Pattern,
 				"source", source,
 			)
+			f.audit.Record(AuditEvent{
+				Action:   "block",
+				Pattern:  m.Pattern,
+				Source:   source,
+				TokenLen: m.End - m.Start,
+			})
 			return "[BLOCKED: content contained a secret]"
 		default:
 			f.logger.Debug("secret redacted",
@@ -153,6 +183,12 @@ func (f *Firewall) scanAndRedact(content, source string) string {
 				"action", m.Action,
 				"source", source,
 			)
+			f.audit.Record(AuditEvent{
+				Action:   string(m.Action),
+				Pattern:  m.Pattern,
+				Source:   source,
+				TokenLen: m.End - m.Start,
+			})
 		}
 	}

@@ -1403,6 +1403,28 @@ func (m Model) handleCommand(cmd string) (tea.Model, tea.Cmd) {
 		m.injectSystemContext(msg)
 		return m, nil

+	case "/router":
+		if m.config.Router == nil {
+			m.messages = append(m.messages, chatMessage{role: "error", content: "router not configured"})
+			return m, nil
+		}
+		if args == "" || args == "help" {
+			current := m.config.Router.PreferPolicy().String()
+			m.messages = append(m.messages, chatMessage{role: "system",
+				content: fmt.Sprintf("router.prefer = %s\nUsage: /router <auto|local|cloud>\n  auto  — no bias; tier order + Strengths decide\n  local — cloud arms demoted; locals win when feasible\n  cloud — local arms demoted; cloud arms win (except tier-0 SLM)", current)})
+			return m, nil
+		}
+		policy, err := router.ParsePreferPolicy(args)
+		if err != nil {
+			m.messages = append(m.messages, chatMessage{role: "error", content: err.Error()})
+			return m, nil
+		}
+		m.config.Router.SetPreferPolicy(policy)
+		msg := fmt.Sprintf("router.prefer = %s (runtime override; not written to config)", policy.String())
+		m.messages = append(m.messages, chatMessage{role: "system", content: msg})
+		m.injectSystemContext(msg)
+		return m, nil
+
 	case "/profile":
 		if args == "" {
 			m = m.closeAllPickers()
@@ -1532,7 +1554,7 @@ func (m Model) handleCommand(cmd string) (tea.Model, tea.Cmd) {
 			return m, nil
 		}
 		m.messages = append(m.messages, chatMessage{role: "system",
-			content: "Commands:\n  /init               generate or update AGENTS.md project docs\n  /clear, /new        clear chat and start new conversation\n  /config             show current config\n  /incognito          toggle incognito (Ctrl+X)\n  /keys               show keyboard shortcuts\n  /model [name]       list/switch models\n  /permission [mode]  set permission mode (Shift+Tab to cycle)\n  /plugins            list installed plugins\n  /profile [name]     list profiles / switch (re-execs gnoma)\n  /provider           show current provider\n  /replay             scroll to top to re-read conversation\n  /resume [id]        list or restore saved sessions\n  /shell [cmd]        open interactive shell (or run cmd in shell)\n  /skills             list loaded skills\n  /usage              show token usage and cost\n  /help               show this help\n  /quit               exit gnoma\n\nSkills (use /<name> [args] to invoke):\n  Add .md files with YAML front matter to .gnoma/skills/ or ~/.config/gnoma/skills/"})
+			content: "Commands:\n  /init               generate or update AGENTS.md project docs\n  /clear, /new        clear chat and start new conversation\n  /config             show current config\n  /incognito          toggle incognito (Ctrl+X)\n  /keys               show keyboard shortcuts\n  /model [name]       list/switch models\n  /permission [mode]  set permission mode (Shift+Tab to cycle)\n  /plugins            list installed plugins\n  /profile [name]     list profiles / switch (re-execs gnoma)\n  /provider           show current provider\n  /replay             scroll to top to re-read conversation\n  /resume [id]        list or restore saved sessions\n  /router [mode]      show or set routing preference (auto/local/cloud)\n  /shell [cmd]        open interactive shell (or run cmd in shell)\n  /skills             list loaded skills\n  /usage              show token usage and cost\n  /help               show this help\n  /quit               exit gnoma\n\nSkills (use /<name> [args] to invoke):\n  Add .md files with YAML front matter to .gnoma/skills/ or ~/.config/gnoma/skills/"})
 		return m, nil

 	case "/keys":
@@ -22,7 +22,10 @@ var builtinCommands = []cmdEntry{
 	{"/exit", "exit gnoma"},
 	{"/help", "show available commands and shortcuts"},
 	{"/incognito", "toggle incognito mode (no persistence, local-only routing)"},
-	{"/init", "initialize project — create AGENTS.md"},
+	// /init is provided by the bundled skill at
+	// internal/skill/skills/init.md; do not duplicate it here. The dedup
+	// in completionSource() would skip a duplicate entry anyway, but
+	// omitting it keeps the source-of-truth single.
 	{"/keys", "show keyboard shortcuts"},
 	{"/model", "list or switch active model"},
 	{"/new", "start a new conversation"},
@@ -34,6 +37,7 @@ var builtinCommands = []cmdEntry{
 	{"/quit", "quit gnoma"},
 	{"/replay", "replay last assistant response"},
 	{"/resume", "browse and resume a saved session"},
+	{"/router", "show or set routing preference (auto/local/cloud)"},
 	{"/shell", "open interactive shell"},
 	{"/theme", "list themes or set active theme"},
 	{"/skills", "list available skills"},
@@ -46,11 +50,27 @@ var permissionModes = []string{
 	"auto", "default", "accept_edits", "bypass", "deny", "plan",
 }

-// completionSource builds a sorted command list from builtins + skills.
-func completionSource(skills *skill.Registry) []cmdEntry {
-	entries := make([]cmdEntry, len(builtinCommands))
-	copy(entries, builtinCommands)
+// routerPreferModes lists valid values for /router completion.
+var routerPreferModes = []string{"auto", "local", "cloud"}

+// completionSource builds a sorted command list from builtins + skills.
+// Skill names shadow builtin names so a skill (bundled or user-defined)
+// can replace a static entry without producing a duplicate in the picker.
+func completionSource(skills *skill.Registry) []cmdEntry {
+	skillNames := make(map[string]struct{})
+	if skills != nil {
+		for _, s := range skills.All() {
+			skillNames["/"+s.Frontmatter.Name] = struct{}{}
+		}
+	}
+
+	entries := make([]cmdEntry, 0, len(builtinCommands)+len(skillNames))
+	for _, c := range builtinCommands {
+		if _, shadowed := skillNames[c.name]; shadowed {
+			continue
+		}
+		entries = append(entries, c)
+	}
 	if skills != nil {
 		for _, s := range skills.All() {
 			desc := s.Frontmatter.Description
@@ -150,6 +170,16 @@ func matchArgCompletion(input string, profileNames []string, providerNames []str
 				return cmd + " " + mode
 			}
 		}
+	case "/router":
+		if arg == "" {
+			return ""
+		}
+		lower := strings.ToLower(arg)
+		for _, mode := range routerPreferModes {
+			if strings.HasPrefix(mode, lower) && mode != arg {
+				return cmd + " " + mode
+			}
+		}
 	case "/profile":
 		if arg == "" || len(profileNames) == 0 {
 			return ""
Author	SHA1	Message	Date
vikingowl	fa65a68728	docs(plans): config-migration and sensitive-content-policy Release / release (push) Has been cancelled Details Promotes two TODO entries into phased plan docs and links them from the TODO bullets. config-migration plan covers the silent layered-config corruption chain (encoder zero-spam -> reader overwrite -> wrong effective values) and its remediation across five phases: encoder fix (omitempty + pointer-numeric hybrid), project registry, gnoma doctor, gnoma upgrade-config, and auto-migration on startup with banner notice. sensitive-content-policy plan unifies three input paths (pasted text, pasted images, tool-read files) behind one decision API with consistent UI surface and audit-log integration. Phases A-E sequence the work from highest-leverage (text paste) to most complex (image OCR with local vision arm). Neither plan starts implementation in this commit — they exist to make the design decisions explicit so the eventual code can be reviewed against a written intent rather than a TODO bullet.	2026-05-24 22:51:33 +02:00
vikingowl	8b9bdc2978	feat(security): per-session firewall audit log New AuditLogger writes one JSON line per firewall action to <projectRoot>/.gnoma/sessions/<sessionID>/audit.jsonl so a user can grep 'what did the firewall do this session?' after the fact. Records 'block', 'redact', 'warn', and 'unicode_sanitize' events with the matcher name, source (tool_result / message_text / etc.), and token length. Discipline: never the bytes themselves — only the matcher name and the length, matching the README's scope-note promise about audit data. Plumbing: - Firewall gains an audit *AuditLogger field plus SetAudit setter. The firewall is constructed before the session ID exists, so the audit logger is wired post-hoc once main.go has the sessionID. - Honours incognito: Record is a silent no-op when the firewall's IncognitoMode is active, preserving the no-persistence contract. - Tolerant of fs errors: mkdir / open / encode failures log a Warn but never propagate; the scan pipeline must not depend on audit succeeding. - Nil receiver is a valid no-op so callers don't need nil-guards around every Record. Tracks 'Security boundary — per-session audit log' from the v0.3.0 r/SideProject launch thread (u/Secret_Theme3192, 2026-05-24). Per-host egress allowlist remains separately tracked pending the commenter's reply on host-level vs per-tool semantics.	2026-05-24 22:47:28 +02:00
vikingowl	eea26a262e	feat(router): surface bandit knobs as [router.bandit] config Four hardcoded constants in the selector and feedback tracker are now user-tunable via [router.bandit]: - quality_alpha (EMA smoothing, default 0.3) - min_observations (samples before observed overrides heuristic, default 3) - observed_weight (observed/heuristic blend ratio, default 0.7) - strength_bonus (quality bonus for Strengths-tagged arms, default 0.15) Each field treats 0 as 'use default', so an empty TOML block is byte-identical to pre-config behaviour. BanditParams is plumbed via router.Config{Bandit: ...} and resolveBanditParams() centralises the fallback so every call site shares the same defaults. QualityTracker, scoreArm, bestScored, and selectBest signatures now take the configured values directly rather than reaching for package- level constants. Tests updated to pass BanditParams{} (defaults) or explicit overrides where they validate the new tuning paths. Tracks item #3 from the 'Bandit selector — design decisions deferred' TODO entry — ships independently of the EMA vs SLM strategic decision.	2026-05-24 22:42:34 +02:00
vikingowl	352cab4a94	docs(todo): extend config-migration plan with project registry Release / release (push) Has been cancelled Details Adds item #5 to the config write/merge corruption entry: ~/.config/gnoma/projects.json tracking which directories gnoma has been launched in. Enables doctor --all-projects, cross-project session listing, and one-shot upgrade-config across all known projects. Documents the design constraints: must use the same omitempty / atomic-write discipline as the encoder fix to avoid recreating the class of bug it exists to help solve. Privacy footprint flagged (local-only directory log; opt-out toggle). Stale-entry handling gated through doctor, not auto-prune.	2026-05-24 22:29:56 +02:00
vikingowl	58f4001917	docs(todo): track config write/merge corruption + doctor/upgrade design setConfig() serializes the entire Config struct on every key change, which writes zero-valued fields into the file. On the next load those explicit zeros override higher-priority layers via toml.Decode's present-beats-absent semantics. Concrete symptom today: a global prefer = 'cloud' was silently shadowed by a project prefer = ''. Captures the multi-part fix surface so it doesn't get half-done: - Stop generating zero-spam (omitempty hybrid or pelletier swap). - gnoma doctor: read-only diagnostic (zero-spam, invalid enums, removed keys, effective-merged values). - gnoma upgrade-config: active migration with .bak backup + diff. - Auto-migrate project-level on startup with TUI banner notice; global stays explicit.	2026-05-24 22:24:59 +02:00
vikingowl	6c5e969217	feat(tui): add /router command for runtime routing-preference switch Mirrors the pattern of /permission: bare command shows the current value plus a help line; with an argument (auto/local/cloud) it calls Router.SetPreferPolicy and emits a system message. Session-only — does not write back to config.toml, matching /permission and Ctrl+X incognito-toggle conventions. Tab completion on the value via routerPreferModes alongside the existing permissionModes pattern. Help text updated. Status-bar indicator deferred (separate concern if it turns out to be wanted).	2026-05-24 22:13:27 +02:00
vikingowl	74bd570438	fix(tui): de-dupe /init in command picker; skill names shadow builtins /init appeared twice in the completion picker — once from the static builtinCommands list and once from the bundled init skill at internal/skill/skills/init.md (registered via skills.All()). Two changes: - Remove /init from builtinCommands. The skill provides the canonical entry, and its description ('Generate or update AGENTS.md project documentation') is more accurate than the static one ('initialize project — create AGENTS.md') because the skill handles both create and update. - Refactor completionSource() so a skill name silently shadows any builtin with the same name. Prevents this from recurring if a future builtin migrates to a skill, and lets users override a builtin's description by dropping a skill of the same name into .gnoma/skills/.	2026-05-24 22:08:46 +02:00
vikingowl	d38d7daf25	fix(subprocess/agy): disable ToolUse until stream-json lands agy is registered with FormatAgyText and the agyParser emits every stdout line as a plain EventTextDelta. There is no path for a structured ToolCall event to come back. With ToolUse=true the router would dispatch tool-needing tasks (security_review, spawn_elfs, file edit) to agy; the underlying Gemini model would describe calling the tool in prose — invented UUIDs and 'I will pause now'-style stubs — the engine would receive only text, and the turn would hang waiting for a tool call that never arrives. Surfaced when /init routed to agy for a security_review task and elf spawning visibly hallucinated in the TUI. Capability flag flipped to false; agy stays usable for tool-free prompts (explain, summarize, simple chat). TODO entry for native stream-json updated to flag that the capability flip is part of that same change.	2026-05-24 21:58:22 +02:00
vikingowl	06d4069076	ci: pin GoReleaser to the triggering tag, fix tag-collision regression Release / release (push) Has been cancelled Details When v0.3.1 was tagged on the same commit as v0.3.1-rc2, the release workflow built and tried to publish rc2 artifacts instead of v0.3.1, failing with 'already_exists' on every asset upload. Root cause: goreleaser-action@v6 + 'version: latest' (locked to v2.x) falls back to 'git describe --tags' for the current tag, which picked v0.3.1-rc2 over v0.3.1 when both refs pointed at HEAD. Explicitly setting GORELEASER_CURRENT_TAG = github.ref_name forces the workflow to use the tag that triggered it, regardless of other refs at the same commit.	2026-05-24 17:36:01 +02:00
vikingowl	f641bd4971	docs(todo): track bandit selector design questions Two related items surfaced from the r/coolgithubprojects v0.3.1 launch thread. Bundled because they share the selector code: 1. Whether to keep numeric EMA at all post-SLM dispatcher (open strategic question from the 2026-05-07 roadmap — not a must-implement). 2. Surfacing hardcoded selector knobs (qualityAlpha, blend ratio, strength bonus, quality floor) as [router.bandit] config keys — ships independently of #1.	2026-05-24 17:34:13 +02:00
vikingowl	798f2ab3c3	fix(release): prerelease auto-detect; changelog excludes scoped conventional commits Release / release (push) Has been cancelled Details Two polish issues surfaced by the v0.3.1-rc1 pipeline test: - The release was tagged v0.3.1-rc1 but published without the prerelease flag, so it appeared alongside stable releases. Add 'prerelease: auto' to release.github so GoReleaser marks any tag with a semver prerelease suffix (-rc, -beta, -alpha, -pre) appropriately. - The changelog filters used '^docs:' patterns that only match bare conventional commits. Scoped variants like 'docs(readme):' and 'chore(make):' slipped through into the published changelog. Switch to '^docs[:(]' style patterns to match both forms, and add '^style[:(]' so gofmt-drift commits are excluded too.	2026-05-24 17:05:49 +02:00
vikingowl	9814795b3c	ci: migrate release pipeline from Woodpecker to GitHub Actions Release / release (push) Has been cancelled Details Drop the broken .woodpecker/release.yml (top-level when: triggered an 'error' status on every dev push instead of skipping non-tag events) and replace with .github/workflows/release.yml driving the same GoReleaser flow. Rationale: - Release artifacts already land on GitHub (releases + ghcr.io), so running the pipeline on GitHub eliminates a build hop. - GH Actions auto-provides GITHUB_TOKEN with packages:write via the workflow permissions block — no PAT plumbing or login secrets. - docker/setup-qemu-action and docker/setup-buildx-action handle the multi-arch cross-build setup that Woodpecker would require manual host configuration for. Trigger: any tag matching refs/tags/v*. Mirror sync from somegit.dev propagates tags to GitHub, so 'git push origin v0.3.1' on the canonical remote still drives the GitHub-side release.	2026-05-24 16:45:17 +02:00
vikingowl	047924da2b	ci(woodpecker): release pipeline on vX.Y.Z tag Runs 'go test ./...' then 'goreleaser release --clean' inside the official goreleaser image when a tag matching refs/tags/v* is pushed. GITHUB_TOKEN comes from the 'github_token' repo secret (needs repo + write:packages scopes) and is reused for ghcr.io docker login so the multi-arch image build can push. Runner requirements documented inline: docker socket access plus QEMU registered on the host (tonistiigi/binfmt --install all) for arm64 cross-builds. Directory form chosen so a non-release CI pipeline can land later under .woodpecker/ci.yml without restructuring.	2026-05-24 16:38:24 +02:00
vikingowl	a23eb6b92c	style: gofmt drift from prior commits Pure whitespace cleanup surfaced when 'make check' ran gofmt over the tree. Mostly struct-field column alignment in internal/safety/banner.go (SessionInfo) and the var(...) flag block in cmd/gnoma/main.go after --dangerously-allow-anywhere was added without realignment. Verified zero substantive changes via 'git diff --ignore-all-space --ignore-blank-lines'.	2026-05-24 16:33:17 +02:00
vikingowl	0981fb82d6	chore(make): add govulncheck and semgrep to 'make check' Both checks already passed locally on the current dev tip; wiring them into the canonical pre-commit gate so security regressions fail fast instead of leaking into a release. - 'make vuln' runs govulncheck with reachability analysis against the Go vuln DB. - 'make sec' runs semgrep with p/golang + p/security-audit, metrics off, --error so findings exit non-zero. Tools must be installed locally (commands in Makefile comments). If upstream Woodpecker CI runs 'make check', it will need both binaries on the runner image.	2026-05-24 16:30:54 +02:00
vikingowl	3888966e68	fix(deps): bump golang.org/x/net to v0.55.0 to clear reachable CVEs govulncheck flagged two reachable vulnerabilities in golang.org/x/net@v0.52.0: - GO-2026-5026 (idna fails to reject ASCII-only Punycode labels), reached via router.DiscoverOllama -> http.Client.Do -> idna.ToASCII. - GO-2026-4918 (HTTP/2 transport infinite loop on bad SETTINGS_MAX_FRAME_SIZE), same call path -> http2.Transport.*. Bumping to v0.55.0 covers both. Transitive bumps to x/crypto v0.51.0, x/sys v0.45.0, x/text v0.37.0. Post-bump govulncheck reports 0 reachable vulnerabilities and 0 in directly imported packages.	2026-05-24 16:27:28 +02:00
vikingowl	847cd5fe0c	fix(security): use crypto/rand for session-ID suffix Semgrep flagged math/rand for the /tmp artifact-directory session-ID generation. Modern Go (1.20+) auto-seeds the global math/rand source so this wasn't exploitable in practice, but crypto/rand is the idiomatic choice for any security-adjacent identifier and removes the finding from future security audits. Drops the mrand alias entirely; reads 8 random bytes once and masks to 24 bits to preserve the existing %06x suffix format.	2026-05-24 16:22:50 +02:00
vikingowl	001865f069	fix(env): correct ANTHROPIC_API_KEY typo, add missing vars The placeholder ANTHROPICS_API_KEY (with trailing S) silently failed: the auth layer reads ANTHROPIC_API_KEY, so anyone copying .env.example to .env and pasting their key would see gnoma never pick it up, with no clear error. Also surfaces vars that already work but weren't templated: GOOGLE_API_KEY (alternative to GEMINI_API_KEY), GNOMA_PROVIDER and GNOMA_MODEL (config overrides), and the two subprocess sandbox bypass footguns (GNOMA_AGY_BYPASS_PERMISSIONS, GNOMA_CODEX_BYPASS_SANDBOX), left commented out so they don't accidentally turn on.	2026-05-24 16:16:39 +02:00
vikingowl	c1c52f139d	docs(readme): add 'no phone-home' bullet and data-flow scope note Clarify that gnoma itself emits no telemetry to external services while being explicit that cloud-provider arms send data to those providers by design. Adds: - 'No phone-home' bullet to the differentiator list, naming the on-device path (Ollama/llama.cpp + --incognito). - 'Data flow' paragraph to the Security scope-note blockquote so the framing is consistent between the hero bullets and the Security section.	2026-05-24 16:00:40 +02:00
vikingowl	7040041f13	docs(readme): correct firewall scope; track egress controls in TODO The 'What makes gnoma different' bullet and Security section both implied a network-egress firewall. Today the Firewall only enforces a content boundary (secret scan, Unicode sanitize, redact/block). Reword both spots and add a Scope note. Surface the gap as a top-of-TODO entry covering per-session audit log and per-host egress allowlist, with the open design question (host-level vs per-tool) called out. Raised via r/SideProject v0.3.0 launch thread.	2026-05-24 15:50:35 +02:00
vikingowl	1828151162	docs(claude): big-picture architecture and expanded test commands Add a 'Big picture' section summarising the request flow (cmd → session → engine → router → security/permission → extensibility) so future Claude Code instances can orient without reading INDEX.md plus five package directories first. Note that internal/safety and internal/slm aren't in INDEX.md yet. Document the somegit.dev / GitHub mirror split and the ruleset that blocks force-push and deletion on main/dev. Expand build/test section with make check, make test-integration, single-test, and benchmark commands.	2026-05-24 15:39:23 +02:00
vikingowl	b5062d59e9	docs(readme): hero screenshot, differentiators, status, TOC Add docs/img/gnoma-tui.png as a hero image so visitors see the TUI above the fold instead of a wall of text. Pull the bandit router, prefer-policy, SLM, and built-in firewall out of buried sections into a 'What makes gnoma different' bullet list. Add a Status block flagging pre-1.0 and a table of contents. Move the pygmy-owl naming note and upstream/mirror URLs into a footer About section.	2026-05-24 15:39:14 +02:00
vikingowl	b13a6a2801	docs(plans): mark v0.3.0 plans shipped Three plans shipped end-to-end in v0.3.0; removing them from TODO.md In-flight and adding a Status: shipped header to each plan doc with the commit references. Shipped: - 2026-05-23-routing-defaults-refresh.md - 2026-05-23-prefer-routing-policy.md - 2026-05-23-startup-safety-banner.md Still in flight (telemetry-gated, fires only if measurements support it): - 2026-05-23-tool-router-specialization.md	2026-05-23 22:45:05 +02:00