docs(todo,plans): specs for open features + MiniMax & ACP

Add implementation-ready plans for the in-flight features that lacked one, and two new provider/protocol items: - MiniMax provider (cloud arm + Token Plan billing decision) - Agent Client Protocol (ACP) — dual role: gnoma as ACP agent and as ACP client driving external agents as router arms - Network egress allowlist (Learn/Review/Enforce); note the per-session audit log is already implemented, remaining gap is a viewer command - Cross-platform (Windows/macOS) code touch-points + build-tag pattern - Distribution follow-ups (cosign, brew tap, installer, dockers_v2) Link each plan from its TODO.md entry; mark audit-log item done.
2026-06-04 11:59:16 +02:00
parent 98daebd359
commit f8ab522bef
6 changed files with 1297 additions and 6 deletions
@@ -4,6 +4,86 @@ Active work, newest first.

 ## In flight

+- **MiniMax provider — cloud arm + subscription token plan.** Add
+  MiniMax (api.minimax.io / api.minimaxi.com) as a first-class cloud
+  provider so it can register as a router arm alongside
+  anthropic/openai/google/mistral.
+
+  **API surface.** MiniMax ships *two* OpenAI-and-Anthropic-compatible
+  HTTP surfaces, so this is a base-URL + auth wiring task, not a new
+  translation layer:
+  - **OpenAI-compatible** chat-completions at `…/v1` — reusable via
+    `internal/provider/openaicompat`. Cleanest first cut: add a
+    `NewMiniMax(cfg)` constructor mirroring `NewOllama` /
+    `NewLlamaCpp` (`openaicompat/provider.go`) with the MiniMax base
+    URL baked in, then a `case "minimax"` in
+    `createProvider` (`cmd/gnoma/main.go:1265`) and the available-
+    providers usage string (`:1279`).
+  - **Anthropic-compatible** endpoint (`…/anthropic`) — alternative
+    backing via the existing `anthropic` provider with a `BaseURL`
+    override. Decide one canonical path; OpenAI-compat is the lower-
+    risk default since `openaicompat` is already exercised by the
+    local backends.
+  - **Auth.** Bearer API key. `envKeyFor`'s default branch
+    (`main.go:1199`) already resolves `MINIMAX_API_KEY` with no code
+    change; add an explicit `case "minimax"` only if we want a
+    friendlier name or alternates list.
+  - **Models.** `MiniMax-M2` (agentic/coding, the one to default to),
+    `MiniMax-M1`, abab6.5 series. Set `Strengths` + `MaxComplexity`
+    + `CostWeight` on the arm so the selector treats it as a cheap
+    high-capability cloud tier.
+
+  **Token plan (open question — affects auth + billing UX).** MiniMax
+  offers a flat-rate **Coding Plan** subscription (token-quota based,
+  Claude-Max-style) *in addition to* metered pay-as-you-go API
+  credits. Both authenticate with the same Bearer key, so no adapter
+  difference — but the router's `CostWeight` math assumes metered
+  per-token pricing. Under a subscription the marginal cost is ~0
+  until the quota is hit, then hard-stops. Decisions to make:
+  - How to model "subscription" cost in the selector — e.g. a
+    `[provider.minimax].billing = "subscription" | "metered"` knob
+    that zeroes `CostWeight` while quota remains, vs. real per-token
+    cost when metered.
+  - Quota exhaustion handling — surface the 429/quota error cleanly
+    and let the bandit fail over to the next arm (ties into the
+    session error-recovery work in `0d3d190`).
+  - Document both plans + the region split (`api.minimax.io`
+    international vs `api.minimaxi.com`) in `docs/slm-backends.md` /
+    provider docs.
+
+  Smallest shippable slice: OpenAI-compat `NewMiniMax` + metered
+  pricing, registered as a cloud arm. Subscription/quota modelling is
+  the follow-up once the billing knob lands. Plan:
+  [`docs/superpowers/plans/2026-06-04-minimax-provider.md`](docs/superpowers/plans/2026-06-04-minimax-provider.md).
+
+- **Agent Client Protocol (ACP) support.** Run gnoma as an *ACP agent*
+  (`gnoma acp`) so any ACP-capable editor (Zed, Kiro, OpenCode, …) can
+  drive it as an external coding agent. ACP is "the LSP for AI coding
+  agents": JSON-RPC 2.0 over stdio, editor (client) spawns agent
+  (subprocess). gnoma already owns the hard parts — agentic engine,
+  tools, permissions, and JSON-RPC-over-stdio (from its MCP-client
+  side, `internal/mcp/jsonrpc.go`). The fit is symmetric: gnoma is the
+  JSON-RPC *server* here. No Go SDK exists (official SDKs are
+  TS/Python/Rust/Kotlin), so gnoma implements the wire protocol
+  natively against the schema. `session/new` can declare `mcpServers`,
+  so ACP and gnoma's existing MCP manager wire up in one handshake.
+
+  **Dual role — both directions:**
+  1. **gnoma as ACP agent (server)** — `gnoma acp` over stdio so
+     editors drive gnoma.
+  2. **gnoma as ACP client** — gnoma spawns *external* ACP agents
+     (Claude, Gemini CLI, Codex, …) and uses them as router-arm
+     provider backends. This is the same shape as the existing
+     `internal/provider/subprocess` CLI-agent arms
+     (`cmd/gnoma/main.go:521-531`, `IsCLIAgent: true`) but over
+     standardized ACP JSON-RPC — gaining structured tool-call
+     surfacing, real turn/permission semantics, and cancellation
+     that the current one-shot stream-json subprocess provider
+     lacks (it sets `ToolUse:false` for agents without stream-json).
+
+  Upstream: <https://github.com/agentclientprotocol>. Plan:
+  [`docs/superpowers/plans/2026-06-04-agent-client-protocol.md`](docs/superpowers/plans/2026-06-04-agent-client-protocol.md).
+
 - **Config write/merge — silent corruption of layered configs.**
  `internal/config/write.go:setConfig` reads the existing TOML into a
  zero-valued `Config` struct, sets one field, and writes the entire
@@ -159,11 +239,13 @@ Active work, newest first.
  with no per-host allowlist or dial-layer interception. Two follow-
  ups surfaced from the r/SideProject v0.3.0 launch thread
  (2026-05-24, `u/Secret_Theme3192`):
-  1. **Per-session audit log of blocked/redacted events** —
-     grep-able file at `.gnoma/sessions/<id>/audit.jsonl` so the
-     user can answer "what did the firewall do this session?" in
-     one command. Today the `slog` output goes to whatever sink is
-     configured, with no per-session grouping.
+  1. **Per-session audit log of blocked/redacted events** — ✅ JSONL
+     writing **implemented**: `internal/security/audit.go` +
+     wiring at `cmd/gnoma/main.go:685-691`
+     (`.gnoma/sessions/<id>/audit.jsonl`), recorded from
+     `firewall.go:152/173/186`. **Remaining gap:** no CLI to *read*
+     it — a `gnoma firewall audit` viewer is folded into the egress
+     plan (shares the `gnoma firewall` command surface).
  2. **Per-host egress allowlist (HTTP transport layer)** — design
     refined by `u/HarjjotSinghh` on the r/SideProject thread
     (2026-05-28). Three-stage rollout, not a single-shot
@@ -195,6 +277,9 @@ Active work, newest first.
     "network egress gated"; corrected in the README scope note
     and the audit-log commit.

+  Egress plan (incl. the `gnoma firewall audit` viewer for item #1):
+  [`docs/superpowers/plans/2026-06-04-egress-allowlist.md`](docs/superpowers/plans/2026-06-04-egress-allowlist.md).
+
 - **Cross-platform support — Windows + macOS.** GoReleaser builds
  static binaries for `linux/darwin/windows × amd64/arm64` every
  release but only Linux is exercised at all today. Windows and
@@ -244,6 +329,9 @@ Active work, newest first.
  least a TODO-linked acknowledgement in the post body so the
  thread sees gnoma takes the gaps seriously.

+  Plan (build-tag scaffolding + concrete code touch-points):
+  [`docs/superpowers/plans/2026-06-04-cross-platform.md`](docs/superpowers/plans/2026-06-04-cross-platform.md).
+
 - **Tool-router specialization (functiongemma)** — gated on telemetry,
  not committed. Phase A.2 adds did-switch-rate measurement to the
  two-stage `select_category` path; Phase A.3 (LoRA fine-tune of
@@ -288,7 +376,8 @@ Active work, newest first.
  from `dockers` + `docker_manifests` to `dockers_v2` in
  `.goreleaser.yml` (collapses ~45 lines into one block but
  requires Dockerfile changes for the per-platform binary layout
-  — deferred to its own commit before v0.3.0).
+  — deferred to its own commit before v0.3.0). Plan:
+  [`docs/superpowers/plans/2026-06-04-distribution-followups.md`](docs/superpowers/plans/2026-06-04-distribution-followups.md).

 ## Stable backlog (not in active phases)

@@ -0,0 +1,375 @@
+# Agent Client Protocol (ACP) — 2026-06-04
+
+Adds **both directions** of ACP to gnoma:
+
+1. **gnoma as ACP agent (server)** — `gnoma acp` over stdio so any
+   ACP-capable editor (Zed, Kiro, OpenCode, …) can drive gnoma as an
+   external coding agent.
+2. **gnoma as ACP client** — gnoma spawns *external* ACP agents
+   (Claude, Gemini CLI, Codex, …) and exposes them as router-arm
+   provider backends, the standardized successor to the current
+   `internal/provider/subprocess` CLI-agent arms.
+
+Adds the TODO.md entry "Agent Client Protocol (ACP) support".
+
+Upstream: <https://github.com/agentclientprotocol> ·
+spec <https://agentclientprotocol.com>
+
+---
+
+## Problem
+
+ACP is "the LSP for AI coding agents": a JSON-RPC 2.0 protocol, spoken
+over stdio, that lets editors (clients) spawn agents (subprocesses) and
+talk to them in a standard way — eliminating point-to-point editor↔agent
+integrations. Zed, Kiro, OpenCode and others are clients; Claude, Gemini
+CLI, Codex ship as ACP agents.
+
+Today gnoma is reachable only via its own TUI and pipe mode. It cannot
+plug into an editor's agent panel. Supporting ACP makes gnoma a drop-in
+agent inside any ACP client, which is a large distribution surface for
+near-zero ongoing cost — the protocol is stable and gnoma already owns
+all the hard parts (an agentic engine, tools, permissions, MCP).
+
+### Why this is a natural fit
+
+- gnoma already speaks **JSON-RPC over stdio** for MCP
+  (`internal/mcp/jsonrpc.go` `Request`/`Notification`,
+  `internal/mcp/transport*.go`) — that machinery is reusable for the
+  ACP server side (gnoma is the *server* of the JSON-RPC channel here,
+  the mirror of its MCP-client role).
+- The agentic loop is already factored behind
+  `session.Session` (`internal/session/session.go:54`,
+  `Local.Send`/`SendWithOptions` at `local.go:80-85`) driving
+  `engine.Engine` (`internal/engine/engine.go`). ACP `session/prompt`
+  maps onto one `Send`.
+- Permissions already route through a pluggable prompt function
+  (`permission.NewChecker(mode, rules, promptFn)`,
+  `cmd/gnoma/main.go:668`). ACP's `session/request_permission` callback
+  is just another `promptFn` implementation.
+- ACP `session/new` can declare the `mcpServers` the agent should
+  connect to — gnoma already has an MCP manager
+  (`internal/mcp/manager.go`) to honour that in the same handshake.
+
+### Role decision — both, server first
+
+Both roles ship under this plan. Sequence them: **agent (server)
+first** — it's the larger distribution win and exercises the wire
+protocol end-to-end — then **client**, which reuses the same
+`internal/acp` protocol/types from the other side. They share the
+JSON-RPC framing, content-block translation, and capability structs;
+only the dispatch direction differs.
+
+The client role is the standardized successor to
+`internal/provider/subprocess`: that package shells out to CLI agents
+with one-shot `--output-format stream-json` (or prompt-augmentation
+fallback), runs the agent's *own* loop with `--yolo`/`--trust`, and
+cannot surface structured tool calls (it sets `ToolUse:false` for
+agents lacking stream-json — see TODO "Native agy JSON output"). ACP
+fixes all of that: a persistent JSON-RPC session, structured
+`session/update` tool-call events, real permission round-trips, and
+cancellation.
+
+### No Go SDK exists
+
+Official SDKs are TypeScript, Python, Rust, Kotlin — **no Go**. gnoma
+implements the wire protocol natively against the published JSON
+schema. Pin the supported `protocolVersion` and the exact method set
+against the spec at implementation time (the protocol is young and
+still moving).
+
+---
+
+## Non-goals
+
+- **A full editor UI.** In agent mode gnoma renders nothing; the client
+  owns the UI. gnoma emits `session/update` notifications and the client
+  displays them.
+- **Replacing the TUI / pipe modes.** ACP agent mode is a third entry
+  mode alongside them, not a replacement.
+- **Replacing `internal/provider/subprocess` outright.** The ACP-client
+  provider is added alongside it; the stream-json subprocess path stays
+  for agents that don't (yet) speak ACP. Deprecation is a later call.
+- **Custom transports.** stdio only (the ACP norm: local agent as a
+  subprocess). No socket/HTTP transport.
+- **gnoma-drives-gnoma over ACP as the default.** gnoma's native
+  providers/router remain the primary path; ACP-client arms are an
+  additional backend source.
+
+---
+
+## Design
+
+The two roles share one package (`internal/acp`): JSON-RPC framing,
+content-block translation, and the capability/handshake types are
+direction-agnostic. **Part A** is the agent (server) side; **Part B**
+is the client side. Build Part A first.
+
+## Part A — gnoma as ACP agent (server)
+
+### New entry mode: `gnoma acp`
+
+Add a third mode beside TUI and pipe (mode is chosen near
+`cmd/gnoma/main.go:106-114`). Selected by an explicit `acp` subcommand
+(stdio is shared with the JSON-RPC channel, so it can't be
+TTY-autodetected the way TUI is). In ACP mode:
+
+- **No banner, no TUI, no stdout chatter.** stdout/stdin are the
+  JSON-RPC pipe; all human/diagnostic logging goes to **stderr** only
+  (the firewall/audit slog sink must not write to stdout). Audit this
+  carefully — any stray stdout write corrupts the protocol stream.
+- Reuse the existing session/engine/router/security construction; only
+  the front-end loop differs.
+
+### Package layout
+
+```
+internal/acp/
+  protocol.go   // ACP types: handshake, capabilities, content blocks (shared)
+  jsonrpc.go    // framing reused/forked from internal/mcp/jsonrpc.go (shared)
+  content.go    // ContentBlock <-> message.Message translation (shared)
+  server.go     // Part A: stdio JSON-RPC read loop; method dispatch
+  session.go    // Part A: ACP session <-> gnoma session.Session bridge
+  permission.go // Part A: session/request_permission promptFn
+  update.go     // Part A: gnoma stream events -> session/update
+  client.go     // Part B: spawn external agent, drive the handshake/prompt
+```
+
+A separate `internal/provider/acp/` holds the **Part B provider**
+adapter (mirrors `internal/provider/subprocess/`), depending on
+`internal/acp/client.go`.
+
+Reuse `internal/mcp/jsonrpc.go` framing if it generalises; otherwise
+fork the minimal envelope (it's tiny). Keep ACP types separate from MCP
+types — they are different protocols that happen to share JSON-RPC.
+
+### Method handlers (agent side)
+
+Map each ACP method to existing gnoma machinery. Pin exact shapes to the
+spec; the mapping is the contract:
+
+| ACP method (client→agent) | gnoma handling |
+|---|---|
+| `initialize` | Reply with `agentCapabilities` (tools, MCP support, prompt streaming, permission modes), `agentInfo` (name "gnoma", `buildVersion`). Negotiate `protocolVersion`. |
+| `session/new` | Build a `session.Local` (router, security, tools wired as in main). Honour `cwd` (run it through `safety.ClassifyCWD`), and connect any `mcpServers` the client declares via `internal/mcp/manager.go`. Return a `sessionId`. |
+| `session/load` (if advertised) | Rehydrate from `internal/session` store (`SessionStore.Load`). Optional — only if we advertise the capability. |
+| `session/prompt` | Translate ACP `ContentBlock`s → `message.Message`, call `Send`/`SendWithOptions`, stream results back as `session/update`, return the stop reason. |
+| `session/cancel` (notification) | Cancel the in-flight turn's context. |
+
+Agent→client calls gnoma must make:
+
+| ACP call (agent→client) | Trigger |
+|---|---|
+| `session/update` (notification) | Per engine stream event: assistant text deltas, tool-call start/args/result, plan/thoughts, token usage. Map gnoma's stream iterator (`Next/Current`) to update variants. |
+| `session/request_permission` | gnoma's `permission.Checker` promptFn — instead of console `Scanln`, send this and await the client's allow/deny (with the ACP "allow once / always" options mapped to gnoma permission modes). |
+| `fs/read_text_file`, `fs/write_text_file` | **If** we advertise client-side fs and the client supports it, route the `fs` tools through the client so edits show in the editor's buffers. Otherwise gnoma's own `internal/tool/fs` operates on disk directly. Decide per capability negotiation. |
+
+### Streaming bridge
+
+The engine produces a pull-based stream (`Next() / Current() / Err() /
+Close()`). The ACP bridge consumes it and emits a `session/update` per
+event. Backpressure: ACP is fire-and-forget notifications, so no
+blocking — but coalesce text deltas if the client is slow (config knob,
+default flush per token).
+
+### Security & safety interplay
+
+- The `SafeProvider` firewall boundary and the per-session audit log
+  apply unchanged — ACP is a front-end, providers/tools sit behind the
+  same security layer.
+- `safety.ClassifyCWD` runs on the `session/new` `cwd`; a `refuse`
+  classification returns an ACP error rather than starting the session.
+- Egress allowlist (`2026-06-04-egress-allowlist.md`) applies as usual.
+- Incognito: expose a way to start an ACP session incognito (capability
+  flag or `session/new` param) so editor-driven sessions can be
+  non-persistent.
+
+### MCP-in-ACP
+
+When `session/new` lists `mcpServers`, spin them up through the existing
+manager so the editor's MCP config and gnoma's converge in one
+handshake (this is the headline ACP×MCP integration). gnoma's own
+config-level MCP servers still load too; merge, don't replace.
+
+---
+
+## Part B — gnoma as ACP client (external agents as router arms)
+
+gnoma connects to external ACP agents and exposes each as a router-arm
+backend, the standardized successor to `internal/provider/subprocess`.
+gnoma plays the *client* (editor) side of the JSON-RPC channel.
+
+### Provider adapter
+
+Add `internal/provider/acp/` implementing the `provider.Provider`
+contract (`Stream`, `Name`, `Models`, `DefaultModel`) — the same surface
+the subprocess provider satisfies
+(`internal/provider/subprocess/provider.go:28-62`):
+
+- **Spawn + handshake.** On first use (or at discovery), spawn the agent
+  subprocess (`exec.CommandContext`, with the Windows/Unix process-group
+  handling from `2026-06-04-cross-platform.md`), send `initialize` as the
+  client, then `session/new` with gnoma's `cwd` and — crucially —
+  gnoma's *own* MCP servers passed through as the `mcpServers` list so
+  the external agent shares gnoma's tool surface.
+- **`Stream` → `session/prompt`.** Translate the gnoma `Request`
+  messages into ACP `ContentBlock`s, send `session/prompt`, and turn the
+  incoming `session/update` notifications back into gnoma's pull-based
+  stream events (`EventTextDelta`, structured tool-call events, usage).
+  This is the win over the subprocess provider: tool calls arrive
+  **structured**, not as opaque `EventTextDelta` text.
+- **Permission callbacks.** The external agent sends
+  `session/request_permission` to gnoma (now the client). Route these
+  through gnoma's existing `permission.Checker` so the *user's* gnoma
+  permission policy governs the sub-agent — a strict improvement over
+  today's `--yolo`/`--trust` subprocess invocations that bypass gnoma's
+  gate entirely.
+- **`fs/*` callbacks.** Route the agent's file reads/writes through
+  gnoma's `internal/tool/fs` guard so the path-safety boundary still
+  applies.
+- **Cancellation.** gnoma's turn-cancel sends ACP `session/cancel`.
+
+### Discovery & registration
+
+Mirror the subprocess flow (`cmd/gnoma/main.go:521-531`):
+
+- Discover ACP agents from config (`[acp.agents]` — command + args +
+  optional capability hints) and/or a known-agents table analogous to
+  `subprocess/agent.go:60` (`knownAgents`).
+- Register each as a `router.Arm` (a new `IsACPAgent` flag, or reuse
+  `IsCLIAgent` with a transport discriminant). Set `Capabilities` from
+  the ACP `initialize` response — notably `ToolUse:true`, which the
+  subprocess provider often can't claim.
+- Wrap in `security.WrapProvider(..., fwRef)` exactly like every other
+  arm so the firewall + audit + egress boundaries hold.
+
+### Relationship to the subprocess provider
+
+Additive. Agents that speak ACP (Claude, Gemini CLI, Codex increasingly
+do) get the ACP arm; agents that only do one-shot stream-json keep the
+subprocess arm. Where both exist for one binary, prefer ACP. This also
+unblocks the "Native agy JSON output" backlog item for any agent that
+exposes ACP instead of `--output-format stream-json`.
+
+---
+
+## Touch-points (file:line)
+
+**Part A — agent (server):**
+
+| Change | Location |
+|---|---|
+| New ACP package | `internal/acp/` |
+| Entry mode dispatch | `cmd/gnoma/main.go` (mode select ~`:106`, subcommand dispatch ~`:178`) |
+| stdout→stderr log discipline | logger setup (`main.go:100-114`) |
+| Session bridge | `internal/session` (`Session`/`Local`) |
+| Permission callback | `internal/permission` checker promptFn (`main.go:645-668`) |
+| Stream→update | engine stream iterator (`internal/engine`, `internal/stream`) |
+| MCP per-session | `internal/mcp/manager.go` |
+| JSON-RPC framing reuse | `internal/mcp/jsonrpc.go` |
+
+**Part B — client (external agents as arms):**
+
+| Change | Location |
+|---|---|
+| ACP-client provider | new `internal/provider/acp/` (mirrors `internal/provider/subprocess/`) |
+| Client handshake/driver | `internal/acp/client.go` |
+| Arm discovery + registration | `cmd/gnoma/main.go:521-531` (subprocess pattern), `[acp.agents]` config |
+| Known-agents table | analogous to `internal/provider/subprocess/agent.go:60` |
+| Arm flag | `router.Arm` (`IsACPAgent`, or `IsCLIAgent` + transport) |
+| Security wrap | `security.WrapProvider(..., fwRef)` |
+
+---
+
+## Testing (TDD — write first)
+
+- **Protocol unit tests (no real provider):**
+  - `initialize` handshake: version negotiation, advertised
+    capabilities are stable and accurate.
+  - `session/new` → returns a sessionId; honours `cwd`; rejects a
+    `refuse`-classified cwd with an ACP error.
+  - `session/prompt` with a stubProvider: ContentBlocks translate in,
+    `session/update`s stream out in order, correct stop reason.
+  - `session/cancel` aborts the in-flight turn (context cancellation
+    observed).
+  - Permission: a tool call triggers `session/request_permission`; a
+    "deny" response blocks the tool; "allow always" updates the mode.
+  - **stdout purity test:** drive a full prompt and assert stdout
+    contains *only* valid JSON-RPC frames (no banner/log leakage) — this
+    is the most common ACP-agent bug.
+- **Conformance:** run gnoma against the upstream ACP test client /
+  example client (Rust/TS) in a `//go:build integration` test if one is
+  available; otherwise a recorded-transcript fixture.
+- **MCP-in-ACP:** `session/new` with an `mcpServers` entry spins the
+  server up and its tools become callable in that session.
+- **Part B (client) unit tests** — drive a *fake ACP agent* (a small
+  in-process JSON-RPC responder, the mirror of the agent-side tests):
+  - Provider `Stream` performs `initialize`+`session/new`+`session/prompt`
+    and yields gnoma stream events in order, with **structured** tool-call
+    events (not opaque text).
+  - An inbound `session/request_permission` is routed through
+    `permission.Checker` and a deny blocks the call.
+  - An inbound `fs/write_text_file` is mediated by the `internal/tool/fs`
+    guard (a guarded path is refused).
+  - Turn cancel emits `session/cancel`; the subprocess is reaped (tie to
+    cross-platform process-group handling).
+  - Discovery registers a fake ACP agent as an arm with `ToolUse:true`.
+- **Round-trip (loopback):** point gnoma's ACP-*client* at a `gnoma acp`
+  *server* subprocess and run a prompt end-to-end — exercises both parts
+  over a real stdio pipe.
+
+### Acceptance criteria
+
+**Part A (agent/server):**
+
+1. `gnoma acp` speaks the handshake and a full prompt turn over stdio.
+2. gnoma appears and works as an external agent in Zed (manual: add
+   gnoma to Zed's external-agents config, run a prompt, approve a tool).
+3. Tool permission prompts surface in the client and gate execution.
+4. stdout carries only JSON-RPC; all logs go to stderr.
+5. Cancelling from the editor stops the turn.
+6. MCP servers declared by the client in `session/new` are available in
+   that session.
+
+**Part B (client):**
+
+7. An external ACP agent configured under `[acp.agents]` appears as a
+   router arm (`gnoma providers` lists it) with `ToolUse:true`.
+8. Routing a task to that arm runs a full turn via ACP, surfacing the
+   sub-agent's tool calls **structured** in gnoma's stream.
+9. The sub-agent's permission requests are gated by the user's gnoma
+   permission policy (not auto-approved).
+10. The sub-agent's file writes pass through gnoma's fs guard.
+11. Loopback: `gnoma acp` driven by gnoma's own ACP-client completes a
+    prompt end-to-end.
+
+---
+
+## Open questions (resolve against the live spec at implementation)
+
+- Exact `protocolVersion` to target and the precise capability struct
+  shapes (the schema is the source of truth; pin a version).
+- Whether to advertise client-side `fs/*` (edits flow through the
+  editor's buffers) vs. direct-disk fs tools — depends on parity and on
+  how gnoma's `internal/tool/fs` guard composes with editor-mediated
+  writes.
+- `session/load` support (needs our session store to round-trip the
+  ACP transcript shape).
+- **(Part B)** How a sub-agent's own model/cost is represented in the
+  router — an ACP arm's tokens are billed by *that* agent, so
+  `CostWeight`/`CostPer1k*` are opaque. Likely model it like the
+  subprocess arms (no metered cost; selection driven by `Strengths`).
+- **(Part B)** Lifecycle: spawn-per-session vs. a pooled long-lived
+  agent process reused across turns; how cancellation and crashes are
+  recovered (ties to session error-recovery, `0d3d190`).
+
+---
+
+## TODO linkage
+
+New "Agent Client Protocol (ACP) support" entry in `TODO.md` (In
+flight) links here. Covers **both** roles: gnoma as ACP agent (Part A)
+and gnoma as ACP client driving external agents as router arms
+(Part B). Part B is the standardized successor to
+`internal/provider/subprocess` and overlaps the "Native agy JSON
+output" backlog item.
@@ -0,0 +1,198 @@
+# Cross-Platform Support (Windows + macOS) — 2026-06-04
+
+Makes the Windows and macOS binaries — which GoReleaser already builds
+for `linux/darwin/windows × amd64/arm64` but only Linux exercises —
+actually work and stay working. Promotes the TODO.md entry
+"Cross-platform support — Windows + macOS" into a phased design with
+concrete code touch-points.
+
+This plan does not restate the TODO's r/devops question map (Phase 2
+table there stands). Its value-add is the **specific code locations**
+that need OS-conditional handling and the build-tag pattern to use.
+
+---
+
+## Problem
+
+Only Linux is tested. The binaries ship for Windows/macOS untested, and
+the codebase has several hard Unix assumptions that will fail or
+silently misbehave off-Linux. The pattern to follow already exists:
+`internal/mcp/transport_{unix,windows}.go` split via build tags.
+
+---
+
+## Non-goals
+
+- **MSI installer, Authenticode/Gatekeeper signing.** Covered by
+  `2026-06-04-distribution-followups.md` — those are packaging, not
+  runtime correctness.
+- **Group Policy / Event Viewer integration.** Out of scope per the
+  TODO; documentation-only.
+- **WSL-specific tuning.** WSL is Linux; it works today.
+
+---
+
+## Confirmed Unix-assumption defects (file:line)
+
+### Critical — break core functionality on Windows
+
+1. **Bash tool hardcodes `bash -c`.**
+   `internal/tool/bash/bash.go:117` →
+   `exec.CommandContext(ctx, "bash", "-c", command)`. No Windows shell.
+   Alias harvesting (`internal/tool/bash/aliases.go:115,148`) hardcodes
+   `/bin/bash` and splits the shell path on `/`.
+2. **Llamafile SLM startup hardcodes `sh`.**
+   `internal/slm/manager.go:172` invokes `sh <llamafile>` (a Wine
+   binfmt workaround). `sh` is absent on native Windows → `gnoma slm
+   status/setup` fails outright.
+3. **MCP process-tree kill is a Windows stub.**
+   `internal/mcp/transport_windows.go:10-18` — `setProcessGroup` is a
+   no-op and `killProcessTree` calls `p.Kill()`, leaking any child
+   processes an MCP server spawns. Unix version uses process groups
+   (`transport_unix.go:11-18`).
+
+### High — config/auth land in the wrong place off-Linux
+
+4. **Config/data dirs assume XDG.**
+   `internal/config/load.go:52-59` falls back to `~/.config`;
+   `internal/slm/manager.go:25-35` falls back to `~/.local/share`. On
+   Windows these should be `os.UserConfigDir()` (`%AppData%`) /
+   `os.UserCacheDir()`. On macOS, native tools use
+   `~/Library/Application Support`, though `~/.config` is tolerable;
+   decide and document.
+5. **OAuth credential discovery is Unix-pathed.**
+   `internal/provider/google/provider.go:188-204` hardcodes
+   `~/.config/...` and `~/.gemini/...`. `expandHome` (`:114-129`)
+   already handles `\`, but the path *set* is Unix-centric — Gemini/
+   Antigravity creds on macOS/Windows won't be found.
+6. **No system-proxy support.** No `http.ProxyFromEnvironment` wiring
+   found. Go stdlib reads `HTTP(S)_PROXY` env vars but **not** the
+   Windows system proxy / PAC. Corporate Windows networks rely on these.
+
+### Medium — usability / safety classifier gaps
+
+7. **`internal/safety/cwd.go`** macOS system roots
+   (`:185-210`) miss `/opt`, `/usr/local`; personal-dir detection
+   (`:221-252`) misses Windows `%TEMP%`/`%APPDATA%` and macOS
+   `~/Library/...`.
+8. **Terminal/ANSI.** TUI uses lipgloss/termenv (auto-detects), so
+   modern Windows Terminal/PowerShell 7 are fine; legacy `conhost.exe`
+   may mangle. Verify, don't assume.
+
+---
+
+## Design
+
+### Phase 0 — build-tag scaffolding
+
+Adopt the existing `_unix.go` / `_windows.go` split (as in
+`internal/mcp`) for each defect that needs divergent behaviour. Prefer
+`runtime.GOOS` only for small inline branches (as
+`internal/safety/cwd.go:201` already does); use build tags when the
+implementation genuinely differs (shell selection, process kill).
+
+### Phase 1 — smoke tests (unblocks the honest "did you test it?" answer)
+
+Non-blocking GitHub Actions matrix (`windows-latest`, `macos-latest`,
+`ubuntu-latest`):
+
+- `go build ./...` and `go test ./...` per OS (today the release
+  workflow tests Linux only — `.github/workflows/release.yml`).
+- Post-release: download each archive, run `gnoma --version` and a
+  stubbed `echo hi | gnoma --provider ollama` against a fake endpoint.
+  Confirms the binary launches and the TUI doesn't crash.
+
+This is the precondition the TODO names for posting to r/devops.
+
+### Phase 2 — shell abstraction (defects #1, #2)
+
+1. Introduce `internal/tool/bash/shell_unix.go` /
+   `shell_windows.go` exposing `defaultShell() (name string, args
+   []string)` and a `quoteArg(string) string`:
+   - Unix: `bash`/`$SHELL`, `-c`, POSIX quoting.
+   - Windows: prefer `pwsh`/`powershell` with the appropriate
+     `-Command` invocation and PowerShell quoting rules; fall back to
+     `cmd /c`. Document the choice.
+2. Fix `aliases.go` to use `filepath.Base` instead of splitting on `/`,
+   and skip alias harvesting on Windows shells that have no equivalent.
+3. Llamafile: on Windows, invoke the `.llamafile` (which is a valid
+   Windows PE as well as a shell script) directly rather than via `sh`;
+   guard with a build tag.
+
+### Phase 3 — process management (defect #3)
+
+Implement Windows job objects via `golang.org/x/sys/windows` in
+`transport_windows.go` (and any other subprocess owner —
+`internal/provider/subprocess`, `internal/tool/bash`): create a job,
+assign the child, `TerminateJobObject` on close to reap the whole tree.
+Shared helper so MCP and bash tool both get tree-kill. (This is the
+same item the distribution TODO references.)
+
+### Phase 4 — paths + proxy (defects #4, #5, #6)
+
+1. Replace XDG fallbacks with `os.UserConfigDir()` / `os.UserCacheDir()`
+   on Windows (keep XDG honoring on Unix). Centralise in one
+   `configDir()` / `dataDir()` helper so it's not re-derived.
+2. Extend the OAuth credential path sets with OS-appropriate locations
+   (macOS `~/Library/Application Support/...`, Windows `%AppData%/...`).
+3. Ensure every `http.Client` uses a transport with
+   `Proxy: http.ProxyFromEnvironment`. For Windows system-proxy/PAC,
+   document the env-var workaround now; optionally vendor a PAC-aware
+   transport (e.g. `github.com/rapid7/go-get-proxied`) later. This
+   overlaps the shared-client work in
+   `2026-06-04-egress-allowlist.md` — do the proxy transport once, in
+   the shared client.
+
+### Phase 5 — safety classifier + terminal (defects #7, #8)
+
+Extend `internal/safety/cwd.go` system-root and personal-dir sets per
+OS; add a manual verification note for legacy Windows terminals.
+
+---
+
+## Touch-points (file:line)
+
+| Defect | Location |
+|---|---|
+| Bash shell | `internal/tool/bash/bash.go:117`, `aliases.go:115,148` |
+| Llamafile `sh` | `internal/slm/manager.go:172` |
+| MCP kill stub | `internal/mcp/transport_windows.go:10-18` |
+| Config/data dirs | `internal/config/load.go:52-59`, `internal/slm/manager.go:25-35` |
+| OAuth paths | `internal/provider/google/provider.go:188-204` |
+| Proxy | shared `http.Client` (see egress plan) |
+| Safety classifier | `internal/safety/cwd.go:185-252` |
+| CI matrix | `.github/workflows/` (new test job), `release.yml` |
+
+---
+
+## Testing (TDD — write first)
+
+- **OS-gated unit tests** (run on each matrix OS):
+  - `defaultShell()` returns a runnable shell per OS; `quoteArg`
+    round-trips a value containing spaces/quotes through the real shell.
+  - `configDir()`/`dataDir()` return the OS-correct base.
+  - Job-object kill: spawn a child that spawns a grandchild; assert
+    both are gone after `killProcessTree` (Windows).
+  - `safety.ClassifyCWD` flags OS-appropriate system/personal dirs.
+- **Existing tests** that `t.Skip` on Windows
+  (`internal/tool/fs/guard_test.go`,
+  `internal/provider/subprocess/stream_test.go`) — audit whether the
+  skip hides a real gap now that Windows is a target.
+
+### Acceptance criteria
+
+1. CI smoke matrix is green on `windows-latest` + `macos-latest`.
+2. `gnoma --version` and a stubbed pipe run succeed on a Windows runner.
+3. A bash-tool command with quoted args runs on Windows (PowerShell).
+4. An MCP server that spawns a child leaves no orphan after shutdown on
+   Windows.
+5. Config lands in `%AppData%\gnoma` on Windows, `~/.config/gnoma` on
+   Linux.
+
+---
+
+## TODO linkage
+
+Promotes the "Cross-platform support — Windows + macOS" entry in
+`TODO.md`. The Phase-2 r/devops question table stays in the TODO as the
+public-facing answer map; link this plan for the implementation detail.
@@ -0,0 +1,169 @@
+# Distribution Follow-ups — 2026-06-04
+
+Hardens and broadens the release pipeline. v0.1.0+ already ships static
+archives (GitHub mirror releases) and multi-arch Docker images (GHCR)
+via GoReleaser. This plan covers the optional follow-ups listed under
+"Distribution — follow-ups" in TODO.md: signed checksums, Homebrew tap,
+`curl | sh` installer, release-note automation, and the
+`dockers`→`dockers_v2` migration.
+
+---
+
+## Current state (confirmed)
+
+- **`.goreleaser.yml`:** 6-target build matrix (linux/darwin/windows ×
+  amd64/arm64), CGO disabled, version injected via ldflags
+  (`-X main.buildVersion/buildCommit/buildDate`; read at
+  `cmd/gnoma/main.go:55-60`, printed at `:95-98`). Archives: tar.gz
+  (zip on Windows). Checksums: plain SHA256 `checksums.txt`,
+  **unsigned**. Docker: separate per-arch `dockers` blocks +
+  `docker_manifests` for the multi-arch manifest. Release published to
+  GitHub mirror (`release.github` owner `VikingOwl91`).
+- **`.github/workflows/release.yml`:** triggers on `v*` tags, sets up
+  QEMU + Buildx, logs into GHCR with the built-in `GITHUB_TOKEN`, runs
+  `go test ./...` (Linux only), then `goreleaser release --clean` with
+  `GORELEASER_CURRENT_TAG` set. **No signing step.**
+- **`Dockerfile`:** distroless `static:nonroot`, copies the
+  GoReleaser-built binary in. Architecture-agnostic (binary built
+  before `COPY`).
+- **No** Homebrew tap, install script, or Makefile release target.
+
+---
+
+## Non-goals
+
+- **Authenticode (Windows) / Gatekeeper notarization (macOS) code
+  signing.** These need a paid EV cert / Apple Developer account —
+  tracked separately (the cross-platform TODO documents the
+  "right-click → Unblock" workaround). Sigstore/cosign here is for
+  *checksum* signing, which needs no paid cert.
+- **MSI installer.** Lives in the cross-platform plan, gated on demand.
+- **Changing the canonical repo flow.** PRs still go to the Gitea
+  upstream; the GitHub mirror remains the release/CI surface.
+
+---
+
+## Design (independent work items — ship in any order)
+
+### 1. Signed checksums (cosign / sigstore keyless)
+
+Add a GoReleaser `signs` block that signs `checksums.txt` with cosign
+in **keyless** mode (OIDC via the GitHub Actions token — no stored
+private key, no cert cost):
+
+- Add `cosign` install + `id-token: write` permission to
+  `release.yml`.
+- GoReleaser `signs:` → `cmd: cosign`, `args: sign-blob` producing
+  `checksums.txt.sig` + `.pem` (cert bundle) as release artifacts.
+- Document verification:
+  `cosign verify-blob --certificate ... --signature ... checksums.txt`.
+
+Acceptance: a downloaded release verifies offline against the published
+signature + Rekor transparency log.
+
+### 2. Homebrew tap
+
+Create a tap repo (`VikingOwl91/homebrew-tap`) and add GoReleaser's
+`brews:` block targeting it. Needs a PAT with `contents:write` on the
+tap repo (the default `GITHUB_TOKEN` can't push to a *second* repo) —
+store as `HOMEBREW_TAP_TOKEN` secret. Formula installs the darwin/linux
+archives.
+
+Acceptance: `brew install vikingowl91/tap/gnoma` installs a working
+binary on macOS + Linuxbrew; `gnoma --version` matches the tag.
+
+### 3. `curl | sh` installer
+
+Add `install.sh` (committed at repo root, served via the raw GitHub
+mirror) that:
+
+- Detects OS/arch, maps to the GoReleaser archive name template
+  (`gnoma_<ver>_<os>_<arch>.<ext>`).
+- Resolves the latest release via the GitHub API (or honours a pinned
+  `GNOMA_VERSION`).
+- Downloads the archive **and** `checksums.txt`, verifies the SHA256
+  before extracting (and the cosign signature if cosign is present).
+- Installs to `~/.local/bin` (or `$GNOMA_INSTALL_DIR`), prints a PATH
+  hint.
+
+Keep it POSIX-sh, no bashisms. Acceptance:
+`curl -fsSL <raw>/install.sh | sh` yields a runnable `gnoma` on a clean
+Linux + macOS box; checksum mismatch aborts.
+
+### 4. Release-note automation
+
+GoReleaser already generates a filtered changelog (excludes
+docs/test/chore/style). Enrich it:
+
+- Group commits by Conventional-Commit type
+  (`changelog.groups` with title regexes for feat/fix/perf/refactor).
+- Add a release header template pointing to the upstream Gitea repo and
+  the install methods (brew / curl | sh / docker).
+
+Acceptance: a tagged release's GitHub notes show grouped sections + an
+install snippet, with no docs/chore noise.
+
+### 5. `dockers` → `dockers_v2` migration
+
+Collapse the two per-arch `dockers` blocks + `docker_manifests` into a
+single `dockers_v2` block (GoReleaser's newer multi-platform builder).
+The current `Dockerfile` is architecture-agnostic (binary copied
+post-build), so verify whether `dockers_v2`'s expected per-platform
+binary layout needs a `Dockerfile` change or a `templates`/`extra_files`
+tweak — the TODO flags this as the reason it was deferred. Do it in its
+own commit; diff the resulting GHCR manifest against the current one to
+prove parity (same tags: `<ver>-amd64`, `<ver>-arm64`, `<ver>`,
+`latest`).
+
+Acceptance: GHCR still publishes a multi-arch manifest with identical
+tags + labels; `docker pull --platform linux/arm64` works.
+
+### 6. (Carry-over) Windows process-tree kill
+
+Listed in this TODO bullet but it's a *runtime* concern — implemented in
+`2026-06-04-cross-platform.md` Phase 3 (job objects). Cross-linked here
+only so the TODO bullet's reference resolves.
+
+---
+
+## Touch-points (file:line)
+
+| Item | Location |
+|---|---|
+| Signing, brews, changelog groups, dockers_v2 | `.goreleaser.yml` |
+| cosign install, `id-token` perm, tap token | `.github/workflows/release.yml` |
+| Installer | new `install.sh` (repo root) |
+| Dockerfile (if dockers_v2 needs it) | `Dockerfile` |
+| Tap repo | new `VikingOwl91/homebrew-tap` |
+
+---
+
+## Testing
+
+Distribution is config + scripts, so testing is mostly pipeline-level:
+
+- **Dry run:** `goreleaser release --snapshot --clean` locally must
+  produce signed checksums, brew formula, and the dockers_v2 manifest
+  without publishing.
+- **install.sh:** a `shellcheck` gate + a CI job that runs it against
+  the latest release on linux + macos runners and asserts
+  `gnoma --version`.
+- **Checksum/signature negative test:** corrupt the archive → installer
+  aborts; tampered checksums → cosign verify fails.
+
+### Acceptance criteria
+
+1. A tagged release publishes `checksums.txt` + `.sig` + `.pem`,
+   verifiable with cosign keyless.
+2. `brew install vikingowl91/tap/gnoma` works on macOS.
+3. `curl -fsSL <raw>/install.sh | sh` works on clean Linux + macOS,
+   with checksum verification.
+4. Release notes are grouped and carry install instructions.
+5. GHCR multi-arch manifest is unchanged after the dockers_v2 swap.
+
+---
+
+## TODO linkage
+
+Promotes the "Distribution — follow-ups" entry in `TODO.md`. Link this
+file; the Windows job-object sub-item points at the cross-platform plan.
@@ -0,0 +1,236 @@
+# Network Egress Allowlist — 2026-06-04
+
+Adds a per-host network egress boundary to the security layer via a
+Learn → Review → Enforce rollout. Promotes the second half of the
+TODO.md entry "Security boundary — egress controls + session audit log"
+into a phased design.
+
+---
+
+## Status of the sibling item: per-session audit log — DONE
+
+The first half of the TODO entry (per-session audit log of
+blocked/redacted events) is **already implemented**:
+
+- `internal/security/audit.go` defines `AuditLogger` / `AuditEvent`,
+  writing append-only JSONL at mode `0o600`, incognito-gated,
+  best-effort (write failures never break the scan pipeline).
+- `cmd/gnoma/main.go:685-691` wires it to
+  `<projectRoot>/.gnoma/sessions/<sessionID>/audit.jsonl`.
+- `internal/security/firewall.go` records events at `:152` (unicode
+  sanitize), `:173` (block), `:186` (redact).
+
+**Remaining audit-log gap:** there is no CLI surface to *read* it. The
+TODO's promise — answer "what did the firewall do this session?" in one
+command — needs a `gnoma firewall audit` subcommand (no `firewall`
+subcommand exists today; top-level commands are `providers`, `slm`,
+`router`, `profile`). That viewer is folded into Phase 3 below since it
+shares the `gnoma firewall` command surface with `firewall review`.
+
+The rest of this plan is the genuinely-unbuilt egress allowlist.
+
+---
+
+## Problem
+
+The current `Firewall` is a **content** boundary only: it scans
+messages and tool results for secrets (regex + Shannon entropy) and
+redacts/blocks/warns. It does **not** enforce network egress. Outgoing
+HTTP uses stock clients with no per-host allowlist and no dial-layer
+interception, so a compromised tool, MCP server, or prompt-injected
+provider call can reach any host.
+
+The README and v0.3.0 launch post oversold "network egress gated";
+this plan makes that claim true.
+
+### Why this is hard: no egress chokepoint today
+
+Outgoing HTTP is constructed in many places, none sharing a client:
+
+- **Provider SDKs** each build their own `http.Client` internally:
+  - anthropic (`internal/provider/anthropic/provider.go:36`,
+    `anthropic.NewClient`)
+  - openai (`internal/provider/openai/provider.go:46`, `oai.NewClient`)
+  - mistral (`internal/provider/mistral/provider.go:33`,
+    `mistralgo.NewClient`)
+  - google genai (`internal/provider/google/provider.go:239,306`)
+- **Non-SDK direct calls** using `http.DefaultClient` or ad-hoc
+  `&http.Client{}`:
+  - `internal/router/discovery.go` (`:65,141,325,365`)
+  - `internal/router/probe.go` (`:24,72`)
+  - `internal/slm/backend.go` (`:266,294,316,343`)
+  - `internal/slm/download.go` (`:22`)
+  - `internal/slm/manager.go` (`:273`)
+
+No custom `http.Client` is injected anywhere today. **But** every SDK
+supports injecting one, which is the enabler for a single chokepoint.
+
+---
+
+## Non-goals
+
+- **TLS interception / MITM.** We allowlist by destination host, not by
+  inspecting decrypted payloads. Content inspection stays the
+  firewall's job.
+- **Blocking the provider SDKs' own retry/telemetry hosts by default.**
+  Model-provider hosts are baseline-allowed (see below).
+- **Replacing the OS/network firewall.** This is an in-process
+  application-level guard, defense-in-depth, not a substitute for real
+  network controls. Document this honestly (the README over-claim is
+  the cautionary tale).
+
+---
+
+## Design
+
+### The chokepoint: one shared `http.Client` with a guarded dialer
+
+Build a single `*http.Client` whose `Transport.DialContext` validates
+the destination against the allowlist **before** the connection is
+made. `DialContext` receives `host:port` pre-resolution, so host-based
+matching works without DNS races. Thread this client everywhere.
+
+```
+internal/security/egress/
+  guard.go      // EgressGuard: mode + allowlist + Decide(host) ResultEnum
+  dialer.go     // GuardedDialer wrapping net.Dialer.DialContext
+  client.go     // HTTPClient(guard) *http.Client
+  store.go      // learned-destinations persistence (per project)
+  baseline.go   // curated ship-in-binary allowlist
+```
+
+**Injection mechanism per SDK** (each differs — enumerate, don't assume):
+
+| Client | Mechanism |
+|---|---|
+| anthropic | `option.WithHTTPClient(c)` appended in `anthropic/provider.go` |
+| openai | `option.WithHTTPClient(c)` appended in `openai/provider.go` |
+| google genai | `genai.ClientConfig{HTTPClient: c}` in `google/provider.go` |
+| mistral | **user's own SDK** — add `WithHTTPClient` option if absent (`github.com/VikingOwl91/mistral-go-sdk`), then use it |
+| non-SDK paths | replace `http.DefaultClient` with the shared client in `router/discovery.go`, `router/probe.go`, `slm/backend.go`, `slm/download.go`, `slm/manager.go` |
+
+Plumb the shared client into providers by adding
+`HTTPClient *http.Client` to `provider.ProviderConfig`
+(`internal/provider/registry.go:8-16`) and setting it in
+`createProvider`. The non-SDK paths take the client via their existing
+constructors / a package-level setter.
+
+> The non-SDK paths are the trap: if any is missed it punches a hole in
+> the allowlist. Treat the list above as a checklist; add a grep test
+> (Phase 4) that fails if `http.DefaultClient` reappears.
+
+### Three-stage rollout (not a single "block everything" default)
+
+**Learn.** First runs log every egress destination per `(project,
+agent, tool)` tuple to the per-project store **without blocking**.
+Reuse the audit JSONL discipline (atomic, incognito-gated).
+
+**Review.** `gnoma firewall review` surfaces the captured set; the user
+marks each destination `allow | deny | scoped` (scoped = only reachable
+by named tool/agent). Persist to `.gnoma/firewall/allowlist.toml`
+(project) — subject to the same `omitempty`/atomic-write discipline as
+the config-migration plan (`2026-05-24-config-migration.md`) to avoid
+the zero-spam corruption class.
+
+**Enforce.** When mode is `enforce`, unrecognised destinations are
+blocked with a clear violation logged to the **same per-session
+`audit.jsonl`** (new `AuditEvent.Action = "egress_block"`). Mode is
+`[security.egress].mode = "off" | "learn" | "enforce"`, default `off`
+(opt-in; shipping `enforce` on by default would break first-run UX).
+
+### Baseline allowlist (curated, ship-in-binary)
+
+`baseline.go` seeds the allowlist so Enforce mode is usable immediately:
+
+- **Package ecosystems:** github.com, registry.npmjs.org, pypi.org,
+  files.pythonhosted.org, crates.io, static.crates.io,
+  registry-1.docker.io, proxy.golang.org, sum.golang.org.
+- **Model providers:** anthropic, openai, google, mistral, **minimax**
+  (per `2026-06-04-minimax-provider.md`) — host set derived from the
+  effective `[provider.endpoints]` map so user-configured local
+  ollama/llamacpp endpoints are auto-allowed.
+
+The painful middle ground is SDK egress (sentry, stripe, supabase,
+datadog…). These break a naive "block unknown" default, which is
+exactly why Learn → Review → Enforce is the only flow that scales.
+
+### Per-tool scoping
+
+`scoped` destinations carry an allowed-tool/agent set. Enforcement
+checks the calling context — the engine already knows which tool is
+running (it threads per-tool context for redaction logging today). Pass
+the tool/agent identity into `EgressGuard.Decide(host, callerCtx)`.
+
+---
+
+## Interactions
+
+- **Incognito:** Learn-mode writes are gated by incognito exactly like
+  the audit log (`IncognitoMode.ShouldLogContent`). Enforcement still
+  applies in incognito (security is not relaxed); only the *learning*
+  persistence is suppressed.
+- **Config layering:** the allowlist file is a new corruption surface —
+  follow `2026-05-24-config-migration.md` #1 discipline.
+- **SafeProvider:** egress is orthogonal to the content `SafeProvider`
+  wrap; it lives one layer down at the transport. Both must hold.
+
+---
+
+## Touch-points (file:line)
+
+| Change | Location |
+|---|---|
+| New egress package | `internal/security/egress/` |
+| `HTTPClient` field | `internal/provider/registry.go:8-16` |
+| Provider client injection | `anthropic/provider.go`, `openai/provider.go`, `google/provider.go`, `mistral/provider.go` |
+| mistral SDK `WithHTTPClient` | `github.com/VikingOwl91/mistral-go-sdk` (if absent) |
+| Non-SDK client swap | `router/discovery.go`, `router/probe.go`, `slm/backend.go`, `slm/download.go`, `slm/manager.go` |
+| `audit.go` egress action | `internal/security/audit.go` (`AuditEvent`) |
+| Config `[security.egress]` | `internal/config/config.go` (SecuritySection ~`:280-306`) |
+| `gnoma firewall` command | `cmd/gnoma/main.go` subcommand dispatch (~`:178`) |
+| Allowlist store | `.gnoma/firewall/allowlist.toml` |
+
+---
+
+## Testing (TDD — write first)
+
+- **Unit:**
+  - `EgressGuard.Decide`: off → always allow; learn → allow + record;
+    enforce → allow baseline/allowlisted, block unknown, scoped host
+    allowed only for the named tool.
+  - `GuardedDialer` blocks a non-allowlisted `host:port` before dial
+    (use a guard with a closed allowlist; assert no connection
+    attempt — inject a fake inner dialer that records calls).
+  - Baseline expansion: `[provider.endpoints]` hosts are auto-allowed;
+    a local ollama URL becomes an allowlist entry.
+  - Allowlist store round-trips without zero-spam corruption.
+  - `audit.jsonl` gains an `egress_block` record on a blocked dial.
+- **Grep/guard test:** fails if `http.DefaultClient` is used in
+  provider/router/slm packages (prevents regressions reopening the
+  hole).
+- **Integration (`//go:build integration`):** with mode=enforce and a
+  minimal allowlist, a provider call to an allowed host succeeds and a
+  tool fetch to a blocked host fails with a logged violation.
+
+### Acceptance criteria
+
+1. `mode="off"` (default) → behaviour identical to today.
+2. `mode="learn"` → every outbound host appears in the store; nothing
+   is blocked.
+3. `gnoma firewall review` lists learned hosts and persists
+   allow/deny/scoped decisions.
+4. `mode="enforce"` → baseline + allowlisted hosts reachable; an
+   un-allowlisted host is blocked with an `egress_block` line in
+   `.gnoma/sessions/<id>/audit.jsonl`.
+5. `gnoma firewall audit` prints this session's firewall events
+   (block/redact/egress) in a grep-friendly form. (Closes the
+   remaining audit-log gap.)
+6. Scoped destination reachable by its named tool only.
+
+---
+
+## TODO linkage
+
+Replaces the egress half of the "Security boundary — egress controls +
+session audit log" entry in `TODO.md`. Update that entry to mark the
+audit log implemented and link this file for the egress work.
@@ -0,0 +1,224 @@
+# MiniMax Provider — 2026-06-04
+
+Adds MiniMax (<https://platform.minimax.io>) as a first-class cloud
+provider so it can register as a router arm alongside
+anthropic/openai/google/mistral. Promotes the TODO.md entry
+"MiniMax provider — cloud arm + subscription token plan" out of
+bullet form into a phased design.
+
+---
+
+## Problem
+
+Gnoma has no MiniMax adapter. MiniMax ships strong, very cheap coding
+models (M2 family) that are a natural fit for the cheap-high-capability
+cloud tier the router already reasons about via `CostWeight`. Two facts
+make the integration cheap:
+
+1. MiniMax exposes **both** an OpenAI-compatible and an
+   Anthropic-compatible HTTP surface, so no new translation layer is
+   needed — gnoma already has both `internal/provider/openaicompat`
+   (built on the OpenAI SDK) and `internal/provider/anthropic` with a
+   working `BaseURL` override.
+2. `envKeyFor`'s default branch (`cmd/gnoma/main.go:1199-1200`) already
+   resolves `MINIMAX_API_KEY` for any unknown provider with no code
+   change.
+
+The remaining work is wiring (a constructor + switch cases +
+enumerations), routing metadata (family defaults, rate limits), and a
+**design decision around the subscription billing model** that the
+router's metered-cost assumption does not currently handle.
+
+### External facts (VERIFY at implementation — MiniMax docs move fast)
+
+These were confirmed 2026-06-04 but the model lineup and pricing are
+revised frequently (a pricing overhaul landed 2026-06-02). Re-verify
+against the live docs before hardcoding anything:
+
+- **OpenAI-compatible base URL:** `https://api.minimax.io/v1`
+  (international). A separate region endpoint exists
+  (`api.minimaxi.com`); confirm the exact host + whether gnoma should
+  expose a region toggle. Docs:
+  <https://platform.minimax.io/docs/api-reference/text-openai-api>
+- **Anthropic-compatible endpoint:** exists ("two equivalent
+  endpoints, one mimics OpenAI, one mimics Anthropic"). Confirm the
+  exact path/host before choosing it over OpenAI-compat.
+- **Models (do NOT hardcode a single ID):** MiniMax-M2, M2.1, M2.5,
+  M2.7 (+ `-highspeed` variants), M3. Coding-relevant default is the
+  current M2-coding model — at time of writing M2.5 for PAYG, M2.1 for
+  the subscription plan. **Treat the default as config, not a
+  constant**, and call `Models(ctx)` to enumerate live.
+- **Pricing (PAYG, for `CostPer1k*` metadata):** M2.7 ≈ $0.30 / MTok
+  input, $1.20 / MTok output; highspeed ≈ 2×. Convert to the EUR
+  per-1k convention used by the Arm struct. Docs:
+  <https://platform.minimax.io/docs/guides/pricing-token-plan>
+- **Subscription:** "Token Plan" (current; supersedes the former
+  "Coding Plan"). Flat-rate prompt quota over a rolling window
+  (published M2.7 limits 1,500–30,000 requests / 5h across tiers).
+  Same Bearer key as PAYG.
+
+---
+
+## Non-goals
+
+- **A bespoke MiniMax SDK / translation layer.** We reuse the existing
+  OpenAI-compat (default) or Anthropic provider via `BaseURL`. If
+  MiniMax adds non-standard body fields, use the existing
+  `openai.NewWithStreamOptions` escape hatch (the same one Ollama uses).
+- **Region auto-detection.** Ship the international endpoint as the
+  default; the user can override via `[provider.endpoints]`. A region
+  toggle is a follow-up if anyone asks.
+- **Full subscription-quota accounting.** Phase 2 models subscription
+  cost as a coarse `CostWeight` zero-out, not a live quota meter.
+
+---
+
+## Decision: OpenAI-compat vs Anthropic-compat backing
+
+**Default to OpenAI-compat** (`internal/provider/openaicompat`). It is
+already exercised by the local backends (ollama/llamacpp), so the
+streaming, tool-call, and error paths are battle-tested in this repo.
+The Anthropic-compat endpoint is a fallback only if a MiniMax feature
+(e.g. extended thinking) is exposed solely through it. Keep the option
+open by making the backing selectable via config
+(`[provider.minimax].api = "openai" | "anthropic"`), defaulting to
+`openai`.
+
+---
+
+## Design
+
+### Phase 1 — provider wiring (smallest shippable slice)
+
+Goal: `gnoma --provider minimax` works against PAYG with metered
+pricing, registered as a cloud arm.
+
+1. **Constructor.** Add `NewMiniMax(cfg provider.ProviderConfig)
+   (provider.Provider, error)` to
+   `internal/provider/openaicompat/provider.go`, mirroring `NewOllama`
+   / `NewLlamaCpp` (`openaicompat/provider.go:18-49`):
+   - Default `BaseURL` to `https://api.minimax.io/v1` when unset (but
+     let `[provider.endpoints].minimax` override).
+   - Require a real API key (unlike Ollama's dummy key) — return an
+     error if `cfg.APIKey == ""`.
+   - Leave `MaxRetries` at the SDK default (cloud failures *are*
+     transient, unlike the local backends which force `0`).
+   - Default `cfg.Model` to the current coding model **read from
+     config**, not a baked constant.
+
+2. **Construction switch.** Add `case "minimax": return
+   openaicompat.NewMiniMax(cfg)` to `createProvider`
+   (`cmd/gnoma/main.go:1265-1280`). If `[provider.minimax].api =
+   "anthropic"`, route to `anthropicprov.New(cfg)` with `cfg.BaseURL`
+   set to the anthropic-compat host instead.
+
+3. **Provider enumerations.** Add `"minimax"` to:
+   - the known-providers set (`main.go:233-236`),
+   - the available-providers usage string (`main.go:1279`),
+   - NOT the local-providers set (it is a cloud arm).
+
+4. **API key (optional friendliness).** `envKeyFor`'s default already
+   yields `MINIMAX_API_KEY`. Add an explicit `case "minimax"` in
+   `envKeyFor` (`main.go:1189-1201`) only if we want alternates (e.g.
+   `MINIMAX_GROUP_ID` if the account requires a group id header —
+   VERIFY whether MiniMax needs a group id alongside the key; if so,
+   thread it through `ProviderConfig.Options`).
+
+5. **Family defaults.** Add MiniMax model families to
+   `knownFamilyDefaults` in `internal/router/defaults.go` (pattern at
+   `defaults.go:212-239`). Cloud arm → no `MaxComplexity` ceiling. Set
+   `Strengths` (`TaskGeneration`, `TaskRefactor`, `TaskDebug` are the
+   coding sweet spot) and a low `CostWeight` (~0.8–1.0 — cheap arm, so
+   the cost penalty is small) plus `CostPer1kInput/Output` from the
+   verified PAYG pricing.
+
+6. **Rate limits.** Add a `minimaxDefaults()` entry in
+   `internal/provider/ratelimits.go` (pattern at the anthropic block
+   ~`ratelimits.go:109-130`) and wire it into the `DefaultRateLimits`
+   switch. Use the published PAYG RPM/TPM; allow `[rate_limits.minimax]`
+   config overrides (the existing override path in `resolveRateLimitPools`).
+
+### Phase 2 — subscription (Token Plan) billing model
+
+The router's `CostWeight` math assumes metered per-token pricing. Under
+a Token Plan subscription, marginal cost is ≈0 until the quota is hit,
+then requests hard-fail. Design:
+
+1. **Billing knob.** `[provider.minimax].billing = "metered" |
+   "subscription"` (default `"metered"`). In `subscription` mode, set
+   the arm's `CostWeight` to 0 (or `CostPer1k*` to 0) so the selector
+   treats MiniMax as free while quota remains.
+
+2. **Quota-exhaustion failover.** MiniMax returns a quota/429 error
+   when the plan is exhausted. Map it to the existing rate-limit
+   backoff path (`Arm.BackoffUntil`, the 429 handling that already
+   disables an arm temporarily) so the bandit fails over to the next
+   arm cleanly. This ties into the session error-recovery work landed
+   in `0d3d190`. Confirm the exact error shape MiniMax returns and add
+   a classifier in `internal/provider/errors.go`.
+
+3. **Docs.** Document both plans + the region split in
+   `docs/slm-backends.md` (or a new provider doc) and the README
+   provider list.
+
+---
+
+## Touch-points (file:line)
+
+| Change | Location |
+|---|---|
+| `NewMiniMax` constructor | `internal/provider/openaicompat/provider.go` (after `:49`) |
+| Construction switch case | `cmd/gnoma/main.go:1265-1280` |
+| Known-providers set | `cmd/gnoma/main.go:233-236` |
+| Usage string | `cmd/gnoma/main.go:1279` |
+| `envKeyFor` (optional) | `cmd/gnoma/main.go:1189-1201` |
+| Family defaults | `internal/router/defaults.go:212-239` |
+| Rate-limit defaults | `internal/provider/ratelimits.go` (+ `DefaultRateLimits` switch) |
+| Error classifier (Phase 2) | `internal/provider/errors.go` |
+| Config: `[provider.minimax]` | `internal/config/config.go` (provider section) |
+
+The `Provider` interface contract to satisfy
+(`internal/provider/provider.go:136-148`): `Stream`, `Name`, `Models`,
+`DefaultModel`. All four come free by delegating to the OpenAI-compat
+base provider.
+
+---
+
+## Testing (TDD — write first)
+
+Per CLAUDE.md: table-driven, `//go:build integration` for anything
+hitting the live API.
+
+- **Unit (no network):**
+  - `NewMiniMax` defaults: empty `BaseURL` → `https://api.minimax.io/v1`;
+    empty key → error; `[provider.endpoints].minimax` override wins.
+  - `createProvider("minimax", …)` returns a non-nil provider; unknown
+    still errors.
+  - `envKeyFor("minimax") == "MINIMAX_API_KEY"`.
+  - `defaults.go`: a MiniMax model family resolves to the expected
+    `Strengths`/`CostWeight`; `MaxComplexity == 0`.
+  - `ratelimits.go`: `DefaultRateLimits("minimax").LookupModel(...)`
+    returns the configured limits; `"*"` fallback works.
+  - Phase 2: billing=`subscription` → arm `CostWeight == 0`; the
+    quota/429 error maps to a retryable/backoff classification.
+- **Integration (`//go:build integration`, real `MINIMAX_API_KEY`):**
+  a one-shot `Stream` against the cheapest model returns tokens;
+  `Models(ctx)` enumerates a non-empty list.
+
+### Acceptance criteria
+
+1. `MINIMAX_API_KEY=… gnoma --provider minimax -p "hello"` streams a
+   response in pipe mode.
+2. With no `--provider`, MiniMax appears as a selectable router arm and
+   is chosen for a cheap generation task when `prefer` allows cloud.
+3. `gnoma providers` lists `minimax`.
+4. Phase 2: with `billing="subscription"`, the selector prefers MiniMax
+   for eligible tasks; on simulated quota-exhaustion the router fails
+   over without surfacing an error to the user.
+
+---
+
+## TODO linkage
+
+Replaces the inline "MiniMax provider" bullet in `TODO.md` (In flight).
+Link this file from that entry.