docs(todo,plans): specs for open features + MiniMax & ACP
Add implementation-ready plans for the in-flight features that lacked one, and two new provider/protocol items: - MiniMax provider (cloud arm + Token Plan billing decision) - Agent Client Protocol (ACP) — dual role: gnoma as ACP agent and as ACP client driving external agents as router arms - Network egress allowlist (Learn/Review/Enforce); note the per-session audit log is already implemented, remaining gap is a viewer command - Cross-platform (Windows/macOS) code touch-points + build-tag pattern - Distribution follow-ups (cosign, brew tap, installer, dockers_v2) Link each plan from its TODO.md entry; mark audit-log item done.
This commit is contained in:
@@ -4,6 +4,86 @@ Active work, newest first.
|
||||
|
||||
## In flight
|
||||
|
||||
- **MiniMax provider — cloud arm + subscription token plan.** Add
|
||||
MiniMax (api.minimax.io / api.minimaxi.com) as a first-class cloud
|
||||
provider so it can register as a router arm alongside
|
||||
anthropic/openai/google/mistral.
|
||||
|
||||
**API surface.** MiniMax ships *two* OpenAI-and-Anthropic-compatible
|
||||
HTTP surfaces, so this is a base-URL + auth wiring task, not a new
|
||||
translation layer:
|
||||
- **OpenAI-compatible** chat-completions at `…/v1` — reusable via
|
||||
`internal/provider/openaicompat`. Cleanest first cut: add a
|
||||
`NewMiniMax(cfg)` constructor mirroring `NewOllama` /
|
||||
`NewLlamaCpp` (`openaicompat/provider.go`) with the MiniMax base
|
||||
URL baked in, then a `case "minimax"` in
|
||||
`createProvider` (`cmd/gnoma/main.go:1265`) and the available-
|
||||
providers usage string (`:1279`).
|
||||
- **Anthropic-compatible** endpoint (`…/anthropic`) — alternative
|
||||
backing via the existing `anthropic` provider with a `BaseURL`
|
||||
override. Decide one canonical path; OpenAI-compat is the lower-
|
||||
risk default since `openaicompat` is already exercised by the
|
||||
local backends.
|
||||
- **Auth.** Bearer API key. `envKeyFor`'s default branch
|
||||
(`main.go:1199`) already resolves `MINIMAX_API_KEY` with no code
|
||||
change; add an explicit `case "minimax"` only if we want a
|
||||
friendlier name or alternates list.
|
||||
- **Models.** `MiniMax-M2` (agentic/coding, the one to default to),
|
||||
`MiniMax-M1`, abab6.5 series. Set `Strengths` + `MaxComplexity`
|
||||
+ `CostWeight` on the arm so the selector treats it as a cheap
|
||||
high-capability cloud tier.
|
||||
|
||||
**Token plan (open question — affects auth + billing UX).** MiniMax
|
||||
offers a flat-rate **Coding Plan** subscription (token-quota based,
|
||||
Claude-Max-style) *in addition to* metered pay-as-you-go API
|
||||
credits. Both authenticate with the same Bearer key, so no adapter
|
||||
difference — but the router's `CostWeight` math assumes metered
|
||||
per-token pricing. Under a subscription the marginal cost is ~0
|
||||
until the quota is hit, then hard-stops. Decisions to make:
|
||||
- How to model "subscription" cost in the selector — e.g. a
|
||||
`[provider.minimax].billing = "subscription" | "metered"` knob
|
||||
that zeroes `CostWeight` while quota remains, vs. real per-token
|
||||
cost when metered.
|
||||
- Quota exhaustion handling — surface the 429/quota error cleanly
|
||||
and let the bandit fail over to the next arm (ties into the
|
||||
session error-recovery work in `0d3d190`).
|
||||
- Document both plans + the region split (`api.minimax.io`
|
||||
international vs `api.minimaxi.com`) in `docs/slm-backends.md` /
|
||||
provider docs.
|
||||
|
||||
Smallest shippable slice: OpenAI-compat `NewMiniMax` + metered
|
||||
pricing, registered as a cloud arm. Subscription/quota modelling is
|
||||
the follow-up once the billing knob lands. Plan:
|
||||
[`docs/superpowers/plans/2026-06-04-minimax-provider.md`](docs/superpowers/plans/2026-06-04-minimax-provider.md).
|
||||
|
||||
- **Agent Client Protocol (ACP) support.** Run gnoma as an *ACP agent*
|
||||
(`gnoma acp`) so any ACP-capable editor (Zed, Kiro, OpenCode, …) can
|
||||
drive it as an external coding agent. ACP is "the LSP for AI coding
|
||||
agents": JSON-RPC 2.0 over stdio, editor (client) spawns agent
|
||||
(subprocess). gnoma already owns the hard parts — agentic engine,
|
||||
tools, permissions, and JSON-RPC-over-stdio (from its MCP-client
|
||||
side, `internal/mcp/jsonrpc.go`). The fit is symmetric: gnoma is the
|
||||
JSON-RPC *server* here. No Go SDK exists (official SDKs are
|
||||
TS/Python/Rust/Kotlin), so gnoma implements the wire protocol
|
||||
natively against the schema. `session/new` can declare `mcpServers`,
|
||||
so ACP and gnoma's existing MCP manager wire up in one handshake.
|
||||
|
||||
**Dual role — both directions:**
|
||||
1. **gnoma as ACP agent (server)** — `gnoma acp` over stdio so
|
||||
editors drive gnoma.
|
||||
2. **gnoma as ACP client** — gnoma spawns *external* ACP agents
|
||||
(Claude, Gemini CLI, Codex, …) and uses them as router-arm
|
||||
provider backends. This is the same shape as the existing
|
||||
`internal/provider/subprocess` CLI-agent arms
|
||||
(`cmd/gnoma/main.go:521-531`, `IsCLIAgent: true`) but over
|
||||
standardized ACP JSON-RPC — gaining structured tool-call
|
||||
surfacing, real turn/permission semantics, and cancellation
|
||||
that the current one-shot stream-json subprocess provider
|
||||
lacks (it sets `ToolUse:false` for agents without stream-json).
|
||||
|
||||
Upstream: <https://github.com/agentclientprotocol>. Plan:
|
||||
[`docs/superpowers/plans/2026-06-04-agent-client-protocol.md`](docs/superpowers/plans/2026-06-04-agent-client-protocol.md).
|
||||
|
||||
- **Config write/merge — silent corruption of layered configs.**
|
||||
`internal/config/write.go:setConfig` reads the existing TOML into a
|
||||
zero-valued `Config` struct, sets one field, and writes the entire
|
||||
@@ -159,11 +239,13 @@ Active work, newest first.
|
||||
with no per-host allowlist or dial-layer interception. Two follow-
|
||||
ups surfaced from the r/SideProject v0.3.0 launch thread
|
||||
(2026-05-24, `u/Secret_Theme3192`):
|
||||
1. **Per-session audit log of blocked/redacted events** —
|
||||
grep-able file at `.gnoma/sessions/<id>/audit.jsonl` so the
|
||||
user can answer "what did the firewall do this session?" in
|
||||
one command. Today the `slog` output goes to whatever sink is
|
||||
configured, with no per-session grouping.
|
||||
1. **Per-session audit log of blocked/redacted events** — ✅ JSONL
|
||||
writing **implemented**: `internal/security/audit.go` +
|
||||
wiring at `cmd/gnoma/main.go:685-691`
|
||||
(`.gnoma/sessions/<id>/audit.jsonl`), recorded from
|
||||
`firewall.go:152/173/186`. **Remaining gap:** no CLI to *read*
|
||||
it — a `gnoma firewall audit` viewer is folded into the egress
|
||||
plan (shares the `gnoma firewall` command surface).
|
||||
2. **Per-host egress allowlist (HTTP transport layer)** — design
|
||||
refined by `u/HarjjotSinghh` on the r/SideProject thread
|
||||
(2026-05-28). Three-stage rollout, not a single-shot
|
||||
@@ -195,6 +277,9 @@ Active work, newest first.
|
||||
"network egress gated"; corrected in the README scope note
|
||||
and the audit-log commit.
|
||||
|
||||
Egress plan (incl. the `gnoma firewall audit` viewer for item #1):
|
||||
[`docs/superpowers/plans/2026-06-04-egress-allowlist.md`](docs/superpowers/plans/2026-06-04-egress-allowlist.md).
|
||||
|
||||
- **Cross-platform support — Windows + macOS.** GoReleaser builds
|
||||
static binaries for `linux/darwin/windows × amd64/arm64` every
|
||||
release but only Linux is exercised at all today. Windows and
|
||||
@@ -244,6 +329,9 @@ Active work, newest first.
|
||||
least a TODO-linked acknowledgement in the post body so the
|
||||
thread sees gnoma takes the gaps seriously.
|
||||
|
||||
Plan (build-tag scaffolding + concrete code touch-points):
|
||||
[`docs/superpowers/plans/2026-06-04-cross-platform.md`](docs/superpowers/plans/2026-06-04-cross-platform.md).
|
||||
|
||||
- **Tool-router specialization (functiongemma)** — gated on telemetry,
|
||||
not committed. Phase A.2 adds did-switch-rate measurement to the
|
||||
two-stage `select_category` path; Phase A.3 (LoRA fine-tune of
|
||||
@@ -288,7 +376,8 @@ Active work, newest first.
|
||||
from `dockers` + `docker_manifests` to `dockers_v2` in
|
||||
`.goreleaser.yml` (collapses ~45 lines into one block but
|
||||
requires Dockerfile changes for the per-platform binary layout
|
||||
— deferred to its own commit before v0.3.0).
|
||||
— deferred to its own commit before v0.3.0). Plan:
|
||||
[`docs/superpowers/plans/2026-06-04-distribution-followups.md`](docs/superpowers/plans/2026-06-04-distribution-followups.md).
|
||||
|
||||
## Stable backlog (not in active phases)
|
||||
|
||||
|
||||
@@ -0,0 +1,375 @@
|
||||
# Agent Client Protocol (ACP) — 2026-06-04
|
||||
|
||||
Adds **both directions** of ACP to gnoma:
|
||||
|
||||
1. **gnoma as ACP agent (server)** — `gnoma acp` over stdio so any
|
||||
ACP-capable editor (Zed, Kiro, OpenCode, …) can drive gnoma as an
|
||||
external coding agent.
|
||||
2. **gnoma as ACP client** — gnoma spawns *external* ACP agents
|
||||
(Claude, Gemini CLI, Codex, …) and exposes them as router-arm
|
||||
provider backends, the standardized successor to the current
|
||||
`internal/provider/subprocess` CLI-agent arms.
|
||||
|
||||
Adds the TODO.md entry "Agent Client Protocol (ACP) support".
|
||||
|
||||
Upstream: <https://github.com/agentclientprotocol> ·
|
||||
spec <https://agentclientprotocol.com>
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
ACP is "the LSP for AI coding agents": a JSON-RPC 2.0 protocol, spoken
|
||||
over stdio, that lets editors (clients) spawn agents (subprocesses) and
|
||||
talk to them in a standard way — eliminating point-to-point editor↔agent
|
||||
integrations. Zed, Kiro, OpenCode and others are clients; Claude, Gemini
|
||||
CLI, Codex ship as ACP agents.
|
||||
|
||||
Today gnoma is reachable only via its own TUI and pipe mode. It cannot
|
||||
plug into an editor's agent panel. Supporting ACP makes gnoma a drop-in
|
||||
agent inside any ACP client, which is a large distribution surface for
|
||||
near-zero ongoing cost — the protocol is stable and gnoma already owns
|
||||
all the hard parts (an agentic engine, tools, permissions, MCP).
|
||||
|
||||
### Why this is a natural fit
|
||||
|
||||
- gnoma already speaks **JSON-RPC over stdio** for MCP
|
||||
(`internal/mcp/jsonrpc.go` `Request`/`Notification`,
|
||||
`internal/mcp/transport*.go`) — that machinery is reusable for the
|
||||
ACP server side (gnoma is the *server* of the JSON-RPC channel here,
|
||||
the mirror of its MCP-client role).
|
||||
- The agentic loop is already factored behind
|
||||
`session.Session` (`internal/session/session.go:54`,
|
||||
`Local.Send`/`SendWithOptions` at `local.go:80-85`) driving
|
||||
`engine.Engine` (`internal/engine/engine.go`). ACP `session/prompt`
|
||||
maps onto one `Send`.
|
||||
- Permissions already route through a pluggable prompt function
|
||||
(`permission.NewChecker(mode, rules, promptFn)`,
|
||||
`cmd/gnoma/main.go:668`). ACP's `session/request_permission` callback
|
||||
is just another `promptFn` implementation.
|
||||
- ACP `session/new` can declare the `mcpServers` the agent should
|
||||
connect to — gnoma already has an MCP manager
|
||||
(`internal/mcp/manager.go`) to honour that in the same handshake.
|
||||
|
||||
### Role decision — both, server first
|
||||
|
||||
Both roles ship under this plan. Sequence them: **agent (server)
|
||||
first** — it's the larger distribution win and exercises the wire
|
||||
protocol end-to-end — then **client**, which reuses the same
|
||||
`internal/acp` protocol/types from the other side. They share the
|
||||
JSON-RPC framing, content-block translation, and capability structs;
|
||||
only the dispatch direction differs.
|
||||
|
||||
The client role is the standardized successor to
|
||||
`internal/provider/subprocess`: that package shells out to CLI agents
|
||||
with one-shot `--output-format stream-json` (or prompt-augmentation
|
||||
fallback), runs the agent's *own* loop with `--yolo`/`--trust`, and
|
||||
cannot surface structured tool calls (it sets `ToolUse:false` for
|
||||
agents lacking stream-json — see TODO "Native agy JSON output"). ACP
|
||||
fixes all of that: a persistent JSON-RPC session, structured
|
||||
`session/update` tool-call events, real permission round-trips, and
|
||||
cancellation.
|
||||
|
||||
### No Go SDK exists
|
||||
|
||||
Official SDKs are TypeScript, Python, Rust, Kotlin — **no Go**. gnoma
|
||||
implements the wire protocol natively against the published JSON
|
||||
schema. Pin the supported `protocolVersion` and the exact method set
|
||||
against the spec at implementation time (the protocol is young and
|
||||
still moving).
|
||||
|
||||
---
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **A full editor UI.** In agent mode gnoma renders nothing; the client
|
||||
owns the UI. gnoma emits `session/update` notifications and the client
|
||||
displays them.
|
||||
- **Replacing the TUI / pipe modes.** ACP agent mode is a third entry
|
||||
mode alongside them, not a replacement.
|
||||
- **Replacing `internal/provider/subprocess` outright.** The ACP-client
|
||||
provider is added alongside it; the stream-json subprocess path stays
|
||||
for agents that don't (yet) speak ACP. Deprecation is a later call.
|
||||
- **Custom transports.** stdio only (the ACP norm: local agent as a
|
||||
subprocess). No socket/HTTP transport.
|
||||
- **gnoma-drives-gnoma over ACP as the default.** gnoma's native
|
||||
providers/router remain the primary path; ACP-client arms are an
|
||||
additional backend source.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
The two roles share one package (`internal/acp`): JSON-RPC framing,
|
||||
content-block translation, and the capability/handshake types are
|
||||
direction-agnostic. **Part A** is the agent (server) side; **Part B**
|
||||
is the client side. Build Part A first.
|
||||
|
||||
## Part A — gnoma as ACP agent (server)
|
||||
|
||||
### New entry mode: `gnoma acp`
|
||||
|
||||
Add a third mode beside TUI and pipe (mode is chosen near
|
||||
`cmd/gnoma/main.go:106-114`). Selected by an explicit `acp` subcommand
|
||||
(stdio is shared with the JSON-RPC channel, so it can't be
|
||||
TTY-autodetected the way TUI is). In ACP mode:
|
||||
|
||||
- **No banner, no TUI, no stdout chatter.** stdout/stdin are the
|
||||
JSON-RPC pipe; all human/diagnostic logging goes to **stderr** only
|
||||
(the firewall/audit slog sink must not write to stdout). Audit this
|
||||
carefully — any stray stdout write corrupts the protocol stream.
|
||||
- Reuse the existing session/engine/router/security construction; only
|
||||
the front-end loop differs.
|
||||
|
||||
### Package layout
|
||||
|
||||
```
|
||||
internal/acp/
|
||||
protocol.go // ACP types: handshake, capabilities, content blocks (shared)
|
||||
jsonrpc.go // framing reused/forked from internal/mcp/jsonrpc.go (shared)
|
||||
content.go // ContentBlock <-> message.Message translation (shared)
|
||||
server.go // Part A: stdio JSON-RPC read loop; method dispatch
|
||||
session.go // Part A: ACP session <-> gnoma session.Session bridge
|
||||
permission.go // Part A: session/request_permission promptFn
|
||||
update.go // Part A: gnoma stream events -> session/update
|
||||
client.go // Part B: spawn external agent, drive the handshake/prompt
|
||||
```
|
||||
|
||||
A separate `internal/provider/acp/` holds the **Part B provider**
|
||||
adapter (mirrors `internal/provider/subprocess/`), depending on
|
||||
`internal/acp/client.go`.
|
||||
|
||||
Reuse `internal/mcp/jsonrpc.go` framing if it generalises; otherwise
|
||||
fork the minimal envelope (it's tiny). Keep ACP types separate from MCP
|
||||
types — they are different protocols that happen to share JSON-RPC.
|
||||
|
||||
### Method handlers (agent side)
|
||||
|
||||
Map each ACP method to existing gnoma machinery. Pin exact shapes to the
|
||||
spec; the mapping is the contract:
|
||||
|
||||
| ACP method (client→agent) | gnoma handling |
|
||||
|---|---|
|
||||
| `initialize` | Reply with `agentCapabilities` (tools, MCP support, prompt streaming, permission modes), `agentInfo` (name "gnoma", `buildVersion`). Negotiate `protocolVersion`. |
|
||||
| `session/new` | Build a `session.Local` (router, security, tools wired as in main). Honour `cwd` (run it through `safety.ClassifyCWD`), and connect any `mcpServers` the client declares via `internal/mcp/manager.go`. Return a `sessionId`. |
|
||||
| `session/load` (if advertised) | Rehydrate from `internal/session` store (`SessionStore.Load`). Optional — only if we advertise the capability. |
|
||||
| `session/prompt` | Translate ACP `ContentBlock`s → `message.Message`, call `Send`/`SendWithOptions`, stream results back as `session/update`, return the stop reason. |
|
||||
| `session/cancel` (notification) | Cancel the in-flight turn's context. |
|
||||
|
||||
Agent→client calls gnoma must make:
|
||||
|
||||
| ACP call (agent→client) | Trigger |
|
||||
|---|---|
|
||||
| `session/update` (notification) | Per engine stream event: assistant text deltas, tool-call start/args/result, plan/thoughts, token usage. Map gnoma's stream iterator (`Next/Current`) to update variants. |
|
||||
| `session/request_permission` | gnoma's `permission.Checker` promptFn — instead of console `Scanln`, send this and await the client's allow/deny (with the ACP "allow once / always" options mapped to gnoma permission modes). |
|
||||
| `fs/read_text_file`, `fs/write_text_file` | **If** we advertise client-side fs and the client supports it, route the `fs` tools through the client so edits show in the editor's buffers. Otherwise gnoma's own `internal/tool/fs` operates on disk directly. Decide per capability negotiation. |
|
||||
|
||||
### Streaming bridge
|
||||
|
||||
The engine produces a pull-based stream (`Next() / Current() / Err() /
|
||||
Close()`). The ACP bridge consumes it and emits a `session/update` per
|
||||
event. Backpressure: ACP is fire-and-forget notifications, so no
|
||||
blocking — but coalesce text deltas if the client is slow (config knob,
|
||||
default flush per token).
|
||||
|
||||
### Security & safety interplay
|
||||
|
||||
- The `SafeProvider` firewall boundary and the per-session audit log
|
||||
apply unchanged — ACP is a front-end, providers/tools sit behind the
|
||||
same security layer.
|
||||
- `safety.ClassifyCWD` runs on the `session/new` `cwd`; a `refuse`
|
||||
classification returns an ACP error rather than starting the session.
|
||||
- Egress allowlist (`2026-06-04-egress-allowlist.md`) applies as usual.
|
||||
- Incognito: expose a way to start an ACP session incognito (capability
|
||||
flag or `session/new` param) so editor-driven sessions can be
|
||||
non-persistent.
|
||||
|
||||
### MCP-in-ACP
|
||||
|
||||
When `session/new` lists `mcpServers`, spin them up through the existing
|
||||
manager so the editor's MCP config and gnoma's converge in one
|
||||
handshake (this is the headline ACP×MCP integration). gnoma's own
|
||||
config-level MCP servers still load too; merge, don't replace.
|
||||
|
||||
---
|
||||
|
||||
## Part B — gnoma as ACP client (external agents as router arms)
|
||||
|
||||
gnoma connects to external ACP agents and exposes each as a router-arm
|
||||
backend, the standardized successor to `internal/provider/subprocess`.
|
||||
gnoma plays the *client* (editor) side of the JSON-RPC channel.
|
||||
|
||||
### Provider adapter
|
||||
|
||||
Add `internal/provider/acp/` implementing the `provider.Provider`
|
||||
contract (`Stream`, `Name`, `Models`, `DefaultModel`) — the same surface
|
||||
the subprocess provider satisfies
|
||||
(`internal/provider/subprocess/provider.go:28-62`):
|
||||
|
||||
- **Spawn + handshake.** On first use (or at discovery), spawn the agent
|
||||
subprocess (`exec.CommandContext`, with the Windows/Unix process-group
|
||||
handling from `2026-06-04-cross-platform.md`), send `initialize` as the
|
||||
client, then `session/new` with gnoma's `cwd` and — crucially —
|
||||
gnoma's *own* MCP servers passed through as the `mcpServers` list so
|
||||
the external agent shares gnoma's tool surface.
|
||||
- **`Stream` → `session/prompt`.** Translate the gnoma `Request`
|
||||
messages into ACP `ContentBlock`s, send `session/prompt`, and turn the
|
||||
incoming `session/update` notifications back into gnoma's pull-based
|
||||
stream events (`EventTextDelta`, structured tool-call events, usage).
|
||||
This is the win over the subprocess provider: tool calls arrive
|
||||
**structured**, not as opaque `EventTextDelta` text.
|
||||
- **Permission callbacks.** The external agent sends
|
||||
`session/request_permission` to gnoma (now the client). Route these
|
||||
through gnoma's existing `permission.Checker` so the *user's* gnoma
|
||||
permission policy governs the sub-agent — a strict improvement over
|
||||
today's `--yolo`/`--trust` subprocess invocations that bypass gnoma's
|
||||
gate entirely.
|
||||
- **`fs/*` callbacks.** Route the agent's file reads/writes through
|
||||
gnoma's `internal/tool/fs` guard so the path-safety boundary still
|
||||
applies.
|
||||
- **Cancellation.** gnoma's turn-cancel sends ACP `session/cancel`.
|
||||
|
||||
### Discovery & registration
|
||||
|
||||
Mirror the subprocess flow (`cmd/gnoma/main.go:521-531`):
|
||||
|
||||
- Discover ACP agents from config (`[acp.agents]` — command + args +
|
||||
optional capability hints) and/or a known-agents table analogous to
|
||||
`subprocess/agent.go:60` (`knownAgents`).
|
||||
- Register each as a `router.Arm` (a new `IsACPAgent` flag, or reuse
|
||||
`IsCLIAgent` with a transport discriminant). Set `Capabilities` from
|
||||
the ACP `initialize` response — notably `ToolUse:true`, which the
|
||||
subprocess provider often can't claim.
|
||||
- Wrap in `security.WrapProvider(..., fwRef)` exactly like every other
|
||||
arm so the firewall + audit + egress boundaries hold.
|
||||
|
||||
### Relationship to the subprocess provider
|
||||
|
||||
Additive. Agents that speak ACP (Claude, Gemini CLI, Codex increasingly
|
||||
do) get the ACP arm; agents that only do one-shot stream-json keep the
|
||||
subprocess arm. Where both exist for one binary, prefer ACP. This also
|
||||
unblocks the "Native agy JSON output" backlog item for any agent that
|
||||
exposes ACP instead of `--output-format stream-json`.
|
||||
|
||||
---
|
||||
|
||||
## Touch-points (file:line)
|
||||
|
||||
**Part A — agent (server):**
|
||||
|
||||
| Change | Location |
|
||||
|---|---|
|
||||
| New ACP package | `internal/acp/` |
|
||||
| Entry mode dispatch | `cmd/gnoma/main.go` (mode select ~`:106`, subcommand dispatch ~`:178`) |
|
||||
| stdout→stderr log discipline | logger setup (`main.go:100-114`) |
|
||||
| Session bridge | `internal/session` (`Session`/`Local`) |
|
||||
| Permission callback | `internal/permission` checker promptFn (`main.go:645-668`) |
|
||||
| Stream→update | engine stream iterator (`internal/engine`, `internal/stream`) |
|
||||
| MCP per-session | `internal/mcp/manager.go` |
|
||||
| JSON-RPC framing reuse | `internal/mcp/jsonrpc.go` |
|
||||
|
||||
**Part B — client (external agents as arms):**
|
||||
|
||||
| Change | Location |
|
||||
|---|---|
|
||||
| ACP-client provider | new `internal/provider/acp/` (mirrors `internal/provider/subprocess/`) |
|
||||
| Client handshake/driver | `internal/acp/client.go` |
|
||||
| Arm discovery + registration | `cmd/gnoma/main.go:521-531` (subprocess pattern), `[acp.agents]` config |
|
||||
| Known-agents table | analogous to `internal/provider/subprocess/agent.go:60` |
|
||||
| Arm flag | `router.Arm` (`IsACPAgent`, or `IsCLIAgent` + transport) |
|
||||
| Security wrap | `security.WrapProvider(..., fwRef)` |
|
||||
|
||||
---
|
||||
|
||||
## Testing (TDD — write first)
|
||||
|
||||
- **Protocol unit tests (no real provider):**
|
||||
- `initialize` handshake: version negotiation, advertised
|
||||
capabilities are stable and accurate.
|
||||
- `session/new` → returns a sessionId; honours `cwd`; rejects a
|
||||
`refuse`-classified cwd with an ACP error.
|
||||
- `session/prompt` with a stubProvider: ContentBlocks translate in,
|
||||
`session/update`s stream out in order, correct stop reason.
|
||||
- `session/cancel` aborts the in-flight turn (context cancellation
|
||||
observed).
|
||||
- Permission: a tool call triggers `session/request_permission`; a
|
||||
"deny" response blocks the tool; "allow always" updates the mode.
|
||||
- **stdout purity test:** drive a full prompt and assert stdout
|
||||
contains *only* valid JSON-RPC frames (no banner/log leakage) — this
|
||||
is the most common ACP-agent bug.
|
||||
- **Conformance:** run gnoma against the upstream ACP test client /
|
||||
example client (Rust/TS) in a `//go:build integration` test if one is
|
||||
available; otherwise a recorded-transcript fixture.
|
||||
- **MCP-in-ACP:** `session/new` with an `mcpServers` entry spins the
|
||||
server up and its tools become callable in that session.
|
||||
- **Part B (client) unit tests** — drive a *fake ACP agent* (a small
|
||||
in-process JSON-RPC responder, the mirror of the agent-side tests):
|
||||
- Provider `Stream` performs `initialize`+`session/new`+`session/prompt`
|
||||
and yields gnoma stream events in order, with **structured** tool-call
|
||||
events (not opaque text).
|
||||
- An inbound `session/request_permission` is routed through
|
||||
`permission.Checker` and a deny blocks the call.
|
||||
- An inbound `fs/write_text_file` is mediated by the `internal/tool/fs`
|
||||
guard (a guarded path is refused).
|
||||
- Turn cancel emits `session/cancel`; the subprocess is reaped (tie to
|
||||
cross-platform process-group handling).
|
||||
- Discovery registers a fake ACP agent as an arm with `ToolUse:true`.
|
||||
- **Round-trip (loopback):** point gnoma's ACP-*client* at a `gnoma acp`
|
||||
*server* subprocess and run a prompt end-to-end — exercises both parts
|
||||
over a real stdio pipe.
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
**Part A (agent/server):**
|
||||
|
||||
1. `gnoma acp` speaks the handshake and a full prompt turn over stdio.
|
||||
2. gnoma appears and works as an external agent in Zed (manual: add
|
||||
gnoma to Zed's external-agents config, run a prompt, approve a tool).
|
||||
3. Tool permission prompts surface in the client and gate execution.
|
||||
4. stdout carries only JSON-RPC; all logs go to stderr.
|
||||
5. Cancelling from the editor stops the turn.
|
||||
6. MCP servers declared by the client in `session/new` are available in
|
||||
that session.
|
||||
|
||||
**Part B (client):**
|
||||
|
||||
7. An external ACP agent configured under `[acp.agents]` appears as a
|
||||
router arm (`gnoma providers` lists it) with `ToolUse:true`.
|
||||
8. Routing a task to that arm runs a full turn via ACP, surfacing the
|
||||
sub-agent's tool calls **structured** in gnoma's stream.
|
||||
9. The sub-agent's permission requests are gated by the user's gnoma
|
||||
permission policy (not auto-approved).
|
||||
10. The sub-agent's file writes pass through gnoma's fs guard.
|
||||
11. Loopback: `gnoma acp` driven by gnoma's own ACP-client completes a
|
||||
prompt end-to-end.
|
||||
|
||||
---
|
||||
|
||||
## Open questions (resolve against the live spec at implementation)
|
||||
|
||||
- Exact `protocolVersion` to target and the precise capability struct
|
||||
shapes (the schema is the source of truth; pin a version).
|
||||
- Whether to advertise client-side `fs/*` (edits flow through the
|
||||
editor's buffers) vs. direct-disk fs tools — depends on parity and on
|
||||
how gnoma's `internal/tool/fs` guard composes with editor-mediated
|
||||
writes.
|
||||
- `session/load` support (needs our session store to round-trip the
|
||||
ACP transcript shape).
|
||||
- **(Part B)** How a sub-agent's own model/cost is represented in the
|
||||
router — an ACP arm's tokens are billed by *that* agent, so
|
||||
`CostWeight`/`CostPer1k*` are opaque. Likely model it like the
|
||||
subprocess arms (no metered cost; selection driven by `Strengths`).
|
||||
- **(Part B)** Lifecycle: spawn-per-session vs. a pooled long-lived
|
||||
agent process reused across turns; how cancellation and crashes are
|
||||
recovered (ties to session error-recovery, `0d3d190`).
|
||||
|
||||
---
|
||||
|
||||
## TODO linkage
|
||||
|
||||
New "Agent Client Protocol (ACP) support" entry in `TODO.md` (In
|
||||
flight) links here. Covers **both** roles: gnoma as ACP agent (Part A)
|
||||
and gnoma as ACP client driving external agents as router arms
|
||||
(Part B). Part B is the standardized successor to
|
||||
`internal/provider/subprocess` and overlaps the "Native agy JSON
|
||||
output" backlog item.
|
||||
@@ -0,0 +1,198 @@
|
||||
# Cross-Platform Support (Windows + macOS) — 2026-06-04
|
||||
|
||||
Makes the Windows and macOS binaries — which GoReleaser already builds
|
||||
for `linux/darwin/windows × amd64/arm64` but only Linux exercises —
|
||||
actually work and stay working. Promotes the TODO.md entry
|
||||
"Cross-platform support — Windows + macOS" into a phased design with
|
||||
concrete code touch-points.
|
||||
|
||||
This plan does not restate the TODO's r/devops question map (Phase 2
|
||||
table there stands). Its value-add is the **specific code locations**
|
||||
that need OS-conditional handling and the build-tag pattern to use.
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
Only Linux is tested. The binaries ship for Windows/macOS untested, and
|
||||
the codebase has several hard Unix assumptions that will fail or
|
||||
silently misbehave off-Linux. The pattern to follow already exists:
|
||||
`internal/mcp/transport_{unix,windows}.go` split via build tags.
|
||||
|
||||
---
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **MSI installer, Authenticode/Gatekeeper signing.** Covered by
|
||||
`2026-06-04-distribution-followups.md` — those are packaging, not
|
||||
runtime correctness.
|
||||
- **Group Policy / Event Viewer integration.** Out of scope per the
|
||||
TODO; documentation-only.
|
||||
- **WSL-specific tuning.** WSL is Linux; it works today.
|
||||
|
||||
---
|
||||
|
||||
## Confirmed Unix-assumption defects (file:line)
|
||||
|
||||
### Critical — break core functionality on Windows
|
||||
|
||||
1. **Bash tool hardcodes `bash -c`.**
|
||||
`internal/tool/bash/bash.go:117` →
|
||||
`exec.CommandContext(ctx, "bash", "-c", command)`. No Windows shell.
|
||||
Alias harvesting (`internal/tool/bash/aliases.go:115,148`) hardcodes
|
||||
`/bin/bash` and splits the shell path on `/`.
|
||||
2. **Llamafile SLM startup hardcodes `sh`.**
|
||||
`internal/slm/manager.go:172` invokes `sh <llamafile>` (a Wine
|
||||
binfmt workaround). `sh` is absent on native Windows → `gnoma slm
|
||||
status/setup` fails outright.
|
||||
3. **MCP process-tree kill is a Windows stub.**
|
||||
`internal/mcp/transport_windows.go:10-18` — `setProcessGroup` is a
|
||||
no-op and `killProcessTree` calls `p.Kill()`, leaking any child
|
||||
processes an MCP server spawns. Unix version uses process groups
|
||||
(`transport_unix.go:11-18`).
|
||||
|
||||
### High — config/auth land in the wrong place off-Linux
|
||||
|
||||
4. **Config/data dirs assume XDG.**
|
||||
`internal/config/load.go:52-59` falls back to `~/.config`;
|
||||
`internal/slm/manager.go:25-35` falls back to `~/.local/share`. On
|
||||
Windows these should be `os.UserConfigDir()` (`%AppData%`) /
|
||||
`os.UserCacheDir()`. On macOS, native tools use
|
||||
`~/Library/Application Support`, though `~/.config` is tolerable;
|
||||
decide and document.
|
||||
5. **OAuth credential discovery is Unix-pathed.**
|
||||
`internal/provider/google/provider.go:188-204` hardcodes
|
||||
`~/.config/...` and `~/.gemini/...`. `expandHome` (`:114-129`)
|
||||
already handles `\`, but the path *set* is Unix-centric — Gemini/
|
||||
Antigravity creds on macOS/Windows won't be found.
|
||||
6. **No system-proxy support.** No `http.ProxyFromEnvironment` wiring
|
||||
found. Go stdlib reads `HTTP(S)_PROXY` env vars but **not** the
|
||||
Windows system proxy / PAC. Corporate Windows networks rely on these.
|
||||
|
||||
### Medium — usability / safety classifier gaps
|
||||
|
||||
7. **`internal/safety/cwd.go`** macOS system roots
|
||||
(`:185-210`) miss `/opt`, `/usr/local`; personal-dir detection
|
||||
(`:221-252`) misses Windows `%TEMP%`/`%APPDATA%` and macOS
|
||||
`~/Library/...`.
|
||||
8. **Terminal/ANSI.** TUI uses lipgloss/termenv (auto-detects), so
|
||||
modern Windows Terminal/PowerShell 7 are fine; legacy `conhost.exe`
|
||||
may mangle. Verify, don't assume.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Phase 0 — build-tag scaffolding
|
||||
|
||||
Adopt the existing `_unix.go` / `_windows.go` split (as in
|
||||
`internal/mcp`) for each defect that needs divergent behaviour. Prefer
|
||||
`runtime.GOOS` only for small inline branches (as
|
||||
`internal/safety/cwd.go:201` already does); use build tags when the
|
||||
implementation genuinely differs (shell selection, process kill).
|
||||
|
||||
### Phase 1 — smoke tests (unblocks the honest "did you test it?" answer)
|
||||
|
||||
Non-blocking GitHub Actions matrix (`windows-latest`, `macos-latest`,
|
||||
`ubuntu-latest`):
|
||||
|
||||
- `go build ./...` and `go test ./...` per OS (today the release
|
||||
workflow tests Linux only — `.github/workflows/release.yml`).
|
||||
- Post-release: download each archive, run `gnoma --version` and a
|
||||
stubbed `echo hi | gnoma --provider ollama` against a fake endpoint.
|
||||
Confirms the binary launches and the TUI doesn't crash.
|
||||
|
||||
This is the precondition the TODO names for posting to r/devops.
|
||||
|
||||
### Phase 2 — shell abstraction (defects #1, #2)
|
||||
|
||||
1. Introduce `internal/tool/bash/shell_unix.go` /
|
||||
`shell_windows.go` exposing `defaultShell() (name string, args
|
||||
[]string)` and a `quoteArg(string) string`:
|
||||
- Unix: `bash`/`$SHELL`, `-c`, POSIX quoting.
|
||||
- Windows: prefer `pwsh`/`powershell` with the appropriate
|
||||
`-Command` invocation and PowerShell quoting rules; fall back to
|
||||
`cmd /c`. Document the choice.
|
||||
2. Fix `aliases.go` to use `filepath.Base` instead of splitting on `/`,
|
||||
and skip alias harvesting on Windows shells that have no equivalent.
|
||||
3. Llamafile: on Windows, invoke the `.llamafile` (which is a valid
|
||||
Windows PE as well as a shell script) directly rather than via `sh`;
|
||||
guard with a build tag.
|
||||
|
||||
### Phase 3 — process management (defect #3)
|
||||
|
||||
Implement Windows job objects via `golang.org/x/sys/windows` in
|
||||
`transport_windows.go` (and any other subprocess owner —
|
||||
`internal/provider/subprocess`, `internal/tool/bash`): create a job,
|
||||
assign the child, `TerminateJobObject` on close to reap the whole tree.
|
||||
Shared helper so MCP and bash tool both get tree-kill. (This is the
|
||||
same item the distribution TODO references.)
|
||||
|
||||
### Phase 4 — paths + proxy (defects #4, #5, #6)
|
||||
|
||||
1. Replace XDG fallbacks with `os.UserConfigDir()` / `os.UserCacheDir()`
|
||||
on Windows (keep XDG honoring on Unix). Centralise in one
|
||||
`configDir()` / `dataDir()` helper so it's not re-derived.
|
||||
2. Extend the OAuth credential path sets with OS-appropriate locations
|
||||
(macOS `~/Library/Application Support/...`, Windows `%AppData%/...`).
|
||||
3. Ensure every `http.Client` uses a transport with
|
||||
`Proxy: http.ProxyFromEnvironment`. For Windows system-proxy/PAC,
|
||||
document the env-var workaround now; optionally vendor a PAC-aware
|
||||
transport (e.g. `github.com/rapid7/go-get-proxied`) later. This
|
||||
overlaps the shared-client work in
|
||||
`2026-06-04-egress-allowlist.md` — do the proxy transport once, in
|
||||
the shared client.
|
||||
|
||||
### Phase 5 — safety classifier + terminal (defects #7, #8)
|
||||
|
||||
Extend `internal/safety/cwd.go` system-root and personal-dir sets per
|
||||
OS; add a manual verification note for legacy Windows terminals.
|
||||
|
||||
---
|
||||
|
||||
## Touch-points (file:line)
|
||||
|
||||
| Defect | Location |
|
||||
|---|---|
|
||||
| Bash shell | `internal/tool/bash/bash.go:117`, `aliases.go:115,148` |
|
||||
| Llamafile `sh` | `internal/slm/manager.go:172` |
|
||||
| MCP kill stub | `internal/mcp/transport_windows.go:10-18` |
|
||||
| Config/data dirs | `internal/config/load.go:52-59`, `internal/slm/manager.go:25-35` |
|
||||
| OAuth paths | `internal/provider/google/provider.go:188-204` |
|
||||
| Proxy | shared `http.Client` (see egress plan) |
|
||||
| Safety classifier | `internal/safety/cwd.go:185-252` |
|
||||
| CI matrix | `.github/workflows/` (new test job), `release.yml` |
|
||||
|
||||
---
|
||||
|
||||
## Testing (TDD — write first)
|
||||
|
||||
- **OS-gated unit tests** (run on each matrix OS):
|
||||
- `defaultShell()` returns a runnable shell per OS; `quoteArg`
|
||||
round-trips a value containing spaces/quotes through the real shell.
|
||||
- `configDir()`/`dataDir()` return the OS-correct base.
|
||||
- Job-object kill: spawn a child that spawns a grandchild; assert
|
||||
both are gone after `killProcessTree` (Windows).
|
||||
- `safety.ClassifyCWD` flags OS-appropriate system/personal dirs.
|
||||
- **Existing tests** that `t.Skip` on Windows
|
||||
(`internal/tool/fs/guard_test.go`,
|
||||
`internal/provider/subprocess/stream_test.go`) — audit whether the
|
||||
skip hides a real gap now that Windows is a target.
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
1. CI smoke matrix is green on `windows-latest` + `macos-latest`.
|
||||
2. `gnoma --version` and a stubbed pipe run succeed on a Windows runner.
|
||||
3. A bash-tool command with quoted args runs on Windows (PowerShell).
|
||||
4. An MCP server that spawns a child leaves no orphan after shutdown on
|
||||
Windows.
|
||||
5. Config lands in `%AppData%\gnoma` on Windows, `~/.config/gnoma` on
|
||||
Linux.
|
||||
|
||||
---
|
||||
|
||||
## TODO linkage
|
||||
|
||||
Promotes the "Cross-platform support — Windows + macOS" entry in
|
||||
`TODO.md`. The Phase-2 r/devops question table stays in the TODO as the
|
||||
public-facing answer map; link this plan for the implementation detail.
|
||||
@@ -0,0 +1,169 @@
|
||||
# Distribution Follow-ups — 2026-06-04
|
||||
|
||||
Hardens and broadens the release pipeline. v0.1.0+ already ships static
|
||||
archives (GitHub mirror releases) and multi-arch Docker images (GHCR)
|
||||
via GoReleaser. This plan covers the optional follow-ups listed under
|
||||
"Distribution — follow-ups" in TODO.md: signed checksums, Homebrew tap,
|
||||
`curl | sh` installer, release-note automation, and the
|
||||
`dockers`→`dockers_v2` migration.
|
||||
|
||||
---
|
||||
|
||||
## Current state (confirmed)
|
||||
|
||||
- **`.goreleaser.yml`:** 6-target build matrix (linux/darwin/windows ×
|
||||
amd64/arm64), CGO disabled, version injected via ldflags
|
||||
(`-X main.buildVersion/buildCommit/buildDate`; read at
|
||||
`cmd/gnoma/main.go:55-60`, printed at `:95-98`). Archives: tar.gz
|
||||
(zip on Windows). Checksums: plain SHA256 `checksums.txt`,
|
||||
**unsigned**. Docker: separate per-arch `dockers` blocks +
|
||||
`docker_manifests` for the multi-arch manifest. Release published to
|
||||
GitHub mirror (`release.github` owner `VikingOwl91`).
|
||||
- **`.github/workflows/release.yml`:** triggers on `v*` tags, sets up
|
||||
QEMU + Buildx, logs into GHCR with the built-in `GITHUB_TOKEN`, runs
|
||||
`go test ./...` (Linux only), then `goreleaser release --clean` with
|
||||
`GORELEASER_CURRENT_TAG` set. **No signing step.**
|
||||
- **`Dockerfile`:** distroless `static:nonroot`, copies the
|
||||
GoReleaser-built binary in. Architecture-agnostic (binary built
|
||||
before `COPY`).
|
||||
- **No** Homebrew tap, install script, or Makefile release target.
|
||||
|
||||
---
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **Authenticode (Windows) / Gatekeeper notarization (macOS) code
|
||||
signing.** These need a paid EV cert / Apple Developer account —
|
||||
tracked separately (the cross-platform TODO documents the
|
||||
"right-click → Unblock" workaround). Sigstore/cosign here is for
|
||||
*checksum* signing, which needs no paid cert.
|
||||
- **MSI installer.** Lives in the cross-platform plan, gated on demand.
|
||||
- **Changing the canonical repo flow.** PRs still go to the Gitea
|
||||
upstream; the GitHub mirror remains the release/CI surface.
|
||||
|
||||
---
|
||||
|
||||
## Design (independent work items — ship in any order)
|
||||
|
||||
### 1. Signed checksums (cosign / sigstore keyless)
|
||||
|
||||
Add a GoReleaser `signs` block that signs `checksums.txt` with cosign
|
||||
in **keyless** mode (OIDC via the GitHub Actions token — no stored
|
||||
private key, no cert cost):
|
||||
|
||||
- Add `cosign` install + `id-token: write` permission to
|
||||
`release.yml`.
|
||||
- GoReleaser `signs:` → `cmd: cosign`, `args: sign-blob` producing
|
||||
`checksums.txt.sig` + `.pem` (cert bundle) as release artifacts.
|
||||
- Document verification:
|
||||
`cosign verify-blob --certificate ... --signature ... checksums.txt`.
|
||||
|
||||
Acceptance: a downloaded release verifies offline against the published
|
||||
signature + Rekor transparency log.
|
||||
|
||||
### 2. Homebrew tap
|
||||
|
||||
Create a tap repo (`VikingOwl91/homebrew-tap`) and add GoReleaser's
|
||||
`brews:` block targeting it. Needs a PAT with `contents:write` on the
|
||||
tap repo (the default `GITHUB_TOKEN` can't push to a *second* repo) —
|
||||
store as `HOMEBREW_TAP_TOKEN` secret. Formula installs the darwin/linux
|
||||
archives.
|
||||
|
||||
Acceptance: `brew install vikingowl91/tap/gnoma` installs a working
|
||||
binary on macOS + Linuxbrew; `gnoma --version` matches the tag.
|
||||
|
||||
### 3. `curl | sh` installer
|
||||
|
||||
Add `install.sh` (committed at repo root, served via the raw GitHub
|
||||
mirror) that:
|
||||
|
||||
- Detects OS/arch, maps to the GoReleaser archive name template
|
||||
(`gnoma_<ver>_<os>_<arch>.<ext>`).
|
||||
- Resolves the latest release via the GitHub API (or honours a pinned
|
||||
`GNOMA_VERSION`).
|
||||
- Downloads the archive **and** `checksums.txt`, verifies the SHA256
|
||||
before extracting (and the cosign signature if cosign is present).
|
||||
- Installs to `~/.local/bin` (or `$GNOMA_INSTALL_DIR`), prints a PATH
|
||||
hint.
|
||||
|
||||
Keep it POSIX-sh, no bashisms. Acceptance:
|
||||
`curl -fsSL <raw>/install.sh | sh` yields a runnable `gnoma` on a clean
|
||||
Linux + macOS box; checksum mismatch aborts.
|
||||
|
||||
### 4. Release-note automation
|
||||
|
||||
GoReleaser already generates a filtered changelog (excludes
|
||||
docs/test/chore/style). Enrich it:
|
||||
|
||||
- Group commits by Conventional-Commit type
|
||||
(`changelog.groups` with title regexes for feat/fix/perf/refactor).
|
||||
- Add a release header template pointing to the upstream Gitea repo and
|
||||
the install methods (brew / curl | sh / docker).
|
||||
|
||||
Acceptance: a tagged release's GitHub notes show grouped sections + an
|
||||
install snippet, with no docs/chore noise.
|
||||
|
||||
### 5. `dockers` → `dockers_v2` migration
|
||||
|
||||
Collapse the two per-arch `dockers` blocks + `docker_manifests` into a
|
||||
single `dockers_v2` block (GoReleaser's newer multi-platform builder).
|
||||
The current `Dockerfile` is architecture-agnostic (binary copied
|
||||
post-build), so verify whether `dockers_v2`'s expected per-platform
|
||||
binary layout needs a `Dockerfile` change or a `templates`/`extra_files`
|
||||
tweak — the TODO flags this as the reason it was deferred. Do it in its
|
||||
own commit; diff the resulting GHCR manifest against the current one to
|
||||
prove parity (same tags: `<ver>-amd64`, `<ver>-arm64`, `<ver>`,
|
||||
`latest`).
|
||||
|
||||
Acceptance: GHCR still publishes a multi-arch manifest with identical
|
||||
tags + labels; `docker pull --platform linux/arm64` works.
|
||||
|
||||
### 6. (Carry-over) Windows process-tree kill
|
||||
|
||||
Listed in this TODO bullet but it's a *runtime* concern — implemented in
|
||||
`2026-06-04-cross-platform.md` Phase 3 (job objects). Cross-linked here
|
||||
only so the TODO bullet's reference resolves.
|
||||
|
||||
---
|
||||
|
||||
## Touch-points (file:line)
|
||||
|
||||
| Item | Location |
|
||||
|---|---|
|
||||
| Signing, brews, changelog groups, dockers_v2 | `.goreleaser.yml` |
|
||||
| cosign install, `id-token` perm, tap token | `.github/workflows/release.yml` |
|
||||
| Installer | new `install.sh` (repo root) |
|
||||
| Dockerfile (if dockers_v2 needs it) | `Dockerfile` |
|
||||
| Tap repo | new `VikingOwl91/homebrew-tap` |
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
Distribution is config + scripts, so testing is mostly pipeline-level:
|
||||
|
||||
- **Dry run:** `goreleaser release --snapshot --clean` locally must
|
||||
produce signed checksums, brew formula, and the dockers_v2 manifest
|
||||
without publishing.
|
||||
- **install.sh:** a `shellcheck` gate + a CI job that runs it against
|
||||
the latest release on linux + macos runners and asserts
|
||||
`gnoma --version`.
|
||||
- **Checksum/signature negative test:** corrupt the archive → installer
|
||||
aborts; tampered checksums → cosign verify fails.
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
1. A tagged release publishes `checksums.txt` + `.sig` + `.pem`,
|
||||
verifiable with cosign keyless.
|
||||
2. `brew install vikingowl91/tap/gnoma` works on macOS.
|
||||
3. `curl -fsSL <raw>/install.sh | sh` works on clean Linux + macOS,
|
||||
with checksum verification.
|
||||
4. Release notes are grouped and carry install instructions.
|
||||
5. GHCR multi-arch manifest is unchanged after the dockers_v2 swap.
|
||||
|
||||
---
|
||||
|
||||
## TODO linkage
|
||||
|
||||
Promotes the "Distribution — follow-ups" entry in `TODO.md`. Link this
|
||||
file; the Windows job-object sub-item points at the cross-platform plan.
|
||||
@@ -0,0 +1,236 @@
|
||||
# Network Egress Allowlist — 2026-06-04
|
||||
|
||||
Adds a per-host network egress boundary to the security layer via a
|
||||
Learn → Review → Enforce rollout. Promotes the second half of the
|
||||
TODO.md entry "Security boundary — egress controls + session audit log"
|
||||
into a phased design.
|
||||
|
||||
---
|
||||
|
||||
## Status of the sibling item: per-session audit log — DONE
|
||||
|
||||
The first half of the TODO entry (per-session audit log of
|
||||
blocked/redacted events) is **already implemented**:
|
||||
|
||||
- `internal/security/audit.go` defines `AuditLogger` / `AuditEvent`,
|
||||
writing append-only JSONL at mode `0o600`, incognito-gated,
|
||||
best-effort (write failures never break the scan pipeline).
|
||||
- `cmd/gnoma/main.go:685-691` wires it to
|
||||
`<projectRoot>/.gnoma/sessions/<sessionID>/audit.jsonl`.
|
||||
- `internal/security/firewall.go` records events at `:152` (unicode
|
||||
sanitize), `:173` (block), `:186` (redact).
|
||||
|
||||
**Remaining audit-log gap:** there is no CLI surface to *read* it. The
|
||||
TODO's promise — answer "what did the firewall do this session?" in one
|
||||
command — needs a `gnoma firewall audit` subcommand (no `firewall`
|
||||
subcommand exists today; top-level commands are `providers`, `slm`,
|
||||
`router`, `profile`). That viewer is folded into Phase 3 below since it
|
||||
shares the `gnoma firewall` command surface with `firewall review`.
|
||||
|
||||
The rest of this plan is the genuinely-unbuilt egress allowlist.
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
The current `Firewall` is a **content** boundary only: it scans
|
||||
messages and tool results for secrets (regex + Shannon entropy) and
|
||||
redacts/blocks/warns. It does **not** enforce network egress. Outgoing
|
||||
HTTP uses stock clients with no per-host allowlist and no dial-layer
|
||||
interception, so a compromised tool, MCP server, or prompt-injected
|
||||
provider call can reach any host.
|
||||
|
||||
The README and v0.3.0 launch post oversold "network egress gated";
|
||||
this plan makes that claim true.
|
||||
|
||||
### Why this is hard: no egress chokepoint today
|
||||
|
||||
Outgoing HTTP is constructed in many places, none sharing a client:
|
||||
|
||||
- **Provider SDKs** each build their own `http.Client` internally:
|
||||
- anthropic (`internal/provider/anthropic/provider.go:36`,
|
||||
`anthropic.NewClient`)
|
||||
- openai (`internal/provider/openai/provider.go:46`, `oai.NewClient`)
|
||||
- mistral (`internal/provider/mistral/provider.go:33`,
|
||||
`mistralgo.NewClient`)
|
||||
- google genai (`internal/provider/google/provider.go:239,306`)
|
||||
- **Non-SDK direct calls** using `http.DefaultClient` or ad-hoc
|
||||
`&http.Client{}`:
|
||||
- `internal/router/discovery.go` (`:65,141,325,365`)
|
||||
- `internal/router/probe.go` (`:24,72`)
|
||||
- `internal/slm/backend.go` (`:266,294,316,343`)
|
||||
- `internal/slm/download.go` (`:22`)
|
||||
- `internal/slm/manager.go` (`:273`)
|
||||
|
||||
No custom `http.Client` is injected anywhere today. **But** every SDK
|
||||
supports injecting one, which is the enabler for a single chokepoint.
|
||||
|
||||
---
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **TLS interception / MITM.** We allowlist by destination host, not by
|
||||
inspecting decrypted payloads. Content inspection stays the
|
||||
firewall's job.
|
||||
- **Blocking the provider SDKs' own retry/telemetry hosts by default.**
|
||||
Model-provider hosts are baseline-allowed (see below).
|
||||
- **Replacing the OS/network firewall.** This is an in-process
|
||||
application-level guard, defense-in-depth, not a substitute for real
|
||||
network controls. Document this honestly (the README over-claim is
|
||||
the cautionary tale).
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### The chokepoint: one shared `http.Client` with a guarded dialer
|
||||
|
||||
Build a single `*http.Client` whose `Transport.DialContext` validates
|
||||
the destination against the allowlist **before** the connection is
|
||||
made. `DialContext` receives `host:port` pre-resolution, so host-based
|
||||
matching works without DNS races. Thread this client everywhere.
|
||||
|
||||
```
|
||||
internal/security/egress/
|
||||
guard.go // EgressGuard: mode + allowlist + Decide(host) ResultEnum
|
||||
dialer.go // GuardedDialer wrapping net.Dialer.DialContext
|
||||
client.go // HTTPClient(guard) *http.Client
|
||||
store.go // learned-destinations persistence (per project)
|
||||
baseline.go // curated ship-in-binary allowlist
|
||||
```
|
||||
|
||||
**Injection mechanism per SDK** (each differs — enumerate, don't assume):
|
||||
|
||||
| Client | Mechanism |
|
||||
|---|---|
|
||||
| anthropic | `option.WithHTTPClient(c)` appended in `anthropic/provider.go` |
|
||||
| openai | `option.WithHTTPClient(c)` appended in `openai/provider.go` |
|
||||
| google genai | `genai.ClientConfig{HTTPClient: c}` in `google/provider.go` |
|
||||
| mistral | **user's own SDK** — add `WithHTTPClient` option if absent (`github.com/VikingOwl91/mistral-go-sdk`), then use it |
|
||||
| non-SDK paths | replace `http.DefaultClient` with the shared client in `router/discovery.go`, `router/probe.go`, `slm/backend.go`, `slm/download.go`, `slm/manager.go` |
|
||||
|
||||
Plumb the shared client into providers by adding
|
||||
`HTTPClient *http.Client` to `provider.ProviderConfig`
|
||||
(`internal/provider/registry.go:8-16`) and setting it in
|
||||
`createProvider`. The non-SDK paths take the client via their existing
|
||||
constructors / a package-level setter.
|
||||
|
||||
> The non-SDK paths are the trap: if any is missed it punches a hole in
|
||||
> the allowlist. Treat the list above as a checklist; add a grep test
|
||||
> (Phase 4) that fails if `http.DefaultClient` reappears.
|
||||
|
||||
### Three-stage rollout (not a single "block everything" default)
|
||||
|
||||
**Learn.** First runs log every egress destination per `(project,
|
||||
agent, tool)` tuple to the per-project store **without blocking**.
|
||||
Reuse the audit JSONL discipline (atomic, incognito-gated).
|
||||
|
||||
**Review.** `gnoma firewall review` surfaces the captured set; the user
|
||||
marks each destination `allow | deny | scoped` (scoped = only reachable
|
||||
by named tool/agent). Persist to `.gnoma/firewall/allowlist.toml`
|
||||
(project) — subject to the same `omitempty`/atomic-write discipline as
|
||||
the config-migration plan (`2026-05-24-config-migration.md`) to avoid
|
||||
the zero-spam corruption class.
|
||||
|
||||
**Enforce.** When mode is `enforce`, unrecognised destinations are
|
||||
blocked with a clear violation logged to the **same per-session
|
||||
`audit.jsonl`** (new `AuditEvent.Action = "egress_block"`). Mode is
|
||||
`[security.egress].mode = "off" | "learn" | "enforce"`, default `off`
|
||||
(opt-in; shipping `enforce` on by default would break first-run UX).
|
||||
|
||||
### Baseline allowlist (curated, ship-in-binary)
|
||||
|
||||
`baseline.go` seeds the allowlist so Enforce mode is usable immediately:
|
||||
|
||||
- **Package ecosystems:** github.com, registry.npmjs.org, pypi.org,
|
||||
files.pythonhosted.org, crates.io, static.crates.io,
|
||||
registry-1.docker.io, proxy.golang.org, sum.golang.org.
|
||||
- **Model providers:** anthropic, openai, google, mistral, **minimax**
|
||||
(per `2026-06-04-minimax-provider.md`) — host set derived from the
|
||||
effective `[provider.endpoints]` map so user-configured local
|
||||
ollama/llamacpp endpoints are auto-allowed.
|
||||
|
||||
The painful middle ground is SDK egress (sentry, stripe, supabase,
|
||||
datadog…). These break a naive "block unknown" default, which is
|
||||
exactly why Learn → Review → Enforce is the only flow that scales.
|
||||
|
||||
### Per-tool scoping
|
||||
|
||||
`scoped` destinations carry an allowed-tool/agent set. Enforcement
|
||||
checks the calling context — the engine already knows which tool is
|
||||
running (it threads per-tool context for redaction logging today). Pass
|
||||
the tool/agent identity into `EgressGuard.Decide(host, callerCtx)`.
|
||||
|
||||
---
|
||||
|
||||
## Interactions
|
||||
|
||||
- **Incognito:** Learn-mode writes are gated by incognito exactly like
|
||||
the audit log (`IncognitoMode.ShouldLogContent`). Enforcement still
|
||||
applies in incognito (security is not relaxed); only the *learning*
|
||||
persistence is suppressed.
|
||||
- **Config layering:** the allowlist file is a new corruption surface —
|
||||
follow `2026-05-24-config-migration.md` #1 discipline.
|
||||
- **SafeProvider:** egress is orthogonal to the content `SafeProvider`
|
||||
wrap; it lives one layer down at the transport. Both must hold.
|
||||
|
||||
---
|
||||
|
||||
## Touch-points (file:line)
|
||||
|
||||
| Change | Location |
|
||||
|---|---|
|
||||
| New egress package | `internal/security/egress/` |
|
||||
| `HTTPClient` field | `internal/provider/registry.go:8-16` |
|
||||
| Provider client injection | `anthropic/provider.go`, `openai/provider.go`, `google/provider.go`, `mistral/provider.go` |
|
||||
| mistral SDK `WithHTTPClient` | `github.com/VikingOwl91/mistral-go-sdk` (if absent) |
|
||||
| Non-SDK client swap | `router/discovery.go`, `router/probe.go`, `slm/backend.go`, `slm/download.go`, `slm/manager.go` |
|
||||
| `audit.go` egress action | `internal/security/audit.go` (`AuditEvent`) |
|
||||
| Config `[security.egress]` | `internal/config/config.go` (SecuritySection ~`:280-306`) |
|
||||
| `gnoma firewall` command | `cmd/gnoma/main.go` subcommand dispatch (~`:178`) |
|
||||
| Allowlist store | `.gnoma/firewall/allowlist.toml` |
|
||||
|
||||
---
|
||||
|
||||
## Testing (TDD — write first)
|
||||
|
||||
- **Unit:**
|
||||
- `EgressGuard.Decide`: off → always allow; learn → allow + record;
|
||||
enforce → allow baseline/allowlisted, block unknown, scoped host
|
||||
allowed only for the named tool.
|
||||
- `GuardedDialer` blocks a non-allowlisted `host:port` before dial
|
||||
(use a guard with a closed allowlist; assert no connection
|
||||
attempt — inject a fake inner dialer that records calls).
|
||||
- Baseline expansion: `[provider.endpoints]` hosts are auto-allowed;
|
||||
a local ollama URL becomes an allowlist entry.
|
||||
- Allowlist store round-trips without zero-spam corruption.
|
||||
- `audit.jsonl` gains an `egress_block` record on a blocked dial.
|
||||
- **Grep/guard test:** fails if `http.DefaultClient` is used in
|
||||
provider/router/slm packages (prevents regressions reopening the
|
||||
hole).
|
||||
- **Integration (`//go:build integration`):** with mode=enforce and a
|
||||
minimal allowlist, a provider call to an allowed host succeeds and a
|
||||
tool fetch to a blocked host fails with a logged violation.
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
1. `mode="off"` (default) → behaviour identical to today.
|
||||
2. `mode="learn"` → every outbound host appears in the store; nothing
|
||||
is blocked.
|
||||
3. `gnoma firewall review` lists learned hosts and persists
|
||||
allow/deny/scoped decisions.
|
||||
4. `mode="enforce"` → baseline + allowlisted hosts reachable; an
|
||||
un-allowlisted host is blocked with an `egress_block` line in
|
||||
`.gnoma/sessions/<id>/audit.jsonl`.
|
||||
5. `gnoma firewall audit` prints this session's firewall events
|
||||
(block/redact/egress) in a grep-friendly form. (Closes the
|
||||
remaining audit-log gap.)
|
||||
6. Scoped destination reachable by its named tool only.
|
||||
|
||||
---
|
||||
|
||||
## TODO linkage
|
||||
|
||||
Replaces the egress half of the "Security boundary — egress controls +
|
||||
session audit log" entry in `TODO.md`. Update that entry to mark the
|
||||
audit log implemented and link this file for the egress work.
|
||||
@@ -0,0 +1,224 @@
|
||||
# MiniMax Provider — 2026-06-04
|
||||
|
||||
Adds MiniMax (<https://platform.minimax.io>) as a first-class cloud
|
||||
provider so it can register as a router arm alongside
|
||||
anthropic/openai/google/mistral. Promotes the TODO.md entry
|
||||
"MiniMax provider — cloud arm + subscription token plan" out of
|
||||
bullet form into a phased design.
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
Gnoma has no MiniMax adapter. MiniMax ships strong, very cheap coding
|
||||
models (M2 family) that are a natural fit for the cheap-high-capability
|
||||
cloud tier the router already reasons about via `CostWeight`. Two facts
|
||||
make the integration cheap:
|
||||
|
||||
1. MiniMax exposes **both** an OpenAI-compatible and an
|
||||
Anthropic-compatible HTTP surface, so no new translation layer is
|
||||
needed — gnoma already has both `internal/provider/openaicompat`
|
||||
(built on the OpenAI SDK) and `internal/provider/anthropic` with a
|
||||
working `BaseURL` override.
|
||||
2. `envKeyFor`'s default branch (`cmd/gnoma/main.go:1199-1200`) already
|
||||
resolves `MINIMAX_API_KEY` for any unknown provider with no code
|
||||
change.
|
||||
|
||||
The remaining work is wiring (a constructor + switch cases +
|
||||
enumerations), routing metadata (family defaults, rate limits), and a
|
||||
**design decision around the subscription billing model** that the
|
||||
router's metered-cost assumption does not currently handle.
|
||||
|
||||
### External facts (VERIFY at implementation — MiniMax docs move fast)
|
||||
|
||||
These were confirmed 2026-06-04 but the model lineup and pricing are
|
||||
revised frequently (a pricing overhaul landed 2026-06-02). Re-verify
|
||||
against the live docs before hardcoding anything:
|
||||
|
||||
- **OpenAI-compatible base URL:** `https://api.minimax.io/v1`
|
||||
(international). A separate region endpoint exists
|
||||
(`api.minimaxi.com`); confirm the exact host + whether gnoma should
|
||||
expose a region toggle. Docs:
|
||||
<https://platform.minimax.io/docs/api-reference/text-openai-api>
|
||||
- **Anthropic-compatible endpoint:** exists ("two equivalent
|
||||
endpoints, one mimics OpenAI, one mimics Anthropic"). Confirm the
|
||||
exact path/host before choosing it over OpenAI-compat.
|
||||
- **Models (do NOT hardcode a single ID):** MiniMax-M2, M2.1, M2.5,
|
||||
M2.7 (+ `-highspeed` variants), M3. Coding-relevant default is the
|
||||
current M2-coding model — at time of writing M2.5 for PAYG, M2.1 for
|
||||
the subscription plan. **Treat the default as config, not a
|
||||
constant**, and call `Models(ctx)` to enumerate live.
|
||||
- **Pricing (PAYG, for `CostPer1k*` metadata):** M2.7 ≈ $0.30 / MTok
|
||||
input, $1.20 / MTok output; highspeed ≈ 2×. Convert to the EUR
|
||||
per-1k convention used by the Arm struct. Docs:
|
||||
<https://platform.minimax.io/docs/guides/pricing-token-plan>
|
||||
- **Subscription:** "Token Plan" (current; supersedes the former
|
||||
"Coding Plan"). Flat-rate prompt quota over a rolling window
|
||||
(published M2.7 limits 1,500–30,000 requests / 5h across tiers).
|
||||
Same Bearer key as PAYG.
|
||||
|
||||
---
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **A bespoke MiniMax SDK / translation layer.** We reuse the existing
|
||||
OpenAI-compat (default) or Anthropic provider via `BaseURL`. If
|
||||
MiniMax adds non-standard body fields, use the existing
|
||||
`openai.NewWithStreamOptions` escape hatch (the same one Ollama uses).
|
||||
- **Region auto-detection.** Ship the international endpoint as the
|
||||
default; the user can override via `[provider.endpoints]`. A region
|
||||
toggle is a follow-up if anyone asks.
|
||||
- **Full subscription-quota accounting.** Phase 2 models subscription
|
||||
cost as a coarse `CostWeight` zero-out, not a live quota meter.
|
||||
|
||||
---
|
||||
|
||||
## Decision: OpenAI-compat vs Anthropic-compat backing
|
||||
|
||||
**Default to OpenAI-compat** (`internal/provider/openaicompat`). It is
|
||||
already exercised by the local backends (ollama/llamacpp), so the
|
||||
streaming, tool-call, and error paths are battle-tested in this repo.
|
||||
The Anthropic-compat endpoint is a fallback only if a MiniMax feature
|
||||
(e.g. extended thinking) is exposed solely through it. Keep the option
|
||||
open by making the backing selectable via config
|
||||
(`[provider.minimax].api = "openai" | "anthropic"`), defaulting to
|
||||
`openai`.
|
||||
|
||||
---
|
||||
|
||||
## Design
|
||||
|
||||
### Phase 1 — provider wiring (smallest shippable slice)
|
||||
|
||||
Goal: `gnoma --provider minimax` works against PAYG with metered
|
||||
pricing, registered as a cloud arm.
|
||||
|
||||
1. **Constructor.** Add `NewMiniMax(cfg provider.ProviderConfig)
|
||||
(provider.Provider, error)` to
|
||||
`internal/provider/openaicompat/provider.go`, mirroring `NewOllama`
|
||||
/ `NewLlamaCpp` (`openaicompat/provider.go:18-49`):
|
||||
- Default `BaseURL` to `https://api.minimax.io/v1` when unset (but
|
||||
let `[provider.endpoints].minimax` override).
|
||||
- Require a real API key (unlike Ollama's dummy key) — return an
|
||||
error if `cfg.APIKey == ""`.
|
||||
- Leave `MaxRetries` at the SDK default (cloud failures *are*
|
||||
transient, unlike the local backends which force `0`).
|
||||
- Default `cfg.Model` to the current coding model **read from
|
||||
config**, not a baked constant.
|
||||
|
||||
2. **Construction switch.** Add `case "minimax": return
|
||||
openaicompat.NewMiniMax(cfg)` to `createProvider`
|
||||
(`cmd/gnoma/main.go:1265-1280`). If `[provider.minimax].api =
|
||||
"anthropic"`, route to `anthropicprov.New(cfg)` with `cfg.BaseURL`
|
||||
set to the anthropic-compat host instead.
|
||||
|
||||
3. **Provider enumerations.** Add `"minimax"` to:
|
||||
- the known-providers set (`main.go:233-236`),
|
||||
- the available-providers usage string (`main.go:1279`),
|
||||
- NOT the local-providers set (it is a cloud arm).
|
||||
|
||||
4. **API key (optional friendliness).** `envKeyFor`'s default already
|
||||
yields `MINIMAX_API_KEY`. Add an explicit `case "minimax"` in
|
||||
`envKeyFor` (`main.go:1189-1201`) only if we want alternates (e.g.
|
||||
`MINIMAX_GROUP_ID` if the account requires a group id header —
|
||||
VERIFY whether MiniMax needs a group id alongside the key; if so,
|
||||
thread it through `ProviderConfig.Options`).
|
||||
|
||||
5. **Family defaults.** Add MiniMax model families to
|
||||
`knownFamilyDefaults` in `internal/router/defaults.go` (pattern at
|
||||
`defaults.go:212-239`). Cloud arm → no `MaxComplexity` ceiling. Set
|
||||
`Strengths` (`TaskGeneration`, `TaskRefactor`, `TaskDebug` are the
|
||||
coding sweet spot) and a low `CostWeight` (~0.8–1.0 — cheap arm, so
|
||||
the cost penalty is small) plus `CostPer1kInput/Output` from the
|
||||
verified PAYG pricing.
|
||||
|
||||
6. **Rate limits.** Add a `minimaxDefaults()` entry in
|
||||
`internal/provider/ratelimits.go` (pattern at the anthropic block
|
||||
~`ratelimits.go:109-130`) and wire it into the `DefaultRateLimits`
|
||||
switch. Use the published PAYG RPM/TPM; allow `[rate_limits.minimax]`
|
||||
config overrides (the existing override path in `resolveRateLimitPools`).
|
||||
|
||||
### Phase 2 — subscription (Token Plan) billing model
|
||||
|
||||
The router's `CostWeight` math assumes metered per-token pricing. Under
|
||||
a Token Plan subscription, marginal cost is ≈0 until the quota is hit,
|
||||
then requests hard-fail. Design:
|
||||
|
||||
1. **Billing knob.** `[provider.minimax].billing = "metered" |
|
||||
"subscription"` (default `"metered"`). In `subscription` mode, set
|
||||
the arm's `CostWeight` to 0 (or `CostPer1k*` to 0) so the selector
|
||||
treats MiniMax as free while quota remains.
|
||||
|
||||
2. **Quota-exhaustion failover.** MiniMax returns a quota/429 error
|
||||
when the plan is exhausted. Map it to the existing rate-limit
|
||||
backoff path (`Arm.BackoffUntil`, the 429 handling that already
|
||||
disables an arm temporarily) so the bandit fails over to the next
|
||||
arm cleanly. This ties into the session error-recovery work landed
|
||||
in `0d3d190`. Confirm the exact error shape MiniMax returns and add
|
||||
a classifier in `internal/provider/errors.go`.
|
||||
|
||||
3. **Docs.** Document both plans + the region split in
|
||||
`docs/slm-backends.md` (or a new provider doc) and the README
|
||||
provider list.
|
||||
|
||||
---
|
||||
|
||||
## Touch-points (file:line)
|
||||
|
||||
| Change | Location |
|
||||
|---|---|
|
||||
| `NewMiniMax` constructor | `internal/provider/openaicompat/provider.go` (after `:49`) |
|
||||
| Construction switch case | `cmd/gnoma/main.go:1265-1280` |
|
||||
| Known-providers set | `cmd/gnoma/main.go:233-236` |
|
||||
| Usage string | `cmd/gnoma/main.go:1279` |
|
||||
| `envKeyFor` (optional) | `cmd/gnoma/main.go:1189-1201` |
|
||||
| Family defaults | `internal/router/defaults.go:212-239` |
|
||||
| Rate-limit defaults | `internal/provider/ratelimits.go` (+ `DefaultRateLimits` switch) |
|
||||
| Error classifier (Phase 2) | `internal/provider/errors.go` |
|
||||
| Config: `[provider.minimax]` | `internal/config/config.go` (provider section) |
|
||||
|
||||
The `Provider` interface contract to satisfy
|
||||
(`internal/provider/provider.go:136-148`): `Stream`, `Name`, `Models`,
|
||||
`DefaultModel`. All four come free by delegating to the OpenAI-compat
|
||||
base provider.
|
||||
|
||||
---
|
||||
|
||||
## Testing (TDD — write first)
|
||||
|
||||
Per CLAUDE.md: table-driven, `//go:build integration` for anything
|
||||
hitting the live API.
|
||||
|
||||
- **Unit (no network):**
|
||||
- `NewMiniMax` defaults: empty `BaseURL` → `https://api.minimax.io/v1`;
|
||||
empty key → error; `[provider.endpoints].minimax` override wins.
|
||||
- `createProvider("minimax", …)` returns a non-nil provider; unknown
|
||||
still errors.
|
||||
- `envKeyFor("minimax") == "MINIMAX_API_KEY"`.
|
||||
- `defaults.go`: a MiniMax model family resolves to the expected
|
||||
`Strengths`/`CostWeight`; `MaxComplexity == 0`.
|
||||
- `ratelimits.go`: `DefaultRateLimits("minimax").LookupModel(...)`
|
||||
returns the configured limits; `"*"` fallback works.
|
||||
- Phase 2: billing=`subscription` → arm `CostWeight == 0`; the
|
||||
quota/429 error maps to a retryable/backoff classification.
|
||||
- **Integration (`//go:build integration`, real `MINIMAX_API_KEY`):**
|
||||
a one-shot `Stream` against the cheapest model returns tokens;
|
||||
`Models(ctx)` enumerates a non-empty list.
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
1. `MINIMAX_API_KEY=… gnoma --provider minimax -p "hello"` streams a
|
||||
response in pipe mode.
|
||||
2. With no `--provider`, MiniMax appears as a selectable router arm and
|
||||
is chosen for a cheap generation task when `prefer` allows cloud.
|
||||
3. `gnoma providers` lists `minimax`.
|
||||
4. Phase 2: with `billing="subscription"`, the selector prefers MiniMax
|
||||
for eligible tasks; on simulated quota-exhaustion the router fails
|
||||
over without surfacing an error to the user.
|
||||
|
||||
---
|
||||
|
||||
## TODO linkage
|
||||
|
||||
Replaces the inline "MiniMax provider" bullet in `TODO.md` (In flight).
|
||||
Link this file from that entry.
|
||||
Reference in New Issue
Block a user