# Gnoma — TODO Active work, newest first. ## In flight - **Bandit selector — design decisions deferred.** The current selector (`internal/router/selector.go:scoreArm`) is greedy quality-weighted: per-(arm × task-type) EMA scores blended 70/30 with heuristic defaults, divided by CostWeight-adjusted cost. It is **not** a true multi-armed bandit — no UCB-style exploration bonus, no Thompson sampling. Tracked as a design question rather than a must-implement item because of two open dependencies: 1. **Whether to keep numeric EMA at all.** The 2026-05-07 roadmap (Phase 4) puts re-evaluating bandit learning on hold until the SLM-driven dispatcher is in production. Three options on the table: keep bandit as feedback for the SLM, retire EMA in favour of qualitative outcome summaries fed to the SLM, or split responsibilities (SLM = intent routing, bandit = cost/quality within a tier). See [`docs/superpowers/plans/2026-05-07-gnoma-roadmap.md`](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md) §Phase 4. 2. **User-tunable selector knobs.** Several constants are hardcoded today: `qualityAlpha` (EMA smoothing, ~3-sample memory), the 70/30 observed/heuristic blend, `strengthScoreBonus` for tagged task types, and the `DefaultThresholds.Minimum` quality floor. Surfacing these as `[router.bandit]` config keys would let users tune for their workloads (faster alpha for shifting model performance, longer memory for stable fleets) without waiting for the strategic decision in #1. Surfaced from the r/coolgithubprojects v0.3.1 launch thread (2026-05-24, `u/Ha_Deal_5079`). - **Security boundary — egress controls + session audit log.** The current `Firewall` is a content boundary only (scans messages and tool results for secrets via regex + Shannon entropy, redacts or blocks, logs via `log/slog`). It does not enforce network egress — outgoing HTTP from tools and providers uses stock `http.Client` with no per-host allowlist or dial-layer interception. Two follow- ups surfaced from the r/SideProject v0.3.0 launch thread (2026-05-24, `u/Secret_Theme3192`): 1. **Per-session audit log of blocked/redacted events** — grep-able file at `.gnoma/sessions//audit.jsonl` so the user can answer "what did the firewall do this session?" in one command. Today the `slog` output goes to whatever sink is configured, with no per-session grouping. 2. **Per-host egress allowlist (HTTP transport layer)** — open design question: host-level (`allow api.openai.com, deny *`) vs per-tool (`bash can only hit these hosts`). Reply asked the commenter for their mental model; revisit when feedback lands. The README and v0.3.0 Reddit post phrasing oversold "network egress gated"; corrected in the same commit as this TODO entry. - **Tool-router specialization (functiongemma)** — gated on telemetry, not committed. Phase A.2 adds did-switch-rate measurement to the two-stage `select_category` path; Phase A.3 (LoRA fine-tune of `functiongemma-270m-it` as a dedicated `ArmRoleToolRouter`) only fires if did-switch rate exceeds 20 %. Three independent external reviews consulted 2026-05-23; consensus is "fits as tool-call router, not chat; fine-tuning mandatory; prove the need first." See [`docs/superpowers/plans/2026-05-23-tool-router-specialization.md`](docs/superpowers/plans/2026-05-23-tool-router-specialization.md). - **Entropy FP reduction (post-SLM Phase F)** — F-1 (format-aware pre-extractor) shipped 2026-05-22: `[security].entropy_safelist` with `uuid`, `sha_hex`, `iso8601`, `url`; default empty so pre-F-1 behaviour is unchanged. F-2 (SLM-assisted classifier for ambiguous entropy hits) remains gated on F-1 FP-rate telemetry from real workloads plus ≥50 SLM observations. Surfaced from the r/ollama launch thread (2026-05-20); external validation from alterlab.io on the same tiered approach. See [`docs/superpowers/plans/2026-05-19-post-slm-unlock.md`](docs/superpowers/plans/2026-05-19-post-slm-unlock.md). - **Compound tools (post-SLM Phase E)** — held until ≥50 SLM observations inform which primitives are worth adding. See [`docs/superpowers/plans/2026-05-19-post-slm-unlock.md`](docs/superpowers/plans/2026-05-19-post-slm-unlock.md). - **Sensitive-content handling — unified policy.** Three input paths can introduce sensitive content into the context: pasted images (screenshots may contain secrets, API keys, PII), pasted text (often copied straight from a terminal with credentials), and tool-read files (`.env`, key files, etc.). Today these are handled inconsistently: incognito gates persistence but content still flows to providers; outgoing-scan firewall covers some patterns but is format-aware only for text. Need a single policy/UI: at-paste warning when the content matches sensitive heuristics, a consent-gated review step, and consistent treatment across the three paths. Cross-cuts with Phase F entropy work and the outgoing-scan firewall. - **Distribution — follow-ups.** v0.1.0 shipped (archives on github.com/VikingOwl91/gnoma/releases, multi-arch images on ghcr.io/vikingowl91/gnoma). Still optional: Homebrew tap, `curl | sh` installer script, signed checksums (cosign/sigstore), release note automation, Windows process-tree kill via golang.org/x/sys/windows job objects (currently `os.Process.Kill` only — see `internal/mcp/transport_windows.go`), and migration from `dockers` + `docker_manifests` to `dockers_v2` in `.goreleaser.yml` (collapses ~45 lines into one block but requires Dockerfile changes for the per-platform binary layout — deferred to its own commit before v0.3.0). ## Stable backlog (not in active phases) - **Thinking mode** (disabled / budget / adaptive) — M12. - **Structured output** with JSON schema validation — M12. - **Native agy JSON output** — switch the subprocess provider to `--output-format stream-json` once the agy CLI supports it, replacing the current prompt-augmentation fallback. - **SQLite session persistence** + serve mode — M10. - **Task learning** (pattern recognition, persistent tasks) — M11. - **Web UI** (`gnoma web`) — M15. - **OAuth / keyring** — M13. - **Observability** (feature flags, cost dashboards) — M14. - **PE / Mach-O ELF support** — future, after ELF Phase 6. ## History Completed initiatives, kept here as pointers to their plan files: - **v0.1.0 release** — 2026-05-20. First tagged release. GoReleaser pipeline produces six static archives (linux/darwin/windows × amd64/arm64) on the GitHub mirror plus multi-arch Docker images on GHCR. History was rewritten on the same day to migrate authorship to a noreply identity and strip co-author attribution. - **Post-audit security hardening** — complete 2026-05-19. Three waves + one ADR closed all 14 findings from the external review: - [Wave 1 — SafeProvider boundary](docs/superpowers/plans/2026-05-19-security-wave1-safeprovider.md) - [Wave 2 — Incognito coherence](docs/superpowers/plans/2026-05-19-security-wave2-incognito.md) - Wave 3 — scanner + path hygiene (rolled out directly without a plan file; see commits leading up to 2026-05-19 on `internal/security`) - [ADR-004 — PostToolUse hook ordering](docs/essentials/decisions/004-posttooluse-hook-ordering.md) - **Post-SLM unlock** — [plan](docs/superpowers/plans/2026-05-19-post-slm-unlock.md). Phases A–D complete (two-stage tool routing, CLI agent binary override, user profiles, per-arm capability tags). - **2026-05-07 roadmap** — [plan](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md). M1–M8 done; SLM classifier (Phase 3) complete; Phase 4 superseded by the post-SLM plan. ## Reference - Milestones: `docs/essentials/milestones.md` - Decisions: `docs/essentials/decisions/` - ADR-002 (SLM routing, supersedes earlier ADR-009): `docs/essentials/decisions/002-slm-routing.md`