Two related items surfaced from the r/coolgithubprojects v0.3.1 launch thread. Bundled because they share the selector code: 1. Whether to keep numeric EMA at all post-SLM dispatcher (open strategic question from the 2026-05-07 roadmap — not a must-implement). 2. Surfacing hardcoded selector knobs (qualityAlpha, blend ratio, strength bonus, quality floor) as [router.bandit] config keys — ships independently of #1.
7.9 KiB
Gnoma — TODO
Active work, newest first.
In flight
-
Bandit selector — design decisions deferred. The current selector (
internal/router/selector.go:scoreArm) is greedy quality-weighted: per-(arm × task-type) EMA scores blended 70/30 with heuristic defaults, divided by CostWeight-adjusted cost. It is not a true multi-armed bandit — no UCB-style exploration bonus, no Thompson sampling. Tracked as a design question rather than a must-implement item because of two open dependencies:-
Whether to keep numeric EMA at all. The 2026-05-07 roadmap (Phase 4) puts re-evaluating bandit learning on hold until the SLM-driven dispatcher is in production. Three options on the table: keep bandit as feedback for the SLM, retire EMA in favour of qualitative outcome summaries fed to the SLM, or split responsibilities (SLM = intent routing, bandit = cost/quality within a tier). See
docs/superpowers/plans/2026-05-07-gnoma-roadmap.md§Phase 4. -
User-tunable selector knobs. Several constants are hardcoded today:
qualityAlpha(EMA smoothing, ~3-sample memory), the 70/30 observed/heuristic blend,strengthScoreBonusfor tagged task types, and theDefaultThresholds.Minimumquality floor. Surfacing these as[router.bandit]config keys would let users tune for their workloads (faster alpha for shifting model performance, longer memory for stable fleets) without waiting for the strategic decision in #1.
Surfaced from the r/coolgithubprojects v0.3.1 launch thread (2026-05-24,
u/Ha_Deal_5079). -
-
Security boundary — egress controls + session audit log. The current
Firewallis a content boundary only (scans messages and tool results for secrets via regex + Shannon entropy, redacts or blocks, logs vialog/slog). It does not enforce network egress — outgoing HTTP from tools and providers uses stockhttp.Clientwith no per-host allowlist or dial-layer interception. Two follow- ups surfaced from the r/SideProject v0.3.0 launch thread (2026-05-24,u/Secret_Theme3192):- Per-session audit log of blocked/redacted events —
grep-able file at
.gnoma/sessions/<id>/audit.jsonlso the user can answer "what did the firewall do this session?" in one command. Today theslogoutput goes to whatever sink is configured, with no per-session grouping. - Per-host egress allowlist (HTTP transport layer) — open
design question: host-level (
allow api.openai.com, deny *) vs per-tool (bash can only hit these hosts). Reply asked the commenter for their mental model; revisit when feedback lands. The README and v0.3.0 Reddit post phrasing oversold "network egress gated"; corrected in the same commit as this TODO entry.
- Per-session audit log of blocked/redacted events —
grep-able file at
-
Tool-router specialization (functiongemma) — gated on telemetry, not committed. Phase A.2 adds did-switch-rate measurement to the two-stage
select_categorypath; Phase A.3 (LoRA fine-tune offunctiongemma-270m-itas a dedicatedArmRoleToolRouter) only fires if did-switch rate exceeds 20 %. Three independent external reviews consulted 2026-05-23; consensus is "fits as tool-call router, not chat; fine-tuning mandatory; prove the need first." Seedocs/superpowers/plans/2026-05-23-tool-router-specialization.md. -
Entropy FP reduction (post-SLM Phase F) — F-1 (format-aware pre-extractor) shipped 2026-05-22:
[security].entropy_safelistwithuuid,sha_hex,iso8601,url; default empty so pre-F-1 behaviour is unchanged. F-2 (SLM-assisted classifier for ambiguous entropy hits) remains gated on F-1 FP-rate telemetry from real workloads plus ≥50 SLM observations. Surfaced from the r/ollama launch thread (2026-05-20); external validation from alterlab.io on the same tiered approach. Seedocs/superpowers/plans/2026-05-19-post-slm-unlock.md. -
Compound tools (post-SLM Phase E) — held until ≥50 SLM observations inform which primitives are worth adding. See
docs/superpowers/plans/2026-05-19-post-slm-unlock.md. -
Sensitive-content handling — unified policy. Three input paths can introduce sensitive content into the context: pasted images (screenshots may contain secrets, API keys, PII), pasted text (often copied straight from a terminal with credentials), and tool-read files (
.env, key files, etc.). Today these are handled inconsistently: incognito gates persistence but content still flows to providers; outgoing-scan firewall covers some patterns but is format-aware only for text. Need a single policy/UI: at-paste warning when the content matches sensitive heuristics, a consent-gated review step, and consistent treatment across the three paths. Cross-cuts with Phase F entropy work and the outgoing-scan firewall. -
Distribution — follow-ups. v0.1.0 shipped (archives on github.com/VikingOwl91/gnoma/releases, multi-arch images on ghcr.io/vikingowl91/gnoma). Still optional: Homebrew tap,
curl | shinstaller script, signed checksums (cosign/sigstore), release note automation, Windows process-tree kill via golang.org/x/sys/windows job objects (currentlyos.Process.Killonly — seeinternal/mcp/transport_windows.go), and migration fromdockers+docker_manifeststodockers_v2in.goreleaser.yml(collapses ~45 lines into one block but requires Dockerfile changes for the per-platform binary layout — deferred to its own commit before v0.3.0).
Stable backlog (not in active phases)
- Thinking mode (disabled / budget / adaptive) — M12.
- Structured output with JSON schema validation — M12.
- Native agy JSON output — switch the subprocess provider to
--output-format stream-jsononce the agy CLI supports it, replacing the current prompt-augmentation fallback. - SQLite session persistence + serve mode — M10.
- Task learning (pattern recognition, persistent tasks) — M11.
- Web UI (
gnoma web) — M15. - OAuth / keyring — M13.
- Observability (feature flags, cost dashboards) — M14.
- PE / Mach-O ELF support — future, after ELF Phase 6.
History
Completed initiatives, kept here as pointers to their plan files:
-
v0.1.0 release — 2026-05-20. First tagged release. GoReleaser pipeline produces six static archives (linux/darwin/windows × amd64/arm64) on the GitHub mirror plus multi-arch Docker images on GHCR. History was rewritten on the same day to migrate authorship to a noreply identity and strip co-author attribution.
-
Post-audit security hardening — complete 2026-05-19. Three waves
- one ADR closed all 14 findings from the external review:
- Wave 1 — SafeProvider boundary
- Wave 2 — Incognito coherence
- Wave 3 — scanner + path hygiene (rolled out directly without a
plan file; see commits leading up to 2026-05-19 on
internal/security) - ADR-004 — PostToolUse hook ordering
-
Post-SLM unlock — plan. Phases A–D complete (two-stage tool routing, CLI agent binary override, user profiles, per-arm capability tags).
-
2026-05-07 roadmap — plan. M1–M8 done; SLM classifier (Phase 3) complete; Phase 4 superseded by the post-SLM plan.
Reference
- Milestones:
docs/essentials/milestones.md - Decisions:
docs/essentials/decisions/ - ADR-002 (SLM routing, supersedes earlier ADR-009):
docs/essentials/decisions/002-slm-routing.md