Files
gnoma/docs/essentials/risks.md
vikingowl d3990214a5 docs: update essentials for router, security, task learning
Restructure milestones from M1-M11 to M1-M15:
- M3: Security Firewall (secret scanner, incognito mode)
- M4: Router Foundation (arm registry, pools, task classifier)
- M5: TUI with full 6 permission modes
- M6: Full compaction (truncate + LLM summarization)
- M9: Router Advanced (bandit learning, ensemble strategies)
- M11: Task Learning (pattern detection, persistent tasks)

Add ADR-007 through ADR-012 for security-as-core, router split,
Thompson Sampling, MCP replaceability, task learning, incognito.

Add risks R-010 through R-015 for router, security, feedback,
task learning, ensemble quality, shell parser.

Update architecture dependency graph with security, router,
elf, hook, skill, mcp, plugin, tasklearn packages.

Update domain model with Router, Arm, LimitPool, Firewall entities.
2026-04-03 10:47:11 +02:00

3.5 KiB

essential, status, last_updated, project, depends_on
essential status last_updated project depends_on
risks complete 2026-04-02 gnoma

Risk / Unknowns

ID Risk Severity Mitigation Status
R-001 SDK breaking changes — provider SDKs are pre-1.0 and may change APIs Medium Pin versions, integration tests per provider, adapter layer absorbs changes Open
R-002 Google range-to-pull bridge goroutine leak — context cancellation edge cases Medium Thorough testing with testing/synctest, always select on ctx.Done() Open
R-003 Thinking block round-trip fidelity — Anthropic signatures must survive serialization Medium Unit tests with real signature values, golden file tests Open
R-004 Tool call ID generation inconsistency — Google/Ollama may return empty IDs Low Generate UUID if provider returns empty, documented in provider adapter Open
R-005 Mistral SDK 2.2.0 stability — user-maintained SDK, recently updated Low User maintains it, can fix bugs directly. Integration tests catch regressions. Accepted
R-006 Bubble Tea v2 maturity — v2 is relatively new Low Pin version, fallback to v1 if blockers. TUI is last milestone item. Open
R-007 Multi-provider routing complexity — coordinating elfs on different providers with different capabilities High Design routing interface early (M4), start simple (manual provider assignment), add rules incrementally Open
R-008 Context compaction coherence — summarization may lose critical details Medium Truncation as safe default, summarization opt-in, compact boundaries for recovery Open
R-009 Permission prompt UX in pipe mode — no TUI for interactive prompts Low Default to allow or deny in pipe mode, require explicit flag Open
R-010 Router complexity — bandit tuning, cold start problem High Ship default.state with embedded priors, heuristic fallback for <5 observations Open
R-011 Security false positives — blocking legitimate content Medium Warn-first mode, user override per-pattern, configurable sensitivity Open
R-012 Feedback attribution — delayed/noisy signals for orchestration tasks Medium Neutral default for missing signals, ensemble contribution rank as strong signal Open
R-013 Task learning privacy — pattern data persistence Low Patterns stored locally only, cleared in incognito mode Open
R-014 Ensemble synthesis quality — depends heavily on synthesis prompt Medium Invest in prompt engineering, A/B test with polisher arm Open
R-015 Shell parser dependency — mvdan.cc/sh for compound command decomposition Low Well-maintained Go package, fallback to regex-based decomposition if needed Open

Open Questions

  • How should routing rules be expressed in config? Per-task rules, model capability tags, cost-based? — needs research before M5
  • Which local tokenizer library to use? (tiktoken port, sentencepiece, or provider-specific)
  • Serve mode protocol — choose what fits best when implementing M10
  • What automated quality evaluation to use for router feedback? (compile check, linter, self-consistency, small local judge model)
  • Should gnoma embed a tokenizer? → Yes, include local tokenizer (M6)
  • Session persistence format? → SQLite (M10)
  • Mistral SDK as long-term reference? → Yes for now, revisit after M2

Changelog

  • 2026-04-02: Initial version
  • 2026-04-03: Added R-010 through R-015 for router, security, feedback, task learning, shell parser