Empirical comparison on 2026-05-25 across three candidate SLMs on
identical prompts (two prompts: trivial 'what is 2+2' + knowledge
'explain a multi-armed bandit'):
qwen3:0.6b consistent across both prompts
functiongemma:270m works trivial, derails on knowledge prompts
gemma3:1b unusable (emits just '{' or invented keys)
reecdev/tiny3.5:1.5b unusable (ignores /no_think, leaks <Thought Process> blocks)
qwen2.5-coder:1.5b unusable (ignores classifier prompt, answers in prose)
qwen3:0.6b honours Qwen3's native /no_think flag (the distillation in
the old default did not), is smaller than the previous recommendation
(520 MB vs 1 GB), and was the only candidate to classify both test
prompts successfully without falling back to heuristic.
README quickstart block + slm-backends.md presets + status output
sample all switched. Also documents register_as_arm (default true,
set false for task-specialised models like FunctionGemma) and
classify_timeout (default 15s) in the example configs since both
landed in v0.3.3+.
Code defaults for the tiny3.5 family in internal/router/defaults.go
are unchanged — that table still applies when users have tiny3.5
registered as a routing arm independent of the SLM role.
Clarify that gnoma itself emits no telemetry to external services
while being explicit that cloud-provider arms send data to those
providers by design. Adds:
- 'No phone-home' bullet to the differentiator list, naming the
on-device path (Ollama/llama.cpp + --incognito).
- 'Data flow' paragraph to the Security scope-note blockquote so
the framing is consistent between the hero bullets and the
Security section.
The 'What makes gnoma different' bullet and Security section both
implied a network-egress firewall. Today the Firewall only enforces a
content boundary (secret scan, Unicode sanitize, redact/block). Reword
both spots and add a Scope note. Surface the gap as a top-of-TODO
entry covering per-session audit log and per-host egress allowlist,
with the open design question (host-level vs per-tool) called out.
Raised via r/SideProject v0.3.0 launch thread.
Add docs/img/gnoma-tui.png as a hero image so visitors see the TUI
above the fold instead of a wall of text. Pull the bandit router,
prefer-policy, SLM, and built-in firewall out of buried sections into
a 'What makes gnoma different' bullet list. Add a Status block flagging
pre-1.0 and a table of contents. Move the pygmy-owl naming note and
upstream/mirror URLs into a footer About section.
README:
- New "Preferring local vs cloud" subsection under "Routing
defaults" — table of the three [router].prefer values, priority
order against forced arm / incognito / Strengths, and the
CLI-agent-counts-as-local clarification.
- New "Startup safety check" subsection under "Security" — tier
table, [safety] config block, --dangerously-allow-anywhere flag,
container detection note, link to the plan doc.
Plan doc (prefer-routing-policy):
- Approach section updated to describe the tier-shift mechanism
that actually shipped, with a clear "Implementation note"
explaining why the original score-multiplier approach was
abandoned (cost-floor math gives local arms a ~280x raw-score
advantage that any reasonable multiplier can't overcome).
- CLI-agent placement flipped from "non-local" to "local" with
rationale — implementation chose user-facing behavior axis over
the privacy axis the original draft used.
- Tier-shift rationale table replacing the multiplier rationale.
- P-3 task rewritten to reflect the actual implementation (checked
off and pointing at the right code), with the policyMultiplier
helper noted as a within-tier nudge of limited present effect.
The implementation-vs-plan deviation is now documented in both the
plan doc and the original feature commit message (f9094f6). Future
readers reach the same understanding via either path.
Closes R-8 of the routing-defaults plan. Adds a new "Routing
defaults" section between Config and SLM that documents what arms
ship with out-of-the-box — the family-keyed Strengths /
MaxComplexity / CostWeight matrix plus the non-chat exclude list.
Also introduces the [[arms]] override block in the README for the
first time (previously undocumented), showing how users keep
priority over the defaults.
Links back to the plan doc for the benchmark sources and per-entry
rationale.
- Add for-the-badge style shields (release, license, Go 1.26+, GHCR)
- Drop the "until the first tag is cut" line that's been stale since
v0.1.0 shipped on 2026-05-20
- Add a Vision / image input section covering Ctrl+V paste, literal
[Image: /path] markers, the 10 MiB cap, the incognito carve-out,
and the router's Vision capability gating
- Add a Subprocess sandbox bypass subsection under Providers
documenting GNOMA_AGY_BYPASS_PERMISSIONS and
GNOMA_CODEX_BYPASS_SANDBOX as deliberate footguns
- Add an Entropy false-positive reduction subsection under Security
showing the [security].entropy_safelist opt-in (Phase F-1) and
noting the per-pattern Debug telemetry that feeds F-2 gating
The subprocess CLI table only mentioned three agents; the full set
now is claude, gemini, agy, codex, and vibe (Mistral). Bring the
documentation in line with knownAgents.
- Switch GoReleaser archive release target from Gitea to the GitHub
mirror (VikingOwl91/gnoma). Pre-built archives now publish to
github.com/VikingOwl91/gnoma/releases on each tag.
- README: drop ANTHROPIC_API_KEY-as-example from Quickstart and the
Docker run example; both now reference the Providers table for the
full env-var list. Pre-built binary section points solely at the
GitHub releases page.
- NOTICE: correct copyright holder to vikingowl.
Top-level docs were stale and the .gitea/ issue templates referenced a
workflow that is no longer in use.
- README: rewrite around the current feature set (SLM routing, profiles,
plugin TOFU, SafeProvider boundary, current model defaults). Add a
pre-built-binary install section plus Docker (ghcr.io) install path
for users without a Go toolchain. Document the GitHub mirror.
- CONTRIBUTING: drop the dead issue-template reference, note Gitea
upstream + GitHub mirror split, expand the package map and test-target
table.
- AGENTS: rebuild as a domain glossary (Elf / Arm / Turn / SafeProvider /
Incognito / Profile) plus non-obvious conventions an outside agent
needs and would not infer from the code.
- TODO: trim completed waves into a History section, fix a broken
link to the never-written Wave 3 plan file, surface active backlog.
- docs/essentials/INDEX: add ADR-004 (PostToolUse hook ordering) to the
ADR list.
- LICENSE + NOTICE: adopt Apache License 2.0. Patent grant matters
because gnoma bundles SDKs from Anthropic / OpenAI / Google / Mistral
and ships derivative tooling that runs untrusted MCP servers.
- Delete .gitea/issue_template/ and gemma-integration-analysis.md
(latter is obsolete per its own preamble — Node.js-specific notes
that don't apply to the Go implementation).
Plugins are now verified against ~/.config/gnoma/plugins.pins.toml at
load time. Each plugin's plugin.json bytes are hashed (SHA-256) and:
- recorded automatically on first load (TOFU) with a prominent warning
- compared on subsequent loads
- refused with a clear error if the hash drifted, without overwriting
the pin so the user can review and re-enrol deliberately
Pin-store I/O failures degrade to load-without-pinning rather than
locking the user out of previously-trusted plugins.
Closes audit finding C2. See ADR-003 for the decision rationale and
docs/plugins-trust.md for the end-user trust model.
- Fix append footgun: allHooks/allMCPServers allocated fresh to avoid
mutating cfg's backing array (lines 391/413 in main.go)
- Fix pipe-mode permission prompt: detect no-TTY stdin and auto-deny
instead of blocking forever on fmt.Scanln EOF
- Tighten Mistral API key regex from bare [a-zA-Z0-9]{32} (matched
commit hashes, UUIDs) to context-gated pattern requiring "mistral"
keyword nearby. Added scanner test for positives and negatives.
- Remove README demo GIF TODO placeholder
- Unify version string: pass buildVersion from ldflags into tui.Config
instead of hardcoding "v0.1.0-dev"
- Populate benchmarks doc with actual Go benchmark results