Top-level docs were stale and the .gitea/ issue templates referenced a
workflow that is no longer in use.
- README: rewrite around the current feature set (SLM routing, profiles,
plugin TOFU, SafeProvider boundary, current model defaults). Add a
pre-built-binary install section plus Docker (ghcr.io) install path
for users without a Go toolchain. Document the GitHub mirror.
- CONTRIBUTING: drop the dead issue-template reference, note Gitea
upstream + GitHub mirror split, expand the package map and test-target
table.
- AGENTS: rebuild as a domain glossary (Elf / Arm / Turn / SafeProvider /
Incognito / Profile) plus non-obvious conventions an outside agent
needs and would not infer from the code.
- TODO: trim completed waves into a History section, fix a broken
link to the never-written Wave 3 plan file, surface active backlog.
- docs/essentials/INDEX: add ADR-004 (PostToolUse hook ordering) to the
ADR list.
- LICENSE + NOTICE: adopt Apache License 2.0. Patent grant matters
because gnoma bundles SDKs from Anthropic / OpenAI / Google / Mistral
and ships derivative tooling that runs untrusted MCP servers.
- Delete .gitea/issue_template/ and gemma-integration-analysis.md
(latter is obsolete per its own preamble — Node.js-specific notes
that don't apply to the Go implementation).
Three compounding bugs prevented tool calling with llama.cpp:
- Stream parser set argsComplete on partial JSON (e.g. "{"), dropping
subsequent argument deltas — fix: use json.Valid to detect completeness
- Missing tool_choice default — llama.cpp needs explicit "auto" to
activate its GBNF grammar constraint; now set when tools are present
- Tool names in history used internal format (fs.ls) while definitions
used API format (fs_ls) — now re-sanitized in translateMessage
Additional changes:
- Disable SDK retries for local providers (500s are deterministic)
- Dynamic capability probing via /props (llama.cpp) and /api/show
(Ollama), replacing hardcoded model prefix list
- Engine respects forced arm ToolUse capability when router is active
- Bundled /init skill with Go template blocks, context-aware for local
vs cloud models, deduplication rules against CLAUDE.md
- Tool result compaction for local models — previous round results
replaced with size markers to stay within small context windows
- Text-only fallback when tool-parse errors occur on local models
- "text-only" TUI indicator when model lacks tool support
- Session ResetError for retry after stream failures
- AllowedTools per-turn filtering in engine buildRequest
provider/openai:
- Fix doubled tool call args (argsComplete flag): Ollama sends complete
args in the first streaming chunk then repeats them as delta, causing
doubled JSON and 400 errors in elfs
- Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep)
- Add Reasoning field support for Ollama thinking output
cmd/gnoma:
- Early TTY detection so logger is created with correct destination
before any component gets a reference to it (fixes slog WARN bleed
into TUI textarea)
permission:
- Exempt spawn_elfs and agent tools from safety scanner: elf prompt
text may legitimately mention .env/.ssh/credentials patterns and
should not be blocked
tui/app:
- /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge
(ask for plain text output) → TUI fallback write from streamBuf
- looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback
content before writing (reject refusals, strip narrative preambles)
- Collapse thinking output to 3 lines; ctrl+o to expand (live stream
and committed messages)
- Stream-level filter for model pseudo-tool-call blocks: suppresses
<<tool_code>>...</tool_code>> and <<function_call>>...<tool_call|>
from entering streamBuf across chunk boundaries
- sanitizeAssistantText regex covers both block formats
- Reset streamFilterClose at every turn start