vikingowl 3873f90f83 feat: local model reliability — SDK retries, capability probing, init skill, context compaction
Three compounding bugs prevented tool calling with llama.cpp:
- Stream parser set argsComplete on partial JSON (e.g. "{"), dropping
  subsequent argument deltas — fix: use json.Valid to detect completeness
- Missing tool_choice default — llama.cpp needs explicit "auto" to
  activate its GBNF grammar constraint; now set when tools are present
- Tool names in history used internal format (fs.ls) while definitions
  used API format (fs_ls) — now re-sanitized in translateMessage

Additional changes:
- Disable SDK retries for local providers (500s are deterministic)
- Dynamic capability probing via /props (llama.cpp) and /api/show
  (Ollama), replacing hardcoded model prefix list
- Engine respects forced arm ToolUse capability when router is active
- Bundled /init skill with Go template blocks, context-aware for local
  vs cloud models, deduplication rules against CLAUDE.md
- Tool result compaction for local models — previous round results
  replaced with size markers to stay within small context windows
- Text-only fallback when tool-parse errors occur on local models
- "text-only" TUI indicator when model lacks tool support
- Session ResetError for retry after stream failures
- AllowedTools per-turn filtering in engine buildRequest
2026-04-13 02:01:01 +02:00

gnoma

A provider-agnostic agentic coding assistant built in Go. gnoma routes tasks to the best available LLM — cloud or local — through a multi-armed bandit router, while tools, hooks, skills, MCP servers, and plugins keep it extensible. Named after the northern pygmy-owl (Glaucidium gnoma); agents are called elfs (elf owl).

Quickstart

# Install
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest

# Or build from source
git clone https://somegit.dev/Owlibou/gnoma && cd gnoma
make build    # binary at ./bin/gnoma

# Set at least one provider key
export ANTHROPIC_API_KEY=sk-ant-...   # or OPENAI_API_KEY, MISTRAL_API_KEY, GEMINI_API_KEY

# Run
gnoma                                 # interactive TUI
echo "list files" | gnoma             # pipe mode
gnoma --provider ollama               # use a local model

Build

make build          # ./bin/gnoma
make install        # $GOPATH/bin/gnoma

Providers

Anthropic

export ANTHROPIC_API_KEY=sk-ant-...
./bin/gnoma --provider anthropic
./bin/gnoma --provider anthropic --model claude-opus-4-5-20251001

Integration tests hit the real API — keep a key in env:

go test -tags integration ./internal/provider/...

OpenAI

export OPENAI_API_KEY=sk-proj-...
./bin/gnoma --provider openai
./bin/gnoma --provider openai --model gpt-4o

Mistral

export MISTRAL_API_KEY=...
./bin/gnoma --provider mistral

Google (Gemini)

export GEMINI_API_KEY=AIza...
./bin/gnoma --provider google
./bin/gnoma --provider google --model gemini-2.0-flash

Ollama (local)

Start Ollama and pull a model, then:

./bin/gnoma --provider ollama --model gemma4:latest
./bin/gnoma --provider ollama --model qwen3:8b     # default if --model omitted

Default endpoint: http://localhost:11434/v1. Override via config or env:

# .gnoma/config.toml
[provider]
default = "ollama"
model   = "gemma4:latest"

[provider.endpoints]
ollama = "http://myhost:11434/v1"

llama.cpp (local)

Start the llama.cpp server:

llama-server --model /path/to/model.gguf --port 8080 --ctx-size 8192

Then:

./bin/gnoma --provider llamacpp
# model name is taken from the server's /v1/models response

Default endpoint: http://localhost:8080/v1. Override:

[provider.endpoints]
llamacpp = "http://localhost:9090/v1"

Extensibility (M8)

gnoma supports hooks, skills, MCP servers, and plugins.

MCP Servers

Connect any MCP-compatible tool server:

[[mcp_servers]]
name    = "git"
command = "mcp-server-git"
args    = ["--repo", "."]
timeout = "30s"

# Replace a built-in tool with an MCP tool
[mcp_servers.replace_default]
exec = "bash"   # MCP tool "exec" replaces gnoma's built-in "bash"

MCP tools appear as mcp__{server}__{tool} (e.g., mcp__git__status), or under the built-in name when using replace_default.

Skills

Drop markdown files into .gnoma/skills/ or ~/.config/gnoma/skills/:

/skillname          # invoke a skill
/skills             # list available skills

Hooks

Run shell commands on tool events:

[[hooks]]
name         = "block-rm-rf"
event        = "pre_tool_use"
type         = "command"
exec         = "bash-safety-check.sh"
tool_pattern = "bash*"

Plugins

Bundle skills, hooks, and MCP configs into installable plugins:

gnoma plugin install ./my-plugin    # install from directory
gnoma plugin list                   # list installed plugins

Session Persistence

Conversations are auto-saved to .gnoma/sessions/ after each completed turn. On a crash you lose at most the current in-flight turn; all previously completed turns are safe.

Resume a session

gnoma --resume              # interactive session picker (↑↓ navigate, Enter load, Esc cancel)
gnoma --resume <id>         # restore directly by ID
gnoma -r                    # shorthand

Inside the TUI:

/resume                     # open picker
/resume <id>                # restore by ID

Incognito mode

gnoma --incognito           # no session saved, no quality scores updated

Toggle at runtime with Ctrl+X.

Config

[session]
max_keep = 20   # how many sessions to retain per project (default: 20)

Sessions are stored per-project under .gnoma/sessions/<id>/. Quality scores (EMA routing data) are stored globally at ~/.config/gnoma/quality.json.


Config

Config is read in priority order:

  1. ~/.config/gnoma/config.toml — global
  2. .gnoma/config.toml — project-local (next to go.mod / .git)
  3. Environment variables

Example .gnoma/config.toml:

[provider]
default = "anthropic"
model   = "claude-sonnet-4-6"

[provider.api_keys]
anthropic = "${ANTHROPIC_API_KEY}"

[provider.endpoints]
ollama   = "http://localhost:11434/v1"
llamacpp = "http://localhost:8080/v1"

[permission]
mode = "auto"   # auto | accept_edits | bypass | deny | plan

Environment variable overrides: GNOMA_PROVIDER, GNOMA_MODEL.


Testing

make test               # unit tests
make test-integration   # integration tests (require real API keys)
make cover              # coverage report → coverage.html
make lint               # golangci-lint
make check              # fmt + vet + lint + test

Integration tests are gated behind //go:build integration and skipped by default.

Description
No description provided
Readme 635 KiB
Languages
Go 99.9%