Three plans shipped end-to-end in v0.3.0; removing them from TODO.md In-flight and adding a Status: shipped header to each plan doc with the commit references. Shipped: - 2026-05-23-routing-defaults-refresh.md - 2026-05-23-prefer-routing-policy.md - 2026-05-23-startup-safety-banner.md Still in flight (telemetry-gated, fires only if measurements support it): - 2026-05-23-tool-router-specialization.md
gnoma
A provider-agnostic agentic coding assistant in Go. gnoma routes each prompt to the best available model — cloud or local — through a multi-armed bandit router, executes tools on your behalf, and stays extensible through hooks, skills, MCP servers, and plugins.
Named after the northern pygmy-owl (Glaucidium gnoma); agents are called elfs (elf owl).
- Upstream: https://somegit.dev/Owlibou/gnoma
- GitHub mirror: https://github.com/VikingOwl91/gnoma
Install
Pre-built binary (no Go toolchain required)
Releases are built by GoReleaser for
linux, darwin, and windows × amd64/arm64 as static (CGO_ENABLED=0)
archives. Grab the one matching your OS/arch from
https://github.com/VikingOwl91/gnoma/releases:
# Linux/macOS one-liner (substitute the asset URL):
curl -fsSL <ARCHIVE_URL> | tar -xz -C /tmp
sudo mv /tmp/gnoma /usr/local/bin/
gnoma --version
Windows: download the _windows_*.zip, extract gnoma.exe, and put it on
%PATH%.
Docker
Multi-arch images (linux/amd64, linux/arm64) are published to GitHub
Container Registry on each tagged release:
docker pull ghcr.io/vikingowl91/gnoma:latest
docker run --rm -it -v "$PWD:/workspace" ghcr.io/vikingowl91/gnoma:latest --version
Mount your project as /workspace (the image's working directory) and pass
any provider keys via -e VAR_NAME — see the Providers table
for env-var names.
Go users
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest # latest tagged
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@main # bleeding edge
Build from source
git clone https://somegit.dev/Owlibou/gnoma && cd gnoma
make build # → ./bin/gnoma
make install # → $GOPATH/bin/gnoma
Requires Go 1.26+.
Quickstart
Set at least one provider key (env var names are listed in the Providers table below) — or run a local model and skip the keys entirely.
gnoma # interactive TUI
echo "list files" | gnoma # pipe / one-shot mode
gnoma --provider ollama # use a local model (no API key needed)
gnoma --version
Inside the TUI, Ctrl+X toggles incognito (no session saved, no router
learning); /help lists slash commands; Esc cancels an in-flight turn.
Vision / image input
Ctrl+V in the TUI pastes a screenshot from the system clipboard:
gnoma writes the bytes to your user cache and inserts a
[Pasted image #imgN] placeholder, which expands to [Image: /path]
when the turn is sent. You can also type a literal [Image: /path]
marker anywhere in a prompt to reference an existing file:
explain this error [Image: /tmp/screen.png] — what's the root cause?
Image markers are parsed by the engine, files larger than 10 MiB are
skipped (the marker stays as plain text), and the router only routes
vision-tagged turns to arms that declare the Vision capability
(Anthropic, OpenAI, Google, and Ollama models that advertise
multimodal support). Image paste is disabled under --incognito to
honour the no-persistence contract.
Providers
| Provider | Env var | Default model | Also available |
|---|---|---|---|
| Anthropic | ANTHROPIC_API_KEY |
claude-sonnet-4-6 |
claude-opus-4-7, claude-haiku-4-5-20251001 |
| OpenAI | OPENAI_API_KEY |
gpt-5.5 |
gpt-5.5-pro, gpt-5.2, gpt-5.2-chat-latest |
| Google (Gemini) | GEMINI_API_KEY (alt: GOOGLE_API_KEY) |
gemini-3.5-flash |
gemini-3.1-pro-preview, gemini-3.1-flash-lite |
| Mistral | MISTRAL_API_KEY |
mistral-large-latest (Mistral Large 3) |
mistral-medium-3.5, magistral-medium-2509 |
| Ollama (local) | — | qwen3:8b (override with --model) |
any model on your Ollama instance |
| llama.cpp (local) | — | reported by /v1/models |
n/a |
Subprocess (claude, gemini, agy, codex, vibe CLIs) |
provider-specific | binary name | configurable via [cli_agents] |
Override per-invocation:
gnoma --provider anthropic --model claude-opus-4-7
gnoma --provider openai --model gpt-5.5-pro # GPT-5.5 is the default; pro is the higher-accuracy tier
gnoma --provider google --model gemini-3.1-pro-preview
gnoma --provider ollama --model qwen2.5-coder:3b
gnoma --provider llamacpp # model picked from server
gnoma providers prints every discovered provider, model, and CLI agent.
Subprocess sandbox bypass. The agy and codex CLIs each run with
their respective sandboxes enabled by default. Two env vars exist for the
rare case where a sandbox blocks legitimate work (e.g., reading files
outside the project root):
| Env var | Effect |
|---|---|
GNOMA_AGY_BYPASS_PERMISSIONS=1 |
Skip agy's permission prompts |
GNOMA_CODEX_BYPASS_SANDBOX=1 |
Disable codex's filesystem sandbox |
These are footguns — set them deliberately, per-invocation. They do not disable gnoma's own permission system, hooks, or firewall.
Local models
Start your local server, then point gnoma at it:
# Ollama (default http://localhost:11434/v1)
ollama pull qwen2.5-coder:3b
gnoma --provider ollama --model qwen2.5-coder:3b
# llama.cpp (default http://localhost:8080/v1)
llama-server --model /path/to/model.gguf --port 8080 --ctx-size 8192
gnoma --provider llamacpp
Override the endpoint in .gnoma/config.toml:
[provider.endpoints]
ollama = "http://myhost:11434/v1"
llamacpp = "http://localhost:9090/v1"
Config
Configuration merges (lowest → highest priority):
- Built-in defaults
~/.config/gnoma/config.toml— global base~/.config/gnoma/profiles/<name>.toml— active profile (when profile mode is enabled)<projectRoot>/.gnoma/config.toml— project override- Environment variables (
GNOMA_PROVIDER,GNOMA_MODEL,*_API_KEY)
Example global config:
[provider]
default = "anthropic"
model = "claude-sonnet-4-6"
[provider.api_keys]
anthropic = "${ANTHROPIC_API_KEY}"
[provider.endpoints]
ollama = "http://localhost:11434/v1"
llamacpp = "http://localhost:8080/v1"
[permission]
mode = "auto" # default | accept_edits | bypass | deny | plan | auto
[session]
max_keep = 20 # sessions retained per project
Profiles
Drop multiple configs under ~/.config/gnoma/profiles/ and switch with
--profile <name> or /profile <name>. Each profile keeps its own router
quality data and session history. Full details: docs/profiles.md.
Routing defaults
Discovered arms ship with opinionated defaults — Strengths (per-task
preference) and MaxComplexity (ceiling above which the arm won't be
picked) — so a freshly-pulled fleet routes sensibly without any
[[arms]] config. Defaults match against the model ID with
longest-prefix-wins; size-keyed families (Qwen 3, Ministral 3, tiny3.5,
etc.) scale MaxComplexity down for smaller variants automatically.
Non-chat models (embeddinggemma, whisper-base, kokoros,
vibevoice, *-asr, *-tts, *-audio, *-reranker,
*-embedding) are skipped during discovery so they never register
as broken chat arms.
| Local family | Strengths | MaxComplexity |
|---|---|---|
qwen3-coder / devstral |
Generation, Refactor, Debug | 0.85 |
qwen2.5-coder |
Generation, Refactor, UnitTest | 0.70 |
phi-4 |
Planning, Debug, Review | 0.65 |
gemma4 (base ~9B) |
Explain, Review, Generation | 0.70 |
gemma4-e / gemma-4-e (edge 2B–4B) |
Explain, Boilerplate | 0.45 |
mistral-small-3 |
Orchestration, Review | 0.65 |
qwen3 |
Generation, Refactor, Debug | 0.50–0.75 (size-keyed) |
qwen3.5 |
Boilerplate, Explain, Orchestration | 0.40–0.65 |
ministral-3 |
Orchestration, Planning | 0.35–0.70 |
tiny3.5 |
Boilerplate, Explain | 0.20–0.30 |
phi-4-mini / llama3.2 / granite |
Boilerplate, Explain | 0.30–0.35 |
functiongemma |
(Disabled — reserved for tool-router role) | 0.40 |
| Cloud model | Strengths | CostWeight |
|---|---|---|
claude-opus-4-7 |
Planning, SecurityReview, Debug, Refactor | 0.3 |
claude-sonnet-4-6 |
Generation, Refactor, Review | 0.7 |
gpt-5.5 |
Planning, SecurityReview, Generation | 0.3 |
gpt-5.3-codex |
Generation, Refactor, Debug, UnitTest | 0.6 |
gpt-5.2 |
Orchestration, Review | 0.8 |
gemini-3.1-pro |
Planning, Review, Orchestration | 0.5 |
gemini-3.5-flash |
Boilerplate, Explain, Orchestration | 1.2 |
CostWeight scales how much $/Mtok matters in scoring: values below
1.0 keep expensive frontier arms competitive on high-stakes tasks
(Planning, SecurityReview); values above 1.0 penalize cost more so
cheap fast arms only win when cost is genuinely decisive.
Overriding the defaults
Drop an [[arms]] block in config.toml to override per-arm
Strengths or CostWeight. User values win — defaults only fill
zero fields:
[[arms]]
id = "anthropic/claude-opus-4-7"
strengths = ["security_review", "planning", "debug"]
cost_weight = 0.2 # weight cost even less than the default 0.3
[[arms]]
id = "ollama/qwen3-coder:30b"
strengths = ["generation", "refactor"]
Full rationale and benchmark sources behind these defaults:
docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md.
Preferring local vs cloud
[router].prefer biases routing toward one camp without hard-filtering
the other:
[router]
prefer = "auto" # auto (default) | local | cloud
| Value | Effect |
|---|---|
"auto" |
No bias. Tier order (SLM → CLI-agent → local → cloud) decides, with Strengths and quality scores breaking ties. Default. |
"local" |
Cloud arms are demoted by 2 tiers. Local + CLI-agent arms always win unless no local option is feasible. |
"cloud" |
Local arms are demoted by 2 tiers. Cloud arms win, except for tier-0 SLMs — a small specialist arm whose MaxComplexity ceiling fits the task still wins, by design (the SLM is for small stuff). |
Three things still take priority over prefer:
--provider Xpins the forced arm.- Incognito (
Ctrl+Xor--incognito) hard-filters cloud arms —prefer = "cloud"under incognito still picks a local arm. - A
Strengths-tagged arm always wins its tagged task type, regardless ofprefer. Tag Opus with[security_review]underprefer = "local"and Opus still wins SecurityReview tasks.
CLI-agent subprocess arms (claude, gemini, vibe) count as local for this knob — they proxy to cloud but run as local processes. Use --provider <name> if you need to pin a specific subprocess.
SLM (small-language-model) routing
gnoma can run a tiny local model alongside the main provider to:
- Classify each prompt (task type + complexity + tool requirement) so the router picks the right arm.
- Execute trivial tasks itself (knowledge questions, single file reads, anything with complexity ≤ 0.3), keeping the heavy provider for real work.
[slm]
enabled = true
backend = "auto" # ollama | llamacpp | llamafile | openaicompat | auto | disabled
model = "reecdev/tiny3.5:500m"
Setup, presets, and verification: docs/slm-backends.md.
The auto backend probes Ollama → llama.cpp → llamafile on startup and picks
the first reachable option. Inspect with gnoma slm status and
gnoma router stats.
Session persistence
Sessions are auto-saved per project under .gnoma/sessions/<id>/ after each
completed turn. On a crash you lose at most the current in-flight turn.
gnoma --resume # interactive picker
gnoma --resume <id> # restore by ID
gnoma -r # shorthand
gnoma --incognito # no save, no router learning
Inside the TUI: /resume, /resume <id>, Ctrl+X (incognito toggle).
Router-quality data (EMA scores) is stored at
~/.config/gnoma/quality.json (or quality-<profile>.json in profile mode).
Extensibility
MCP servers
Connect any MCP-compatible server:
[[mcp_servers]]
name = "git"
command = "mcp-server-git"
args = ["--repo", "."]
timeout = "30s"
# Optionally replace a built-in tool with an MCP one
[mcp_servers.replace_default]
exec = "bash"
MCP tools appear as mcp__{server}__{tool} unless mapped via replace_default.
Skills
Drop markdown files into .gnoma/skills/ or ~/.config/gnoma/skills/. Invoke
with /<skill-name>. List with /skills.
Hooks
Shell commands run on tool events (pre_tool_use, post_tool_use, etc.):
[[hooks]]
name = "block-rm-rf"
event = "pre_tool_use"
type = "command"
exec = "bash-safety-check.sh"
tool_pattern = "bash*"
Ordering rules: ADR-004.
Plugins
Plugins bundle skills, hooks, and MCP server configs. Drop a plugin directory
into ~/.config/gnoma/plugins/ (global) or <project>/.gnoma/plugins/
(project-local); gnoma auto-discovers them on startup.
Each plugin's plugin.json is pinned by SHA-256 on first load
(Trust-On-First-Use). A manifest that changes between runs is refused with a
clear error and a re-enrolment hint. Full model:
docs/plugins-trust.md and
ADR-003.
Elfs (sub-agents)
The spawn_elfs tool decomposes work into parallel sub-tasks. See
internal/skill/skills/batch.md for the
built-in batching skill.
Subcommands
| Command | What it does |
|---|---|
gnoma providers |
List every discovered provider, model, and CLI agent |
gnoma profile list / show <name> |
Profile diagnostics |
gnoma router stats |
Quality EMA + classifier source breakdown |
gnoma slm setup / slm status |
Manage the llamafile-backed SLM |
gnoma --help for the full flag set.
Security
gnoma runs tools and shell commands on your behalf. The
internal/security package canonicalises every path
(TOCTOU-safe), gates network access through a configurable firewall, and
scans tool output for secrets before it ever reaches the model. The
SafeProvider boundary keeps incognito-mode data out of long-lived stores.
Entropy false-positive reduction
The secret scanner also computes Shannon entropy on long unstructured
tokens to catch unknown-format secrets. Under a lowered threshold or
redact_high_entropy = true, this can fire on shapes that are never
secrets (UUIDs, SHA digests, ISO-8601 timestamps, URLs). Opt into the
format-aware safelist to skip them:
[security]
entropy_threshold = 3.5
redact_high_entropy = true
entropy_safelist = ["uuid", "sha_hex", "iso8601", "url"]
Default is an empty list — pre-safelist behaviour. Skips are logged
(Debug-level, per pattern, token length only — never the bytes) so the
real false-positive rate is measurable on real workloads.
Startup safety check
gnoma classifies the current working directory before launch and refuses, warns, or allows based on tier:
| Tier | What | Behavior |
|---|---|---|
| Refuse | /, /etc, /sys, /proc, /usr, /var, /bin, /sbin, /boot, /root, /dev (and macOS equivalents /System, /Library, /private, /Applications) |
Refuses to start. Exit code 2. |
| Warn | $HOME, ~/Desktop, ~/Downloads, ~/Documents, ~/.config, ~/.local, ~/.cache, /tmp |
Prints a warning banner and waits for y keypress to continue. Anything else (including piped EOF) aborts with exit 1. |
| OK | Anywhere with a project marker (.gnoma/, go.mod, package.json, pyproject.toml, Cargo.toml, Makefile, Dockerfile, build.gradle, pom.xml) or inside a git repo |
No prompt. |
A project marker anywhere — including inside $HOME — promotes the
directory to OK. The banner is shown for every tier and summarizes
cwd, git branch, project type, provider, model, modes, and a
top-level sensitive-file inventory (.env, SSH keys, *.pem,
.ssh/, .aws/, etc.).
[safety]
refuse_in_system_dirs = true # default
warn_in_home = true # default
require_project_marker = false # default — being inside a git repo is enough
Bypass all safety checks with --dangerously-allow-anywhere. Required
for non-interactive invocations (piped stdin, CI) in warn-tier dirs,
since there's no human present to consent.
Containers (/.dockerenv or /run/.containerenv present) automatically
downgrade refuse-tier paths to warn-tier — devcontainers commonly run
from / or /workspace.
Full design:
docs/superpowers/plans/2026-05-23-startup-safety-banner.md.
Architecture references:
- docs/essentials/INDEX.md — full architecture map
- docs/essentials/decisions/ — ADRs 001–004
Development
make build # ./bin/gnoma
make test # unit tests
make test-integration # //go:build integration — requires real API keys
make cover # coverage.html
make lint # golangci-lint
make check # fmt + vet + lint + test
Architecture, conventions, and TDD workflow: CONTRIBUTING.md.