diff --git a/.gitea/issue_template/bug_report.yaml b/.gitea/issue_template/bug_report.yaml deleted file mode 100644 index 21a3d17..0000000 --- a/.gitea/issue_template/bug_report.yaml +++ /dev/null @@ -1,58 +0,0 @@ -name: Bug Report -about: Report something that isn't working correctly -labels: - - bug -body: - - type: textarea - id: description - attributes: - label: Description - description: What happened? What did you expect? - validations: - required: true - - type: textarea - id: reproduction - attributes: - label: Steps to reproduce - description: Minimal steps to trigger the issue - placeholder: | - 1. Run `gnoma --provider anthropic` - 2. Type "..." - 3. See error - validations: - required: true - - type: input - id: version - attributes: - label: gnoma version - description: Output of `gnoma --version` - placeholder: "gnoma 0.1.0 (abc1234, 2026-04-12)" - validations: - required: true - - type: input - id: os - attributes: - label: OS / Architecture - placeholder: "Linux x86_64 / macOS arm64 / Windows amd64" - validations: - required: true - - type: dropdown - id: provider - attributes: - label: Provider - options: - - mistral - - anthropic - - openai - - google - - ollama - - llamacpp - - N/A - validations: - required: false - - type: textarea - id: logs - attributes: - label: Relevant logs - description: Run with `--verbose` for debug output - render: shell diff --git a/.gitea/issue_template/feature_request.yaml b/.gitea/issue_template/feature_request.yaml deleted file mode 100644 index a1849bd..0000000 --- a/.gitea/issue_template/feature_request.yaml +++ /dev/null @@ -1,42 +0,0 @@ -name: Feature Request -about: Suggest an improvement or new capability -labels: - - enhancement -body: - - type: textarea - id: problem - attributes: - label: Problem - description: What are you trying to do that gnoma doesn't support well? - validations: - required: true - - type: textarea - id: solution - attributes: - label: Proposed solution - description: How would you like this to work? - validations: - required: true - - type: textarea - id: alternatives - attributes: - label: Alternatives considered - description: Other approaches you've thought about - validations: - required: false - - type: dropdown - id: area - attributes: - label: Area - options: - - providers - - tools - - router - - TUI - - MCP / plugins - - elfs (sub-agents) - - security - - config - - other - validations: - required: false diff --git a/AGENTS.md b/AGENTS.md index 1b3bf6d..f32a8e1 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,26 +1,75 @@ # AGENTS.md -## Domain Terminology -- **Elf**: An agent instance. -- **Turn**: A complete sequence of agentic reasoning and tool execution. -- **Routing Arm**: A specific model/provider selected by the `Router` for a task. -- **Stream Event**: Discrete updates during LLM generation (e.g., `EventTextDelta`, `EventToolCallStart`, `EventToolResult`). +Conventions for AI assistants working in this repository. CLAUDE.md +covers Go style, commits, and TDD policy; this file adds gnoma-specific +domain knowledge those rules do not capture. -## Build & Test Targets -- **Run**: `make run` -- **Test (Verbose)**: `make test-v` -- **Integration Tests**: `make test-integration` (requires `//go:build integration`) +## Domain glossary -## Key Dependencies -- **Mistral**: `github.com/VikingOwl91/mistral-go-sdk` -- **Anthropic**: `github.com/anthropics/anthropic-sdk-go` -- **OpenAI**: `github.com/openai/openai-go` -- **Google GenAI**: `google.golang.org/genai` -- **TUI**: `charm.land/bubbletea/v2`, `charm.land/lipgloss/v2` -- **Other**: `charm.land/bubbles/v2`, `charm.land/glamour/v2`, `github.com/pkoukk/tiktoken-go` +| Term | Meaning | +|---|---| +| **Elf** | A sub-agent instance, spawned via `spawn_elfs`. | +| **Turn** | One complete `stream → tool → re-query` cycle in the engine. | +| **Arm** | A `(provider, model)` pair the router can select. Registered with cost and capability metadata. | +| **Router** | Multi-armed-bandit selector that picks an Arm per Turn from the registered set. | +| **SLM** | Small language model running locally for prompt classification and trivial-task execution. | +| **Stream Event** | Discriminated-union update emitted while a provider streams: `EventTextDelta`, `EventToolCallStart`, `EventToolResult`, etc. See `internal/stream/event.go`. | +| **SafeProvider** | The sealed boundary that gates outbound provider calls — every Provider implementation embeds the unexported marker. See `internal/security`. | +| **Incognito** | Per-turn mode that disables session persistence and router learning. | +| **Profile** | A named config overlay under `~/.config/gnoma/profiles/`. Switches keys, models, and per-profile router quality data. | -## Environment Variables -- `MISTRAL_API_KEY`: Required for Mistral provider. -- `ANTHROPIC_API_KEY`: Required for Anthropic provider. -- `OPENAI_API_KEY`: Required for OpenAI provider. -- `GOOGLE_API_KEY`: Required for Google provider. +## Build & test targets (beyond standard) + +| Target | Purpose | +|---|---| +| `make test-v` | Verbose unit tests | +| `make test-integration` | Runs `//go:build integration` tests (real API calls) | +| `make check` | fmt + vet + lint + test (use before committing) | +| `go test -bench=. ./internal/router/` | Router benchmarks | + +## Provider env vars + +| Provider | Primary | Alternative | +|---|---|---| +| Anthropic | `ANTHROPIC_API_KEY` | `ANTHROPICS_API_KEY` | +| OpenAI | `OPENAI_API_KEY` | — | +| Google | `GEMINI_API_KEY` | `GOOGLE_API_KEY` | +| Mistral | `MISTRAL_API_KEY` | — | + +`GNOMA_PROVIDER` and `GNOMA_MODEL` override the resolved config. + +## Non-obvious conventions + +- **Discriminated unions** are structs with a `Type` field and pointer + payloads — not Go interfaces. See `internal/stream/event.go` and + `internal/message`. +- **Pull-based iterators** follow the `Next() / Current() / Err() / Close()` + shape. Streams in `internal/provider/*/stream.go` are the canonical examples. +- **`json.RawMessage`** flows through `tool.Definition.Parameters` and tool + arguments untouched — never marshal/unmarshal in the middle. +- **Capabilities and ContextWindow** come from `internal/provider` + `inferXxxModelCapabilities` per provider; updating model lists also updates + these tables and the `ratelimits.go` map. +- **Hook ordering** matters for `PostToolUse`. See ADR-004. +- **Plugin trust** is TOFU pinning — see `internal/plugin/pinstore.go` and + ADR-003. + +## Sub-agent (elf) etiquette + +When spawning elfs: + +- One `spawn_elfs` call for all parallel work; never spawn one at a time. +- Read-only tasks on disjoint files parallelize cleanly. +- Writes to the same file must be sequenced into one elf. +- Cap each batch at 5–7 elfs. + +See `internal/skill/skills/batch.md` for the canonical batching template. + +## Reference docs + +- Architecture map: `docs/essentials/INDEX.md` +- ADRs: `docs/essentials/decisions/` +- Profiles: `docs/profiles.md` +- SLM backends: `docs/slm-backends.md` +- Plugin trust: `docs/plugins-trust.md` +- Router benchmarks: `docs/benchmarks/README.md` diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 2f76f1e..3da1898 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,5 +1,10 @@ # Contributing to gnoma +The upstream repository lives at + and is mirrored to +. PRs are accepted on the upstream +(Gitea) instance; the GitHub mirror is read-only. + ## Setup ```sh @@ -11,34 +16,43 @@ make lint # requires golangci-lint ## Development workflow -1. Create a branch from `main` -2. Write tests first (TDD) — table-driven, `t.TempDir()` for filesystem tests -3. `make check` (fmt + vet + lint + test) must pass -4. Commit with conventional messages: `feat:`, `fix:`, `refactor:`, `test:`, `docs:` +1. Branch from `main`. +2. Write tests first (TDD). Table-driven where possible, `t.TempDir()` for + filesystem tests, `testing/synctest` for concurrent ones. +3. `make check` (fmt + vet + lint + test) must pass. +4. Conventional commits: `feat:`, `fix:`, `refactor:`, `test:`, `docs:`, + `chore:`. **No co-signing or "Generated-by" trailers.** ## Code style -- Go 1.26 idioms (`new(expr)`, `errors.AsType[E]`) -- Structured logging with `log/slog` -- `json.RawMessage` for tool schemas (zero-cost passthrough) -- Functional options for complex configuration -- Short, lowercase package names — no underscores +- Go 1.26 idioms (`new(expr)`, `errors.AsType[E]`, `sync.WaitGroup.Go`). +- Structured logging with `log/slog`. +- `json.RawMessage` for tool schemas (zero-cost passthrough). +- Functional options for complex configuration. +- Short, lowercase package names — no underscores. +- Discriminated unions via struct + type discriminant, not interfaces. +- Pull-based stream iterators: `Next() / Current() / Err() / Close()`. ## Testing -- Unit tests: `make test` -- Integration tests (require API keys): `make test-integration` -- Coverage: `make cover` -- Benchmarks: `go test -bench=. ./internal/router/` +| Command | What it runs | +|---|---| +| `make test` | unit tests | +| `make test-integration` | tests behind `//go:build integration` — requires real API keys | +| `make cover` | coverage → `coverage.html` | +| `make lint` | `golangci-lint run ./...` | +| `make check` | fmt + vet + lint + test | +| `go test -bench=. ./internal/router/` | router benchmarks | -Integration tests use `//go:build integration` and are skipped by default. +Integration tests are skipped by default. ## Architecture -Read `docs/essentials/INDEX.md` before making architectural changes. Key packages: +Read [`docs/essentials/INDEX.md`](docs/essentials/INDEX.md) before changing +architectural boundaries. Key packages: | Package | Purpose | -|---------|---------| +|---|---| | `internal/engine` | Agentic loop (stream → tool → re-query) | | `internal/router` | Multi-armed bandit arm selection | | `internal/provider` | LLM provider adapters | @@ -46,8 +60,24 @@ Read `docs/essentials/INDEX.md` before making architectural changes. Key package | `internal/mcp` | MCP client (JSON-RPC over stdio) | | `internal/plugin` | Plugin manifest, loader, manager | | `internal/elf` | Sub-agent (elf) system | -| `internal/tui` | Bubble Tea terminal UI | +| `internal/security` | SafeProvider boundary, firewall, output scanner | +| `internal/skill` | Skill registry and templating | +| `internal/slm` | Small-language-model classifier + arm | +| `internal/tui` | Bubble Tea v2 terminal UI | -## Issues +ADRs live in [`docs/essentials/decisions/`](docs/essentials/decisions/). -Use the issue templates when filing bugs or requesting features. Include reproduction steps, expected behavior, and gnoma version (`gnoma --version`). +## Reporting issues + +File issues on the upstream Gitea instance with: + +- A short reproduction (commands, prompts, configs that triggered the bug). +- Expected vs. actual behavior. +- `gnoma --version` output and OS / architecture. +- Provider and model in use, if relevant. +- `--verbose` log output if it sheds light. + +## License + +By contributing you agree your work is licensed under the +[Apache License 2.0](LICENSE). diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..d645695 --- /dev/null +++ b/LICENSE @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/NOTICE b/NOTICE new file mode 100644 index 0000000..ad63ead --- /dev/null +++ b/NOTICE @@ -0,0 +1,5 @@ +gnoma +Copyright 2026 vikingowl + +This product includes software developed at the gnoma project +(https://somegit.dev/Owlibou/gnoma). diff --git a/README.md b/README.md index 239c375..34c0f50 100644 --- a/README.md +++ b/README.md @@ -1,234 +1,153 @@ # gnoma -**A provider-agnostic agentic coding assistant built in Go.** gnoma routes tasks to the best available LLM — cloud or local — through a multi-armed bandit router, while tools, hooks, skills, MCP servers, and plugins keep it extensible. Named after the northern pygmy-owl (*Glaucidium gnoma*); agents are called **elfs** (elf owl). +**A provider-agnostic agentic coding assistant in Go.** gnoma routes each prompt +to the best available model — cloud or local — through a multi-armed bandit +router, executes tools on your behalf, and stays extensible through hooks, +skills, MCP servers, and plugins. + +Named after the northern pygmy-owl (*Glaucidium gnoma*); agents are called +**elfs** (elf owl). + +- **Upstream:** +- **GitHub mirror:** + +--- + +## Install + +### Pre-built binary (no Go toolchain required) + +Releases are built by [GoReleaser](.goreleaser.yml) for +`linux`, `darwin`, and `windows` × `amd64`/`arm64` as static (`CGO_ENABLED=0`) +archives. Until the first tag is cut, see "Build from source" below. + +Once releases are published: + +```sh +# Pick the archive matching your OS/arch from the releases page: +# https://somegit.dev/Owlibou/gnoma/releases (upstream) +# https://github.com/VikingOwl91/gnoma/releases (mirror) + +# Linux/macOS one-liner (substitute the asset URL): +curl -fsSL | tar -xz -C /tmp +sudo mv /tmp/gnoma /usr/local/bin/ +gnoma --version +``` + +Windows: download the `_windows_*.zip`, extract `gnoma.exe`, and put it on +`%PATH%`. + +### Docker + +Multi-arch images (`linux/amd64`, `linux/arm64`) are published to GitHub +Container Registry on each tagged release: + +```sh +docker pull ghcr.io/vikingowl91/gnoma:latest +docker run --rm -it \ + -v "$PWD:/workspace" \ + -e ANTHROPIC_API_KEY \ + ghcr.io/vikingowl91/gnoma:latest --version +``` + +Mount your project as `/workspace` (the image's working directory) and pass +provider keys via `-e`. + +### Go users + +```sh +go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest # latest tagged +go install somegit.dev/Owlibou/gnoma/cmd/gnoma@main # bleeding edge +``` + +### Build from source + +```sh +git clone https://somegit.dev/Owlibou/gnoma && cd gnoma +make build # → ./bin/gnoma +make install # → $GOPATH/bin/gnoma +``` + +Requires Go 1.26+. + +--- ## Quickstart ```sh -# Install -go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest +# Set at least one provider key (or run a local model — see Providers below). +export ANTHROPIC_API_KEY=sk-ant-... -# Or build from source -git clone https://somegit.dev/Owlibou/gnoma && cd gnoma -make build # binary at ./bin/gnoma - -# Set at least one provider key -export ANTHROPIC_API_KEY=sk-ant-... # or OPENAI_API_KEY, MISTRAL_API_KEY, GEMINI_API_KEY - -# Run -gnoma # interactive TUI -echo "list files" | gnoma # pipe mode -gnoma --provider ollama # use a local model +gnoma # interactive TUI +echo "list files" | gnoma # pipe / one-shot mode +gnoma --provider ollama # use a local model +gnoma --version ``` -## Build +Inside the TUI, `Ctrl+X` toggles **incognito** (no session saved, no router +learning); `/help` lists slash commands; `Esc` cancels an in-flight turn. -```sh -make build # ./bin/gnoma -make install # $GOPATH/bin/gnoma -``` +--- ## Providers -### Anthropic +| Provider | Env var | Default model | Also available | +|---|---|---|---| +| Anthropic | `ANTHROPIC_API_KEY` | `claude-sonnet-4-6` | `claude-opus-4-7`, `claude-haiku-4-5-20251001` | +| OpenAI | `OPENAI_API_KEY` | `gpt-5.5` | `gpt-5.5-pro`, `gpt-5.2`, `gpt-5.2-chat-latest` | +| Google (Gemini) | `GEMINI_API_KEY` (alt: `GOOGLE_API_KEY`) | `gemini-3.5-flash` | `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite` | +| Mistral | `MISTRAL_API_KEY` | `mistral-large-latest` (Mistral Large 3) | `mistral-medium-3.5`, `magistral-medium-2509` | +| Ollama (local) | — | `qwen3:8b` (override with `--model`) | any model on your Ollama instance | +| llama.cpp (local) | — | reported by `/v1/models` | n/a | +| Subprocess (`claude`, `gemini`, `agy` CLIs) | provider-specific | binary name | configurable via `[cli_agents]` | + +Override per-invocation: ```sh -export ANTHROPIC_API_KEY=sk-ant-... -./bin/gnoma --provider anthropic -./bin/gnoma --provider anthropic --model claude-opus-4-5-20251001 +gnoma --provider anthropic --model claude-opus-4-7 +gnoma --provider openai --model gpt-5.5-pro # GPT-5.5 is the default; pro is the higher-accuracy tier +gnoma --provider google --model gemini-3.1-pro-preview +gnoma --provider ollama --model qwen2.5-coder:3b +gnoma --provider llamacpp # model picked from server ``` -Integration tests hit the real API — keep a key in env: +`gnoma providers` prints every discovered provider, model, and CLI agent. + +### Local models + +Start your local server, then point gnoma at it: ```sh -go test -tags integration ./internal/provider/... -``` +# Ollama (default http://localhost:11434/v1) +ollama pull qwen2.5-coder:3b +gnoma --provider ollama --model qwen2.5-coder:3b ---- - -### OpenAI - -```sh -export OPENAI_API_KEY=sk-proj-... -./bin/gnoma --provider openai -./bin/gnoma --provider openai --model gpt-4o -``` - ---- - -### Mistral - -```sh -export MISTRAL_API_KEY=... -./bin/gnoma --provider mistral -``` - ---- - -### Google (Gemini) - -```sh -export GEMINI_API_KEY=AIza... -./bin/gnoma --provider google -./bin/gnoma --provider google --model gemini-2.0-flash -``` - ---- - -### Ollama (local) - -Start Ollama and pull a model, then: - -```sh -./bin/gnoma --provider ollama --model gemma4:latest -./bin/gnoma --provider ollama --model qwen3:8b # default if --model omitted -``` - -Default endpoint: `http://localhost:11434/v1`. Override via config or env: - -```sh -# .gnoma/config.toml -[provider] -default = "ollama" -model = "gemma4:latest" - -[provider.endpoints] -ollama = "http://myhost:11434/v1" -``` - ---- - -### llama.cpp (local) - -Start the llama.cpp server: - -```sh +# llama.cpp (default http://localhost:8080/v1) llama-server --model /path/to/model.gguf --port 8080 --ctx-size 8192 +gnoma --provider llamacpp ``` -Then: +Override the endpoint in `.gnoma/config.toml`: -```sh -./bin/gnoma --provider llamacpp -# model name is taken from the server's /v1/models response -``` - -Default endpoint: `http://localhost:8080/v1`. Override: - -```sh +```toml [provider.endpoints] +ollama = "http://myhost:11434/v1" llamacpp = "http://localhost:9090/v1" ``` --- -## Extensibility (M8) - -gnoma supports hooks, skills, MCP servers, and plugins. - -### MCP Servers - -Connect any [MCP](https://modelcontextprotocol.io)-compatible tool server: - -```toml -[[mcp_servers]] -name = "git" -command = "mcp-server-git" -args = ["--repo", "."] -timeout = "30s" - -# Replace a built-in tool with an MCP tool -[mcp_servers.replace_default] -exec = "bash" # MCP tool "exec" replaces gnoma's built-in "bash" -``` - -MCP tools appear as `mcp__{server}__{tool}` (e.g., `mcp__git__status`), or under the built-in name when using `replace_default`. - -### Skills - -Drop markdown files into `.gnoma/skills/` or `~/.config/gnoma/skills/`: - -``` -/skillname # invoke a skill -/skills # list available skills -``` - -### Hooks - -Run shell commands on tool events: - -```toml -[[hooks]] -name = "block-rm-rf" -event = "pre_tool_use" -type = "command" -exec = "bash-safety-check.sh" -tool_pattern = "bash*" -``` - -### Plugins - -Bundle skills, hooks, and MCP configs into installable plugins: - -```sh -gnoma plugin install ./my-plugin # install from directory -gnoma plugin list # list installed plugins -``` - -Plugins are pinned by SHA-256 of their `plugin.json` on first load -(Trust-On-First-Use). A manifest that changes between runs is refused with a -clear error and a re-enrollment hint. See [docs/plugins-trust.md](docs/plugins-trust.md) -and [ADR-003](docs/essentials/decisions/003-plugin-trust.md). - ---- - -## Session Persistence - -Conversations are auto-saved to `.gnoma/sessions/` after each completed turn. On a crash you lose at most the current in-flight turn; all previously completed turns are safe. - -### Resume a session - -```sh -gnoma --resume # interactive session picker (↑↓ navigate, Enter load, Esc cancel) -gnoma --resume # restore directly by ID -gnoma -r # shorthand -``` - -Inside the TUI: - -``` -/resume # open picker -/resume # restore by ID -``` - -### Incognito mode - -```sh -gnoma --incognito # no session saved, no quality scores updated -``` - -Toggle at runtime with `Ctrl+X`. - -### Config - -```toml -[session] -max_keep = 20 # how many sessions to retain per project (default: 20) -``` - -Sessions are stored per-project under `.gnoma/sessions//`. Quality scores (EMA routing data) are stored globally at `~/.config/gnoma/quality.json`. - ---- - ## Config -Config is read in priority order: +Configuration merges (lowest → highest priority): -1. `~/.config/gnoma/config.toml` — global -2. `.gnoma/config.toml` — project-local (next to `go.mod` / `.git`) -3. Environment variables +1. Built-in defaults +2. `~/.config/gnoma/config.toml` — global base +3. `~/.config/gnoma/profiles/.toml` — active profile (when profile mode is enabled) +4. `/.gnoma/config.toml` — project override +5. Environment variables (`GNOMA_PROVIDER`, `GNOMA_MODEL`, `*_API_KEY`) -Example `.gnoma/config.toml`: +Example global config: ```toml [provider] @@ -243,21 +162,165 @@ ollama = "http://localhost:11434/v1" llamacpp = "http://localhost:8080/v1" [permission] -mode = "auto" # auto | accept_edits | bypass | deny | plan +mode = "auto" # default | accept_edits | bypass | deny | plan | auto + +[session] +max_keep = 20 # sessions retained per project ``` -Environment variable overrides: `GNOMA_PROVIDER`, `GNOMA_MODEL`. +### Profiles + +Drop multiple configs under `~/.config/gnoma/profiles/` and switch with +`--profile ` or `/profile `. Each profile keeps its own router +quality data and session history. Full details: [docs/profiles.md](docs/profiles.md). --- -## Testing +## SLM (small-language-model) routing -```sh -make test # unit tests -make test-integration # integration tests (require real API keys) -make cover # coverage report → coverage.html -make lint # golangci-lint -make check # fmt + vet + lint + test +gnoma can run a tiny local model alongside the main provider to: + +- **Classify** each prompt (task type + complexity + tool requirement) so the + router picks the right arm. +- **Execute** trivial tasks itself (knowledge questions, single file reads, + anything with complexity ≤ 0.3), keeping the heavy provider for real work. + +```toml +[slm] +enabled = true +backend = "auto" # ollama | llamacpp | llamafile | openaicompat | auto | disabled +model = "reecdev/tiny3.5:500m" ``` -Integration tests are gated behind `//go:build integration` and skipped by default. +Setup, presets, and verification: [docs/slm-backends.md](docs/slm-backends.md). +The `auto` backend probes Ollama → llama.cpp → llamafile on startup and picks +the first reachable option. Inspect with `gnoma slm status` and +`gnoma router stats`. + +--- + +## Session persistence + +Sessions are auto-saved per project under `.gnoma/sessions//` after each +completed turn. On a crash you lose at most the current in-flight turn. + +```sh +gnoma --resume # interactive picker +gnoma --resume # restore by ID +gnoma -r # shorthand +gnoma --incognito # no save, no router learning +``` + +Inside the TUI: `/resume`, `/resume `, `Ctrl+X` (incognito toggle). + +Router-quality data (EMA scores) is stored at +`~/.config/gnoma/quality.json` (or `quality-.json` in profile mode). + +--- + +## Extensibility + +### MCP servers + +Connect any [MCP](https://modelcontextprotocol.io)-compatible server: + +```toml +[[mcp_servers]] +name = "git" +command = "mcp-server-git" +args = ["--repo", "."] +timeout = "30s" + +# Optionally replace a built-in tool with an MCP one +[mcp_servers.replace_default] +exec = "bash" +``` + +MCP tools appear as `mcp__{server}__{tool}` unless mapped via `replace_default`. + +### Skills + +Drop markdown files into `.gnoma/skills/` or `~/.config/gnoma/skills/`. Invoke +with `/`. List with `/skills`. + +### Hooks + +Shell commands run on tool events (`pre_tool_use`, `post_tool_use`, etc.): + +```toml +[[hooks]] +name = "block-rm-rf" +event = "pre_tool_use" +type = "command" +exec = "bash-safety-check.sh" +tool_pattern = "bash*" +``` + +Ordering rules: [ADR-004](docs/essentials/decisions/004-posttooluse-hook-ordering.md). + +### Plugins + +Plugins bundle skills, hooks, and MCP server configs. Drop a plugin directory +into `~/.config/gnoma/plugins/` (global) or `/.gnoma/plugins/` +(project-local); gnoma auto-discovers them on startup. + +Each plugin's `plugin.json` is pinned by SHA-256 on first load +(Trust-On-First-Use). A manifest that changes between runs is refused with a +clear error and a re-enrolment hint. Full model: +[docs/plugins-trust.md](docs/plugins-trust.md) and +[ADR-003](docs/essentials/decisions/003-plugin-trust.md). + +### Elfs (sub-agents) + +The `spawn_elfs` tool decomposes work into parallel sub-tasks. See +[`internal/skill/skills/batch.md`](internal/skill/skills/batch.md) for the +built-in batching skill. + +--- + +## Subcommands + +| Command | What it does | +|---|---| +| `gnoma providers` | List every discovered provider, model, and CLI agent | +| `gnoma profile list` / `show ` | Profile diagnostics | +| `gnoma router stats` | Quality EMA + classifier source breakdown | +| `gnoma slm setup` / `slm status` | Manage the llamafile-backed SLM | + +`gnoma --help` for the full flag set. + +--- + +## Security + +gnoma runs tools and shell commands on your behalf. The +[`internal/security`](internal/security) package canonicalises every path +(TOCTOU-safe), gates network access through a configurable firewall, and +scans tool output for secrets before it ever reaches the model. The +`SafeProvider` boundary keeps incognito-mode data out of long-lived stores. + +Architecture references: + +- [docs/essentials/INDEX.md](docs/essentials/INDEX.md) — full architecture map +- [docs/essentials/decisions/](docs/essentials/decisions/) — ADRs 001–004 + +--- + +## Development + +```sh +make build # ./bin/gnoma +make test # unit tests +make test-integration # //go:build integration — requires real API keys +make cover # coverage.html +make lint # golangci-lint +make check # fmt + vet + lint + test +``` + +Architecture, conventions, and TDD workflow: [CONTRIBUTING.md](CONTRIBUTING.md). + +--- + +## License + +Apache License 2.0. See [LICENSE](LICENSE) and [NOTICE](NOTICE). diff --git a/TODO.md b/TODO.md index c60a624..a55c4fd 100644 --- a/TODO.md +++ b/TODO.md @@ -1,51 +1,53 @@ # Gnoma — TODO -Active plans, newest first: +Active work, newest first. -- **Post-audit security hardening** — **complete (2026-05-19)**. All 14 - findings from the external review are closed across three waves + - one ADR: +## In flight + +- **Distribution** — `.goreleaser.yml` is configured for + `linux`/`darwin`/`windows` × `amd64`/`arm64`. Still pending: first + tag + release pipeline trigger, optional Homebrew tap and Docker + image, mirror release publishing to GitHub. +- **Compound tools (post-SLM Phase E)** — held until ≥50 SLM + observations inform which primitives are worth adding. See + [`docs/superpowers/plans/2026-05-19-post-slm-unlock.md`](docs/superpowers/plans/2026-05-19-post-slm-unlock.md). + +## Stable backlog (not in active phases) + +- **Thinking mode** (disabled / budget / adaptive) — M12. +- **Structured output** with JSON schema validation — M12. +- **Native agy JSON output** — switch the subprocess provider to + `--output-format stream-json` once the agy CLI supports it, + replacing the current prompt-augmentation fallback. +- **SQLite session persistence** + serve mode — M10. +- **Task learning** (pattern recognition, persistent tasks) — M11. +- **Web UI** (`gnoma web`) — M15. +- **OAuth / keyring** — M13. +- **Observability** (feature flags, cost dashboards) — M14. +- **PE / Mach-O ELF support** — future, after ELF Phase 6. + +## History + +Completed initiatives, kept here as pointers to their plan files: + +- **Post-audit security hardening** — complete 2026-05-19. Three waves + + one ADR closed all 14 findings from the external review: - [Wave 1 — SafeProvider boundary](docs/superpowers/plans/2026-05-19-security-wave1-safeprovider.md) - [Wave 2 — Incognito coherence](docs/superpowers/plans/2026-05-19-security-wave2-incognito.md) - - [Wave 3 — Scanner + path hygiene](docs/superpowers/plans/2026-05-19-security-wave3-scanner-paths.md) + - Wave 3 — scanner + path hygiene (rolled out directly without a + plan file; see commits leading up to 2026-05-19 on `internal/security`) - [ADR-004 — PostToolUse hook ordering](docs/essentials/decisions/004-posttooluse-hook-ordering.md) -- **[`docs/superpowers/plans/2026-05-19-post-slm-unlock.md`](docs/superpowers/plans/2026-05-19-post-slm-unlock.md)** - — outstanding work after the SLM unlock session. Phases A (two-stage - tool routing), B (CLI agent binary override), C (user profiles), and - D (per-arm capability tags) are **complete**. Phase E (compound - tools) is held until ≥50 SLM observations inform which primitives are - worth adding. -- **[`docs/superpowers/plans/2026-05-07-gnoma-roadmap.md`](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md)** - — broader roadmap (PTY shell, USP integration, ELF, distribution). - Phase 4 ("Router Revisit") is superseded by the post-SLM plan above. +- **Post-SLM unlock** — + [plan](docs/superpowers/plans/2026-05-19-post-slm-unlock.md). Phases + A–D complete (two-stage tool routing, CLI agent binary override, + user profiles, per-arm capability tags). +- **2026-05-07 roadmap** — + [plan](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md). M1–M8 + done; SLM classifier (Phase 3) complete; Phase 4 superseded by the + post-SLM plan. -Phases (2026-05-07 roadmap): -1. M8 Cleanup (wiring gaps) -2. PTY Interactive Shell (`tea.ExecProcess`) -3. SLM Task Classifier (Ollama HTTP, opt-in) — **complete** -4. Router Revisit — **superseded by post-SLM plan** -5. USP Security Integration -6. ELF Binary Support (deferred/opportunistic) -7. Distribution (CI trigger for goreleaser) - ---- - -## Stable Backlog (not in active phases) - -- **Thinking mode** (disabled / budget / adaptive) — M12 in milestones -- **Structured output** with JSON schema validation — M12 -- **Native agy JSON output** — update subprocess provider to use `--output-format stream-json` once supported by agy CLI, replacing the current prompt-augmentation fallback. -- **SQLite session persistence** + serve mode — M10 -- **Task learning** (pattern recognition, persistent tasks) — M11 -- **Web UI** (`gnoma web`) — M15 -- **OAuth / keyring** — M13 -- **Observability** (feature flags, cost dashboards) — M14 -- **PE / Mach-O support** — future, after ELF Phase 6 - ---- - -## Architecture References +## Reference - Milestones: `docs/essentials/milestones.md` - Decisions: `docs/essentials/decisions/` -- ADR-013 (SLM routing, supersedes ADR-009): `docs/essentials/decisions/002-slm-routing.md` +- ADR-002 (SLM routing, supersedes earlier ADR-009): `docs/essentials/decisions/002-slm-routing.md` diff --git a/docs/essentials/INDEX.md b/docs/essentials/INDEX.md index fa90972..eedce45 100644 --- a/docs/essentials/INDEX.md +++ b/docs/essentials/INDEX.md @@ -39,3 +39,4 @@ essentials: - [ADR-001 — Initial Decisions](decisions/001-initial-decisions.md) - [ADR-002 — SLM Routing](decisions/002-slm-routing.md) - [ADR-003 — Plugin Trust via TOFU Manifest Pinning](decisions/003-plugin-trust.md) +- [ADR-004 — PostToolUse Hook Ordering](decisions/004-posttooluse-hook-ordering.md) diff --git a/gemma-integration-analysis.md b/gemma-integration-analysis.md deleted file mode 100644 index 4d85c62..0000000 --- a/gemma-integration-analysis.md +++ /dev/null @@ -1,160 +0,0 @@ -> **Note (2026-05-07):** This document describes the `gemini-cli` (Node.js) implementation. -> The specifics — LiteRT-LM runtime, daemon/PID management, `litert-lm pull`, React/Ink UI — -> are Node.js artifacts and do not apply to gnoma. The **conceptually relevant part** is the -> Complexity Rubric and the `GemmaClassifierStrategy` JSON interface, which informed the Go -> `SLMClassifier` design in Phase 3 of `docs/superpowers/plans/2026-05-07-gnoma-roadmap.md`. -> For the Go implementation, see ADR-013 (`docs/essentials/decisions/002-slm-routing.md`). - -# Gemini CLI Local Model Routing (/gemma) Architecture - -The `/gemma` integration in the `gemini-cli` uses a local LLM to perform "Model Routing". It automatically decides whether to use a cheaper/faster model (Flash) or a more powerful one (Pro) based on the user's request. - -## Core Architecture -* **Engine:** Uses **LiteRT-LM**, a lightweight runtime that serves Gemma models via a Gemini-compatible HTTP API. -* **Model:** Specifically uses a quantized **Gemma 3 1B** model (`gemma3-1b-gpu-custom`). It's ~1GB and runs locally with low latency (~100-200ms for classification). -* **Orchestration:** The CLI manages the LiteRT server as a background daemon, tracking its state via PID files and logs. -* **Integration:** A `GemmaClassifierStrategy` is injected into the core `ModelRouterService`. It flattens recent chat history, sends it to the local Gemma model with a strict "Complexity Rubric," and uses the JSON response to switch models dynamically. - ---- - -## Integration Todo List - -### 1. Infrastructure & Asset Management -- [ ] **Platform Detection:** Logic to map OS/Arch to the correct LiteRT-LM binary download URL. -- [ ] **Safe Installer:** Implementation of binary download + SHA256 checksum verification + permission handling (`chmod +x`, macOS quarantine removal). -- [ ] **Model Manager:** Wrapper for the `litert-lm pull` command to download and verify the 1GB Gemma model. - -### 2. Process & Server Management -- [ ] **Background Daemon:** Implementation of `spawn(..., { detached: true })` to keep the LiteRT server running independently of the CLI session. -- [ ] **State Tracking:** A PID-file system to manage server lifecycle (start/stop/status) and prevent port collisions. -- [ ] **Auto-Start Logic:** A manager class (`LiteRtServerManager`) that checks server health on CLI startup and launches it if enabled in settings. - -### 3. Routing Logic (The "Brain") -- [ ] **Complexity Rubric:** A specialized system prompt that defines what constitutes a "SIMPLE" vs "COMPLEX" task. -- [ ] **Context Flattener:** Utility to compress the last ~4-20 turns of chat history into a prompt suitable for a small 1B model. -- [ ] **Strategy Implementation:** The `GemmaClassifierStrategy` class to handle the local API call, parse the JSON "reasoning," and return the model decision. - -### 4. User Experience (CLI & UI) -- [ ] **Management Commands:** Commands like `gemini gemma {setup|start|stop|status|logs}` for lifecycle and troubleshooting. -- [ ] **Slash Command:** A built-in `/gemma` command that queries the local server health and displays a status panel inside a session. -- [ ] **React/Ink UI:** A status component to show visual indicators (green/red) for the binary, model, and server state. - -### 5. Configuration & Safety -- [ ] **Scoped Settings:** Separate "User" settings (binary path) from "Workspace" settings (router enabled/disabled for a specific project). -- [ ] **Failure Resilience:** Logic to gracefully fall back to the default model if the local classifier times out or fails. - ---- - -## Routing Prompts - -These are the exact prompts used by the `gemini-cli` to force the small 1B model to output structured JSON with strict reasoning criteria. - -### 1. The Complexity Rubric -```markdown -### Complexity Rubric -A task is COMPLEX (Choose \`pro\`) if it meets ONE OR MORE of the following criteria: -1. **High Operational Complexity (Est. 4+ Steps/Tool Calls):** Requires dependent actions, significant planning, or multiple coordinated changes. -2. **Strategic Planning & Conceptual Design:** Asking "how" or "why." Requires advice, architecture, or high-level strategy. -3. **High Ambiguity or Large Scope (Extensive Investigation):** Broadly defined requests requiring extensive investigation. -4. **Deep Debugging & Root Cause Analysis:** Diagnosing unknown or complex problems from symptoms. -A task is SIMPLE (Choose \`flash\`) if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls). Operational simplicity overrides strategic phrasing. -``` - -### 2. Output Format Enforcement -```markdown -### Output Format -Respond *only* in JSON format like this: -{ - "reasoning": Your reasoning... - "model_choice": Either flash or pro -} -And you must follow the following JSON schema: -{ - "type": "object", - "properties": { - "reasoning": { - "type": "string", - "description": "A brief summary of the user objective, followed by a step-by-step explanation for the model choice, referencing the rubric." - }, - "model_choice": { - "type": "string", - "enum": ["flash", "pro"] - } - }, - "required": ["reasoning", "model_choice"] -} -You must ensure that your reasoning is no more than 2 sentences long and directly references the rubric criteria. -When making your decision, the user's request should be weighted much more heavily than the surrounding context when making your determination. -``` - -### 3. The Main System Prompt -```markdown -### Role -You are the **Lead Orchestrator** for an AI system. You do not talk to users. Your sole responsibility is to analyze the **Chat History** and delegate the **Current Request** to the most appropriate **Model** based on the request's complexity. - -### Models -Choose between \`flash\` (SIMPLE) or \`pro\` (COMPLEX). -1. \`flash\`: A fast, efficient model for simple, well-defined tasks. -2. \`pro\`: A powerful, advanced model for complex, open-ended, or multi-step tasks. - -[... Injects COMPLEXITY_RUBRIC here ...] - -[... Injects OUTPUT_FORMAT here ...] - -### Examples -**Example 1 (Strategic Planning):** -*User Prompt:* "How should I architect the data pipeline for this new analytics service?" -*Your JSON Output:* -{ - "reasoning": "The user is asking for high-level architectural design and strategy. This falls under 'Strategic Planning & Conceptual Design'.", - "model_choice": "pro" -} -**Example 2 (Simple Tool Use):** -*User Prompt:* "list the files in the current directory" -*Your JSON Output:* -{ - "reasoning": "This is a direct command requiring a single tool call (ls). It has Low Operational Complexity (1 step).", - "model_choice": "flash" -} -**Example 3 (High Operational Complexity):** -*User Prompt:* "I need to add a new 'email' field to the User schema in 'src/models/user.ts', migrate the database, and update the registration endpoint." -*Your JSON Output:* -{ - "reasoning": "This request involves multiple coordinated steps across different files and systems. This meets the criteria for High Operational Complexity (4+ steps).", - "model_choice": "pro" -} -**Example 4 (Simple Read):** -*User Prompt:* "Read the contents of 'package.json'." -*Your JSON Output:* -{ - "reasoning": "This is a direct command requiring a single read. It has Low Operational Complexity (1 step).", - "model_choice": "flash" -} -**Example 5 (Deep Debugging):** -*User Prompt:* "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?" -*Your JSON Output:* -{ - "reasoning": "The user is reporting an error symptom without a known cause. This requires investigation and falls under 'Deep Debugging'.", - "model_choice": "pro" -} -**Example 6 (Simple Edit despite Phrasing):** -*User Prompt:* "What is the best way to rename the variable 'data' to 'userData' in 'src/utils.js'?" -*Your JSON Output:* -{ - "reasoning": "Although the user uses strategic language ('best way'), the underlying task is a localized edit. The operational complexity is low (1-2 steps).", - "model_choice": "flash" -} -``` - -### 4. The Per-Request Prompt Structure -For every routing decision, the CLI flattens the last ~4 turns of chat history and appends the new user request. - -```markdown -You are provided with a **Chat History** and the user's **Current Request** below. - -#### Chat History: -[... Flattened text of the last 4 turns, excluding tool calls ...] - -#### Current Request: -"[... The actual text of what the user just typed ...]" -```