docs: add project essentials (12/12 complete)

Vision, domain model, architecture, patterns, process flows,
UML diagrams, API contracts, tech stack, constraints, milestones
(M1-M11), decision log (6 ADRs), and risk register.

Key decisions: single binary, pull-based streaming, Mistral as M1
reference provider, discriminated unions, multi-provider collaboration
as core identity.
This commit is contained in:
2026-04-02 18:09:07 +02:00
parent f909733bff
commit efcb5a2901
14 changed files with 1638 additions and 0 deletions

35
docs/essentials/INDEX.md Normal file
View File

@@ -0,0 +1,35 @@
---
project: gnoma
layout: directory
path: docs/essentials/
essentials:
vision: complete
domain-model: complete
architecture: complete
patterns: complete
process-flows: complete
uml-diagrams: complete
api-contracts: complete
tech-stack: complete
constraints: complete
milestones: complete
decision-log: complete
risks: complete
---
# Project Essentials — gnoma
| # | Essential | Status | Link | Last Updated |
|---|-----------|--------|------|-------------|
| 1 | Vision | complete | [vision.md](vision.md) | 2026-04-02 |
| 2 | Domain Model | complete | [domain-model.md](domain-model.md) | 2026-04-02 |
| 3 | Architecture | complete | [architecture.md](architecture.md) | 2026-04-02 |
| 4 | Patterns | complete | [patterns.md](patterns.md) | 2026-04-02 |
| 5 | Process Flows | complete | [process-flows.md](process-flows.md) | 2026-04-02 |
| 6 | UML Diagrams | complete | [uml-diagrams.md](uml-diagrams.md) | 2026-04-02 |
| 7 | API Contracts | complete | [api-contracts.md](api-contracts.md) | 2026-04-02 |
| 8 | Tech Stack & Conventions | complete | [tech-stack.md](tech-stack.md) | 2026-04-02 |
| 9 | Constraints & Trade-offs | complete | [constraints.md](constraints.md) | 2026-04-02 |
| 10 | Milestones | complete | [milestones.md](milestones.md) | 2026-04-02 |
| 11 | Decision Log | complete | [decisions/001-initial-decisions.md](decisions/001-initial-decisions.md) | 2026-04-02 |
| 12 | Risk / Unknowns | complete | [risks.md](risks.md) | 2026-04-02 |

View File

@@ -0,0 +1,107 @@
---
essential: api-contracts
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [architecture]
---
# API Contracts
gnoma has no external HTTP API. All interfaces are internal Go APIs between packages. The stability guarantees below define how these internal boundaries evolve.
## Core Interfaces
| Interface | Package | Stability | Description |
|-----------|---------|-----------|-------------|
| `Provider` | `provider` | stable | LLM backend adapter contract |
| `Stream` | `stream` | stable | Unified streaming event iterator |
| `Tool` | `tool` | stable | Tool execution contract |
| `Session` | `session` | stable | UI ↔ engine decoupling boundary |
| `Strategy` | `context` | experimental | Compaction strategy contract |
| `Elf` | TBD | experimental | Sub-agent contract (future) |
## Provider Interface
```go
type Provider interface {
Stream(ctx context.Context, req Request) (stream.Stream, error)
Name() string
}
```
**Stability:** Stable. Adding methods requires a new interface (e.g., `ProviderV2`) or optional interface assertion pattern.
## Stream Interface
```go
type Stream interface {
Next() bool
Current() Event
Err() error
Close() error
}
```
**Stability:** Stable. The pull-based iterator contract is locked.
## Tool Interface
```go
type Tool interface {
Name() string
Description() string
Parameters() json.RawMessage
Execute(ctx context.Context, args json.RawMessage) (Result, error)
IsReadOnly() bool
}
```
**Stability:** Stable. New capabilities added via optional interfaces:
```go
// Future: tools that support streaming output
type StreamingTool interface {
Tool
ExecuteStream(ctx context.Context, args json.RawMessage) (stream.Stream, error)
}
```
## Session Interface
```go
type Session interface {
Send(ctx context.Context, input string) error
Events() <-chan stream.Event
TurnResult() (*engine.Turn, error)
Cancel()
Close() error
Status() SessionStatus
}
```
**Stability:** Stable. This is the boundary that enables future transport implementations (Unix socket, WebSocket) without changing the engine or UI.
## Event Schema
Events flow from provider → engine → session → UI. The `stream.Event` struct is the wire format:
| Event Type | Fields Set | Direction |
|-----------|-----------|-----------|
| `EventTextDelta` | `Text` | Provider → UI |
| `EventThinkingDelta` | `Text` | Provider → UI |
| `EventToolCallStart` | `ToolCallID`, `ToolCallName` | Provider → UI |
| `EventToolCallDelta` | `ToolCallID`, `ArgDelta` | Provider → UI |
| `EventToolCallDone` | `ToolCallID`, `Args` | Provider → UI |
| `EventUsage` | `Usage` | Provider → Engine |
| `EventError` | `Err` | Any → Consumer |
## Versioning Strategy
Internal packages under `internal/` have no versioning — they change freely. The `Provider`, `Stream`, `Tool`, and `Session` interfaces are considered public contracts even though they're internal. Breaking changes to these require migration notes in the changelog.
Future public API (if gnoma becomes embeddable as a library) would live under a `pkg/` directory with semantic versioning.
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,158 @@
---
essential: architecture
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [domain-model]
---
# Architecture
## System Context
```mermaid
graph TB
User([Developer]) -->|TUI / CLI pipe| gnoma[gnoma binary]
gnoma -->|HTTPS| Anthropic[Anthropic API]
gnoma -->|HTTPS| OpenAI[OpenAI API]
gnoma -->|HTTPS| Google[Google GenAI API]
gnoma -->|HTTPS| Mistral[Mistral API]
gnoma -->|HTTP| Local[Ollama / llama.cpp]
gnoma -->|stdio JSON-RPC| MCP[MCP Servers]
gnoma -->|exec| Tools[Local Tools<br/>bash, file ops]
```
## Container View
```mermaid
graph TB
subgraph "gnoma (single binary, single process)"
CLI[CLI Parser] --> Router{Mode?}
Router -->|TTY| TUI[TUI — Bubble Tea]
Router -->|Pipe| Pipe[CLI Pipe Mode]
TUI --> SM[Session Manager]
Pipe --> SM
SM --> S1[Session goroutine]
SM --> SN[Session N goroutine]
S1 --> E1[Engine]
SN --> EN[Engine N]
E1 --> PR[Provider Registry]
EN --> PR
PR --> Anthropic[Anthropic adapter]
PR --> OpenAI[OpenAI adapter]
PR --> Google[Google adapter]
PR --> Mistral[Mistral adapter]
PR --> OAICompat[OpenAI-compat adapter]
E1 --> TR[Tool Registry]
EN --> TR
TR --> Bash[bash]
TR --> FS[fs.read / write / edit / glob / grep]
E1 --> PM[Permission Checker]
EN --> PM
E1 --> CTX[Context Window]
EN --> CTX
end
subgraph "Config Stack"
Defaults --> Global["~/.config/gnoma/config.toml"]
Global --> Project[".gnoma/config.toml"]
Project --> Env[Environment Variables]
Env --> Flags[CLI Flags]
end
```
## Component Overview
| Component | Responsibility | Technology | Boundary |
|-----------|---------------|------------|----------|
| `cmd/gnoma` | Binary entrypoint, flag parsing, mode routing | Go stdlib | Internal |
| `internal/message` | Foundation types: Message, Content, Usage, Response | Pure Go, zero deps | Internal |
| `internal/stream` | Streaming interface, Event types, Accumulator | Depends on message | Internal |
| `internal/provider` | Provider interface, Registry, error taxonomy | Depends on message, stream | Internal |
| `internal/provider/{anthropic,openai,google,mistral}` | SDK adapters: translate + stream | SDK dependencies | Network boundary |
| `internal/provider/openaicompat` | Thin wrapper for Ollama/llama.cpp | Reuses openai adapter | Network boundary |
| `internal/tool` | Tool interface, Registry, bash, file ops | Go stdlib, doublestar | Local system boundary |
| `internal/permission` | Permission modes, rule matching, user prompts | Pure Go | Internal |
| `internal/context` | Token tracking, compaction strategies, sliding window | Depends on message, provider | Internal |
| `internal/config` | TOML layered config loading | BurntSushi/toml | Internal |
| `internal/auth` | API key resolution from env/config | Pure Go | Internal |
| `internal/engine` | Agentic query loop, tool execution orchestration | Depends on all above | Internal |
| `internal/session` | Session lifecycle, channel-based UI decoupling | Depends on engine, stream | Internal |
| `internal/tui` | Terminal UI: chat, input, status, permission dialogs | Bubble Tea, lipgloss | Internal |
## Package Dependency Graph
```mermaid
graph BT
message["message"]
stream["stream"]
provider["provider"]
tool["tool"]
permission["permission"]
context_mgr["context"]
config["config"]
auth["auth"]
engine["engine"]
session["session"]
tui["tui"]
cmd["cmd/gnoma"]
stream --> message
provider --> message
provider --> stream
tool --> message
permission --> message
context_mgr --> message
context_mgr --> provider
config --> permission
engine --> provider
engine --> tool
engine --> permission
engine --> stream
engine --> context_mgr
session --> engine
session --> stream
tui --> session
tui --> stream
cmd --> tui
cmd --> config
cmd --> auth
cmd --> session
cmd --> provider
cmd --> tool
```
## Scope
**In scope:**
- Streaming chat with tool execution across 5+ LLM providers
- Agentic loop (stream → tool calls → re-query → until done)
- Permission system for tool execution
- TUI and CLI pipe modes
- TOML configuration with layering
- Context management and compaction
- Multi-agent (elfs) with per-elf provider routing
- Hook, skill, and MCP extensibility
**Out of scope:**
- Web UI (future, via serve mode)
- Cloud hosting / SaaS deployment
- Training or fine-tuning models
- IDE extension authoring (gnoma provides the backend, not the extension itself)
## Deployment
Single statically-linked Go binary. No runtime dependencies. Runs on Linux, macOS, Windows — anywhere Go compiles. Distributed via `go install`, release binaries, or package managers.
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,68 @@
---
essential: constraints
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [domain-model]
---
# Constraints & Trade-offs
## Non-Functional Requirements
| Constraint | Target | Measurement |
|-----------|--------|-------------|
| First token latency | Dominated by provider, not gnoma overhead | Time from Submit() to first EventTextDelta |
| Binary size | < 20 MB (static, no CGO) | `ls -lh bin/gnoma` |
| Memory per session | < 50 MB baseline (excluding context window) | `runtime.MemStats` |
| Startup time | < 200ms to TUI ready | Wall clock from exec to first render |
| Provider support | 5+ providers from M2 | Count of passing provider integration tests |
| Context window | Up to 200k tokens managed | Token tracker reports |
## Trade-offs
### Single binary over daemon architecture
- **Chose:** Single Go binary, goroutines + channels for all communication
- **Over:** Client-server split with gRPC IPC (gnoma + gnomad)
- **Because:** Simpler deployment, no daemon lifecycle, no protobuf codegen. Go's goroutine model provides sufficient isolation.
- **Consequence:** True process isolation for tool sandboxing requires future work. Multi-client scenarios (IDE + TUI) need serve mode added later.
### Pull-based stream over channels or iter.Seq
- **Chose:** `Next() / Current() / Err() / Close()` interface
- **Over:** Channel-based streaming or Go 1.23+ `iter.Seq` range functions
- **Because:** Matches 3 of 4 SDKs natively (zero-overhead adapter). Supports explicit resource cleanup via `Close()`. Consumer controls backpressure.
- **Consequence:** Google's range-based SDK needs a goroutine bridge. Slightly more verbose than range-based iteration.
### json.RawMessage passthrough over typed schemas
- **Chose:** Tool parameters and arguments as `json.RawMessage`
- **Over:** Typed JSON Schema library or code-generated types
- **Because:** Zero-cost passthrough — no serialize/deserialize between provider and tool. No JSON Schema library as a core dependency.
- **Consequence:** Schema validation happens at tool boundaries, not centrally. Type safety relies on tool implementations parsing their own args.
### Sequential tool execution (MVP) over parallel
- **Chose:** Execute tools one at a time in the agentic loop
- **Over:** Parallel execution via errgroup with read/write partitioning
- **Because:** Simpler to test, debug, and implement permission prompts. Parallel execution adds complexity around error collection and ordering.
- **Consequence:** Multiple tool calls in a single turn execute sequentially. Performance impact is minimal for most workloads. Parallel execution planned for post-MVP.
### Discriminated union structs over interface hierarchies
- **Chose:** Struct with Type discriminant field for Content and Event types
- **Over:** Interface-based variant types (e.g., `TextContent`, `ToolCallContent` implementing `Content`)
- **Because:** Zero allocation, cache-friendly, works with switch exhaustiveness. Go interfaces for data variants incur boxing overhead.
- **Consequence:** Adding a new content type requires updating switch statements. Acceptable for a small, stable set of variants.
### Mistral as M1 reference provider over Anthropic
- **Chose:** Implement Mistral adapter first as the reference
- **Over:** Starting with Anthropic (richest content model)
- **Because:** User maintains the Mistral Go SDK, knows its internals. Good baseline — similar to OpenAI's API shape. Anthropic's unique features (thinking blocks, cache tokens) are better added as an M2 extension.
- **Consequence:** Thinking block support tested later. Cache token tracking added with Anthropic provider.
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,35 @@
# ADR-NNN: [Title]
**Status:** Proposed | Accepted | Deprecated | Superseded by ADR-NNN
**Date:** YYYY-MM-DD
## Context
[Describe the situation that requires a decision. What forces are at play? What constraints exist? What problem are you trying to solve?]
## Decision
[State the decision clearly and concisely. Use active voice: "We will..." or "The system will..."]
## Alternatives Considered
### Alternative A: [Name]
- **Pros:** [advantages]
- **Cons:** [disadvantages]
### Alternative B: [Name]
- **Pros:** [advantages]
- **Cons:** [disadvantages]
## Consequences
**Positive:**
- [Expected benefit]
**Negative:**
- [Expected cost or trade-off]
**Neutral:**
- [Side effects that are neither clearly positive nor negative]

View File

@@ -0,0 +1,188 @@
# ADR-001: Single Binary with Goroutines
**Status:** Accepted
**Date:** 2026-04-02
## Context
gnoma needs to decouple the UI from the engine to support multiple frontends (TUI, CLI, future IDE extensions). Options were: (a) single binary with goroutines + channels, (b) client-server with gRPC IPC, (c) embedded library.
## Decision
Single Go binary. Engine runs as goroutines within the same process. UI communicates with engine via the `Session` interface over channels. Future serve mode adds a Unix socket listener for external clients — still the same process.
## Alternatives Considered
### Alternative A: gRPC IPC (gnoma + gnomad)
- **Pros:** Process isolation, true sandboxing, multiple clients to one daemon
- **Cons:** Protobuf codegen dependency, daemon lifecycle management, two binaries to distribute
### Alternative B: Embedded library
- **Pros:** Maximum flexibility for embedders
- **Cons:** No standalone binary, API stability burden, harder to ship
## Consequences
**Positive:** Simple deployment, no daemon, no codegen, Go's goroutine model provides sufficient isolation.
**Negative:** No process-level sandboxing for tools. Multi-client scenarios require serve mode (future).
---
# ADR-002: Pull-Based Stream Interface
**Status:** Accepted
**Date:** 2026-04-02
## Context
Need a unified streaming abstraction across 4 SDKs with different patterns: Anthropic/OpenAI/Mistral use pull-based `Next()/Current()`, Google uses range-based `for chunk, err := range iter`.
## Decision
Pull-based `Stream` interface: `Next() bool`, `Current() Event`, `Err() error`, `Close() error`. Google adapter bridges via goroutine + buffered channel.
## Alternatives Considered
### Alternative A: Channel-based
- **Pros:** Go-idiomatic, works with `select`
- **Cons:** Requires goroutine per stream, less control over backpressure, no `Close()` for cleanup
### Alternative B: iter.Seq (range-over-func)
- **Pros:** Modern Go pattern, clean `for event := range stream`
- **Cons:** No `Close()` for resource cleanup, no separate error retrieval, doesn't match SDK patterns
## Consequences
**Positive:** Zero-overhead adapter for 3 of 4 SDKs. Explicit resource cleanup. Consumer controls pace.
**Negative:** Google needs a goroutine bridge. Slightly more verbose than range-based.
---
# ADR-003: Mistral as M1 Reference Provider
**Status:** Accepted
**Date:** 2026-04-02
## Context
Need to pick one provider to implement first as the reference adapter. Candidates: Anthropic (richest model), OpenAI (most popular), Mistral (user maintains SDK).
## Decision
Mistral first. The user maintains `somegit.dev/vikingowl/mistral-go-sdk` and knows its internals. The API shape is similar to OpenAI, making it a good baseline. Anthropic's unique features (thinking blocks, cache tokens) are better tested as M2 extensions.
## Alternatives Considered
### Alternative A: Anthropic first
- **Pros:** Richest content model, most features to test
- **Cons:** Anthropic-specific features (thinking, caching) could bias the abstraction
### Alternative B: OpenAI first
- **Pros:** Most widely used, well-documented
- **Cons:** No special insight into SDK internals
## Consequences
**Positive:** Fast iteration on reference adapter. SDK bugs fixed directly.
**Negative:** Thinking block support tested later (M2).
---
# ADR-004: Discriminated Union Structs
**Status:** Accepted
**Date:** 2026-04-02
## Context
Go lacks sum types. Need to represent Content variants (text, tool call, tool result, thinking) and Event variants.
## Decision
Struct with `Type` discriminant field. Exactly one payload field is set per type value. Consumer switches on `Type`.
## Alternatives Considered
### Alternative A: Interface hierarchy
- **Pros:** Extensible, familiar OOP pattern
- **Cons:** Heap allocation per variant, type assertion overhead, no exhaustive switch checking
### Alternative B: Generics-based enum
- **Pros:** Type-safe, compile-time checked
- **Cons:** Complex, unfamiliar, Go's generics don't support sum types well
## Consequences
**Positive:** Zero allocation, cache-friendly, fast switch dispatch, simple.
**Negative:** New variants require updating all switch statements. Acceptable for small, stable sets.
---
# ADR-005: json.RawMessage for Tool Schemas
**Status:** Accepted
**Date:** 2026-04-02
## Context
Tool parameters (JSON Schema) and tool call arguments need to flow between providers and tools. Options: typed schema library, code generation, or raw JSON passthrough.
## Decision
`json.RawMessage` for both tool parameter schemas and tool call arguments. Zero-cost passthrough between provider and tool. Tools parse their own arguments.
## Alternatives Considered
### Alternative A: JSON Schema library
- **Pros:** Centralized validation, type-safe schema construction
- **Cons:** Core dependency, serialization overhead, schema library selection lock-in
### Alternative B: Code generation from schemas
- **Pros:** Full type safety, compile-time checks
- **Cons:** Build complexity, generated code maintenance, rigid
## Consequences
**Positive:** No JSON Schema dependency. Providers and tools speak JSON natively. Minimal overhead.
**Negative:** Validation at tool boundary only, not centralized.
---
# ADR-006: Multi-Provider Collaboration as Core Identity
**Status:** Accepted
**Date:** 2026-04-02
## Context
Most AI coding assistants are single-provider. gnoma already supports multiple providers, but the question is whether multi-provider collaboration (elfs on different providers working together) is a nice-to-have or a core architectural feature.
## Decision
Multi-provider collaboration is a core feature and part of gnoma's identity. The architecture must support elfs running on different providers simultaneously, with routing rules directing tasks by capability, cost, or latency. This is not an afterthought — it shapes how we design the elf system, provider registry, and session management.
## Alternatives Considered
### Alternative A: Multi-provider as optional extension
- **Pros:** Simpler MVP, routing added later
- **Cons:** Architectural decisions made without routing in mind may need rework
## Consequences
**Positive:** Clear differentiator from all existing tools. Shapes architecture from day one.
**Negative:** Elf system design must account for per-elf provider config from the start.
## Changelog
- 2026-04-02: Initial decisions from architecture planning session

View File

@@ -0,0 +1,132 @@
---
essential: domain-model
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [vision]
---
# Domain Model
## Entity Relationships
```mermaid
classDiagram
class Session {
+id: string
+state: SessionState
+Send(input) error
+Events() chan Event
+Cancel()
}
class Engine {
+history: []Message
+usage: Usage
+Submit(input, callback) Turn
+SetProvider(provider)
+SetModel(model)
}
class Message {
+Role: Role
+Content: []Content
+HasToolCalls() bool
+ToolCalls() []ToolCall
+TextContent() string
}
class Content {
+Type: ContentType
+Text: string
+ToolCall: ToolCall
+ToolResult: ToolResult
+Thinking: Thinking
}
class Provider {
<<interface>>
+Stream(req) Stream
+Name() string
}
class Stream {
<<interface>>
+Next() bool
+Current() Event
+Err() error
+Close() error
}
class Tool {
<<interface>>
+Name() string
+Execute(args) Result
+IsReadOnly() bool
}
class Turn {
+Messages: []Message
+Usage: Usage
+Rounds: int
}
class Elf {
<<interface>>
+ID() string
+Send(msg) error
+Events() chan Event
+Wait() ElfResult
}
Session "1" --> "1" Engine : owns
Engine "1" --> "1" Provider : uses
Engine "1" --> "*" Tool : executes
Engine "1" --> "*" Message : history
Engine "1" --> "*" Turn : produces
Message "1" --> "*" Content : contains
Provider "1" --> "*" Stream : creates
Stream "1" --> "*" Event : yields
Session "1" --> "*" Elf : spawns (future)
Elf "1" --> "1" Engine : owns
```
## Glossary
| Term | Definition | Example |
|------|-----------|---------|
| gnoma | The host application — single binary, agentic coding assistant | `gnoma "list files"` |
| Elf | A sub-agent (goroutine) with its own engine, history, and provider. Named after the elf owl. | Background elf exploring `auth/` on Ollama |
| Session | A conversation boundary between UI and engine. Owns one engine, communicates via channels. | TUI session, CLI pipe session |
| Engine | The agentic loop orchestrator. Manages history, streams from provider, executes tools, loops until done. | Engine running on Mistral with 5 tools |
| Provider | An LLM backend adapter. Translates gnoma types to/from SDK-specific types. | Anthropic provider, OpenAI-compat provider |
| Stream | Pull-based iterator over streaming events from a provider. Unified interface across all SDKs. | `for s.Next() { e := s.Current() }` |
| Event | A single streaming delta — text chunk, tool call fragment, thinking trace, or usage update. | `EventTextDelta{Text: "hello"}` |
| Message | A single turn in conversation history. Contains one or more Content blocks. | User text message, assistant message with tool calls |
| Content | A discriminated union within a Message — text, tool call, tool result, or thinking block. | `Content{Type: ContentToolCall, ToolCall: &ToolCall{...}}` |
| ToolCall | The model's request to invoke a tool, with ID, name, and JSON arguments. | `{ID: "tc_1", Name: "bash", Args: {"command": "ls"}}` |
| ToolResult | The output of executing a tool, correlated to a ToolCall by ID. | `{ToolCallID: "tc_1", Content: "file1.go\nfile2.go"}` |
| Turn | The result of a complete agentic loop — may span multiple API calls and tool executions. | Turn with 3 rounds: stream → tool → stream → tool → stream → done |
| Accumulator | Assembles a complete Response from a sequence of streaming Events. Shared across all providers. | Text fragments → complete assistant message |
| Callback | Function the engine calls for each streaming event, enabling real-time UI updates. | `func(evt stream.Event) { ch <- evt }` |
| Round | A single API call within a Turn. A turn with 2 tool-use loops has 3 rounds. | Round 1: initial query. Round 2: after tool results. |
| Routing | Directing tasks to different providers based on capability, cost, or latency rules. | Complex reasoning → Claude, quick lookups → local Qwen |
## Invariants
Rules that must always hold true in the domain:
- A Message always has at least one Content block
- A ToolResult always references a ToolCall.ID from the preceding assistant message
- A Session owns exactly one Engine; an Engine is owned by exactly one Session
- An Elf owns its own Engine — no shared mutable state between elfs
- The Accumulator produces exactly one Response per stream consumption
- Content.Type determines which payload field is set — exactly one is non-nil
- Thinking.Signature must round-trip unchanged through message history (Anthropic requirement)
- Tool execution only happens when StopReason == ToolUse
- Stream.Close() must be called after consumption, regardless of error state
- Provider.Stream() is the only network boundary — all tool execution is local
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,177 @@
---
essential: milestones
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [vision]
---
# Milestones
## M1: Core Engine (MVP)
**Scope:** First working assistant. CLI pipe mode. Mistral as reference provider. Bash + file tools. No TUI, no permissions, no config file.
**Deliverables:**
- [ ] Architecture docs in `docs/essentials/`
- [ ] Foundation types (`internal/message/`)
- [ ] Streaming abstraction (`internal/stream/`)
- [ ] Provider interface + Mistral adapter
- [ ] Tool system: bash, fs.read, fs.write, fs.edit, fs.glob, fs.grep
- [ ] Engine agentic loop (stream → tool → re-query → done)
- [ ] CLI pipe mode (`echo "list files" | gnoma`)
**Exit criteria:** Pipe a coding question in, get a response that uses tools, answer on stdout.
## M2: Multi-Provider
**Scope:** All remaining providers. Config file. Dynamic provider switching.
**Deliverables:**
- [ ] Anthropic provider (streaming + tool use + thinking blocks)
- [ ] OpenAI provider (streaming + tool use)
- [ ] Google provider (streaming + function calling)
- [ ] OpenAI-compat for Ollama and llama.cpp
- [ ] TOML config (global + project + env + flags)
- [ ] `/model provider/model` switching mid-session
**Exit criteria:** Chat with any configured provider via CLI pipe. Switch providers mid-session.
## M3: TUI
**Scope:** Interactive terminal UI. Permission system.
**Deliverables:**
- [ ] Bubble Tea TUI: chat panel, input box, streaming output
- [ ] Status bar (provider, model, token usage)
- [ ] Permission system (allow / deny / prompt modes)
- [ ] Permission dialog overlay
- [ ] Model picker overlay
- [ ] Input history (up/down)
**Exit criteria:** Launch TUI, chat interactively, tools execute with permission prompts.
## M4: Context Intelligence
**Scope:** Long sessions. Token tracking. Compaction. Local tokenizer.
**Deliverables:**
- [ ] Local tokenizer for accurate token counting without provider round-trips
- [ ] Token tracker (cumulative usage, OK/warning/critical states)
- [ ] Truncate compaction (drop old messages, keep system + recent)
- [ ] Summarize compaction (LLM summarizes dropped messages)
- [ ] Compact boundaries (transaction markers for crash recovery)
- [ ] Deferred tool loading (non-essential tools loaded on demand)
- [ ] Result persistence (large tool outputs written to disk)
**Exit criteria:** 100+ turn conversation stays coherent within token budget. Local token counting matches provider reports within 5%.
## M5: Elfs (Multi-Agent + Multi-Provider Routing)
**Scope:** Sub-agents on different providers. Parallel work. Provider routing.
**Deliverables:**
- [ ] Elf spawning (`Engine.SpawnElf` with per-elf provider config)
- [ ] Background elfs (independent goroutine + engine)
- [ ] Parent ↔ elf communication via typed channels
- [ ] Concurrent tool execution (read-only parallel, writes sequential)
- [ ] Provider routing rules (route by capability, cost, latency) — research needed
- [ ] Coordinator dispatches tasks to elfs on different providers
**Exit criteria:** Coordinator on Claude spawns research elf on local Qwen + review elf on OpenAI, collects and synthesizes results.
## M6: Extensibility
**Scope:** Hooks, skills, MCP, plugin foundation.
**Deliverables:**
- [ ] Hook system (PreToolUse / PostToolUse, stdin/stdout protocol)
- [ ] Skill loading (`.gnoma/skills/*.md` with frontmatter)
- [ ] MCP client (JSON-RPC over stdio, tool discovery)
- [ ] Plugin foundation (manifest, install, lifecycle)
**Exit criteria:** MCP server tools appear in gnoma. Skills invocable by model. Hook logs all bash commands.
## M7: Persistence & Serve
**Scope:** Session persistence via SQLite. Serve mode for external clients. Coordinator mode.
**Deliverables:**
- [ ] Session persistence with SQLite (save/restore conversations across restarts)
- [ ] Serve mode (Unix socket listener, external UI clients)
- [ ] Coordinator mode (orchestrator dispatches to worker elfs)
**Exit criteria:** Resume yesterday's conversation. VS Code extension connects via serve mode. Coordinator parallelizes subtasks.
## M8: Thinking & Structured Output
**Scope:** Extended thinking support across providers. Schema-validated structured output.
**Deliverables:**
- [ ] Thinking mode (disabled / enabled with budget / adaptive)
- [ ] Thinking block streaming and display in TUI
- [ ] Structured output with JSON schema validation
- [ ] Retry logic for schema validation failures
**Exit criteria:** Extended thinking with budget works on Anthropic. Structured output validates against schema on all providers that support it.
## M9: Auth
**Scope:** OAuth 2.0 + PKCE for cloud providers. Credential management.
**Deliverables:**
- [ ] OAuth 2.0 + PKCE flow (browser redirect → callback → token exchange)
- [ ] Token refresh (proactive, before expiry)
- [ ] OS keyring integration for secure credential storage
- [ ] Multi-account support per provider
**Exit criteria:** `gnoma login anthropic` opens browser, completes OAuth flow, stores token in keyring. Automatic refresh works.
## M10: Observability
**Scope:** Feature flags. Opt-in telemetry and analytics.
**Deliverables:**
- [ ] Feature flag system (local config + optional remote evaluation)
- [ ] Opt-in analytics (event queue, local-only by default)
- [ ] Usage dashboards (token spend, provider usage, tool frequency)
- [ ] Cost tracking per provider/model
**Exit criteria:** Feature flags gate experimental features. User can view their token spend breakdown. Analytics disabled by default.
## M11: Web UI
**Scope:** Browser-based UI as alternative to TUI. Requires serve mode (M7).
**Deliverables:**
- [ ] `gnoma web` CLI subcommand (or `gnoma --web`) starts local web server
- [ ] Web UI connects to serve mode backend
- [ ] Chat interface with streaming, tool output, permission prompts
- [ ] Responsive design for desktop browsers
**Exit criteria:** `gnoma web` opens browser, full chat with streaming and tool execution. Serve mode required as prerequisite.
## Future
Ideas not yet committed:
- Voice input/output via provider audio APIs
- Collaborative sessions (multiple humans + elfs)
- Plugin marketplace
- Remote agent execution
## Changelog
- 2026-04-02: Initial version — M1-M6
- 2026-04-02: Split M2 into providers (M2) and TUI (M3). Added M8-M11 for thinking, auth, observability, web UI. Local tokenizer in M4. SQLite for session persistence in M7.

135
docs/essentials/patterns.md Normal file
View File

@@ -0,0 +1,135 @@
---
essential: patterns
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [architecture]
---
# Patterns
## Discriminated Unions
- **What:** Struct with a `Type` field discriminant; exactly one payload field is set per type value. Used instead of Go interfaces for data variants.
- **Where:** `message.Content`, `stream.Event`
- **Why:** Zero allocation (no interface boxing), cache-friendly, works with `switch` statements. Go lacks sum types — this is the pragmatic equivalent.
- **Example:**
```go
type Content struct {
Type ContentType
Text string // set when Type == ContentText
ToolCall *ToolCall // set when Type == ContentToolCall
ToolResult *ToolResult // set when Type == ContentToolResult
Thinking *Thinking // set when Type == ContentThinking
}
switch c.Type {
case ContentText:
fmt.Print(c.Text)
case ContentToolCall:
execute(c.ToolCall)
}
```
## Pull-Based Stream Iterator
- **What:** `Next() / Current() / Err() / Close()` interface for consuming streaming data.
- **Where:** `stream.Stream` interface, all provider adapters
- **Why:** Matches 3 of 4 SDKs (Anthropic, OpenAI, Mistral) natively. Gives consumer explicit backpressure control. Supports `Close()` for resource cleanup, unlike `iter.Seq`. Only Google needs a goroutine bridge.
- **Example:**
```go
for s.Next() {
event := s.Current()
process(event)
}
if err := s.Err(); err != nil {
handle(err)
}
s.Close()
```
## Accumulator
- **What:** Shared component that assembles a `message.Response` from a sequence of `stream.Event` values. Separated from provider-specific translation.
- **Where:** `stream.Accumulator`, used by every provider adapter
- **Why:** Provider adapters become thin translation layers. Accumulation logic (text building, tool call JSON fragment assembly, thinking blocks) is tested once, not per-provider.
- **Example:**
```go
acc := stream.NewAccumulator()
for s.Next() {
acc.Apply(s.Current())
}
response := acc.Response()
```
## Factory Registry
- **What:** Map of names to factory functions. Creates instances on demand with config.
- **Where:** `provider.Registry`, `tool.Registry`
- **Why:** Decouples creation from usage. Makes testing easy — register mock factories. Enables dynamic provider switching.
- **Example:**
```go
registry.Register("mistral", mistral.NewProvider)
provider, err := registry.Create("mistral", cfg)
```
## Functional Options
- **What:** Variadic option functions for configuring complex objects.
- **Where:** Session creation, provider construction
- **Why:** Clean API for objects with many optional parameters. Self-documenting, extensible without breaking changes.
- **Example:**
```go
session, err := manager.NewSession(
WithProvider(mistral),
WithModel("mistral-large-latest"),
WithMaxTurns(20),
)
```
## Callback Event Propagation
- **What:** The engine accepts a `Callback func(stream.Event)` and calls it for each event. The session wraps this to push events into a channel.
- **Where:** `engine.Submit()` → `session/local.go`
- **Why:** Keeps the engine testable without concurrency. The engine knows nothing about channels, TUI, or goroutines. The session implementation decides how to propagate events.
- **Example:**
```go
// In session/local.go:
cb := func(evt stream.Event) {
select {
case s.events <- evt:
case <-ctx.Done():
}
}
turn, err := s.engine.Submit(ctx, input, cb)
```
## Error Wrapping with errors.AsType
- **What:** Provider adapters wrap SDK errors into typed `ProviderError` with classification. Consumers extract using Go 1.26's `errors.AsType[E]`.
- **Where:** All provider adapters, retry logic, engine error handling
- **Why:** Enables error classification (transient vs auth vs bad request) for retry decisions. Type-safe extraction without pointer indirection.
- **Example:**
```go
if pErr, ok := errors.AsType[*ProviderError](err); ok {
if pErr.Retryable {
// exponential backoff
}
}
```
## Anti-Patterns
Patterns explicitly avoided in this project:
| Anti-Pattern | Why we avoid it | What to do instead |
|---|---|---|
| Interface-based unions | Heap allocation, type assertion overhead, no exhaustive matching | Discriminated union structs with Type field |
| Channel-based streams | Requires goroutine management, harder to control backpressure | Pull-based iterator interface |
| Global state | Untestable, race-prone, hidden dependencies | Dependency injection via config structs |
| Shared mutable state between elfs | Race conditions, complex synchronization | Each elf owns its own engine; communicate via channels |
| Over-abstraction | Premature generalization obscures intent | Three similar lines > one premature abstraction |
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,245 @@
---
essential: process-flows
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [architecture]
---
# Process Flows
## Bootstrap / Initialization
```mermaid
sequenceDiagram
participant User
participant Main as cmd/gnoma
participant Cfg as config.Load()
participant Auth as auth.KeySource
participant PR as ProviderRegistry
participant TR as ToolRegistry
participant PM as PermissionChecker
participant SM as SessionManager
participant UI as TUI / CLI
User->>Main: gnoma [flags]
Main->>Cfg: Load()
Note over Cfg: defaults → ~/.config/gnoma/config.toml<br/>→ .gnoma/config.toml → env → flags
Cfg-->>Main: Config
Main->>Auth: NewKeySource(config.APIKeys)
Main->>PR: NewRegistry()
loop each provider
Main->>Auth: Resolve(providerName)
Auth-->>Main: apiKey
Main->>PR: Register(name, factory)
end
Main->>TR: NewRegistry()
Main->>TR: Register(bash, fs.read, fs.write, ...)
Main->>PM: NewChecker(mode, rules, promptFn)
Main->>SM: NewManager(config)
alt stdin is TTY
Main->>UI: LaunchTUI(sessionManager)
else stdin is pipe
Main->>UI: RunCLI(sessionManager)
end
```
## User Message → Response (Full Agentic Turn)
**Happy path:**
```mermaid
sequenceDiagram
participant UI as TUI
participant Sess as Session<br/>(goroutine)
participant Eng as Engine
participant Prov as Provider
participant Acc as Accumulator
participant TR as ToolRegistry
participant PM as Permissions
UI->>Sess: Send(ctx, "user input")
Sess->>Eng: Submit(ctx, input, callback)
Note over Eng: Append user message to history
loop Agentic Loop (until EndTurn or MaxTurns)
Eng->>Prov: Stream(ctx, Request)
Prov-->>Eng: stream.Stream
loop Stream consumption
Eng->>Eng: stream.Next()
Eng->>Acc: Apply(event)
Eng->>Sess: callback(event)
Sess-->>UI: event via channel
UI->>UI: render delta
end
Eng->>Acc: Response()
Note over Eng: Append assistant message to history
alt StopReason == EndTurn
Note over Eng: Done — return Turn
else StopReason == ToolUse
loop each ToolCall
Eng->>PM: Check(toolName, args)
alt Denied
Note over Eng: Add error ToolResult
else Prompt needed
Eng->>Sess: callback(PermissionEvent)
Sess-->>UI: show permission dialog
UI-->>Sess: user decision
Sess-->>Eng: approved/denied
end
Eng->>TR: Get(toolName)
Eng->>TR: tool.Execute(ctx, args)
TR-->>Eng: Result
end
Note over Eng: Append ToolResults, continue loop
else StopReason == MaxTokens
Note over Eng: Return Turn with truncation warning
end
end
Eng-->>Sess: Turn
Sess-->>UI: TurnResult()
```
**Key decision points:**
- StopReason determines whether the loop continues (ToolUse), ends (EndTurn), or warns (MaxTokens)
- Permission check can block tool execution — denied tools get error results sent back to the model
- MaxTurns is a safety limit to prevent runaway loops
## Streaming Pipeline
```mermaid
graph LR
subgraph "Provider SDK"
SDK[SDK Stream]
end
subgraph "Provider Adapter"
Adapt[translate SDK event<br/>→ stream.Event]
end
subgraph "Engine"
CB[Callback func]
ACC[Accumulator]
end
subgraph "Session"
CH[Event Channel<br/>buffered 64]
end
subgraph "TUI"
Render[Render delta<br/>to terminal]
end
SDK -->|SDK-specific type| Adapt
Adapt -->|stream.Event| CB
CB --> ACC
CB --> CH
CH --> Render
```
## Tool Execution Flow
```mermaid
sequenceDiagram
participant Eng as Engine
participant PM as Permissions
participant UI as UI (via callback)
participant Reg as ToolRegistry
participant Tool as Tool impl
Note over Eng: Extract []ToolCall from response
loop each ToolCall
Eng->>PM: Check(ctx, toolName, args)
alt mode == Allow
PM-->>Eng: nil (allowed)
else mode == Prompt
PM->>UI: PromptFunc(toolName, args)
UI->>UI: Show permission dialog
UI-->>PM: true/false
alt denied
PM-->>Eng: ErrDenied
Note over Eng: ToolResult{IsError: true}
end
else mode == Deny + no allow rule
PM-->>Eng: ErrDenied
end
Eng->>Reg: Get(toolName)
alt tool not found
Note over Eng: ToolResult{IsError: true, "unknown tool"}
else found
Eng->>Tool: Execute(ctx, args)
Tool-->>Eng: Result
end
end
Note over Eng: Append ToolResults to history
```
## Context Compaction Flow
```mermaid
sequenceDiagram
participant Eng as Engine
participant Win as Context Window
participant Trk as Token Tracker
participant Strat as Compaction Strategy
Eng->>Win: Append(message)
Win->>Trk: Add(usage)
Trk-->>Win: State()
alt TokensOK
Note over Win: No action
else TokensWarning
Note over Win: Log warning, continue
else TokensCritical
Win->>Strat: Compact(messages, budget)
alt TruncateStrategy
Note over Strat: Keep system prompt + last N messages
else SummarizeStrategy
Note over Strat: LLM summarizes old messages
end
Strat-->>Win: compacted messages
Win->>Trk: Reset + recount
end
```
## Cancellation Propagation
```mermaid
sequenceDiagram
participant UI
participant Sess as Session goroutine
participant Eng as Engine
participant Prov as Provider.Stream
participant SDK as SDK HTTP
UI->>Sess: Cancel()
Note over Sess: cancel context
Sess->>Eng: ctx.Done() propagates
Eng->>Prov: ctx.Done() propagates
Prov->>SDK: HTTP request cancelled
SDK-->>Prov: context.Canceled
Prov-->>Eng: stream.Err() = context.Canceled
Eng-->>Sess: Turn with error
Sess->>Sess: close events channel
Sess-->>UI: TurnResult() returns error
```
## Changelog
- 2026-04-02: Initial version

34
docs/essentials/risks.md Normal file
View File

@@ -0,0 +1,34 @@
---
essential: risks
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: []
---
# Risk / Unknowns
| ID | Risk | Severity | Mitigation | Status |
|----|------|----------|-----------|--------|
| R-001 | SDK breaking changes — provider SDKs are pre-1.0 and may change APIs | Medium | Pin versions, integration tests per provider, adapter layer absorbs changes | Open |
| R-002 | Google range-to-pull bridge goroutine leak — context cancellation edge cases | Medium | Thorough testing with `testing/synctest`, always select on `ctx.Done()` | Open |
| R-003 | Thinking block round-trip fidelity — Anthropic signatures must survive serialization | Medium | Unit tests with real signature values, golden file tests | Open |
| R-004 | Tool call ID generation inconsistency — Google/Ollama may return empty IDs | Low | Generate UUID if provider returns empty, documented in provider adapter | Open |
| R-005 | Mistral SDK 2.2.0 stability — user-maintained SDK, recently updated | Low | User maintains it, can fix bugs directly. Integration tests catch regressions. | Accepted |
| R-006 | Bubble Tea v2 maturity — v2 is relatively new | Low | Pin version, fallback to v1 if blockers. TUI is last milestone item. | Open |
| R-007 | Multi-provider routing complexity — coordinating elfs on different providers with different capabilities | High | Design routing interface early (M4), start simple (manual provider assignment), add rules incrementally | Open |
| R-008 | Context compaction coherence — summarization may lose critical details | Medium | Truncation as safe default, summarization opt-in, compact boundaries for recovery | Open |
| R-009 | Permission prompt UX in pipe mode — no TUI for interactive prompts | Low | Default to `allow` or `deny` in pipe mode, require explicit flag | Open |
## Open Questions
- [ ] How should routing rules be expressed in config? Per-task rules, model capability tags, cost-based? — needs research before M5
- [ ] Which local tokenizer library to use? (tiktoken port, sentencepiece, or provider-specific)
- [ ] Serve mode protocol — choose what fits best when implementing M7
- [x] ~~Should gnoma embed a tokenizer?~~ → Yes, include local tokenizer (M4)
- [x] ~~Session persistence format?~~ → SQLite (M7)
- [x] ~~Mistral SDK as long-term reference?~~ → Yes for now, revisit after M2
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,85 @@
---
essential: tech-stack
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: []
---
# Tech Stack & Conventions
## Languages
| Language | Version | Role |
|----------|---------|------|
| Go | 1.26 | Primary — all application code |
## Frameworks & Libraries
| Library | Module | Purpose |
|---------|--------|---------|
| Mistral SDK | `somegit.dev/vikingowl/mistral-go-sdk` | Mistral API client (user-maintained) |
| Anthropic SDK | `github.com/anthropics/anthropic-sdk-go` | Anthropic API client |
| OpenAI SDK | `github.com/openai/openai-go` | OpenAI API client (+ compat endpoints) |
| Google GenAI | `google.golang.org/genai` | Google Gemini API client |
| TOML | `github.com/BurntSushi/toml` | Configuration file parsing |
| Bubble Tea | `github.com/charmbracelet/bubbletea/v2` | Terminal UI framework |
| Lip Gloss | `github.com/charmbracelet/lipgloss` | Terminal styling |
| Bubbles | `github.com/charmbracelet/bubbles` | TUI components (input, viewport) |
| Doublestar | `github.com/bmatcuk/doublestar/v4` | Glob with `**` support |
## Go 1.26 Features Used
| Feature | Where |
|---------|-------|
| `new(expr)` | Optional pointer fields in config/params |
| `errors.AsType[E](err)` | Provider error handling |
| `sync.WaitGroup.Go(f)` | Goroutine management |
| `slog.NewMultiHandler()` | Fan-out logging |
| `testing/synctest` | Concurrent test support |
| Green Tea GC (default) | No action needed — 10-40% less GC overhead |
| `io.ReadAll` 2x faster | File tool reads |
## Tooling
- **Build:** `go build` via Makefile
- **CI/CD:** none yet (planned)
- **Linting:** `golangci-lint`
- **Testing:** stdlib `testing`, `testing/synctest`
- **Package management:** Go modules
## Conventions
### Naming
- Files: lowercase, underscores for multi-word (`tool_result.go`)
- Packages: short, lowercase, no underscores (`provider`, `stream`)
- Functions/methods: camelCase (`NewUserText`, `HasToolCalls`)
- Types/structs: PascalCase (`ToolCall`, `ProviderError`)
- Constants: PascalCase for exported (`StopEndTurn`), camelCase for unexported
- Interfaces: describe behavior (`Provider`, `Stream`, `Tool`), not implementation
### Error Handling
- Explicit error types with `%w` wrapping
- `errors.AsType[E]` for type-safe extraction (Go 1.26)
- `Err` prefix for sentinel errors (`ErrDenied`)
- `*Error` suffix for error types (`ProviderError`)
- Fail fast — never swallow errors
- Include context in error messages
### File Organization
- By layer within `internal/`: `message/`, `stream/`, `provider/`, `tool/`, `engine/`, `session/`
- Provider adapters: one directory per provider under `internal/provider/`
- Tool implementations: one directory per tool type under `internal/tool/`
- Three files per provider adapter: `provider.go`, `translate.go`, `stream.go`
### Commit Style
- Conventional commits: `feat:`, `fix:`, `refactor:`, `test:`, `docs:`, `chore:`
- No co-signing
## Changelog
- 2026-04-02: Initial version

View File

@@ -0,0 +1,190 @@
---
essential: uml-diagrams
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: [architecture, process-flows]
---
# UML Diagrams
Go doesn't have class hierarchies. These diagrams show struct relationships, interface implementations, and state machines.
## State Diagram: Engine Turn
```mermaid
stateDiagram-v2
[*] --> Idle
Idle --> Streaming: Submit(input)
Streaming --> Accumulating: stream exhausted
Accumulating --> ToolExec: StopReason == ToolUse
Accumulating --> Complete: StopReason == EndTurn
Accumulating --> Complete: StopReason == MaxTokens
ToolExec --> Streaming: tools executed, continue loop
ToolExec --> PermissionWait: tool needs approval
PermissionWait --> ToolExec: user approves
PermissionWait --> ToolExec: user denies (error result)
Streaming --> Cancelled: ctx.Done()
ToolExec --> Cancelled: ctx.Done()
PermissionWait --> Cancelled: ctx.Done()
Complete --> Idle: turn returned
Cancelled --> Idle: turn returned with error
Streaming --> Error: provider error
ToolExec --> Error: fatal tool error
Error --> Idle: error returned
```
## State Diagram: Session Lifecycle
```mermaid
stateDiagram-v2
[*] --> Idle
Idle --> Active: Send(input)
Active --> Idle: Turn complete (events channel closed)
Active --> Cancelling: Cancel()
Cancelling --> Idle: cancellation propagated
Idle --> Closed: Close()
Active --> Closed: Close() (cancels first)
Closed --> [*]
```
## State Diagram: Stream
```mermaid
stateDiagram-v2
[*] --> Open
Open --> HasEvent: Next() returns true
HasEvent --> Open: caller reads Current()
Open --> Exhausted: Next() returns false, Err()==nil
Open --> Errored: Next() returns false, Err()!=nil
Exhausted --> [*]: Close()
Errored --> [*]: Close()
```
## Component Diagram: Provider Adapter Stack
```mermaid
graph TD
subgraph "Provider Interface"
PI[provider.Provider]
PS[stream.Stream]
end
subgraph "Adapters"
MA[mistral adapter]
AA[anthropic adapter]
OA[openai adapter]
GA[google adapter]
OC[openaicompat adapter]
end
subgraph "SDKs"
MS[mistral-go-sdk]
AS[anthropic-sdk-go]
OS[openai-go]
GS[genai-go]
end
PI --> MA
PI --> AA
PI --> OA
PI --> GA
PI --> OC
MA --> MS
AA --> AS
OA --> OS
GA --> GS
OC --> OS
MA --> PS
AA --> PS
OA --> PS
GA --> PS
OC --> PS
```
## Component Diagram: Streaming Event Translation
```mermaid
graph TB
subgraph "Anthropic SDK"
A1[ContentBlockDeltaEvent TextDelta] -->|→| E2[EventTextDelta]
A2[ContentBlockDeltaEvent InputJSONDelta] -->|→| E3[EventToolCallDelta]
A3[ContentBlockDeltaEvent ThinkingDelta] -->|→| E4[EventThinkingDelta]
A4[ContentBlockStartEvent tool_use] -->|→| E1[EventToolCallStart]
A5[ContentBlockStopEvent] -->|→| E5[EventToolCallDone]
end
subgraph "OpenAI SDK"
O1[Chunk.Delta.Content] -->|→| E2
O2[Chunk.Delta.ToolCalls start] -->|→| E1
O3[Chunk.Delta.ToolCalls delta] -->|→| E3
O4[Chunk.FinishReason=tool_calls] -->|→| E5
end
subgraph "Google SDK"
G1[Response.Part.Text] -->|→| E2
G2[Response.Part.FunctionCall] -->|→| E1
G3[same FunctionCall] -->|→| E5
end
subgraph "Mistral SDK"
M1[Chunk.Delta.Content] -->|→| E2
M2[Chunk ToolCalls start] -->|→| E1
M3[Chunk ToolCalls delta] -->|→| E3
M4[Chunk.FinishReason=tool_calls] -->|→| E5
end
```
## Struct Relationships: Elf System (Future)
```mermaid
classDiagram
class Elf {
<<interface>>
+ID() string
+Status() ElfStatus
+Send(msg) error
+Events() chan Event
+Cancel()
+Wait() ElfResult
}
class SyncElf {
-engine Engine
-parentCtx context.Context
runs on parent goroutine
}
class BackgroundElf {
-engine Engine
-goroutine
-events chan Event
runs independently
}
class ElfManager {
-elfs map~string, Elf~
+Spawn(config) Elf
+Get(id) Elf
+List() []Elf
+CancelAll()
}
Elf <|.. SyncElf
Elf <|.. BackgroundElf
ElfManager --> Elf
```
## Changelog
- 2026-04-02: Initial version

49
docs/essentials/vision.md Normal file
View File

@@ -0,0 +1,49 @@
---
essential: vision
status: complete
last_updated: 2026-04-02
project: gnoma
depends_on: []
---
# Vision
## What
A provider-agnostic agentic coding assistant — a single Go binary that streams, calls tools, and manages conversations across any LLM provider without privileging any one of them. Providers don't just coexist — they collaborate. Elfs (sub-agents) running on different providers work together within a single session, routed by capability, cost, or latency.
Named after the northern pygmy-owl (*Glaucidium gnoma*). Sub-agents are called *elfs* (elf owl, *Micrathene whitneyi*).
## Who
Any developer who wants an AI coding assistant they actually control — from hobbyists running local models on their own hardware, to professionals choosing between cloud providers, to teams where each member prefers a different LLM.
## Problem
Current agentic coding assistants (Claude Code, Cursor, Windsurf, Copilot) lock users to a single provider. Switching costs are high. Behavior is opaque — hidden tool execution, unclear token spend, no way to customize permissions or inject hooks. Local model support is an afterthought.
Worse, these assistants are single-provider silos. You can't have one model coordinate with another, route tasks to the best-fit provider, or mix a cloud model's reasoning with a local model's speed. Every request goes to the same provider regardless of complexity, cost, or capability.
There is no open, extensible assistant that treats all providers as collaborators, gives full visibility into every action, and works just as well with a local Ollama instance as with a cloud API.
## Core Principles
- **Provider freedom** — switch between Anthropic, OpenAI, Google, Mistral, or local models with one config change. No privileged provider.
- **Multi-provider collaboration** — elfs on different providers work together. A coordinator on Claude dispatches research to a local Qwen elf and code review to an OpenAI elf. Routing rules direct tasks by capability, cost, or latency.
- **Transparency** — every tool call, permission check, and token spend is visible. No hidden behavior.
- **Extensibility** — hooks, skills, and MCP let users shape the assistant without forking.
- **Simplicity** — single binary, zero infrastructure, runs anywhere Go compiles.
## Success Criteria
- [ ] gnoma replaces a vendor-locked assistant as the user's daily driver
- [ ] A user can switch providers mid-session with zero friction
- [ ] Elfs run on different providers simultaneously — a coordinator on one provider dispatches work to elfs on other providers
- [ ] Routing rules direct tasks to providers by capability, cost, or latency
- [ ] Local models (Ollama, llama.cpp) work with full tool-use support
- [ ] Every tool call, permission check, and token spend is visible to the user
- [ ] Users extend gnoma via hooks, skills, and MCP without forking
## Changelog
- 2026-04-02: Initial version