docs: add project essentials (12/12 complete)

Vision, domain model, architecture, patterns, process flows, UML diagrams, API contracts, tech stack, constraints, milestones (M1-M11), decision log (6 ADRs), and risk register. Key decisions: single binary, pull-based streaming, Mistral as M1 reference provider, discriminated unions, multi-provider collaboration as core identity.
2026-04-02 18:09:07 +02:00
parent f909733bff
commit efcb5a2901
14 changed files with 1638 additions and 0 deletions
--- a/docs/essentials/INDEX.md
+++ b/docs/essentials/INDEX.md
@@ -0,0 +1,35 @@
+---
+project: gnoma
+layout: directory
+path: docs/essentials/
+essentials:
+  vision: complete
+  domain-model: complete
+  architecture: complete
+  patterns: complete
+  process-flows: complete
+  uml-diagrams: complete
+  api-contracts: complete
+  tech-stack: complete
+  constraints: complete
+  milestones: complete
+  decision-log: complete
+  risks: complete
+---
+
+# Project Essentials — gnoma
+
+| # | Essential | Status | Link | Last Updated |
+|---|-----------|--------|------|-------------|
+| 1 | Vision | complete | [vision.md](vision.md) | 2026-04-02 |
+| 2 | Domain Model | complete | [domain-model.md](domain-model.md) | 2026-04-02 |
+| 3 | Architecture | complete | [architecture.md](architecture.md) | 2026-04-02 |
+| 4 | Patterns | complete | [patterns.md](patterns.md) | 2026-04-02 |
+| 5 | Process Flows | complete | [process-flows.md](process-flows.md) | 2026-04-02 |
+| 6 | UML Diagrams | complete | [uml-diagrams.md](uml-diagrams.md) | 2026-04-02 |
+| 7 | API Contracts | complete | [api-contracts.md](api-contracts.md) | 2026-04-02 |
+| 8 | Tech Stack & Conventions | complete | [tech-stack.md](tech-stack.md) | 2026-04-02 |
+| 9 | Constraints & Trade-offs | complete | [constraints.md](constraints.md) | 2026-04-02 |
+| 10 | Milestones | complete | [milestones.md](milestones.md) | 2026-04-02 |
+| 11 | Decision Log | complete | [decisions/001-initial-decisions.md](decisions/001-initial-decisions.md) | 2026-04-02 |
+| 12 | Risk / Unknowns | complete | [risks.md](risks.md) | 2026-04-02 |
--- a/docs/essentials/api-contracts.md
+++ b/docs/essentials/api-contracts.md
@@ -0,0 +1,107 @@
+---
+essential: api-contracts
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [architecture]
+---
+
+# API Contracts
+
+gnoma has no external HTTP API. All interfaces are internal Go APIs between packages. The stability guarantees below define how these internal boundaries evolve.
+
+## Core Interfaces
+
+| Interface | Package | Stability | Description |
+|-----------|---------|-----------|-------------|
+| `Provider` | `provider` | stable | LLM backend adapter contract |
+| `Stream` | `stream` | stable | Unified streaming event iterator |
+| `Tool` | `tool` | stable | Tool execution contract |
+| `Session` | `session` | stable | UI ↔ engine decoupling boundary |
+| `Strategy` | `context` | experimental | Compaction strategy contract |
+| `Elf` | TBD | experimental | Sub-agent contract (future) |
+
+## Provider Interface
+
+```go
+type Provider interface {
+    Stream(ctx context.Context, req Request) (stream.Stream, error)
+    Name() string
+}
+```
+
+**Stability:** Stable. Adding methods requires a new interface (e.g., `ProviderV2`) or optional interface assertion pattern.
+
+## Stream Interface
+
+```go
+type Stream interface {
+    Next() bool
+    Current() Event
+    Err() error
+    Close() error
+}
+```
+
+**Stability:** Stable. The pull-based iterator contract is locked.
+
+## Tool Interface
+
+```go
+type Tool interface {
+    Name() string
+    Description() string
+    Parameters() json.RawMessage
+    Execute(ctx context.Context, args json.RawMessage) (Result, error)
+    IsReadOnly() bool
+}
+```
+
+**Stability:** Stable. New capabilities added via optional interfaces:
+
+```go
+// Future: tools that support streaming output
+type StreamingTool interface {
+    Tool
+    ExecuteStream(ctx context.Context, args json.RawMessage) (stream.Stream, error)
+}
+```
+
+## Session Interface
+
+```go
+type Session interface {
+    Send(ctx context.Context, input string) error
+    Events() <-chan stream.Event
+    TurnResult() (*engine.Turn, error)
+    Cancel()
+    Close() error
+    Status() SessionStatus
+}
+```
+
+**Stability:** Stable. This is the boundary that enables future transport implementations (Unix socket, WebSocket) without changing the engine or UI.
+
+## Event Schema
+
+Events flow from provider → engine → session → UI. The `stream.Event` struct is the wire format:
+
+| Event Type | Fields Set | Direction |
+|-----------|-----------|-----------|
+| `EventTextDelta` | `Text` | Provider → UI |
+| `EventThinkingDelta` | `Text` | Provider → UI |
+| `EventToolCallStart` | `ToolCallID`, `ToolCallName` | Provider → UI |
+| `EventToolCallDelta` | `ToolCallID`, `ArgDelta` | Provider → UI |
+| `EventToolCallDone` | `ToolCallID`, `Args` | Provider → UI |
+| `EventUsage` | `Usage` | Provider → Engine |
+| `EventError` | `Err` | Any → Consumer |
+
+## Versioning Strategy
+
+Internal packages under `internal/` have no versioning — they change freely. The `Provider`, `Stream`, `Tool`, and `Session` interfaces are considered public contracts even though they're internal. Breaking changes to these require migration notes in the changelog.
+
+Future public API (if gnoma becomes embeddable as a library) would live under a `pkg/` directory with semantic versioning.
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/architecture.md
+++ b/docs/essentials/architecture.md
@@ -0,0 +1,158 @@
+---
+essential: architecture
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [domain-model]
+---
+
+# Architecture
+
+## System Context
+
+```mermaid
+graph TB
+    User([Developer]) -->|TUI / CLI pipe| gnoma[gnoma binary]
+    gnoma -->|HTTPS| Anthropic[Anthropic API]
+    gnoma -->|HTTPS| OpenAI[OpenAI API]
+    gnoma -->|HTTPS| Google[Google GenAI API]
+    gnoma -->|HTTPS| Mistral[Mistral API]
+    gnoma -->|HTTP| Local[Ollama / llama.cpp]
+    gnoma -->|stdio JSON-RPC| MCP[MCP Servers]
+    gnoma -->|exec| Tools[Local Tools<br/>bash, file ops]
+```
+
+## Container View
+
+```mermaid
+graph TB
+    subgraph "gnoma (single binary, single process)"
+        CLI[CLI Parser] --> Router{Mode?}
+        Router -->|TTY| TUI[TUI — Bubble Tea]
+        Router -->|Pipe| Pipe[CLI Pipe Mode]
+
+        TUI --> SM[Session Manager]
+        Pipe --> SM
+
+        SM --> S1[Session goroutine]
+        SM --> SN[Session N goroutine]
+
+        S1 --> E1[Engine]
+        SN --> EN[Engine N]
+
+        E1 --> PR[Provider Registry]
+        EN --> PR
+
+        PR --> Anthropic[Anthropic adapter]
+        PR --> OpenAI[OpenAI adapter]
+        PR --> Google[Google adapter]
+        PR --> Mistral[Mistral adapter]
+        PR --> OAICompat[OpenAI-compat adapter]
+
+        E1 --> TR[Tool Registry]
+        EN --> TR
+
+        TR --> Bash[bash]
+        TR --> FS[fs.read / write / edit / glob / grep]
+
+        E1 --> PM[Permission Checker]
+        EN --> PM
+
+        E1 --> CTX[Context Window]
+        EN --> CTX
+    end
+
+    subgraph "Config Stack"
+        Defaults --> Global["~/.config/gnoma/config.toml"]
+        Global --> Project[".gnoma/config.toml"]
+        Project --> Env[Environment Variables]
+        Env --> Flags[CLI Flags]
+    end
+```
+
+## Component Overview
+
+| Component | Responsibility | Technology | Boundary |
+|-----------|---------------|------------|----------|
+| `cmd/gnoma` | Binary entrypoint, flag parsing, mode routing | Go stdlib | Internal |
+| `internal/message` | Foundation types: Message, Content, Usage, Response | Pure Go, zero deps | Internal |
+| `internal/stream` | Streaming interface, Event types, Accumulator | Depends on message | Internal |
+| `internal/provider` | Provider interface, Registry, error taxonomy | Depends on message, stream | Internal |
+| `internal/provider/{anthropic,openai,google,mistral}` | SDK adapters: translate + stream | SDK dependencies | Network boundary |
+| `internal/provider/openaicompat` | Thin wrapper for Ollama/llama.cpp | Reuses openai adapter | Network boundary |
+| `internal/tool` | Tool interface, Registry, bash, file ops | Go stdlib, doublestar | Local system boundary |
+| `internal/permission` | Permission modes, rule matching, user prompts | Pure Go | Internal |
+| `internal/context` | Token tracking, compaction strategies, sliding window | Depends on message, provider | Internal |
+| `internal/config` | TOML layered config loading | BurntSushi/toml | Internal |
+| `internal/auth` | API key resolution from env/config | Pure Go | Internal |
+| `internal/engine` | Agentic query loop, tool execution orchestration | Depends on all above | Internal |
+| `internal/session` | Session lifecycle, channel-based UI decoupling | Depends on engine, stream | Internal |
+| `internal/tui` | Terminal UI: chat, input, status, permission dialogs | Bubble Tea, lipgloss | Internal |
+
+## Package Dependency Graph
+
+```mermaid
+graph BT
+    message["message"]
+    stream["stream"]
+    provider["provider"]
+    tool["tool"]
+    permission["permission"]
+    context_mgr["context"]
+    config["config"]
+    auth["auth"]
+    engine["engine"]
+    session["session"]
+    tui["tui"]
+    cmd["cmd/gnoma"]
+
+    stream --> message
+    provider --> message
+    provider --> stream
+    tool --> message
+    permission --> message
+    context_mgr --> message
+    context_mgr --> provider
+    config --> permission
+    engine --> provider
+    engine --> tool
+    engine --> permission
+    engine --> stream
+    engine --> context_mgr
+    session --> engine
+    session --> stream
+    tui --> session
+    tui --> stream
+    cmd --> tui
+    cmd --> config
+    cmd --> auth
+    cmd --> session
+    cmd --> provider
+    cmd --> tool
+```
+
+## Scope
+
+**In scope:**
+- Streaming chat with tool execution across 5+ LLM providers
+- Agentic loop (stream → tool calls → re-query → until done)
+- Permission system for tool execution
+- TUI and CLI pipe modes
+- TOML configuration with layering
+- Context management and compaction
+- Multi-agent (elfs) with per-elf provider routing
+- Hook, skill, and MCP extensibility
+
+**Out of scope:**
+- Web UI (future, via serve mode)
+- Cloud hosting / SaaS deployment
+- Training or fine-tuning models
+- IDE extension authoring (gnoma provides the backend, not the extension itself)
+
+## Deployment
+
+Single statically-linked Go binary. No runtime dependencies. Runs on Linux, macOS, Windows — anywhere Go compiles. Distributed via `go install`, release binaries, or package managers.
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/constraints.md
+++ b/docs/essentials/constraints.md
@@ -0,0 +1,68 @@
+---
+essential: constraints
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [domain-model]
+---
+
+# Constraints & Trade-offs
+
+## Non-Functional Requirements
+
+| Constraint | Target | Measurement |
+|-----------|--------|-------------|
+| First token latency | Dominated by provider, not gnoma overhead | Time from Submit() to first EventTextDelta |
+| Binary size | < 20 MB (static, no CGO) | `ls -lh bin/gnoma` |
+| Memory per session | < 50 MB baseline (excluding context window) | `runtime.MemStats` |
+| Startup time | < 200ms to TUI ready | Wall clock from exec to first render |
+| Provider support | 5+ providers from M2 | Count of passing provider integration tests |
+| Context window | Up to 200k tokens managed | Token tracker reports |
+
+## Trade-offs
+
+### Single binary over daemon architecture
+
+- **Chose:** Single Go binary, goroutines + channels for all communication
+- **Over:** Client-server split with gRPC IPC (gnoma + gnomad)
+- **Because:** Simpler deployment, no daemon lifecycle, no protobuf codegen. Go's goroutine model provides sufficient isolation.
+- **Consequence:** True process isolation for tool sandboxing requires future work. Multi-client scenarios (IDE + TUI) need serve mode added later.
+
+### Pull-based stream over channels or iter.Seq
+
+- **Chose:** `Next() / Current() / Err() / Close()` interface
+- **Over:** Channel-based streaming or Go 1.23+ `iter.Seq` range functions
+- **Because:** Matches 3 of 4 SDKs natively (zero-overhead adapter). Supports explicit resource cleanup via `Close()`. Consumer controls backpressure.
+- **Consequence:** Google's range-based SDK needs a goroutine bridge. Slightly more verbose than range-based iteration.
+
+### json.RawMessage passthrough over typed schemas
+
+- **Chose:** Tool parameters and arguments as `json.RawMessage`
+- **Over:** Typed JSON Schema library or code-generated types
+- **Because:** Zero-cost passthrough — no serialize/deserialize between provider and tool. No JSON Schema library as a core dependency.
+- **Consequence:** Schema validation happens at tool boundaries, not centrally. Type safety relies on tool implementations parsing their own args.
+
+### Sequential tool execution (MVP) over parallel
+
+- **Chose:** Execute tools one at a time in the agentic loop
+- **Over:** Parallel execution via errgroup with read/write partitioning
+- **Because:** Simpler to test, debug, and implement permission prompts. Parallel execution adds complexity around error collection and ordering.
+- **Consequence:** Multiple tool calls in a single turn execute sequentially. Performance impact is minimal for most workloads. Parallel execution planned for post-MVP.
+
+### Discriminated union structs over interface hierarchies
+
+- **Chose:** Struct with Type discriminant field for Content and Event types
+- **Over:** Interface-based variant types (e.g., `TextContent`, `ToolCallContent` implementing `Content`)
+- **Because:** Zero allocation, cache-friendly, works with switch exhaustiveness. Go interfaces for data variants incur boxing overhead.
+- **Consequence:** Adding a new content type requires updating switch statements. Acceptable for a small, stable set of variants.
+
+### Mistral as M1 reference provider over Anthropic
+
+- **Chose:** Implement Mistral adapter first as the reference
+- **Over:** Starting with Anthropic (richest content model)
+- **Because:** User maintains the Mistral Go SDK, knows its internals. Good baseline — similar to OpenAI's API shape. Anthropic's unique features (thinking blocks, cache tokens) are better added as an M2 extension.
+- **Consequence:** Thinking block support tested later. Cache token tracking added with Anthropic provider.
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/decisions/000-template.md
+++ b/docs/essentials/decisions/000-template.md
@@ -0,0 +1,35 @@
+# ADR-NNN: [Title]
+
+**Status:** Proposed | Accepted | Deprecated | Superseded by ADR-NNN
+**Date:** YYYY-MM-DD
+
+## Context
+
+[Describe the situation that requires a decision. What forces are at play? What constraints exist? What problem are you trying to solve?]
+
+## Decision
+
+[State the decision clearly and concisely. Use active voice: "We will..." or "The system will..."]
+
+## Alternatives Considered
+
+### Alternative A: [Name]
+
+- **Pros:** [advantages]
+- **Cons:** [disadvantages]
+
+### Alternative B: [Name]
+
+- **Pros:** [advantages]
+- **Cons:** [disadvantages]
+
+## Consequences
+
+**Positive:**
+- [Expected benefit]
+
+**Negative:**
+- [Expected cost or trade-off]
+
+**Neutral:**
+- [Side effects that are neither clearly positive nor negative]
--- a/docs/essentials/decisions/001-initial-decisions.md
+++ b/docs/essentials/decisions/001-initial-decisions.md
@@ -0,0 +1,188 @@
+# ADR-001: Single Binary with Goroutines
+
+**Status:** Accepted
+**Date:** 2026-04-02
+
+## Context
+
+gnoma needs to decouple the UI from the engine to support multiple frontends (TUI, CLI, future IDE extensions). Options were: (a) single binary with goroutines + channels, (b) client-server with gRPC IPC, (c) embedded library.
+
+## Decision
+
+Single Go binary. Engine runs as goroutines within the same process. UI communicates with engine via the `Session` interface over channels. Future serve mode adds a Unix socket listener for external clients — still the same process.
+
+## Alternatives Considered
+
+### Alternative A: gRPC IPC (gnoma + gnomad)
+
+- **Pros:** Process isolation, true sandboxing, multiple clients to one daemon
+- **Cons:** Protobuf codegen dependency, daemon lifecycle management, two binaries to distribute
+
+### Alternative B: Embedded library
+
+- **Pros:** Maximum flexibility for embedders
+- **Cons:** No standalone binary, API stability burden, harder to ship
+
+## Consequences
+
+**Positive:** Simple deployment, no daemon, no codegen, Go's goroutine model provides sufficient isolation.
+**Negative:** No process-level sandboxing for tools. Multi-client scenarios require serve mode (future).
+
+---
+
+# ADR-002: Pull-Based Stream Interface
+
+**Status:** Accepted
+**Date:** 2026-04-02
+
+## Context
+
+Need a unified streaming abstraction across 4 SDKs with different patterns: Anthropic/OpenAI/Mistral use pull-based `Next()/Current()`, Google uses range-based `for chunk, err := range iter`.
+
+## Decision
+
+Pull-based `Stream` interface: `Next() bool`, `Current() Event`, `Err() error`, `Close() error`. Google adapter bridges via goroutine + buffered channel.
+
+## Alternatives Considered
+
+### Alternative A: Channel-based
+
+- **Pros:** Go-idiomatic, works with `select`
+- **Cons:** Requires goroutine per stream, less control over backpressure, no `Close()` for cleanup
+
+### Alternative B: iter.Seq (range-over-func)
+
+- **Pros:** Modern Go pattern, clean `for event := range stream`
+- **Cons:** No `Close()` for resource cleanup, no separate error retrieval, doesn't match SDK patterns
+
+## Consequences
+
+**Positive:** Zero-overhead adapter for 3 of 4 SDKs. Explicit resource cleanup. Consumer controls pace.
+**Negative:** Google needs a goroutine bridge. Slightly more verbose than range-based.
+
+---
+
+# ADR-003: Mistral as M1 Reference Provider
+
+**Status:** Accepted
+**Date:** 2026-04-02
+
+## Context
+
+Need to pick one provider to implement first as the reference adapter. Candidates: Anthropic (richest model), OpenAI (most popular), Mistral (user maintains SDK).
+
+## Decision
+
+Mistral first. The user maintains `somegit.dev/vikingowl/mistral-go-sdk` and knows its internals. The API shape is similar to OpenAI, making it a good baseline. Anthropic's unique features (thinking blocks, cache tokens) are better tested as M2 extensions.
+
+## Alternatives Considered
+
+### Alternative A: Anthropic first
+
+- **Pros:** Richest content model, most features to test
+- **Cons:** Anthropic-specific features (thinking, caching) could bias the abstraction
+
+### Alternative B: OpenAI first
+
+- **Pros:** Most widely used, well-documented
+- **Cons:** No special insight into SDK internals
+
+## Consequences
+
+**Positive:** Fast iteration on reference adapter. SDK bugs fixed directly.
+**Negative:** Thinking block support tested later (M2).
+
+---
+
+# ADR-004: Discriminated Union Structs
+
+**Status:** Accepted
+**Date:** 2026-04-02
+
+## Context
+
+Go lacks sum types. Need to represent Content variants (text, tool call, tool result, thinking) and Event variants.
+
+## Decision
+
+Struct with `Type` discriminant field. Exactly one payload field is set per type value. Consumer switches on `Type`.
+
+## Alternatives Considered
+
+### Alternative A: Interface hierarchy
+
+- **Pros:** Extensible, familiar OOP pattern
+- **Cons:** Heap allocation per variant, type assertion overhead, no exhaustive switch checking
+
+### Alternative B: Generics-based enum
+
+- **Pros:** Type-safe, compile-time checked
+- **Cons:** Complex, unfamiliar, Go's generics don't support sum types well
+
+## Consequences
+
+**Positive:** Zero allocation, cache-friendly, fast switch dispatch, simple.
+**Negative:** New variants require updating all switch statements. Acceptable for small, stable sets.
+
+---
+
+# ADR-005: json.RawMessage for Tool Schemas
+
+**Status:** Accepted
+**Date:** 2026-04-02
+
+## Context
+
+Tool parameters (JSON Schema) and tool call arguments need to flow between providers and tools. Options: typed schema library, code generation, or raw JSON passthrough.
+
+## Decision
+
+`json.RawMessage` for both tool parameter schemas and tool call arguments. Zero-cost passthrough between provider and tool. Tools parse their own arguments.
+
+## Alternatives Considered
+
+### Alternative A: JSON Schema library
+
+- **Pros:** Centralized validation, type-safe schema construction
+- **Cons:** Core dependency, serialization overhead, schema library selection lock-in
+
+### Alternative B: Code generation from schemas
+
+- **Pros:** Full type safety, compile-time checks
+- **Cons:** Build complexity, generated code maintenance, rigid
+
+## Consequences
+
+**Positive:** No JSON Schema dependency. Providers and tools speak JSON natively. Minimal overhead.
+**Negative:** Validation at tool boundary only, not centralized.
+
+---
+
+# ADR-006: Multi-Provider Collaboration as Core Identity
+
+**Status:** Accepted
+**Date:** 2026-04-02
+
+## Context
+
+Most AI coding assistants are single-provider. gnoma already supports multiple providers, but the question is whether multi-provider collaboration (elfs on different providers working together) is a nice-to-have or a core architectural feature.
+
+## Decision
+
+Multi-provider collaboration is a core feature and part of gnoma's identity. The architecture must support elfs running on different providers simultaneously, with routing rules directing tasks by capability, cost, or latency. This is not an afterthought — it shapes how we design the elf system, provider registry, and session management.
+
+## Alternatives Considered
+
+### Alternative A: Multi-provider as optional extension
+
+- **Pros:** Simpler MVP, routing added later
+- **Cons:** Architectural decisions made without routing in mind may need rework
+
+## Consequences
+
+**Positive:** Clear differentiator from all existing tools. Shapes architecture from day one.
+**Negative:** Elf system design must account for per-elf provider config from the start.
+
+## Changelog
+
+- 2026-04-02: Initial decisions from architecture planning session
--- a/docs/essentials/domain-model.md
+++ b/docs/essentials/domain-model.md
@@ -0,0 +1,132 @@
+---
+essential: domain-model
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [vision]
+---
+
+# Domain Model
+
+## Entity Relationships
+
+```mermaid
+classDiagram
+    class Session {
+        +id: string
+        +state: SessionState
+        +Send(input) error
+        +Events() chan Event
+        +Cancel()
+    }
+
+    class Engine {
+        +history: []Message
+        +usage: Usage
+        +Submit(input, callback) Turn
+        +SetProvider(provider)
+        +SetModel(model)
+    }
+
+    class Message {
+        +Role: Role
+        +Content: []Content
+        +HasToolCalls() bool
+        +ToolCalls() []ToolCall
+        +TextContent() string
+    }
+
+    class Content {
+        +Type: ContentType
+        +Text: string
+        +ToolCall: ToolCall
+        +ToolResult: ToolResult
+        +Thinking: Thinking
+    }
+
+    class Provider {
+        <<interface>>
+        +Stream(req) Stream
+        +Name() string
+    }
+
+    class Stream {
+        <<interface>>
+        +Next() bool
+        +Current() Event
+        +Err() error
+        +Close() error
+    }
+
+    class Tool {
+        <<interface>>
+        +Name() string
+        +Execute(args) Result
+        +IsReadOnly() bool
+    }
+
+    class Turn {
+        +Messages: []Message
+        +Usage: Usage
+        +Rounds: int
+    }
+
+    class Elf {
+        <<interface>>
+        +ID() string
+        +Send(msg) error
+        +Events() chan Event
+        +Wait() ElfResult
+    }
+
+    Session "1" --> "1" Engine : owns
+    Engine "1" --> "1" Provider : uses
+    Engine "1" --> "*" Tool : executes
+    Engine "1" --> "*" Message : history
+    Engine "1" --> "*" Turn : produces
+    Message "1" --> "*" Content : contains
+    Provider "1" --> "*" Stream : creates
+    Stream "1" --> "*" Event : yields
+    Session "1" --> "*" Elf : spawns (future)
+    Elf "1" --> "1" Engine : owns
+```
+
+## Glossary
+
+| Term | Definition | Example |
+|------|-----------|---------|
+| gnoma | The host application — single binary, agentic coding assistant | `gnoma "list files"` |
+| Elf | A sub-agent (goroutine) with its own engine, history, and provider. Named after the elf owl. | Background elf exploring `auth/` on Ollama |
+| Session | A conversation boundary between UI and engine. Owns one engine, communicates via channels. | TUI session, CLI pipe session |
+| Engine | The agentic loop orchestrator. Manages history, streams from provider, executes tools, loops until done. | Engine running on Mistral with 5 tools |
+| Provider | An LLM backend adapter. Translates gnoma types to/from SDK-specific types. | Anthropic provider, OpenAI-compat provider |
+| Stream | Pull-based iterator over streaming events from a provider. Unified interface across all SDKs. | `for s.Next() { e := s.Current() }` |
+| Event | A single streaming delta — text chunk, tool call fragment, thinking trace, or usage update. | `EventTextDelta{Text: "hello"}` |
+| Message | A single turn in conversation history. Contains one or more Content blocks. | User text message, assistant message with tool calls |
+| Content | A discriminated union within a Message — text, tool call, tool result, or thinking block. | `Content{Type: ContentToolCall, ToolCall: &ToolCall{...}}` |
+| ToolCall | The model's request to invoke a tool, with ID, name, and JSON arguments. | `{ID: "tc_1", Name: "bash", Args: {"command": "ls"}}` |
+| ToolResult | The output of executing a tool, correlated to a ToolCall by ID. | `{ToolCallID: "tc_1", Content: "file1.go\nfile2.go"}` |
+| Turn | The result of a complete agentic loop — may span multiple API calls and tool executions. | Turn with 3 rounds: stream → tool → stream → tool → stream → done |
+| Accumulator | Assembles a complete Response from a sequence of streaming Events. Shared across all providers. | Text fragments → complete assistant message |
+| Callback | Function the engine calls for each streaming event, enabling real-time UI updates. | `func(evt stream.Event) { ch <- evt }` |
+| Round | A single API call within a Turn. A turn with 2 tool-use loops has 3 rounds. | Round 1: initial query. Round 2: after tool results. |
+| Routing | Directing tasks to different providers based on capability, cost, or latency rules. | Complex reasoning → Claude, quick lookups → local Qwen |
+
+## Invariants
+
+Rules that must always hold true in the domain:
+
+- A Message always has at least one Content block
+- A ToolResult always references a ToolCall.ID from the preceding assistant message
+- A Session owns exactly one Engine; an Engine is owned by exactly one Session
+- An Elf owns its own Engine — no shared mutable state between elfs
+- The Accumulator produces exactly one Response per stream consumption
+- Content.Type determines which payload field is set — exactly one is non-nil
+- Thinking.Signature must round-trip unchanged through message history (Anthropic requirement)
+- Tool execution only happens when StopReason == ToolUse
+- Stream.Close() must be called after consumption, regardless of error state
+- Provider.Stream() is the only network boundary — all tool execution is local
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/milestones.md
+++ b/docs/essentials/milestones.md
@@ -0,0 +1,177 @@
+---
+essential: milestones
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [vision]
+---
+
+# Milestones
+
+## M1: Core Engine (MVP)
+
+**Scope:** First working assistant. CLI pipe mode. Mistral as reference provider. Bash + file tools. No TUI, no permissions, no config file.
+
+**Deliverables:**
+
+- [ ] Architecture docs in `docs/essentials/`
+- [ ] Foundation types (`internal/message/`)
+- [ ] Streaming abstraction (`internal/stream/`)
+- [ ] Provider interface + Mistral adapter
+- [ ] Tool system: bash, fs.read, fs.write, fs.edit, fs.glob, fs.grep
+- [ ] Engine agentic loop (stream → tool → re-query → done)
+- [ ] CLI pipe mode (`echo "list files" | gnoma`)
+
+**Exit criteria:** Pipe a coding question in, get a response that uses tools, answer on stdout.
+
+## M2: Multi-Provider
+
+**Scope:** All remaining providers. Config file. Dynamic provider switching.
+
+**Deliverables:**
+
+- [ ] Anthropic provider (streaming + tool use + thinking blocks)
+- [ ] OpenAI provider (streaming + tool use)
+- [ ] Google provider (streaming + function calling)
+- [ ] OpenAI-compat for Ollama and llama.cpp
+- [ ] TOML config (global + project + env + flags)
+- [ ] `/model provider/model` switching mid-session
+
+**Exit criteria:** Chat with any configured provider via CLI pipe. Switch providers mid-session.
+
+## M3: TUI
+
+**Scope:** Interactive terminal UI. Permission system.
+
+**Deliverables:**
+
+- [ ] Bubble Tea TUI: chat panel, input box, streaming output
+- [ ] Status bar (provider, model, token usage)
+- [ ] Permission system (allow / deny / prompt modes)
+- [ ] Permission dialog overlay
+- [ ] Model picker overlay
+- [ ] Input history (up/down)
+
+**Exit criteria:** Launch TUI, chat interactively, tools execute with permission prompts.
+
+## M4: Context Intelligence
+
+**Scope:** Long sessions. Token tracking. Compaction. Local tokenizer.
+
+**Deliverables:**
+
+- [ ] Local tokenizer for accurate token counting without provider round-trips
+- [ ] Token tracker (cumulative usage, OK/warning/critical states)
+- [ ] Truncate compaction (drop old messages, keep system + recent)
+- [ ] Summarize compaction (LLM summarizes dropped messages)
+- [ ] Compact boundaries (transaction markers for crash recovery)
+- [ ] Deferred tool loading (non-essential tools loaded on demand)
+- [ ] Result persistence (large tool outputs written to disk)
+
+**Exit criteria:** 100+ turn conversation stays coherent within token budget. Local token counting matches provider reports within 5%.
+
+## M5: Elfs (Multi-Agent + Multi-Provider Routing)
+
+**Scope:** Sub-agents on different providers. Parallel work. Provider routing.
+
+**Deliverables:**
+
+- [ ] Elf spawning (`Engine.SpawnElf` with per-elf provider config)
+- [ ] Background elfs (independent goroutine + engine)
+- [ ] Parent ↔ elf communication via typed channels
+- [ ] Concurrent tool execution (read-only parallel, writes sequential)
+- [ ] Provider routing rules (route by capability, cost, latency) — research needed
+- [ ] Coordinator dispatches tasks to elfs on different providers
+
+**Exit criteria:** Coordinator on Claude spawns research elf on local Qwen + review elf on OpenAI, collects and synthesizes results.
+
+## M6: Extensibility
+
+**Scope:** Hooks, skills, MCP, plugin foundation.
+
+**Deliverables:**
+
+- [ ] Hook system (PreToolUse / PostToolUse, stdin/stdout protocol)
+- [ ] Skill loading (`.gnoma/skills/*.md` with frontmatter)
+- [ ] MCP client (JSON-RPC over stdio, tool discovery)
+- [ ] Plugin foundation (manifest, install, lifecycle)
+
+**Exit criteria:** MCP server tools appear in gnoma. Skills invocable by model. Hook logs all bash commands.
+
+## M7: Persistence & Serve
+
+**Scope:** Session persistence via SQLite. Serve mode for external clients. Coordinator mode.
+
+**Deliverables:**
+
+- [ ] Session persistence with SQLite (save/restore conversations across restarts)
+- [ ] Serve mode (Unix socket listener, external UI clients)
+- [ ] Coordinator mode (orchestrator dispatches to worker elfs)
+
+**Exit criteria:** Resume yesterday's conversation. VS Code extension connects via serve mode. Coordinator parallelizes subtasks.
+
+## M8: Thinking & Structured Output
+
+**Scope:** Extended thinking support across providers. Schema-validated structured output.
+
+**Deliverables:**
+
+- [ ] Thinking mode (disabled / enabled with budget / adaptive)
+- [ ] Thinking block streaming and display in TUI
+- [ ] Structured output with JSON schema validation
+- [ ] Retry logic for schema validation failures
+
+**Exit criteria:** Extended thinking with budget works on Anthropic. Structured output validates against schema on all providers that support it.
+
+## M9: Auth
+
+**Scope:** OAuth 2.0 + PKCE for cloud providers. Credential management.
+
+**Deliverables:**
+
+- [ ] OAuth 2.0 + PKCE flow (browser redirect → callback → token exchange)
+- [ ] Token refresh (proactive, before expiry)
+- [ ] OS keyring integration for secure credential storage
+- [ ] Multi-account support per provider
+
+**Exit criteria:** `gnoma login anthropic` opens browser, completes OAuth flow, stores token in keyring. Automatic refresh works.
+
+## M10: Observability
+
+**Scope:** Feature flags. Opt-in telemetry and analytics.
+
+**Deliverables:**
+
+- [ ] Feature flag system (local config + optional remote evaluation)
+- [ ] Opt-in analytics (event queue, local-only by default)
+- [ ] Usage dashboards (token spend, provider usage, tool frequency)
+- [ ] Cost tracking per provider/model
+
+**Exit criteria:** Feature flags gate experimental features. User can view their token spend breakdown. Analytics disabled by default.
+
+## M11: Web UI
+
+**Scope:** Browser-based UI as alternative to TUI. Requires serve mode (M7).
+
+**Deliverables:**
+
+- [ ] `gnoma web` CLI subcommand (or `gnoma --web`) starts local web server
+- [ ] Web UI connects to serve mode backend
+- [ ] Chat interface with streaming, tool output, permission prompts
+- [ ] Responsive design for desktop browsers
+
+**Exit criteria:** `gnoma web` opens browser, full chat with streaming and tool execution. Serve mode required as prerequisite.
+
+## Future
+
+Ideas not yet committed:
+
+- Voice input/output via provider audio APIs
+- Collaborative sessions (multiple humans + elfs)
+- Plugin marketplace
+- Remote agent execution
+
+## Changelog
+
+- 2026-04-02: Initial version — M1-M6
+- 2026-04-02: Split M2 into providers (M2) and TUI (M3). Added M8-M11 for thinking, auth, observability, web UI. Local tokenizer in M4. SQLite for session persistence in M7.
--- a/docs/essentials/patterns.md
+++ b/docs/essentials/patterns.md
@@ -0,0 +1,135 @@
+---
+essential: patterns
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [architecture]
+---
+
+# Patterns
+
+## Discriminated Unions
+
+- **What:** Struct with a `Type` field discriminant; exactly one payload field is set per type value. Used instead of Go interfaces for data variants.
+- **Where:** `message.Content`, `stream.Event`
+- **Why:** Zero allocation (no interface boxing), cache-friendly, works with `switch` statements. Go lacks sum types — this is the pragmatic equivalent.
+- **Example:**
+  ```go
+  type Content struct {
+      Type       ContentType
+      Text       string      // set when Type == ContentText
+      ToolCall   *ToolCall   // set when Type == ContentToolCall
+      ToolResult *ToolResult // set when Type == ContentToolResult
+      Thinking   *Thinking   // set when Type == ContentThinking
+  }
+
+  switch c.Type {
+  case ContentText:
+      fmt.Print(c.Text)
+  case ContentToolCall:
+      execute(c.ToolCall)
+  }
+  ```
+
+## Pull-Based Stream Iterator
+
+- **What:** `Next() / Current() / Err() / Close()` interface for consuming streaming data.
+- **Where:** `stream.Stream` interface, all provider adapters
+- **Why:** Matches 3 of 4 SDKs (Anthropic, OpenAI, Mistral) natively. Gives consumer explicit backpressure control. Supports `Close()` for resource cleanup, unlike `iter.Seq`. Only Google needs a goroutine bridge.
+- **Example:**
+  ```go
+  for s.Next() {
+      event := s.Current()
+      process(event)
+  }
+  if err := s.Err(); err != nil {
+      handle(err)
+  }
+  s.Close()
+  ```
+
+## Accumulator
+
+- **What:** Shared component that assembles a `message.Response` from a sequence of `stream.Event` values. Separated from provider-specific translation.
+- **Where:** `stream.Accumulator`, used by every provider adapter
+- **Why:** Provider adapters become thin translation layers. Accumulation logic (text building, tool call JSON fragment assembly, thinking blocks) is tested once, not per-provider.
+- **Example:**
+  ```go
+  acc := stream.NewAccumulator()
+  for s.Next() {
+      acc.Apply(s.Current())
+  }
+  response := acc.Response()
+  ```
+
+## Factory Registry
+
+- **What:** Map of names to factory functions. Creates instances on demand with config.
+- **Where:** `provider.Registry`, `tool.Registry`
+- **Why:** Decouples creation from usage. Makes testing easy — register mock factories. Enables dynamic provider switching.
+- **Example:**
+  ```go
+  registry.Register("mistral", mistral.NewProvider)
+  provider, err := registry.Create("mistral", cfg)
+  ```
+
+## Functional Options
+
+- **What:** Variadic option functions for configuring complex objects.
+- **Where:** Session creation, provider construction
+- **Why:** Clean API for objects with many optional parameters. Self-documenting, extensible without breaking changes.
+- **Example:**
+  ```go
+  session, err := manager.NewSession(
+      WithProvider(mistral),
+      WithModel("mistral-large-latest"),
+      WithMaxTurns(20),
+  )
+  ```
+
+## Callback Event Propagation
+
+- **What:** The engine accepts a `Callback func(stream.Event)` and calls it for each event. The session wraps this to push events into a channel.
+- **Where:** `engine.Submit()` → `session/local.go`
+- **Why:** Keeps the engine testable without concurrency. The engine knows nothing about channels, TUI, or goroutines. The session implementation decides how to propagate events.
+- **Example:**
+  ```go
+  // In session/local.go:
+  cb := func(evt stream.Event) {
+      select {
+      case s.events <- evt:
+      case <-ctx.Done():
+      }
+  }
+  turn, err := s.engine.Submit(ctx, input, cb)
+  ```
+
+## Error Wrapping with errors.AsType
+
+- **What:** Provider adapters wrap SDK errors into typed `ProviderError` with classification. Consumers extract using Go 1.26's `errors.AsType[E]`.
+- **Where:** All provider adapters, retry logic, engine error handling
+- **Why:** Enables error classification (transient vs auth vs bad request) for retry decisions. Type-safe extraction without pointer indirection.
+- **Example:**
+  ```go
+  if pErr, ok := errors.AsType[*ProviderError](err); ok {
+      if pErr.Retryable {
+          // exponential backoff
+      }
+  }
+  ```
+
+## Anti-Patterns
+
+Patterns explicitly avoided in this project:
+
+| Anti-Pattern | Why we avoid it | What to do instead |
+|---|---|---|
+| Interface-based unions | Heap allocation, type assertion overhead, no exhaustive matching | Discriminated union structs with Type field |
+| Channel-based streams | Requires goroutine management, harder to control backpressure | Pull-based iterator interface |
+| Global state | Untestable, race-prone, hidden dependencies | Dependency injection via config structs |
+| Shared mutable state between elfs | Race conditions, complex synchronization | Each elf owns its own engine; communicate via channels |
+| Over-abstraction | Premature generalization obscures intent | Three similar lines > one premature abstraction |
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/process-flows.md
+++ b/docs/essentials/process-flows.md
@@ -0,0 +1,245 @@
+---
+essential: process-flows
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [architecture]
+---
+
+# Process Flows
+
+## Bootstrap / Initialization
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant Main as cmd/gnoma
+    participant Cfg as config.Load()
+    participant Auth as auth.KeySource
+    participant PR as ProviderRegistry
+    participant TR as ToolRegistry
+    participant PM as PermissionChecker
+    participant SM as SessionManager
+    participant UI as TUI / CLI
+
+    User->>Main: gnoma [flags]
+    Main->>Cfg: Load()
+    Note over Cfg: defaults → ~/.config/gnoma/config.toml<br/>→ .gnoma/config.toml → env → flags
+    Cfg-->>Main: Config
+
+    Main->>Auth: NewKeySource(config.APIKeys)
+    Main->>PR: NewRegistry()
+
+    loop each provider
+        Main->>Auth: Resolve(providerName)
+        Auth-->>Main: apiKey
+        Main->>PR: Register(name, factory)
+    end
+
+    Main->>TR: NewRegistry()
+    Main->>TR: Register(bash, fs.read, fs.write, ...)
+    Main->>PM: NewChecker(mode, rules, promptFn)
+
+    Main->>SM: NewManager(config)
+
+    alt stdin is TTY
+        Main->>UI: LaunchTUI(sessionManager)
+    else stdin is pipe
+        Main->>UI: RunCLI(sessionManager)
+    end
+```
+
+## User Message → Response (Full Agentic Turn)
+
+**Happy path:**
+
+```mermaid
+sequenceDiagram
+    participant UI as TUI
+    participant Sess as Session<br/>(goroutine)
+    participant Eng as Engine
+    participant Prov as Provider
+    participant Acc as Accumulator
+    participant TR as ToolRegistry
+    participant PM as Permissions
+
+    UI->>Sess: Send(ctx, "user input")
+    Sess->>Eng: Submit(ctx, input, callback)
+
+    Note over Eng: Append user message to history
+
+    loop Agentic Loop (until EndTurn or MaxTurns)
+        Eng->>Prov: Stream(ctx, Request)
+        Prov-->>Eng: stream.Stream
+
+        loop Stream consumption
+            Eng->>Eng: stream.Next()
+            Eng->>Acc: Apply(event)
+            Eng->>Sess: callback(event)
+            Sess-->>UI: event via channel
+            UI->>UI: render delta
+        end
+
+        Eng->>Acc: Response()
+        Note over Eng: Append assistant message to history
+
+        alt StopReason == EndTurn
+            Note over Eng: Done — return Turn
+        else StopReason == ToolUse
+            loop each ToolCall
+                Eng->>PM: Check(toolName, args)
+                alt Denied
+                    Note over Eng: Add error ToolResult
+                else Prompt needed
+                    Eng->>Sess: callback(PermissionEvent)
+                    Sess-->>UI: show permission dialog
+                    UI-->>Sess: user decision
+                    Sess-->>Eng: approved/denied
+                end
+                Eng->>TR: Get(toolName)
+                Eng->>TR: tool.Execute(ctx, args)
+                TR-->>Eng: Result
+            end
+            Note over Eng: Append ToolResults, continue loop
+        else StopReason == MaxTokens
+            Note over Eng: Return Turn with truncation warning
+        end
+    end
+
+    Eng-->>Sess: Turn
+    Sess-->>UI: TurnResult()
+```
+
+**Key decision points:**
+
+- StopReason determines whether the loop continues (ToolUse), ends (EndTurn), or warns (MaxTokens)
+- Permission check can block tool execution — denied tools get error results sent back to the model
+- MaxTurns is a safety limit to prevent runaway loops
+
+## Streaming Pipeline
+
+```mermaid
+graph LR
+    subgraph "Provider SDK"
+        SDK[SDK Stream]
+    end
+
+    subgraph "Provider Adapter"
+        Adapt[translate SDK event<br/>→ stream.Event]
+    end
+
+    subgraph "Engine"
+        CB[Callback func]
+        ACC[Accumulator]
+    end
+
+    subgraph "Session"
+        CH[Event Channel<br/>buffered 64]
+    end
+
+    subgraph "TUI"
+        Render[Render delta<br/>to terminal]
+    end
+
+    SDK -->|SDK-specific type| Adapt
+    Adapt -->|stream.Event| CB
+    CB --> ACC
+    CB --> CH
+    CH --> Render
+```
+
+## Tool Execution Flow
+
+```mermaid
+sequenceDiagram
+    participant Eng as Engine
+    participant PM as Permissions
+    participant UI as UI (via callback)
+    participant Reg as ToolRegistry
+    participant Tool as Tool impl
+
+    Note over Eng: Extract []ToolCall from response
+
+    loop each ToolCall
+        Eng->>PM: Check(ctx, toolName, args)
+
+        alt mode == Allow
+            PM-->>Eng: nil (allowed)
+        else mode == Prompt
+            PM->>UI: PromptFunc(toolName, args)
+            UI->>UI: Show permission dialog
+            UI-->>PM: true/false
+            alt denied
+                PM-->>Eng: ErrDenied
+                Note over Eng: ToolResult{IsError: true}
+            end
+        else mode == Deny + no allow rule
+            PM-->>Eng: ErrDenied
+        end
+
+        Eng->>Reg: Get(toolName)
+        alt tool not found
+            Note over Eng: ToolResult{IsError: true, "unknown tool"}
+        else found
+            Eng->>Tool: Execute(ctx, args)
+            Tool-->>Eng: Result
+        end
+    end
+
+    Note over Eng: Append ToolResults to history
+```
+
+## Context Compaction Flow
+
+```mermaid
+sequenceDiagram
+    participant Eng as Engine
+    participant Win as Context Window
+    participant Trk as Token Tracker
+    participant Strat as Compaction Strategy
+
+    Eng->>Win: Append(message)
+    Win->>Trk: Add(usage)
+    Trk-->>Win: State()
+
+    alt TokensOK
+        Note over Win: No action
+    else TokensWarning
+        Note over Win: Log warning, continue
+    else TokensCritical
+        Win->>Strat: Compact(messages, budget)
+        alt TruncateStrategy
+            Note over Strat: Keep system prompt + last N messages
+        else SummarizeStrategy
+            Note over Strat: LLM summarizes old messages
+        end
+        Strat-->>Win: compacted messages
+        Win->>Trk: Reset + recount
+    end
+```
+
+## Cancellation Propagation
+
+```mermaid
+sequenceDiagram
+    participant UI
+    participant Sess as Session goroutine
+    participant Eng as Engine
+    participant Prov as Provider.Stream
+    participant SDK as SDK HTTP
+
+    UI->>Sess: Cancel()
+    Note over Sess: cancel context
+    Sess->>Eng: ctx.Done() propagates
+    Eng->>Prov: ctx.Done() propagates
+    Prov->>SDK: HTTP request cancelled
+    SDK-->>Prov: context.Canceled
+    Prov-->>Eng: stream.Err() = context.Canceled
+    Eng-->>Sess: Turn with error
+    Sess->>Sess: close events channel
+    Sess-->>UI: TurnResult() returns error
+```
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/risks.md
+++ b/docs/essentials/risks.md
@@ -0,0 +1,34 @@
+---
+essential: risks
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: []
+---
+
+# Risk / Unknowns
+
+| ID | Risk | Severity | Mitigation | Status |
+|----|------|----------|-----------|--------|
+| R-001 | SDK breaking changes — provider SDKs are pre-1.0 and may change APIs | Medium | Pin versions, integration tests per provider, adapter layer absorbs changes | Open |
+| R-002 | Google range-to-pull bridge goroutine leak — context cancellation edge cases | Medium | Thorough testing with `testing/synctest`, always select on `ctx.Done()` | Open |
+| R-003 | Thinking block round-trip fidelity — Anthropic signatures must survive serialization | Medium | Unit tests with real signature values, golden file tests | Open |
+| R-004 | Tool call ID generation inconsistency — Google/Ollama may return empty IDs | Low | Generate UUID if provider returns empty, documented in provider adapter | Open |
+| R-005 | Mistral SDK 2.2.0 stability — user-maintained SDK, recently updated | Low | User maintains it, can fix bugs directly. Integration tests catch regressions. | Accepted |
+| R-006 | Bubble Tea v2 maturity — v2 is relatively new | Low | Pin version, fallback to v1 if blockers. TUI is last milestone item. | Open |
+| R-007 | Multi-provider routing complexity — coordinating elfs on different providers with different capabilities | High | Design routing interface early (M4), start simple (manual provider assignment), add rules incrementally | Open |
+| R-008 | Context compaction coherence — summarization may lose critical details | Medium | Truncation as safe default, summarization opt-in, compact boundaries for recovery | Open |
+| R-009 | Permission prompt UX in pipe mode — no TUI for interactive prompts | Low | Default to `allow` or `deny` in pipe mode, require explicit flag | Open |
+
+## Open Questions
+
+- [ ] How should routing rules be expressed in config? Per-task rules, model capability tags, cost-based? — needs research before M5
+- [ ] Which local tokenizer library to use? (tiktoken port, sentencepiece, or provider-specific)
+- [ ] Serve mode protocol — choose what fits best when implementing M7
+- [x] ~~Should gnoma embed a tokenizer?~~ → Yes, include local tokenizer (M4)
+- [x] ~~Session persistence format?~~ → SQLite (M7)
+- [x] ~~Mistral SDK as long-term reference?~~ → Yes for now, revisit after M2
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/tech-stack.md
+++ b/docs/essentials/tech-stack.md
@@ -0,0 +1,85 @@
+---
+essential: tech-stack
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: []
+---
+
+# Tech Stack & Conventions
+
+## Languages
+
+| Language | Version | Role |
+|----------|---------|------|
+| Go | 1.26 | Primary — all application code |
+
+## Frameworks & Libraries
+
+| Library | Module | Purpose |
+|---------|--------|---------|
+| Mistral SDK | `somegit.dev/vikingowl/mistral-go-sdk` | Mistral API client (user-maintained) |
+| Anthropic SDK | `github.com/anthropics/anthropic-sdk-go` | Anthropic API client |
+| OpenAI SDK | `github.com/openai/openai-go` | OpenAI API client (+ compat endpoints) |
+| Google GenAI | `google.golang.org/genai` | Google Gemini API client |
+| TOML | `github.com/BurntSushi/toml` | Configuration file parsing |
+| Bubble Tea | `github.com/charmbracelet/bubbletea/v2` | Terminal UI framework |
+| Lip Gloss | `github.com/charmbracelet/lipgloss` | Terminal styling |
+| Bubbles | `github.com/charmbracelet/bubbles` | TUI components (input, viewport) |
+| Doublestar | `github.com/bmatcuk/doublestar/v4` | Glob with `**` support |
+
+## Go 1.26 Features Used
+
+| Feature | Where |
+|---------|-------|
+| `new(expr)` | Optional pointer fields in config/params |
+| `errors.AsType[E](err)` | Provider error handling |
+| `sync.WaitGroup.Go(f)` | Goroutine management |
+| `slog.NewMultiHandler()` | Fan-out logging |
+| `testing/synctest` | Concurrent test support |
+| Green Tea GC (default) | No action needed — 10-40% less GC overhead |
+| `io.ReadAll` 2x faster | File tool reads |
+
+## Tooling
+
+- **Build:** `go build` via Makefile
+- **CI/CD:** none yet (planned)
+- **Linting:** `golangci-lint`
+- **Testing:** stdlib `testing`, `testing/synctest`
+- **Package management:** Go modules
+
+## Conventions
+
+### Naming
+
+- Files: lowercase, underscores for multi-word (`tool_result.go`)
+- Packages: short, lowercase, no underscores (`provider`, `stream`)
+- Functions/methods: camelCase (`NewUserText`, `HasToolCalls`)
+- Types/structs: PascalCase (`ToolCall`, `ProviderError`)
+- Constants: PascalCase for exported (`StopEndTurn`), camelCase for unexported
+- Interfaces: describe behavior (`Provider`, `Stream`, `Tool`), not implementation
+
+### Error Handling
+
+- Explicit error types with `%w` wrapping
+- `errors.AsType[E]` for type-safe extraction (Go 1.26)
+- `Err` prefix for sentinel errors (`ErrDenied`)
+- `*Error` suffix for error types (`ProviderError`)
+- Fail fast — never swallow errors
+- Include context in error messages
+
+### File Organization
+
+- By layer within `internal/`: `message/`, `stream/`, `provider/`, `tool/`, `engine/`, `session/`
+- Provider adapters: one directory per provider under `internal/provider/`
+- Tool implementations: one directory per tool type under `internal/tool/`
+- Three files per provider adapter: `provider.go`, `translate.go`, `stream.go`
+
+### Commit Style
+
+- Conventional commits: `feat:`, `fix:`, `refactor:`, `test:`, `docs:`, `chore:`
+- No co-signing
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/uml-diagrams.md
+++ b/docs/essentials/uml-diagrams.md
@@ -0,0 +1,190 @@
+---
+essential: uml-diagrams
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: [architecture, process-flows]
+---
+
+# UML Diagrams
+
+Go doesn't have class hierarchies. These diagrams show struct relationships, interface implementations, and state machines.
+
+## State Diagram: Engine Turn
+
+```mermaid
+stateDiagram-v2
+    [*] --> Idle
+
+    Idle --> Streaming: Submit(input)
+    Streaming --> Accumulating: stream exhausted
+    Accumulating --> ToolExec: StopReason == ToolUse
+    Accumulating --> Complete: StopReason == EndTurn
+    Accumulating --> Complete: StopReason == MaxTokens
+    ToolExec --> Streaming: tools executed, continue loop
+    ToolExec --> PermissionWait: tool needs approval
+
+    PermissionWait --> ToolExec: user approves
+    PermissionWait --> ToolExec: user denies (error result)
+
+    Streaming --> Cancelled: ctx.Done()
+    ToolExec --> Cancelled: ctx.Done()
+    PermissionWait --> Cancelled: ctx.Done()
+
+    Complete --> Idle: turn returned
+    Cancelled --> Idle: turn returned with error
+    Streaming --> Error: provider error
+    ToolExec --> Error: fatal tool error
+    Error --> Idle: error returned
+```
+
+## State Diagram: Session Lifecycle
+
+```mermaid
+stateDiagram-v2
+    [*] --> Idle
+
+    Idle --> Active: Send(input)
+    Active --> Idle: Turn complete (events channel closed)
+    Active --> Cancelling: Cancel()
+    Cancelling --> Idle: cancellation propagated
+
+    Idle --> Closed: Close()
+    Active --> Closed: Close() (cancels first)
+    Closed --> [*]
+```
+
+## State Diagram: Stream
+
+```mermaid
+stateDiagram-v2
+    [*] --> Open
+
+    Open --> HasEvent: Next() returns true
+    HasEvent --> Open: caller reads Current()
+    Open --> Exhausted: Next() returns false, Err()==nil
+    Open --> Errored: Next() returns false, Err()!=nil
+
+    Exhausted --> [*]: Close()
+    Errored --> [*]: Close()
+```
+
+## Component Diagram: Provider Adapter Stack
+
+```mermaid
+graph TD
+    subgraph "Provider Interface"
+        PI[provider.Provider]
+        PS[stream.Stream]
+    end
+
+    subgraph "Adapters"
+        MA[mistral adapter]
+        AA[anthropic adapter]
+        OA[openai adapter]
+        GA[google adapter]
+        OC[openaicompat adapter]
+    end
+
+    subgraph "SDKs"
+        MS[mistral-go-sdk]
+        AS[anthropic-sdk-go]
+        OS[openai-go]
+        GS[genai-go]
+    end
+
+    PI --> MA
+    PI --> AA
+    PI --> OA
+    PI --> GA
+    PI --> OC
+
+    MA --> MS
+    AA --> AS
+    OA --> OS
+    GA --> GS
+    OC --> OS
+
+    MA --> PS
+    AA --> PS
+    OA --> PS
+    GA --> PS
+    OC --> PS
+```
+
+## Component Diagram: Streaming Event Translation
+
+```mermaid
+graph TB
+    subgraph "Anthropic SDK"
+        A1[ContentBlockDeltaEvent TextDelta] -->|→| E2[EventTextDelta]
+        A2[ContentBlockDeltaEvent InputJSONDelta] -->|→| E3[EventToolCallDelta]
+        A3[ContentBlockDeltaEvent ThinkingDelta] -->|→| E4[EventThinkingDelta]
+        A4[ContentBlockStartEvent tool_use] -->|→| E1[EventToolCallStart]
+        A5[ContentBlockStopEvent] -->|→| E5[EventToolCallDone]
+    end
+
+    subgraph "OpenAI SDK"
+        O1[Chunk.Delta.Content] -->|→| E2
+        O2[Chunk.Delta.ToolCalls start] -->|→| E1
+        O3[Chunk.Delta.ToolCalls delta] -->|→| E3
+        O4[Chunk.FinishReason=tool_calls] -->|→| E5
+    end
+
+    subgraph "Google SDK"
+        G1[Response.Part.Text] -->|→| E2
+        G2[Response.Part.FunctionCall] -->|→| E1
+        G3[same FunctionCall] -->|→| E5
+    end
+
+    subgraph "Mistral SDK"
+        M1[Chunk.Delta.Content] -->|→| E2
+        M2[Chunk ToolCalls start] -->|→| E1
+        M3[Chunk ToolCalls delta] -->|→| E3
+        M4[Chunk.FinishReason=tool_calls] -->|→| E5
+    end
+```
+
+## Struct Relationships: Elf System (Future)
+
+```mermaid
+classDiagram
+    class Elf {
+        <<interface>>
+        +ID() string
+        +Status() ElfStatus
+        +Send(msg) error
+        +Events() chan Event
+        +Cancel()
+        +Wait() ElfResult
+    }
+
+    class SyncElf {
+        -engine Engine
+        -parentCtx context.Context
+        runs on parent goroutine
+    }
+
+    class BackgroundElf {
+        -engine Engine
+        -goroutine
+        -events chan Event
+        runs independently
+    }
+
+    class ElfManager {
+        -elfs map~string, Elf~
+        +Spawn(config) Elf
+        +Get(id) Elf
+        +List() []Elf
+        +CancelAll()
+    }
+
+    Elf <|.. SyncElf
+    Elf <|.. BackgroundElf
+    ElfManager --> Elf
+```
+
+## Changelog
+
+- 2026-04-02: Initial version
--- a/docs/essentials/vision.md
+++ b/docs/essentials/vision.md
@@ -0,0 +1,49 @@
+---
+essential: vision
+status: complete
+last_updated: 2026-04-02
+project: gnoma
+depends_on: []
+---
+
+# Vision
+
+## What
+
+A provider-agnostic agentic coding assistant — a single Go binary that streams, calls tools, and manages conversations across any LLM provider without privileging any one of them. Providers don't just coexist — they collaborate. Elfs (sub-agents) running on different providers work together within a single session, routed by capability, cost, or latency.
+
+Named after the northern pygmy-owl (*Glaucidium gnoma*). Sub-agents are called *elfs* (elf owl, *Micrathene whitneyi*).
+
+## Who
+
+Any developer who wants an AI coding assistant they actually control — from hobbyists running local models on their own hardware, to professionals choosing between cloud providers, to teams where each member prefers a different LLM.
+
+## Problem
+
+Current agentic coding assistants (Claude Code, Cursor, Windsurf, Copilot) lock users to a single provider. Switching costs are high. Behavior is opaque — hidden tool execution, unclear token spend, no way to customize permissions or inject hooks. Local model support is an afterthought.
+
+Worse, these assistants are single-provider silos. You can't have one model coordinate with another, route tasks to the best-fit provider, or mix a cloud model's reasoning with a local model's speed. Every request goes to the same provider regardless of complexity, cost, or capability.
+
+There is no open, extensible assistant that treats all providers as collaborators, gives full visibility into every action, and works just as well with a local Ollama instance as with a cloud API.
+
+## Core Principles
+
+- **Provider freedom** — switch between Anthropic, OpenAI, Google, Mistral, or local models with one config change. No privileged provider.
+- **Multi-provider collaboration** — elfs on different providers work together. A coordinator on Claude dispatches research to a local Qwen elf and code review to an OpenAI elf. Routing rules direct tasks by capability, cost, or latency.
+- **Transparency** — every tool call, permission check, and token spend is visible. No hidden behavior.
+- **Extensibility** — hooks, skills, and MCP let users shape the assistant without forking.
+- **Simplicity** — single binary, zero infrastructure, runs anywhere Go compiles.
+
+## Success Criteria
+
+- [ ] gnoma replaces a vendor-locked assistant as the user's daily driver
+- [ ] A user can switch providers mid-session with zero friction
+- [ ] Elfs run on different providers simultaneously — a coordinator on one provider dispatches work to elfs on other providers
+- [ ] Routing rules direct tasks to providers by capability, cost, or latency
+- [ ] Local models (Ollama, llama.cpp) work with full tool-use support
+- [ ] Every tool call, permission check, and token spend is visible to the user
+- [ ] Users extend gnoma via hooks, skills, and MCP without forking
+
+## Changelog
+
+- 2026-04-02: Initial version