Commit Graph

78 Commits

Author SHA1 Message Date
87f8d2697f feat: wire --resume/-r CLI flags, SessionStore, quality persistence
- Add --resume/-r flags; empty = list sessions, ID = restore specific session
- Create SessionStore from config.ProjectRoot() and cfg.Session.MaxKeep
- Wire SessionID and Store into session.NewLocal
- Restore QualityTracker EMA data from ~/.config/gnoma/quality.json at startup
- Persist QualityTracker data to quality.json via defer on process exit
2026-04-05 23:52:05 +02:00
4597d4cb08 feat: /resume TUI command + SessionStore in tui.Config
- Add SessionStore field to tui.Config
- Add /resume slash command: lists sessions or restores by ID
- Pass SessionStore to tui.New in main.go
- Update /help text to include /resume
- Add .gnoma/sessions/ to .gitignore
2026-04-05 23:51:48 +02:00
16193de4fc feat: LocalConfig + auto-save hook in session.Local
Refactor NewLocal to accept LocalConfig (matching engine/router patterns),
add persistence fields (SessionID, Store, Incognito, Logger), capture
finalState before releasing the lock to avoid data races, and auto-save
a Snapshot after each successful turn when a store is configured.
Add SessionID() to the Session interface and three new tests covering
auto-save, no-store no-panic, and SessionID accessors.
2026-04-05 23:46:48 +02:00
ca163dc8d4 feat: QualityTracker.Snapshot/Restore + Router.QualityTracker() for cross-session persistence 2026-04-05 23:40:19 +02:00
20fb045cba feat: Engine.SetHistory/SetUsage/SetActivatedTools for session restore 2026-04-05 23:39:38 +02:00
bbd7791428 feat: add Session config section (max_keep for session retention) 2026-04-05 23:37:10 +02:00
8b489d429a test: snapshot JSON round-trip with multi-turn conversation 2026-04-05 23:35:25 +02:00
3cf5cdeeb6 feat: SessionStore — save/load/list/prune session snapshots to .gnoma/sessions/ 2026-04-05 23:34:29 +02:00
ef2d058cc0 feat: JSON serialization for Message and Content (session persistence blocker)
Add custom MarshalJSON/UnmarshalJSON on Content using string type discriminant
("text", "tool_call", "tool_result", "thinking"). Add json tags to Message.
2026-04-05 23:31:25 +02:00
3f258a1bc7 feat: coordinator mode — system prompt injection for orchestration tasks 2026-04-05 23:07:56 +02:00
8d3d498245 feat: coordinator mode — system prompt injection for orchestration tasks 2026-04-05 23:06:23 +02:00
dd9f4e390a feat: accurate context window sizing from arm capabilities + prefix token baseline + tokenizer wiring 2026-04-05 22:26:31 +02:00
27ca12f863 feat: tokenizer-aware Tracker.CountTokens/CountMessages replaces EstimateMessages in compaction 2026-04-05 22:21:12 +02:00
62112cff55 feat: list_results + read_result tools for coordinator artifact discovery 2026-04-05 22:19:05 +02:00
7c991a9c68 feat: list_results + read_result tools for coordinator artifact discovery 2026-04-05 22:15:04 +02:00
6cf5e92957 feat: QualityTracker — EMA router feedback from elf outcomes, ResultFilePaths tracking 2026-04-05 22:08:08 +02:00
8a846bd024 fix: restore import order in main.go 2026-04-05 22:02:23 +02:00
d251dd7507 feat: wire persist.Store into engine, elf manager, and agent tools 2026-04-05 21:59:55 +02:00
eab5f26407 test: tokenizer cached encoding path coverage 2026-04-05 21:53:05 +02:00
f7782215dc feat: tiktoken tokenizer — accurate BPE token counting with provider-aware encoding 2026-04-05 21:46:43 +02:00
fbb28de0b8 fix: persist.Store — sanitize callID, log save errors, document List filter semantics 2026-04-05 21:44:03 +02:00
ace3716204 feat: persist.Store — session-scoped /tmp tool result persistence 2026-04-05 21:38:45 +02:00
01a05fba4e docs: M6/M7 close-out implementation plan — 8 tasks, TDD, full file map 2026-04-05 21:33:42 +02:00
c556d3172f docs: M6/M7 close-out design spec — tool persistence, tokenizer, router feedback, coordinator 2026-04-05 21:22:26 +02:00
c2502a2b39 docs: add README with provider setup and dev/test instructions 2026-04-05 19:29:10 +02:00
cb2d63d06f feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes
provider/openai:
- Fix doubled tool call args (argsComplete flag): Ollama sends complete
  args in the first streaming chunk then repeats them as delta, causing
  doubled JSON and 400 errors in elfs
- Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep)
- Add Reasoning field support for Ollama thinking output

cmd/gnoma:
- Early TTY detection so logger is created with correct destination
  before any component gets a reference to it (fixes slog WARN bleed
  into TUI textarea)

permission:
- Exempt spawn_elfs and agent tools from safety scanner: elf prompt
  text may legitimately mention .env/.ssh/credentials patterns and
  should not be blocked

tui/app:
- /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge
  (ask for plain text output) → TUI fallback write from streamBuf
- looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback
  content before writing (reject refusals, strip narrative preambles)
- Collapse thinking output to 3 lines; ctrl+o to expand (live stream
  and committed messages)
- Stream-level filter for model pseudo-tool-call blocks: suppresses
  <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call|>
  from entering streamBuf across chunk boundaries
- sanitizeAssistantText regex covers both block formats
- Reset streamFilterClose at every turn start
2026-04-05 19:24:51 +02:00
14b88cadcc feat: M1-M7 gap audit phase 3 — context prefix, deferred tools, compact hooks
Gap 11 (M6): Fixed context prefix
- Window.PrefixMessages stores immutable docs (CLAUDE.md, .gnoma/GNOMA.md)
- Prefix stripped before compaction, prepended after — survives all compaction
- AllMessages() returns prefix + history for provider requests
- main.go loads CLAUDE.md and .gnoma/GNOMA.md at startup as prefix

Gap 12 (M6): Deferred tool loading
- DeferrableTool optional interface: ShouldDefer() bool
- buildRequest() skips deferred tools until activated
- Tools auto-activate on first model request (activatedTools map)
- agent + spawn_elfs marked as deferrable (large schemas, rarely needed early)
- Saves ~800 tokens per deferred tool per request

Gap 13 (M6): Pre/post compact hooks
- OnPreCompact/OnPostCompact callbacks in WindowConfig
- Called in doCompact() (shared by CompactIfNeeded + ForceCompact)
- M8 hooks system will extend these to full protocol
2026-04-04 20:46:50 +02:00
509c897847 feat: M1-M7 gap audit phase 2 — security, TUI, context, router feedback
Gap 6 (M3): 7 new bash security checks (8-14)
- JQ injection, obfuscated flags (Unicode lookalike hyphens),
  /proc/environ access, brace expansion, Unicode whitespace,
  zsh dangerous constructs, comment-quote desync
- Total: 14 checks (was 7)

Gap 7 (M5): Model picker numbered selection
- /model shows numbered sorted list, /model 3 picks by number

Gap 8 (M5): /config set command
- /config set provider.default mistral writes to .gnoma/config.toml
- Whitelisted keys: provider.default, provider.model, permission.mode
- New config/write.go with TOML round-trip via BurntSushi/toml

Gap 9 (M6): Simple token estimator
- EstimateTokens (len/4 heuristic), EstimateMessages (content + overhead)
- PreEstimate on Tracker for proactive compaction triggering

Gap 10 (M7): Router quality feedback from elfs
- Router.Outcome + ReportOutcome (logs for now, M9 bandit uses later)
- Manager tracks armID/taskType per elf via elfMeta map
- Manager.ReportResult called after elf completion in both agent + batch tools
2026-04-04 11:07:08 +02:00
31cba286ac fix: M1-M7 gap audit phase 1 — bug fix + 5 quick wins
Bug fix:
- window.go: token ratio after compaction used len(w.messages) after
  reassignment, always producing ratio ~1.0. Fixed by saving original
  length before assignment.

Gap 1 (M3): Scanner patterns 13 → 47
- Added 34 new patterns: Azure, DigitalOcean, HuggingFace, Grafana,
  GitHub extended (app/oauth/refresh), Shopify, Twilio, SendGrid,
  NPM, PyPI, Databricks, Pulumi, Postman, Sentry, Anthropic admin,
  OpenAI extended, Vault, Supabase, Telegram, Discord, JWT, Heroku,
  Mailgun, Figma

Gap 2 (M3): Config security section
- SecuritySection with EntropyThreshold + custom PatternConfig
- Wire custom patterns from TOML into scanner at startup

Gap 3 (M4): Polling discovery loop
- StartDiscoveryLoop with 30s ticker, reconciles arms vs discovered
- Router.RemoveArm for disappeared local models

Gap 4 (M5): Incognito LocalOnly enforcement
- Router.SetLocalOnly filters non-local arms in Select()
- TUI incognito toggle (Ctrl+X, /incognito) sets local-only routing

Gap 5 (M6): Reactive 413 compaction
- Window.ForceCompact() bypasses ShouldCompact threshold
- Engine handles 413 with emergency compact + retry
2026-04-03 23:11:08 +02:00
38fc49a6c4 fix: retry with exponential backoff on 429, stagger elf spawns
Engine retries transient errors (429, 5xx) up to 4 times with
1s/2s/4s/8s backoff. Respects Retry-After header from provider.

Batch tool staggers elf spawns by 300ms to avoid rate limit bursts
when all elfs hit the API simultaneously (Mistral's 1 req/s limit).
2026-04-03 21:08:20 +02:00
ace9b5f273 feat: spawn_elfs batch tool for guaranteed parallel elf execution
New spawn_elfs tool takes array of tasks, spawns all elfs simultaneously.
Solves the problem of models (Mistral Small, Devstral) that serialize
tool calls instead of batching them.

Schema: {"tasks": [{"prompt": "...", "task_type": "..."}], "max_turns": 30}

Also:
- Suppress spawn_elfs tool output from chat (tree handles display)
- Update M7 milestones to reflect completed deliverables
- Add CC-inspired features to M8/M10: task notification system,
  task framework, /batch skill, coordinator mode, StreamingToolExecutor,
  git worktree isolation
2026-04-03 21:03:51 +02:00
706363f94b feat: rate limit pools, elf tree view, permission prompts, dep updates
Rate limits:
- Add PoolRPS/PoolTPM/PoolTokensMonth/PoolCostMonth pool kinds
- Provider defaults for Mistral/Anthropic/OpenAI/Google (tier-aware)
- Config override via [rate_limits.<provider>] TOML section
- Pools auto-attached to arms on registration

Elf tree view (CC-style):
- Structured elf.Progress type replaces flat string channel
- Tree with ├─/└─ branches, per-elf stats (tool uses, tokens)
- Live activity updates: tool calls, "generating… (N chars)"
- Completed elfs stay in tree with "Done (duration)" until turn ends
- Suppress raw elf output from chat (tree + LLM summary instead)
- Remove background elf mode (wait: false) — always wait
- Truncate elf results to 2000 chars for parent context
- Parallel hint in system prompt and tool description

Permission prompts:
- Show actual command in prompt: "bash wants to execute: find . -name '*.go'"
- Compact hint in separator bar: "⚠ bash: find . | wc -l [y/n]"
- PermReqMsg carries tool name + args

Other:
- Fix /model not updating status bar (session.Local.SetModel)
- Add make targets: run, check, install
- Update deps: BurntSushi/toml v1.6.0, chroma v2.23.1, x/text v0.35.0, cloud.google.com/go v0.123.0
2026-04-03 20:54:48 +02:00
1f416bac8f fix: live elf progress shows tool calls + results, not just text 2026-04-03 19:42:48 +02:00
97d5093526 feat: configurable max_turns for elfs — LLM sets via agent tool param 2026-04-03 19:37:17 +02:00
2ccc261c39 fix: elf progress — proper last-2-lines tracking, 70 char truncation 2026-04-03 19:30:18 +02:00
e0cdc891f1 feat: live elf progress in TUI
- Elf tool calls show as 🦉 [elf] <prompt> (not ⚙ [agent])
- Live 2-line progress beneath the elf label showing what the
  elf is currently outputting (grey, auto-updated)
- Agent tool forwards elf streaming events via progress channel
- Progress cleared on turn completion
- elfProgressCh wired from agent tool → TUI
2026-04-03 19:25:43 +02:00
07c739795c feat: M7 Elfs — sub-agents with router-integrated spawning
internal/elf/:
- BackgroundElf: runs on own goroutine with independent engine,
  history, and provider. No shared mutable state.
- Manager: spawns elfs via router.Select() (picks best arm per
  task type), tracks lifecycle, WaitAll(), CancelAll(), Cleanup().

internal/tool/agent/:
- Agent tool: LLM can call 'agent' to spawn sub-agents.
  Supports task_type hint for routing, wait/background mode.
  5-minute timeout, context cancellation propagated.

Concurrent tool execution:
- Read-only tools (fs.read, fs.grep, fs.glob, etc.) execute in
  parallel via goroutines.
- Write tools (bash, fs.write, fs.edit) execute sequentially.
- Partition by tool.IsReadOnly().

TUI: /elf command explains how to use sub-agents.
5 elf tests. Exit criteria: parent spawns 3 background elfs on
different providers, collects and synthesizes results.
2026-04-03 19:16:46 +02:00
ec9a918da9 feat: ctrl+o toggles tool output expand, fix auto default
- ctrl+o toggles between 10-line truncated and full tool output
- Label shows "ctrl+o to expand" (lowercase)
- Fixed: auto permission mode now sticks — config default was
  overriding flag default ("default" → "auto" in config defaults)
2026-04-03 19:00:59 +02:00
4847421b17 feat: auto permission mode, edit diffs, truncated tool output
- Default permission mode changed to 'auto' (read-only auto-allows,
  writes prompt)
- fs.edit now shows diff-style output: line numbers, context ±3 lines,
  + for added (green), - for removed (red)
- Tool output truncated to 10 lines in TUI with "+N lines (Ctrl+O
  to expand)" indicator
- Mistral SDK bumped to v1.3.0
2026-04-03 18:57:13 +02:00
f3d3390b86 feat: M6 complete — summarize strategy + tool result persistence
SummarizeStrategy: calls LLM to condense older messages into a
summary, preserving key decisions, file changes, tool outputs.
Falls back to truncation on failure. Keeps 6 recent messages.

Tool result persistence: outputs >50K chars saved to disk at
.gnoma/sessions/tool-results/{id}.txt with 2K preview inline.

TUI: /compact command for manual compaction, /clear now resets
engine history. Summarize strategy used by default (with
truncation fallback).
2026-04-03 18:51:28 +02:00
b0b393517e feat: M6 context intelligence — token tracker + truncation compaction
internal/context/:
- Tracker: monitors token usage with OK/Warning/Critical states
  (thresholds from CC: 20K warning buffer, 13K autocompact buffer)
- TruncateStrategy: drops oldest messages, preserves system prompt +
  recent N turns, adds compaction boundary marker
- Window: manages message history with auto-compaction trigger,
  circuit breaker after 3 consecutive failures

Engine integration:
- Context window tracks usage per turn
- Auto-compacts when critical threshold reached
- History syncs with context window after compaction

TUI status bar:
- Token count with percentage (tokens: 1234 (5%))
- Color-coded: green=ok, yellow=warning, red=critical

Session Status extended: TokensMax, TokenPercent, TokenState.
7 context tests.
2026-04-03 18:46:03 +02:00
8ba95091fb fix: raw text during streaming, glamour only on completed messages 2026-04-03 18:36:27 +02:00
d5ee8fbc9d feat: markdown rendering in chat via glamour
Assistant responses now rendered with full markdown support:
- Headers, bold, italic, strikethrough
- Code blocks with syntax highlighting
- Lists (ordered + unordered)
- Links, blockquotes
- Tables

Uses charm.land/glamour/v2 with dark theme. Renderer width
updates on terminal resize. Live markdown rendering during
streaming. Consistent ◆ prefix + indent for multi-line output.
2026-04-03 18:32:56 +02:00
3640f03efc fix: consistent indentation and AI icon in chat
- ❯ flush left for user input, continuation lines indented 2 spaces
- ◆ purple icon for AI responses, continuation indented
- User multiline messages: ❯ first line, indented rest
- Tool output: indented under parent
- System messages: • prefix with multiline indent
- Input area: no extra padding, ❯ at column 0
2026-04-03 18:25:37 +02:00
0498e2ff0b fix: strings.Builder panic — use pointer to avoid copy-by-value 2026-04-03 18:14:01 +02:00
fdf23ed4b7 feat: multiline input with auto-expanding textarea
Switch from textinput to textarea bubble:
- Enter submits message
- Shift+Enter / Ctrl+J inserts newline
- Input area auto-expands from 1 to 10 lines based on content
- Line numbers hidden, prompt preserved
2026-04-03 18:11:34 +02:00
f6f69cf8e5 feat: /config, /model listing, /shell stub, /help update
/model — lists all available arms from router (local + API),
  shows capabilities (tools, thinking, vision), marks current.
/config — shows resolved config (provider, model, permission,
  incognito, cwd, git branch, config file paths).
/shell — stub explaining feature is coming.
/help — updated with all commands + keyboard shortcuts.

Router passed to TUI config for arm listing.
2026-04-03 18:01:20 +02:00
279a8d43bd feat: complete 7/7 bash security checks
Added:
- Standalone semicolon check: blocks ; outside quotes (use && instead)
- Sensitive redirection check: blocks > to /etc/passwd, .bashrc,
  .ssh/authorized_keys, .env, etc.

Now all 7 security checks are active:
1. Incomplete commands, 2. Control characters, 3. Newline injection,
4. Command substitution, 5. Dangerous variables, 6. Semicolons,
7. Sensitive redirections
2026-04-03 17:56:01 +02:00
34108075d2 feat: auto-discover local models from ollama + llama.cpp
At startup, polls ollama (/api/tags) and llama.cpp (/v1/models) for
available models. Registers each as an arm in the router alongside
the CLI-specified provider.

Discovered: 7 ollama models + 1 llama.cpp model = 9 total arms.
Router can now select from multiple local models based on task type.
Discovery is non-blocking — failures logged and skipped.
2026-04-03 17:53:11 +02:00
cfa87f1c1b feat: wire config system into main — TOML config now active
config.Load() called at startup. Layered: defaults → global
(~/.config/gnoma/config.toml) → project (.gnoma/config.toml)
→ env vars. CLI flags override config values.

Config drives:
- provider.default + provider.model as defaults
- provider.api_keys for key resolution
- provider.endpoints for custom base URLs
- permission.mode + permission.rules loaded into checker
- tools.bash_timeout passed to bash tool

Example .gnoma/config.toml:
  [provider]
  default = "ollama"
  model = "qwen3:14b"
  [permission]
  mode = "bypass"
  [[permission.rules]]
  tool = "bash"
  pattern = "rm -rf"
  action = "deny"
2026-04-03 17:38:58 +02:00