122 Commits

Author SHA1 Message Date
vikingowl 995b08dc0f feat(engine): M8 cleanup — Wave B skill enforcement
- Add tool.PathSensitiveTool interface (ExtractPaths); implement on all 6 fs tools
- Add engine.TurnOptions.AllowedPaths: restricts tool filesystem access per skill invocation
- Bash is denied outright when AllowedPaths is active (unparseable command args)
- fs tools with empty path (cwd default) resolved via os.Getwd() and validated
- Add engine.TurnOptions.AllowedTools + AllowedPaths wiring in pipe mode (main.go) and TUI skill dispatch (tui/app.go)
- Remove TODO(M8.3) from skill.Frontmatter — enforcement is now complete
2026-05-07 15:29:33 +02:00
vikingowl fc465e5f29 feat(engine): M8 cleanup — Wave A wiring gaps
- Remove stale TODO(P0c) comment from main.go (resolved by P0c tier routing)
- Wire config.Provider.Temperature → engine.Config.Temperature → provider.Request
- Add WithMaxFileSize option to fs.write; wire cfg.Tools.MaxFileSize in main.go
- Wire router.ReportOutcome after each runLoop return (success = err == nil)
- Fix nil-callback guard on EventRouting dispatch (pre-existing bug exposed by new test)
2026-05-07 15:22:22 +02:00
vikingowl 0c65eeda08 docs: consolidated roadmap, ADR-013, drop stale plans
- New 7-phase roadmap (2026-05-07-gnoma-roadmap.md) covering M8 cleanup,
  PTY interactive shell, SLM classifier, router revisit, USP security,
  ELF support, and distribution
- ADR-013 (002-slm-routing.md): SLM-first routing supersedes ADR-009;
  Thompson Sampling deferred pending SLM production data
- ADR-009 status updated to "Superseded by ADR-013"
- gemma-integration-analysis.md: header note that Node.js specifics
  (LiteRT-LM, daemon, PID) don't apply to gnoma's Go implementation
- TODO.md replaced with thin pointer to roadmap + stable backlog
- Deleted stale plan/spec files: m6-m7-closeout, m8-hooks-design
2026-05-07 15:06:54 +02:00
vikingowl 76988453ab docs: note routing revisit after SLM integration 2026-05-07 14:41:37 +02:00
vikingowl 640860404a feat(router): tier-based routing — CLI > local > API, disabled arms
Adds explicit tier preference to arm selection so the router
deterministically prefers lower-cost arms before falling back:

  tier 0: CLI agents (IsCLIAgent=true, subprocess/claude|gemini|vibe)
  tier 1: local models (IsLocal=true, ollama/llamacpp)
  tier 2: API providers (everything else)

Within a tier, quality/cost scoring still applies. filterFeasible still
gates on quality thresholds, so a low-quality local arm won't beat a
high-quality API arm when the task's minimum threshold rules it out.

Also adds Arm.Disabled: arms with Disabled=true are excluded from
auto-routing but remain selectable via ForceArm.

Implementation: armTier helper + selectBest refactored to try tiers in
order, bestScored picks within a tier. router.Select skips disabled arms
in allArms collection (forced arm bypasses disable check).
2026-05-07 14:36:36 +02:00
vikingowl f213d8f9ce feat(provider): subprocess CLI provider for claude, gemini, vibe
Adds internal/provider/subprocess — a provider.Provider that spawns CLI
agents (claude, gemini, vibe) as subprocesses and streams their output.

- FormatParser interface + three parsers for claude-stream-json,
  gemini-stream-json, and vibe-streaming formats; fixtures captured from
  real binaries
- subprocessStream: pull-based stream.Stream over subprocess stdout with
  bounded stderr capture (8KB) and guarded reap() to prevent double-Wait
- DiscoverCLIAgents: parallel PATH scan with 10s timeout, stable ordering
- Provider: only the last user message is passed as --prompt; all other
  request fields (history, tools, system prompt) are intentionally ignored
  (see package doc)
- main.go: discover and register CLI arms at startup; TODO(P0c) for
  tier-based routing to enforce preference order explicitly
2026-05-07 14:29:34 +02:00
vikingowl f9b8c1886b feat(router): normalize effort/thinking abstraction across providers
Add EffortLevel (auto/low/medium/high) as a provider-agnostic reasoning
control, replacing the Capabilities.Thinking bool. Each provider maps
the level to its native parameter: Anthropic budget tokens (1K/8K/16K),
OpenAI reasoning_effort (low/medium/high), Google thinking budget
(1K/8K/16K). Task classification auto-infers effort from TaskType and
complexity; filterFeasible excludes arms that lack the required level.
2026-05-07 14:08:50 +02:00
vikingowl 86b5169bff docs: update TODO with Native SLM Runtime integration
- Replace Gemma Integration with expanded SLM Preflight Engine section
- Add Deep Intent Routing (Skill Decomposer, Context Flattener, HITL toggle)
- Add Security & Iron Law Integration (USP Pre-Audit, Hallucination Gate)
- Include Recommended Tiny Stack table (Gemma 3 270M, ollama/llm, Q4_K_M GGUF)
- Document the Integrated Flow for local vs frontier routing

Generated by Mistral Vibe.
Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
2026-05-07 11:36:00 +02:00
vikingowl e6489b31a9 docs: add TODO roadmap for gemma routing, USP integration, local tmp, and ELF support 2026-05-07 00:21:52 +02:00
vikingowl 3873f90f83 feat: local model reliability — SDK retries, capability probing, init skill, context compaction
Three compounding bugs prevented tool calling with llama.cpp:
- Stream parser set argsComplete on partial JSON (e.g. "{"), dropping
  subsequent argument deltas — fix: use json.Valid to detect completeness
- Missing tool_choice default — llama.cpp needs explicit "auto" to
  activate its GBNF grammar constraint; now set when tools are present
- Tool names in history used internal format (fs.ls) while definitions
  used API format (fs_ls) — now re-sanitized in translateMessage

Additional changes:
- Disable SDK retries for local providers (500s are deterministic)
- Dynamic capability probing via /props (llama.cpp) and /api/show
  (Ollama), replacing hardcoded model prefix list
- Engine respects forced arm ToolUse capability when router is active
- Bundled /init skill with Go template blocks, context-aware for local
  vs cloud models, deduplication rules against CLAUDE.md
- Tool result compaction for local models — previous round results
  replaced with size markers to stay within small context windows
- Text-only fallback when tool-parse errors occur on local models
- "text-only" TUI indicator when model lacks tool support
- Session ResetError for retry after stream failures
- AllowedTools per-turn filtering in engine buildRequest
2026-04-13 02:01:01 +02:00
vikingowl 99529e6156 fix: deterministic 500 retry, OpenAI error wrapping, local /init prompt
Stop retrying llama.cpp 500s that are deterministic tool-parse failures
by inspecting the error message body (ClassifyHTTPError). Wrap OpenAI SDK
errors as ProviderError so the engine's retry logic classifies them. Add
localInitPrompt for local models that uses sequential fs_* calls instead
of spawn_elfs (which local models can't produce reliably).
2026-04-12 18:35:18 +02:00
vikingowl 0adf118675 fix(router): discovery loop removes forced arm, breaking routing
The discovery loop's reconcileArms removed the CLI-forced arm
(llamacpp/default) because the llama.cpp server reports the real model
name (e.g. gemma-26b), creating a mismatch. After 30s the forced arm
disappeared and all subsequent requests failed.

Three-layer fix:
- Eager: query the specific provider at startup to resolve the real
  model name before registering the forced arm
- Lazy: reconcileArms detects placeholder "default" arm names and
  atomically renames them when discovery reveals the real identity,
  with an onReconcile callback to update the session and TUI
- Guard: the forced arm is never garbage-collected by the removal loop

Also fixes misleading /init error messaging — failed inits now show
"loaded from disk (init failed)" instead of "AGENTS.md written to".
2026-04-12 17:51:30 +02:00
vikingowl 88e6bdb2a4 feat(tui): Tier 3-4 UX improvements — split, routing, session naming, context bar
- Split app.go (2091→1378 lines) into rendering.go, events.go, init.go
- Add EventRouting stream event for router arm transparency
- Add session auto-naming from first user message
- Add context window progress bar in status bar
- Add /keys cheatsheet, /replay for resumed sessions
- Add inline cost-per-turn after assistant responses
- Add diff previews in fs.write/fs.edit permission prompts
- Collapse tool output to 3 lines by default (ctrl+o expands)
- Use AddPrefix for system context instead of InjectMessage
- Handle ContentThinking and ContentToolResult in session resume
- Show session title in resume picker
- Add /model numeric selection snapshot safety
2026-04-12 05:13:16 +02:00
vikingowl 0b1f8cb5ec feat(tui): Tier 1-2 UX improvements — completions, usage, provider status
Tier 1 (launch blockers):
- Remove /shell from /help (advertised but unimplemented)
- Kill dead _ = closeLen assignment
- Cache glamour renderer by width — no longer recreated on every
  WindowSizeMsg when width hasn't changed

Tier 2 (ship-quality UX):
- Slash command ghost-text completion with Tab accept. Sources: static
  command list + dynamic skill names. /permission gets arg completion
  for the 6 modes.
- /compact reports before/after token counts (e.g. "32k → 18k tokens")
- /provider shows all registered arms grouped by provider, not just
  "restart required"
- /usage command: input/output/total tokens, context %, provider, turns
- Widen Ctrl+C quit window from 1s to 2s
- "new content below" indicator when scrolled up during streaming
- Permission prompt: inline chat notification when approval needed,
  so the user notices even if focused on input
2026-04-12 04:19:55 +02:00
vikingowl e5a1d21f53 fix: append mutation, pipe-mode hang, Mistral regex false positives
- Fix append footgun: allHooks/allMCPServers allocated fresh to avoid
  mutating cfg's backing array (lines 391/413 in main.go)
- Fix pipe-mode permission prompt: detect no-TTY stdin and auto-deny
  instead of blocking forever on fmt.Scanln EOF
- Tighten Mistral API key regex from bare [a-zA-Z0-9]{32} (matched
  commit hashes, UUIDs) to context-gated pattern requiring "mistral"
  keyword nearby. Added scanner test for positives and negatives.
- Remove README demo GIF TODO placeholder
- Unify version string: pass buildVersion from ldflags into tui.Config
  instead of hardcoding "v0.1.0-dev"
- Populate benchmarks doc with actual Go benchmark results
2026-04-12 03:49:47 +02:00
vikingowl d7b524664d fix(m8): replace_default map, error UX, benchmarks, and launch prep
- Fix replace_default positional bug: []string → map[string]string for
  explicit MCP tool → built-in name mapping
- Improve error messages for missing API keys (3 actionable options) and
  unknown providers (early validation with available list)
- Remove python3 dependency from MCP tests (pure bash grep/sed parsing)
- Add router benchmark scaffold (6 benchmarks in bench_test.go + docs)
- Add .goreleaser.yml for cross-platform binary releases with ldflags
- Add launch-ready README with quickstart, extensibility docs, GIF placeholder
- Add CONTRIBUTING.md and Gitea issue templates (bug report, feature request)
2026-04-12 03:34:58 +02:00
vikingowl d2d79d65da feat(m8): MCP client, tool replaceability, and plugin system
Complete the remaining M8 extensibility deliverables:

- MCP client with JSON-RPC 2.0 over stdio transport, protocol
  lifecycle (initialize/tools-list/tools-call), and process group
  management for clean shutdown
- MCP tool adapter implementing tool.Tool with mcp__{server}__{tool}
  naming convention and replace_default for swapping built-in tools
- MCP manager for multi-server orchestration with parallel startup,
  tool discovery, and registry integration
- Plugin system with plugin.json manifest (name/version/capabilities),
  directory-based discovery (global + project scopes with precedence),
  loader that merges skills/hooks/MCP configs into existing registries,
  and install/uninstall/list lifecycle manager
- Config additions: MCPServerConfig, PluginsSection with opt-in/opt-out
  enabled/disabled resolution
- TUI /plugins command for listing installed plugins
- 54 tests across internal/mcp and internal/plugin packages
2026-04-12 03:09:05 +02:00
vikingowl d51f76aee0 docs: mark M8.2 skill system deliverables complete in milestones.md 2026-04-07 02:25:29 +02:00
vikingowl 4a37e84114 feat(skill): enhanced coordinator prompt with fan-out and concurrency guidance 2026-04-07 02:24:49 +02:00
vikingowl 338d4a12b8 feat(skill): pipe mode support and main.go wiring 2026-04-07 02:19:42 +02:00
vikingowl 71b0cf9490 feat(skill): TUI integration — /skillname invokes skills, /skills lists them 2026-04-07 02:18:12 +02:00
vikingowl 995b26ffe7 feat(skill): registry with multi-directory loading and precedence 2026-04-07 02:17:17 +02:00
vikingowl 42fc2adcd8 feat(skill): bundled /batch skill with go:embed 2026-04-07 02:16:35 +02:00
vikingowl 327e4d74c0 feat(skill): template rendering with Go text/template 2026-04-07 02:15:51 +02:00
vikingowl 106694e36a feat(skill): core Skill type and YAML frontmatter parser 2026-04-07 02:05:49 +02:00
vikingowl 62e1e4f11d chore: add gopkg.in/yaml.v3 for skill frontmatter parsing 2026-04-07 02:04:36 +02:00
vikingowl 8fdad3dd31 docs: mark M8.1 hook system deliverables complete in milestones.md 2026-04-07 01:09:07 +02:00
vikingowl a9df023732 feat: wire hook dispatcher in main.go — SessionStart, SessionEnd, PreCompact 2026-04-07 01:08:40 +02:00
vikingowl 03d3a5d016 feat: engine hook integration — PreToolUse, PostToolUse, Stop 2026-04-07 01:02:55 +02:00
vikingowl cf792c7059 feat: AgentExecutor — elf-based hook evaluation via elf.Manager 2026-04-07 00:55:19 +02:00
vikingowl 1aa1d83e9e feat: PromptExecutor — LLM-based hook evaluation via router 2026-04-07 00:53:53 +02:00
vikingowl 7d0b9c222f feat: ParseHookDefs — config to HookDef conversion with validation 2026-04-07 00:52:00 +02:00
vikingowl d27dea5ee0 feat: hook config schema with user+project merge ordering 2026-04-07 00:50:53 +02:00
vikingowl 5701bc2033 feat: Dispatcher — handler chain dispatch, filtering, transform chaining 2026-04-07 00:48:08 +02:00
vikingowl bf50aa234f feat: CommandExecutor — shell hook execution with stdin/stdout protocol 2026-04-07 00:38:48 +02:00
vikingowl 345fa5bbd4 feat: hook payload marshal/unmarshal helpers 2026-04-07 00:37:40 +02:00
vikingowl ff9fd8c294 feat: hook core types — EventType, Action, CommandType, HookDef, Executor 2026-04-07 00:36:18 +02:00
vikingowl 451c79aaf6 docs: M8.1 hook system design spec 2026-04-06 02:42:34 +02:00
vikingowl ada9c5c442 docs: mark M7 deliverables complete in milestones.md 2026-04-06 00:59:16 +02:00
vikingowl 9b1d6ca100 test: M7 audit — quality feedback, coordinator, agent tool coverage
Quality feedback integration: TestQualityTracker_InfluencesArmSelection
verifies that 5 successes vs 5 failures tips Router.Select() to the
high-quality arm once EMA has enough observations. Companion test
confirms heuristic fallback below minObservations.

Coordinator tests expanded from 2 → 5: added guidance content check
(parallel/serial/synthesize present), false-positive table extended with
7 cases including the reordered keywords from the previous fix.

Agent tool suite: tool interface contracts for all four tools (Name,
Description, Parameters validity, IsReadOnly). Extracted duplicated
2000-char truncation into truncateOutput() helper (format.go), removing
the inline copies in agent.go and batch.go. Four boundary tests cover
empty, short, exact-max, and over-max cases.
2026-04-06 00:59:12 +02:00
vikingowl 15345540f2 fix: ClassifyTask priority ordering — orchestration below operational types
Operational task types (debug, review, refactor, test, explain) now gate
before orchestration in the keyword cascade. Previously, prompts like
"review the orchestration layer" or "refactor the pipeline dispatch"
matched "orchestrat"/"dispatch" and misclassified as TaskOrchestration.
Planning is also moved below the operational types.

Expanded orchestration keywords to cover common intent that the original
four keywords missed: "fan out", "subtask", "delegate to", "spawn elf".

Adds regression tests for false-positive cases and positive tests for new
keywords.
2026-04-06 00:58:54 +02:00
vikingowl 32c8691a43 docs: document session persistence, resume flags, and incognito mode 2026-04-06 00:30:37 +02:00
vikingowl 8f10b92ae1 feat: interactive session picker for /resume and --resume 2026-04-06 00:22:52 +02:00
vikingowl 2d41c2d46c fix: session security and correctness — path traversal, turn count restore, incognito quality leak
- store: validate session ID against store root to block path traversal in Load/Save
- local: seed turnCount from LocalConfig.TurnCount so resumed sessions keep correct turn count
- main: pass TurnCount from snapshot to LocalConfig on resume
- main: suppress quality.json save when --incognito is active
- main: handle UserConfigDir error in quality save defer instead of silently using wrong path
- test: add TestSessionStore_Load/Save_RejectsPathTraversal
2026-04-06 00:04:09 +02:00
vikingowl 87f8d2697f feat: wire --resume/-r CLI flags, SessionStore, quality persistence
- Add --resume/-r flags; empty = list sessions, ID = restore specific session
- Create SessionStore from config.ProjectRoot() and cfg.Session.MaxKeep
- Wire SessionID and Store into session.NewLocal
- Restore QualityTracker EMA data from ~/.config/gnoma/quality.json at startup
- Persist QualityTracker data to quality.json via defer on process exit
2026-04-05 23:52:05 +02:00
vikingowl 4597d4cb08 feat: /resume TUI command + SessionStore in tui.Config
- Add SessionStore field to tui.Config
- Add /resume slash command: lists sessions or restores by ID
- Pass SessionStore to tui.New in main.go
- Update /help text to include /resume
- Add .gnoma/sessions/ to .gitignore
2026-04-05 23:51:48 +02:00
vikingowl 16193de4fc feat: LocalConfig + auto-save hook in session.Local
Refactor NewLocal to accept LocalConfig (matching engine/router patterns),
add persistence fields (SessionID, Store, Incognito, Logger), capture
finalState before releasing the lock to avoid data races, and auto-save
a Snapshot after each successful turn when a store is configured.
Add SessionID() to the Session interface and three new tests covering
auto-save, no-store no-panic, and SessionID accessors.
2026-04-05 23:46:48 +02:00
vikingowl ca163dc8d4 feat: QualityTracker.Snapshot/Restore + Router.QualityTracker() for cross-session persistence 2026-04-05 23:40:19 +02:00
vikingowl 20fb045cba feat: Engine.SetHistory/SetUsage/SetActivatedTools for session restore 2026-04-05 23:39:38 +02:00
vikingowl bbd7791428 feat: add Session config section (max_keep for session retention) 2026-04-05 23:37:10 +02:00