gnoma

Author	SHA1	Message	Date
vikingowl	d71bd942c4	feat: local model reliability — SDK retries, capability probing, init skill, context compaction Three compounding bugs prevented tool calling with llama.cpp: - Stream parser set argsComplete on partial JSON (e.g. "{"), dropping subsequent argument deltas — fix: use json.Valid to detect completeness - Missing tool_choice default — llama.cpp needs explicit "auto" to activate its GBNF grammar constraint; now set when tools are present - Tool names in history used internal format (fs.ls) while definitions used API format (fs_ls) — now re-sanitized in translateMessage Additional changes: - Disable SDK retries for local providers (500s are deterministic) - Dynamic capability probing via /props (llama.cpp) and /api/show (Ollama), replacing hardcoded model prefix list - Engine respects forced arm ToolUse capability when router is active - Bundled /init skill with Go template blocks, context-aware for local vs cloud models, deduplication rules against CLAUDE.md - Tool result compaction for local models — previous round results replaced with size markers to stay within small context windows - Text-only fallback when tool-parse errors occur on local models - "text-only" TUI indicator when model lacks tool support - Session ResetError for retry after stream failures - AllowedTools per-turn filtering in engine buildRequest	2026-04-13 02:01:01 +02:00
vikingowl	2093beea58	fix: deterministic 500 retry, OpenAI error wrapping, local /init prompt Stop retrying llama.cpp 500s that are deterministic tool-parse failures by inspecting the error message body (ClassifyHTTPError). Wrap OpenAI SDK errors as ProviderError so the engine's retry logic classifies them. Add localInitPrompt for local models that uses sequential fs_* calls instead of spawn_elfs (which local models can't produce reliably).	2026-04-12 18:35:18 +02:00
vikingowl	0caab0fed1	fix(router): discovery loop removes forced arm, breaking routing The discovery loop's reconcileArms removed the CLI-forced arm (llamacpp/default) because the llama.cpp server reports the real model name (e.g. gemma-26b), creating a mismatch. After 30s the forced arm disappeared and all subsequent requests failed. Three-layer fix: - Eager: query the specific provider at startup to resolve the real model name before registering the forced arm - Lazy: reconcileArms detects placeholder "default" arm names and atomically renames them when discovery reveals the real identity, with an onReconcile callback to update the session and TUI - Guard: the forced arm is never garbage-collected by the removal loop Also fixes misleading /init error messaging — failed inits now show "loaded from disk (init failed)" instead of "AGENTS.md written to".	2026-04-12 17:51:30 +02:00
vikingowl	ce5f9d3dc9	feat(tui): Tier 3-4 UX improvements — split, routing, session naming, context bar - Split app.go (2091→1378 lines) into rendering.go, events.go, init.go - Add EventRouting stream event for router arm transparency - Add session auto-naming from first user message - Add context window progress bar in status bar - Add /keys cheatsheet, /replay for resumed sessions - Add inline cost-per-turn after assistant responses - Add diff previews in fs.write/fs.edit permission prompts - Collapse tool output to 3 lines by default (ctrl+o expands) - Use AddPrefix for system context instead of InjectMessage - Handle ContentThinking and ContentToolResult in session resume - Show session title in resume picker - Add /model numeric selection snapshot safety	2026-04-12 05:13:16 +02:00

4 Commits