7 Commits

Author SHA1 Message Date
vikingowl c4fde583f5 chore(lint): gofmt sweep + errcheck cleanups in router discovery
Apply gofmt -w across the codebase (struct field comment realignment
only — no semantic changes) and silence two errcheck warnings on
fmt.Sscanf / fmt.Fprintf return values in internal/router/discovery
with explicit `_, _ =` discards. Required so `make check` is green
before tagging v0.1.0.
2026-05-20 03:13:05 +02:00
vikingowl 9388479b03 feat(openai): lexical repair for malformed tool-call arguments
Local-model servers (Ollama, llama.cpp, llamafile) routed through the
OpenAI-compatible path frequently emit tool-call arguments that are
*almost* valid JSON — wrapped in markdown fences, padded with prose, or
trailing a stray comma. Strict parsing fails, the engine receives empty
args, and the agent loop has to retry or escalate.

Adds repairArgs(raw) at the EventToolCallDone boundary: strict-parse
first, then apply cheap lexical fixes (strip ```json fences, drop
trailing commas before }/], extract the first balanced {...} block with
proper string/escape awareness). On success, the repaired bytes flow
through unchanged; on failure, the original is returned and downstream
parsing surfaces the error as before.

Frontier providers (OpenAI proper, Anthropic, Mistral, Google) are
unaffected — their SDKs return structured args that pass strict parse.
The repair only does work when the upstream output is malformed.

11 unit tests cover: valid passthrough, empty, trailing commas,
single/double-line fences, prose-wrapped, braces-inside-strings,
multiple top-level objects (takes the first), and unrepairable input.
A stream-level test verifies the wiring through flushNextToolCall.
2026-05-19 17:59:05 +02:00
vikingowl d71bd942c4 feat: local model reliability — SDK retries, capability probing, init skill, context compaction
Three compounding bugs prevented tool calling with llama.cpp:
- Stream parser set argsComplete on partial JSON (e.g. "{"), dropping
  subsequent argument deltas — fix: use json.Valid to detect completeness
- Missing tool_choice default — llama.cpp needs explicit "auto" to
  activate its GBNF grammar constraint; now set when tools are present
- Tool names in history used internal format (fs.ls) while definitions
  used API format (fs_ls) — now re-sanitized in translateMessage

Additional changes:
- Disable SDK retries for local providers (500s are deterministic)
- Dynamic capability probing via /props (llama.cpp) and /api/show
  (Ollama), replacing hardcoded model prefix list
- Engine respects forced arm ToolUse capability when router is active
- Bundled /init skill with Go template blocks, context-aware for local
  vs cloud models, deduplication rules against CLAUDE.md
- Tool result compaction for local models — previous round results
  replaced with size markers to stay within small context windows
- Text-only fallback when tool-parse errors occur on local models
- "text-only" TUI indicator when model lacks tool support
- Session ResetError for retry after stream failures
- AllowedTools per-turn filtering in engine buildRequest
2026-04-13 02:01:01 +02:00
vikingowl 2093beea58 fix: deterministic 500 retry, OpenAI error wrapping, local /init prompt
Stop retrying llama.cpp 500s that are deterministic tool-parse failures
by inspecting the error message body (ClassifyHTTPError). Wrap OpenAI SDK
errors as ProviderError so the engine's retry logic classifies them. Add
localInitPrompt for local models that uses sequential fs_* calls instead
of spawn_elfs (which local models can't produce reliably).
2026-04-12 18:35:18 +02:00
vikingowl 4f1e0cf567 feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes
provider/openai:
- Fix doubled tool call args (argsComplete flag): Ollama sends complete
  args in the first streaming chunk then repeats them as delta, causing
  doubled JSON and 400 errors in elfs
- Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep)
- Add Reasoning field support for Ollama thinking output

cmd/gnoma:
- Early TTY detection so logger is created with correct destination
  before any component gets a reference to it (fixes slog WARN bleed
  into TUI textarea)

permission:
- Exempt spawn_elfs and agent tools from safety scanner: elf prompt
  text may legitimately mention .env/.ssh/credentials patterns and
  should not be blocked

tui/app:
- /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge
  (ask for plain text output) → TUI fallback write from streamBuf
- looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback
  content before writing (reject refusals, strip narrative preambles)
- Collapse thinking output to 3 lines; ctrl+o to expand (live stream
  and committed messages)
- Stream-level filter for model pseudo-tool-call blocks: suppresses
  <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call|>
  from entering streamBuf across chunk boundaries
- sanitizeAssistantText regex covers both block formats
- Reset streamFilterClose at every turn start
2026-04-05 19:24:51 +02:00
vikingowl 9608436b52 feat: add OpenAI-compat adapter for Ollama and llama.cpp
Thin wrapper over OpenAI adapter with custom base URLs.
Ollama: localhost:11434/v1, llama.cpp: localhost:8080/v1.
No API key required for local providers.

Fixed: initial tool call args captured on first chunk
(Ollama sends complete args in one chunk, not as deltas).

Live verified: text + tool calling with qwen3:14b on Ollama.
Five providers now live: Mistral, Anthropic, OpenAI, Google, Ollama.
2026-04-03 13:47:30 +02:00
vikingowl 261c19f90f feat: add OpenAI provider adapter
Streaming, tool use (index-based delta accumulation), tool name
sanitization (fs.read → fs_read), StreamOptions.IncludeUsage for
token tracking. Hardcoded model list (gpt-4o, gpt-4o-mini, o3, o3-mini).

Wired into CLI with OPENAI_API_KEY env support.
Live verified: text streaming + tool calling with gpt-4o.
2026-04-03 13:33:55 +02:00