gnoma

Author	SHA1	Message	Date
vikingowl	c065a2dea7	fix(provider/openai): wire ResponseFormat into OpenAI request params provider.Request.ResponseFormat was being silently dropped by the openai translation layer (translate.go:translateRequest). The upstream provider type and the openai-go SDK both supported it; the adapter just never propagated it. This is why Move 1 (set ResponseFormat=ResponseJSON in the SLM classifier) produced zero observable change: the field made it from the classifier into provider.Request but stopped at the OpenAI translation step. The ollama backend (used via the OpenAI-compatible endpoint) therefore never received format=json_object and kept emitting free-form prose, which the classifier's downstream JSON parser duly rejected — 50 fallbacks in a row across two model swaps. Translate provider.ResponseJSON to oai.ResponseFormatJSONObjectParam and provider.ResponseText to oai.ResponseFormatTextParam; leave the union zero-valued when the caller didn't set ResponseFormat so the SDK omits the field per its omitzero tag. Three table cases cover the json / text / unset paths. Affects ollama, llama.cpp, llamafile, and any other backend reached via openaicompat — all run through openai.translateRequest.	2026-05-25 01:26:38 +02:00
vikingowl	d38d7daf25	fix(subprocess/agy): disable ToolUse until stream-json lands agy is registered with FormatAgyText and the agyParser emits every stdout line as a plain EventTextDelta. There is no path for a structured ToolCall event to come back. With ToolUse=true the router would dispatch tool-needing tasks (security_review, spawn_elfs, file edit) to agy; the underlying Gemini model would describe calling the tool in prose — invented UUIDs and 'I will pause now'-style stubs — the engine would receive only text, and the turn would hang waiting for a tool call that never arrives. Surfaced when /init routed to agy for a security_review task and elf spawning visibly hallucinated in the TUI. Capability flag flipped to false; agy stays usable for tool-free prompts (explain, summarize, simple chat). TODO entry for native stream-json updated to flag that the capability flip is part of that same change.	2026-05-24 21:58:22 +02:00
vikingowl	a23eb6b92c	style: gofmt drift from prior commits Pure whitespace cleanup surfaced when 'make check' ran gofmt over the tree. Mostly struct-field column alignment in internal/safety/banner.go (SessionInfo) and the var(...) flag block in cmd/gnoma/main.go after --dangerously-allow-anywhere was added without realignment. Verified zero substantive changes via 'git diff --ignore-all-space --ignore-blank-lines'.	2026-05-24 16:33:17 +02:00
vikingowl	2f8d4c412f	feat(router): cloud-arm defaults, gpt-5.3-codex registration Closes R-4 and R-5 of the routing-defaults plan. R-4: Strengths + CostWeight defaults for closed frontier models. Cloud entries land in the same knownFamilyDefaults table as local ones, with MaxComplexity intentionally left zero (cloud arms get no complexity ceiling). CostWeight tuned per the plan's rationale: claude-opus-4-7 → Planning/SecurityReview/Debug/Refactor, 0.3 claude-sonnet-4-6 → Generation/Refactor/Review, 0.7 gpt-5.5 → Planning/SecurityReview/Generation, 0.3 gpt-5.3-codex → Generation/Refactor/Debug/UnitTest, 0.6 gpt-5.2 → Orchestration/Review, 0.8 gemini-3.1-pro → Planning/Review/Orchestration, 0.5 gemini-3.5-flash → Boilerplate/Explain/Orchestration, 1.2 The 0.3 weight on frontier arms keeps them competitive on SecurityReview / Planning despite $4+/Mtok; 1.2 on Gemini Flash penalizes cost more so it only wins when cost is genuinely decisive (boilerplate, explain). Mechanism: extracted applyFamilyDefaults into defaults.go and call it from Router.RegisterArm. Single source of truth — both local discovery and the primary-provider path in cmd/gnoma/main.go now flow through the same defaults application. Removed the duplicate apply block from RegisterDiscoveredModels. Legacy model IDs (claude-opus-4-20250514, gpt-4o, o3, gemini-2.5-pro, etc.) intentionally do not match any table entry — keeps users on pinned older models safe from imposed 2026 Strengths. R-5: gpt-5.3-codex registration. - internal/provider/openai/provider.go: added to fallbackModels and inferOpenAIModelCapabilities (400K context, 32K output). - internal/provider/ratelimits.go: gpt-5.3-codex and its dated alias gpt-5.3-codex-2026-02-15 added with the same Tier 1 quotas as gpt-5.2. Gemini 3.x (3.1-pro-preview, 3.5-flash, 3.1-flash-lite) was already registered in both google/provider.go and ratelimits.go — no change needed for that part of R-5. Test coverage: - ResolveFamilyDefaults table-driven across all 7 cloud entries including prefix-sharing (gpt-5.5-pro → gpt-5.5 defaults, gemini-3.1-pro-preview → gemini-3.1-pro defaults). - Legacy IDs return !ok. - RegisterArm applies cloud defaults end-to-end. - User-supplied Strengths and CostWeight are not overridden. - ID.Model() fallback works when ModelName is empty (test code often constructs arms this way). Refs: docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md	2026-05-23 21:39:48 +02:00
vikingowl	1606d19366	feat(subprocess/codex): account for cached and reasoning tokens codex 0.133.0 emits two token-accounting fields at top level that we previously dropped: cached_input_tokens — subset of input_tokens that hit the prompt cache (cheaper, but still counted in input_tokens per OpenAI Responses API semantics) reasoning_output_tokens — separately reported billable thinking tokens on reasoning-capable models Map cached_input_tokens to message.Usage.CacheReadTokens and subtract it from InputTokens. message.Usage.Add() sums InputTokens and CacheReadTokens as peers, so the uncached residual goes in InputTokens — matches the anthropic provider's convention and keeps cumulative usage tracking arithmetically correct. Fold reasoning_output_tokens into OutputTokens for accurate cost tracking. The top-level peer positioning (vs nested in output_tokens_details) implies a separately counted billable quantity, not a subset of output_tokens. Defensive clamp at zero in case a future codex build reports cached > input due to schema drift. Includes a verbatim regression guard against the live 2026-05-22 codex 0.133.0 output to catch schema changes early.	2026-05-22 13:35:57 +02:00
vikingowl	ea1a5361e2	chore: restore agy JSON-output TODO; idiomatic t.TempDir() in google test The worktree commit `12a6b83` dropped the "Native agy JSON output" backlog item alongside removing the agy agent. Since we restored agy in this branch, the TODO is relevant again — agy v1.0.0 still emits plain text and the prompt-augmentation fallback should be replaced by --output-format stream-json once the CLI supports it. Switch TestTryLoadOAuthCredentials_Formats to t.TempDir() to drop the unchecked os.RemoveAll defer that golangci-lint's errcheck caught after the merge.	2026-05-22 12:17:10 +02:00
vikingowl	246997c4be	Merge branch 'feat/agy-sdk-integration' into dev Brings in the Google auth precedence work (agy > gemini > ADC credential walk, fileTokenProvider expiry handling, slog-backed error reporting), the Codex CLI integration as a new subprocess agent, and the restoration of the agy subprocess agent that was accidentally removed by the initial codex commit. Sandbox-bypass flags on both agy and codex are now opt-out via env vars (GNOMA_AGY_BYPASS_PERMISSIONS, GNOMA_CODEX_BYPASS_SANDBOX). Includes review-driven fixes: - ADC fallback now uses real DetectOptions (cloud-platform scope) - fileTokenProvider returns an error on expired tokens instead of shipping a known-dead bearer - TestNew_Precedence asserts which credential was actually picked - codex parser tolerates non-JSON banner / debug lines on stdout - codex usage takes max(input_tokens, prompt_tokens) so accounting can't silently undercount No conflicts expected with the dev image-content feature: the worktree branch only touches the google and subprocess provider families.	2026-05-22 12:15:32 +02:00
vikingowl	afc31b0af4	fix(subprocess): restore agy alongside codex; env-gate sandbox bypass The original commit on this branch replaced the agy subprocess agent with codex (overwriting the slot in knownAgents, deleting agy_test.go and the agyParser). That was unintentional — agy (antigravity) is a distinct CLI from codex (OpenAI's). Antigravity will replace gemini when gemini retires on 2026-06-16, so it needs to keep its own slot. Restored: FormatAgyText constant, agyParser with newAgyParser and the line-delimited text parser, the agy CLIAgent entry in knownAgents with PromptResponseFormat:true, agy_test.go, and the agy case in newParser. Sourced from the parent commit so behavior matches what shipped before the codex change. Sandbox bypass: both agy (--dangerously-skip-permissions) and codex (--dangerously-bypass-approvals-and-sandbox) need a flag to run non-interactively (their stdin is closed; without it they block on approval prompts nobody can answer). Both default to ON for out-of-box behavior; operators with pre-approved trust config can opt out via GNOMA_AGY_BYPASS_PERMISSIONS=0 or GNOMA_CODEX_BYPASS_SANDBOX=0. Tests cover the on / opt-out / unknown value branches. TestKnownAgents_ValidFormats updated to accept the restored FormatAgyText.	2026-05-22 12:14:54 +02:00
vikingowl	1717f9f567	fix(subprocess/codex): tolerate non-JSON stdout, max-of-token-paths Codex emits banner / debug / "starting turn" lines to stdout interleaved with the JSON event stream. The parser previously returned an error on any line that wasn't a JSON object, which subprocessStream.Next treats as terminal — one stray banner aborted the whole turn. Skip lines that don't start with `{` after whitespace trim, and downgrade unparseable JSON-looking lines to a slog.Debug so they don't kill the stream either. Token accounting: usage payloads from newer codex builds occasionally carry both input_tokens and prompt_tokens (and likewise output / completion) with slightly different values. Always use the larger of the two so we can't silently undercount. Tests cover non-JSON banner skipping, malformed-JSON non-fatal-skip, and the max() behavior with both token fields populated.	2026-05-22 12:08:32 +02:00
vikingowl	f83ace7ad6	fix(google): real ADC scopes, expired-token rejection, error reporting credentials.DetectDefault(nil) always returns "options must be provided", which made the ADC branch unreachable. Pass an explicit DetectOptions with the cloud-platform scope so users with GOOGLE_APPLICATION_CREDENTIALS or `gcloud auth application-default login` actually flow through ADC instead of falling out as "no credentials found". fileTokenProvider.Token used to return expired tokens unchanged. We don't perform an OAuth refresh exchange (the upstream CLI does that out-of-band into the file we read), so when the file isn't fresh the only safe move is to fail loudly with an actionable message rather than ship a known-dead bearer that genai forwards to Vertex AI and gets back a confusing 401. tryLoadOAuthCredentials previously swallowed all errors equally, so the precedence walker silently skipped past misconfigured files (chmod 0600 on the wrong user, half-written JSON, etc.). Now os.IsNotExist is silent (normal walking), everything else gets a slog.Warn with the path so an unreadable file is visible. selectOAuthCredentials extracts the precedence chain into a testable helper that also returns a CredentialSource tag identifying which path was chosen. The previous precedence test only asserted err == nil; the new test verifies that the agy file wins when both are present and that the fallback to gemini actually loads the gemini token.	2026-05-22 12:08:22 +02:00
vikingowl	c5cc98ed8a	feat(provider/openai): translate user image content to image_url parts When the user message has at least one ImageContent block, build a ChatCompletionContentPartUnionParam array with text + image_url parts instead of the string content path. Image bytes are inlined as a base64 data URL (data:<media-type>;base64,...). Adjacent text blocks are merged into a single TextContentPart. Pure-text user messages stay on the existing string fast path. This covers OpenAI direct + every openaicompat backend (Ollama, llama.cpp, llamafile) since they all share the same provider. Tests: pure text uses OfString; image present emits 2 content parts (text + image_url with the expected base64 payload); nil-Image blocks are dropped and adjacent text merges correctly.	2026-05-22 11:50:55 +02:00
vikingowl	12a6b83cc9	feat: implement Google auth precedence and Codex integration	2026-05-22 00:21:32 +02:00
vikingowl	99fa0ff08e	refactor(providers): refresh defaults to current 2026 model lineup Bump hard-coded provider defaults to the May 2026 lineup: - Anthropic: claude-sonnet-4-6 (default); Opus 4.7 and Haiku 4.5 in the fallback list. 4.6/4.7 generation has 1M context standard. - OpenAI: gpt-5.5 (default); 5.5-pro / 5.2 / 5.2-chat-latest in fallback. ThinkingModes now baseline on GPT-5.x. - Google: gemini-3.5-flash (default); 3.1 Pro / Flash Lite in fallback. - Mistral: mistral-large-latest unchanged (Mistral Large 3); add mistral-medium-3.5, mistral-medium-2511, mistral-large-2512 to the rate-limit map. Legacy dated IDs retained in fallback lists and ratelimits maps so configs pinned to claude-sonnet-4-20250514 / gpt-4o / gemini-2.5-flash keep resolving. Capability tables (ContextWindow, MaxOutput, ThinkingModes) updated to match each generation. CLI help text in cmd/gnoma/main.go also updated.	2026-05-20 03:13:21 +02:00
vikingowl	c4fde583f5	chore(lint): gofmt sweep + errcheck cleanups in router discovery Apply gofmt -w across the codebase (struct field comment realignment only — no semantic changes) and silence two errcheck warnings on fmt.Sscanf / fmt.Fprintf return values in internal/router/discovery with explicit `_, _ =` discards. Required so `make check` is green before tagging v0.1.0.	2026-05-20 03:13:05 +02:00
vikingowl	c8813768d5	fix(subprocess): harden agy CLI integration - Drop unverified JSONOutput/Vision capability claims on agy (no native stream-json, no image-input path on v1.0.0). - Replace agent.Name == "agy" check with PromptResponseFormat flag on CLIAgent so the prompt-augmented JSON fallback scales to future agents. - Pass --dangerously-skip-permissions in agy PromptArgs to parallel gemini --yolo / vibe --trust; required for non-interactive runs. - Nil-guard JSONSchema and Schema bytes in buildPrompt (previously panicked when ResponseJSON was requested without a schema). - Rename misleading TestAgyProvider_StreamAugmentation to TestAgyParser_EmitsLineDeltas; add coverage for nil-schema path and non-augmenting agents.	2026-05-20 01:29:05 +02:00
vikingowl	3c875276c9	feat(security): implement multi-wave audit remediation and agy provider support Implemented full security remediation following Universal Security Pilot protocol: - W1: Enforced SecureProvider at router and engine boundaries to prevent bypasses. - W1: Implemented path-sensitive policy for MCP tools. - W2: Added SHA256 hash verification for SLM downloads (llamafile). - W3: Enhanced secret redaction for private keys (full body) and high-entropy strings. - W4: Fixed symlink-based filesystem sandbox escapes in paths and grep. - W4: Documented CLI agent trust boundaries. Also added 'agy' (Antigravity) as a subprocess CLI provider with plain-text JSON schema support.	2026-05-20 01:13:13 +02:00
vikingowl	17d83f2e2a	feat: add agy CLI provider and support structured output via prompt augmentation	2026-05-20 00:21:03 +02:00
vikingowl	b331dcd61a	feat(subprocess): per-agent binary override via [cli_agents] config Plan B from docs/superpowers/plans/2026-05-19-post-slm-unlock.md. Users with aliased CLI binaries (claude-priv, claude-work, gemini-personal) can now point gnoma's auto-discovery at them without renaming. The override flows through to the actual subprocess spawn at internal/provider/subprocess/provider.go:56, so routing through the alias is functional, not cosmetic. Config: [cli_agents] claude = "claude-priv" # discovery uses claude-priv instead of claude gemini = "" # empty value = no override (fall back to canonical) # vibe is absent = canonical name used - internal/config/config.go: CLIAgentsSection map[string]string; TOML [cli_agents] key. - internal/provider/subprocess/agent.go: - Package-level lookPath = exec.LookPath for test injection. - resolveAgentBinary(canonical, override) → (path, binName, err). Override='' falls back to canonical. Override set but missing from PATH returns an error (no silent fallback — masks user typos). - DiscoveredAgent.OverrideBinary records the override binary name when one was used; empty otherwise. - DiscoverCLIAgents(ctx, overrides) signature; warning logged when an override is configured but the binary isn't on PATH. - cmd/gnoma/main.go: both call sites pass cfg.CLIAgents. The `gnoma providers` listing renders `claude-priv (via [cli_agents].claude)` when an override is in effect. Tests cover: 5 resolver cases (no override, override set, empty override falls back, override missing, canonical missing); 4 discovery cases (no overrides, override resolves alias, empty value falls back, override missing skips agent); 2 config round-trip cases.	2026-05-19 21:02:16 +02:00
vikingowl	9388479b03	feat(openai): lexical repair for malformed tool-call arguments Local-model servers (Ollama, llama.cpp, llamafile) routed through the OpenAI-compatible path frequently emit tool-call arguments that are almost valid JSON — wrapped in markdown fences, padded with prose, or trailing a stray comma. Strict parsing fails, the engine receives empty args, and the agent loop has to retry or escalate. Adds repairArgs(raw) at the EventToolCallDone boundary: strict-parse first, then apply cheap lexical fixes (strip ```json fences, drop trailing commas before }/], extract the first balanced {...} block with proper string/escape awareness). On success, the repaired bytes flow through unchanged; on failure, the original is returned and downstream parsing surfaces the error as before. Frontier providers (OpenAI proper, Anthropic, Mistral, Google) are unaffected — their SDKs return structured args that pass strict parse. The repair only does work when the upstream output is malformed. 11 unit tests cover: valid passthrough, empty, trailing commas, single/double-line fences, prose-wrapped, braces-inside-strings, multiple top-level objects (takes the first), and unrepairable input. A stream-level test verifies the wiring through flushNextToolCall.	2026-05-19 17:59:05 +02:00
vikingowl	ec9433d783	chore(lint): clear remaining errcheck and staticcheck findings Brings the project to a clean `make lint` baseline (0 issues). Mechanical: - Wrap deferred resp.Body.Close() in closures (router/discovery.go, router/probe.go) so the unchecked return surfaces as `_ = ...`. - Apply `_ = ...` (single or multi-return blank) to test-file calls that intentionally ignore errors: os.MkdirAll / os.WriteFile / os.Chdir in setup paths, Close / Shutdown in teardown, Submit / Spawn / Send / LoadDir in tests that assert on side effects. Structural: - engine.handleRequestTooLarge drops the unused req parameter and rebuilds the request from compacted history (SA4009 — argument was overwritten before first use). - provider.ClassifyHTTPStatus and google.applyCapabilityOverrides switch to tagged switches over the discriminator (QF1002). - tui.app.go MouseWheel + inputMode and cmd/gnoma main slm-status use tagged switches in place of equality chains (QF1003). - cmd/gnoma main.go merges a var decl with its immediate assignment (S1021). - Three empty-branch sites (dispatcher_test, loader_test, coordinator_test) become real assertions or get the dead `if` removed (SA9003).	2026-05-19 17:53:42 +02:00
vikingowl	13b2f5e14d	chore(lint): clear dead code and tighten lifecycle errcheck Removes five unused funcs/vars/fields that golangci-lint had been flagging (anthropic.toolCallDoneEvent, mistral.translateMessages, hook.newError, subprocess.vibeParser.lastAssistantMsgID, tui.cBase), two ineffectual assignments (tui/rendering.go visible-window loop, subprocess stream_test setup), and a stale if/HasPrefix that's now a strings.TrimPrefix. Wires errcheck onto every subprocess / stream lifecycle path so a failed close or shutdown is at least logged rather than silently dropped: - engine/loop.go: stream.Close on both the error and success paths - mcp/manager.go: Shutdown when StartAll partial-fails; Transport close after Initialize failure - mcp/transport.go: stdin.Close + syscall.Kill on graceful-timeout fallback - slm/download.go: Close propagated as a named-return error on the success path; explicitly discarded on the rollback path - slm/classifier.go, slm/manager.go, hook/prompt.go, context/summarize.go, config/write.go, cmd/gnoma/main.go, tool/fs/grep.go: explicit ignores or error logging on Close / Shutdown / WalkDir / Scanln Production-code errcheck and ineffassign are now zero. Remaining golangci-lint output is test-only Close-in-defer noise plus stylistic staticcheck QF suggestions, left alone.	2026-05-19 17:05:54 +02:00
vikingowl	0d2d825e52	feat: add dynamic model discovery within providers - OpenAI provider: use Models.ListAutoPaging() to discover available models - Anthropic provider: use Models.ListAutoPaging() to discover available models - Google provider: use Models.All() iterator to discover available models - All providers fall back to hardcoded lists if API calls fail - Add capability inference functions for each provider based on model ID - Add tests for model discovery fallback behavior This enables gnoma to dynamically discover new models as they become available from cloud providers, while maintaining backward compatibility with fallback lists for offline use or API failures.	2026-05-07 22:27:24 +02:00
vikingowl	a9213ec382	feat(slm): Wave C — SLM classifier, MaxComplexity routing, CLI subcommands, TUI status - slm.Classifier: openaicompat → llamafile, 2s timeout + heuristic fallback, heuristic baseline blended so Priority/RequiredEffort are never zeroed, extractJSON strips markdown fences from small-model responses - router.ParseTaskType: case-insensitive string → TaskType, unknown → TaskGeneration - router.Arm.MaxComplexity: zero = no ceiling (preserves existing arm behavior); filterFeasible excludes arms when task.ComplexityScore > MaxComplexity - config.SLMSection: [slm] enabled / model_url / data_dir - openaicompat.NewLlamafile: no API key, model = "default", no retries - slm.Manager: DefaultDataDir() (XDG), Manifest() accessor - cmd/gnoma: `gnoma slm setup` / `gnoma slm status` subcommands; SLM arm registered with MaxComplexity=0.3 when enabled + set up - tui: /config shows slm status (ready/missing/not set up + base URL if running) - docs: roadmap updated to reflect llamafile pivot from Ollama	2026-05-07 16:44:32 +02:00
vikingowl	44d0bdc032	feat(provider): subprocess CLI provider for claude, gemini, vibe Adds internal/provider/subprocess — a provider.Provider that spawns CLI agents (claude, gemini, vibe) as subprocesses and streams their output. - FormatParser interface + three parsers for claude-stream-json, gemini-stream-json, and vibe-streaming formats; fixtures captured from real binaries - subprocessStream: pull-based stream.Stream over subprocess stdout with bounded stderr capture (8KB) and guarded reap() to prevent double-Wait - DiscoverCLIAgents: parallel PATH scan with 10s timeout, stable ordering - Provider: only the last user message is passed as --prompt; all other request fields (history, tools, system prompt) are intentionally ignored (see package doc) - main.go: discover and register CLI arms at startup; TODO(P0c) for tier-based routing to enforce preference order explicitly	2026-05-07 14:29:34 +02:00
vikingowl	7fbb5454ee	feat(router): normalize effort/thinking abstraction across providers Add EffortLevel (auto/low/medium/high) as a provider-agnostic reasoning control, replacing the Capabilities.Thinking bool. Each provider maps the level to its native parameter: Anthropic budget tokens (1K/8K/16K), OpenAI reasoning_effort (low/medium/high), Google thinking budget (1K/8K/16K). Task classification auto-infers effort from TaskType and complexity; filterFeasible excludes arms that lack the required level.	2026-05-07 14:08:50 +02:00
vikingowl	d71bd942c4	feat: local model reliability — SDK retries, capability probing, init skill, context compaction Three compounding bugs prevented tool calling with llama.cpp: - Stream parser set argsComplete on partial JSON (e.g. "{"), dropping subsequent argument deltas — fix: use json.Valid to detect completeness - Missing tool_choice default — llama.cpp needs explicit "auto" to activate its GBNF grammar constraint; now set when tools are present - Tool names in history used internal format (fs.ls) while definitions used API format (fs_ls) — now re-sanitized in translateMessage Additional changes: - Disable SDK retries for local providers (500s are deterministic) - Dynamic capability probing via /props (llama.cpp) and /api/show (Ollama), replacing hardcoded model prefix list - Engine respects forced arm ToolUse capability when router is active - Bundled /init skill with Go template blocks, context-aware for local vs cloud models, deduplication rules against CLAUDE.md - Tool result compaction for local models — previous round results replaced with size markers to stay within small context windows - Text-only fallback when tool-parse errors occur on local models - "text-only" TUI indicator when model lacks tool support - Session ResetError for retry after stream failures - AllowedTools per-turn filtering in engine buildRequest	2026-04-13 02:01:01 +02:00
vikingowl	2093beea58	fix: deterministic 500 retry, OpenAI error wrapping, local /init prompt Stop retrying llama.cpp 500s that are deterministic tool-parse failures by inspecting the error message body (ClassifyHTTPError). Wrap OpenAI SDK errors as ProviderError so the engine's retry logic classifies them. Add localInitPrompt for local models that uses sequential fs_* calls instead of spawn_elfs (which local models can't produce reliably).	2026-04-12 18:35:18 +02:00
vikingowl	4f1e0cf567	feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes provider/openai: - Fix doubled tool call args (argsComplete flag): Ollama sends complete args in the first streaming chunk then repeats them as delta, causing doubled JSON and 400 errors in elfs - Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep) - Add Reasoning field support for Ollama thinking output cmd/gnoma: - Early TTY detection so logger is created with correct destination before any component gets a reference to it (fixes slog WARN bleed into TUI textarea) permission: - Exempt spawn_elfs and agent tools from safety scanner: elf prompt text may legitimately mention .env/.ssh/credentials patterns and should not be blocked tui/app: - /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge (ask for plain text output) → TUI fallback write from streamBuf - looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback content before writing (reject refusals, strip narrative preambles) - Collapse thinking output to 3 lines; ctrl+o to expand (live stream and committed messages) - Stream-level filter for model pseudo-tool-call blocks: suppresses <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call\|> from entering streamBuf across chunk boundaries - sanitizeAssistantText regex covers both block formats - Reset streamFilterClose at every turn start	2026-04-05 19:24:51 +02:00
vikingowl	e1a47a7620	feat: rate limit pools, elf tree view, permission prompts, dep updates Rate limits: - Add PoolRPS/PoolTPM/PoolTokensMonth/PoolCostMonth pool kinds - Provider defaults for Mistral/Anthropic/OpenAI/Google (tier-aware) - Config override via [rate_limits.<provider>] TOML section - Pools auto-attached to arms on registration Elf tree view (CC-style): - Structured elf.Progress type replaces flat string channel - Tree with ├─/└─ branches, per-elf stats (tool uses, tokens) - Live activity updates: tool calls, "generating… (N chars)" - Completed elfs stay in tree with "Done (duration)" until turn ends - Suppress raw elf output from chat (tree + LLM summary instead) - Remove background elf mode (wait: false) — always wait - Truncate elf results to 2000 chars for parent context - Parallel hint in system prompt and tool description Permission prompts: - Show actual command in prompt: "bash wants to execute: find . -name '*.go'" - Compact hint in separator bar: "⚠ bash: find . \| wc -l [y/n]" - PermReqMsg carries tool name + args Other: - Fix /model not updating status bar (session.Local.SetModel) - Add make targets: run, check, install - Update deps: BurntSushi/toml v1.6.0, chroma v2.23.1, x/text v0.35.0, cloud.google.com/go v0.123.0	2026-04-03 20:54:48 +02:00
vikingowl	9608436b52	feat: add OpenAI-compat adapter for Ollama and llama.cpp Thin wrapper over OpenAI adapter with custom base URLs. Ollama: localhost:11434/v1, llama.cpp: localhost:8080/v1. No API key required for local providers. Fixed: initial tool call args captured on first chunk (Ollama sends complete args in one chunk, not as deltas). Live verified: text + tool calling with qwen3:14b on Ollama. Five providers now live: Mistral, Anthropic, OpenAI, Google, Ollama.	2026-04-03 13:47:30 +02:00
vikingowl	dccb5fe65a	feat: add Google GenAI provider adapter Streaming via goroutine+channel bridge (range-based iter.Seq2 → pull iterator). Tool use with FunctionCall/FunctionResponse, tool name sanitization, tool name map for FunctionResponse correlation. Stop reason override (Google uses STOP for function calls). Hardcoded model list (gemini-2.5-pro/flash, gemini-2.0-flash). Wired into CLI with GOOGLE_API_KEY + GEMINI_API_KEY env support. Live verified: text streaming + tool calling with gemini-2.5-flash. Four providers now live: Mistral, Anthropic, OpenAI, Google.	2026-04-03 13:42:29 +02:00
vikingowl	261c19f90f	feat: add OpenAI provider adapter Streaming, tool use (index-based delta accumulation), tool name sanitization (fs.read → fs_read), StreamOptions.IncludeUsage for token tracking. Hardcoded model list (gpt-4o, gpt-4o-mini, o3, o3-mini). Wired into CLI with OPENAI_API_KEY env support. Live verified: text streaming + tool calling with gpt-4o.	2026-04-03 13:33:55 +02:00
vikingowl	9e7caf2467	feat: add Anthropic provider adapter Streaming, tool use (with InputJSONDelta assembly), thinking blocks, cache token tracking, system prompt separation. Tool name sanitization (fs.read → fs_read) for Anthropic's naming constraints with reverse translation on tool call responses. Hardcoded model list with capabilities (Opus 4, Sonnet 4, Haiku 4.5). Wired into CLI with ANTHROPIC_API_KEY + ANTHROPICS_API_KEY env support. Also: migrated Mistral SDK to github.com/VikingOwl91/mistral-go-sdk. Live verified: text streaming + tool calling with claude-sonnet-4. 126 tests across 9 packages.	2026-04-03 13:11:00 +02:00
vikingowl	c54471a37b	refactor: migrate mistral sdk to github.com/VikingOwl91/mistral-go-sdk Same package, new GitHub deployment with fixed tests. somegit.dev/vikingowl → github.com/VikingOwl91, v1.2.0 → v1.2.1	2026-04-03 12:06:59 +02:00
vikingowl	69f5dba091	feat: complete M1 — core engine with Mistral provider Mistral provider adapter with streaming, tool calls (single-chunk pattern), stop reason inference, model listing, capabilities, and JSON output support. Tool system: bash (7 security checks, shell alias harvesting for bash/zsh/fish), file ops (read, write, edit, glob, grep, ls). Alias harvesting collects 300+ aliases from user's shell config. Engine agentic loop: stream → tool execution → re-query → until done. Tool gating on model capabilities. Max turns safety limit. CLI pipe mode: echo "prompt" \| gnoma streams response to stdout. Flags: --provider, --model, --system, --api-key, --max-turns, --verbose, --version. Provider interface expanded: Models(), DefaultModel(), Capabilities (ToolUse, JSONOutput, Vision, Thinking, ContextWindow, MaxOutput), ResponseFormat with JSON schema support. Live verified: text streaming + tool calling with devstral-small. 117 tests across 8 packages, 10MB binary.	2026-04-03 12:01:55 +02:00
vikingowl	788bd8ec24	feat: add foundation types, streaming, and provider interface internal/message/ — Content discriminated union, Message, Usage, StopReason, Response. 22 tests. internal/stream/ — Stream pull-based iterator interface, Event types, Accumulator (assembles Response from events). 8 tests. internal/provider/ — Provider interface, Request, ToolDefinition, Registry with factory pattern, ProviderError with HTTP status classification. errors.AsType[E] for Go 1.26. 13 tests. 43 tests total, all passing.	2026-04-03 10:57:54 +02:00

36 Commits