gnoma

Author	SHA1	Message	Date
vikingowl	244ecd97e5	fix: security hardening (bash redirection, unicode sanitization, edit tool resolver)	2026-05-21 23:29:48 +02:00
vikingowl	c4fde583f5	chore(lint): gofmt sweep + errcheck cleanups in router discovery Apply gofmt -w across the codebase (struct field comment realignment only — no semantic changes) and silence two errcheck warnings on fmt.Sscanf / fmt.Fprintf return values in internal/router/discovery with explicit `_, _ =` discards. Required so `make check` is green before tagging v0.1.0.	2026-05-20 03:13:05 +02:00
vikingowl	9853a522e6	refactor(security): consolidate TOCTOU-safe path canonicalization `3c87527` added engine/paths.go:resolveCanonical, duplicating the ancestor-walk + EvalSymlinks algorithm that already lived in fs/guard.go:ResolveWrite. Two implementations of the same TOCTOU defense is exactly the wrong shape for security code — a bug fix in one would silently miss the other. Extracts the shared algorithm to security.CanonicalizePath. Both call sites become thin wrappers that pre-anchor relative paths against the appropriate root (cwd for engine, workspace root for guard). The "hit-root" defensive branch in engine's version (commented "highly unlikely") is tightened to match guard's error behavior. Adds focused unit tests for the helper covering existing path, non-existent leaf, non-existent mid-component, symlinked ancestor, and relative-path rejection.	2026-05-20 01:50:38 +02:00
vikingowl	3c875276c9	feat(security): implement multi-wave audit remediation and agy provider support Implemented full security remediation following Universal Security Pilot protocol: - W1: Enforced SecureProvider at router and engine boundaries to prevent bypasses. - W1: Implemented path-sensitive policy for MCP tools. - W2: Added SHA256 hash verification for SLM downloads (llamafile). - W3: Enhanced secret redaction for private keys (full body) and high-entropy strings. - W4: Fixed symlink-based filesystem sandbox escapes in paths and grep. - W4: Documented CLI agent trust boundaries. Also added 'agy' (Antigravity) as a subprocess CLI provider with plain-text JSON schema support.	2026-05-20 01:13:13 +02:00
vikingowl	34f6f1c786	feat(security): incognito coherence across firewall/router/persist (Wave 2) Closes the cluster of audit findings where gnoma's incognito promise ('no persistence, no learning, local-only routing') silently broke because state was duplicated across the CLI flag, the firewall's IncognitoMode, the router's localOnly flag, and the TUI's local m.incognito field. Wave 2 makes security.IncognitoMode the canonical source of truth. W2-1 Router.Select rejects forced non-local arms when localOnly is on rather than short-circuiting and silently routing to cloud. Main fails fast when --incognito + --provider <cloud> are combined; the TUI toggle (Ctrl+X, /incognito, config panel) refuses with an actionable message when a non-local arm is pinned. Factored the three duplicated toggle sites into Model.attemptIncognitoToggle. W2-2 persist.Store.Save consults an IncognitoGate (local interface, security.IncognitoMode satisfies it). nil gate = always persist (legacy behaviour for tests); non-nil gate is consulted on every Save so TUI runtime toggles take effect without reconstructing the store. File mode 0o600, dir mode 0o700. W2-3 tui.New seeds m.incognito from cfg.Firewall.Incognito().Active(). Fixes the Ctrl+X-on-launch-with-incognito case where the first toggle silently turned the firewall OFF because the local flag started false out of sync with the firewall. W2-4 saveQuality gates on both incognito (defensive, covers the window before fwRef.Set fires) and fw.Incognito().ShouldLearn() (so TUI Ctrl+X suppresses the snapshot on exit). Quality restore skipped under --incognito. Quality file written 0o600 in dir 0o700. engine.reportOutcome and elf.Manager.ReportResult both gate on fw.Incognito().ShouldLearn() — bandit signal no longer leaks out of incognito sessions. W2-5 session files written 0o600 in dirs 0o700 (was 0o644 / 0o755). W2-6 IncognitoMode.LocalOnly dropped — dead field with no readers; routing local-only state lives on the router, not the firewall. Also wires rtr.SetLocalOnly(true) when --incognito at launch — main previously activated the firewall's flag but never told the router to filter, so even without the forced-arm bug, launching with --incognito alone gave you 'incognito badge but full arm pool'.	2026-05-19 22:57:36 +02:00
vikingowl	43ea2e562d	feat(engine): two-stage tool routing for small local arms Plan A from docs/superpowers/plans/2026-05-19-post-slm-unlock.md. Small local SLMs (<=16k context) waste ~1500 tokens per turn on the full tool catalogue. Two-stage routing replaces round-1 tools with a single synthetic select_category schema; round-2+ sends only the selected category's real tool schemas plus select_category for re-selection. - internal/tool/category.go: Category type, optional Categorized interface, CategoryOf() with meta fallback. fs.read/fs.ls -> read, fs.write/fs.edit -> write, fs.glob/fs.grep -> search, bash -> exec. - internal/engine/twostage.go: synthetic select_category tool, intercept helper, per-turn selectedCategory state under e.mu. - Engine round 1 forces ToolChoiceRequired so SLMs don't fall back to prose. State resets at the top and end of every runLoop. - Activates automatically on a forced local arm with ContextWindow <=16384, or via [router].force_two_stage TOML key. - Integration test drives a 3-round trip and asserts: round 1 emits exactly one schema (synthetic) with ToolChoiceRequired, round 2 contains only write-category schemas + select_category, real fs.write executes. Invalid-category fallback round-trips back to round-1 mode.	2026-05-19 20:53:21 +02:00
vikingowl	ec9433d783	chore(lint): clear remaining errcheck and staticcheck findings Brings the project to a clean `make lint` baseline (0 issues). Mechanical: - Wrap deferred resp.Body.Close() in closures (router/discovery.go, router/probe.go) so the unchecked return surfaces as `_ = ...`. - Apply `_ = ...` (single or multi-return blank) to test-file calls that intentionally ignore errors: os.MkdirAll / os.WriteFile / os.Chdir in setup paths, Close / Shutdown in teardown, Submit / Spawn / Send / LoadDir in tests that assert on side effects. Structural: - engine.handleRequestTooLarge drops the unused req parameter and rebuilds the request from compacted history (SA4009 — argument was overwritten before first use). - provider.ClassifyHTTPStatus and google.applyCapabilityOverrides switch to tagged switches over the discriminator (QF1002). - tui.app.go MouseWheel + inputMode and cmd/gnoma main slm-status use tagged switches in place of equality chains (QF1003). - cmd/gnoma main.go merges a var decl with its immediate assignment (S1021). - Three empty-branch sites (dispatcher_test, loader_test, coordinator_test) become real assertions or get the dead `if` removed (SA9003).	2026-05-19 17:53:42 +02:00
vikingowl	13b2f5e14d	chore(lint): clear dead code and tighten lifecycle errcheck Removes five unused funcs/vars/fields that golangci-lint had been flagging (anthropic.toolCallDoneEvent, mistral.translateMessages, hook.newError, subprocess.vibeParser.lastAssistantMsgID, tui.cBase), two ineffectual assignments (tui/rendering.go visible-window loop, subprocess stream_test setup), and a stale if/HasPrefix that's now a strings.TrimPrefix. Wires errcheck onto every subprocess / stream lifecycle path so a failed close or shutdown is at least logged rather than silently dropped: - engine/loop.go: stream.Close on both the error and success paths - mcp/manager.go: Shutdown when StartAll partial-fails; Transport close after Initialize failure - mcp/transport.go: stdin.Close + syscall.Kill on graceful-timeout fallback - slm/download.go: Close propagated as a named-return error on the success path; explicitly discarded on the rollback path - slm/classifier.go, slm/manager.go, hook/prompt.go, context/summarize.go, config/write.go, cmd/gnoma/main.go, tool/fs/grep.go: explicit ignores or error logging on Close / Shutdown / WalkDir / Scanln Production-code errcheck and ineffassign are now zero. Remaining golangci-lint output is test-only Close-in-defer noise plus stylistic staticcheck QF suggestions, left alone.	2026-05-19 17:05:54 +02:00
vikingowl	b60aa02bfd	feat(fs): enforce workspace boundary on fs tools Adds a Guard that resolves every path against an allowlist of absolute roots (default: cwd) and rejects anything escaping via relative segments, absolute paths outside the root, or symlinks (including symlinked parents on writes). Closes audit finding C1: fs.read/fs.write/fs.edit/fs.glob/fs.grep/fs.ls previously accepted any absolute path; the only protection was a substring denylist (.env, .ssh/, ...) which missed /etc/shadow, kube configs, IDE secrets, and anything reachable via symlink.	2026-05-19 16:07:29 +02:00
vikingowl	0b1392cf6b	feat(pty): Phase 2 — interactive shell and bash interactive detection - /shell [cmd]: launch user's $SHELL via tea.ExecProcess (PTY handoff) hands terminal to the shell and restores TUI on exit. /shell <cmd> runs that command in the shell directly. Detects $SHELL > $COMSPEC > /bin/sh\|powershell.exe in order. - bash tool: detect interactive commands before execution Prefix-interactive: sudo, ssh, passwd, vim/vi/nano, less/more, htop/top, mysql/psql, ftp/sftp, git push. Exact-interactive (REPL): python3/python/node/irb/iex/ghci/julia. Returns a tool result with interactive=true metadata and a hint to use /shell instead of hanging or erroring. - completions: add /shell to builtin command list - help: document /shell [cmd]	2026-05-07 15:52:56 +02:00
vikingowl	176926924c	feat(engine): M8 cleanup — Wave B skill enforcement - Add tool.PathSensitiveTool interface (ExtractPaths); implement on all 6 fs tools - Add engine.TurnOptions.AllowedPaths: restricts tool filesystem access per skill invocation - Bash is denied outright when AllowedPaths is active (unparseable command args) - fs tools with empty path (cwd default) resolved via os.Getwd() and validated - Add engine.TurnOptions.AllowedTools + AllowedPaths wiring in pipe mode (main.go) and TUI skill dispatch (tui/app.go) - Remove TODO(M8.3) from skill.Frontmatter — enforcement is now complete	2026-05-07 15:29:33 +02:00
vikingowl	9fb520fba6	feat(engine): M8 cleanup — Wave A wiring gaps - Remove stale TODO(P0c) comment from main.go (resolved by P0c tier routing) - Wire config.Provider.Temperature → engine.Config.Temperature → provider.Request - Add WithMaxFileSize option to fs.write; wire cfg.Tools.MaxFileSize in main.go - Wire router.ReportOutcome after each runLoop return (success = err == nil) - Fix nil-callback guard on EventRouting dispatch (pre-existing bug exposed by new test)	2026-05-07 15:22:22 +02:00
vikingowl	8d86bc75fd	test: M7 audit — quality feedback, coordinator, agent tool coverage Quality feedback integration: TestQualityTracker_InfluencesArmSelection verifies that 5 successes vs 5 failures tips Router.Select() to the high-quality arm once EMA has enough observations. Companion test confirms heuristic fallback below minObservations. Coordinator tests expanded from 2 → 5: added guidance content check (parallel/serial/synthesize present), false-positive table extended with 7 cases including the reordered keywords from the previous fix. Agent tool suite: tool interface contracts for all four tools (Name, Description, Parameters validity, IsReadOnly). Extracted duplicated 2000-char truncation into truncateOutput() helper (format.go), removing the inline copies in agent.go and batch.go. Four boundary tests cover empty, short, exact-max, and over-max cases.	2026-04-06 00:59:12 +02:00
vikingowl	745b27e5db	feat: list_results + read_result tools for coordinator artifact discovery	2026-04-05 22:19:05 +02:00
vikingowl	f4fda8346b	feat: list_results + read_result tools for coordinator artifact discovery	2026-04-05 22:15:04 +02:00
vikingowl	64ee385039	feat: QualityTracker — EMA router feedback from elf outcomes, ResultFilePaths tracking	2026-04-05 22:08:08 +02:00
vikingowl	dae2c488e5	feat: wire persist.Store into engine, elf manager, and agent tools	2026-04-05 21:59:55 +02:00
vikingowl	88e76cddb0	fix: persist.Store — sanitize callID, log save errors, document List filter semantics	2026-04-05 21:44:03 +02:00
vikingowl	6fa9df5613	feat: persist.Store — session-scoped /tmp tool result persistence	2026-04-05 21:38:45 +02:00
vikingowl	4f1e0cf567	feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes provider/openai: - Fix doubled tool call args (argsComplete flag): Ollama sends complete args in the first streaming chunk then repeats them as delta, causing doubled JSON and 400 errors in elfs - Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep) - Add Reasoning field support for Ollama thinking output cmd/gnoma: - Early TTY detection so logger is created with correct destination before any component gets a reference to it (fixes slog WARN bleed into TUI textarea) permission: - Exempt spawn_elfs and agent tools from safety scanner: elf prompt text may legitimately mention .env/.ssh/credentials patterns and should not be blocked tui/app: - /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge (ask for plain text output) → TUI fallback write from streamBuf - looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback content before writing (reject refusals, strip narrative preambles) - Collapse thinking output to 3 lines; ctrl+o to expand (live stream and committed messages) - Stream-level filter for model pseudo-tool-call blocks: suppresses <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call\|> from entering streamBuf across chunk boundaries - sanitizeAssistantText regex covers both block formats - Reset streamFilterClose at every turn start	2026-04-05 19:24:51 +02:00
vikingowl	95dfd0cf0c	feat: M1-M7 gap audit phase 3 — context prefix, deferred tools, compact hooks Gap 11 (M6): Fixed context prefix - Window.PrefixMessages stores immutable docs (CLAUDE.md, .gnoma/GNOMA.md) - Prefix stripped before compaction, prepended after — survives all compaction - AllMessages() returns prefix + history for provider requests - main.go loads CLAUDE.md and .gnoma/GNOMA.md at startup as prefix Gap 12 (M6): Deferred tool loading - DeferrableTool optional interface: ShouldDefer() bool - buildRequest() skips deferred tools until activated - Tools auto-activate on first model request (activatedTools map) - agent + spawn_elfs marked as deferrable (large schemas, rarely needed early) - Saves ~800 tokens per deferred tool per request Gap 13 (M6): Pre/post compact hooks - OnPreCompact/OnPostCompact callbacks in WindowConfig - Called in doCompact() (shared by CompactIfNeeded + ForceCompact) - M8 hooks system will extend these to full protocol	2026-04-04 20:46:50 +02:00
vikingowl	11363f3b97	feat: M1-M7 gap audit phase 2 — security, TUI, context, router feedback Gap 6 (M3): 7 new bash security checks (8-14) - JQ injection, obfuscated flags (Unicode lookalike hyphens), /proc/environ access, brace expansion, Unicode whitespace, zsh dangerous constructs, comment-quote desync - Total: 14 checks (was 7) Gap 7 (M5): Model picker numbered selection - /model shows numbered sorted list, /model 3 picks by number Gap 8 (M5): /config set command - /config set provider.default mistral writes to .gnoma/config.toml - Whitelisted keys: provider.default, provider.model, permission.mode - New config/write.go with TOML round-trip via BurntSushi/toml Gap 9 (M6): Simple token estimator - EstimateTokens (len/4 heuristic), EstimateMessages (content + overhead) - PreEstimate on Tracker for proactive compaction triggering Gap 10 (M7): Router quality feedback from elfs - Router.Outcome + ReportOutcome (logs for now, M9 bandit uses later) - Manager tracks armID/taskType per elf via elfMeta map - Manager.ReportResult called after elf completion in both agent + batch tools	2026-04-04 11:07:08 +02:00
vikingowl	6aea2a9e3a	fix: retry with exponential backoff on 429, stagger elf spawns Engine retries transient errors (429, 5xx) up to 4 times with 1s/2s/4s/8s backoff. Respects Retry-After header from provider. Batch tool staggers elf spawns by 300ms to avoid rate limit bursts when all elfs hit the API simultaneously (Mistral's 1 req/s limit).	2026-04-03 21:08:20 +02:00
vikingowl	abb3e3ca90	feat: spawn_elfs batch tool for guaranteed parallel elf execution New spawn_elfs tool takes array of tasks, spawns all elfs simultaneously. Solves the problem of models (Mistral Small, Devstral) that serialize tool calls instead of batching them. Schema: {"tasks": [{"prompt": "...", "task_type": "..."}], "max_turns": 30} Also: - Suppress spawn_elfs tool output from chat (tree handles display) - Update M7 milestones to reflect completed deliverables - Add CC-inspired features to M8/M10: task notification system, task framework, /batch skill, coordinator mode, StreamingToolExecutor, git worktree isolation	2026-04-03 21:03:51 +02:00
vikingowl	e1a47a7620	feat: rate limit pools, elf tree view, permission prompts, dep updates Rate limits: - Add PoolRPS/PoolTPM/PoolTokensMonth/PoolCostMonth pool kinds - Provider defaults for Mistral/Anthropic/OpenAI/Google (tier-aware) - Config override via [rate_limits.<provider>] TOML section - Pools auto-attached to arms on registration Elf tree view (CC-style): - Structured elf.Progress type replaces flat string channel - Tree with ├─/└─ branches, per-elf stats (tool uses, tokens) - Live activity updates: tool calls, "generating… (N chars)" - Completed elfs stay in tree with "Done (duration)" until turn ends - Suppress raw elf output from chat (tree + LLM summary instead) - Remove background elf mode (wait: false) — always wait - Truncate elf results to 2000 chars for parent context - Parallel hint in system prompt and tool description Permission prompts: - Show actual command in prompt: "bash wants to execute: find . -name '*.go'" - Compact hint in separator bar: "⚠ bash: find . \| wc -l [y/n]" - PermReqMsg carries tool name + args Other: - Fix /model not updating status bar (session.Local.SetModel) - Add make targets: run, check, install - Update deps: BurntSushi/toml v1.6.0, chroma v2.23.1, x/text v0.35.0, cloud.google.com/go v0.123.0	2026-04-03 20:54:48 +02:00
vikingowl	8138bd69f8	fix: live elf progress shows tool calls + results, not just text	2026-04-03 19:42:48 +02:00
vikingowl	ebfbefc73d	feat: configurable max_turns for elfs — LLM sets via agent tool param	2026-04-03 19:37:17 +02:00
vikingowl	086a4622fd	fix: elf progress — proper last-2-lines tracking, 70 char truncation	2026-04-03 19:30:18 +02:00
vikingowl	c01069164e	feat: live elf progress in TUI - Elf tool calls show as 🦉 [elf] <prompt> (not ⚙ [agent]) - Live 2-line progress beneath the elf label showing what the elf is currently outputting (grey, auto-updated) - Agent tool forwards elf streaming events via progress channel - Progress cleared on turn completion - elfProgressCh wired from agent tool → TUI	2026-04-03 19:25:43 +02:00
vikingowl	13db7521b1	feat: M7 Elfs — sub-agents with router-integrated spawning internal/elf/: - BackgroundElf: runs on own goroutine with independent engine, history, and provider. No shared mutable state. - Manager: spawns elfs via router.Select() (picks best arm per task type), tracks lifecycle, WaitAll(), CancelAll(), Cleanup(). internal/tool/agent/: - Agent tool: LLM can call 'agent' to spawn sub-agents. Supports task_type hint for routing, wait/background mode. 5-minute timeout, context cancellation propagated. Concurrent tool execution: - Read-only tools (fs.read, fs.grep, fs.glob, etc.) execute in parallel via goroutines. - Write tools (bash, fs.write, fs.edit) execute sequentially. - Partition by tool.IsReadOnly(). TUI: /elf command explains how to use sub-agents. 5 elf tests. Exit criteria: parent spawns 3 background elfs on different providers, collects and synthesizes results.	2026-04-03 19:16:46 +02:00
vikingowl	60883521c7	feat: auto permission mode, edit diffs, truncated tool output - Default permission mode changed to 'auto' (read-only auto-allows, writes prompt) - fs.edit now shows diff-style output: line numbers, context ±3 lines, + for added (green), - for removed (red) - Tool output truncated to 10 lines in TUI with "+N lines (Ctrl+O to expand)" indicator - Mistral SDK bumped to v1.3.0	2026-04-03 18:57:13 +02:00
vikingowl	46505a1f71	feat: complete 7/7 bash security checks Added: - Standalone semicolon check: blocks ; outside quotes (use && instead) - Sensitive redirection check: blocks > to /etc/passwd, .bashrc, .ssh/authorized_keys, .env, etc. Now all 7 security checks are active: 1. Incomplete commands, 2. Control characters, 3. Newline injection, 4. Command substitution, 5. Dangerous variables, 6. Semicolons, 7. Sensitive redirections	2026-04-03 17:56:01 +02:00
vikingowl	6cfe35620d	feat: compact system inventory with queryable system_info tool System prompt gets a one-line summary (~200 chars): OS, CPU, RAM, GPU, top runtimes, package count, PATH command count. Full details available on demand via system_info tool with sections: runtimes, packages, tools, hardware, all. LLM calls the tool when it needs specifics — saves thousands of tokens per request. Hardware detection: CPU model, core count, total RAM, GPU via lspci. Package manager: pacman/apt/dnf/brew with dev package filtering. PATH scan: 5541 executables. Runtime probing: 22 detected.	2026-04-03 14:50:33 +02:00
vikingowl	8e5ddb20cb	feat: hybrid system inventory — dynamic PATH scan + runtime probing No hardcoded tool lists. Scans all $PATH directories for executables (5541 on this system), then probes known runtime patterns for version info (23 detected: Go, Python, Node, Rust, Ruby, Perl, Java, Dart, Deno, Bun, Lua, LuaJIT, Guile, GCC, Clang, NASM + package managers). System prompt includes: OS, shell, runtime versions, and notable tools (git, docker, kubectl, fzf, rg, etc.) from the full PATH scan. Total executable count reported so the LLM knows the full scope. Milestones updated: M6 fixed context prefix, M12 multimodality.	2026-04-03 14:36:22 +02:00
vikingowl	69f5dba091	feat: complete M1 — core engine with Mistral provider Mistral provider adapter with streaming, tool calls (single-chunk pattern), stop reason inference, model listing, capabilities, and JSON output support. Tool system: bash (7 security checks, shell alias harvesting for bash/zsh/fish), file ops (read, write, edit, glob, grep, ls). Alias harvesting collects 300+ aliases from user's shell config. Engine agentic loop: stream → tool execution → re-query → until done. Tool gating on model capabilities. Max turns safety limit. CLI pipe mode: echo "prompt" \| gnoma streams response to stdout. Flags: --provider, --model, --system, --api-key, --max-turns, --verbose, --version. Provider interface expanded: Models(), DefaultModel(), Capabilities (ToolUse, JSONOutput, Vision, Thinking, ContextWindow, MaxOutput), ResponseFormat with JSON schema support. Live verified: text streaming + tool calling with devstral-small. 117 tests across 8 packages, 10MB binary.	2026-04-03 12:01:55 +02:00

35 Commits