gnoma

Author	SHA1	Message	Date
vikingowl	3c875276c9	feat(security): implement multi-wave audit remediation and agy provider support Implemented full security remediation following Universal Security Pilot protocol: - W1: Enforced SecureProvider at router and engine boundaries to prevent bypasses. - W1: Implemented path-sensitive policy for MCP tools. - W2: Added SHA256 hash verification for SLM downloads (llamafile). - W3: Enhanced secret redaction for private keys (full body) and high-entropy strings. - W4: Fixed symlink-based filesystem sandbox escapes in paths and grep. - W4: Documented CLI agent trust boundaries. Also added 'agy' (Antigravity) as a subprocess CLI provider with plain-text JSON schema support.	2026-05-20 01:13:13 +02:00
vikingowl	129d4f1ea6	chore: remove TinyLlama and set tiny3.5 (Qwen2.5 0.5B) as default SLM	2026-05-20 00:26:58 +02:00
vikingowl	a14fe8b504	feat(slm): pluggable backends + trivial-prompt routing The SLM had two intended jobs — classify every prompt and execute the small ones itself — but in practice three independent gates kept it out of nearly all real work: 1. llamafile cold-start blocked pipe-mode runs (always faster than the 15 s health check) 2. ClassifyTask defaulted RequiresTools=true, excluding the SLM arm (ToolUse=false) from 9/10 task types 3. armTier hard-coded CLI agents > local > API, so even when the SLM arm was feasible a CLI agent won Each gate is addressed below. The result is an SLM that actually does its job — small stuff stays local, complex stuff routes up — gated by arm capability rather than by accidents of the boot order. Backend layer (the bigger change) The original implementation hard-coded llamafile. That's fine if you have nothing else, but most users with a local model setup already run Ollama or llama.cpp. The new factory at internal/slm/backend.go picks between: - ollama (any local Ollama daemon) - llamacpp (any llama.cpp server) - llamafile (gnoma-managed, current behaviour) - openaicompat (LM Studio, vLLM, remote API) - auto (probes in order, picks first reachable) - disabled [slm].backend in config.toml selects which. Documented in docs/slm-backends.md with copy-paste presets for each. The factory probes the underlying model's actual capabilities (Ollama /api/show, llama.cpp /props) and sets the SLM arm's ToolUse accordingly — so the arm picks up simple file-read style tasks on tool-capable models and stays knowledge-only on completion-only models. Trivial-prompt heuristic (Gate 2) ClassifyTask now flips RequiresTools=false for short, low-complexity prompts whose task type doesn't imply existing code (Explain, Generation, Boilerplate). Tool-needing tokens (read, write, run, test, file, …) keep RequiresTools=true even when the prompt is brief. Complexity-aware tier ordering (Gate 3) armTier takes a Task and returns tier 0 for arms whose MaxComplexity ceiling fits the task. CLI agents drop to tier 1, local to 2, API to 3. For trivial tasks the SLM arm wins; for complex tasks the SLM falls out of the feasible set (MaxComplexity exclusion) and the original ordering reasserts. Eager boot with user-facing wait (Gate 1) Removed the original goroutine-only path. SLM startup now blocks synchronously inside the factory; for llamafile that means up to [slm].startup_timeout (default 5 s) of waiting on the first invocation, with "Starting SLM…" → "SLM ready (backend, model, tools, boot=N)" / "SLM unavailable: …" messages on stderr. Ollama / llamacpp backends boot instantly because the daemon is already running. waitHealthy() now respects the caller's context deadline instead of its old hardcoded 15 s ceiling. Classifier reliability Classifier timeout bumped 2 s → 5 s for thinking-mode models like Qwen3-distilled Tiny3.5. System prompt includes /no_think directive for the same family. These help but don't eliminate small-model JSON-contract failures — see the docs section on picking a model. Probe + telemetry surfaces gnoma slm status now prints the configured backend + model + a live probe result (✓/✗) instead of just the llamafile manifest state. `gnoma router stats` already (from the previous commit) shows the classifier-source mix; with this change you can finally see slm / slm_fallback / heuristic share rise from "always heuristic" to something reflecting real SLM activity. Tests - 9 new backend-factory tests (httptest-backed Ollama probe, error paths, auto-detection, capability flags) - Tier-ordering tests cover the new "specialised small arm wins trivial task" path - Trivial-prompt heuristic tested for both halves (knowledge-only flips RequiresTools=false; debug/file/run keeps it true) Deletes the dead SLMManager field from the TUI Config — it was declared but never read.	2026-05-19 18:53:32 +02:00
vikingowl	58beb7ce3c	feat(router): classifier-source telemetry + router stats command Phase 4 routing decisions depend on knowing whether the SLM classifier is actually firing or whether the heuristic is silently doing all the work. Adds the instrumentation to make that observable. router.ClassifierSource enum (heuristic / slm / slm_fallback) is set on Task by every classifier: - HeuristicClassifier → ClassifierHeuristic - slm.Classifier → ClassifierSLM on success, ClassifierSLMFallback when the SLM call fails or returns unparseable output The source is plumbed through router.Outcome to QualityTracker, which now maintains per-source counters alongside the existing per-arm × task EMA scores. QualitySnapshot serializes both (classifier_counts is omitempty for back-compat with pre-feature quality.json files). lazyClassifier logs at INFO the first time it falls back to heuristic because the SLM hasn't booted yet — distinguishes operational fallback from an unconfigured-SLM run. slm.Manager.Start() now records elapsed-to-healthy and the main.go goroutine logs it as part of the "SLM ready" event. Confirms whether short-lived runs are racing the boot cycle. New `gnoma router stats` subcommand prints both tables (arm × task quality, classifier source breakdown) from quality.json with a Phase 4 trust hint when the data is too sparse or the SLM share is low. 6 new tests cover ClassifierSource string/enum, heuristic + SLM source propagation, QualityTracker counter round-trip, and back-compat restore from a legacy quality.json without classifier_counts.	2026-05-19 18:18:22 +02:00
vikingowl	13b2f5e14d	chore(lint): clear dead code and tighten lifecycle errcheck Removes five unused funcs/vars/fields that golangci-lint had been flagging (anthropic.toolCallDoneEvent, mistral.translateMessages, hook.newError, subprocess.vibeParser.lastAssistantMsgID, tui.cBase), two ineffectual assignments (tui/rendering.go visible-window loop, subprocess stream_test setup), and a stale if/HasPrefix that's now a strings.TrimPrefix. Wires errcheck onto every subprocess / stream lifecycle path so a failed close or shutdown is at least logged rather than silently dropped: - engine/loop.go: stream.Close on both the error and success paths - mcp/manager.go: Shutdown when StartAll partial-fails; Transport close after Initialize failure - mcp/transport.go: stdin.Close + syscall.Kill on graceful-timeout fallback - slm/download.go: Close propagated as a named-return error on the success path; explicitly discarded on the rollback path - slm/classifier.go, slm/manager.go, hook/prompt.go, context/summarize.go, config/write.go, cmd/gnoma/main.go, tool/fs/grep.go: explicit ignores or error logging on Close / Shutdown / WalkDir / Scanln Production-code errcheck and ineffassign are now zero. Remaining golangci-lint output is test-only Close-in-defer noise plus stylistic staticcheck QF suggestions, left alone.	2026-05-19 17:05:54 +02:00
vikingowl	9037a0d195	fix(slm): skip re-download when already set up Setup() now returns early if Status() == StatusReady. CLI also prints the existing path/size instead of starting a download.	2026-05-07 17:10:16 +02:00
vikingowl	329610209a	fix(slm): invoke llamafile via sh to bypass Wine binfmt_misc APE polyglot binaries start with MZ magic bytes which Wine's binfmt_misc rule intercepts on Linux. llamafile is also a valid POSIX shell script; running it via 'sh' bypasses the kernel's binfmt_misc lookup entirely.	2026-05-07 17:08:52 +02:00
vikingowl	0a1730943f	fix: provider-agnostic startup + slm setup auto-config Remove the hardcoded mistral default so gnoma starts without any provider configured. TUI mode uses a stubProvider that lets CLI agent arms (claude, gemini, etc.) handle routing; pipe mode prints a clear setup message. Also: gnoma slm setup now auto-writes the default model_url to the global config when none is set, instead of erroring.	2026-05-07 17:05:06 +02:00
vikingowl	a9213ec382	feat(slm): Wave C — SLM classifier, MaxComplexity routing, CLI subcommands, TUI status - slm.Classifier: openaicompat → llamafile, 2s timeout + heuristic fallback, heuristic baseline blended so Priority/RequiredEffort are never zeroed, extractJSON strips markdown fences from small-model responses - router.ParseTaskType: case-insensitive string → TaskType, unknown → TaskGeneration - router.Arm.MaxComplexity: zero = no ceiling (preserves existing arm behavior); filterFeasible excludes arms when task.ComplexityScore > MaxComplexity - config.SLMSection: [slm] enabled / model_url / data_dir - openaicompat.NewLlamafile: no API key, model = "default", no retries - slm.Manager: DefaultDataDir() (XDG), Manifest() accessor - cmd/gnoma: `gnoma slm setup` / `gnoma slm status` subcommands; SLM arm registered with MaxComplexity=0.3 when enabled + set up - tui: /config shows slm status (ready/missing/not set up + base URL if running) - docs: roadmap updated to reflect llamafile pivot from Ollama	2026-05-07 16:44:32 +02:00
vikingowl	d1a5c79fa4	feat(slm): Wave B — Manager, Manifest, download, subprocess lifecycle - Manifest: JSON read/write with atomic rename; presence = ready invariant - download: HTTP fetch with SHA256 computation, progress callback, cleanup on failure - Manager: Status (NotSetUp/Ready/Missing), Setup (download + manifest write), Start (freePort, exec, PID file, health check), Stop, BaseURL - waitHealthy: polls /health with 15s ceiling and context cancellation - reapStalePID: kills stale process from previous run on next Start - 28 tests; all pass	2026-05-07 16:23:46 +02:00

10 Commits