gnoma

Author	SHA1	Message	Date
vikingowl	49d80cf847	feat(security): format-aware entropy safelist (Phase F-1) Add a deterministic pre-extractor that skips known-safe token shapes before they reach the entropy scorer. Targets the false-positive regime that bites under lowered entropy_threshold or redact_high_entropy = true — UUIDs (~3.4 bits), SHA hex digests (~3.9 bits), ISO-8601 timestamps, and HTTP(S) URLs. Config knob lives under the existing security section to match entropy_threshold / redact_high_entropy convention: [security] entropy_safelist = ["uuid", "sha_hex", "iso8601", "url"] Empty / unset preserves pre-F-1 behaviour exactly — users opt in. Per-pattern Debug telemetry fires on every skip (pattern name + token length, never the token bytes). This is the data F-2's go/no-go gate depends on; the plan literally specifies it. NewFirewall validates names at the config boundary and emits a Warn for unknown entries so a typo like "uid" instead of "uuid" surfaces loudly instead of silently disabling FP reduction. Tests cover: UUID/SHA-1/SHA-256 skipped at lowered threshold, mixed payload (safe shape + real secret) preserves the secret, secret-adjacent-to-UUID regression guard, empty safelist preserves pre-F-1 behaviour, unknown name silently dropped at scanner level but warned at firewall level, end-to-end FirewallConfig wiring, and the skip-telemetry log line. F-2 remains gated on real-workload FP-rate observations.	2026-05-22 12:39:10 +02:00
vikingowl	7d0e35b0f4	docs: record Phase F external validation, surface in active TODOs	2026-05-20 19:15:49 +02:00
vikingowl	8d6e66533b	docs(plans): add Phase F entropy FP reduction to post-SLM plan	2026-05-20 10:06:43 +02:00
vikingowl	3ae40083f1	docs(security): Wave 2 plan — incognito coherence Plan for the second hardening wave. Six findings closed in one PR: W2-1 router rejects forced non-local under local-only; W2-2 persist store consults IncognitoMode + 0o600/0o700 perms; W2-3 TUI seeds incognito from firewall; W2-4 quality/outcome gates read firewall instead of CLI flag; W2-5 session perms 0o600; W2-6 remove dead IncognitoMode.LocalOnly field.	2026-05-19 22:44:20 +02:00
vikingowl	8dcca64e41	feat(security): add SafeProvider boundary wrapper (W1-1) Introduces internal/security/SafeProvider — a provider.Provider decorator that scans outgoing messages and the system prompt through the firewall before delegating to the inner provider. Tool-result redaction stays in the engine because it needs per-tool context the boundary lacks. FirewallRef provides a late-binding atomic.Pointer[Firewall] so the wrapper can be installed before NewFirewall runs in main. A nil or unset ref makes SafeProvider a pass-through — preserves the current init order without lock contention or panics. Wave 1 of the post-audit hardening plan (docs/superpowers/plans/2026-05-19-security-wave1-safeprovider.md). Closes the architectural critique that secret scanning only ran inside engine.buildRequest(), leaving SLM/summarizer/hook/routerStreamer paths to send raw payloads. This commit only ships the wrapper; W1-2 and W1-3 will wire it through main and the four bypass sites.	2026-05-19 22:28:46 +02:00
vikingowl	d84b295da2	feat(tui): /profile slash command + status-bar profile badge (Phase C-3) Adds the in-TUI surface for the profile system: - Status bar carries " · profile: <name>" next to the SLM badge when profile mode is engaged (renders nothing in legacy single-config installations). - /profile (no args) shows the active profile and lists available ones. - /profile <name> switches by re-executing gnoma via syscall.Exec under --profile <name>. Critical cleanups (quality.json snapshot, SLM backend Close, session.Close) fire explicitly before exec since defers don't run after exec replaces the process image. Using syscall.Exec rather than a child process avoids stacking a process level on every switch and propagates the new gnoma's exit code directly to the shell. - Autocomplete after "/profile " offers configured profile names; the completion source is threaded from main.go via tui.Config. Conversation history is not preserved across a switch — profile change implies different context, different keys, different permission mode, so a clean reset is the correct semantic.	2026-05-19 21:59:11 +02:00
vikingowl	8450005b31	feat(cli): gnoma profile list/show subcommands (Phase C-2) `profile list` enumerates configured profiles and marks default + active. `profile show <name>` prints the merged effective config the profile would produce — sections, configured key names (values never), CLI agent overrides, arms, hooks, MCP servers, per-profile quality and session paths. Both commands work as a recovery affordance when profile resolution is broken: list flags a missing-default explicitly with "<name> (default, missing)", and the dispatcher falls back to a base-only load (new gnomacfg.LoadBase) so the diagnostics still run. API key values are filtered out of `profile show` — the output is safe to paste in a help channel or attach to a bug report.	2026-05-19 21:44:50 +02:00
vikingowl	635dad660c	feat(config): per-profile config layering with --profile flag (Phase C-1) Adds opt-in user profiles for swapping API keys, CLI binaries, and permission modes between contexts (work/private/experiment/...). Profile mode engages only when ~/.config/gnoma/profiles/ exists, so existing single-config installations are untouched. Selection order: --profile flag → default_profile in base config → fatal error. Layering: defaults → ~/.config/gnoma/config.toml → profiles/<name>.toml → <projectRoot>/.gnoma/config.toml → env. Map sections merge per-key; [[arms]] and [[mcp_servers]] merge by id/name; [[hooks]] appends. Per-profile data: quality-<name>.json and sessions/<name>/ keep the bandit and session list from cross-contaminating between profiles. Profile names restricted to [A-Za-z0-9_-] to block --profile=../foo path traversal into derived paths.	2026-05-19 21:35:33 +02:00
vikingowl	0aabd19906	feat(router): per-arm strengths + cost weight (Phase D) Plan D from docs/superpowers/plans/2026-05-19-post-slm-unlock.md (static portion; dynamic bandit-driven promotion deferred to D-2). Routing previously let tier ordering (CLI > local > API) dominate selection — Opus, in tier 3, would lose to a tier-1 CLI agent for SecurityReview even though Opus is empirically stronger at that task. This change introduces explicit per-arm overrides: [[arms]] id = "anthropic/claude-opus-4-7" strengths = ["security_review", "planning"] cost_weight = 0.3 Strengths gate cross-tier promotion: arms matching task.Type bypass the tier loop and compete with each other directly. Promotion is a preference, not a pin — if no strength-tagged arm is feasible (backoff, pool capacity, tool support), selection falls through to the default tier order. CostWeight linearly dampens the cost penalty in scoreArm via effectiveCost = 1 + CostWeight * (cost - 1) CostWeight=1.0 (or unset) preserves current behavior; lower values trade cheapness for quality. The earlier draft used cost^CostWeight which inverts direction for sub-1 local-arm costs (raising a fraction <1 to a fractional power makes it bigger, not smaller); a monotonicity regression test prevents that drift. - internal/router/arm.go: Strengths []TaskType, CostWeight float64, HasStrength(), ResolvedCostWeight() (zero → 1.0). - internal/router/selector.go: scoreArm strength bonus const (strengthScoreBonus = 0.15) + linear cost dampening; selectBest cross-tier promotion before tier loop. - internal/router/router.go: ArmOverride type + ApplyArmOverrides() returns unknown IDs; unknown strength names skipped with per-name warning via slog. - internal/router/task.go: ParseTaskTypeStrict() returns ok bool; ParseTaskType now delegates so the two switches stay in sync. - internal/config/config.go: ArmConfig + [[arms]] TOML wiring. - cmd/gnoma/main.go: applies overrides after all initial arms register; logs a warning when an [[arms]] id has no matching registered arm. Tests cover: predicate helpers, scoring direction across two arms, linear-formula monotonicity on both sides of cost=1, cross-tier promotion, empty-Strengths preserves tier order, promoted arm in backoff falls through via full Router.Select path, observed-quality tiebreak between two strength-tagged arms, ApplyArmOverrides happy path + unknown-ID reporting + unknown-strength skipping.	2026-05-19 21:14:45 +02:00
vikingowl	b331dcd61a	feat(subprocess): per-agent binary override via [cli_agents] config Plan B from docs/superpowers/plans/2026-05-19-post-slm-unlock.md. Users with aliased CLI binaries (claude-priv, claude-work, gemini-personal) can now point gnoma's auto-discovery at them without renaming. The override flows through to the actual subprocess spawn at internal/provider/subprocess/provider.go:56, so routing through the alias is functional, not cosmetic. Config: [cli_agents] claude = "claude-priv" # discovery uses claude-priv instead of claude gemini = "" # empty value = no override (fall back to canonical) # vibe is absent = canonical name used - internal/config/config.go: CLIAgentsSection map[string]string; TOML [cli_agents] key. - internal/provider/subprocess/agent.go: - Package-level lookPath = exec.LookPath for test injection. - resolveAgentBinary(canonical, override) → (path, binName, err). Override='' falls back to canonical. Override set but missing from PATH returns an error (no silent fallback — masks user typos). - DiscoveredAgent.OverrideBinary records the override binary name when one was used; empty otherwise. - DiscoverCLIAgents(ctx, overrides) signature; warning logged when an override is configured but the binary isn't on PATH. - cmd/gnoma/main.go: both call sites pass cfg.CLIAgents. The `gnoma providers` listing renders `claude-priv (via [cli_agents].claude)` when an override is in effect. Tests cover: 5 resolver cases (no override, override set, empty override falls back, override missing, canonical missing); 4 discovery cases (no overrides, override resolves alias, empty value falls back, override missing skips agent); 2 config round-trip cases.	2026-05-19 21:02:16 +02:00
vikingowl	43ea2e562d	feat(engine): two-stage tool routing for small local arms Plan A from docs/superpowers/plans/2026-05-19-post-slm-unlock.md. Small local SLMs (<=16k context) waste ~1500 tokens per turn on the full tool catalogue. Two-stage routing replaces round-1 tools with a single synthetic select_category schema; round-2+ sends only the selected category's real tool schemas plus select_category for re-selection. - internal/tool/category.go: Category type, optional Categorized interface, CategoryOf() with meta fallback. fs.read/fs.ls -> read, fs.write/fs.edit -> write, fs.glob/fs.grep -> search, bash -> exec. - internal/engine/twostage.go: synthetic select_category tool, intercept helper, per-turn selectedCategory state under e.mu. - Engine round 1 forces ToolChoiceRequired so SLMs don't fall back to prose. State resets at the top and end of every runLoop. - Activates automatically on a forced local arm with ContextWindow <=16384, or via [router].force_two_stage TOML key. - Integration test drives a 3-round trip and asserts: round 1 emits exactly one schema (synthetic) with ToolChoiceRequired, round 2 contains only write-category schemas + select_category, real fs.write executes. Invalid-category fallback round-trips back to round-1 mode.	2026-05-19 20:53:21 +02:00
vikingowl	21da29e73e	docs(plan): capture post-SLM-unlock outstanding work New dated plan at docs/superpowers/plans/2026-05-19-post-slm-unlock.md covers the work surfaced during this session that hasn't shipped yet: Phase A — two-stage tool routing (last item from the original smallcode audit; gates on local + small-context arms; saves ~70% of schema tokens per request). Phase B — CLI agent binary override. [cli_agents] config section lets users map canonical agent names (claude / gemini / vibe) onto local aliases (claude-priv, gemini-work, etc.). Phase C — user profiles. Multiple named configs (work / private / experiment) layered over a base config.toml, switchable via --profile flag, [config].default_profile, and a /profile TUI command. Phase D — per-arm capability tags (Phase-4 prep). Per-arm Strengths []TaskType and CostWeight to make the router actually pick Opus over Gemini for Planning/SecurityReview etc., not just for cost reasons. Phase E — compound tools (deferred until SLM-arm telemetry shows which chain patterns fail). Plus an explicit drop list of things we considered and won't ship. TODO.md updated to point at the new plan and note that the original roadmap's Phase 4 is now superseded.	2026-05-19 19:31:40 +02:00
vikingowl	a9213ec382	feat(slm): Wave C — SLM classifier, MaxComplexity routing, CLI subcommands, TUI status - slm.Classifier: openaicompat → llamafile, 2s timeout + heuristic fallback, heuristic baseline blended so Priority/RequiredEffort are never zeroed, extractJSON strips markdown fences from small-model responses - router.ParseTaskType: case-insensitive string → TaskType, unknown → TaskGeneration - router.Arm.MaxComplexity: zero = no ceiling (preserves existing arm behavior); filterFeasible excludes arms when task.ComplexityScore > MaxComplexity - config.SLMSection: [slm] enabled / model_url / data_dir - openaicompat.NewLlamafile: no API key, model = "default", no retries - slm.Manager: DefaultDataDir() (XDG), Manifest() accessor - cmd/gnoma: `gnoma slm setup` / `gnoma slm status` subcommands; SLM arm registered with MaxComplexity=0.3 when enabled + set up - tui: /config shows slm status (ready/missing/not set up + base URL if running) - docs: roadmap updated to reflect llamafile pivot from Ollama	2026-05-07 16:44:32 +02:00
vikingowl	5569d4fb86	docs: consolidated roadmap, ADR-013, drop stale plans - New 7-phase roadmap (2026-05-07-gnoma-roadmap.md) covering M8 cleanup, PTY interactive shell, SLM classifier, router revisit, USP security, ELF support, and distribution - ADR-013 (002-slm-routing.md): SLM-first routing supersedes ADR-009; Thompson Sampling deferred pending SLM production data - ADR-009 status updated to "Superseded by ADR-013" - gemma-integration-analysis.md: header note that Node.js specifics (LiteRT-LM, daemon, PID) don't apply to gnoma's Go implementation - TODO.md replaced with thin pointer to roadmap + stable backlog - Deleted stale plan/spec files: m6-m7-closeout, m8-hooks-design	2026-05-07 15:06:54 +02:00
vikingowl	fef38b3502	docs: M8.1 hook system design spec	2026-04-06 02:42:34 +02:00
vikingowl	43dcc7e9de	docs: M6/M7 close-out implementation plan — 8 tasks, TDD, full file map	2026-04-05 21:33:42 +02:00
vikingowl	252ffde732	docs: M6/M7 close-out design spec — tool persistence, tokenizer, router feedback, coordinator	2026-04-05 21:22:26 +02:00

17 Commits