8 Commits

Author SHA1 Message Date
vikingowl a79e99199d feat(router): non-chat exclude, vision prefixes, family-defaults scaffold
Discovery previously registered every model returned by Ollama as a
chat arm, including embeddings, ASR, TTS, audio realtime, and
rerankers — which then failed at inference time when the router
selected them. Local arms also shipped with all-zero defaults, so
selection between e.g. tiny3.5:1.5b, phi-4:14b, and qwen3-coder:30b
was effectively random.

This change covers tasks R-1, R-2, R-6 from the routing-defaults plan.

- nonChatModelPatterns + isNonChatModel substring matcher; matched
  IDs are skipped during RegisterDiscoveredModels. Covers whisper,
  moonshine, kokoros, vibevoice, -asr, -tts, -audio, -embedding,
  embeddinggemma, -reranker, lfm2.
- knownVisionModelPrefixes gains gemma4, gemma-4, glm-ocr. gemma3
  and minicpm-v entries stay for regression coverage.
- New internal/router/defaults.go with FamilyDefaults struct,
  knownFamilyDefaults map, and ResolveFamilyDefaults longest-prefix
  lookup (with org/-namespace stripping so reecdev/tiny3.5:1.5b
  resolves to "tiny3.5"). Single entry for now: functiongemma is
  registered with Disabled=true and MaxComplexity=0.40, reserved for
  the future ArmRoleToolRouter path. Table will grow in R-3.
- RegisterDiscoveredModels consults ResolveFamilyDefaults and only
  populates fields that are still zero on the arm, so user [[arms]]
  overrides keep priority.

Plans:
- docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md
- docs/superpowers/plans/2026-05-23-tool-router-specialization.md

TODO.md surfaces both as in-flight items.
2026-05-23 21:24:59 +02:00
vikingowl a2b7f8eb3f feat(router): vision capability gating and Ollama vision detection
Task gains a RequiresVision bool; filterFeasible enforces it on
both the primary feasibility pass and the last-resort fallback
(no degradation to a non-vision arm — the model literally cannot
consume image bytes).

Ollama discovery now probes /api/show for vision capability:
- details.families containing "clip" / "mllama" / "*vl"
- capabilities array containing "vision" (newer Ollama)
- name-prefix fallback for releases that predate either
  (llava, qwen2.5-vl, llama3.2-vision, moondream, pixtral, etc.)

OllamaProbeResult replaces the map[string]bool tool cache so the
single /api/show call can populate tools + vision + ctx-size in
one probe. DiscoverOllama / DiscoverLocalModels signatures updated;
nil-cache callers in cmd/gnoma keep working unchanged.
RegisterDiscoveredModels propagates SupportsVision into the arm's
Capabilities.Vision.

Tests cover RequiresVision filtering in both the happy path
(vision-only arm chosen when image present) and the fallback path
(non-vision arm rejected even as last resort).
2026-05-22 11:50:33 +02:00
vikingowl c4fde583f5 chore(lint): gofmt sweep + errcheck cleanups in router discovery
Apply gofmt -w across the codebase (struct field comment realignment
only — no semantic changes) and silence two errcheck warnings on
fmt.Sscanf / fmt.Fprintf return values in internal/router/discovery
with explicit `_, _ =` discards. Required so `make check` is green
before tagging v0.1.0.
2026-05-20 03:13:05 +02:00
vikingowl fb42202834 refactor(security): seal SecureProvider via unexported marker method
The router.SecureProvider interface previously required a public
IsSecure() bool method. Any test mock — or future production type —
could satisfy it by returning true, defeating the W1 "only wrapped
providers may flow past the boundary" contract through convention
rather than at the type level.

Replaces IsSecure() bool with an unexported security.Marker interface
that has a single secured() method. Go's method-set semantics key
unexported methods by their defining package, so only types declared in
internal/security can satisfy Marker. *SafeProvider gets the lone
secured() implementation; router.SecureProvider embeds Marker.

The seal forces every test mock that previously implemented IsSecure()
to either (a) be wrapped with security.WrapProvider(mp, nil) at the use
site, or (b) drop the method entirely if the mock never flows through
SecureProvider. 93 use sites across 11 test files were updated via a
per-package secureMock helper. WrapProvider with a nil firewall ref is
a no-op pass-through, so test behavior is unchanged.

Empirically: a type from outside internal/security can declare
`secured()` but the compiler will reject assigning it to
router.SecureProvider because the unexported method belongs to the
other package's namespace. Convention → compile-time guarantee.
2026-05-20 02:04:07 +02:00
vikingowl f6f8801040 fix(router): restore llama.cpp model enumeration; keep /props for n_ctx
3c87527 rewrote DiscoverLlamaCPP to hit /props and emit a single hardcoded
"default" entry. That breaks two cases:

  1. Multi-model llama.cpp deployments (llama-swap, model-routing proxies)
     are collapsed to a single arm with a placeholder ID.
  2. Single-model deployments lose the real model name — arms are
     registered as llamacpp/default instead of llamacpp/<actual-id>.

Restores enumeration via /v1/models (the OpenAI-compatible endpoint
llama-server exposes) while keeping the concrete n_ctx read from /props.
/props is now best-effort: failure or missing n_ctx falls back to the
documented default rather than aborting discovery.

Adds three tests: multi-model enumeration with shared context, /props
unreachable, and the empty-/v1/models error path.
2026-05-20 01:45:54 +02:00
vikingowl 8539426a46 fix(router): restore Ollama cache prune + provider-specific context defaults
3c87527 refactored DiscoverOllama and DiscoverLlamaCPP and dropped two
behaviors:

  1. The Ollama toolCache prune loop. Without it, the cache grows
     unbounded across reconcile cycles and stale entries linger; a
     model that disappears and reappears replays an out-of-date
     tool-support verdict because the cache hit skips re-probing.

  2. Sensible context-size defaults. Both probes can yield
     ContextSize=0 (Ollama: no num_ctx in /api/show parameters;
     llama.cpp: /props default_generation_settings without n_ctx).
     Registering an arm with ContextWindow=0 misroutes — the post-SLM
     two-stage path treats it as a tiny model.

Restores the prune loop, applies 32768 (ollama) / 8192 (llama.cpp) as
fallbacks at discovery time, and adds three tests covering each path.
2026-05-20 01:42:14 +02:00
vikingowl 3c875276c9 feat(security): implement multi-wave audit remediation and agy provider support
Implemented full security remediation following Universal Security Pilot protocol:
- W1: Enforced SecureProvider at router and engine boundaries to prevent bypasses.
- W1: Implemented path-sensitive policy for MCP tools.
- W2: Added SHA256 hash verification for SLM downloads (llamafile).
- W3: Enhanced secret redaction for private keys (full body) and high-entropy strings.
- W4: Fixed symlink-based filesystem sandbox escapes in paths and grep.
- W4: Documented CLI agent trust boundaries.

Also added 'agy' (Antigravity) as a subprocess CLI provider with plain-text JSON schema support.
2026-05-20 01:13:13 +02:00
vikingowl 0caab0fed1 fix(router): discovery loop removes forced arm, breaking routing
The discovery loop's reconcileArms removed the CLI-forced arm
(llamacpp/default) because the llama.cpp server reports the real model
name (e.g. gemma-26b), creating a mismatch. After 30s the forced arm
disappeared and all subsequent requests failed.

Three-layer fix:
- Eager: query the specific provider at startup to resolve the real
  model name before registering the forced arm
- Lazy: reconcileArms detects placeholder "default" arm names and
  atomically renames them when discovery reveals the real identity,
  with an onReconcile callback to update the session and TUI
- Guard: the forced arm is never garbage-collected by the removal loop

Also fixes misleading /init error messaging — failed inits now show
"loaded from disk (init failed)" instead of "AGENTS.md written to".
2026-04-12 17:51:30 +02:00