gnoma

Author	SHA1	Message	Date
vikingowl	a23eb6b92c	style: gofmt drift from prior commits Pure whitespace cleanup surfaced when 'make check' ran gofmt over the tree. Mostly struct-field column alignment in internal/safety/banner.go (SessionInfo) and the var(...) flag block in cmd/gnoma/main.go after --dangerously-allow-anywhere was added without realignment. Verified zero substantive changes via 'git diff --ignore-all-space --ignore-blank-lines'.	2026-05-24 16:33:17 +02:00
vikingowl	2f8d4c412f	feat(router): cloud-arm defaults, gpt-5.3-codex registration Closes R-4 and R-5 of the routing-defaults plan. R-4: Strengths + CostWeight defaults for closed frontier models. Cloud entries land in the same knownFamilyDefaults table as local ones, with MaxComplexity intentionally left zero (cloud arms get no complexity ceiling). CostWeight tuned per the plan's rationale: claude-opus-4-7 → Planning/SecurityReview/Debug/Refactor, 0.3 claude-sonnet-4-6 → Generation/Refactor/Review, 0.7 gpt-5.5 → Planning/SecurityReview/Generation, 0.3 gpt-5.3-codex → Generation/Refactor/Debug/UnitTest, 0.6 gpt-5.2 → Orchestration/Review, 0.8 gemini-3.1-pro → Planning/Review/Orchestration, 0.5 gemini-3.5-flash → Boilerplate/Explain/Orchestration, 1.2 The 0.3 weight on frontier arms keeps them competitive on SecurityReview / Planning despite $4+/Mtok; 1.2 on Gemini Flash penalizes cost more so it only wins when cost is genuinely decisive (boilerplate, explain). Mechanism: extracted applyFamilyDefaults into defaults.go and call it from Router.RegisterArm. Single source of truth — both local discovery and the primary-provider path in cmd/gnoma/main.go now flow through the same defaults application. Removed the duplicate apply block from RegisterDiscoveredModels. Legacy model IDs (claude-opus-4-20250514, gpt-4o, o3, gemini-2.5-pro, etc.) intentionally do not match any table entry — keeps users on pinned older models safe from imposed 2026 Strengths. R-5: gpt-5.3-codex registration. - internal/provider/openai/provider.go: added to fallbackModels and inferOpenAIModelCapabilities (400K context, 32K output). - internal/provider/ratelimits.go: gpt-5.3-codex and its dated alias gpt-5.3-codex-2026-02-15 added with the same Tier 1 quotas as gpt-5.2. Gemini 3.x (3.1-pro-preview, 3.5-flash, 3.1-flash-lite) was already registered in both google/provider.go and ratelimits.go — no change needed for that part of R-5. Test coverage: - ResolveFamilyDefaults table-driven across all 7 cloud entries including prefix-sharing (gpt-5.5-pro → gpt-5.5 defaults, gemini-3.1-pro-preview → gemini-3.1-pro defaults). - Legacy IDs return !ok. - RegisterArm applies cloud defaults end-to-end. - User-supplied Strengths and CostWeight are not overridden. - ID.Model() fallback works when ModelName is empty (test code often constructs arms this way). Refs: docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md	2026-05-23 21:39:48 +02:00
vikingowl	9bb775a4aa	feat(router): full local family defaults table with size-keyed ceilings Expands the family-defaults scaffold to 23 entries covering the local models that currently appear in real Ollama fleets: coder specialists (qwen3-coder, devstral, qwen2.5-coder, yi-coder, deepseek-coder, starcoder), reasoners (phi-4, phi-4-mini), Gemma 2/3/4 (including the "edge" e2b/e4b variants under both Ollama and GGUF naming), Qwen 2.5/3/3.5 with a catch-all qwen entry, Mistral/Ministral (incl. the 24B mistral-small-3), Llama 3.2/4, tiny3.5 (reec's distill family), Granite, GLM (incl. glm-ocr specialist), and MiniCPM-V. Five families that span wide parameter ranges (qwen3.5, qwen3, qwen2.5, ministral-3, tiny3.5) now use SizeCap ladders instead of a flat MaxComplexity. A new parseSizeFromModelID helper splits the model ID on :/-_/ and matches pure <N>b/<N>m tokens, correctly ignoring qwen3.5 version strings, e2b edge tags, a3b MoE active params, and v0.3 version suffixes. ResolveMaxComplexity wraps ResolveFamilyDefaults plus the SizeCap traversal, falling back to the smallest cap when size parsing fails (conservative). Discovery's apply path now goes through it so SizeCap entries actually take effect. Test coverage: - parseSizeFromModelID (11 cases) - ResolveFamilyDefaults longest-prefix discipline (19 cases) - Unknown-family fallback returns !ok - ResolveMaxComplexity size-keyed ladder (13 cases) - Size-parse-failure fallback - knownFamilyDefaults invariants: SizeCaps ordered largest-first, SizeCaps and MaxComplexity mutually exclusive per entry - Routing-payoff integration: 3 arms (tiny3.5:1.5b, phi-4:14b, qwen3-coder:30b) get picked for TaskGeneration / TaskPlanning / TaskBoilerplate respectively, without any [[arms]] config - Local fleet visibility: the maintainer's actual `ollama ls` inventory registers correctly with expected MaxComplexity and Strengths; embeddinggemma stays filtered out The Planning sub-case surfaced a separate issue worth flagging: heuristicQuality floors out at 0.55 for a generic 14B local model without ThinkingModes, below TaskPlanning's 0.60 threshold. The test mutates phi-4's capabilities post-registration to reflect reality (phi-4 is reasoning-tuned). A discovery-side thinking-capability detection is out of scope for this plan but flagged in the test comment for follow-up. Refs: docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md	2026-05-23 21:34:09 +02:00
vikingowl	a79e99199d	feat(router): non-chat exclude, vision prefixes, family-defaults scaffold Discovery previously registered every model returned by Ollama as a chat arm, including embeddings, ASR, TTS, audio realtime, and rerankers — which then failed at inference time when the router selected them. Local arms also shipped with all-zero defaults, so selection between e.g. tiny3.5:1.5b, phi-4:14b, and qwen3-coder:30b was effectively random. This change covers tasks R-1, R-2, R-6 from the routing-defaults plan. - nonChatModelPatterns + isNonChatModel substring matcher; matched IDs are skipped during RegisterDiscoveredModels. Covers whisper, moonshine, kokoros, vibevoice, -asr, -tts, -audio, -embedding, embeddinggemma, -reranker, lfm2. - knownVisionModelPrefixes gains gemma4, gemma-4, glm-ocr. gemma3 and minicpm-v entries stay for regression coverage. - New internal/router/defaults.go with FamilyDefaults struct, knownFamilyDefaults map, and ResolveFamilyDefaults longest-prefix lookup (with org/-namespace stripping so reecdev/tiny3.5:1.5b resolves to "tiny3.5"). Single entry for now: functiongemma is registered with Disabled=true and MaxComplexity=0.40, reserved for the future ArmRoleToolRouter path. Table will grow in R-3. - RegisterDiscoveredModels consults ResolveFamilyDefaults and only populates fields that are still zero on the arm, so user [[arms]] overrides keep priority. Plans: - docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md - docs/superpowers/plans/2026-05-23-tool-router-specialization.md TODO.md surfaces both as in-flight items.	2026-05-23 21:24:59 +02:00

4 Commits