Pure whitespace cleanup surfaced when 'make check' ran gofmt over the
tree. Mostly struct-field column alignment in internal/safety/banner.go
(SessionInfo) and the var(...) flag block in cmd/gnoma/main.go after
--dangerously-allow-anywhere was added without realignment. Verified
zero substantive changes via 'git diff --ignore-all-space
--ignore-blank-lines'.
Closes R-4 and R-5 of the routing-defaults plan.
R-4: Strengths + CostWeight defaults for closed frontier models.
Cloud entries land in the same knownFamilyDefaults table as local
ones, with MaxComplexity intentionally left zero (cloud arms get
no complexity ceiling). CostWeight tuned per the plan's rationale:
claude-opus-4-7 → Planning/SecurityReview/Debug/Refactor, 0.3
claude-sonnet-4-6 → Generation/Refactor/Review, 0.7
gpt-5.5 → Planning/SecurityReview/Generation, 0.3
gpt-5.3-codex → Generation/Refactor/Debug/UnitTest, 0.6
gpt-5.2 → Orchestration/Review, 0.8
gemini-3.1-pro → Planning/Review/Orchestration, 0.5
gemini-3.5-flash → Boilerplate/Explain/Orchestration, 1.2
The 0.3 weight on frontier arms keeps them competitive on
SecurityReview / Planning despite $4+/Mtok; 1.2 on Gemini Flash
penalizes cost more so it only wins when cost is genuinely
decisive (boilerplate, explain).
Mechanism: extracted applyFamilyDefaults into defaults.go and call
it from Router.RegisterArm. Single source of truth — both local
discovery and the primary-provider path in cmd/gnoma/main.go now
flow through the same defaults application. Removed the duplicate
apply block from RegisterDiscoveredModels.
Legacy model IDs (claude-opus-4-20250514, gpt-4o, o3, gemini-2.5-pro,
etc.) intentionally do not match any table entry — keeps users on
pinned older models safe from imposed 2026 Strengths.
R-5: gpt-5.3-codex registration.
- internal/provider/openai/provider.go: added to fallbackModels
and inferOpenAIModelCapabilities (400K context, 32K output).
- internal/provider/ratelimits.go: gpt-5.3-codex and its dated
alias gpt-5.3-codex-2026-02-15 added with the same Tier 1
quotas as gpt-5.2.
Gemini 3.x (3.1-pro-preview, 3.5-flash, 3.1-flash-lite) was already
registered in both google/provider.go and ratelimits.go — no change
needed for that part of R-5.
Test coverage:
- ResolveFamilyDefaults table-driven across all 7 cloud entries
including prefix-sharing (gpt-5.5-pro → gpt-5.5 defaults,
gemini-3.1-pro-preview → gemini-3.1-pro defaults).
- Legacy IDs return !ok.
- RegisterArm applies cloud defaults end-to-end.
- User-supplied Strengths and CostWeight are not overridden.
- ID.Model() fallback works when ModelName is empty (test code
often constructs arms this way).
Refs: docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md
Expands the family-defaults scaffold to 23 entries covering the local
models that currently appear in real Ollama fleets: coder specialists
(qwen3-coder, devstral, qwen2.5-coder, yi-coder, deepseek-coder,
starcoder), reasoners (phi-4, phi-4-mini), Gemma 2/3/4 (including the
"edge" e2b/e4b variants under both Ollama and GGUF naming), Qwen
2.5/3/3.5 with a catch-all qwen entry, Mistral/Ministral (incl. the
24B mistral-small-3), Llama 3.2/4, tiny3.5 (reec's distill family),
Granite, GLM (incl. glm-ocr specialist), and MiniCPM-V.
Five families that span wide parameter ranges (qwen3.5, qwen3,
qwen2.5, ministral-3, tiny3.5) now use SizeCap ladders instead of a
flat MaxComplexity. A new parseSizeFromModelID helper splits the
model ID on :/-_/ and matches pure <N>b/<N>m tokens, correctly
ignoring qwen3.5 version strings, e2b edge tags, a3b MoE active
params, and v0.3 version suffixes.
ResolveMaxComplexity wraps ResolveFamilyDefaults plus the SizeCap
traversal, falling back to the smallest cap when size parsing fails
(conservative). Discovery's apply path now goes through it so
SizeCap entries actually take effect.
Test coverage:
- parseSizeFromModelID (11 cases)
- ResolveFamilyDefaults longest-prefix discipline (19 cases)
- Unknown-family fallback returns !ok
- ResolveMaxComplexity size-keyed ladder (13 cases)
- Size-parse-failure fallback
- knownFamilyDefaults invariants: SizeCaps ordered largest-first,
SizeCaps and MaxComplexity mutually exclusive per entry
- Routing-payoff integration: 3 arms (tiny3.5:1.5b, phi-4:14b,
qwen3-coder:30b) get picked for TaskGeneration / TaskPlanning /
TaskBoilerplate respectively, without any [[arms]] config
- Local fleet visibility: the maintainer's actual `ollama ls`
inventory registers correctly with expected MaxComplexity and
Strengths; embeddinggemma stays filtered out
The Planning sub-case surfaced a separate issue worth flagging:
heuristicQuality floors out at 0.55 for a generic 14B local model
without ThinkingModes, below TaskPlanning's 0.60 threshold. The test
mutates phi-4's capabilities post-registration to reflect reality
(phi-4 is reasoning-tuned). A discovery-side thinking-capability
detection is out of scope for this plan but flagged in the test
comment for follow-up.
Refs: docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md
Discovery previously registered every model returned by Ollama as a
chat arm, including embeddings, ASR, TTS, audio realtime, and
rerankers — which then failed at inference time when the router
selected them. Local arms also shipped with all-zero defaults, so
selection between e.g. tiny3.5:1.5b, phi-4:14b, and qwen3-coder:30b
was effectively random.
This change covers tasks R-1, R-2, R-6 from the routing-defaults plan.
- nonChatModelPatterns + isNonChatModel substring matcher; matched
IDs are skipped during RegisterDiscoveredModels. Covers whisper,
moonshine, kokoros, vibevoice, -asr, -tts, -audio, -embedding,
embeddinggemma, -reranker, lfm2.
- knownVisionModelPrefixes gains gemma4, gemma-4, glm-ocr. gemma3
and minicpm-v entries stay for regression coverage.
- New internal/router/defaults.go with FamilyDefaults struct,
knownFamilyDefaults map, and ResolveFamilyDefaults longest-prefix
lookup (with org/-namespace stripping so reecdev/tiny3.5:1.5b
resolves to "tiny3.5"). Single entry for now: functiongemma is
registered with Disabled=true and MaxComplexity=0.40, reserved for
the future ArmRoleToolRouter path. Table will grow in R-3.
- RegisterDiscoveredModels consults ResolveFamilyDefaults and only
populates fields that are still zero on the arm, so user [[arms]]
overrides keep priority.
Plans:
- docs/superpowers/plans/2026-05-23-routing-defaults-refresh.md
- docs/superpowers/plans/2026-05-23-tool-router-specialization.md
TODO.md surfaces both as in-flight items.