3 Commits

Author SHA1 Message Date
vikingowl eea26a262e feat(router): surface bandit knobs as [router.bandit] config
Four hardcoded constants in the selector and feedback tracker are now
user-tunable via [router.bandit]:

- quality_alpha    (EMA smoothing, default 0.3)
- min_observations (samples before observed overrides heuristic, default 3)
- observed_weight  (observed/heuristic blend ratio, default 0.7)
- strength_bonus   (quality bonus for Strengths-tagged arms, default 0.15)

Each field treats 0 as 'use default', so an empty TOML block is
byte-identical to pre-config behaviour. BanditParams is plumbed via
router.Config{Bandit: ...} and resolveBanditParams() centralises the
fallback so every call site shares the same defaults.

QualityTracker, scoreArm, bestScored, and selectBest signatures now
take the configured values directly rather than reaching for package-
level constants. Tests updated to pass BanditParams{} (defaults) or
explicit overrides where they validate the new tuning paths.

Tracks item #3 from the 'Bandit selector — design decisions deferred'
TODO entry — ships independently of the EMA vs SLM strategic decision.
2026-05-24 22:42:34 +02:00
vikingowl 58beb7ce3c feat(router): classifier-source telemetry + router stats command
Phase 4 routing decisions depend on knowing whether the SLM classifier
is actually firing or whether the heuristic is silently doing all the
work. Adds the instrumentation to make that observable.

router.ClassifierSource enum (heuristic / slm / slm_fallback) is set
on Task by every classifier:
- HeuristicClassifier → ClassifierHeuristic
- slm.Classifier → ClassifierSLM on success, ClassifierSLMFallback when
  the SLM call fails or returns unparseable output

The source is plumbed through router.Outcome to QualityTracker, which
now maintains per-source counters alongside the existing per-arm × task
EMA scores. QualitySnapshot serializes both (classifier_counts is
omitempty for back-compat with pre-feature quality.json files).

lazyClassifier logs at INFO the first time it falls back to heuristic
because the SLM hasn't booted yet — distinguishes operational fallback
from an unconfigured-SLM run.

slm.Manager.Start() now records elapsed-to-healthy and the main.go
goroutine logs it as part of the "SLM ready" event. Confirms whether
short-lived runs are racing the boot cycle.

New `gnoma router stats` subcommand prints both tables (arm × task
quality, classifier source breakdown) from quality.json with a Phase 4
trust hint when the data is too sparse or the SLM share is low.

6 new tests cover ClassifierSource string/enum, heuristic + SLM source
propagation, QualityTracker counter round-trip, and back-compat
restore from a legacy quality.json without classifier_counts.
2026-05-19 18:18:22 +02:00
vikingowl 64ee385039 feat: QualityTracker — EMA router feedback from elf outcomes, ResultFilePaths tracking 2026-04-05 22:08:08 +02:00