gnoma

Owlibou/gnoma

Fork 0

Commit Graph

Author	SHA1	Message	Date
vikingowl	aca830e7db	feat(engine): consumption-time stream-error failover When a stream errors out before producing any user-visible content (text, thinking, or tool calls), the engine now transparently retries on the next-best arm instead of bubbling the error to the TUI. Covers the case from the post-SLM screenshot: subprocess CLI agents that exit non-zero on auth/config failures, network drops mid-stream, rate-limited arms whose error surfaces after Stream() already returned. Mechanism: the stream-create + consume blocks are wrapped in a labeled streamLoop. On s.Err() != nil with empty accumulator, the engine emits a new EventFailover ("↻ <failed_arm> failed (<reason>) — retrying on another arm"), excludes the failed arm via task.ExcludedArms, and re-enters the loop. Cap of 4 failovers per round. Guards: - !acc.HasContent() — if text/tool calls already streamed, fail loud rather than duplicate visible output on retry. - isFailoverable(err) — deny-list approach: context.Canceled/Deadline and HTTP 400/413 are fatal; everything else (auth, rate limit, 5xx, subprocess exit, network) is failoverable. - Router.ForcedArm() == "" — when the user pinned an arm via --provider, failover is disabled by design. - failoverAttempt < maxFailovers — bounded retry budget. TUI renders EventFailover under the existing "cost" role styling. shortFailReason strips the subprocess wrapper envelope so the user sees "Invalid API key. Try again." instead of "subprocess: exit status 1: Error: Invalid API key. Try again.". Tests cover the classifier (isFailoverable, shortFailReason), end-to-end auth-error failover, content-already-streamed guard, and context-cancel guard. Deterministic across 10x -race runs by giving the failing arm IsCLIAgent=true to anchor it in tier 0 ahead of the API-tier backup.	2026-05-20 02:20:00 +02:00
vikingowl	0b4de6054d	feat(tui): surface SLM backend + per-turn classifier in status bar The TUI gave no indication that an SLM was configured or active. You'd see the primary provider on the status line and nothing else, even with [slm].enabled=true and a successfully booted backend. Two surfaces added: 1. Status-bar SLM badge. The left side of the status line gains a dim " · slm: <model> ⚙" suffix when the backend booted, " · slm: ✗" when it failed, and nothing when SLM is disabled. The ⚙ marker indicates the model advertises tool support. 2. Per-turn classifier visibility. The existing routing event already produced "routed → <arm> (task: <type>)" lines in the chat history; it now also reports which classifier made the decision, e.g. "routed → ollama/ministral-3:3b (task: explain, by: slm_fallback)". Lets you tell in real time whether the SLM is actually classifying or falling back to the keyword heuristic. Plumbing: - new tui.SLMInfo struct on tui.Config - main.go populates it after StartBackend returns - stream.Event gains RoutingClassifier; engine.runLoop fills it from task.ClassifierSource on the first round	2026-05-19 19:06:26 +02:00
vikingowl	ce5f9d3dc9	feat(tui): Tier 3-4 UX improvements — split, routing, session naming, context bar - Split app.go (2091→1378 lines) into rendering.go, events.go, init.go - Add EventRouting stream event for router arm transparency - Add session auto-naming from first user message - Add context window progress bar in status bar - Add /keys cheatsheet, /replay for resumed sessions - Add inline cost-per-turn after assistant responses - Add diff previews in fs.write/fs.edit permission prompts - Collapse tool output to 3 lines by default (ctrl+o expands) - Use AddPrefix for system context instead of InjectMessage - Handle ContentThinking and ContentToolResult in session resume - Show session title in resume picker - Add /model numeric selection snapshot safety	2026-04-12 05:13:16 +02:00

Author

SHA1

Message

Date

vikingowl

aca830e7db

feat(engine): consumption-time stream-error failover

When a stream errors out before producing any user-visible content
(text, thinking, or tool calls), the engine now transparently retries
on the next-best arm instead of bubbling the error to the TUI. Covers
the case from the post-SLM screenshot: subprocess CLI agents that
exit non-zero on auth/config failures, network drops mid-stream,
rate-limited arms whose error surfaces after Stream() already returned.

Mechanism: the stream-create + consume blocks are wrapped in a labeled
streamLoop. On s.Err() != nil with empty accumulator, the engine emits
a new EventFailover ("↻ <failed_arm> failed (<reason>) — retrying on
another arm"), excludes the failed arm via task.ExcludedArms, and
re-enters the loop. Cap of 4 failovers per round.

Guards:
- !acc.HasContent() — if text/tool calls already streamed, fail loud
  rather than duplicate visible output on retry.
- isFailoverable(err) — deny-list approach: context.Canceled/Deadline
  and HTTP 400/413 are fatal; everything else (auth, rate limit, 5xx,
  subprocess exit, network) is failoverable.
- Router.ForcedArm() == "" — when the user pinned an arm via --provider,
  failover is disabled by design.
- failoverAttempt < maxFailovers — bounded retry budget.

TUI renders EventFailover under the existing "cost" role styling.
shortFailReason strips the subprocess wrapper envelope so the user sees
"Invalid API key. Try again." instead of
"subprocess: exit status 1: Error: Invalid API key. Try again.".

Tests cover the classifier (isFailoverable, shortFailReason), end-to-end
auth-error failover, content-already-streamed guard, and context-cancel
guard. Deterministic across 10x -race runs by giving the failing arm
IsCLIAgent=true to anchor it in tier 0 ahead of the API-tier backup.

2026-05-20 02:20:00 +02:00

vikingowl

0b4de6054d

feat(tui): surface SLM backend + per-turn classifier in status bar

The TUI gave no indication that an SLM was configured or active.
You'd see the primary provider on the status line and nothing else,
even with [slm].enabled=true and a successfully booted backend.

Two surfaces added:

1. Status-bar SLM badge. The left side of the status line gains a
   dim " · slm: <model> ⚙" suffix when the backend booted, " · slm: ✗"
   when it failed, and nothing when SLM is disabled. The ⚙ marker
   indicates the model advertises tool support.

2. Per-turn classifier visibility. The existing routing event already
   produced "routed → <arm> (task: <type>)" lines in the chat history;
   it now also reports which classifier made the decision, e.g.
   "routed → ollama/ministral-3:3b (task: explain, by: slm_fallback)".
   Lets you tell in real time whether the SLM is actually classifying
   or falling back to the keyword heuristic.

Plumbing:
  - new tui.SLMInfo struct on tui.Config
  - main.go populates it after StartBackend returns
  - stream.Event gains RoutingClassifier; engine.runLoop fills it from
    task.ClassifierSource on the first round

2026-05-19 19:06:26 +02:00

vikingowl

ce5f9d3dc9

feat(tui): Tier 3-4 UX improvements — split, routing, session naming, context bar

- Split app.go (2091→1378 lines) into rendering.go, events.go, init.go
- Add EventRouting stream event for router arm transparency
- Add session auto-naming from first user message
- Add context window progress bar in status bar
- Add /keys cheatsheet, /replay for resumed sessions
- Add inline cost-per-turn after assistant responses
- Add diff previews in fs.write/fs.edit permission prompts
- Collapse tool output to 3 lines by default (ctrl+o expands)
- Use AddPrefix for system context instead of InjectMessage
- Handle ContentThinking and ContentToolResult in session resume
- Show session title in resume picker
- Add /model numeric selection snapshot safety

2026-04-12 05:13:16 +02:00

3 Commits