90 Commits

Author SHA1 Message Date
4a37e84114 feat(skill): enhanced coordinator prompt with fan-out and concurrency guidance 2026-04-07 02:24:49 +02:00
338d4a12b8 feat(skill): pipe mode support and main.go wiring 2026-04-07 02:19:42 +02:00
71b0cf9490 feat(skill): TUI integration — /skillname invokes skills, /skills lists them 2026-04-07 02:18:12 +02:00
995b26ffe7 feat(skill): registry with multi-directory loading and precedence 2026-04-07 02:17:17 +02:00
42fc2adcd8 feat(skill): bundled /batch skill with go:embed 2026-04-07 02:16:35 +02:00
327e4d74c0 feat(skill): template rendering with Go text/template 2026-04-07 02:15:51 +02:00
106694e36a feat(skill): core Skill type and YAML frontmatter parser 2026-04-07 02:05:49 +02:00
03d3a5d016 feat: engine hook integration — PreToolUse, PostToolUse, Stop 2026-04-07 01:02:55 +02:00
cf792c7059 feat: AgentExecutor — elf-based hook evaluation via elf.Manager 2026-04-07 00:55:19 +02:00
1aa1d83e9e feat: PromptExecutor — LLM-based hook evaluation via router 2026-04-07 00:53:53 +02:00
7d0b9c222f feat: ParseHookDefs — config to HookDef conversion with validation 2026-04-07 00:52:00 +02:00
d27dea5ee0 feat: hook config schema with user+project merge ordering 2026-04-07 00:50:53 +02:00
5701bc2033 feat: Dispatcher — handler chain dispatch, filtering, transform chaining 2026-04-07 00:48:08 +02:00
bf50aa234f feat: CommandExecutor — shell hook execution with stdin/stdout protocol 2026-04-07 00:38:48 +02:00
345fa5bbd4 feat: hook payload marshal/unmarshal helpers 2026-04-07 00:37:40 +02:00
ff9fd8c294 feat: hook core types — EventType, Action, CommandType, HookDef, Executor 2026-04-07 00:36:18 +02:00
9b1d6ca100 test: M7 audit — quality feedback, coordinator, agent tool coverage
Quality feedback integration: TestQualityTracker_InfluencesArmSelection
verifies that 5 successes vs 5 failures tips Router.Select() to the
high-quality arm once EMA has enough observations. Companion test
confirms heuristic fallback below minObservations.

Coordinator tests expanded from 2 → 5: added guidance content check
(parallel/serial/synthesize present), false-positive table extended with
7 cases including the reordered keywords from the previous fix.

Agent tool suite: tool interface contracts for all four tools (Name,
Description, Parameters validity, IsReadOnly). Extracted duplicated
2000-char truncation into truncateOutput() helper (format.go), removing
the inline copies in agent.go and batch.go. Four boundary tests cover
empty, short, exact-max, and over-max cases.
2026-04-06 00:59:12 +02:00
15345540f2 fix: ClassifyTask priority ordering — orchestration below operational types
Operational task types (debug, review, refactor, test, explain) now gate
before orchestration in the keyword cascade. Previously, prompts like
"review the orchestration layer" or "refactor the pipeline dispatch"
matched "orchestrat"/"dispatch" and misclassified as TaskOrchestration.
Planning is also moved below the operational types.

Expanded orchestration keywords to cover common intent that the original
four keywords missed: "fan out", "subtask", "delegate to", "spawn elf".

Adds regression tests for false-positive cases and positive tests for new
keywords.
2026-04-06 00:58:54 +02:00
8f10b92ae1 feat: interactive session picker for /resume and --resume 2026-04-06 00:22:52 +02:00
2d41c2d46c fix: session security and correctness — path traversal, turn count restore, incognito quality leak
- store: validate session ID against store root to block path traversal in Load/Save
- local: seed turnCount from LocalConfig.TurnCount so resumed sessions keep correct turn count
- main: pass TurnCount from snapshot to LocalConfig on resume
- main: suppress quality.json save when --incognito is active
- main: handle UserConfigDir error in quality save defer instead of silently using wrong path
- test: add TestSessionStore_Load/Save_RejectsPathTraversal
2026-04-06 00:04:09 +02:00
4597d4cb08 feat: /resume TUI command + SessionStore in tui.Config
- Add SessionStore field to tui.Config
- Add /resume slash command: lists sessions or restores by ID
- Pass SessionStore to tui.New in main.go
- Update /help text to include /resume
- Add .gnoma/sessions/ to .gitignore
2026-04-05 23:51:48 +02:00
16193de4fc feat: LocalConfig + auto-save hook in session.Local
Refactor NewLocal to accept LocalConfig (matching engine/router patterns),
add persistence fields (SessionID, Store, Incognito, Logger), capture
finalState before releasing the lock to avoid data races, and auto-save
a Snapshot after each successful turn when a store is configured.
Add SessionID() to the Session interface and three new tests covering
auto-save, no-store no-panic, and SessionID accessors.
2026-04-05 23:46:48 +02:00
ca163dc8d4 feat: QualityTracker.Snapshot/Restore + Router.QualityTracker() for cross-session persistence 2026-04-05 23:40:19 +02:00
20fb045cba feat: Engine.SetHistory/SetUsage/SetActivatedTools for session restore 2026-04-05 23:39:38 +02:00
bbd7791428 feat: add Session config section (max_keep for session retention) 2026-04-05 23:37:10 +02:00
8b489d429a test: snapshot JSON round-trip with multi-turn conversation 2026-04-05 23:35:25 +02:00
3cf5cdeeb6 feat: SessionStore — save/load/list/prune session snapshots to .gnoma/sessions/ 2026-04-05 23:34:29 +02:00
ef2d058cc0 feat: JSON serialization for Message and Content (session persistence blocker)
Add custom MarshalJSON/UnmarshalJSON on Content using string type discriminant
("text", "tool_call", "tool_result", "thinking"). Add json tags to Message.
2026-04-05 23:31:25 +02:00
3f258a1bc7 feat: coordinator mode — system prompt injection for orchestration tasks 2026-04-05 23:07:56 +02:00
8d3d498245 feat: coordinator mode — system prompt injection for orchestration tasks 2026-04-05 23:06:23 +02:00
dd9f4e390a feat: accurate context window sizing from arm capabilities + prefix token baseline + tokenizer wiring 2026-04-05 22:26:31 +02:00
27ca12f863 feat: tokenizer-aware Tracker.CountTokens/CountMessages replaces EstimateMessages in compaction 2026-04-05 22:21:12 +02:00
62112cff55 feat: list_results + read_result tools for coordinator artifact discovery 2026-04-05 22:19:05 +02:00
7c991a9c68 feat: list_results + read_result tools for coordinator artifact discovery 2026-04-05 22:15:04 +02:00
6cf5e92957 feat: QualityTracker — EMA router feedback from elf outcomes, ResultFilePaths tracking 2026-04-05 22:08:08 +02:00
d251dd7507 feat: wire persist.Store into engine, elf manager, and agent tools 2026-04-05 21:59:55 +02:00
eab5f26407 test: tokenizer cached encoding path coverage 2026-04-05 21:53:05 +02:00
f7782215dc feat: tiktoken tokenizer — accurate BPE token counting with provider-aware encoding 2026-04-05 21:46:43 +02:00
fbb28de0b8 fix: persist.Store — sanitize callID, log save errors, document List filter semantics 2026-04-05 21:44:03 +02:00
ace3716204 feat: persist.Store — session-scoped /tmp tool result persistence 2026-04-05 21:38:45 +02:00
cb2d63d06f feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes
provider/openai:
- Fix doubled tool call args (argsComplete flag): Ollama sends complete
  args in the first streaming chunk then repeats them as delta, causing
  doubled JSON and 400 errors in elfs
- Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep)
- Add Reasoning field support for Ollama thinking output

cmd/gnoma:
- Early TTY detection so logger is created with correct destination
  before any component gets a reference to it (fixes slog WARN bleed
  into TUI textarea)

permission:
- Exempt spawn_elfs and agent tools from safety scanner: elf prompt
  text may legitimately mention .env/.ssh/credentials patterns and
  should not be blocked

tui/app:
- /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge
  (ask for plain text output) → TUI fallback write from streamBuf
- looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback
  content before writing (reject refusals, strip narrative preambles)
- Collapse thinking output to 3 lines; ctrl+o to expand (live stream
  and committed messages)
- Stream-level filter for model pseudo-tool-call blocks: suppresses
  <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call|>
  from entering streamBuf across chunk boundaries
- sanitizeAssistantText regex covers both block formats
- Reset streamFilterClose at every turn start
2026-04-05 19:24:51 +02:00
14b88cadcc feat: M1-M7 gap audit phase 3 — context prefix, deferred tools, compact hooks
Gap 11 (M6): Fixed context prefix
- Window.PrefixMessages stores immutable docs (CLAUDE.md, .gnoma/GNOMA.md)
- Prefix stripped before compaction, prepended after — survives all compaction
- AllMessages() returns prefix + history for provider requests
- main.go loads CLAUDE.md and .gnoma/GNOMA.md at startup as prefix

Gap 12 (M6): Deferred tool loading
- DeferrableTool optional interface: ShouldDefer() bool
- buildRequest() skips deferred tools until activated
- Tools auto-activate on first model request (activatedTools map)
- agent + spawn_elfs marked as deferrable (large schemas, rarely needed early)
- Saves ~800 tokens per deferred tool per request

Gap 13 (M6): Pre/post compact hooks
- OnPreCompact/OnPostCompact callbacks in WindowConfig
- Called in doCompact() (shared by CompactIfNeeded + ForceCompact)
- M8 hooks system will extend these to full protocol
2026-04-04 20:46:50 +02:00
509c897847 feat: M1-M7 gap audit phase 2 — security, TUI, context, router feedback
Gap 6 (M3): 7 new bash security checks (8-14)
- JQ injection, obfuscated flags (Unicode lookalike hyphens),
  /proc/environ access, brace expansion, Unicode whitespace,
  zsh dangerous constructs, comment-quote desync
- Total: 14 checks (was 7)

Gap 7 (M5): Model picker numbered selection
- /model shows numbered sorted list, /model 3 picks by number

Gap 8 (M5): /config set command
- /config set provider.default mistral writes to .gnoma/config.toml
- Whitelisted keys: provider.default, provider.model, permission.mode
- New config/write.go with TOML round-trip via BurntSushi/toml

Gap 9 (M6): Simple token estimator
- EstimateTokens (len/4 heuristic), EstimateMessages (content + overhead)
- PreEstimate on Tracker for proactive compaction triggering

Gap 10 (M7): Router quality feedback from elfs
- Router.Outcome + ReportOutcome (logs for now, M9 bandit uses later)
- Manager tracks armID/taskType per elf via elfMeta map
- Manager.ReportResult called after elf completion in both agent + batch tools
2026-04-04 11:07:08 +02:00
31cba286ac fix: M1-M7 gap audit phase 1 — bug fix + 5 quick wins
Bug fix:
- window.go: token ratio after compaction used len(w.messages) after
  reassignment, always producing ratio ~1.0. Fixed by saving original
  length before assignment.

Gap 1 (M3): Scanner patterns 13 → 47
- Added 34 new patterns: Azure, DigitalOcean, HuggingFace, Grafana,
  GitHub extended (app/oauth/refresh), Shopify, Twilio, SendGrid,
  NPM, PyPI, Databricks, Pulumi, Postman, Sentry, Anthropic admin,
  OpenAI extended, Vault, Supabase, Telegram, Discord, JWT, Heroku,
  Mailgun, Figma

Gap 2 (M3): Config security section
- SecuritySection with EntropyThreshold + custom PatternConfig
- Wire custom patterns from TOML into scanner at startup

Gap 3 (M4): Polling discovery loop
- StartDiscoveryLoop with 30s ticker, reconciles arms vs discovered
- Router.RemoveArm for disappeared local models

Gap 4 (M5): Incognito LocalOnly enforcement
- Router.SetLocalOnly filters non-local arms in Select()
- TUI incognito toggle (Ctrl+X, /incognito) sets local-only routing

Gap 5 (M6): Reactive 413 compaction
- Window.ForceCompact() bypasses ShouldCompact threshold
- Engine handles 413 with emergency compact + retry
2026-04-03 23:11:08 +02:00
38fc49a6c4 fix: retry with exponential backoff on 429, stagger elf spawns
Engine retries transient errors (429, 5xx) up to 4 times with
1s/2s/4s/8s backoff. Respects Retry-After header from provider.

Batch tool staggers elf spawns by 300ms to avoid rate limit bursts
when all elfs hit the API simultaneously (Mistral's 1 req/s limit).
2026-04-03 21:08:20 +02:00
ace9b5f273 feat: spawn_elfs batch tool for guaranteed parallel elf execution
New spawn_elfs tool takes array of tasks, spawns all elfs simultaneously.
Solves the problem of models (Mistral Small, Devstral) that serialize
tool calls instead of batching them.

Schema: {"tasks": [{"prompt": "...", "task_type": "..."}], "max_turns": 30}

Also:
- Suppress spawn_elfs tool output from chat (tree handles display)
- Update M7 milestones to reflect completed deliverables
- Add CC-inspired features to M8/M10: task notification system,
  task framework, /batch skill, coordinator mode, StreamingToolExecutor,
  git worktree isolation
2026-04-03 21:03:51 +02:00
706363f94b feat: rate limit pools, elf tree view, permission prompts, dep updates
Rate limits:
- Add PoolRPS/PoolTPM/PoolTokensMonth/PoolCostMonth pool kinds
- Provider defaults for Mistral/Anthropic/OpenAI/Google (tier-aware)
- Config override via [rate_limits.<provider>] TOML section
- Pools auto-attached to arms on registration

Elf tree view (CC-style):
- Structured elf.Progress type replaces flat string channel
- Tree with ├─/└─ branches, per-elf stats (tool uses, tokens)
- Live activity updates: tool calls, "generating… (N chars)"
- Completed elfs stay in tree with "Done (duration)" until turn ends
- Suppress raw elf output from chat (tree + LLM summary instead)
- Remove background elf mode (wait: false) — always wait
- Truncate elf results to 2000 chars for parent context
- Parallel hint in system prompt and tool description

Permission prompts:
- Show actual command in prompt: "bash wants to execute: find . -name '*.go'"
- Compact hint in separator bar: "⚠ bash: find . | wc -l [y/n]"
- PermReqMsg carries tool name + args

Other:
- Fix /model not updating status bar (session.Local.SetModel)
- Add make targets: run, check, install
- Update deps: BurntSushi/toml v1.6.0, chroma v2.23.1, x/text v0.35.0, cloud.google.com/go v0.123.0
2026-04-03 20:54:48 +02:00
1f416bac8f fix: live elf progress shows tool calls + results, not just text 2026-04-03 19:42:48 +02:00
97d5093526 feat: configurable max_turns for elfs — LLM sets via agent tool param 2026-04-03 19:37:17 +02:00
2ccc261c39 fix: elf progress — proper last-2-lines tracking, 70 char truncation 2026-04-03 19:30:18 +02:00