Commit Graph

17 Commits

Author SHA1 Message Date
4a37e84114 feat(skill): enhanced coordinator prompt with fan-out and concurrency guidance 2026-04-07 02:24:49 +02:00
03d3a5d016 feat: engine hook integration — PreToolUse, PostToolUse, Stop 2026-04-07 01:02:55 +02:00
8d3d498245 feat: coordinator mode — system prompt injection for orchestration tasks 2026-04-05 23:06:23 +02:00
dd9f4e390a feat: accurate context window sizing from arm capabilities + prefix token baseline + tokenizer wiring 2026-04-05 22:26:31 +02:00
d251dd7507 feat: wire persist.Store into engine, elf manager, and agent tools 2026-04-05 21:59:55 +02:00
cb2d63d06f feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes
provider/openai:
- Fix doubled tool call args (argsComplete flag): Ollama sends complete
  args in the first streaming chunk then repeats them as delta, causing
  doubled JSON and 400 errors in elfs
- Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep)
- Add Reasoning field support for Ollama thinking output

cmd/gnoma:
- Early TTY detection so logger is created with correct destination
  before any component gets a reference to it (fixes slog WARN bleed
  into TUI textarea)

permission:
- Exempt spawn_elfs and agent tools from safety scanner: elf prompt
  text may legitimately mention .env/.ssh/credentials patterns and
  should not be blocked

tui/app:
- /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge
  (ask for plain text output) → TUI fallback write from streamBuf
- looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback
  content before writing (reject refusals, strip narrative preambles)
- Collapse thinking output to 3 lines; ctrl+o to expand (live stream
  and committed messages)
- Stream-level filter for model pseudo-tool-call blocks: suppresses
  <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call|>
  from entering streamBuf across chunk boundaries
- sanitizeAssistantText regex covers both block formats
- Reset streamFilterClose at every turn start
2026-04-05 19:24:51 +02:00
14b88cadcc feat: M1-M7 gap audit phase 3 — context prefix, deferred tools, compact hooks
Gap 11 (M6): Fixed context prefix
- Window.PrefixMessages stores immutable docs (CLAUDE.md, .gnoma/GNOMA.md)
- Prefix stripped before compaction, prepended after — survives all compaction
- AllMessages() returns prefix + history for provider requests
- main.go loads CLAUDE.md and .gnoma/GNOMA.md at startup as prefix

Gap 12 (M6): Deferred tool loading
- DeferrableTool optional interface: ShouldDefer() bool
- buildRequest() skips deferred tools until activated
- Tools auto-activate on first model request (activatedTools map)
- agent + spawn_elfs marked as deferrable (large schemas, rarely needed early)
- Saves ~800 tokens per deferred tool per request

Gap 13 (M6): Pre/post compact hooks
- OnPreCompact/OnPostCompact callbacks in WindowConfig
- Called in doCompact() (shared by CompactIfNeeded + ForceCompact)
- M8 hooks system will extend these to full protocol
2026-04-04 20:46:50 +02:00
31cba286ac fix: M1-M7 gap audit phase 1 — bug fix + 5 quick wins
Bug fix:
- window.go: token ratio after compaction used len(w.messages) after
  reassignment, always producing ratio ~1.0. Fixed by saving original
  length before assignment.

Gap 1 (M3): Scanner patterns 13 → 47
- Added 34 new patterns: Azure, DigitalOcean, HuggingFace, Grafana,
  GitHub extended (app/oauth/refresh), Shopify, Twilio, SendGrid,
  NPM, PyPI, Databricks, Pulumi, Postman, Sentry, Anthropic admin,
  OpenAI extended, Vault, Supabase, Telegram, Discord, JWT, Heroku,
  Mailgun, Figma

Gap 2 (M3): Config security section
- SecuritySection with EntropyThreshold + custom PatternConfig
- Wire custom patterns from TOML into scanner at startup

Gap 3 (M4): Polling discovery loop
- StartDiscoveryLoop with 30s ticker, reconciles arms vs discovered
- Router.RemoveArm for disappeared local models

Gap 4 (M5): Incognito LocalOnly enforcement
- Router.SetLocalOnly filters non-local arms in Select()
- TUI incognito toggle (Ctrl+X, /incognito) sets local-only routing

Gap 5 (M6): Reactive 413 compaction
- Window.ForceCompact() bypasses ShouldCompact threshold
- Engine handles 413 with emergency compact + retry
2026-04-03 23:11:08 +02:00
38fc49a6c4 fix: retry with exponential backoff on 429, stagger elf spawns
Engine retries transient errors (429, 5xx) up to 4 times with
1s/2s/4s/8s backoff. Respects Retry-After header from provider.

Batch tool staggers elf spawns by 300ms to avoid rate limit bursts
when all elfs hit the API simultaneously (Mistral's 1 req/s limit).
2026-04-03 21:08:20 +02:00
07c739795c feat: M7 Elfs — sub-agents with router-integrated spawning
internal/elf/:
- BackgroundElf: runs on own goroutine with independent engine,
  history, and provider. No shared mutable state.
- Manager: spawns elfs via router.Select() (picks best arm per
  task type), tracks lifecycle, WaitAll(), CancelAll(), Cleanup().

internal/tool/agent/:
- Agent tool: LLM can call 'agent' to spawn sub-agents.
  Supports task_type hint for routing, wait/background mode.
  5-minute timeout, context cancellation propagated.

Concurrent tool execution:
- Read-only tools (fs.read, fs.grep, fs.glob, etc.) execute in
  parallel via goroutines.
- Write tools (bash, fs.write, fs.edit) execute sequentially.
- Partition by tool.IsReadOnly().

TUI: /elf command explains how to use sub-agents.
5 elf tests. Exit criteria: parent spawns 3 background elfs on
different providers, collects and synthesizes results.
2026-04-03 19:16:46 +02:00
f3d3390b86 feat: M6 complete — summarize strategy + tool result persistence
SummarizeStrategy: calls LLM to condense older messages into a
summary, preserving key decisions, file changes, tool outputs.
Falls back to truncation on failure. Keeps 6 recent messages.

Tool result persistence: outputs >50K chars saved to disk at
.gnoma/sessions/tool-results/{id}.txt with 2K preview inline.

TUI: /compact command for manual compaction, /clear now resets
engine history. Summarize strategy used by default (with
truncation fallback).
2026-04-03 18:51:28 +02:00
b0b393517e feat: M6 context intelligence — token tracker + truncation compaction
internal/context/:
- Tracker: monitors token usage with OK/Warning/Critical states
  (thresholds from CC: 20K warning buffer, 13K autocompact buffer)
- TruncateStrategy: drops oldest messages, preserves system prompt +
  recent N turns, adds compaction boundary marker
- Window: manages message history with auto-compaction trigger,
  circuit breaker after 3 consecutive failures

Engine integration:
- Context window tracks usage per turn
- Auto-compacts when critical threshold reached
- History syncs with context window after compaction

TUI status bar:
- Token count with percentage (tokens: 1234 (5%))
- Color-coded: green=ok, yellow=warning, red=critical

Session Status extended: TokensMax, TokenPercent, TokenState.
7 context tests.
2026-04-03 18:46:03 +02:00
97b065596d feat: wire permission checker into engine tool execution
Tools now go through permission.Checker before executing:
- plan mode: denies all writes (fs.write, bash), allows reads
- bypass mode: allows all (deny rules still enforced)
- default mode: prompts user (pipe: stdin prompt, TUI: auto-approve for now)
- accept_edits: auto-allows file ops, prompts for bash
- deny mode: denies all without allow rules

CLI flags: --permission <mode>, --incognito
Pipe mode: console Y/N prompt on stderr
TUI mode: auto-approve (proper overlay TODO)

Verified: plan mode correctly blocks fs.write, model sees error.
2026-04-03 16:15:41 +02:00
6c70a2ceaf fix: TUI overflow, scrollable header, tool output, git branch
- Fixed: chat content no longer overflows past allocated height.
  Lines are measured for physical width and hard-truncated to
  exactly the chat area height. Input + status bar always visible.
- Header scrolls with chat (not pinned), only input/status fixed
- Git branch in status bar (green, via git rev-parse)
- Alt screen mode — terminal scrollback disabled
- Mouse wheel + PgUp/PgDown scroll within TUI
- New EventToolResult: tool output as dimmed indented block
- Separator lines above/below input, no status bar backgrounds
2026-04-03 15:53:42 +02:00
b9faa30ea8 feat: add router foundation with task classification and arm selection
internal/router/ — core routing layer:
- Task classification: 10 types (boilerplate, generation, refactor,
  review, unit_test, planning, orchestration, security_review, debug,
  explain) with keyword heuristics and complexity scoring
- Arm registry: provider+model pairs with capabilities and cost
- Limit pools: shared resource budgets with scarcity multipliers,
  optimistic reservation, use-it-or-lose-it discounting
- Heuristic selector: score = (quality × value) / effective_cost
  Prefers tools, thinking for planning, penalizes small models on
  complex tasks
- Router: Select() picks best feasible arm, ForceArm() for CLI override

Engine now routes through router.Select() when configured.
Wired into CLI — arm registered per --provider/--model flags.

20 router tests. 173 tests total across 13 packages.
2026-04-03 14:23:15 +02:00
33dec722b8 feat: add security firewall with secret scanning and incognito mode
internal/security/ — core security layer baked into gnoma:
- Secret scanner: gitleaks-derived regex patterns (Anthropic, OpenAI,
  AWS, GitHub, GitLab, Slack, Stripe, private keys, DB URLs, generic
  secrets) + Shannon entropy detection for unknown formats
- Redactor: replaces matched secrets with [REDACTED], merges
  overlapping ranges, preserves surrounding context
- Unicode sanitizer: NFKC normalization, strips Cf/Co categories,
  tag characters (ASCII smuggling), zero-width chars, RTL overrides
- Incognito mode: suppresses persistence, learning, content logging
- Firewall: wraps engine, scans outgoing messages + system prompt +
  tool results before they reach the provider

Wired into engine and CLI. 21 security tests.
2026-04-03 14:07:50 +02:00
f0633d8ac6 feat: complete M1 — core engine with Mistral provider
Mistral provider adapter with streaming, tool calls (single-chunk
pattern), stop reason inference, model listing, capabilities, and
JSON output support.

Tool system: bash (7 security checks, shell alias harvesting for
bash/zsh/fish), file ops (read, write, edit, glob, grep, ls).
Alias harvesting collects 300+ aliases from user's shell config.

Engine agentic loop: stream → tool execution → re-query → until
done. Tool gating on model capabilities. Max turns safety limit.

CLI pipe mode: echo "prompt" | gnoma streams response to stdout.
Flags: --provider, --model, --system, --api-key, --max-turns,
--verbose, --version.

Provider interface expanded: Models(), DefaultModel(), Capabilities
(ToolUse, JSONOutput, Vision, Thinking, ContextWindow, MaxOutput),
ResponseFormat with JSON schema support.

Live verified: text streaming + tool calling with devstral-small.
117 tests across 8 packages, 10MB binary.
2026-04-03 12:01:55 +02:00