Files

T

vikingowl 49d80cf847 feat(security): format-aware entropy safelist (Phase F-1)

Add a deterministic pre-extractor that skips known-safe token shapes
before they reach the entropy scorer. Targets the false-positive
regime that bites under lowered entropy_threshold or
redact_high_entropy = true — UUIDs (~3.4 bits), SHA hex digests
(~3.9 bits), ISO-8601 timestamps, and HTTP(S) URLs.

Config knob lives under the existing security section to match
entropy_threshold / redact_high_entropy convention:

  [security]
  entropy_safelist = ["uuid", "sha_hex", "iso8601", "url"]

Empty / unset preserves pre-F-1 behaviour exactly — users opt in.

Per-pattern Debug telemetry fires on every skip (pattern name +
token length, never the token bytes). This is the data F-2's
go/no-go gate depends on; the plan literally specifies it.

NewFirewall validates names at the config boundary and emits a
Warn for unknown entries so a typo like "uid" instead of "uuid"
surfaces loudly instead of silently disabling FP reduction.

Tests cover: UUID/SHA-1/SHA-256 skipped at lowered threshold,
mixed payload (safe shape + real secret) preserves the secret,
secret-adjacent-to-UUID regression guard, empty safelist preserves
pre-F-1 behaviour, unknown name silently dropped at scanner level
but warned at firewall level, end-to-end FirewallConfig wiring,
and the skip-telemetry log line.

F-2 remains gated on real-workload FP-rate observations.

2026-05-22 12:39:10 +02:00

4.3 KiB

Raw Permalink Blame History

Gnoma — TODO

Active work, newest first.

In flight

Entropy FP reduction (post-SLM Phase F) — F-1 (format-aware pre-extractor) shipped 2026-05-22: [security].entropy_safelist with uuid, sha_hex, iso8601, url; default empty so pre-F-1 behaviour is unchanged. F-2 (SLM-assisted classifier for ambiguous entropy hits) remains gated on F-1 FP-rate telemetry from real workloads plus ≥50 SLM observations. Surfaced from the r/ollama launch thread (2026-05-20); external validation from alterlab.io on the same tiered approach. See docs/superpowers/plans/2026-05-19-post-slm-unlock.md.
Compound tools (post-SLM Phase E) — held until ≥50 SLM observations inform which primitives are worth adding. See docs/superpowers/plans/2026-05-19-post-slm-unlock.md.
Sensitive-content handling — unified policy. Three input paths can introduce sensitive content into the context: pasted images (screenshots may contain secrets, API keys, PII), pasted text (often copied straight from a terminal with credentials), and tool-read files (.env, key files, etc.). Today these are handled inconsistently: incognito gates persistence but content still flows to providers; outgoing-scan firewall covers some patterns but is format-aware only for text. Need a single policy/UI: at-paste warning when the content matches sensitive heuristics, a consent-gated review step, and consistent treatment across the three paths. Cross-cuts with Phase F entropy work and the outgoing-scan firewall.
Distribution — follow-ups. v0.1.0 shipped (archives on github.com/VikingOwl91/gnoma/releases, multi-arch images on ghcr.io/vikingowl91/gnoma). Still optional: Homebrew tap, curl | sh installer script, signed checksums (cosign/sigstore), release note automation, Windows process-tree kill via golang.org/x/sys/windows job objects (currently os.Process.Kill only — see internal/mcp/transport_windows.go).

Stable backlog (not in active phases)

Thinking mode (disabled / budget / adaptive) — M12.
Structured output with JSON schema validation — M12.
Native agy JSON output — switch the subprocess provider to --output-format stream-json once the agy CLI supports it, replacing the current prompt-augmentation fallback.
SQLite session persistence + serve mode — M10.
Task learning (pattern recognition, persistent tasks) — M11.
Web UI (gnoma web) — M15.
OAuth / keyring — M13.
Observability (feature flags, cost dashboards) — M14.
PE / Mach-O ELF support — future, after ELF Phase 6.

History

Completed initiatives, kept here as pointers to their plan files:

v0.1.0 release — 2026-05-20. First tagged release. GoReleaser pipeline produces six static archives (linux/darwin/windows × amd64/arm64) on the GitHub mirror plus multi-arch Docker images on GHCR. History was rewritten on the same day to migrate authorship to a noreply identity and strip co-author attribution.
Post-audit security hardening — complete 2026-05-19. Three waves
- one ADR closed all 14 findings from the external review:
- Wave 1 — SafeProvider boundary
- Wave 2 — Incognito coherence
- Wave 3 — scanner + path hygiene (rolled out directly without a plan file; see commits leading up to 2026-05-19 on internal/security)
- ADR-004 — PostToolUse hook ordering
Post-SLM unlock — plan. Phases A–D complete (two-stage tool routing, CLI agent binary override, user profiles, per-arm capability tags).
2026-05-07 roadmap — plan. M1–M8 done; SLM classifier (Phase 3) complete; Phase 4 superseded by the post-SLM plan.

Reference

Milestones: docs/essentials/milestones.md
Decisions: docs/essentials/decisions/
ADR-002 (SLM routing, supersedes earlier ADR-009): docs/essentials/decisions/002-slm-routing.md

4.3 KiB Raw Permalink Blame History Unescape Escape

Gnoma — TODO

In flight

Stable backlog (not in active phases)

History

Reference

4.3 KiB

Raw Permalink Blame History