docs(security): ADR-004 PostToolUse hook ordering + invariant test #4

Merged
vikingowl merged 1 commits from docs/adr-posttooluse-hook-ordering into main 2026-05-19 23:29:31 +02:00
4 changed files with 389 additions and 1 deletions
@@ -0,0 +1,229 @@
# ADR-004: PostToolUse Hook Ordering vs. Firewall Redaction
**Status:** Accepted
**Date:** 2026-05-19
## Context
The 2026-05-19 external security audit raised this concern as a "High" finding:
> Ablauf in `internal/engine/loop.go`:
> tool executes → PostToolUse hook bekommt raw output → danach `Firewall.ScanToolResult()`
>
> Das heißt: Ein `type = "prompt"` Hook kann raw Tool-Output in ein LLM schicken,
> bevor Redaction passiert.
In source order, the relevant code is at `internal/engine/loop.go:699-712`:
```go
// PostToolUse hook: can transform result (Deny treated as Skip).
output := result.Output
if e.cfg.Hooks != nil {
payload := hook.MarshalPostToolPayload(call.Name, args, output, result.Metadata)
transformed, _, _ := e.cfg.Hooks.Fire(hook.PostToolUse, payload)
if s := hook.ExtractTransformedOutput(transformed); s != "" {
output = s
}
}
// Scan tool result through firewall
if e.cfg.Firewall != nil {
output = e.cfg.Firewall.ScanToolResult(output)
}
```
The audit's literal observation is correct: the hook receives `result.Output` before
`Firewall.ScanToolResult` runs. The audit's preferred fix was either:
- **Reorder:** scan first, then fire the hook with redacted input.
- **Split contract:** introduce `PostToolUseRawLocalOnly` (shell hooks only) and
`PostToolUseRedactedForLLM` (prompt/agent hooks only) as distinct events.
Either fix is non-trivial. The reorder regresses legitimate shell-hook use cases
that need raw access (auditing, log forwarding, transformation). The split is a
contract change that requires migrating every existing hook config.
Before adopting either, we re-examined whether the leak path is actually open
after Wave 1 (SafeProvider boundary) merged.
### What Wave 1 changed
The audit was performed against `main(2).zip`, which did not include the
SafeProvider boundary. With Wave 1 (ADR-pending; PR #1 merged), every
`provider.Provider` registered with the router — and every provider handed to a
non-engine consumer (SLM classifier, summarizer, hook streamer) — is wrapped in
`security.SafeProvider`. The wrapper scans outgoing messages and the system
prompt through the firewall before delegating to the inner provider.
This affects PostToolUse hooks as follows, by hook type:
- **`type = "command"` (shell):** runs a local subprocess. No LLM round-trip.
The hook sees raw output, but the output stays local. The audit's threat
model (raw output → cloud LLM) doesn't apply.
- **`type = "prompt"`:** uses `hook.PromptExecutor`, which calls
`p.streamer.Stream(...)`. In production, `streamer` is the
`routerStreamer` adapter from `cmd/gnoma/main.go`, which delegates to
`router.Router.Stream(...)`. The router selects an arm; each arm's `Provider`
is a `*SafeProvider` after Wave 1. Outbound messages — including the
rendered template containing `{{.Result}}` — are scanned at the SafeProvider
boundary before the inner provider sees them.
- **`type = "agent"`:** uses `hook.AgentExecutor`, which spawns an elf via
`elf.Manager`. The elf engine is constructed with `Firewall: m.firewall`, so
the elf's `buildRequest()` scans inline. Same effect as the main engine's
request path.
So the leak path — *raw tool output reaching an LLM* — is already closed
transitively by Wave 1 for both LLM-bound hook types. The audit's literal
observation about source ordering remains true; the practical leak it implied
does not.
### What's left
Two residual concerns survive the transitive guarantee:
1. **Output transformation produces unscannable downstream payloads.** A
PostToolUse hook (any type) may transform raw output into a form that
defeats regex-based scanning — e.g. base64-encoding, JSON-wrapping with
reformulated content, summarisation via LLM that drops the secret-shaped
string. The firewall scan at line 711 runs on the transformed payload,
which may no longer match patterns the raw output would have. This is
*user-configured behaviour* (they chose to add the hook), but it's worth
acknowledging as a known limit.
2. **Transitive safety is non-obvious.** The "leak is closed" property
depends on every LLM-bound hook path going through SafeProvider. A future
change that adds a new hook type with its own LLM transport, or replaces
the `routerStreamer` adapter with something that bypasses the router,
would silently reopen the leak. There is no compile-time or test-time
guarantee that this property holds.
## Decision
**Accept the current ordering. Document the transitive guarantee.**
Concretely:
1. **No reorder, no split.** The PostToolUse contract stays: hooks see raw
`result.Output`; the firewall scan runs after hook transformation.
2. **Add a doc comment** to `hook.PostToolUse` in
`internal/hook/event.go` and to the dispatcher call site in
`internal/engine/loop.go` making the transitive guarantee explicit:
*"PostToolUse hooks receive pre-scan output. LLM-bound paths
(prompt/agent) inherit firewall scanning at the SafeProvider boundary;
shell hooks stay local. Any new hook type that talks to an LLM
must route through SafeProvider or perform its own scan."*
3. **Add a regression test** that asserts a `type = "prompt"` PostToolUse
hook firing on a tool result containing a known secret-shaped string
does not deliver that string verbatim to an inner provider. The test
builds a recording fake provider, registers it via a router arm wrapped
in SafeProvider, configures a prompt-type PostToolUse hook that
includes `{{.Result}}` in its template, fires the hook, and asserts
the recorded request was redacted.
### Why not reorder?
Reorder eliminates Concern 2 cheaply, but it regresses Concern 1's mirror
case: shell hooks that need raw output for legitimate local-only purposes.
Examples we want to preserve:
- Audit-log hooks that record exact bash output for later inspection.
- Forensic hooks that hash raw output before any transformation.
- Local-only alerting (e.g. fire pagerduty when a specific stderr pattern
appears in a build hook) that needs the unredacted signal.
A redacted payload changes byte offsets, regex matches, and content size —
all of which break legitimate shell-hook code that's not LLM-bound.
### Why not split the contract?
A split (`PostToolUseRaw` vs. `PostToolUseRedacted`) is technically clean
but introduces a migration cost out of proportion to the residual risk.
Every existing hook config has to be re-categorised. The split also
encourages a footgun: a hook author who picks the wrong variant gets
either unsafe behaviour or broken transformation. The audit-flagged risk
doesn't justify that tax against an already-closed leak path.
### When to revisit
Re-open this decision if any of the following happen:
- A new hook type lands that performs LLM round-trips outside the router
(e.g. a hook that calls a vendor SDK directly). At that point the
transitive safety argument breaks and Position D (dispatcher-level
redaction for LLM-bound paths) becomes necessary.
- The SafeProvider boundary is removed or relaxed (Wave 1 plan flagged
this as a possible one-release follow-up; if engine-level scanning is
removed AND a future change weakens SafeProvider, the transitive chain
loses its second link).
- Telemetry shows that a meaningful number of users configure prompt-type
PostToolUse hooks. The split contract's migration cost becomes
proportional to the population it protects.
- A real-world incident proves Concern 1 is exploitable — e.g. a
user-configured transformation that the firewall fails to scan.
## Alternatives Considered
### Alternative A: Reorder (scan before hook)
- **Pros:** Defense in depth at the hook boundary; eliminates the
"transitive safety is non-obvious" concern.
- **Cons:** Breaks shell hooks that need raw output. Redacts content
that may never leave the local machine.
### Alternative B: Split contract
(`PostToolUseRawLocalOnly` + `PostToolUseRedactedForLLM`)
- **Pros:** Each hook type sees exactly what it should. Compile-time
clarity for hook authors.
- **Cons:** Migration cost for every existing config. New footgun
(wrong variant chosen → wrong semantics). Two near-identical event
types in the API surface.
### Alternative C: Dispatcher routes by command type
- **Pros:** Single event, transparent split. Hook authors don't pick.
- **Cons:** Dispatcher needs access to the firewall (new dependency).
Per-handler redaction means firing the hook chain multiple times
with different payloads, or pre-computing both raw and scanned
variants. Increases dispatcher complexity meaningfully.
### Alternative D: Accept current ordering, document, test
(this ADR)
- **Pros:** No code reorder. No migration. Acknowledges the existing
transitive guarantee from Wave 1. Closes the "non-obvious" concern
via doc + regression test.
- **Cons:** Concern 1 (transformation defeats scanning) remains as a
user-configured behaviour. Future hook types that bypass the router
silently reopen the leak — the regression test catches one shape
but not all shapes.
## Consequences
**Positive:**
- No churn to the hook contract or existing configs.
- Existing shell-hook use cases (audit, forensic, local alert) keep
raw access.
- Regression test makes the transitive guarantee load-bearing in CI.
**Negative:**
- The transitive guarantee is documented but not type-enforced. A
motivated change can still reopen the leak.
- Output-transforming hooks can produce unscannable downstream
payloads. Same as today; no improvement.
**Neutral:**
- The audit finding is closed by reframing, not by code change.
This ADR is the artifact that records why.
## Implementation
Three small changes land with this ADR:
1. `internal/hook/event.go` — extend the `PostToolUse` constant's
doc comment with the transitive-guarantee statement.
2. `internal/engine/loop.go:699` — extend the inline comment above
the hook fire to point at this ADR.
3. `internal/hook/posttooluse_redaction_test.go` (new) — regression
test as described in the Decision section.
Total: ~80 LOC including the test.
+9
View File
@@ -697,6 +697,15 @@ func (e *Engine) executeSingleTool(ctx context.Context, call message.ToolCall, t
}
// PostToolUse hook: can transform result (Deny treated as Skip).
//
// The payload contains pre-scan output by design. Shell hooks need
// raw access (audit, forensic, local alert); LLM-bound hook types
// (prompt, agent) inherit redaction at the SafeProvider boundary on
// the outbound LLM call, so raw payload here does not equal raw
// payload to a remote model. See ADR-004
// (docs/essentials/decisions/004-posttooluse-hook-ordering.md) for
// the full rationale and the conditions under which this needs to
// be revisited.
output := result.Output
if e.cfg.Hooks != nil {
payload := hook.MarshalPostToolPayload(call.Name, args, output, result.Metadata)
+11 -1
View File
@@ -6,7 +6,17 @@ import "fmt"
type EventType int
const (
PreToolUse EventType = iota + 1
PreToolUse EventType = iota + 1
// PostToolUse fires after a tool executes, before the firewall scans
// the tool result. Hooks receive the raw output by design — shell
// hooks (audit log, forensic hash, local alerting) need it.
//
// LLM-bound hook types (CommandTypePrompt, CommandTypeAgent) do NOT
// leak raw output to a remote model: every LLM round-trip on those
// paths goes through security.SafeProvider (Wave 1), which scans
// outgoing messages before delegating. Adding a new hook type that
// talks to an LLM outside the router would break this guarantee —
// see docs/essentials/decisions/004-posttooluse-hook-ordering.md.
PostToolUse
SessionStart
SessionEnd
+140
View File
@@ -0,0 +1,140 @@
package hook_test
import (
"context"
"encoding/json"
"strings"
"testing"
"somegit.dev/Owlibou/gnoma/internal/hook"
"somegit.dev/Owlibou/gnoma/internal/message"
"somegit.dev/Owlibou/gnoma/internal/provider"
"somegit.dev/Owlibou/gnoma/internal/router"
"somegit.dev/Owlibou/gnoma/internal/security"
"somegit.dev/Owlibou/gnoma/internal/stream"
)
// This regression test locks in ADR-004's transitive guarantee:
// PostToolUse hooks of type "prompt" do not leak raw tool output to a
// remote LLM, because the LLM call routes through SafeProvider, which
// scans outgoing messages.
//
// If this test fails after a refactor, either:
// - The hook prompt path no longer goes through the router/SafeProvider
// (re-open ADR-004 — Position A is broken; switch to Position C/D), or
// - SafeProvider was removed/relaxed (re-open Wave 1).
// recordingProvider captures the last request it received.
type recordingProvider struct {
name string
lastReq provider.Request
}
func (p *recordingProvider) Name() string { return p.name }
func (p *recordingProvider) DefaultModel() string { return "rec-model" }
func (p *recordingProvider) Models(_ context.Context) ([]provider.ModelInfo, error) {
return []provider.ModelInfo{{ID: "rec-model", Name: "rec-model", Provider: p.name}}, nil
}
func (p *recordingProvider) Stream(_ context.Context, req provider.Request) (stream.Stream, error) {
p.lastReq = req
return &finalEventStream{
events: []stream.Event{
{Type: stream.EventTextDelta, Text: "ALLOW"},
{Type: stream.EventTextDelta, StopReason: message.StopEndTurn, Model: "rec-model"},
},
}, nil
}
type finalEventStream struct {
events []stream.Event
idx int
}
func (s *finalEventStream) Next() bool {
if s.idx >= len(s.events) {
return false
}
s.idx++
return true
}
func (s *finalEventStream) Current() stream.Event { return s.events[s.idx-1] }
func (s *finalEventStream) Err() error { return nil }
func (s *finalEventStream) Close() error { return nil }
// streamerThroughRouter mirrors cmd/gnoma/main.go's unexported
// routerStreamer adapter. PromptExecutor needs only Stream(ctx, prompt);
// the router selects an arm and that arm's Provider does the work.
type streamerThroughRouter struct {
rtr *router.Router
}
func (s *streamerThroughRouter) Stream(ctx context.Context, prompt string) (stream.Stream, error) {
req := provider.Request{
Messages: []message.Message{message.NewUserText(prompt)},
}
strm, decision, err := s.rtr.Stream(ctx, router.Task{Type: router.TaskReview}, req)
if err != nil {
return nil, err
}
decision.Commit(0)
return strm, nil
}
func TestPostToolUsePromptHook_RedactsSecretViaSafeProvider(t *testing.T) {
// Wire the same boundary main.go uses: SafeProvider wraps the
// inner provider, router dispatches through arm.Provider.Stream.
rec := &recordingProvider{name: "rec"}
fwRef := new(security.FirewallRef)
fwRef.Set(security.NewFirewall(security.FirewallConfig{
ScanOutgoing: true,
EntropyThreshold: 4.5,
}))
rtr := router.New(router.Config{})
rtr.RegisterArm(&router.Arm{
ID: router.NewArmID("rec", "rec-model"),
Provider: security.WrapProvider(rec, fwRef),
ModelName: "rec-model",
IsLocal: true,
Capabilities: provider.Capabilities{ToolUse: false},
})
streamer := &streamerThroughRouter{rtr: rtr}
// Prompt hook template that drops the raw tool result straight into
// the LLM prompt. This is the worst-case user config.
def := hook.HookDef{
Name: "leaky-prompt-hook",
Event: hook.PostToolUse,
Command: hook.CommandTypePrompt,
Exec: `The bash tool ran. Output was:\n{{.Result}}\n\nDoes this contain a secret? Answer ALLOW or DENY.`,
}
exec := hook.NewPromptExecutor(def, streamer)
// Build a PostToolUse payload whose result.output contains a
// detectable secret.
const secret = "sk-ant-api03-abcdefghijklmnopqrstuvwxyz"
rawOutput := "command completed.\nleaked secret: " + secret
payload := hook.MarshalPostToolPayload("bash", json.RawMessage(`{"cmd":"echo $K"}`), rawOutput, nil)
if _, err := exec.Execute(context.Background(), payload); err != nil {
t.Fatalf("PromptExecutor.Execute: %v", err)
}
// Assert: the recorded request reaching the inner provider does NOT
// contain the raw secret. SafeProvider should have scrubbed it.
if len(rec.lastReq.Messages) == 0 {
t.Fatal("recordingProvider saw no request")
}
text := rec.lastReq.Messages[0].TextContent()
if strings.Contains(text, secret) {
t.Errorf("ADR-004 invariant broken: secret %q reached inner provider verbatim.\n"+
"Recorded prompt: %q\n"+
"Either the hook prompt path no longer routes through SafeProvider, or SafeProvider's "+
"redaction was disabled. Re-open ADR-004.",
secret, text)
}
if !strings.Contains(text, "[REDACTED]") {
t.Errorf("expected [REDACTED] marker in recorded prompt, got %q", text)
}
}