docs: add reddit-reader design spec
Architecture, data flow, schema, gRPC API, LLM abstraction, TUI layout, config/setup, error handling, and testing strategy.
This commit is contained in:
298
docs/superpowers/specs/2026-04-03-reddit-reader-design.md
Normal file
298
docs/superpowers/specs/2026-04-03-reddit-reader-design.md
Normal file
@@ -0,0 +1,298 @@
|
||||
# Reddit Reader — Design Spec
|
||||
|
||||
## Overview
|
||||
|
||||
A Go TUI application that monitors subreddits for interesting posts, adds them to a reading list, and generates 5-bullet summaries using a local LLM (Ollama/llama.cpp) or Mistral Small 4 as fallback. Runs as a systemd user service for continuous monitoring; the TUI connects on launch.
|
||||
|
||||
## Architecture
|
||||
|
||||
Single Go binary with three subcommands:
|
||||
|
||||
- `reddit-reader serve` — monitor daemon + gRPC server
|
||||
- `reddit-reader tui` — Bubble Tea client, connects via gRPC
|
||||
- `reddit-reader setup` — interactive first-run wizard
|
||||
|
||||
### Package Layout
|
||||
|
||||
```
|
||||
cmd/
|
||||
serve.go — cobra subcommand: starts monitor + gRPC server
|
||||
tui.go — cobra subcommand: launches TUI client
|
||||
setup.go — cobra subcommand: first-run wizard
|
||||
root.go — cobra root command
|
||||
internal/
|
||||
monitor/ — Reddit polling loop, orchestrates filter pipeline
|
||||
filter/ — keyword/regex pre-filter + LLM relevance scoring
|
||||
llm/ — Summarizer interface, Ollama/llama.cpp + Mistral backends
|
||||
store/ — SQLite operations (modernc.org/sqlite, pure Go)
|
||||
grpc/
|
||||
server/ — gRPC service implementation
|
||||
client/ — gRPC client used by TUI
|
||||
tui/ — Bubble Tea views and models
|
||||
config/ — TOML config parsing, env var overlay, first-run setup
|
||||
proto/
|
||||
redditreader.proto — protobuf service definition
|
||||
```
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Monitor Loop (runs in `serve`)
|
||||
|
||||
```
|
||||
every 2min, for each subreddit:
|
||||
1. go-reddit fetches /new or /hot listings
|
||||
2. Dedup: skip posts already in SQLite (keyed by reddit fullname t3_xxxxx)
|
||||
3. Keyword/regex pre-filter: match title/flair against configured patterns (cheap, no API calls)
|
||||
4. LLM relevance scoring: "rate 0.0-1.0 how relevant to [interests]" — includes recent feedback as few-shot context
|
||||
5. Posts above relevance threshold get 5-bullet summary from LLM
|
||||
6. Insert post + summary + score into SQLite
|
||||
7. Push to connected TUI clients via gRPC streaming
|
||||
```
|
||||
|
||||
### LLM Call Budget
|
||||
|
||||
With 10-25 subreddits polled every 2 minutes, only posts passing the keyword pre-filter reach the LLM. Expected: 5-15 LLM calls per cycle, well within local model throughput and Mistral free-tier limits.
|
||||
|
||||
### Feedback Loop
|
||||
|
||||
User thumbs-up/down votes in TUI are stored in SQLite. Recent feedback examples become few-shot context in the relevance scoring prompt ("posts like X were marked interesting, posts like Y were not"). No fine-tuning — prompt engineering with history.
|
||||
|
||||
## SQLite Schema
|
||||
|
||||
```sql
|
||||
CREATE TABLE subreddits (
|
||||
name TEXT PRIMARY KEY,
|
||||
enabled INTEGER DEFAULT 1,
|
||||
poll_sort TEXT DEFAULT 'new',
|
||||
added_at TEXT DEFAULT (datetime('now'))
|
||||
);
|
||||
|
||||
CREATE TABLE filters (
|
||||
id INTEGER PRIMARY KEY,
|
||||
subreddit TEXT REFERENCES subreddits(name),
|
||||
pattern TEXT NOT NULL,
|
||||
is_regex INTEGER DEFAULT 0
|
||||
);
|
||||
|
||||
CREATE TABLE posts (
|
||||
id TEXT PRIMARY KEY, -- reddit fullname t3_xxxxx
|
||||
subreddit TEXT NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
author TEXT,
|
||||
url TEXT,
|
||||
selftext TEXT,
|
||||
score INTEGER,
|
||||
created_utc TEXT,
|
||||
fetched_at TEXT DEFAULT (datetime('now')),
|
||||
relevance REAL,
|
||||
summary TEXT,
|
||||
read INTEGER DEFAULT 0,
|
||||
starred INTEGER DEFAULT 0,
|
||||
dismissed INTEGER DEFAULT 0
|
||||
);
|
||||
|
||||
CREATE TABLE feedback (
|
||||
id INTEGER PRIMARY KEY,
|
||||
post_id TEXT REFERENCES posts(id),
|
||||
vote INTEGER NOT NULL, -- +1 interesting, -1 not
|
||||
created_at TEXT DEFAULT (datetime('now'))
|
||||
);
|
||||
```
|
||||
|
||||
## gRPC Service
|
||||
|
||||
```protobuf
|
||||
service RedditReader {
|
||||
rpc StreamPosts(StreamRequest) returns (stream Post);
|
||||
rpc ListPosts(ListRequest) returns (ListResponse);
|
||||
rpc UpdatePost(UpdateRequest) returns (Post);
|
||||
rpc SubmitFeedback(FeedbackRequest) returns (FeedbackResponse);
|
||||
rpc ListSubreddits(Empty) returns (SubredditList);
|
||||
rpc AddSubreddit(AddSubredditRequest) returns (Subreddit);
|
||||
rpc RemoveSubreddit(RemoveRequest) returns (Empty);
|
||||
rpc UpdateFilters(FilterRequest) returns (FilterResponse);
|
||||
rpc Status(Empty) returns (StatusResponse);
|
||||
}
|
||||
```
|
||||
|
||||
- `StreamPosts`: server-side stream, TUI subscribes on launch for real-time pushes
|
||||
- `ListPosts`: supports filtering by subreddit, read/unread, starred, date range
|
||||
- All mutations go through gRPC — single writer to SQLite, no lock contention
|
||||
- Socket path: `$XDG_RUNTIME_DIR/reddit-reader.sock` (fallback `/tmp/reddit-reader.sock`)
|
||||
|
||||
## LLM Abstraction
|
||||
|
||||
```go
|
||||
type Summarizer interface {
|
||||
Score(ctx context.Context, post Post, interests Interests) (float64, error)
|
||||
Summarize(ctx context.Context, post Post) (string, error)
|
||||
}
|
||||
```
|
||||
|
||||
### Backends
|
||||
|
||||
| Backend | Connection | When Used |
|
||||
|---------|-----------|-----------|
|
||||
| Ollama | OpenAI-compatible HTTP at `localhost:11434` | Default — setup probes for it |
|
||||
| llama.cpp server | OpenAI-compatible HTTP at configurable port | Alternative local |
|
||||
| Mistral API | `somegit.dev/vikingowl/mistral-go-sdk` | Fallback when no local model available |
|
||||
|
||||
Ollama and llama.cpp share one implementation (same OpenAI-compatible API, different base URLs). Mistral uses the dedicated SDK.
|
||||
|
||||
### Backend Selection (in `setup`)
|
||||
|
||||
1. Probe `localhost:11434` — if Ollama responds, use it, ask which model (default `mistral-small`)
|
||||
2. Probe configurable llama.cpp endpoint if set
|
||||
3. Fall back to Mistral API — prompt for API key
|
||||
4. Store choice in config, overridable via env vars
|
||||
|
||||
### Relevance Prompt Includes
|
||||
|
||||
- User's declared interests (from config)
|
||||
- Last N feedback examples as few-shot context (from SQLite)
|
||||
- Post title + first ~500 chars of selftext
|
||||
|
||||
## TUI
|
||||
|
||||
Built with Bubble Tea + Lip Gloss.
|
||||
|
||||
### Views
|
||||
|
||||
- **Reading List** — default view, scrollable post list sorted by relevance, unread first
|
||||
- **Starred** — favorited posts
|
||||
- **Archive** — dismissed and read posts
|
||||
- **Settings** — manage subreddits, keywords, LLM backend, relevance threshold (via gRPC)
|
||||
|
||||
### Post List
|
||||
|
||||
- `*` unread / `o` read indicators
|
||||
- Shows subreddit, relevance score, relative time
|
||||
- Enter expands to show 5-bullet summary in detail pane
|
||||
|
||||
### Keybindings
|
||||
|
||||
- `j/k` navigate, `g/G` top/bottom
|
||||
- `enter` expand/collapse summary
|
||||
- `s` star, `d` dismiss
|
||||
- `o` open in browser
|
||||
- `+/-` vote on relevance
|
||||
- `/` filter, `?` help
|
||||
- `tab` switch views
|
||||
|
||||
### On Launch
|
||||
|
||||
1. Connect to gRPC Unix socket
|
||||
2. If connection fails and socket activation is configured, systemd starts daemon
|
||||
3. `ListPosts` populates initial view
|
||||
4. Subscribe to `StreamPosts` for live updates
|
||||
|
||||
## Configuration
|
||||
|
||||
### Config File
|
||||
|
||||
`~/.config/reddit-reader/config.toml`
|
||||
|
||||
```toml
|
||||
[reddit]
|
||||
client_id = ""
|
||||
client_secret = ""
|
||||
username = ""
|
||||
password = ""
|
||||
|
||||
[llm]
|
||||
backend = "ollama"
|
||||
endpoint = "localhost:11434"
|
||||
model = "mistral-small"
|
||||
api_key = ""
|
||||
relevance_threshold = 0.6
|
||||
|
||||
[interests]
|
||||
description = "" # free-text, e.g. "Go programming, NixOS, systems programming, Linux kernel"
|
||||
|
||||
[monitor]
|
||||
poll_interval = "2m"
|
||||
max_posts_per_poll = 25
|
||||
|
||||
[grpc]
|
||||
socket = "$XDG_RUNTIME_DIR/reddit-reader.sock"
|
||||
```
|
||||
|
||||
Env var overrides: `REDDIT_READER_REDDIT_CLIENT_ID`, `REDDIT_READER_LLM_API_KEY`, etc.
|
||||
|
||||
### First-Run Setup (`reddit-reader setup`)
|
||||
|
||||
Interactive terminal wizard:
|
||||
|
||||
1. Reddit OAuth — walk through creating a script app, prompt for credentials
|
||||
2. LLM backend — probe local, let user pick or enter Mistral key
|
||||
3. Subreddits — add initial subreddits with keyword filters
|
||||
4. Interests — free-text description for relevance prompts
|
||||
5. Validate — test Reddit auth, test LLM responds, create SQLite DB
|
||||
6. Systemd — optionally write and enable service + socket units
|
||||
|
||||
## Systemd Units
|
||||
|
||||
### `reddit-reader.service`
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Reddit Reader Monitor
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart=%h/.local/bin/reddit-reader serve
|
||||
Restart=on-failure
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
|
||||
### `reddit-reader.socket`
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Reddit Reader Socket
|
||||
|
||||
[Socket]
|
||||
ListenStream=%t/reddit-reader.sock
|
||||
|
||||
[Install]
|
||||
WantedBy=sockets.target
|
||||
```
|
||||
|
||||
Daemon is manually activated (`systemctl --user start reddit-reader.service`). Socket is always enabled — systemd starts the daemon on first TUI connection if it's not already running.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- **Reddit API failures**: exponential backoff per subreddit, log warnings. After 5 consecutive failures, disable subreddit and notify TUI via gRPC stream.
|
||||
- **LLM unavailable**: store posts with `relevance = NULL`, `summary = NULL`. Retry on next cycle. TUI shows "pending summary" state.
|
||||
- **SQLite write errors**: fatal for daemon. Fail fast, let systemd restart.
|
||||
- **gRPC connection lost**: TUI shows disconnected state, retries with backoff, resyncs via `ListPosts` on reconnect.
|
||||
- **Config missing/invalid**: `serve` and `tui` check on startup, point to `reddit-reader setup`.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- **Unit tests**: filter pipeline (keyword, regex), config parsing, LLM prompt construction, SQLite store operations (in-memory SQLite)
|
||||
- **Integration tests**: gRPC server/client round-trips with real SQLite, monitor loop with mocked Reddit API responses
|
||||
- **No mocking of SQLite** — use real in-memory databases
|
||||
- **TDD**: tests first for store operations, filter logic, gRPC service methods
|
||||
- **Interfaces for boundaries**: `Summarizer`, Reddit client, store — mock only at system boundaries
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Package | Purpose |
|
||||
|---------|---------|
|
||||
| `github.com/vartanbeno/go-reddit/v2` | Reddit API client |
|
||||
| `somegit.dev/vikingowl/mistral-go-sdk` | Mistral API backend |
|
||||
| `modernc.org/sqlite` | Pure-Go SQLite |
|
||||
| `github.com/charmbracelet/bubbletea` | TUI framework |
|
||||
| `github.com/charmbracelet/lipgloss` | TUI styling |
|
||||
| `github.com/spf13/cobra` | CLI subcommands |
|
||||
| `github.com/pelletier/go-toml/v2` | Config parsing |
|
||||
| `google.golang.org/grpc` | gRPC |
|
||||
| `google.golang.org/protobuf` | Protobuf codegen |
|
||||
|
||||
## Go Version
|
||||
|
||||
Go 1.26.1 — use range-over-func, iterator patterns, and other 1.25/1.26 features where appropriate.
|
||||
Reference in New Issue
Block a user