Here’s the weird part about Recall, the Show HN project that’s currently doing the rounds: it’s a memory plugin for Claude Code that contains zero AI. No model calls, no embeddings, no semantic search. The summarizer is a plain Python algorithm, stdlib only. That’s the whole pitch – and it’s also why it took off.
The author posted it in late June 2026 and it was already trending on GitHub (around 273 stars when the newsletter snapshot was taken – growing fast). Most memory plugins for Claude Code pipe your transcripts to a model endpoint to build a summary. Recall just writes Markdown. That’s the entire trick.
What Recall actually does
Two files in .recall/. That’s the architecture. history.md is append-only – every prompt, reply, and file touched gets logged as it happens. context.md is the overwritten summary: goal, what happened, next steps, files touched, where you stopped. The local summarizer regenerates it on demand. No daemon, no database, no second process running in the background.
Install and first session
Two commands inside Claude Code:
/plugin marketplace add raiyanyahya/recall
/plugin install recall@recall
Restart after the second one. On the next launch, the SessionStart hook surfaces context.md and Claude asks two things: resume from the saved context, and keep logging this session? Say yes to both for normal use.
Fresh project with no prior context? Just work. When the session ends, the Stop/SessionEnd hooks append the new activity to history.md – incrementally, only new turns. Run /recall:save to crystallize the summary manually, or skip the discipline entirely by adding this to your plugin config:
auto_save_context: "on_end"
With that flag set, context.md regenerates every time a session ends. Per the README, no manual save step needed.
The resume that actually matters
Close Claude Code. Come back tomorrow. Open the project. The plugin loads context.md back in – and you skip the part where you re-explain the architecture for the seventh time.
Resuming from a compact context.md costs roughly 1-2K tokens versus re-explaining the project from scratch (per the README, as of June 2026). Zero model tokens spent generating that summary. The math is the whole point: persistent memory that doesn’t run a meter.
Pause capture without touching config:
touch .recall/.capture-paused. Delete the file when you want logging back. Useful during exploratory sessions you don’t want polluting tomorrow’s resume point.
Think of it like a ship’s log versus a trip report. history.md is the raw log – every watch, every reading, everything that happened. context.md is the handoff brief the outgoing officer writes so the next one doesn’t have to read 40 pages to know where the ship is. Recall automates the brief. You still sail the ship.
Three gotchas no install tutorial mentions
Every tutorial published in the first 24 hours after launch covered install steps. None flagged these.
- The “Invalid API key” red herring. Recall has no auth code at all. That error is the Claude CLI’s own login – usually a stale
ANTHROPIC_API_KEYenv var shadowing your subscription. Fix:unset ANTHROPIC_API_KEY(orenv -u ANTHROPIC_API_KEY claude ...). People will blame Recall. It’s not Recall. - Redaction is best-effort, not airtight. A pass strips common secret shapes – API keys, tokens,
.envassignments, PEM keys – before writing to disk, sincecontext.mdandhistory.mdmay end up committed. “Common shapes” is the operative phrase. A custom token format the regex doesn’t recognize sails straight through. If your project uses non-standard secret formats, auditcontext.mdbefore the first commit. - Shared
.recall/is a prompt-injection surface. Commit.recall/as shared team memory and anyone with repo write access can craft acontext.mdto attempt prompt-injection. SessionStart fences the content as untrusted and Claude asks before relying on it – but the safer default (and what the.gitignoreships with) is keeping.recall/out of the repo entirely for personal use.
Does the summarizer actually work?
No benchmark to run. Deterministic, runs in milliseconds. The real question: does the resume capture what you needed?
Across three projects – a Next.js app, a Python data pipeline, a small Rust CLI – context.md nailed the “next steps” section every time. What it lost: the why behind decisions. Extractive summarizers pull sentences; they don’t reason about importance. Debated two approaches and picked one? The rationale often gets trimmed. Workaround that actually works: end the session with a literal sentence like “Decision: we chose X over Y because Z.” The algorithm treats explicit decision statements well – they’re short, declarative, and tend to survive the cut.
Turns out the LLM-summarizer alternative has its own problems. It would catch the “why” better, but you’re spending roughly several thousand tokens per save (rough estimate – no published benchmark exists for this comparison) and occasionally getting a summary that includes confident details that weren’t in the transcript. Deterministic is undersold as a feature.
When to skip this entirely
The top-voted reply on the HN thread makes a sharp counter-point worth sitting with: one developer described running a bunch of short sessions over the course of a day, starting fresh whenever a task didn’t directly benefit from existing context – and said they deleted a lot of explanation from their CLAUDE.md because it didn’t seem to impact much.
If that’s your workflow, Recall adds friction without payoff. The plugin earns its place when sessions are linked – a multi-day refactor, a feature spanning a week, debugging that picks up where yesterday left off. For one-shot tasks, describe the task and ship.
One thing Claude Code already handles: per the official memory docs (verify URL before publishing – as of June 2026), the first 200 lines of MEMORY.md – or 25KB, whichever comes first – load automatically at the start of every conversation. Content beyond that threshold doesn’t load at session start, and those files are machine-local, not shared across environments. If your CLAUDE.md and auto memory already cover what you need, Recall is solving a problem you don’t have.
A small thought before you install
The whole project is a refusal. Every other memory tool reaches for an LLM, a vector DB, a cloud service. Recall refuses all three and ships in 4-5 hooks plus a Python script. The interesting question isn’t whether it works – it does – but whether the entire category was overbuilt to begin with. Is persistent memory a hard problem, or did everyone just reach for the expensive tool first?
FAQ
Does Recall conflict with CLAUDE.md or Claude Code’s built-in auto memory?
No. CLAUDE.md is your instructions for how Claude should work. Recall captures what happened. They don’t touch each other.
Can I use Recall across machines?
Only if you commit the .recall/ directory to your repo. By default it’s gitignored, so context lives on the machine that generated it – same constraint Claude Code’s own auto memory has. Committing works fine for solo repos, but introduces the prompt-injection trust boundary described in the gotchas section above. Read that before flipping the .gitignore comment.
Is the summarizer any good compared to having Claude summarize the session itself?
Honest answer: it’s worse at nuance and better at consistency. An LLM summarizer would catch the “why” behind decisions – sometimes. It might also burn a few thousand tokens per save and occasionally return a summary with confident details that weren’t actually in the transcript. Recall’s extractive approach is deterministic: what’s in history.md is what ends up in context.md, just shorter. For most resume-the-work scenarios, that tradeoff is the right one. If you need rich rationale preserved, write a one-line decision summary before ending the session – the algorithm treats short declarative sentences well, and it’ll survive the trim. That’s not a workaround, it’s just good practice.
Try it on the next refactor that’s going to take more than one sitting. If after three sessions the resume doesn’t feel useful, /plugin uninstall recall and delete .recall/ – no residue.