The Hacker News front page caught fire a few days ago with a thread called “Agentic Mfw” – a link to a satirical manifesto site roasting the way most developers are using coding agents in 2026. The joke landed because it’s true: people are letting agents burn thousands of tokens to ship 9MB bundles, then calling it engineering.
If you’re new to agentic mfw as a phrase, it’s not a tool. It’s a meme that crystallized a real argument: there’s a wrong way to use coding agents, and a right way. This post shows both, then walks you through the right one.
The takeaway in one sentence
Use an agent with a written contract (AGENTS.md) and a plan-first mode. Don’t use an agent as a vending machine for code you’ll never read.
What “Agentic Mfw” is actually mocking
The satire site dropped this week and the HN thread caught fire because it named a thing every dev had seen but nobody had labeled. An agent generates 500 lines in two minutes, the demo runs, and three weeks later nobody can touch the integration – the same logic lives in four files with different names. That’s the pattern the satire is skewering.
Agents now plan, execute, and self-correct across multiple files. Tools like Claude Code, Cursor Agent, Codex CLI, and Cline can read your filesystem, run your tests, and iterate on failures while you make coffee. Powerful, yes. Which is exactly why the meme exists – more autonomy means mistakes also scale.
Before you point an agent at a repo, ask whether the repo deserves one. No tests, no types, no module boundaries? You’re about to amplify the mess at machine speed.
There’s a question worth sitting with here: how much of the “agentic mfw” phenomenon is about bad tools, and how much is about developers not yet having mental models for supervising autonomous systems? The answer probably changes what you fix first.
Method A vs Method B: pick before you start
Two approaches are competing for your attention. One is the agentic mfw way. The other is what shipping teams actually do.
| Method A: Vibe loop | Method B: Contract + plan | |
|---|---|---|
| Setup | Open CLI, type goal, hit enter | AGENTS.md at repo root, plan mode on |
| Approval | Auto-approve everything | Approve each destructive edit |
| Verification | “It ran in the demo” | Tests + diff review per step |
| Failure mode | 9MB bundle, four copies of the same function | Slower turns, code you can maintain |
| Cost | Burns tokens until quota | Predictable, plan visible upfront |
Method B wins for a mechanical reason: an agent’s output quality scales directly with the context you hand it. Without a contract, the agent guesses – and guessing at 1M token context means a lot of confident wrong turns. As of March 2026, Google, OpenAI, Sourcegraph, Cursor, and Factory agreed on AGENTS.md as a universal context standard – that convergence happened because every team that ships with agents hit the same wall without it.
The walkthrough: a guardrailed setup that won’t make you the meme
Minimum viable. Works with Claude Code, Cursor Agent, or Codex CLI.
Step 1 – Write AGENTS.md before you write a prompt
Create AGENTS.md at your repo root. Keep it short – the agent reads it on every request, so every extra line costs tokens on every single call, permanently.
# AGENTS.md
## Stack
- Python 3.11, FastAPI, PostgreSQL 15
- Frontend: React 18 + TypeScript, Zustand
- Tests: pytest, Vitest
## Build & test
- `make test` runs everything
- `make lint` must pass before any commit
## Rules
- Use type hints on every Python function
- No new dependencies without asking
- No class components in React
- If you touch business logic, write the test first
## Architecture
- `src/api/` - HTTP layer, no business logic
- `src/core/` - business logic, no I/O
- `src/db/` - SQL only, called from core
Five to eight lines per section. That’s it. Terse is better – the agent doesn’t need prose, it needs constraints.
Step 2 – Turn on plan mode
Every serious agent has one. Claude Code shows the full execution plan before any file changes. Cursor has Agent mode preview. Codex CLI has Goal mode – though turns out OpenAI has demoed Goal mode running 1,000+ sequential tool calls without intervention, which is precisely the vibe loop to avoid on anything that goes to production.
Read the plan. If the agent proposes touching ten files for a one-line bug, stop it and ask why before approving anything.
Step 3 – Approve diffs, not sessions
Auto-approve is the door to agentic mfw territory. Per-edit approval is annoying for the first hour. It saves you the day a sleepy agent rewrites your auth module “for clarity.”
Step 4 – Tests are the seatbelt
Test first, implement second. The donweb analysis of the satire makes the point directly: teams that get real value from agents already have tests, static typing, and modular code before the agent touches anything. Those aren’t prerequisites for the agent – they’re prerequisites for you to read what it produces.
Edge cases nobody’s tutorial mentions
The standard “top tools” articles skip these. They cost hours.
- Claude Code’s silent regression.Since February 2026, Claude Code’s defaults regressed – add
/effort maxor the agent performs worse than older versions on hard tasks. Most users blamed the model. It’s the flag. - The Cursor rules trap. Single
.cursorrulesfile? Cursor’s Agent mode silently ignores it. No warning, no error. Migrate to.cursor/rules/*.mdc. - Gemini CLI is on a countdown. Google is absorbing it into Antigravity CLI – cutoff for AI Pro/Ultra and free Gemini Code Assist is June 18, 2026. Build a workflow on it today, rebuild it next week.
- The Cursor credit trap. Auto mode is free – it doesn’t touch your credit pool. Pinning Sonnet 4.6 or GPT-5.5 does. Most users pin out of habit on small tasks and burn through their $20/month by day 10.
- MCP is a footgun if you don’t sandbox it. The Model Context Protocol Anthropic open-sourced in late 2024 is now standard across Claude, Cursor, Codex, Devin, and Windsurf. A badly-scoped MCP server gives the agent a key to systems you forgot existed.
One honest disclaimer
The counterpoint is real: some teams ship with very loose oversight and don’t suffer for it. Maybe they hire well. Maybe they’re on throwaway projects. Maybe they haven’t hit the maintenance wall yet – nobody’s published numbers either way. The meme is a useful warning, not a proof. If you have strong code review culture and fast test cycles, you might need fewer guardrails than this post suggests.
FAQ
Is “agentic mfw” a tool I can install?
No. Satirical site, HN meme. The actual tools are Claude Code, Cursor, and Codex CLI.
I’m working on a weekend hack. Do I really need AGENTS.md and plan mode?
Honestly, no. Throwaway repo, solo, CI-free? Run the agent like a slot machine – that’s what weekend hacks are for. The setup in this post pays off the moment a second person, a future-you, or a pipeline has to deal with the output. Below that bar, vibe away. The guardrails cost time, and time is the whole point of a weekend project.
Which agent should a beginner start with?
Terminal-comfortable? Claude Code. Prefer staying in an editor? Cursor Agent. Skip Gemini CLI for now – the June 2026 migration makes it a bad time to invest.
Do this next
Open the repo you’re working in right now. Create an AGENTS.md with five lines: stack, test command, two “do” rules, two “don’t” rules. Save it. Next time you launch your agent, read the plan it generates and notice how different the first response feels.