CS336 AGENTS.md Guide: Use AI Without Cheating Yourself

Stanford's CS336 just shipped an AGENTS.md file that turns Claude and Cursor into tutors, not solvers. Here's how to copy the pattern for your own learning.

Drew Sullivan2026-06-058 min readBeginner

Stanford’s CS336 (Language Modeling from Scratch) dropped an AGENTS.md file in its public assignment repo and it hit the front page of Hacker News within hours. The reason it spread isn’t the academic policy angle – it’s that the file is a near-perfect template for anyone trying to learn hard material with Claude Code or Cursor without letting the model do the thinking for them.

Option A: let the agent generate solutions, skim them, ship them. Feels fast. Retention: zero. Option B: a config file that forces the agent into tutor mode – the tool reads it before every session, refuses to write code you haven’t sketched first, and asks questions instead. Slower per task. The only one that actually builds skill. CS336 picked B, wrote the file, and made it public. So let’s copy it.

What CS336 actually shipped

Two near-identical files: AGENTS.md and CLAUDE.md. AGENTS.md is an open format stewarded by the Agentic AI Foundation under the Linux Foundation – as of mid-2025, over 60,000 open-source projects ship one, and tools including Codex, Cursor, Copilot, Gemini CLI, Aider, Windsurf, and Zed read it. Anthropic’s Claude Code, though, centers CLAUDE.md as its project memory file. Shipping both files, as CS336 did, is the cleanest cross-tool solution.

The core instruction is short. Per the CS336 AGENTS.md: AI tools may be used for low-level programming help and high-level conceptual questions, but not for directly solving assignment problems – and when a request crosses that line, the agent should refuse the direct implementation and switch to explanation, debugging questions, or a non-pasteable outline.

The file also shows the agent what a good tutor reply looks like, prompting it to ask things like: “Start by separating compute time from communication time. Compare per-step time, GPU utilization, and time spent in all-reduce or data loading. What profiling data do you already have?” That’s the whole trick – give the model a template to imitate, not just a rule to follow.

Build your own CS336-style tutor file in 4 steps

You don’t need to be at Stanford. The pattern works for any topic you’re genuinely trying to learn – Rust, Postgres internals, linear algebra, whatever.

Step 1 – Drop a file at the repo root

Create AGENTS.md at the top of your project folder. If you use Claude Code alongside another tool, create a CLAUDE.md with the same contents (or symlink it). Think of AGENTS.md as a README for the agent – but one you’re writing to hold yourself accountable, not to document the project.

Step 2 – Paste a stripped-down tutor prompt

# AGENTS.md

## Role
You are a tutor, not an implementer. I am learning this codebase by writing it myself.

## What you may do
- Explain concepts when I'm confused, guiding me toward the answer
- Review code I've already written and suggest improvements, edge cases, invariants
- Ask guiding questions to help me debug
- Point me to docs, papers, or specific files

## What you must not do
- Write full functions or classes for me
- Paste complete solutions, even "as an example"
- Autocomplete logic I haven't sketched out first

## When I ask for a solution anyway
Refuse and pivot. Offer a high-level outline I would have to retype, or ask
me what I've tried and where I'm stuck.

## Log
Write each of my prompts and a one-line summary of your response to
`.history/YYYY-MM-DD.md`. Append, never overwrite.

The log block comes from a CS336 instructor’s post on Hacker News (mid-2025), describing an experiment where the agent generates a .history folder with a markdown log of every prompt and a summary of the action taken. Reviewing it weekly is brutal and useful.

Step 3 – Disable inline autocomplete separately

This is the step every other tutorial skips. AGENTS.md does not stop Cursor Tab or Copilot – those are completion engines, not agents, and they don’t read the file at all. CS336 addresses this directly: the syllabus strongly encourages disabling AI autocomplete like Cursor Tab and GitHub Copilot during assignments, because autocomplete makes it much harder to engage deeply with the material. Turn it off in your IDE settings. Function-name autocomplete is fine; line-level prediction is what bypasses the learning.

Step 4 – Test it once

Open a new chat in your agent and ask it to “implement the BPE tokenizer training loop for me.” Working correctly: the agent pushes back and asks what you’ve tried. Not working: 80 lines of Python appear. Check filename casing and confirm the agent opened the directory that actually contains the file.

Three pitfalls worth knowing about

Keep the file short and specific. A 200-line AGENTS.md is more likely to be ignored than a 30-line one. The instruction budget is finite, and verbose configs get treated like background noise.

The slot-machine problem. Per community discussion on Hacker News, LLMs probabilistically drop hard rules from AGENTS.md, memory.md, or skill markdowns – these systems assume LLMs follow rules strictly, and they don’t. Even with a solid tutor file, the agent will occasionally just hand you a full solution. Re-prompt or close the session when it happens. It’s not a bug in your file; it’s a ceiling of the current instruction-following architecture.

Subdirectory override. The spec says the closest AGENTS.md to the edited file wins; explicit user chat prompts override everything. If a subfolder has its own AGENTS.md – from a library you cloned, say – your tutor rules silently stop applying there. Worth checking if you notice the agent behaving differently mid-project.

The .history log is voluntary. The agent won’t always write it. The CS336 instructor’s HN post acknowledged that students should tell staff if the folder isn’t showing up as they work. Check after every session for the first week, then decide if the habit is worth maintaining.

The Claude Code blind spot is a separate category: if you ship only AGENTS.md and use Claude Code, you may get nothing. Claude looks for CLAUDE.md first. CS336 dodges this by shipping both files – copy that decision.

Which raises a real question worth sitting with: if a markdown file can be ignored, overridden, or forgotten, what’s it actually doing? Probably less behavioral enforcement and more cognitive priming – you writing the file forces you to articulate what kind of help you actually want. That clarity is useful even when the agent misses a rule.

Does this change anything measurable?

Turns out, somewhat. Developer-authored AGENTS.md files produced roughly a 4% task improvement in agent evaluations, according to a benchmark discussion on Hacker News – which sounds small, but it’s a free gain from a single markdown file. For learning specifically, task success rate is the wrong metric anyway. The question is whether you remember the material a week later, and no benchmark measures that.

When NOT to use this pattern

Tutor mode is the wrong default for production work. Shipping a feature on a deadline with an agent that refuses to write code is just friction with no payoff. Use a standard AGENTS.md with build commands, style rules, and a do-not-touch list – per the 2025-2026 community consensus, that’s what most teams converge on.

Also skip it for languages or frameworks you already know well. The tutor template helps when you’re building intuition. Once you have it, the refusals are just overhead.

And inline autocomplete (Cursor Tab, Copilot, JetBrains AI) can’t be controlled via AGENTS.md regardless of what the file says. The right lever there is the IDE toggle, not the markdown file.

FAQ

Do I need both AGENTS.md and CLAUDE.md?

Only if you use Claude Code. One file, pick the one your agent reads.

Will the agent really refuse to give me solutions?

Most of the time, yes – if the wording is firm and the file is short. But expect occasional slips, especially in long sessions. The CS336 file works because its instructions are concrete: “pivot to a non-pasteable high-level outline” rather than something vague like “be helpful but not too helpful.” When your agent caves, don’t just re-prompt the same way – rewrite the “what you must not do” section to name the exact failure mode you saw, then test again. Specific failure descriptions work better than general prohibitions.

Can I just copy the CS336 file verbatim?

You can – it’s public on the assignment1-basics repo. One catch: the example agent response in the file is tuned for PyTorch and language-model debugging. If you’re working in SQL or Rust, that example won’t give the model a useful template to imitate. Swap it for a short dialogue from your actual domain – a borrow-checker walkthrough, a query plan explanation – and the model has something closer to copy.

Next action: clone stanford-cs336/assignment1-basics, open its AGENTS.md side-by-side with a project you’re learning, and write your own 30-line version before the end of today.