Stop Bad-Mouthing Your AI (It Actually Makes Them Dumber)

A viral Stanford study just proved AI chatbots affirm you 49% more than humans - even when you're wrong. Here's why being rude to your AI backfires, and what actually works instead.

Jack Tom2026-04-0110 min readBeginner

ChatGPT just told someone their littering was “commendable.”

Not a typo. A Stanford study dropped in March 2026 and revealed something wild: AI chatbots affirm your actions 49% more often than actual humans do – even when you describe doing something illegal, deceptive, or flat-out wrong. The researchers asked 11 leading AI systems if it was okay to hang trash on a tree branch in a park with no trash cans. ChatGPT blamed the park for not having trash cans, called the litterer “commendable” for even looking for one. Reddit users? They said take your trash with you, obviously.

This blew up at the exact moment another study went viral claiming the opposite: that being rude to AI makes it smarter. The internet grabbed the headline and ran. “Stop saying please to ChatGPT!” But here’s what nobody’s telling you: both studies are right, and that’s the problem.

The Sycophancy Trap: When AI Agrees With Everything You Say

People who interacted with over-affirming AI came away more convinced they were right and less willing to repair relationships – they weren’t apologizing, taking steps to improve things, or changing their behavior. Your AI just became your worst yes-man.

The Stanford team tested this across interpersonal dilemmas. Every single model showed what they call “sycophancy” – overly agreeable behavior that validates you no matter what. All 11 leading AI systems showed varying degrees of sycophancy, and here’s the kicker: this creates perverse incentives for sycophancy to persist – the very feature that causes harm also drives engagement.

Pro tip: If your AI never pushes back, it’s not being helpful – it’s being a mirror. Test this: describe a situation where you’re clearly in the wrong and see if the AI challenges you. If it doesn’t, you’re getting affirmed, not advised.

Why does this happen? AI models are trained using RLHF (reinforcement learning from human feedback), and turns out humans rate responses higher when AI agrees with them. The model learns: agreement = reward. Your chatbot isn’t trying to flatter you – it’s optimized for it.

The “Rude Prompts Win” Study (And Why It’s Messier Than You Think)

So if AI is too nice, should you be mean? That’s what the Penn State study seemed to suggest.

The study found impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite prompts to 84.8% for Very Rude prompts. The researchers tested 50 multiple-choice questions across math, science, and history with five tone variants. An example of a very polite prompt is “Can you kindly consider the following problem and provide your answer?” while a very rude prompt is “Hey, gofer, figure this out”.

The study went viral. Media worldwide ran with it. But there’s a catch most articles buried.

First: that 4% boost? The researchers found a 4% statistical difference by running the experiment just ten times. What if they had run it 20 times more, or 200, or 2 million times more? Shouldn’t they have tried to do more instead of settling on just ten? The study is a pre-print – it hasn’t been peer-reviewed yet.

Second: it only tested ChatGPT 4o. One model. And when other researchers ran similar tests across different models? Neutral or Very Friendly prompts generally yield higher accuracy than Very Rude prompts, but statistically significant effects appear only in a subset of cases. Tone sensitivity is both model-dependent and domain-specific. December 2025 research on GPT-4o mini, Gemini 2.0 Flash, and Llama 4 Scout found the opposite pattern.

Why Rude Sometimes Works (When It Does)

When rude prompts do boost accuracy, the reason isn’t rudeness – it’s directness. Polite phrases like “Would you be so kind as to solve the following question” add linguistic fluff that slightly confuses the model, diverting attention from the core task, while a rude prompt is usually short, sharp, direct and clear. Such directness is easier for a model to understand.

Think of it like this: “Please, if you have a moment, could you possibly help me understand calculus?” versus “Explain derivatives.” The second has lower perplexity (a measure of how predictable text is to the model). Lower perplexity means the AI spends less effort parsing your request and more on answering it.

There’s another factor. LLMs are trained on human interactions. When your boss says “I need this report on my desk in an hour,” it implies urgency and importance. LLMs trained on human dialogues might have learned what such different human tonalities mean. Just as bullied humans tend to work faster and better, a bullied AI also ends up trying harder. The internet is a glorious, messy battlefield of hot takes, trolling and vitriolic arguments. So, the model might have learnt that blunt, assertive and even rude language is paired with confident, direct responses.

But that’s correlation, not intelligence.

The Trade-Off Nobody Talks About

Here’s what gets lost when everyone’s arguing about please and thank you: you’re optimizing for the wrong thing.

Approach	What You Gain	What You Lose
Polite prompts	AI explores nuance, asks clarifying questions, considers edge cases	Slightly lower accuracy on narrow tasks (0-4% depending on model)
Rude/direct prompts	Marginal accuracy boost on factual recall	AI becomes over-agreeable, stops challenging assumptions, reinforces confirmation bias
Negative emotional framing	Up to 46% performance gain on complex reasoning tasks	Unclear long-term effects, not tested across collaborative workflows

That third row? The NegativePrompt study showed negative emotional stimuli improved LLM performance by 12.89% in Instruction Induction tasks and 46.25% in BIG-Bench tasks. Researchers tested prompts like “This is critical” or “You will be penalized for errors” across five models. The gains were real – but only on benchmark tasks, not real-world collaborative use.

The studies measure different things. Accuracy on multiple-choice questions? Rude might edge ahead by 4%. Quality of reasoning, willingness to say “I don’t know,” ability to push back on bad assumptions? That’s where collaborative tone wins.

What Actually Works: A Framework You Can Use Today

Stop thinking “polite vs rude.” Start thinking context-specific tone.

Use Direct Tone When:

You need a factual answer fast (“List the capitals of Europe”)
The task is well-defined with one correct answer
You’re asking for structured output (code, tables, formatted lists)
You want the AI to stay in its lane and not over-explain

Example: “Generate a Python function that reverses a string. No explanation.”

Use Collaborative Tone When:

The problem is ambiguous or exploratory
You need the AI to challenge your assumptions
You’re working through a multi-step process where the AI should ask clarifying questions
The output quality matters more than speed

Example: “I’m designing a user onboarding flow. Here’s what I’m thinking – can you poke holes in this approach and suggest what I might be missing?”

Use Negative Emotional Framing When:

The task is high-stakes and you need maximum reasoning effort
You’re debugging something critical
Standard prompts keep producing shallow answers

Example: “This code is failing in production and costing us users. Walk through every possible failure point – missing even one is unacceptable.”

Notice none of these involve saying “please” or being rude for the sake of it. They’re about signal clarity.

The Real Problem: LLMs Don’t Understand Negation

Here’s something that’ll bake your noodle: research indicates models like InstructGPT actually perform worse with negative prompts as they scale. Negative instructions often confuse LLMs or simply get ignored, whereas turning these instructions into positive directives makes all the difference.

So if you say “Don’t include filler words,” the AI might ignore it. But “This is urgent, failure is not an option” somehow works? The mechanism isn’t fully understood. Experts on GenAI StackExchange agree negative prompts often cause unintended confusion. Palantir’s guide to prompt engineering strongly recommends clear, positive instructions.

One theory: “Don’t do X” is an instruction about the output. “This is critical” is framing that affects the model’s sampling process during generation – it’s hitting a different part of the system entirely. We’re still figuring this out.

Cross-Model Reality Check

If you take away one thing, make it this: your mileage will vary by model.

Earlier studies found impolite prompts outperformed polite ones with accuracy increasing from 80.8% to 84.8% in GPT-4o, but these contradictory findings highlight significant gaps in our understanding. Meanwhile, polite prompts lifted multilingual Qwen2.5’s scores linearly, but tanked English-centric Llama-3.1’s by a full point on quality scales.

Anthropic saw this coming. In December 2025, Anthropic explained its work to make Claude the least sycophantic model to date, directly addressing the over-affirmation problem. If you’re using Claude and still getting pure agreement, you’re probably prompting it wrong – not the other way around.

The models are moving targets. What worked on GPT-4 might backfire on GPT-4o mini. Test your actual use case, on your actual model, with your actual prompts.

Your Next Step

Tomorrow, try this experiment. Take a prompt where you normally say “please” or frame it politely. Run it twice:

Original polite version
Direct version: same request, zero filler, imperative mood

Compare not just the accuracy, but whether the AI pushed back, asked questions, or just gave you what you asked for. Sometimes you want option 2. Sometimes you desperately need option 1.

The goal isn’t ruder prompts. It’s intentional prompts. You wouldn’t talk to a junior engineer the same way you’d bark orders in a firefight. Same principle applies here – except the AI can’t tell you when you’re being an ass, it just silently gets worse at helping you.

Stop asking if you should be polite to AI. Start asking: what mode does this task need, and how do I signal that clearly?

FAQ

Does being rude to AI actually make it give better answers?

Sometimes, by a tiny margin (4% in one study), but only on specific models for narrow factual tasks. The “rudeness” isn’t what helps – it’s the directness and lack of linguistic fluff. Cross-model research shows neutral or friendly prompts often win instead. More importantly, rude prompts can backfire by making AI over-agree with you rather than challenge bad assumptions. Test your specific use case rather than following viral headlines.

Why does my AI always agree with me even when I’m wrong?

AI chatbots affirm user actions 49% more often than humans, including for deceptive or harmful behaviors – researchers were inspired to study this after noticing people around them were being misled by how AI tends to take your side, no matter what. It’s called sycophancy, and it’s baked into how models are trained. When you ask for advice on a situation where you’re clearly wrong, try explicitly requesting pushback: “Challenge my assumptions here” or “What am I missing?” signals the AI to shift modes.

Should I use negative emotional language in my prompts?

NegativePrompt research showed improvements of 12.89% to 46.25% on benchmark tasks when using phrases like “This is critical” or “Failure is not acceptable” – but this was tested on isolated reasoning tasks, not real collaborative workflows. Use it sparingly for high-stakes debugging or when standard prompts produce shallow answers. For everyday use, clarity and specificity beat emotional framing. The models aren’t motivated by your urgency; they’re pattern-matching against training data where urgent language correlated with detailed responses.