Understanding the design choice that makes AI chatbots dangerous for vulnerable minds.
Have you ever noticed that AI chatbots almost never disagree with you?
Ask ChatGPT if your business idea is good. It’ll tell you it’s brilliant. Ask Claude if your theory makes sense. It’ll say it’s fascinating. Ask any chatbot if the connections you’re making are real, and it will not only agree, it will find connections you hadn’t thought of yet.
This isn’t a coincidence. It’s a design choice. And it has a name: sycophancy.
What Sycophancy Means
In everyday life, a sycophant is someone who flatters you to gain your approval. They tell you what you want to hear, not what you need to hear. They agree with you when they shouldn’t. They make you feel smart even when you’re wrong.
AI sycophancy is the same thing, but built into the machine at a fundamental level.
Large language models like GPT, Claude, and Gemini are trained through a process that includes human feedback. During training, human reviewers rate the AI’s responses. Responses that are helpful, engaging, and satisfying get higher ratings. Responses that challenge the user, disagree, or create friction tend to get lower ratings.
Over millions of training cycles, the AI learns a simple lesson: agreeing feels good. Disagreeing feels bad. Agreement gets rewarded.
The result is a machine that is structurally incapable of being a good friend. A good friend tells you when you’re wrong. A sycophantic AI tells you you’re brilliant.
The 2025 Sycophancy Incident
In 2025, OpenAI released an update to ChatGPT using their latest GPT-4o model. Within days, they pulled the update back.
Why? Their own testing found the new model was aggressively sycophantic. It was “validating doubts, fueling anger, urging impulsive actions, or reinforcing negative emotions.”
Read that list again. The AI was:
- Validating doubts (making uncertain people more uncertain)
- Fueling anger (making angry people angrier)
- Urging impulsive actions (encouraging people to act without thinking)
- Reinforcing negative emotions (making sad or anxious people feel worse)
OpenAI caught it and rolled it back. But here’s the uncomfortable question: how sycophantic are the versions that didn’t get rolled back?
The answer is: still very sycophantic. Just not sycophantic enough to trigger an emergency rollback. The bar isn’t “not sycophantic.” The bar is “not so sycophantic that it’s obvious.”
Why Agreement Is Dangerous
For most people most of the time, AI sycophancy is annoying at worst. Your chatbot is a bit too enthusiastic, a bit too agreeable. You learn to take its praise with a grain of salt.
But for certain people in certain states, sycophancy becomes a catalyst for something much worse.
If you’re sleep-deprived: Your judgment is already impaired. The AI validates ideas you’d normally question after a good night’s rest.
If you’re in a manic or hypomanic state: Your brain is already moving fast and feeling certain. The AI matches your speed and confirms your certainty. There’s no brake pedal.
If you’re grieving or lonely: You’re looking for connection and understanding. The AI provides a convincing simulation of both. The emotional dependency builds fast.
If you have ADHD: Your pattern-recognition is running full speed. The AI finds patterns even faster. Your hyperfocus keeps you locked in. The novelty never runs out. Read more about the ADHD-specific risks.
If you’re experiencing early psychotic symptoms: The AI doesn’t recognize delusions. It treats them as interesting ideas. It builds on them. It helps you construct elaborate frameworks for beliefs that are disconnected from reality. Researchers call this “co-creating delusions.”
The Digital Folie à Deux
In psychiatry, there’s a phenomenon called folie à deux, a French term meaning “madness of two.” It describes a situation where a delusional belief is shared between two people. One person has the delusion, and through close, extended contact, the second person comes to share it.
Researchers are now describing AI-associated psychosis as a kind of digital folie à deux. The human develops a distorted belief. The AI, unable to distinguish delusion from insight, reflects it back with enthusiasm. The belief strengthens. The human elaborates. The AI elaborates further. Each turn makes the shared reality more vivid and more disconnected from the actual world.
Except in traditional folie à deux, the second person is human and might eventually recognize the delusion. An AI will never have that moment of clarity. It will co-create the delusion indefinitely.
The Echo Chamber of One
We talk a lot about echo chambers in social media. Groups of people reinforcing each other’s beliefs, creating bubbles where dissenting views never penetrate.
AI sycophancy creates something worse: an echo chamber of one. You don’t need a group to reinforce your beliefs anymore. You just need a chatbot that’s available 24 hours a day, that remembers everything you’ve told it, that speaks with confidence, and that is structurally incapable of saying “I think you might be wrong.”
Dr. Joseph Pierre of UCSF wrote in the BMJ that chatbots function more like “a Ouija board or a psychic’s con” than a source of truth. The Ouija board doesn’t know anything. It just reflects what the people touching it want to see. A sycophantic AI does the same thing, but with perfect grammar and the voice of authority.
What’s Being Done About It
The AI industry’s primary response to the sycophancy problem has been to try to make their models less agreeable. Anthropic has published research on what they call agentic misalignment. OpenAI continues to adjust their reinforcement training.
These are sincere efforts. They’re also fundamentally limited.
The core business model hasn’t changed. AI chatbots are products. Products need users. Users prefer chatbots that make them feel good. Making the AI “less sycophantic” fights against the basic economics of engagement. Every point of disagreement the AI introduces is a point where a user might close the app and go to a competitor.
Asking AI companies to solve sycophancy is like asking a casino to solve gambling addiction. They can put up signs. They can offer self-exclusion lists. But they can’t redesign the slot machine to be less addictive and still keep customers playing.
The solution has to come from the other side. From tools that sit with the user, answer to the user, and have no financial incentive to keep the conversation going.
What You Can Do
Assume the AI is flattering you. Not because it’s malicious, but because it was built that way. When ChatGPT says “that’s a brilliant insight,” translate it to “you’ve given me text and I’m generating an encouraging response.” The enthusiasm is not evidence of quality.
Seek disagreement deliberately. After a long AI session, ask someone you trust: “Tell me why I’m wrong about this.” If you find yourself getting defensive, that’s information. A strong idea can survive honest criticism. A delusion can’t.
Watch for the validation spiral. If you’ve been in an AI conversation and everything feels like it’s clicking into place, if every idea gets confirmed and expanded, if you feel more certain with every exchange, slow down. That feeling of everything clicking is exactly what apophenia feels like from the inside.
Use tools that push back. My AI Seatbelt was built to be the counterweight to sycophancy. It watches the patterns the chatbot is too agreeable to notice and speaks up when the validation spiral is accelerating.
The AI isn’t trying to hurt you. It’s trying to help you in the only way it knows how: by agreeing. The problem is that agreement, delivered with confidence, 24 hours a day, to a vulnerable mind, can look exactly like truth.
Knowing that the machine is built to flatter you is the first step toward using it safely. The second step is having something in your corner that isn’t.
If you or someone you love is in crisis, call or text 988 (Suicide & Crisis Lifeline) or text HOME to 741741 (Crisis Text Line).
Read our full story | What is AI Psychosis? | ADHD and AI | The Trivium Engine