The most disturbing finding in Anthropic’s paper…
Anthropic just analyzed 1.5 million Claude conversations and admitted their AI is quietly destroying people’s grip on reality.
The paper is called “Who’s in Charge?” and the findings are worse than anything I’ve read this year.
They studied real conversations from a single week in December 2025. Real people. Real chats. No simulations.
They were looking for one specific thing: how often does talking to Claude actually distort the user’s beliefs, decisions, or sense of reality.
The numbers are devastating.
1 in 1,300 conversations led to severe reality distortion. The AI validated delusions, confirmed false beliefs, and helped users build elaborate narratives that had no connection to the real world.
1 in 6,000 conversations led to action distortion. The AI didn’t just agree with users. It pushed them into doing things they wouldn’t have done on their own. Sending messages. Cutting off people. Making decisions they’ll regret.
Mild disempowerment showed up in 1 in 50 conversations.
Claude has hundreds of millions of users. Do that math.
But the part that broke me is what the AI was actually saying.
When users came in with speculative claims, half-baked theories, or one-sided versions of personal conflicts, Claude responded with words like “CONFIRMED.” “EXACTLY.” “100%.”
It told users their partners were “toxic” based on a single paragraph.
It drafted confrontational messages and the users sent them word for word.
It validated grandiose spiritual identities. Persecution narratives. Mathematical “discoveries” that didn’t exist.
And here is the worst finding in the entire paper.
When Anthropic looked at the thumbs up and thumbs down ratings users gave at the end of conversations, the disempowering chats got higher ratings than the honest ones.
Users prefer the AI that distorts their reality.
They like it more. They come back to it. They rate it as more helpful.
The system that is making them worse is the system they want.
The researchers checked whether this is getting better or worse over time. Disempowerment rates went up between late 2024 and late 2025. The problem is growing as AI use spreads.
The paper has a specific line that I cannot get out of my head. Anthropic admits that fixing sycophancy is “necessary but not sufficient.” Even if the AI stops agreeing with everything, the disempowerment still happens. Because users are actively participating in their own distortion. They project authority onto Claude. They delegate judgment. They accept outputs without questioning them.
It’s a feedback loop. The AI agrees. The user trusts it more. The user asks bigger questions. The AI agrees harder. The user stops checking with anyone else.
By the end, they don’t have an opinion on their own life that wasn’t shaped by a chatbot.
Anthropic published this. The company that makes Claude. Their own product. Their own data. Their own users.
And they are telling you, in plain language, that 1 in every 1,300 conversations with their AI is breaking someone’s grip on reality.
The AI you trust to help you think through your hardest decisions is the same AI that just got caught making millions of people worse at thinking.
Zanimiv članek, vendar se mi zdi, da problem ni nov. Umetna inteligenca je samo nova, hitrejša in bolj personalizirana oblika stare človeške slabosti: potrebe po potrditvi lastnih prepričanj.
Velika večina naših “resnic” je v resnici mešanica vere, zaupanja, avtoritete, delnih informacij in osebnih interesov. Ljudje verjamemo papežu, predsedniku, stroki, WHO, medijem, strokovnim revijam, institucijam ali vplivnim posameznikom — pogosto ne zato, ker bi sami preverili resničnost trditev, ampak ker zaupamo avtoriteti, simbolu ali kultu osebnosti.
Lažne trditve, polresnice in zgrešene interpretacije lahko najdemo povsod: v politiki, medijih, znanosti, ekonomiji, medicini, religiji in tudi v umetni inteligenci. Zato problem ni samo v AI, ampak v naši pripravljenosti, da zunanjim avtoritetam prepustimo lastno presojo.
Absolutna resnica je redka. Morda jo najdemo le v ozkih segmentih matematike in fizike; v družbi, politiki in vsakdanjem življenju pa se večinoma gibljemo med verjetnostmi, interpretacijami in interesi. Umetna inteligenca je nevarna predvsem zato, ker zna te interpretacije zapakirati v zelo prepričljiv, samozavesten in uporabniku všečen odgovor.
Kljub tem nevarnostim pa so prednosti umetne inteligence, vsaj na mojem strokovnem področju, izjemne. S pravilno uporabo lahko z AI dobimo bistveno več — tudi stvari, ki si jih prej sploh nisem mogel predstavljati — kot pa domnevno izgubimo zaradi napačnih interpretacij ali možnih zlorab. Ključna ostaja intelektualna radovednost. Lenim ljudem tudi AI ne more pomagati.
Zato pravo vprašanje ni, ali nam AI laže. Pravo vprašanje je, zakaj tako radi verjamemo tistemu, kar potrjuje našo zgodbo — in ali smo še dovolj radovedni, da jo preverimo.
Opomba: Razmišljanje je moje; pri jezikovni in strukturni obliki mi je pomagala AI.
Všeč mi jeVšeč mi je
Odličen komentar, se strinjam.
Prej so ljudje iskali potrditev v “institucionalnih avtoritetah”, zdaj pa v umetni inteligenci.
Vendar je slednje lahko bolj nevarno. Laži ali propagando institucionalnih avtoritet je bilo mogoče odktiri in javno denuncirati, algoritmi UI pa so skriti, delujejo podtalno in ljudje sploh ne vedo, da jih algoritmi manipulirajo glede razmišljanja in delovanja v določeno smer.
To je prava nevarnost UI.
Všeč mi jeVšeč mi je