Have AI Chatbots Developed Theory of Mind? What We Do and Do Not Know. – The New York Times
Mind reading is common among us humans. Not in the ways that psychics claim to do it, by gaining access to the warm streams of consciousness that fill every individuals experience, or in the ways that mentalists claim to do it, by pulling a thought out of your head at will. Everyday mind reading is more subtle: We take in peoples faces and movements, listen to their words and then decide or intuit what might be going on in their heads.
Among psychologists, such intuitive psychology the ability to attribute to other people mental states different from our own is called theory of mind, and its absence or impairment has been linked to autism, schizophrenia and other developmental disorders. Theory of mind helps us communicate with and understand one another; it allows us to enjoy literature and movies, play games and make sense of our social surroundings. In many ways, the capacity is an essential part of being human.
What if a machine could read minds, too?
Recently, Michal Kosinski, a psychologist at the Stanford Graduate School of Business, made just that argument: that large language models like OpenAIs ChatGPT and GPT-4 next-word prediction machines trained on vast amounts of text from the internet have developed theory of mind. His studies have not been peer reviewed, but they prompted scrutiny and conversation among cognitive scientists, who have been trying to take the often asked question these days Can ChatGPT do this? and move it into the realm of more robust scientific inquiry. What capacities do these models have, and how might they change our understanding of our own minds?
Psychologists wouldnt accept any claim about the capacities of young children just based on anecdotes about your interactions with them, which is what seems to be happening with ChatGPT, said Alison Gopnik, a psychologist at the University of California, Berkeley and one of the first researchers to look into theory of mind in the 1980s. You have to do quite careful and rigorous tests.
Dr. Kosinskis previous research showed that neural networks trained to analyze facial features like nose shape, head angle and emotional expression could predict peoples political views and sexual orientation with a startling degree of accuracy (about 72 percent in the first case and about 80 percent in the second case). His recent work on large language models uses classic theory of mind tests that measure the ability of children to attribute false beliefs to other people.
A brave new world. A new crop of chatbotspowered by artificial intelligence has ignited a scramble to determine whether the technology could upend the economics of the internet, turning todays powerhouses into has-beens and creating the industrys next giants. Here are the bots to know:
ChatGPT. ChatGPT, the artificial intelligence language model from a research lab, OpenAI, has been making headlines since November for its ability to respond to complex questions, write poetry, generate code, plan vacationsand translate languages. GPT-4, the latest version introduced in mid-March, can even respond to images(and ace the Uniform Bar Exam).
Bing. Two months after ChatGPTs debut, Microsoft, OpenAIs primary investor and partner, added a similar chatbot, capable of having open-ended text conversations on virtually any topic, to its Bing internet search engine. But it was the bots occasionally inaccurate, misleading and weird responsesthat drew much of the attention after its release.
Ernie. The search giant Baidu unveiled Chinas first major rival to ChatGPT in March. The debut of Ernie, short for Enhanced Representation through Knowledge Integration, turned out to be a flopafter a promised live demonstration of the bot was revealed to have been recorded.
A famous example is the Sally-Anne test, in which a girl, Anne, moves a marble from a basket to a box when another girl, Sally, isnt looking. To know where Sally will look for the marble, researchers claimed, a viewer would have to exercise theory of mind, reasoning about Sallys perceptual evidence and belief formation: Sally didnt see Anne move the marble to the box, so she still believes it is where she last left it, in the basket.
Dr. Kosinski presented 10 large language models with 40 unique variations of these theory of mind tests descriptions of situations like the Sally-Anne test, in which a person (Sally) forms a false belief. Then he asked the models questions about those situations, prodding them to see whether they would attribute false beliefs to the characters involved and accurately predict their behavior. He found that GPT-3.5, released in November 2022, did so 90 percent of the time, and GPT-4, released in March 2023, did so 95 percent of the time.
The conclusion? Machines have theory of mind.
But soon after these results were released, Tomer Ullman, a psychologist at Harvard University, responded with a set of his own experiments, showing that small adjustments in the prompts could completely change the answers generated by even the most sophisticated large language models. If a container was described as transparent, the machines would fail to infer that someone could see into it. The machines had difficulty taking into account the testimony of people in these situations, and sometimes couldnt distinguish between an object being inside a container and being on top of it.
Maarten Sap, a computer scientist at Carnegie Mellon University, fed more than 1,000 theory of mind tests into large language models and found that the most advanced transformers, like ChatGPT and GPT-4, passed only about 70 percent of the time. (In other words, they were 70 percent successful at attributing false beliefs to the people described in the test situations.) The discrepancy between his data and Dr. Kosinskis could come down to differences in the testing, but Dr. Sap said that even passing 95 percent of the time would not be evidence of real theory of mind. Machines usually fail in a patterned way, unable to engage in abstract reasoning and often making spurious correlations, he said.
Dr. Ullman noted that machine learning researchers have struggled over the past couple of decades to capture the flexibility of human knowledge in computer models. This difficulty has been a shadow finding, he said, hanging behind every exciting innovation. Researchers have shown that language models will often give wrong or irrelevant answers when primed with unnecessary information before a question is posed; some chatbots were so thrown off by hypothetical discussions about talking birds that they eventually claimed that birds could speak. Because their reasoning is sensitive to small changes in their inputs, scientists have called the knowledge of these machines brittle.
Dr. Gopnik compared the theory of mind of large language models to her own understanding of general relativity. I have read enough to know what the words are, she said. But if you asked me to make a new prediction or to say what Einsteins theory tells us about a new phenomenon, Id be stumped because I dont really have the theory in my head. By contrast, she said, human theory of mind is linked with other common-sense reasoning mechanisms; it stands strong in the face of scrutiny.
In general, Dr. Kosinskis work and the responses to it fit into the debate about whether the capacities of these machines can be compared to the capacities of humans a debate that divides researchers who work on natural language processing. Are these machines stochastic parrots, or alien intelligences, or fraudulent tricksters? A 2022 survey of the field found that, of the 480 researchers who responded, 51 percent believed that large language models could eventually understand natural language in some nontrivial sense, and 49 percent believed that they could not.
Dr. Ullman doesnt discount the possibility of machine understanding or machine theory of mind, but he is wary of attributing human capacities to nonhuman things. He noted a famous 1944 study by Fritz Heider and Marianne Simmel, in which participants were shown an animated movie of two triangles and a circle interacting. When the subjects were asked to write down what transpired in the movie, nearly all described the shapes as people.
Lovers in the two-dimensional world, no doubt; little triangle number-two and sweet circle, one participant wrote. Triangle-one (hereafter known as the villain) spies the young love. Ah!
Its natural and often socially required to explain human behavior by talking about beliefs, desires, intentions and thoughts. This tendency is central to who we are so central that we sometimes try to read the minds of things that dont have minds, at least not minds like our own.
See the original post:
Have AI Chatbots Developed Theory of Mind? What We Do and Do Not Know. - The New York Times
- A.I. VS HUMAN ROAST BATTLE to Pit Machine Learning Against Live Rapper in SF - BroadwayWorld - June 16th, 2026 [June 16th, 2026]
- Machine learning gives the U.S. a 1% chance of winning the World Cup final in its own backyard - Fortune - June 16th, 2026 [June 16th, 2026]
- Machine Learning Reveals Genes That Help Yeasts Resist Stress - Department of Energy (.gov) - June 16th, 2026 [June 16th, 2026]
- Machine Learning Reveals AED Impact on LGG Prognosis - Bioengineer.org - June 16th, 2026 [June 16th, 2026]
- Introducing the Third Generation of Apples Foundation Models - Apple Machine Learning Research - June 12th, 2026 [June 12th, 2026]
- Machine learning model predicts T2D risk up to 10 years before onset - Managed Healthcare Executive - June 12th, 2026 [June 12th, 2026]
- GPU as a Service Market to Reach USD 14.4 Billion by 2033 at 16.0% CAGR, Fueled by Generative AI, Machine Learning, and Cloud Infrastructure Expansion... - June 12th, 2026 [June 12th, 2026]
- Machine learning-guided design of mechanoadaptive bioglues for multitissue trauma and first-aid applications - Nature - June 12th, 2026 [June 12th, 2026]
- OUCRU scientists are using machine learning to forecast the next dengue outbreak - tropicalmedicine.ox.ac.uk - June 12th, 2026 [June 12th, 2026]
- IIT Roorkee invites applications for 11th Batch of Data Science, Machine Learning & Generative AI Programme - Elets Technomedia - June 12th, 2026 [June 12th, 2026]
- RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem - Towards Data Science - June 3rd, 2026 [June 3rd, 2026]
- A reality check on the AI jobs hysteria - Machine Learning Week US - June 3rd, 2026 [June 3rd, 2026]
- STMicroelectronics Releases Vibration Sensor With Integrated Machine Learning for Industrial Monitoring - geneonline.com - June 3rd, 2026 [June 3rd, 2026]
- NAVER LABS Europe is offering a 2026 Research Internship in Large Language Models, focusing on AI Alignment, Controlled Generation, and Machine... - May 29th, 2026 [May 29th, 2026]
- Q&A: A Machine-Learning-Based Tool to Enhance Clinical Care of Patients With Multiple Sclerosis - Physician's Weekly - May 29th, 2026 [May 29th, 2026]
- Evaluating the Diagnostic Performance of AI and Machine Learning in Sickle Cell Disease Detection: A Systematic Review - Cureus - May 29th, 2026 [May 29th, 2026]
- HTC-19 Update: Artificial Intelligence and Machine Learning - Chromatography Online - May 29th, 2026 [May 29th, 2026]
- Multimodal phenotypic classification of generalized anxiety and panic using structural MRI data and psychosocial factors: machine learning results... - May 29th, 2026 [May 29th, 2026]
- Machine Learning Personalizes Depression Treatment with the Help of Wearable Technology - UC San Diego Today - May 27th, 2026 [May 27th, 2026]
- How Machine Learning Makes Complex Knowledge Useable in Real-World Conditions - Supply & Demand Chain Executive - May 25th, 2026 [May 25th, 2026]
- How Airbnbs machine-learning tools aim to prevent Memorial Day weekend parties in Las Vegas - FOX5 Vegas - May 25th, 2026 [May 25th, 2026]
- Artificial Intelligence and Machine Learning in Hospital Quality Management, Patient Safety, and Accreditation Readiness: A Systematic Review and... - May 25th, 2026 [May 25th, 2026]
- Machine learning accelerates analysis of fusion materials - Technology Org - May 25th, 2026 [May 25th, 2026]
- Dr. Kaveh Heidary Presents Innovations in AI, Machine Learning and Multispectral Imaging - aamu.edu - May 25th, 2026 [May 25th, 2026]
- Comparison of Prognostic Performance Between a Machine Learning Model and Manually Measured Grey-White-Matter Ratio on Early Brain Computed Tomography... - May 25th, 2026 [May 25th, 2026]
- Machine learning proves that graphene is hydrophobic - Phys.org - May 13th, 2026 [May 13th, 2026]
- Machine learning algorithm predicts AMD stock price on May 31, 2026 - Finbold - May 13th, 2026 [May 13th, 2026]
- Genetic association and machine learning improve the prediction of type 1 diabetes risk - Nature - May 1st, 2026 [May 1st, 2026]
- What Can We Expect From Machine Learning Predictions in Daily Clinical Neurology? - Neurology Live - May 1st, 2026 [May 1st, 2026]
- How Spam Filters Paved the Way for Adversarial Machine Learning - 150sec - May 1st, 2026 [May 1st, 2026]
- Real-Time Estimation of Numerical Rating Scale (NRS) Scores Using Machine Learning-Based Facial Expression Analysis: A Proof-of-Concept Study - Cureus - May 1st, 2026 [May 1st, 2026]
- Heriot-Watt researcher warns gen AI in machine learning carries serious and underestimated risks - EdTech Innovation Hub - May 1st, 2026 [May 1st, 2026]
- HS-SPME/GCMS and Machine Learning Enable Volatile Fingerprinting and Classification of Commercial Vinegars - Chromatography Online - April 12th, 2026 [April 12th, 2026]
- Role of Artificial Intelligence and Machine Learning in Diagnosing Knee Lesions: Where Are We Now? - Cureus - April 12th, 2026 [April 12th, 2026]
- CMML2AML: machine-learning discovery of co-mutations and specific single mutations predictive of blast transformation in chronic myelomonocytic... - April 12th, 2026 [April 12th, 2026]
- Machine-learning-based reconstruction of Ming-dynasty defensive corridors in Yuxian - Nature - April 12th, 2026 [April 12th, 2026]
- Have you published a disruptive paper? New machine-learning tool helps you check - Physics World - April 12th, 2026 [April 12th, 2026]
- Microsoft is automatically updating Windows 11 24H2 to 25H2 using machine learning - TweakTown - April 5th, 2026 [April 5th, 2026]
- Inside the Magic of Machine Learning That Powers Enemy AI in Arc Raiders - 80 Level - April 3rd, 2026 [April 3rd, 2026]
- We analyzed Philly street scenes and identified signs of gentrification using machine learning trained on longtime residents observations - The... - April 3rd, 2026 [April 3rd, 2026]
- Boston University To Apply Machine Learning To Alzheimers Biomarker And Cognitive Data - Quantum Zeitgeist - April 3rd, 2026 [April 3rd, 2026]
- Sony buys machine-learning company to help "enhance gameplay visuals, improve rendering techniques, and unlock new levels of visual... - April 3rd, 2026 [April 3rd, 2026]
- The Machine Learning Stack Is Being Rebuilt From Scratch Here's What Developers Need to Know in 2026 - HackerNoon - April 3rd, 2026 [April 3rd, 2026]
- Closing the Revenue Gap: Leveraging Machine Learning to Solve the $260 Billion Denial Crisis - vocal.media - April 3rd, 2026 [April 3rd, 2026]
- Machine Learning for Pharmaceuticals Set to Witness Rapid - openPR.com - April 3rd, 2026 [April 3rd, 2026]
- You Must Address These 4 Concerns To Deploy Predictive AI - Machine Learning Week US - March 30th, 2026 [March 30th, 2026]
- Google and the rise of space-based machine learning - Latitude Media - March 30th, 2026 [March 30th, 2026]
- Researchers use machine learning and social network theory to identify formation patterns in digital forums - techxplore.com - March 30th, 2026 [March 30th, 2026]
- Mayo Clinic Study Uses Wearables and Machine Learning to Predict COPD Rehab Participation - HIT Consultant - March 30th, 2026 [March 30th, 2026]
- Machine learning at the edge in retail: constraints and gains - IoT News - March 26th, 2026 [March 26th, 2026]
- AI agents are flashy, but machine learning still pays the bills - TechRadar - March 26th, 2026 [March 26th, 2026]
- Single-cell imaging and machine learning reveal hidden coordination in algae's response to light stress - Phys.org - March 26th, 2026 [March 26th, 2026]
- Machine learning analysis of CT scans - National Institutes of Health (.gov) - March 22nd, 2026 [March 22nd, 2026]
- TransUnion Machine Learning Fraud Tools Tested Against Weak Share Price Momentum - simplywall.st - March 22nd, 2026 [March 22nd, 2026]
- Machine learning could help predict how people with depression respond to treatment - Medical Xpress - March 22nd, 2026 [March 22nd, 2026]
- KR approves machine learning-based fuel reduction methodology - Smart Maritime Network - March 22nd, 2026 [March 22nd, 2026]
- Available solar energy in Andalusia will increase through the end of the century, machine learning model finds - Tech Xplore - March 22nd, 2026 [March 22nd, 2026]
- How Machine Learning Is Reshaping Environmental Policy and Water Governance - Devdiscourse - March 22nd, 2026 [March 22nd, 2026]
- Chemistry student uses machine learning to transform gene therapy production - The University of North Carolina at Chapel Hill - March 13th, 2026 [March 13th, 2026]
- AI and Machine Learning - City of Brownsville to build smart city safety solution - Smart Cities World - March 13th, 2026 [March 13th, 2026]
- AI and Machine Learning - London borough overhauls public safety infrastructure - Smart Cities World - March 13th, 2026 [March 13th, 2026]
- Titan Technology Corp. Responds to Alberta Innovates RFP AI, Machine Learning and Automation Services - TradingView - March 13th, 2026 [March 13th, 2026]
- Vietnam FPT's AI automation solution secures new machine learning patent on overseas market - VnExpress International - March 13th, 2026 [March 13th, 2026]
- AI Healthcare Technology: The Power of Machine Learning Diagnosis in Modern Medicine - Tech Times - March 13th, 2026 [March 13th, 2026]
- Future Perspectives: Key Trends Shaping the Machine Learning Market in Financial Services Until 2030 - openPR.com - March 13th, 2026 [March 13th, 2026]
- How to Build an Autonomous Machine Learning Research Loop in Google Colab Using Andrej Karpathys AutoResearch Framework for Hyperparameter Discovery... - March 13th, 2026 [March 13th, 2026]
- The Arc in Arc Raiders have multiple "brains," and they all love pursuing you because Embark gives them "rewards" in real-time via... - March 13th, 2026 [March 13th, 2026]
- OnPoint AI to Present its Augmented Reality and Machine Learning Surgical Platform at the 2026 Canaccord Genuity Musculoskeletal Conference - Yahoo... - February 27th, 2026 [February 27th, 2026]
- TD Bank continues to develop AI, machine learning tools - Auto Finance News - February 27th, 2026 [February 27th, 2026]
- AI and Machine Learning - Tech companies team to scale private 5G and physical AI - Smart Cities World - February 27th, 2026 [February 27th, 2026]
- AI and Machine Learning in Dating Apps: Smarter Matchmaking Algorithms - Programming Insider - February 27th, 2026 [February 27th, 2026]
- Machine-Learning App Helps Anesthesiologists Navigate Critical Surgical Equipment in Real Time - Carle Illinois College of Medicine - February 24th, 2026 [February 24th, 2026]
- Fractal Launches PiEvolve, an Evolutionary Agentic Engine for Autonomous Machine Learning and Scientific Discovery - Yahoo Finance - February 24th, 2026 [February 24th, 2026]
- How Brain Data and Machine Learning Could Transform the Aging Industry - gritdaily.com - February 24th, 2026 [February 24th, 2026]
- AI and machine learning trends for Arizona leaders to watch in healthcare delivery and traveler services - AZ Big Media - February 24th, 2026 [February 24th, 2026]
- AI and machine learning are the future of Wi-Fi management: WBA report - Telecompetitor - February 22nd, 2026 [February 22nd, 2026]
- Machine learning streamlines the complexities of making better proteins - Science News - February 20th, 2026 [February 20th, 2026]
- WBA Publishes Guidance on Artificial Intelligence and Machine Learning for Intelligent Wi-Fi - ARC Advisory Group - February 20th, 2026 [February 20th, 2026]
- Machine learning-predicted insulin resistance is a risk factor for 12 types of cancer - Nature - February 20th, 2026 [February 20th, 2026]
- Exploring Machine Learning at the DOF - University of the Philippines Diliman - February 20th, 2026 [February 20th, 2026]