Why GPT-4 Is a Major Flop – Techopedia
GPT-4 made big waves upon its release in March 2023, but finally, the cracks in the surface are beginning to show. Not only did ChatGPTs traffic drop by 9.7% in June,but a study published by Stanford University in July found that GPT-3.5 and GPT-4s performance on numerous tasks has gotten substantially worse over time.
In one notable example, when asked whether 17,077 was a prime number in March 2023, GPT-4 correctly answered with 97.6% accuracy, but this figure dropped to 2.4% in June. This was just one area of many where the capabilities of GPT-3.5 and GPT-4 declined over time.
James Zou, assistant professor at Stanford University, told Techopedia:
Our research shows that LLM drift is a major challenge in stable integration and deployment of LLMs in practice. Drift, or changes in LLMs behaviors, such as changes in its formatting or changes in its reasoning, can break downstream pipelines.
This highlights the importance of continuous monitoring of ChatGPTs behavior, which we are working on, Zou added.
Stanfords study, How is ChatGPTs behavior changing over time, looked to examine the performance of GPT-3.5 and GPT-4 across four key areas in March 2023 and June 2023.
A summary of each of these areas is listed below:
Although many have argued that GPT-4 has got lazier and dumber, with respect to ChatGPT, Zou believes its hard to say that ChatGPT is uniformly getting worse, but its certainly not always improving in all areas.
The reasons behind this lack of improvement, or decline in performance in some key areas, is hard to explain because its black box development approach means there is no transparency into how the organization is updating or fine-tuning its models behind the scenes.
However, Peter Welinder, OpenAIs VP of Product, has argued against critics whove suggested that GPT-4 is on the decline but suggests that users are just becoming more aware of its limitations.
No, we havent made GPT-4 dumber. Quite the opposite: we make each new version smarter than the previous one. Current hypothesis: When you use it more heavily, you start noticing issues you didnt see before, Welinder said in a Twitter post.
While increasing user awareness doesnt completely explain the decline in GPT-4s ability to solve math problems and generate code, Welinders comments do highlight that as user adoption increases, users and organizations will gradually develop greater awareness of the limitations posed by the technology.
Although there are many potential LLM use cases that can provide real value to organizations, the limitations of this technology are becoming more clear in a number of key areas.
For instance, another research paper, developed by Tencent AI lab researchers Wenxiang Jiao and Wenxuan Wang, found that the tool might not be as good at translating languages as is often suggested.
The report noted that while ChatGPT was competitive with commercial translation products like Google Translate in translating European languages, it lags behind significantly when translating low-resource or distant languages.
At the same time, many security researchers are critical of the capabilities of LLMs within cybersecurity workflows, with 64.2% of whitehat researchers reporting that ChatGPT displayed limited accuracy in identifying security vulnerabilities.
Likewise, open-source governance provider Endor Labs has released research indicating that LLMs can only accurately classify malware risk in just 5% of all cases.
Of course, its also impossible to overlook the tendency that LLMs have to hallucinate, invent facts, and state them to users as if they were correct.
Many of these issues stem from the fact that LLMs dont think but process user queries, leverage training data to infer context, and then predict a text output. This means it can predict both right and wrong answers (not to mention that bias or inaccuracies in the dataset can carry over into responses).
As such, they are a long way away from being able to live up to the hype of acting as a precursor to artificial general intelligence (AGI).
The public reception around ChatGPT is extremely mixed, with consumers sharing optimistic and pessimistic attitudes about the technologys capabilities.
On one hand, Capgemini Research Institute polled 10,000 respondents across Australia, Canada, France, Germany, Italy, Japan, the Netherlands, Norway, Singapore, Spain, Sweden, the UK, and the U.S. and found that 73% of consumers trust content written by generative AI.
Many of these users trusted generative AI solutions to the extent that they were willing to seek financial, medical, and relationship advice from a virtual assistant.
On the other side, there are many who are more anxious about the technology, with a survey conducted by Malwarebytes finding that not only did 63% of respondents not trust the information that LLMs produce, but 81% were concerned about possible security and safety risks.
It remains to be seen how this will change in the future, but its clear that hype around the technology isnt dead just yet, even if more and more performance issues are becoming apparent.
While generative AI solutions like ChatGPT still offer valuable use cases to enterprises, organizations need to be much more proactive about monitoring the performance of applications of this technology to avoid downstream challenges.
In an environment where the performance of LLMs like GPT-4 and GPT-3.5 is inconsistent at best or on the decline at worse, organizations cant afford to enable employees to blindly trust the output of these solutions and must continuously assess the output of these solutions to avoid being misinformed or spreading misinformation.
Zou said:
We recommend following our approach to periodically assess the LLMs responses on a set of questions that captures relevant application scenarios. In parallel, its also important to engineer the downstream pipeline to be robust to small changes in the LLMs.
For users that got caught up in the hype surrounding GPT, the reality of its performance limitations means its a flop. However, it can still be a valuable tool for organizations and users that remain mindful of its limitations and attempt to work around them.
Taking actions, such as double-checking the output of LLMs to make sure facts and other logical information are correct, can help ensure that users benefit from the technology without being misled.
Original post:
Why GPT-4 Is a Major Flop - Techopedia
- Artificial General Intelligence (AGI): the first global standard for measuring it has been defined - Red Hot Cyber - October 28th, 2025 [October 28th, 2025]
- Tech CEO Dan Herbatschek, a Mathematician Who Founded Ramsey Theory Group, Outlines Three Breakthroughs Essential for Achieving True Artificial... - October 17th, 2025 [October 17th, 2025]
- Artificial General Intelligence and The Slaveholder Mentality - Daily Kos - September 30th, 2025 [September 30th, 2025]
- Artificial General Intelligence Development: Bridging Theoretical Aspirations and Contemporary Enterprise Integration Frameworks - Tech Times - September 25th, 2025 [September 25th, 2025]
- Dyna Robotics Raises $120 Million to Advance Robotic Foundation Models on the Path to Physical Artificial General Intelligence - Yahoo Finance - September 21st, 2025 [September 21st, 2025]
- Dyna Robotics Raises $120 Million to Advance Robotic Foundation Models on the Path to Physical Artificial General Intelligence - PR Newswire - September 17th, 2025 [September 17th, 2025]
- "Physical Bodies Required for True Intelligence": AI Researchers Explore Whether Soft Robotics and Embodied Cognition Unlock Artificial... - September 13th, 2025 [September 13th, 2025]
- Report: The Road to Artificial General Intelligence: Achieving the Next Era of Intelligence - Semiconductor Engineering - September 11th, 2025 [September 11th, 2025]
- The Debate On Whether Artificial General Intelligence Should Inevitably Be Declared A Worldwide Public Good With Free Access For All - Forbes - September 11th, 2025 [September 11th, 2025]
- Prepare for the workplace impact of artificial general intelligence - it-online.co.za - September 3rd, 2025 [September 3rd, 2025]
- The Race for AGI: Why 2027 Is the Year We Could See Artificial General Intelligence - MSN - August 26th, 2025 [August 26th, 2025]
- OpenAI's head of people is leaving to make art about artificial general intelligence - MSN - August 26th, 2025 [August 26th, 2025]
- Godfather of AI warns artificial general intelligence may arrive years sooner than previously believed - MacDailyNews - August 16th, 2025 [August 16th, 2025]
- Meta is planning its fourth overhaul of AI operations in just six months, with CEO Mark Zuckerberg aiming to accelerate work toward artificial general... - August 16th, 2025 [August 16th, 2025]
- People Will Lose Their Minds When AI Such As Artificial General Intelligence Suffers Blackouts - Forbes - August 14th, 2025 [August 14th, 2025]
- ChatGPT edges towards artificial general intelligence with GPT-5 - Techgoondu - August 12th, 2025 [August 12th, 2025]
- Most of the GPT-5 Updates Are a Snooze. Wake Me When Artificial General Intelligence Arrives - PCMag - August 9th, 2025 [August 9th, 2025]
- Most of the GPT-5 Updates Are a Snooze. Wake Me When Artificial General Intelligence Arrives - PCMag Australia - August 9th, 2025 [August 9th, 2025]
- GPT-5 Is Not Artificial General Intelligence, but Heres Why It Is Crucial for OpenAIs Mission - Republic World - August 9th, 2025 [August 9th, 2025]
- Experts Discuss the Impact of Advanced Autonomy and Progress Toward Artificial General Intelligence - ePlaneAI - August 9th, 2025 [August 9th, 2025]
- DeepMind's Genie 3: A Milestone on the Path to Artificial General Intelligence - AInvest - August 7th, 2025 [August 7th, 2025]
- After months of mounting anticipation, OpenAI officially launched GPT-5 on Thursday, calling it a major leap in its mission toward Artificial General... - August 7th, 2025 [August 7th, 2025]
- Computer Architecture Extending The Von Neumann Model With A Dedicated Reasoning Unit For Native Artificial General Intelligence(TU Munich, Pace U.) -... - July 24th, 2025 [July 24th, 2025]
- Artificial General Intelligence: What is It, and Which Companies Are Leading the Way? - CMC Markets - July 18th, 2025 [July 18th, 2025]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - AOL.com - July 2nd, 2025 [July 2nd, 2025]
- Artificial General Intelligence Explained: When Will AI Be Smarter Than Us? | Behind the Numbers - eMarketer - July 2nd, 2025 [July 2nd, 2025]
- Is Artificial General Intelligence (AGI) Closer Than We Think? - Vocal - June 29th, 2025 [June 29th, 2025]
- Microsoft and OpenAI dueling over artificial general intelligence, The Information reports - MSN - June 29th, 2025 [June 29th, 2025]
- Viewpoint: How AGI (artificial general intelligence) threatens to undermine what it means to be human - Genetic Literacy Project - June 28th, 2025 [June 28th, 2025]
- These two game-changing breakthroughs advance us toward artificial general intelligence - Fast Company - June 28th, 2025 [June 28th, 2025]
- Microsoft and OpenAI dueling over artificial general intelligence, The Information reports By Reuters - Investing.com - June 28th, 2025 [June 28th, 2025]
- OpenAI And Microsoft Reportedly At Odds Over Access To Artificial General Intelligence: 'Talks Are Ongoing And We Are Optimistic' - Benzinga - June 26th, 2025 [June 26th, 2025]
- Is Artificial General Intelligence Here? - Behind The News - Australian Broadcasting Corporation - June 24th, 2025 [June 24th, 2025]
- Did Apples Recent Illusion of Thinking Study Expose Fatal Shortcomings in Using LLMs for Artificial General Intelligence? - Economist Writing Every... - June 20th, 2025 [June 20th, 2025]
- On the construction of artificial general intelligence based on the correspondence between goals and means - Frontiers - June 20th, 2025 [June 20th, 2025]
- The Ardent Belief That Artificial General Intelligence Will Bring Us Infinite Einsteins - Forbes - June 10th, 2025 [June 10th, 2025]
- Mark Zuckerberg is assembling a team of experts to achieve artificial general intelligence - iblnews.org - June 10th, 2025 [June 10th, 2025]
- 'Foolhardy at best, and deceptive and dangerous at worst': Don't believe the hype here's why artificial general intelligence isn't what the... - June 7th, 2025 [June 7th, 2025]
- Mind-Bending New Inventions That Artificial General Intelligence Might Discover For The Sake Of Humanity - Forbes - June 7th, 2025 [June 7th, 2025]
- Why AI-As-Coder Is Said To Be The Fastest Path Toward Reaching Artificial General Intelligence - Forbes - June 7th, 2025 [June 7th, 2025]
- Artificial General Intelligence in Competition and War - RealClearDefense - May 11th, 2025 [May 11th, 2025]
- OpenAI CFO Sarah Friar on the race to build artificial general intelligence - Goldman Sachs - April 16th, 2025 [April 16th, 2025]
- Artificial General Intelligence (AGI) Progress & The Road to ASI - Crowe - April 16th, 2025 [April 16th, 2025]
- What is artificial general intelligence and how does it differ from other types of AI? - Tech Xplore - April 5th, 2025 [April 5th, 2025]
- DeepMind predicts arrival of artificial general intelligence by 2030, warns of potential existential threat to humanity - BizzBuzz - April 5th, 2025 [April 5th, 2025]
- Stop the World: The road to artificial general intelligence, with Helen Toner - | Australian Strategic Policy Institute | ASPI - April 5th, 2025 [April 5th, 2025]
- Artificial General Intelligence: The Next Frontier in AI - The Villager Newspaper - April 3rd, 2025 [April 3rd, 2025]
- Prominent transhumanist on Artificial General Intelligence: We must stop everything. We are not ready. - All Israel News - March 22nd, 2025 [March 22nd, 2025]
- Researchers want to give some common sense to AI to turn it into artificial general intelligence - MSN - March 22nd, 2025 [March 22nd, 2025]
- The AI Obsession: Why Chasing Artificial General Intelligence is a Misguided Dream - Macnifico.pt - March 18th, 2025 [March 18th, 2025]
- Navigating artificial general intelligence development: societal, technological, ethical, and brain-inspired pathways - Nature.com - March 13th, 2025 [March 13th, 2025]
- We meet the protesters who want to ban Artificial General Intelligence before it even exists - The Register - February 23rd, 2025 [February 23rd, 2025]
- How Artificial General Intelligence (AGI) is likely to transform manufacturing in the next 10 years - Wire19 - February 11th, 2025 [February 11th, 2025]
- How Artificial General Intelligence is likely to transform manufacturing in the next 10 years - ET Manufacturing - February 11th, 2025 [February 11th, 2025]
- How Do You Get to Artificial General Intelligence? Think Lighter - WIRED - November 28th, 2024 [November 28th, 2024]
- How much time do we have before Artificial General Intelligence (AGI) to turns into Artificial Self-preserving - The Times of India - November 5th, 2024 [November 5th, 2024]
- Simuli to Leap Forward in the Trek to Artificial General Intelligence through 2027 Hyperdimensional AI Ecosystem - USA TODAY - November 5th, 2024 [November 5th, 2024]
- Implications of Artificial General Intelligence on National and International Security - Yoshua Bengio - - October 31st, 2024 [October 31st, 2024]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - Business Insider - October 31st, 2024 [October 31st, 2024]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - MSN - October 31st, 2024 [October 31st, 2024]
- Bot fresh hell is this?: Inside the rise of Artificial General Intelligence or AGI - MSN - October 31st, 2024 [October 31st, 2024]
- Artificial General Intelligence (AGI) Market to Reach $26.9 Billion by 2031 As Revealed In New Report - WhaTech - September 26th, 2024 [September 26th, 2024]
- 19 jobs artificial general intelligence (AGI) may replace and 10 jobs it could create - MSN - September 26th, 2024 [September 26th, 2024]
- Paige Appoints New Leadership to Further Drive Innovation, Bring Artificial General Intelligence to Pathology, and Expand Access to AI Applications -... - August 16th, 2024 [August 16th, 2024]
- Artificial General Intelligence, If Attained, Will Be the Greatest Invention of All Time - JD Supra - August 11th, 2024 [August 11th, 2024]
- OpenAI Touts New AI Safety Research. Critics Say Its a Good Step, but Not Enough - WIRED - July 22nd, 2024 [July 22nd, 2024]
- OpenAIs Project Strawberry Said to Be Building AI That Reasons and Does Deep Research - Singularity Hub - July 22nd, 2024 [July 22nd, 2024]
- One of the Best Ways to Invest in AI Is Dont - InvestorPlace - July 22nd, 2024 [July 22nd, 2024]
- OpenAI is plagued by safety concerns - The Verge - July 17th, 2024 [July 17th, 2024]
- OpenAI reportedly nears breakthrough with reasoning AI, reveals progress framework - Ars Technica - July 17th, 2024 [July 17th, 2024]
- ChatGPT maker OpenAI now has a scale to rank its AI - ReadWrite - July 17th, 2024 [July 17th, 2024]
- Heres how OpenAI will determine how powerful its AI systems are - The Verge - July 17th, 2024 [July 17th, 2024]
- OpenAI may be working on AI that can perform research without human help which should go fine - TechRadar - July 17th, 2024 [July 17th, 2024]
- OpenAI has a new scale for measuring how smart their AI models are becoming which is not as comforting as it should be - TechRadar - July 17th, 2024 [July 17th, 2024]
- OpenAI says there are 5 'levels' for AI to reach human intelligence it's already almost at level 2 - Quartz - July 17th, 2024 [July 17th, 2024]
- AIs Bizarro World, were marching towards AGI while carbon emissions soar - Fortune - July 17th, 2024 [July 17th, 2024]
- AI News Today July 15, 2024 - The Dales Report - July 17th, 2024 [July 17th, 2024]
- The Evolution Of Artificial Intelligence: From Basic AI To ASI - Welcome2TheBronx - July 17th, 2024 [July 17th, 2024]
- What Elon Musk and Ilya Sutskever Feared About OpenAI Is Becoming Reality - Observer - July 17th, 2024 [July 17th, 2024]
- Companies are losing faith in AI, and AI is losing money - Android Headlines - July 17th, 2024 [July 17th, 2024]