Artificial intelligence: How to measure the I in AI – TechTalks
Image credit: Depositphotos
This article is part ofDemystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.
Last week, Lee Se-dol, the South Korean Go champion who lost in a historical matchup against DeepMinds artificial intelligence algorithm AlphaGo in 2016, declared his retirement from professional play.
With the debut of AI in Go games, Ive realized that Im not at the top even if I become the number one through frantic efforts, Lee told theYonhap news agency. Even if I become the number one, there is an entity that cannot be defeated.
Predictably, Se-dols comments quickly made the rounds across prominent tech publications, some of them using sensational headlines with AI dominance themes.
Since the dawn of AI, games have been one of the main benchmarks to evaluate the efficiency of algorithms. And thanks to advances in deep learning and reinforcement learning, AI researchers are creating programs that can master very complicated games and beat the most seasoned players across the world. Uninformed analysts have been picking up on these successes to suggest that AI is becoming smarter than humans.
But at the same time, contemporary AI fails miserably at some of the most basic that every human can perform.
This begs the question, does mastering a game prove anything? And if not, how can you measure the level of intelligence of an AI system?
Take the following example. In the picture below, youre presented with three problems and their solution. Theres also a fourth task that hasnt been solved. Can you guess the solution?
Youre probably going to think that its very easy. Youll also be able to solve different variations of the same problem with multiple walls, and multiple lines, and lines of different colors, just by seeing these three examples. But currently, theres no AI system, including the ones being developed at the most prestigious research labs, that can learn to solve such a problem with so few examples.
The above example is from The Measure of Intelligence, a paper by Franois Chollet, the creator of Keras deep learning library. Chollet published this paper a few weeks before Le-sedol declared his retirement. In it, he provided many important guidelines on understanding and measuring intelligence.
Ironically, Chollets paper did not receive a fraction of the attention it needs. Unfortunately, the media is more interested in covering exciting AI news that gets more clicks. The 62-page paper contains a lot of invaluable information and is a must-read for anyone who wants to understand the state of AI beyond the hype and sensation.
But I will do my best to summarize the key recommendations Chollet makes on measuring AI systems and comparing their performance to that of human intelligence.
The contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks, such as board games and video games, Chollet writes, adding that solely measuring skill at any given task falls short of measuring intelligence.
In fact, the obsession with optimizing AI algorithms for specific tasks has entrenched the community in narrow AI. As a result, work in AI has drifted away from the original vision of developing thinking machines that possess intelligence comparable to that of humans.
Although we are able to engineer systems that perform extremely well on specific tasks, they have still stark limitations, being brittle, data-hungry, unable to make sense of situations that deviate slightly from their training data or the assumptions of their creators, and unable to repurpose themselves to deal with novel tasks without significant involvement from human researchers, Chollet notes in the paper.
Chollets observations are in line with those made by other scientists on the limitations and challenges of deep learning systems. These limitations manifest themselves in many ways:
Heres an example: OpenAIs Dota-playing neural networks needed 45,000 years worth of gameplay to reach a professional level. The AI is also limited in the number of characters it can play, and the slightest change to the game rules will result in a sudden drop in its performance.
The same can be seen in other fields, such as self-driving cars. Despite millions of hours of road experience, the AI algorithms that power autonomous vehicles can make stupid mistakes, such as crashing into lane dividers or parked firetrucks.
One of the key challenges that the AI community has struggled with is defining intelligence. Scientists have debated for decades on providing a clear definition that allows us to evaluate AI systems and determine what is intelligent or not.
Chollet borrows the definition by DeepMind cofounder Shane Legg and AI scientist Marcus Hutter: Intelligence measures an agents ability to achieve goals in a wide range of environments.
Key here is achieve goals and wide range of environments. Most current AI systems are pretty good at the first part, which is to achieve very specific goals, but bad at doing so in a wide range of environments. For instance, an AI system that can detect and classify objects in images will not be able to perform some other related task, such as drawing images of objects.
Chollet then examines the two dominant approaches in creating intelligence systems: symbolic AI and machine learning.
Early generations of AI research focused on symbolic AI, which involves creating an explicit representation of knowledge and behavior in computer programs. This approach requires human engineers to meticulously write the rules that define the behavior of an AI agent.
It was then widely accepted within the AI community that the problem of intelligence would be solved if only we could encode human skills into formal rules and encode human knowledge into explicit databases, Chollet observes.
But rather than being intelligent by themselves, these symbolic AI systems manifest the intelligence of their creators in creating complicated programs that can solve specific tasks.
The second approach, machine learning systems, is based on providing the AI model with data from the problem space and letting it develop its own behavior. The most successful machine learning structure so far is artificial neural networks, which are complex mathematical functions that can create complex mappings between inputs and outputs.
For instance, instead of manually coding the rules for detecting cancer in x-ray slides, you feed a neural network with many slides annotated with their outcomes, a process called training. The AI examines the data and develops a mathematical model that represents the common traits of cancer patterns. It can then process new slides and outputs how likely it is that the patients have cancer.
Advances in neural networks and deep learning have enabled AI scientists to tackle many tasks that were previously very difficult or impossible with classic AI, such as natural language processing, computer vision and speech recognition.
Neural networkbased models, also known as connectionist AI, are named after their biological counterparts. They are based on the idea that the mind is a blank slate (tabula rasa) that turns experience (data) into behavior. Therefore, the general trend in deep learning has become to solve problems by creating bigger neural networks and providing them with more training data to improve their accuracy.
Chollet rejects both approaches because none of them has been able to create generalized AI that is flexible and fluid like the human mind.
We see the world through the lens of the tools we are most familiar with. Today, it is increasingly apparent that both of these views of the nature of human intelligenceeither a collection of special-purpose programs or a general-purpose Tabula Rasaare likely incorrect, he writes.
Truly intelligent systems should be able to develop higher-level skills that can span across many tasks. For instance, an AI program that masters Quake 3 should be able to play other first-person shooter games at a decent level. Unfortunately, the best that current AI systems achieve is local generalization, a limited maneuver room within their own narrow domain.
In his paper, Chollet argues that the generalization or generalization power for any AI system is its ability to handle situations (or tasks) that differ from previously encountered situations.
Interestingly, this is a missing component of both symbolic and connectionist AI. The former requires engineers to explicitly define its behavioral boundary and the latter requires examples that outline its problem-solving domain.
Chollet also goes further and speaks of developer-aware generalization, which is the ability of an AI system to handle situations that neither the system nor the developer of the system have encountered before.
This is the kind of flexibility you would expect from a robo-butler that could perform various chores inside a home without having explicit instructions or training data on them. An example is Steve Wozniaks famous coffee test, in which a robot would enter a random house and make coffee without knowing in advance the layout of the home or the appliances it contains.
Elsewhere in the paper, Chollet makes it clear that AI systems that cheat their way toward their goal by leveraging priors (rules) and experience (data) are not intelligent. For instance, consider Stockfish, the best rule-base chess-playing program. Stockfish, an open-source project, is the result of contributions from thousands of developers who have created and fine-tuned tens of thousands of rules. A neural networkbased example is AlphaZero, the multi-purpose AI that has conquered several board games by playing them millions of times against itself.
Both systems have been optimized to perform a specific task by making use of resources that are beyond the capacity of the human mind. The brightest human cant memorize tens of thousands of chess rules. Likewise, no human can play millions of chess games in a lifetime.
Solving any given task with beyond-human level performance by leveraging either unlimited priors or unlimited data does not bring us any closer to broad AI or general AI, whether the task is chess, football, or any e-sport, Chollet notes.
This is why its totally wrong to compare Deep Blue, Alpha Zero, AlphaStar or any other game-playing AI with human intelligence.
Likewise, other AI models, such as Aristo, the program that can pass an eighth-grade science test, does not possess the same knowledge as a middle school student. It owes its supposed scientific abilities to the huge corpora of knowledge it was trained on, not its understanding of the world of science.
(Note: Some AI researchers, such as computer scientist Rich Sutton, believe that the true direction for artificial intelligence research should be methods that can scale with the availability of data and compute resources.)
In the paper, Chollet presents the Abstraction Reasoning Corpus (ARC), a dataset intended to evaluate the efficiency of AI systems and compare their performance with that of human intelligence. ARC is a set of problem-solving tasks that tailored for both AI and humans.
One of the key ideas behind ARC is to level the playing ground between humans and AI. It is designed so that humans cant take advantage of their vast background knowledge of the world to outmaneuver the AI. For instance, it doesnt involve language-related problems, which AI systems have historically struggled with.
On the other hand, its also designed in a way that prevents the AI (and its developers) from cheating their way to success. The system does not provide access to vast amounts of training data. As in the example shown at the beginning of this article, each concept is presented with a handful of examples.
The AI developers must build a system that can handle various concepts such as object cohesion, object persistence, and object influence. The AI system must also learn to perform tasks such as scaling, drawing, connecting points, rotating and translating.
Also, the test dataset, the problems that are meant to evaluate the intelligence of the developed system, are designed in a way that prevents developers from solving the tasks in advance and hard-coding their solution in the program. Optimizing for evaluation sets is a popular cheating method in data science and machine learning competitions.
According to Chollet, ARC only assesses a general form of fluid intelligence, with a focus on reasoning and abstraction. This means that the test favors program synthesis, the subfield of AI that involves generating programs that satisfy high-level specifications. This approach is in contrast with current trends in AI, which are inclined toward creating programs that are optimized for a limited set of tasks (e.g., playing a single game).
In his experiments with ARC, Chollet has found that humans can fully solve ARC tests. But current AI systems struggle with the same tasks. To the best of our knowledge, ARC does not appear to be approachable by any existing machine learning technique (including Deep Learning), due to its focus on broad generalization and few-shot learning, Chollet notes.
While ARC is a work in progress, it can become a promising benchmark to test the level of progress toward human-level AI. We posit that the existence of a human-level ARC solver would represent the ability to program an AI from demonstrations alone (only requiring a handful of demonstrations to specify a complex task) to do a wide range of human-relatable tasks of a kind that would normally require human-level, human-like fluid intelligence, Chollet observes.
Read this article:
Artificial intelligence: How to measure the I in AI - TechTalks
- Musk v. Altman live updates: Trial that could alter direction of artificial intelligence begins - ABC7 San Francisco - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence Has Saved the Stock MarketAgain - WSJ - April 27th, 2026 [April 27th, 2026]
- The Best Artificial Intelligence (AI) Growth Stocks to Buy on the Nasdaq as the Rally Heats Up - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- The Best Artificial Intelligence (AI) Growth Stocks to Buy on the Nasdaq as the Rally Heats Up - Yahoo Finance - April 27th, 2026 [April 27th, 2026]
- Building Artificial Intelligence Tools That Speak Nuclear - Forbes - April 27th, 2026 [April 27th, 2026]
- Evaluating the Accuracy of Artificial Intelligence Models for Early Lung Cancer Detection: Evidence From a Systematic Review - Cureus - April 27th, 2026 [April 27th, 2026]
- Artificial intelligence and the future of work - Meer | English edition - April 27th, 2026 [April 27th, 2026]
- Opinion | Children are in the crosshairs of artificial intelligence. Who will we blame? - CalMatters - April 27th, 2026 [April 27th, 2026]
- Potential futures for the IPCCs approach to artificial intelligence - Nature - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence Software Platforms Global Market Report 2026: Google, Microsoft, AWS, Tencent, and IBM Led the $79.38 Billion Market in 2025 -... - April 27th, 2026 [April 27th, 2026]
- 1 No-Brainer Artificial Intelligence (AI) Stock to Buy With $10,000 and Hold for the Long Term - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- Michael Burry Has Soured on Palantir and Is Betting on This Other Beaten-Down Artificial Intelligence (AI) Software Stock Instead - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- 2 Millionaire-Maker Artificial Intelligence (AI) Stocks to Buy and Hold - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- DW News. . Artificial intelligence is transforming election campaigns in Germany but clear rules are lacking. Experts describe the landscape as the... - April 27th, 2026 [April 27th, 2026]
- Conference to focus on how artificial intelligence can dramatically reduce workplace injuries - WFMZ.com - April 27th, 2026 [April 27th, 2026]
- The Nasdaq Is Surging Back Toward All-Time Highs. Is Now the Best Time to Buy Artificial Intelligence (AI) Growth Stocks? - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- AICC Original Article'Empowering All with Intelligence, Creating a Thriving AI Ecosystem in Jianghuai' Artificial Intelligence Innovation Matchmaking... - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence in Aviation Market Booming with Rapid - openPR.com - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence in Transportation Market Is Going to Boom| Tesla, Inc, Uber Technologies, Inc, Bosch Group - openPR.com - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence in Genomics Market Set to Boom Rapidly - openPR.com - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence in E-commerce Market Set to Witness Massive Growth Through 2033| Amazon, Google, Salesforce - openPR.com - April 27th, 2026 [April 27th, 2026]
- The "Great Rotation" Is Reversing and the Nasdaq Is Surging. Here Are the Best Artificial Intelligence (AI) Growth Stocks for the Next Leg... - April 27th, 2026 [April 27th, 2026]
- The Nasdaq Is Approaching All-Time Highs. Is It Too Late to Buy These Artificial Intelligence (AI) Growth Stocks? - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- The Nasdaq Is on Fire. Here Are the 2 Best Artificial Intelligence (AI) Growth Stocks That Still Look Cheap. - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- Artificial Intelligence will bore us to death before it kills us - The Spectator - April 27th, 2026 [April 27th, 2026]
- Watch Understanding the Most Viral Chart in Artificial Intelligence | Odd Lots - Bloomberg.com - April 27th, 2026 [April 27th, 2026]
- Gen Z Increasingly Skeptical of and Angry About Artificial Intelligence - The Good Men Project - April 27th, 2026 [April 27th, 2026]
- 1 No-Brainer Artificial Intelligence (AI) Stock to Buy With $5,000 and Hold for the Long Term - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- Prediction: The Artificial Intelligence (AI) Supercycle Will Survive the Iran War. But the Supply Chain That Feeds It Just Changed Forever. - Yahoo... - April 27th, 2026 [April 27th, 2026]
- The Best Time to Buy Artificial Intelligence (AI) Growth Stocks on the Nasdaq Was Last Month. The Second-Best Time Is Now. - The Motley Fool - April 27th, 2026 [April 27th, 2026]
- Buy 2 Artificial Intelligence (AI) Stocks That Are Crushing Nvidia and Palantir in 2026 - The Motley Fool - April 21st, 2026 [April 21st, 2026]
- Prediction: This Artificial Intelligence (AI) Growth Stock Will Be the Nasdaq's Biggest Winner Over the Next 12 Months - Yahoo Finance - April 21st, 2026 [April 21st, 2026]
- Voices of Youth: Learning to live with artificial intelligence requires some deep thinking - Second Wave Media - April 21st, 2026 [April 21st, 2026]
- 3 Artificial Intelligence (AI) Stocks Warren Buffett Might Buy If He Were a Tech Investor - The Motley Fool - April 21st, 2026 [April 21st, 2026]
- Binghamton and AI: New Initiative Hopes to Advance Artificial Intelligence for the Public Good | Newswise - Newswise - April 21st, 2026 [April 21st, 2026]
- Accuity Named Winner in 2026 Artificial Intelligence Excellence Awards for Advancing Responsible AI in Healthcare - PR Newswire - April 21st, 2026 [April 21st, 2026]
- U.S. Rep. Blake Moore proposes legislation to ban toys with artificial intelligence in U.S. - Cache Valley Daily - April 21st, 2026 [April 21st, 2026]
- Artificial Intelligence in the Art Market | Insights - Holland & Knight - April 21st, 2026 [April 21st, 2026]
- Peer review in the time of artificial intelligence - Nature - April 21st, 2026 [April 21st, 2026]
- Heartland school district working to find balance with artificial intelligence - KFVS12 - April 21st, 2026 [April 21st, 2026]
- The 3 Best Artificial Intelligence (AI) Growth Stocks to Buy on the Nasdaq Before Q1 Earnings Season - The Globe and Mail - April 21st, 2026 [April 21st, 2026]
- Prediction: This Artificial Intelligence (AI) Growth Stock Will Be the Nasdaq's Biggest Winner Over the Next 12 Months - The Motley Fool - April 21st, 2026 [April 21st, 2026]
- North Country educators and parents on how artificial intelligence is changing school - NCPR: North Country Public Radio - April 21st, 2026 [April 21st, 2026]
- I've Been Buying Artificial Intelligence (AI) Stocks for 10 Years. Here's the 1 Lesson This Correction Taught Me. - The Motley Fool - April 21st, 2026 [April 21st, 2026]
- Pope Leo XIV, Africa, and the Reality Artificial Intelligence Cannot Convey EWTN Great Britain - EWTN UK - April 21st, 2026 [April 21st, 2026]
- Jesus Christ AI: This Is the God Transformed Into Artificial Intelligence That Already Exists and Aims to Redefine the Religious Experience - ZENIT -... - April 21st, 2026 [April 21st, 2026]
- U.S. Rep. Blake Moore proposes legislation to ban toys with artificial intelligence in U.S. - KVNU - April 21st, 2026 [April 21st, 2026]
- SLU host regional conference on Artificial intelligence in health care - The Advocate - April 21st, 2026 [April 21st, 2026]
- Buy 2 Artificial Intelligence (AI) Stocks That Are Crushing Nvidia and Palantir in 2026 - The Globe and Mail - April 21st, 2026 [April 21st, 2026]
- Advanced artificial intelligence algorithms and hardware acceleration techniques applied to material structure design - EurekAlert! - April 21st, 2026 [April 21st, 2026]
- Commonwealth launches regional training on Artificial Intelligence and electoral integrity in Trinidad and Tobago - thecommonwealth.org - April 21st, 2026 [April 21st, 2026]
- Marquis Who's Who Honors Maria Sami for Expertise in Artificial Intelligence and Technology Transformation - 24-7 Press Release Newswire - April 21st, 2026 [April 21st, 2026]
- Part I: From Traditional Order to Artificial Intelligence: Where Does Afghanistan Stand in the Global Transition? - Hasht-e Subh Daily - April 21st, 2026 [April 21st, 2026]
- Why Artificial Intelligence (AI) Won't Destroy Software Companies, According to This Microsoft Executive - The Motley Fool - April 21st, 2026 [April 21st, 2026]
- Researchers use artificial intelligence to help farmers - Rocky Mount Telegram - April 21st, 2026 [April 21st, 2026]
- The World Alliance Releases 2026 Report on Artificial Intelligence in Financial Services - The National Law Review - April 21st, 2026 [April 21st, 2026]
- Artificial Intelligence and the End of the World - Adventist Review - April 21st, 2026 [April 21st, 2026]
- Heading Into the Heart of Q2, These Are the 3 Artificial Intelligence (AI) Stocks I Want to Own - The Motley Fool - April 17th, 2026 [April 17th, 2026]
- The Biggest Risk to Your Artificial Intelligence (AI) Stocks Isn't AI Itself. It's $100+ Oil. - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- If I Had $10,000 to Invest in Artificial Intelligence (AI) Right Now, I'd Split It Between These 3 Stocks - Yahoo Finance - March 30th, 2026 [March 30th, 2026]
- 3 Artificial Intelligence (AI) Stocks That Could Help Set You Up for Life - Yahoo Finance - March 30th, 2026 [March 30th, 2026]
- 3 Artificial Intelligence (AI) Stocks That Could Help Set You Up for Life - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- Artificial Intelligence: Reality Versus Hype (Opinion) - Education Week - March 30th, 2026 [March 30th, 2026]
- Tech Days returns to UNM with focus on artificial intelligence, innovation - UNM Newsroom - March 30th, 2026 [March 30th, 2026]
- Artificial intelligence and climate migration equity - Nature - March 30th, 2026 [March 30th, 2026]
- Your Artificial Intelligence (AI) Portfolio Probably Looks Very Different Than It Did 6 Months Ago. Here's Why That's OK. - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- The genies out of the bottle: Little signs artificial intelligence education bill - Idaho Education News - March 30th, 2026 [March 30th, 2026]
- This Artificial Intelligence (AI) Stock Could Handily Outperform Management's Own Guidance. Buy It Now. - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- Artificial intelligence will see you now: Bots to prescribe mental health drugs - New York Post - March 30th, 2026 [March 30th, 2026]
- Artificial Intelligence in Defence Market: Size, Trends, Growth Drivers, and Future Outlook (2026 to 2035) - openPR.com - March 30th, 2026 [March 30th, 2026]
- Area educators address growing student reliance on artificial intelligence - 910news.com - March 30th, 2026 [March 30th, 2026]
- Harnessing Artificial Intelligence to Deliver Growth Mindset Education in a Pre-matriculation Curriculum for Incoming Medical Students - Cureus - March 30th, 2026 [March 30th, 2026]
- Artificial Intelligence in Obstetrics and Gynecology Nursing: Clinical, Educational, and Ethical Perspectives - Cureus - March 30th, 2026 [March 30th, 2026]
- Did Investors Get Too Far Ahead of the Artificial Intelligence (AI) Revolution? The Market Is Starting to Say Yes. - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- 1 No-Brainer Artificial Intelligence (AI) Stock That Will Skyrocket By the End of 2026 - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- Oil Over $100, a War in the Middle East, and the Fed on Hold. Here's How to Protect Your Artificial Intelligence (AI) Portfolio in 2026. - The Motley... - March 30th, 2026 [March 30th, 2026]
- Marvell's Data Center Revenue Just Grew 21%. Here's Why This Artificial Intelligence (AI) Stock Could Deliver 50% Upside in 2026. - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- This Brilliant Artificial Intelligence (AI) Stock Just Unveiled Plans to Reach a $9 Trillion Valuation by 2031 (Hint: Not Nvidia) - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- This Company Is Doubling Its Artificial Intelligence (AI) Spending in 2026. Here's Why It's a Long-Term Winner. - The Motley Fool - March 30th, 2026 [March 30th, 2026]
- 3 Artificial Intelligence (AI) Stocks to Buy at a Discount - The Motley Fool - March 30th, 2026 [March 30th, 2026]