This could lead to the next big breakthrough in common sense AI – MIT Technology Review
AI models that can parse both language and visual input also have very practical uses. If we want to build robotic assistants, for example, they need computer vision to navigate the world and language to communicate about it to humans.
But combining both types of AI is easier said than done. It isnt as simple as stapling together an existing language model with an existing object recognition system. It requires training a new model from scratch with a data set that includes text and images, otherwise known as a visual-language data set.
The most common approach for curating such a data set is to compile a collection of images with descriptive captions. A picture like the one below, for example, would be captioned An orange cat sits in the suitcase ready to be packed. This differs from typical image data sets, which would label the same picture with only one noun, like cat. A visual-language data set can therefore teach an AI model not just how to recognize objects but how they relate to and act on one other, using verbs and prepositions.
But you can see why this data curation process would take forever. This is why the visual-language data sets that exist are so puny. A popular text-only data set like English Wikipedia (which indeed includes nearly all the English-language Wikipedia entries) might contain nearly 3 billion words. A visual-language data set like Microsoft Common Objects in Context, or MS COCO, contains only 7 million. Its simply not enough data to train an AI model for anything useful.
Vokenization gets around this problem, using unsupervised learning methods to scale the tiny amount ofdata in MS COCO to the size of English Wikipedia. The resultant visual-language model outperforms state-of-the-art models in some of the hardest tests used to evaluate AI language comprehension today.
You dont beat state of the art on these tests by just trying a little bit, says Thomas Wolf, the cofounder and chief science officer of the natural-language processing startup Hugging Face, who was not part of the research. This is not a toy test. This is why this is super exciting.
Lets first sort out some terminology. What on earth is a voken?
In AI speak, the words that are used to train language models are known as tokens. So the UNC researchers decided to call the image associated with each token in their visual-language model a voken. Vokenizer is what they call the algorithm that finds vokens for each token, and vokenization is what they call the whole process.
The point of this isnt just to show how much AI researchers love making up words. (They really do.) It also helps break down the basic idea behind vokenization. Instead of starting with an image data set and manually writing sentences to serve as captionsa very slow processthe UNC researchers started with a language data set and used unsupervised learning to match each word with a relevant image (more on this later). This is a highly scalable process.
The unsupervised learning technique, here, is ultimately the contribution of the paper. How do you actually find a relevant image for each word?
Lets go back for a moment to GPT-3. GPT-3 is part of a family of language models known as transformers, which represented a major breakthrough in applying unsupervised learning to natural-language processing when the first one was introduced in 2017. Transformers learn the patterns of human language by observing how words are used in context and then creating a mathematical representation of each word, known as a word embedding, based on that context. The embedding for the word cat might show, for example, that it is frequently used around the words meow and orange but less often around the words bark or blue.
This is how transformers approximate the meanings of words, and how GPT-3 can write such human-like sentences. It relies in part on these embeddings to tell it how to assemble words into sentences, and sentences into paragraphs.
Theres a parallel technique that can also be used for images. Instead of scanning text for word usage patterns, it scans images for visual patterns. It tabulates how often a cat, say, appears on a bed versus on a tree, and creates a cat embedding with this contextual information.
The insight of the UNC researchers was that they should use both embedding techniques on MS COCO. They converted the images into visual embeddings and the captions into word embeddings. Whats really neat about these embeddings is that they can then be graphed in a three-dimensional space, and you can literally see how they are related to one another. Visual embeddings that are closely related to word embeddings will appear closer in the graph. In other words, the visual cat embedding should (in theory) overlap with the text-based cat embedding. Pretty cool.
You can see where this is going. Once the embeddings are all graphed and compared and related to one another, its easy to start matching images (vokens) with words (tokens). And remember, because the images and words are matched based on their embeddings, theyre also matched based on context. This is useful when one word can have totally different meanings. The technique successfully handles that by finding different vokens for each instance of the word.
For example:
Go here to read the rest:
This could lead to the next big breakthrough in common sense AI - MIT Technology Review
- Creepy jail cell pics and Trump Wikipedia page included in new Jeffrey Epstein files - The Independent - February 1st, 2026 [February 1st, 2026]
- Wikipedia Inks AI Deals with Microsoft, Meta and Perplexity on 25th Birthday - Broadband Breakfast - February 1st, 2026 [February 1st, 2026]
- People Shared The Most Extremely Wild, Dark, And Interesting Wikipedia "Facts" - BuzzFeed - February 1st, 2026 [February 1st, 2026]
- Wikipedia Is 25 Years Old. How Does That Make You Feel? - VICE - February 1st, 2026 [February 1st, 2026]
- The IAC and Wikimedia Spain promote an edit-a-thon to raise the profile of women in astronomy on Wikipedia - Instituto de Astrofsica de Canarias IAC - February 1st, 2026 [February 1st, 2026]
- Fact check | Viral screenshot shows Ajit Pawar's death was updated on Wikipedia hours before Baramati crash - WION - February 1st, 2026 [February 1st, 2026]
- Netflixs Take That documentary feels like a Wikipedia entry brought to life - The Telegraph - January 28th, 2026 [January 28th, 2026]
- Wikipedia founder Jimmy Wales on the pillars of organizational trust - ASBN Small Business Network - January 28th, 2026 [January 28th, 2026]
- Discount garmin fenix 5 pro Online Sale Garmin Fenix Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Shop solar movies green book Flash Sales The Green Inferno film Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Cheap how many rings kd has Factory Sale Kevin Durant Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Cheap swiss eagle watches wikipedia Online Swiss Eagle Men - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Shop poljot watches wikipedia Outlet Online Poljot Vintage Watches the Flagship of Soviet Watch Brands - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Cheap boots with wooden soles Discount Clog Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Cheap raymond clothes online Shop Raymond Group Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Cheap dragon ball super broly movie watch now Online Broly Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Shop cinebay new movies Clearance The Fugitive 1993 film Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Wikipedia at 25: Jimmy Wales on AI Hallucination and why he trusts humans over algorithms - The Federal - January 28th, 2026 [January 28th, 2026]
- Best are princess cut diamonds more expensive Factory Sale Princess cut Wikipedia - Through The Fence Baseball - January 28th, 2026 [January 28th, 2026]
- Wikipedia celebrates its first 25 years with a warning about the threat of AI to its next 25 - PC Gamer - January 28th, 2026 [January 28th, 2026]
- 25 years of Wikipedia, 25 years of SF drama - sfstandard.com - January 26th, 2026 [January 26th, 2026]
- A Birthday Cake Song for 25 Years of Wikipedia! - Wikimedia.org - January 26th, 2026 [January 26th, 2026]
- Wikipedia volunteers spent years cataloging AI tells. Now theres a plugin to avoid them. - Ars Technica - January 26th, 2026 [January 26th, 2026]
- Wikipedia, Qatar, and the Future of Knowledge - Algemeiner.com - January 26th, 2026 [January 26th, 2026]
- Wikipedia Turns 25: Celebrating a Legacy of Collective Knowledge and Volunteer Dedication - Hoodline - January 26th, 2026 [January 26th, 2026]
- Pro-government editors wiped Iran rights abuses from Wikipedia - watchdog - - January 26th, 2026 [January 26th, 2026]
- Wikipedia Marks 25 Years by Spotlighting the Volunteers Behind the Platform - DesignRush - January 26th, 2026 [January 26th, 2026]
- Celebrating 25 Years of Wikipedia: WikiClub Tech UIT Marks a Milestone in Open Knowledge - Wikimedia.org - January 26th, 2026 [January 26th, 2026]
- Wikipedia turns 25 and spotlights the humans behind the worlds knowledge - Creative Boom - January 26th, 2026 [January 26th, 2026]
- I'm devastated these Wikipedia logos were robbed from us - Creative Bloq - January 26th, 2026 [January 26th, 2026]
- Wikipedia turns 25 and shares a glimpse into the lives of its volunteer editors - The Verge - January 18th, 2026 [January 18th, 2026]
- Wikipedia celebrates 25 years of knowledge at its best - Wikimedia Foundation - January 18th, 2026 [January 18th, 2026]
- London PR firm rewrites Wikipedia for governments and billionaires - TBIJ - January 18th, 2026 [January 18th, 2026]
- Microsoft, Meta, and Amazon are paying up for enterprise access to Wikipedia - The Verge - January 18th, 2026 [January 18th, 2026]
- Wikipedia's 25th anniversary: The story behind the creation of Concord, New Hampshire, article. - Concord Monitor - January 18th, 2026 [January 18th, 2026]
- At 25, Wikipedia Now Faces Its Most Existential ThreatGenerative A.I. - Scientific American - January 18th, 2026 [January 18th, 2026]
- Wikipedia marks 25 years by celebrating its volunteer army of editors - Ad Age - January 18th, 2026 [January 18th, 2026]
- Wikipedia Turns 25, Sells Access To Amazon, Meta, Microsoft And Other AI Giants - Forbes - January 18th, 2026 [January 18th, 2026]
- Wikipedia Is Now 25 Years Old [Citation Not Needed] - PCMag - January 18th, 2026 [January 18th, 2026]
- Wikipedia is now 25 years old worlds 7th most popular website now has over 7 million English articles and 7 billion monthly visitors - Tom's Hardware - January 18th, 2026 [January 18th, 2026]
- Microsoft, Meta, and Amazon are paying up for enterprise access to Wikipedia - TechRadar - January 18th, 2026 [January 18th, 2026]
- After Being Pillaged By AI Companies, Wikipedia Signs Deal to Get Paid By Them - Futurism - January 18th, 2026 [January 18th, 2026]
- Wikipedia is more important, and more vulnerable, than ever - The Boston Globe - January 18th, 2026 [January 18th, 2026]
- Wikipedia Partners With Big Tech Companies To Allow Access To Its Data For Developing And Training AI Models - AfroTech - January 18th, 2026 [January 18th, 2026]
- Wikipedia Marks 25 Years, Spotlighting Africas Growing Role In Knowledge - AfricaBrief - January 18th, 2026 [January 18th, 2026]
- Daily Digest: Wikipedia cuts deal with AI giants, Green Day coming to S.F. waterfront - San Francisco Business Times - The Business Journals - January 18th, 2026 [January 18th, 2026]
- AI firms need to pay fair share for using Wikipedia, founder says - Euronews.com - January 18th, 2026 [January 18th, 2026]
- Newsletter | Ecocide, a controversial mega-bridge & Wikipedia manipulation - Follow the Money - Platform for investigative journalism - January 18th, 2026 [January 18th, 2026]
- Wikipedia is now getting paid by Meta, Microsoft, Perplexity, and other AI companies - TechSpot - January 18th, 2026 [January 18th, 2026]
- Wikipedia Strikes Lucrative Deals with Tech Giants for AI Training Access - Technology Org - January 18th, 2026 [January 18th, 2026]
- Wikipedia commemorates 25th anniversary by inking AI licensing deals - Yahoo News Malaysia - January 18th, 2026 [January 18th, 2026]
- Microsoft, Meta, Amazon, Perplexity, and Mistral AI officially announced as paid program partners of Wikipedia - GIGAZINE - January 18th, 2026 [January 18th, 2026]
- How Africa Is Helping Rewrite the Worlds Knowledge as Wikipedia Turns 25 - Dawan Africa - January 18th, 2026 [January 18th, 2026]
- Why Microsoft, Meta, and Amazon Are Now Paying Wikipedia - TechRepublic - January 16th, 2026 [January 16th, 2026]
- Wikipedia will share content with AI firms in new licensing deals - Ars Technica - January 16th, 2026 [January 16th, 2026]
- Editorial: Happy 25th birthday, Wikipedia. We now admit to liking you. - Chicago Tribune - January 16th, 2026 [January 16th, 2026]
- Wikipedia commemorates 25th anniversary by inking AI licensing deals - AOL.com - January 16th, 2026 [January 16th, 2026]
- As Wikipedia turns 25, its future will depend on AI for better or worse - Sherwood News - January 16th, 2026 [January 16th, 2026]
- Can Wikipedia survive the age of AI? - San Francisco Chronicle - January 16th, 2026 [January 16th, 2026]
- Wikipedia inks AI deals with Microsoft, Meta and Perplexity as it marks 25th birthday - Oskaloosa Herald - January 16th, 2026 [January 16th, 2026]
- If You Cant Beat Them, Join Them: Wikipedia Shows the Way Into an AI Universe - CXOToday.com - January 16th, 2026 [January 16th, 2026]
- Wikipedia signs AI content training deals with Microsoft, Meta, and Amazon - The American Bazaar - January 16th, 2026 [January 16th, 2026]
- Let the birthday festivities begin! Wikipedia turns 25 - Wikimedia.org - January 16th, 2026 [January 16th, 2026]
- Wikipedia may be the largest compendium of human knowledge ever created, but can it survive? - Financial Times - January 16th, 2026 [January 16th, 2026]
- Wikipedia turns 25 today but faces more threats than ever before - 9to5Mac - January 16th, 2026 [January 16th, 2026]
- Wikipedia at 25: can its original ideals survive in the age of AI? - The Conversation - January 16th, 2026 [January 16th, 2026]
- Wikipedia inks AI deals with Microsoft, Meta and Perplexity as it marks 25th birthday - nwitimes.com - January 16th, 2026 [January 16th, 2026]
- Wikipedia inks AI deals with Microsoft, Meta and Perplexity as it marks 25th birthday - Kearney Hub - January 16th, 2026 [January 16th, 2026]
- Wikipedia inks AI deals with Microsoft, Meta and Perplexity as it marks 25th birthday - La Crosse Tribune - January 16th, 2026 [January 16th, 2026]
- Wikipedia signs data agreements with tech giants for AI training ahead of 25th anniversary - - January 16th, 2026 [January 16th, 2026]
- Wikipedia at 25: Microsoft, Meta, Amazon Now Pay to Train AI on It - H2S Media - January 16th, 2026 [January 16th, 2026]
- Wikipedia inks AI deals with Microsoft, Meta and Perplexity as it marks 25th birthday - Winston-Salem Journal - January 16th, 2026 [January 16th, 2026]
- Happy Birthday, Wikipedia: We need you now more than ever - Salon.com - January 16th, 2026 [January 16th, 2026]
- Wikipedia inks AI deals with Microsoft, Meta and Perplexity as it marks 25th birthday - Kenosha News - January 16th, 2026 [January 16th, 2026]
- Wikipedia signs AI deal with Amazon, Meta, Microsoft and Perplexity on 25th anniversary: Here's what this means - Mint - January 16th, 2026 [January 16th, 2026]
- Indian doctor in Sweden stars in Wikipedia's 25th anniversary celebrations - WION - January 16th, 2026 [January 16th, 2026]
- Wikipedia commemorates 25th anniversary by inking AI licensing deals - The Independent - January 16th, 2026 [January 16th, 2026]
- Wikipedia Secures AI Licensing Deals with Amazon, Meta, and Microsoft as Traffic Decline Threatens Sustainability - WinBuzzer - January 16th, 2026 [January 16th, 2026]
- The Wikipedia-AI Pact: A 25th Anniversary Strategy to Secure the Worlds Source of Truth - FinancialContent - January 16th, 2026 [January 16th, 2026]
- Wikipedia celebrates 25 years of free knowledge and global impact - Geo News - January 16th, 2026 [January 16th, 2026]