Pocket-size AI models could unlock a new era of computing

View in browser | Your newsletter preferences

By Will Knight | 05.23.24

Hello and welcome to another week when the tech industry appears to have its finger on the fast-forward button for artificial intelligence.

In this edition of the newsletter, instead of talking about the latest giant AI model, I’d like to turn your attention to what’s happening at the other end of the spectrum—where tiny AI programs are proving increasingly capable.

Shrinking AI Programs Can Make Them More Powerful 🧠📲🤏

Abstract 3D render of a transparent smartphone with chat bubbles hovering over the surface while it rests on a blue surface

When ChatGPT was released in November 2023, it could only be accessed through the cloud because the model behind it was downright enormous.

Today I am running a similarly capable AI program on a Macbook Air, and it isn’t even warm. The shrinkage shows how rapidly researchers are refining AI models to make them leaner and more efficient. It also shows how going to ever larger scales isn’t the only way to make machines significantly smarter.

The model now infusing my laptop with ChatGPT-like wit and wisdom is called Phi-3-mini. It’s part of a family of smaller AI models recently released by researchers at Microsoft. Although it’s compact enough to run on a smartphone, I tested it by running it on a laptop and accessing it from an iPhone through an app called Enchanted that provides a chat interface similar to the official ChatGPT app.

In a paper describing the Phi-3 family of models, Microsoft’s researchers say the model I used measures up favorably to GPT-3.5, the OpenAI model behind the first release of ChatGPT. That claim is based on measuring its performance on several standard AI benchmarks designed to measure common sense and reasoning. In my own testing, it certainly seems just as capable.

Testing Phi3 on my phone

Microsoft announced a new “multimodal” Phi-3 model capable of handling audio, video, and text at its annual developer conference, Build, this week. That came just days after OpenAI and Google both touted radical new AI assistants built on top of multimodal models accessed via the cloud.

Microsoft’s Lilliputian family of AI models suggest it’s becoming possible to build all kinds of handy AI apps that don’t depend on the cloud. That could open up new use cases, by allowing them to be more responsive or private. (Offline algorithms are a key piece of the Recall feature Microsoft announced that uses AI to make everything you ever did on your PC searchable.)

But the Phi family also reveals something about the nature of modern AI. Sébastien Bubeck, a researcher at Microsoft involved with the project, tells me the models were built to test whether being more selective about what an AI system is trained on could provide a way to fine-tune its abilities.

The large language models like OpenAI’s GPT-4 or Google’s Gemini that power chatbots and other services are typically spoon-fed huge gobs of text siphoned from books, websites, and just about any other accessible source. Although it’s raised legal questions, OpenAI and others have found that increasing the amount of text fed to these models, and the amount of computer power used to train them, can unlock new capabilities.

Bubeck, who is interested in the nature of the “intelligence” exhibited by language models, decided to see if carefully curating the data fed to a model could improve its abilities without having to balloon its training data.

Last September, his team took a model roughly one-17th the size of OpenAI’s GPT-3.5, trained it on “textbook quality” synthetic data generated by a larger AI model, including factoids from specific domains including programming. The resulting model displayed surprising abilities for its size. “Lo and behold, what we observed is that we were able to beat GPT-3.5 at coding using this technique,” he says. “That was really surprising to us.”

Bubeck’s group at Microsoft has made other discoveries using this approach. One experiment showed that feeding an extra-tiny model children’s stories allowed it to produce consistently coherent output, even though AI programs of this size typically produce gibberish when trained the conventional way. Once again, the result suggests you can make seemingly underpowered AI software useful if you educate it with the right material.

Bubeck says these results seem to indicate that making future AI systems smarter will require more than just scaling them up to still greater sizes. And it also seems likely that scaled-down models like Phi-3 will be an important feature of the future of computing. Running AI models “locally” on a smartphone, laptop, or PC reduces the latency or outages that can occur when queries have to be fed into the cloud. It guarantees that your data stays on your device and could unlock entirely new use cases for AI not possible under the cloud-centric model, such as AI apps deeply integrated into a device’s operating system.

Apple is widely expected to unveil its long-awaited AI strategy at its WWDC conference next month, and it has previously boasted that its custom hardware and software allows machine learning to happen locally on its devices. Rather than go toe-to-toe with OpenAI and Google in building ever more enormous cloud AI models, it might think different by focusing on shrinking AI down to fit into its customers’ pockets.

Will Knight, Senior Writer

Need to Know

Tesla Model 3 vehicles in parking spaces

Teslas Can Still Be Stolen With a Cheap Radio Hack—Despite New Keyless Tech

EXCLUSIVE: Ultra-wideband radio has been heralded as the solution for “relay attacks” that are used to steal cars in seconds. But researchers found Teslas equipped with it are as vulnerable as ever.

Image collage of William Ruto, president of Kenya, and U.S. President Joe Biden, with texture of circuitry and computer workers.

The Low-Paid Humans Behind AI’s Smarts Ask Biden to Free Them From ‘Modern Day Slavery’

African workers who label AI data or screen social posts for US tech giants are calling on President Biden to raise their plight with Kenya's president, William Ruto, who visits the US this week.

Sundar Pichai speaking on stage in front of a large crowd with an LED screen behind him that has the acronym %22AI%22 repeated several times

Google Search’s New AI Overviews Will Soon Have Ads

Google is set to start mixing ads into its new AI-generated search answers. It’s a test of how the company’s biggest revenue stream can adapt to the age of generative AI.

Light trails moving inside of black box on pedestal in front of a blue backdrop

AI Is a Black Box. Anthropic Figured Out a Way to Look Inside

What goes on in artificial neural networks work is largely a mystery, even to their creators. But researchers from Anthropic have caught a glimpse.

For all our future-gazing tech coverage, visit WIRED Business.

So, This Happened

Scarlett Johansson’s complaint that OpenAI ripped off her voice is prompting legal action from other actors against Big Tech. (The Hollywood Reporter)

Amazon is preparing a generative AI overhaul for Alexa—as well as a new subscription model for the service. (CNBC)

OpenAI researchers working on the long-term risks of AI were promised 20 percent of the company’s compute resources. Sources say they were often denied it. (Fortune)

Nvidia’s revenue and profits rocket in its latest earnings report as the AI boom built on its hardware continues booming. (The New York Times)