Apple Advances On-Device AI, Shifting Focus from Cloud Dependence

Deep News05-28 22:23

At next month's annual Apple developer conference, a series of long-awaited AI feature upgrades for the iPhone are set to take center stage. Concurrently, the company is expected to highlight a key potential advantage in its AI strategy: leveraging its vast global fleet of devices to run AI models directly on the hardware itself, bypassing the cloud.

According to sources familiar with the plans for Apple's Worldwide Developers Conference, the company is poised to showcase over fifteen years of technical expertise in developing custom chips for the iPhone, Apple Watch, and Mac computers. This accumulated knowledge is considered a core advantage for running AI models locally on devices. The current industry standard, in contrast, involves running these models in large data centers equipped with high-performance AI chips, an approach that entails significant construction and operational costs.

Due to computational complexity and the need to access vast amounts of online information, many AI commands from Apple devices still require processing in the cloud. For instance, under a cooperation agreement with Google, some new Siri commands will run on Google's cloud platform using a licensed version of the Gemini large language model. Additional sources indicate Apple has recently approved the use of Nvidia's privacy-preserving technology in this context, meaning part of the computing demand within Google Cloud will be handled by Nvidia AI chips.

However, running AI models locally on devices can reduce the risk of user data breaches and prevent advertising firms from profiting from personal information. For enterprise clients, on-device processing can also decrease token usage, thereby lowering costs—tokens being the text-based unit of measurement for billing by cloud AI service providers. For Apple itself, offloading more AI computational tasks to end-user devices allows it to avoid the massive capital expenditures in data centers that other tech giants are making.

Sources state that, as part of their collaboration, Apple is using the full version of Google's Gemini model to train lightweight models capable of running locally on Apple devices through a technique called model distillation. Furthermore, Apple is seeking out smaller companies that can assist in adapting and optimizing AI models for on-device operation. A source with knowledge of this strategy revealed that Liquid AI, a Cambridge, Massachusetts-based startup focused on edge AI technology, is on Apple's list of potential acquisition targets.

Apple initially highlighted the privacy benefits of on-device AI when it launched its Apple Intelligence suite of features in 2024. However, progress in this area subsequently stalled: the new AI features received a lukewarm market response, and the updated Siri faced further delays, creating an awkward situation for the company.

Meanwhile, as major tech giants invested heavily in building cloud-based AI computing infrastructure, Apple largely remained on the sidelines. Last year, Meta's capital expenditures reached $72 billion, with the vast majority directed toward data center construction; Microsoft's capital expenditures were as high as $88 billion. In the same period, Apple's capital expenditures were only $12.72 billion.

Apple's conservative approach to AI investment has drawn criticism from investors and industry commentators, who argued the company risked falling behind in an era where AI is a core capability for smart devices. With the tech industry's current unprecedented scale of AI investment—Microsoft alone is forecast to have capital expenditures of $190 billion this year—some technologists have begun to worry about a blind rush to build cloud computing capacity. This has led to a reassessment of Apple's more measured strategy.

David Stott, CEO of Austin-based AI startup webAI, commented, "I believe there's a misstep in the current data center investment frenzy. AI technology is moving towards lighter-weight models. Data centers won't disappear entirely, but the vast majority of computational tasks will eventually shift to the edge. Apple is betting on the right direction here."

Stott is among a growing number of AI developers building businesses on Apple hardware. webAI develops custom on-device AI applications for enterprises, such as creating maintenance tools for the aviation industry: training AI models on the complete repair manuals for Boeing Dreamliner engines to assist technicians.

These models can run offline on an iPad or Mac without an internet connection. Apple devices are also favored by tech enthusiasts for running open-source tools like OpenClaw, which can create AI agents capable of autonomously operating a computer.

In a recent research note to investors, Arete Research tech analyst Richard Kramer estimated that the aggregate computing power of Apple's global installed base of device chips is equivalent to a $50 billion computing resource, entirely borne by users worldwide.

Mark Suman, a former Apple senior engineering program manager who led internal AI system development before departing in 2024, stated that the collective power of billions of Apple devices constitutes a formidable AI computing force in itself.

Suman, now co-founder of startup Maple, which provides services for encrypted access to cloud AI models, said, "Apple has the capability to build the world's largest edge computing AI system. It's only a matter of time before they unleash that potential."

Naturally, Apple's AI strategy cannot rely solely on on-device models. The full version of Google's Gemini model contains trillions of parameters—a key measure of an AI model's complexity—and demands immense computing power. Sources indicate that even Apple's own Private Cloud Compute server architecture, which uses the same custom chips as the Mac, would struggle to run the full Gemini model.

Several former Apple engineers believe the company will still need to rely on Google's cloud infrastructure for some new Siri features. Despite this, Apple is exploring solutions that balance cloud-based AI services with high-level privacy protection. According to sources familiar with the collaboration, Apple's recent approval to use Nvidia's confidential computing system within Google Cloud for some complex Gemini-based computations is one such attempt.

Confidential computing is a security technology in Nvidia GPUs that keeps data encrypted throughout processing by the AI model. Enabling this feature slightly reduces the speed of cloud-based AI command processing but helps Apple uphold its commitment to user privacy.

When Apple first introduced Apple Intelligence, it stated that any AI commands not processed locally on the device would be handled exclusively by its Private Cloud Compute system running on Apple's own chips. While this arrangement has now been adjusted, sources suggest Apple is likely to retain the "Private Cloud Compute" branding.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment