Apple Researchers Show How The Company Plans on Running AI Models On-Device


Apple will join the artificial intelligence arena later than some when it unveils its next-generation operating systems for iPhones, iPads, and Macs at its Worldwide Developers Conference (WWDC) on June 10. According to reports from Bloomberg, Apple is developing its own large language model (LLM) to empower on-device generative AI features. Can an entire AI model run without any cloud-based processing? Apple’s researchers believe it’s possible.

On-Device AI:

In a paper titled “LLM in a flash: Efficient Large Language Model Inference with Limited Memory,” Apple researchers outline the company’s strategy to execute hefty AI models on iPhones, iPads, MacBooks, and more. Instead of storing the AI model’s data in RAM as is customary, Apple plans to utilize the device’s flash memory. The Verge reports that this approach allows for faster and more efficient AI model operation. Furthermore, it enables the running of LLMs up to twice the size of available DRAM.

Apple recently introduced new OpenELM AI models on the Hugging Face model library. These models, collectively known as OpenELM (Open-source Efficient Language Models), comprise four compact language models suitable for devices like phones and PCs. They excel in text-related tasks such as text generation, summarization, and email composition.

Enhancing Siri with AI:

Apple intends to enhance its virtual assistant Siri with AI capabilities. As per The Verge, Apple researchers are exploring ways for Siri to operate without requiring a wake word. Instead of solely responding to voice commands starting with “Hey Siri” or “Siri,” the virtual assistant could discern whether the user is addressing it or not.

In their paper titled “MULTICHANNEL VOICE TRIGGER DETECTION BASED ON TRANSFORM-AVERAGE-CONCATENATE,” Apple researchers propose feeding unnecessary sounds to the AI model for processing, rather than discarding them outright. This approach enables the AI model to differentiate between relevant and irrelevant audio cues.

Upon activation, Apple aims to enhance the conversational experience with Siri. Through a model named STEER (Semantic Turn Extension-Expansion Recognition), which utilizes LLM, Apple seeks to decipher ambiguous queries. The system employs AI to pose clarifying questions to users, thereby gaining a better understanding of their needs. Additionally, it excels at distinguishing follow-up inquiries from entirely new prompts.