What's so amazing about the new AI, 'Apple Foundation Models'? It's a groundbreaking system that runs a 20 billion parameter multimodal model, 'AFM 3 Core Advanced,' on an iPhone.



Apple collaborated with Google to develop ' Apple Foundation Models ,' the foundational models for

next-generation Apple Intelligence . Apple has provided a detailed explanation of how these models are designed.

Introducing the Third Generation of Apple's Foundation Models - Apple Machine Learning Research
https://machinelearning.apple.com/research/introducing-third-generation-of-apple-foundation-models

Recap the WWDC26 Platforms State of the Union - YouTube


At the core of the next-generation Apple Intelligence, which Apple announced at its annual developer conference 'WWDC26' in June 2026, are the Apple Foundation Models, which consist of five foundational models developed in collaboration with Google. The Apple Foundation Models introduced this time are the 'third generation.'



Apple states that 'Apple Foundation Models are designed to deliver a wide range of useful experiences to users, including an all-new Siri and intelligent tools that make everyday apps smarter and more convenient.'

Two of the five Apple Foundation Models are devices with built-in features.
• AFM 3 Core : A high-density model with 3 billion parameters.
• AFM 3 Core Advanced : Apple's most powerful on-device model, natively

multimodal , enabling convenient features such as expressive voice and high-precision voice input.

The remaining three are server-based models.
• AFM 3 Cloud : This is the flagship server-side model, optimized for speed, efficiency, and performance.
• ADM 3 Cloud (Image) : A model designed for image generation and editing, enabling advanced photo editing tools, a completely new Image Playground , and more.
• AFM 3 Cloud Pro : Apple's most powerful server-based model, supporting the most demanding applications such as agent-based tools and complex inference.

The AFM 3 Core, AFM 3 Core Advanced, AFM 3 Cloud, and AFM 3 Cloud are all specifically designed for Apple Silicon.

Regarding AFM 3 Cloud Pro, Google and NVIDIA have collaborated to ensure high performance while protecting user privacy.

Apple's privacy-focused AI servers run on NVIDIA GPUs - GIGAZINE



Let's take a closer look at each of them.

The AFM 3 Core Advanced, the highest-performing model among the device-integrated models, features a 'sparse active architecture' based on a technology called Instruction-Following Pruning (IFP). Neural network architectures include 'dense models' that use all the parameters of the model and 'sparse models' that use only some of them, but in either case, all weights must be stored in active memory (DRAM), and this enormous memory usage has been a limiting factor in the scalability of consumer hardware.

The solution to this problem is the 'sparse active architecture,' which, instead of storing the entire model in DRAM, stores the complete model in flash memory (NAND) and recalls it as needed. Because the bandwidth from NAND to DRAM is too slow for the token-level weight swapping required by standard MoE models, the AFM 3 Core Advanced determines which parameters to activate based on prompts, fixes them, and loads them into DRAM. This effectively transforms a 20 billion parameter model into a dense model with 1 to 4 billion parameters. Apple stated that this 'minimizes latency while enabling model scales far exceeding traditional DRAM limitations.'

The server-side model, AFM 3 Cloud, has achieved significant advancements in multimodal inference capabilities, supported by Apple's privacy-protecting cloud system, Private Cloud Compute. This includes improved training stability and enhanced ability to infer and accurately recall information within the context window for complex server-side queries.

The ADM 3 Cloud image model was developed to provide high control and parameter efficiency for high-quality image generation, editing, and Genmoji. This model supports different aspect ratios and resolutions and leverages the entire Apple Foundation Model family for both generation and editing.

The following is an example of native image generation using ADM 3 Cloud.



While the specific relationship between these models and Gemini remains unclear, technology media outlet MacStories suggests that 'reading between the lines, we speculate that AFM Cloud is based on Gemini 3.1 Flash-Lite, AFM Cloud Pro on Gemini 3.5 Flash, and ADM Cloud on Nano Banana Pro (Gemini 3 Pro Image),' and is calling for further information.

With the Apple Foundation Models framework now publicly available, developers can create apps using a variety of features. Google has integrated Gemini models into the Apple Foundation Models framework, allowing Apple models on devices and cloud-hosted Gemini models to work through a common API interface. This makes it easy to switch between local and cloud inference depending on the use case.

Additionally, you can call Gemini from Apple's development environment, Xcode, to receive development support.

Gemini models for Apple developers
https://blog.google/innovation-and-ai/technology/developers-tools/bringing-gemini-models-to-apple-developers/

On macOS, you can use the Foundation Models SDK for Python to integrate with commonly used tools and evaluation packages in the Python ecosystem.

Build AI-powered scripts with the fm CLI and Python SDK - WWDC26 - Videos - Apple Developer
https://developer.apple.com/videos/play/wwdc2026/334/

In addition, Apple has released 'Core AI,' a framework designed as the optimal method for running on-device models within apps. Core AI is built into the OS and is said to maximize the performance of Apple Silicon.

Core AI | Apple Developer Documentation
https://developer.apple.com/documentation/coreai/

Marco Abis, who is developing the local AI profiler 'Ziraph' on Apple Silicon, pointed out that 'Core AI's profiling tools publish timing information, but they do not publish energy, memory bandwidth, and thermal data, which are key indicators that determine whether a product can be commercialized. This is a major omission considering these metrics greatly influence a device's performance.'




Apple states that these models are trained using a combination of publicly available information, data licensed or purchased from third parties, open-source data, data obtained through dedicated research, and synthetic data. Apple emphasizes its commitment to privacy, stating, 'We do not use users' personal private data or interactions with users to train our base models. We also respect web publishers' right to refuse to have their base models trained.'

in AI,   Video, Posted by log1p_kr