JUST ENDED
|

Foundation Models Q&A

Connect with Apple engineers in the Foundation Models Q&A on the Apple Developer Forums.

Post

Replies

Boosts

Views

Activity

On Protocol Extensibility & Multi-Modal Data
The Foundation Models framework is adding built-in OCR and barcode reader tools this year . If we implement a custom backend using the Language Model Protocol, can we return complex multi-modal objects (like bounding boxes or segmentation masks) back to the agentic flow, or is the protocol currently limited to text-based responses? For the 'Phone a Friend' pattern, is there a standard way to pass 'privacy-preserving embeddings' instead of raw text when calling a third-party model to maintain a higher level of user data protection?
1
0
15
1d
On Agentic Testing & Accessibility
Since agents in Xcode 27 can now interact with the accessibility tree and screenshots, can we provide 'developer hints' in our code to help the agent distinguish between decorative UI and critical interactive elements during automated testing? Can the Evaluations framework be used to 'score' the efficiency of an agent’s navigation path through the app, helping us identify where our App Intents might be creating confusing or redundant loops for Apple Intelligence?
0
0
29
1d
In-app text input vs system speech paths
If users dictate into a standard TextField via the keyboard mic instead of a dedicated in-app record button, does that text still benefit from App Intents entity resolution and indexed entities — or is keyboard dictation a separate pipeline where we lose domain vocabulary unless the user invokes Siri directly?
0
1
9
1d
On Performance & Backgrounding
While we now know about the continued-processing.gpu entitlement for background tasks, is there a similar NPU-specific entitlement or priority flag to ensure that an on-device foundation model isn't preempted by system-level Apple Intelligence features while the app is in the background?
1
0
18
1d
On-device model capabilities, limits, and versioning
What is the context window of the on-device model (AFM 3 Core Advanced and the 3B Core), and how should developers handle prompts that exceed it — automatic truncation, error, or developer-managed chunking? For guided/structured generation into typed Swift values, what are the limits on schema complexity (nesting depth, enums, arrays, optionals), and what is the failure mode when the model cannot satisfy the schema? How deterministic and reliable is on-device tool calling under the Tool protocol — are there guarantees on argument validity, and a recommended pattern for validating/repairing tool arguments before execution? For the new image input: what are the constraints on resolution, image count per prompt, and formats, and does passing images change which device tiers or which model (on-device vs PCC) services the request? Since the on-device model ships and updates with the OS, how should developers detect the active model version at runtime and guard against behavioral drift between OS releases? Is there a pinning or capability-query API? What are the realistic latency and concurrency expectations on supported hardware, and is there a supported way to run multiple sessions or background inference without thermal/throttling penalties?
2
0
36
1d
Creating an in-universe AI computer in my app
Last year after Apple foundation models framework was introduced, I begin working on a separate test Playground project to see how to use the foundation model framework to create an AI computer in my app that only has knowledge of in universe content that comes from within my app. Now with the OS 27 updates released I’m going back to work on that. I believe I can use the on-device system foundation model framework comfortably because I don’t think there’s a lot of content in my app that the AI has to know about Do you have any advice for using instructions to tell the model to focus on only the knowledge boundaries from within my app universe or might there be new tools this year in using foundation models framework that might help me achieve the limited knowledge scope that I want the AI to recognize and respond to for my app users.
1
0
32
1d
Guidance Around PCC
If a developer is eligible for Private Cloud Compute and then crosses the threshold, what happens to PCC calls? Is there a paid program for PCC that you fall back on or does a developer need to already have built into their app another model ready in the wings to take over once that threshold is reached?
4
1
68
1d
Dynamic profile switching
When using Dynamic Profiles to switch between the on-device model and Private Cloud Compute mid-session, how is the context window reconciled — if I build up context on PCC (larger window) and then route a turn back to the on-device model, what happens to the entries that exceed the on-device window? — Divya Ravi, Senior iOS Engineer
1
0
54
1d
The standalone Siri app and cross-surface continuity
The new standalone Siri app keeps conversation history synced via iCloud across iPhone, iPad, and Mac. Can third-party content, results, or an app's agent surface appear inside the Siri app (e.g., as referenced sources or follow-up actions), and can the user deep-link from a Siri-app result back into the originating app with state intact? Is any conversation context from the Siri app exposed to a developer's intent when an action is invoked, so the app can act with the relevant context, and what are the privacy boundaries on that? When the same action is invoked from different surfaces (in-app, system Siri, the Siri app) and across synced devices, how should developers reason about execution location and idempotency to avoid duplicate side effects?
0
0
5
1d
Foundation Models framework — the unified API for third-party cloud providers
The 2026 framework lets apps call cloud models like Claude and Gemini (or "any provider that conforms to Apple's Language Model protocol") through the same Swift API as the on-device model. What exactly must a provider implement to conform to the Language Model protocol, and can developers register a custom/self-hosted endpoint and their own API keys, or is routing limited to an Apple-curated provider list? Does the unified API normalize provider-specific capabilities — tool/function calling formats, system-prompt handling, streaming tokens, JSON/structured output, multi-turn state — or do these degrade to a lowest common denominator across providers? When a request is routed to a third-party cloud model, what is the data path and privacy boundary? Does it transit Private Cloud Compute, or go direct to the provider, and what is disclosed to the user about where their prompt is processed? If an app supplies a conforming provider, does that provider become selectable by Siri AI for system actions, or is custom-provider routing confined to in-app LanguageModelSession use only? With the framework slated to open-source this summer, will the provider/protocol surface be stable enough to build against now, or should developers expect breaking changes between the beta and the open-source release?
1
0
37
1d
Private Cloud Compute trust model across multiple cloud vendors
Reports indicate PCC now extends to NVIDIA hardware in Google Cloud datacenters, and the flagship cloud model is refined using Gemini outputs. Now that PCC spans infrastructure outside Apple's own datacenters, what attestation or verifiable transparency is available to developers and users about where a given request was processed, and do the original "data unreachable even by Apple" guarantees hold unchanged across all hardware vendors? For apps with enterprise or regulated users, is there documented data residency behavior for PCC and for third-party model routing, and any contractual/compliance posture (e.g., regional pinning) developers can rely on? Given the EU and China availability gaps at launch, what is the recommended graceful-degradation path for apps that must function in those regions — fall back to on-device only, to a developer-supplied provider, or disable AI features? Does routing to a third-party cloud provider through the framework carry the same PCC privacy guarantees, or are those guarantees specific to Apple's own cloud models?
0
0
6
1d
Spotlight semantic index & entity schemas — privacy and dynamic/remote content
Entity schemas add app content to the Spotlight semantic index so Siri can find information inside apps. Is the semantic index built and stored entirely on-device, or is any indexed entity content transmitted to Apple or to Private Cloud Compute for embedding/retrieval? How should developers index content that does not live on the device — data that resides on a remote server or is fetched on demand? Is there a provider/just-in-time pattern, or must entities be materialized locally first? What is the freshness/update latency of the index when entities change frequently, and what are the practical limits on entity count and update rate before indexing is throttled? What controls exist to exclude sensitive entities from the semantic index or from Siri's personal-context reach, on a per-entity or per-field basis? How is indexed app content scoped per user/account on shared or multi-account devices, and is it cleared on sign-out?
0
0
27
1d
Siri without opening the app
Can App Intents perform authenticated backend calls (Bearer token in Keychain / App Group) and return structured results to Siri, or must execution always launch the host app first?
Replies
1
Boosts
0
Views
29
Activity
1d
On Protocol Extensibility & Multi-Modal Data
The Foundation Models framework is adding built-in OCR and barcode reader tools this year . If we implement a custom backend using the Language Model Protocol, can we return complex multi-modal objects (like bounding boxes or segmentation masks) back to the agentic flow, or is the protocol currently limited to text-based responses? For the 'Phone a Friend' pattern, is there a standard way to pass 'privacy-preserving embeddings' instead of raw text when calling a third-party model to maintain a higher level of user data protection?
Replies
1
Boosts
0
Views
15
Activity
1d
Privacy, personalization, and App Store expectations
We offer both cloud-based AI (subscription) and are exploring on-device Apple Intelligence features. What user profile data is appropriate to inject into on-device model sessions under Apple’s privacy guidelines, and how should apps disclose hybrid cloud + on-device AI in privacy nutrition labels and review?
Replies
1
Boosts
0
Views
19
Activity
1d
Summarization that must not hallucinate numbers
What’s Apple’s guidance for using on-device models to turn structured JSON (time series, metrics, units) into a one-line natural-language summary without inventing values?
Replies
1
Boosts
0
Views
18
Activity
1d
Using FoundationModels framework in Extensions
LLMs are renowned for using so much RAM. Does this mean we can't essentially use FoundationModels in extensions such as MessageFilterExtension? I assume the system kills the extension before we even get a response.
Replies
2
Boosts
0
Views
70
Activity
1d
On Agentic Testing & Accessibility
Since agents in Xcode 27 can now interact with the accessibility tree and screenshots, can we provide 'developer hints' in our code to help the agent distinguish between decorative UI and critical interactive elements during automated testing? Can the Evaluations framework be used to 'score' the efficiency of an agent’s navigation path through the app, helping us identify where our App Intents might be creating confusing or redundant loops for Apple Intelligence?
Replies
0
Boosts
0
Views
29
Activity
1d
React Native + native AI bridge
What’s the supported integration path for Foundation Models and Apple Intelligence from a React Native app — thin Swift native module, App Intents only, or are these features effectively Swift-first?
Replies
2
Boosts
0
Views
22
Activity
1d
In-app text input vs system speech paths
If users dictate into a standard TextField via the keyboard mic instead of a dedicated in-app record button, does that text still benefit from App Intents entity resolution and indexed entities — or is keyboard dictation a separate pipeline where we lose domain vocabulary unless the user invokes Siri directly?
Replies
0
Boosts
1
Views
9
Activity
1d
On Performance & Backgrounding
While we now know about the continued-processing.gpu entitlement for background tasks, is there a similar NPU-specific entitlement or priority flag to ensure that an on-device foundation model isn't preempted by system-level Apple Intelligence features while the app is in the background?
Replies
1
Boosts
0
Views
18
Activity
1d
On-device model capabilities, limits, and versioning
What is the context window of the on-device model (AFM 3 Core Advanced and the 3B Core), and how should developers handle prompts that exceed it — automatic truncation, error, or developer-managed chunking? For guided/structured generation into typed Swift values, what are the limits on schema complexity (nesting depth, enums, arrays, optionals), and what is the failure mode when the model cannot satisfy the schema? How deterministic and reliable is on-device tool calling under the Tool protocol — are there guarantees on argument validity, and a recommended pattern for validating/repairing tool arguments before execution? For the new image input: what are the constraints on resolution, image count per prompt, and formats, and does passing images change which device tiers or which model (on-device vs PCC) services the request? Since the on-device model ships and updates with the OS, how should developers detect the active model version at runtime and guard against behavioral drift between OS releases? Is there a pinning or capability-query API? What are the realistic latency and concurrency expectations on supported hardware, and is there a supported way to run multiple sessions or background inference without thermal/throttling penalties?
Replies
2
Boosts
0
Views
36
Activity
1d
Mixed languages and foreign proper nouns
If the user’s device language is French but they speak English, or they use one language for the sentence and another for proper nouns, how does Siri handle transcription and entity resolution? Do we need per-locale entity indexing, aliases, or can semantic indexing work across languages?
Replies
0
Boosts
0
Views
16
Activity
1d
Speech recognition with large, dynamic vocabularies
Our users speak proper nouns and domain terms (place names, product jargon) that change frequently. What’s the best practice for improving recognition accuracy: dynamic contextual strings, on-device custom language resources, periodic vocabulary sync, or something else in the current Speech APIs?
Replies
1
Boosts
0
Views
18
Activity
1d
Creating an in-universe AI computer in my app
Last year after Apple foundation models framework was introduced, I begin working on a separate test Playground project to see how to use the foundation model framework to create an AI computer in my app that only has knowledge of in universe content that comes from within my app. Now with the OS 27 updates released I’m going back to work on that. I believe I can use the on-device system foundation model framework comfortably because I don’t think there’s a lot of content in my app that the AI has to know about Do you have any advice for using instructions to tell the model to focus on only the knowledge boundaries from within my app universe or might there be new tools this year in using foundation models framework that might help me achieve the limited knowledge scope that I want the AI to recognize and respond to for my app users.
Replies
1
Boosts
0
Views
32
Activity
1d
Guidance Around PCC
If a developer is eligible for Private Cloud Compute and then crosses the threshold, what happens to PCC calls? Is there a paid program for PCC that you fall back on or does a developer need to already have built into their app another model ready in the wings to take over once that threshold is reached?
Replies
4
Boosts
1
Views
68
Activity
1d
Dynamic profile switching
When using Dynamic Profiles to switch between the on-device model and Private Cloud Compute mid-session, how is the context window reconciled — if I build up context on PCC (larger window) and then route a turn back to the on-device model, what happens to the entries that exceed the on-device window? — Divya Ravi, Senior iOS Engineer
Replies
1
Boosts
0
Views
54
Activity
1d
The standalone Siri app and cross-surface continuity
The new standalone Siri app keeps conversation history synced via iCloud across iPhone, iPad, and Mac. Can third-party content, results, or an app's agent surface appear inside the Siri app (e.g., as referenced sources or follow-up actions), and can the user deep-link from a Siri-app result back into the originating app with state intact? Is any conversation context from the Siri app exposed to a developer's intent when an action is invoked, so the app can act with the relevant context, and what are the privacy boundaries on that? When the same action is invoked from different surfaces (in-app, system Siri, the Siri app) and across synced devices, how should developers reason about execution location and idempotency to avoid duplicate side effects?
Replies
0
Boosts
0
Views
5
Activity
1d
Foundation Models framework — the unified API for third-party cloud providers
The 2026 framework lets apps call cloud models like Claude and Gemini (or "any provider that conforms to Apple's Language Model protocol") through the same Swift API as the on-device model. What exactly must a provider implement to conform to the Language Model protocol, and can developers register a custom/self-hosted endpoint and their own API keys, or is routing limited to an Apple-curated provider list? Does the unified API normalize provider-specific capabilities — tool/function calling formats, system-prompt handling, streaming tokens, JSON/structured output, multi-turn state — or do these degrade to a lowest common denominator across providers? When a request is routed to a third-party cloud model, what is the data path and privacy boundary? Does it transit Private Cloud Compute, or go direct to the provider, and what is disclosed to the user about where their prompt is processed? If an app supplies a conforming provider, does that provider become selectable by Siri AI for system actions, or is custom-provider routing confined to in-app LanguageModelSession use only? With the framework slated to open-source this summer, will the provider/protocol surface be stable enough to build against now, or should developers expect breaking changes between the beta and the open-source release?
Replies
1
Boosts
0
Views
37
Activity
1d
Hybrid assistant architecture (on-device model + server tools)
We run a conversational assistant where answers depend on live API data, not just static knowledge. What is Apple’s recommended split between on-device Foundation Models (intent, routing, summarization, privacy-sensitive context) and server-side tool execution? Is there an official pattern for a local planner with a remote executor?
Replies
0
Boosts
0
Views
8
Activity
1d
Private Cloud Compute trust model across multiple cloud vendors
Reports indicate PCC now extends to NVIDIA hardware in Google Cloud datacenters, and the flagship cloud model is refined using Gemini outputs. Now that PCC spans infrastructure outside Apple's own datacenters, what attestation or verifiable transparency is available to developers and users about where a given request was processed, and do the original "data unreachable even by Apple" guarantees hold unchanged across all hardware vendors? For apps with enterprise or regulated users, is there documented data residency behavior for PCC and for third-party model routing, and any contractual/compliance posture (e.g., regional pinning) developers can rely on? Given the EU and China availability gaps at launch, what is the recommended graceful-degradation path for apps that must function in those regions — fall back to on-device only, to a developer-supplied provider, or disable AI features? Does routing to a third-party cloud provider through the framework carry the same PCC privacy guarantees, or are those guarantees specific to Apple's own cloud models?
Replies
0
Boosts
0
Views
6
Activity
1d
Spotlight semantic index & entity schemas — privacy and dynamic/remote content
Entity schemas add app content to the Spotlight semantic index so Siri can find information inside apps. Is the semantic index built and stored entirely on-device, or is any indexed entity content transmitted to Apple or to Private Cloud Compute for embedding/retrieval? How should developers index content that does not live on the device — data that resides on a remote server or is fetched on demand? Is there a provider/just-in-time pattern, or must entities be materialized locally first? What is the freshness/update latency of the index when entities change frequently, and what are the practical limits on entity count and update rate before indexing is throttled? What controls exist to exclude sensitive entities from the semantic index or from Siri's personal-context reach, on a per-entity or per-field basis? How is indexed app content scoped per user/account on shared or multi-account devices, and is it cleared on sign-out?
Replies
0
Boosts
0
Views
27
Activity
1d