gromgrom’s Profile | Apple Developer Forums

gromgrom

User for

Post

Replies

Boosts

Views

Activity

Sandboxed network permissions on macOS

Are there specific Entitlements (com.apple.security.temporary-exception.files.absolute-path.read-write or network exceptions) required to allow App Intents to talk to local UNIX sockets or loopback interfaces (127.0.0.1) without triggering sandbox violations?

Machine Learning & AI App Intents Siri and Voice Apple Intelligence App Sandbox

1

0

137

1w

Rich SwiftUI Rendering in the Siri Overlay

For intents that return complex markdown, code snippets, or structured layouts, what are the rendering capabilities and resource constraints of custom SwiftUI views returned via Snippet inside the Siri voice overlay? Are interactive controls (like buttons, copy-to-clipboard, or scroll views) supported inside the Siri-overlay SwiftUI container?

Machine Learning & AI App Intents Siri and Voice SiriKit App Intents Apple Intelligence

1

0

234

2w

Confirmation, permissions, and reversibility for agentic actions

Apple demonstrated agentic behavior (e.g., the Passwords app changing credentials on the user's behalf), and Siri AI can now take systemwide actions in apps. Is there a first-class confirmation API for App Intents — a way to mark an action as requiring explicit user approval before execution, with a standard confirmation surface — or must developers build their own confirmation UI inside the intent? For irreversible or high-impact actions, what is Apple's recommended pattern to prevent the model from executing them autonomously, and can an intent declare a risk/sensitivity level the system respects? When Siri AI invokes an action, what authentication/authorization context is available to the intent (biometric gate, user-presence assertion), and how should an app require step-up auth for sensitive operations? Is there a supported audit trail for actions taken via Siri AI on the user's behalf, so an app can show the user what was done and when? How does the system handle an action that fails or partially completes during an agentic, multi-step flow?

Machine Learning & AI Foundation Models

1

1

149

2w

Visual Intelligence and screen/camera understanding for third-party apps

Visual Intelligence lets users ask Siri about what the camera or screen shows, and the screenshot tool can extract structured data into system apps. Can a third-party app contribute results or actions when the user invokes Visual Intelligence over the app's own content or a screenshot of it (analogous to how a schedule becomes calendar events), and what API surfaces that? For the Image Playground API, what are the content, rate, and style constraints, and can generated assets be used in commercial app contexts? Is there a supported way for an app to provide its own visual understanding to the system rather than relying solely on Apple's model — for domain-specific imagery the on-device model may not recognize?

Machine Learning & AI Foundation Models

1

0

101

2w

On-device model capabilities, limits, and versioning

What is the context window of the on-device model (AFM 3 Core Advanced and the 3B Core), and how should developers handle prompts that exceed it — automatic truncation, error, or developer-managed chunking? For guided/structured generation into typed Swift values, what are the limits on schema complexity (nesting depth, enums, arrays, optionals), and what is the failure mode when the model cannot satisfy the schema? How deterministic and reliable is on-device tool calling under the Tool protocol — are there guarantees on argument validity, and a recommended pattern for validating/repairing tool arguments before execution? For the new image input: what are the constraints on resolution, image count per prompt, and formats, and does passing images change which device tiers or which model (on-device vs PCC) services the request? Since the on-device model ships and updates with the OS, how should developers detect the active model version at runtime and guard against behavioral drift between OS releases? Is there a pinning or capability-query API? What are the realistic latency and concurrency expectations on supported hardware, and is there a supported way to run multiple sessions or background inference without thermal/throttling penalties?

Machine Learning & AI Foundation Models

2

0

72

2w

The standalone Siri app and cross-surface continuity

The new standalone Siri app keeps conversation history synced via iCloud across iPhone, iPad, and Mac. Can third-party content, results, or an app's agent surface appear inside the Siri app (e.g., as referenced sources or follow-up actions), and can the user deep-link from a Siri-app result back into the originating app with state intact? Is any conversation context from the Siri app exposed to a developer's intent when an action is invoked, so the app can act with the relevant context, and what are the privacy boundaries on that? When the same action is invoked from different surfaces (in-app, system Siri, the Siri app) and across synced devices, how should developers reason about execution location and idempotency to avoid duplicate side effects?

Machine Learning & AI Foundation Models

0

0

17

2w

Foundation Models framework — the unified API for third-party cloud providers

The 2026 framework lets apps call cloud models like Claude and Gemini (or "any provider that conforms to Apple's Language Model protocol") through the same Swift API as the on-device model. What exactly must a provider implement to conform to the Language Model protocol, and can developers register a custom/self-hosted endpoint and their own API keys, or is routing limited to an Apple-curated provider list? Does the unified API normalize provider-specific capabilities — tool/function calling formats, system-prompt handling, streaming tokens, JSON/structured output, multi-turn state — or do these degrade to a lowest common denominator across providers? When a request is routed to a third-party cloud model, what is the data path and privacy boundary? Does it transit Private Cloud Compute, or go direct to the provider, and what is disclosed to the user about where their prompt is processed? If an app supplies a conforming provider, does that provider become selectable by Siri AI for system actions, or is custom-provider routing confined to in-app LanguageModelSession use only? With the framework slated to open-source this summer, will the provider/protocol surface be stable enough to build against now, or should developers expect breaking changes between the beta and the open-source release?

Machine Learning & AI Foundation Models

1

0

115

2w

Private Cloud Compute trust model across multiple cloud vendors

Reports indicate PCC now extends to NVIDIA hardware in Google Cloud datacenters, and the flagship cloud model is refined using Gemini outputs. Now that PCC spans infrastructure outside Apple's own datacenters, what attestation or verifiable transparency is available to developers and users about where a given request was processed, and do the original "data unreachable even by Apple" guarantees hold unchanged across all hardware vendors? For apps with enterprise or regulated users, is there documented data residency behavior for PCC and for third-party model routing, and any contractual/compliance posture (e.g., regional pinning) developers can rely on? Given the EU and China availability gaps at launch, what is the recommended graceful-degradation path for apps that must function in those regions — fall back to on-device only, to a developer-supplied provider, or disable AI features? Does routing to a third-party cloud provider through the framework carry the same PCC privacy guarantees, or are those guarantees specific to Apple's own cloud models?

Machine Learning & AI Foundation Models

0

0

14

2w

Spotlight semantic index & entity schemas — privacy and dynamic/remote content

Entity schemas add app content to the Spotlight semantic index so Siri can find information inside apps. Is the semantic index built and stored entirely on-device, or is any indexed entity content transmitted to Apple or to Private Cloud Compute for embedding/retrieval? How should developers index content that does not live on the device — data that resides on a remote server or is fetched on demand? Is there a provider/just-in-time pattern, or must entities be materialized locally first? What is the freshness/update latency of the index when entities change frequently, and what are the practical limits on entity count and update rate before indexing is throttled? What controls exist to exclude sensitive entities from the semantic index or from Siri's personal-context reach, on a per-entity or per-field basis? How is indexed app content scoped per user/account on shared or multi-account devices, and is it cleared on sign-out?

Machine Learning & AI Foundation Models

0

0

66

2w

App Intents — exposing conversational and agentic actions to Siri AI

App Intents now connect app content and actions to Apple Intelligence, and Siri AI can take action directly inside third-party apps without fixed trigger phrases. Can an app expose a single conversational/agent-style entry point to Siri AI, or must all capabilities be modeled as discrete intents? If discrete, how does Siri AI chain multiple intents to fulfill a compound natural-language request? What is the supported pattern for long-running or asynchronous intents — actions that acknowledge immediately but complete and return a result seconds or minutes later? Is there a progress/continuation/callback model? How are an intent's results rendered — inline in the Siri app, via a snippet/App Intent UI, or by deep-linking into the app? What control do developers have over that presentation? For intents whose parameters are ambiguous, what disambiguation and follow-up affordances does Siri AI provide, and can developers supply candidate resolutions dynamically at runtime? Is there an eligibility or review process for apps to participate in systemwide Siri AI actions, beyond simply adopting App Intents?

Machine Learning & AI Foundation Models

0

0

36

2w

Spoken Locale Exposure (Dynamic Language Routing)

Does the App Intents framework expose the user's active spoken Siri locale (e.g., ja-JP, fr-FR) directly within the perform() context, or must the extension rely on the system's global locale setting? If a user switches Siri's language dynamically, how is that locale string propagated to the intent execution block?

Machine Learning & AI App Intents Siri and Voice App Intents

1

0

98

2w

Dynamic Runtime App Shortcuts

Can App Shortcuts (and their trigger phrases) be generated or translated dynamically at runtime based on user-defined configurations (e.g. download-on-demand language models or custom voice aliases), or must all voice shortcuts and translations be statically declared in the application's compiled String Catalogs (.xcstrings)?

Machine Learning & AI App Intents Siri and Voice SiriKit App Intents Apple Intelligence

0

2

156

2w

Voice-Streaming & Text-to-Speech (TTS) Latency

How does the App Intents framework support streaming spoken voice (TTS) output for long-form text responses? Is there an API (such as an asynchronous sequence or buffer stream) that allows Siri to begin speaking a response while the underlying AI engine is still generating the remainder of the text?

Machine Learning & AI App Intents Siri and Voice App Intents

0

0

79

2w

Sandbox-Bypassing IPC between App Intents and Launchd Daemons on macOS

We are designing a macOS utility that runs a local background agent via launchd (managing a local SQLite database and Unix socket). We want to expose controls (start/stop, status checks, CLI command invocation) to Siri via the App Intents framework. Since App Intents typically execute within a sandboxed App Extension or a sandboxed App wrapper container: What is the recommended IPC mechanism (e.g., Unix domain sockets, local HTTP/TCP ports, XPC) to securely communicate between a sandboxed App Intent extension and a non-sandboxed launchd helper daemon on macOS? Are there specific Entitlements (com.apple.security.temporary-exception.files.absolute-path.read-write or network exceptions) required to allow App Intents to talk to local UNIX sockets or loopback interfaces (127.0.0.1) without triggering sandbox violations? Can an App Intent directly invoke a helper command-line tool or launch a plist-configured service without bringing up the main application UI?

Machine Learning & AI App Intents

2

0

124

2w

Semantic Voice Search and Natural Language Parameter Resolution

How does next-gen Siri handle semantic or semantic-adjacent parameter resolution when using EntityQuery? For example, if a user says 'Siri, show memory tags related to auth errors' and the exact tag is auth-handler-failure: Does the App Intents framework support passing the raw query string (from natural language) directly to our search/recall logic within the intent, or does it try to map it strictly to pre-registered AppEntity lists? How do we configure an AppEntity to support synonyms or regex-like string match resolution directly within Siri's offline/online parsing?

Machine Learning & AI App Intents

1

0

85

2w

Sandboxed network permissions on macOS

Are there specific Entitlements (com.apple.security.temporary-exception.files.absolute-path.read-write or network exceptions) required to allow App Intents to talk to local UNIX sockets or loopback interfaces (127.0.0.1) without triggering sandbox violations?

Machine Learning & AI App Intents Siri and Voice Apple Intelligence App Sandbox

Replies: 1
Boosts: 0
Views: 137
Activity: 1w

Rich SwiftUI Rendering in the Siri Overlay

For intents that return complex markdown, code snippets, or structured layouts, what are the rendering capabilities and resource constraints of custom SwiftUI views returned via Snippet inside the Siri voice overlay? Are interactive controls (like buttons, copy-to-clipboard, or scroll views) supported inside the Siri-overlay SwiftUI container?

Machine Learning & AI App Intents Siri and Voice SiriKit App Intents Apple Intelligence

Replies: 1
Boosts: 0
Views: 234
Activity: 2w

Confirmation, permissions, and reversibility for agentic actions

Apple demonstrated agentic behavior (e.g., the Passwords app changing credentials on the user's behalf), and Siri AI can now take systemwide actions in apps. Is there a first-class confirmation API for App Intents — a way to mark an action as requiring explicit user approval before execution, with a standard confirmation surface — or must developers build their own confirmation UI inside the intent? For irreversible or high-impact actions, what is Apple's recommended pattern to prevent the model from executing them autonomously, and can an intent declare a risk/sensitivity level the system respects? When Siri AI invokes an action, what authentication/authorization context is available to the intent (biometric gate, user-presence assertion), and how should an app require step-up auth for sensitive operations? Is there a supported audit trail for actions taken via Siri AI on the user's behalf, so an app can show the user what was done and when? How does the system handle an action that fails or partially completes during an agentic, multi-step flow?

Machine Learning & AI Foundation Models

Replies: 1
Boosts: 1
Views: 149
Activity: 2w

Visual Intelligence and screen/camera understanding for third-party apps

Visual Intelligence lets users ask Siri about what the camera or screen shows, and the screenshot tool can extract structured data into system apps. Can a third-party app contribute results or actions when the user invokes Visual Intelligence over the app's own content or a screenshot of it (analogous to how a schedule becomes calendar events), and what API surfaces that? For the Image Playground API, what are the content, rate, and style constraints, and can generated assets be used in commercial app contexts? Is there a supported way for an app to provide its own visual understanding to the system rather than relying solely on Apple's model — for domain-specific imagery the on-device model may not recognize?

Machine Learning & AI Foundation Models

Replies: 1
Boosts: 0
Views: 101
Activity: 2w

On-device model capabilities, limits, and versioning

What is the context window of the on-device model (AFM 3 Core Advanced and the 3B Core), and how should developers handle prompts that exceed it — automatic truncation, error, or developer-managed chunking? For guided/structured generation into typed Swift values, what are the limits on schema complexity (nesting depth, enums, arrays, optionals), and what is the failure mode when the model cannot satisfy the schema? How deterministic and reliable is on-device tool calling under the Tool protocol — are there guarantees on argument validity, and a recommended pattern for validating/repairing tool arguments before execution? For the new image input: what are the constraints on resolution, image count per prompt, and formats, and does passing images change which device tiers or which model (on-device vs PCC) services the request? Since the on-device model ships and updates with the OS, how should developers detect the active model version at runtime and guard against behavioral drift between OS releases? Is there a pinning or capability-query API? What are the realistic latency and concurrency expectations on supported hardware, and is there a supported way to run multiple sessions or background inference without thermal/throttling penalties?

Machine Learning & AI Foundation Models

Replies: 2
Boosts: 0
Views: 72
Activity: 2w

The standalone Siri app and cross-surface continuity

The new standalone Siri app keeps conversation history synced via iCloud across iPhone, iPad, and Mac. Can third-party content, results, or an app's agent surface appear inside the Siri app (e.g., as referenced sources or follow-up actions), and can the user deep-link from a Siri-app result back into the originating app with state intact? Is any conversation context from the Siri app exposed to a developer's intent when an action is invoked, so the app can act with the relevant context, and what are the privacy boundaries on that? When the same action is invoked from different surfaces (in-app, system Siri, the Siri app) and across synced devices, how should developers reason about execution location and idempotency to avoid duplicate side effects?

Machine Learning & AI Foundation Models

Replies: 0
Boosts: 0
Views: 17
Activity: 2w

Foundation Models framework — the unified API for third-party cloud providers

The 2026 framework lets apps call cloud models like Claude and Gemini (or "any provider that conforms to Apple's Language Model protocol") through the same Swift API as the on-device model. What exactly must a provider implement to conform to the Language Model protocol, and can developers register a custom/self-hosted endpoint and their own API keys, or is routing limited to an Apple-curated provider list? Does the unified API normalize provider-specific capabilities — tool/function calling formats, system-prompt handling, streaming tokens, JSON/structured output, multi-turn state — or do these degrade to a lowest common denominator across providers? When a request is routed to a third-party cloud model, what is the data path and privacy boundary? Does it transit Private Cloud Compute, or go direct to the provider, and what is disclosed to the user about where their prompt is processed? If an app supplies a conforming provider, does that provider become selectable by Siri AI for system actions, or is custom-provider routing confined to in-app LanguageModelSession use only? With the framework slated to open-source this summer, will the provider/protocol surface be stable enough to build against now, or should developers expect breaking changes between the beta and the open-source release?

Machine Learning & AI Foundation Models

Replies: 1
Boosts: 0
Views: 115
Activity: 2w

Private Cloud Compute trust model across multiple cloud vendors

Reports indicate PCC now extends to NVIDIA hardware in Google Cloud datacenters, and the flagship cloud model is refined using Gemini outputs. Now that PCC spans infrastructure outside Apple's own datacenters, what attestation or verifiable transparency is available to developers and users about where a given request was processed, and do the original "data unreachable even by Apple" guarantees hold unchanged across all hardware vendors? For apps with enterprise or regulated users, is there documented data residency behavior for PCC and for third-party model routing, and any contractual/compliance posture (e.g., regional pinning) developers can rely on? Given the EU and China availability gaps at launch, what is the recommended graceful-degradation path for apps that must function in those regions — fall back to on-device only, to a developer-supplied provider, or disable AI features? Does routing to a third-party cloud provider through the framework carry the same PCC privacy guarantees, or are those guarantees specific to Apple's own cloud models?

Machine Learning & AI Foundation Models

Replies: 0
Boosts: 0
Views: 14
Activity: 2w

Spotlight semantic index & entity schemas — privacy and dynamic/remote content

Entity schemas add app content to the Spotlight semantic index so Siri can find information inside apps. Is the semantic index built and stored entirely on-device, or is any indexed entity content transmitted to Apple or to Private Cloud Compute for embedding/retrieval? How should developers index content that does not live on the device — data that resides on a remote server or is fetched on demand? Is there a provider/just-in-time pattern, or must entities be materialized locally first? What is the freshness/update latency of the index when entities change frequently, and what are the practical limits on entity count and update rate before indexing is throttled? What controls exist to exclude sensitive entities from the semantic index or from Siri's personal-context reach, on a per-entity or per-field basis? How is indexed app content scoped per user/account on shared or multi-account devices, and is it cleared on sign-out?

Machine Learning & AI Foundation Models

Replies: 0
Boosts: 0
Views: 66
Activity: 2w

App Intents — exposing conversational and agentic actions to Siri AI

App Intents now connect app content and actions to Apple Intelligence, and Siri AI can take action directly inside third-party apps without fixed trigger phrases. Can an app expose a single conversational/agent-style entry point to Siri AI, or must all capabilities be modeled as discrete intents? If discrete, how does Siri AI chain multiple intents to fulfill a compound natural-language request? What is the supported pattern for long-running or asynchronous intents — actions that acknowledge immediately but complete and return a result seconds or minutes later? Is there a progress/continuation/callback model? How are an intent's results rendered — inline in the Siri app, via a snippet/App Intent UI, or by deep-linking into the app? What control do developers have over that presentation? For intents whose parameters are ambiguous, what disambiguation and follow-up affordances does Siri AI provide, and can developers supply candidate resolutions dynamically at runtime? Is there an eligibility or review process for apps to participate in systemwide Siri AI actions, beyond simply adopting App Intents?

Machine Learning & AI Foundation Models

Replies: 0
Boosts: 0
Views: 36
Activity: 2w

Spoken Locale Exposure (Dynamic Language Routing)

Does the App Intents framework expose the user's active spoken Siri locale (e.g., ja-JP, fr-FR) directly within the perform() context, or must the extension rely on the system's global locale setting? If a user switches Siri's language dynamically, how is that locale string propagated to the intent execution block?

Machine Learning & AI App Intents Siri and Voice App Intents

Replies: 1
Boosts: 0
Views: 98
Activity: 2w

Dynamic Runtime App Shortcuts

Can App Shortcuts (and their trigger phrases) be generated or translated dynamically at runtime based on user-defined configurations (e.g. download-on-demand language models or custom voice aliases), or must all voice shortcuts and translations be statically declared in the application's compiled String Catalogs (.xcstrings)?

Machine Learning & AI App Intents Siri and Voice SiriKit App Intents Apple Intelligence

Replies: 0
Boosts: 2
Views: 156
Activity: 2w

Voice-Streaming & Text-to-Speech (TTS) Latency

How does the App Intents framework support streaming spoken voice (TTS) output for long-form text responses? Is there an API (such as an asynchronous sequence or buffer stream) that allows Siri to begin speaking a response while the underlying AI engine is still generating the remainder of the text?

Machine Learning & AI App Intents Siri and Voice App Intents

Replies: 0
Boosts: 0
Views: 79
Activity: 2w

Sandbox-Bypassing IPC between App Intents and Launchd Daemons on macOS

We are designing a macOS utility that runs a local background agent via launchd (managing a local SQLite database and Unix socket). We want to expose controls (start/stop, status checks, CLI command invocation) to Siri via the App Intents framework. Since App Intents typically execute within a sandboxed App Extension or a sandboxed App wrapper container: What is the recommended IPC mechanism (e.g., Unix domain sockets, local HTTP/TCP ports, XPC) to securely communicate between a sandboxed App Intent extension and a non-sandboxed launchd helper daemon on macOS? Are there specific Entitlements (com.apple.security.temporary-exception.files.absolute-path.read-write or network exceptions) required to allow App Intents to talk to local UNIX sockets or loopback interfaces (127.0.0.1) without triggering sandbox violations? Can an App Intent directly invoke a helper command-line tool or launch a plist-configured service without bringing up the main application UI?

Machine Learning & AI App Intents

Replies: 2
Boosts: 0
Views: 124
Activity: 2w

Semantic Voice Search and Natural Language Parameter Resolution

How does next-gen Siri handle semantic or semantic-adjacent parameter resolution when using EntityQuery? For example, if a user says 'Siri, show memory tags related to auth errors' and the exact tag is auth-handler-failure: Does the App Intents framework support passing the raw query string (from natural language) directly to our search/recall logic within the intent, or does it try to map it strictly to pre-registered AppEntity lists? How do we configure an AppEntity to support synonyms or regex-like string match resolution directly within Siri's offline/online parsing?

Machine Learning & AI App Intents

Replies: 1
Boosts: 0
Views: 85
Activity: 2w