Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

macOS 26 Beta 2 - Foundation Models - Symbol not found
It seems like there was an undocumented change that made Transcript.init(entries: [Transcript.Entry] initializer private, which broke my application, which relies on (manual) reconstruction of Transcript entries. Worked fine on beta 1, on beta 2 there's this error dyld[72381]: Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC Referenced from: <44342398-591C-3850-9889-87C9458E1440> /Users/mika/experiments/apple-on-device-ai/fm Expected in: <66A793F6-CB22-3D1D-A560-D1BD5B109B0D> /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels Is this a part of an API transition, if so - Apple, please update your documentation
3
0
400
Jun ’25
Foundation Models performance reality check - anyone else finding it slow?
Testing Foundation Models framework with a health-focused recipe generation app. The on-device approach is appealing but performance is rough. Taking 20+ seconds just to get recipe name and description. Same content from Claude API: 4 seconds. I know it's beta and on-device has different tradeoffs, but this is approaching unusable territory for real-time user experience. The streaming helps psychologically but doesn't mask the underlying latency.The privacy/cost benefits are compelling but not if users abandon the feature before it completes. Anyone else seeing similar performance? Is this expected for beta, or are there optimization techniques I'm missing?
5
0
308
Jul ’25
Using Past Versions of Foundation Models As They Progress
Has Apple made any commitment to versioning the Foundation Models on device? What if you build a feature that works great on 26.0 but they change the model or guardrails in 26.1 and it breaks your feature, is your only recourse filing Feedback or pulling the feature from the app? Will there be a way to specify a model version like in all of the server based LLM provider APIs? If not, sounds risky to build on.
7
1
459
Jul ’25
A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
1
0
1.6k
Jun ’25
Real Time Text detection using iOS18 RecognizeTextRequest from video buffer returns gibberish
Hey Devs, I'm trying to create my own Real Time Text detection like this Apple project. https://developer.apple.com/documentation/vision/extracting-phone-numbers-from-text-in-images I want to use the new iOS18 RecognizeTextRequest instead of the old VNRecognizeTextRequest in my SwiftUI project. This is my delegate code with the camera setup. I removed region of interest for debugging but I'm trying to scan English words in books. The idea is to get one word in the ROI in the future. But I can't even get proper words so testing without ROI incase my math is wrong. @Observable class CameraManager: NSObject, AVCapturePhotoCaptureDelegate ... override init() { super.init() setUpVisionRequest() } private func setUpVisionRequest() { textRequest = RecognizeTextRequest(.revision3) } ... func setup() -> Bool { captureSession.beginConfiguration() guard let captureDevice = AVCaptureDevice.default( .builtInWideAngleCamera, for: .video, position: .back) else { return false } self.captureDevice = captureDevice guard let deviceInput = try? AVCaptureDeviceInput(device: captureDevice) else { return false } /// Check whether the session can add input. guard captureSession.canAddInput(deviceInput) else { print("Unable to add device input to the capture session.") return false } /// Add the input and output to session captureSession.addInput(deviceInput) /// Configure the video data output videoDataOutput.setSampleBufferDelegate( self, queue: videoDataOutputQueue) if captureSession.canAddOutput(videoDataOutput) { captureSession.addOutput(videoDataOutput) videoDataOutput.connection(with: .video)? .preferredVideoStabilizationMode = .off } else { return false } // Set zoom and autofocus to help focus on very small text do { try captureDevice.lockForConfiguration() captureDevice.videoZoomFactor = 2 captureDevice.autoFocusRangeRestriction = .near captureDevice.unlockForConfiguration() } catch { print("Could not set zoom level due to error: \(error)") return false } captureSession.commitConfiguration() // potential issue with background vs dispatchqueue ?? Task(priority: .background) { captureSession.startRunning() } return true } } // Issue here ??? extension CameraManager: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput( _ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection ) { guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } Task { textRequest.recognitionLevel = .fast textRequest.recognitionLanguages = [Locale.Language(identifier: "en-US")] do { let observations = try await textRequest.perform(on: pixelBuffer) for observation in observations { let recognizedText = observation.topCandidates(1).first print("recognized text \(recognizedText)") } } catch { print("Recognition error: \(error.localizedDescription)") } } } } The results I get look like this ( full page of English from a any book) recognized text Optional(RecognizedText(string: e bnUI W4, confidence: 0.5)) recognized text Optional(RecognizedText(string: ?'U, confidence: 0.3)) recognized text Optional(RecognizedText(string: traQt4, confidence: 0.3)) recognized text Optional(RecognizedText(string: li, confidence: 0.3)) recognized text Optional(RecognizedText(string: 15,1,#, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllÈ, confidence: 0.3)) recognized text Optional(RecognizedText(string: vtrll, confidence: 0.3)) recognized text Optional(RecognizedText(string: 5,1,: 11, confidence: 0.5)) recognized text Optional(RecognizedText(string: 1141, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllll ljiiilij41, confidence: 0.3)) recognized text Optional(RecognizedText(string: 2f4, confidence: 0.3)) recognized text Optional(RecognizedText(string: ktril, confidence: 0.3)) recognized text Optional(RecognizedText(string: ¥LLI, confidence: 0.3)) recognized text Optional(RecognizedText(string: 11[Itl,, confidence: 0.3)) recognized text Optional(RecognizedText(string: 'rtlÈ131, confidence: 0.3)) Even with ROI set to a specific rectangle Normalized to Vision, I get the same results with single characters returning gibberish. Any help would be amazing thank you. Am I using the buffer right ? Am I using the new perform(on: CVPixelBuffer) right ? Maybe I didn't set up my camera properly? I can provide code
1
0
370
Jul ’25
Automated Testing and Performance Validation for Foundation Models Framework
I've been successfully integrating the Foundation Models framework into my healthcare app using structured generation with @Generable schemas. While my initial testing (20-30 iterations) shows promising results, I need to validate consistency and reliability at scale before production deployment. Question Is there a recommended approach for automated, large-scale testing of Foundation Models responses? Specifically, I'm looking to: Automate 1000+ test iterations with consistent prompts and structured schemas Measure response consistency across identical inputs Validate structured output reliability (proper schema adherence, no generation failures) Collect performance metrics (TTFT, TPS) for optimization Specific Questions Framework Limitations: Are there any undocumented rate limits or thermal throttling considerations for rapid session creation/destruction? Performance Tools: Can Xcode's Foundation Models Instrument be used programmatically, or only through Instruments UI? Automation Integration: Any recommendations for integrating with testing frameworks? Session Reuse: Is it better to reuse a single LanguageModelSession or create fresh sessions for each test iteration? Use Case Context My wellness app provides medically safe activity recommendations based on user health profiles. The Foundation Models framework processes health context and generates structured recommendations for exercises, nutrition, and lifestyle activities. Given the safety implications of providing health-related guidance, I need rigorous validation to ensure the model consistently produces appropriate, well-formed recommendations across diverse user scenarios and health conditions. Has anyone in the community built similar large-scale testing infrastructure for Foundation Models? Any insights on best practices or potential pitfalls would be greatly appreciated.
1
0
220
Jul ’25
Initializing session with transcript ignores tools
When I initialize a session with an existing transcript using this initializer: public convenience init(model: SystemLanguageModel = .default, guardrails: LanguageModelSession.Guardrails = .default, tools: [any Tool] = [], transcript: Transcript) The tools get ignored. I noticed that when doing that, the model never use the tools. When inspecting the transcript, I can see that the instruction entry does not have any tools available to it. I tried this for both transcripts that already include an instruction entry and ones that don't - both yielding the same result.. Is this the intended behavior / am I missing something here?
1
0
230
Jul ’25
Cannot find type ToolOutput in scope
My sample app has been working with the following code: func call(arguments: Arguments) async throws -&gt; ToolOutput { var temp:Int switch arguments.city { case .singapore: temp = Int.random(in: 30..&lt;40) case .china: temp = Int.random(in: 10..&lt;30) } let content = GeneratedContent(temp) let output = ToolOutput(content) return output } However in 26 beta 5, ToolOutput no longer available, please advice what has changed.
3
0
259
Aug ’25
Various On-Device Frameworks API & ChatGPT
Posting a follow up question after the WWDC 2025 Machine Learning AI & Frameworks Group Lab on June 12. In regards to the on-device API of any of the AI frameworks (foundation model, vision framework, ect.), is there a response condition or path where the API outsources it's input to ChatGPT if the user has allowed this like Siri does? Ignore this if it's a no: is this handled behind the scenes or by the developer?
0
0
333
Jun ’25
Initializing LanguageModelSession crashes app on macOS
Whenever I try to initialize a LanguageModelSession (let session = LanguageModelSession()), my app crashes with EXC_BAD_ACCESS. SystemLanguageModel.default.availability returns available. I tried running the two sample projects I found that use Foundation Models, FoundationModelsTripPlanner and SwiftTranscriptionSampleApp, and they both also crash—immediately on launch. I commented out the Foundation Models logic from the SwiftTranscriptionSampleApp and ran it again, and it no longer crashed. I'm on macOS 26 Beta 4 on an M1 Pro device. I'm based in Austria (EU), if that matters.
7
4
380
Aug ’25
Proposal: Develop a Token Estimation Tool for Foundation Models
Dear Apple Foundation Models Development Team, I am a developer integrating Apple Foundation Models (AFM) into my app and encountered the exceededContextWindowSize error when exceeding the 4096-token limit. Proposal: I suggest Apple develop a tool to estimate the token count of a prompt before sending it to the model. This tool could be integrated into FoundationModels Framework for ease of use. Benefits: A token estimation tool would help developers manage the context window limit and optimize performance. I hope Apple considers this proposal soon. Thank you!
6
0
388
Aug ’25
Vision Framework - Testing RecognizeDocumentsRequest
How do I test the new RecognizeDocumentRequest API. Reference: https://www.youtube.com/watch?v=H-GCNsXdKzM I am running Xcode Beta, however I only have one primary device that I cannot install beta software on. Please provide a strategy for testing. Will simulator work? The new capability is critical to my application, just what I need for structuring document scans and extraction. Thank you.
1
0
363
Jun ’25
When applied to a nested struct, @Generable macro results in infinite nested response from Foundation Model
When the @Generable is applied toward a Swift struct declared within another struct, and when said nested struct is defined as the type of one of the properties of another @Generable type, which is in turn defined as the output format of Foundation Model session, Foundation Model can stuck in a loop trying to create a infinitely nested response, until the context window limit exceeded error is triggered. I have filed feedback FB19987191 with a demo project. Is this expected behavior?
1
0
584
Sep ’25
ML models failed to decrypt and load
We have suddenly encountered a serious issue: our local ML models are no longer being decrypted. Everything was set up according to the guide at https://developer.apple.com/documentation/coreml/generating-a-model-encryption-key and had been working in production, but yesterday we started receiving the following error: Error Domain=com.apple.CoreML Code=8 "Fetching decryption key from server failed: noEntryFound("No records found"). Make sure the encryption key was generated with correct team ID." UserInfo={NSLocalizedDescription=Fetching decryption key from server failed: noEntryFound("No records found"). Make sure the encryption key was generated with correct team ID.} We haven’t changed anything in our code. This started spontaneously affecting users of the release version as of yesterday. It also no longer works locally — we receive the same error at the moment the autogenerated function is called: class func load(configuration: MLModelConfiguration = MLModelConfiguration(), completionHandler handler: @escaping (Swift.Result<ZingPDModel, Error>) -> Void) I assume that I can generate a new key through Xcode, integrate it in place of the old one, and it might start working again. However, this won’t affect existing users until they update the app. Could the issue be on Apple’s infrastructure side?
1
0
397
Jul ’25
Is there anywhere to get precompiled WhisperKit models for Swift?
If try to dynamically load WhipserKit's models, as in below, the download never occurs. No error or anything. And at the same time I can still get to the huggingface.co hosting site without any headaches, so it's not a blocking issue. let config = WhisperKitConfig( model: "openai_whisper-large-v3", modelRepo: "argmaxinc/whisperkit-coreml" ) So I have to default to the tiny model as seen below. I have tried so many ways, using ChatGPT and others, to build the models on my Mac, but too many failures, because I have never dealt with builds like that before. Are there any hosting sites that have the models (small, medium, large) already built where I can download them and just bundle them into my project? Wasted quite a large amount of time trying to get this done. import Foundation import WhisperKit @MainActor class WhisperLoader: ObservableObject { var pipe: WhisperKit? init() { Task { await self.initializeWhisper() } } private func initializeWhisper() async { do { Logging.shared.logLevel = .debug Logging.shared.loggingCallback = { message in print("[WhisperKit] \(message)") } let pipe = try await WhisperKit() // defaults to "tiny" self.pipe = pipe print("initialized. Model state: \(pipe.modelState)") guard let audioURL = Bundle.main.url(forResource: "44pf", withExtension: "wav") else { fatalError("not in bundle") } let result = try await pipe.transcribe(audioPath: audioURL.path) print("result: \(result)") } catch { print("Error: \(error)") } } }
0
0
129
Jun ’25
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
2
0
742
Sep ’25
Foundation Model - Change LLM
Almost everywhere else you see Apple Intelligence, you get to select whether it's on device, private cloud compute, or ChatGPT. Is there a way to do that via code in the Foundation Model? I searched through the docs and couldn't find anything, but maybe I missed it.
2
1
200
Jul ’25
macOS 26 Beta 2 - Foundation Models - Symbol not found
It seems like there was an undocumented change that made Transcript.init(entries: [Transcript.Entry] initializer private, which broke my application, which relies on (manual) reconstruction of Transcript entries. Worked fine on beta 1, on beta 2 there's this error dyld[72381]: Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC Referenced from: <44342398-591C-3850-9889-87C9458E1440> /Users/mika/experiments/apple-on-device-ai/fm Expected in: <66A793F6-CB22-3D1D-A560-D1BD5B109B0D> /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels Is this a part of an API transition, if so - Apple, please update your documentation
Replies
3
Boosts
0
Views
400
Activity
Jun ’25
Foundation Models performance reality check - anyone else finding it slow?
Testing Foundation Models framework with a health-focused recipe generation app. The on-device approach is appealing but performance is rough. Taking 20+ seconds just to get recipe name and description. Same content from Claude API: 4 seconds. I know it's beta and on-device has different tradeoffs, but this is approaching unusable territory for real-time user experience. The streaming helps psychologically but doesn't mask the underlying latency.The privacy/cost benefits are compelling but not if users abandon the feature before it completes. Anyone else seeing similar performance? Is this expected for beta, or are there optimization techniques I'm missing?
Replies
5
Boosts
0
Views
308
Activity
Jul ’25
Artificial Intelligence Bug in Xcode 16.4
I downloaded the new developer beta and then installed xcode. I did the downloads but I couldn't download the Predictive Code Completion Model. When I try to download it I get the error "The operation couldn’t be completed. (ModelCatalog.CatalogErrors.AssetErrors error 1.)". I am using the M3 Pro model.
Replies
2
Boosts
0
Views
180
Activity
Jun ’25
Using Past Versions of Foundation Models As They Progress
Has Apple made any commitment to versioning the Foundation Models on device? What if you build a feature that works great on 26.0 but they change the model or guardrails in 26.1 and it breaks your feature, is your only recourse filing Feedback or pulling the feature from the app? Will there be a way to specify a model version like in all of the server based LLM provider APIs? If not, sounds risky to build on.
Replies
7
Boosts
1
Views
459
Activity
Jul ’25
A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
Replies
1
Boosts
0
Views
1.6k
Activity
Jun ’25
Real Time Text detection using iOS18 RecognizeTextRequest from video buffer returns gibberish
Hey Devs, I'm trying to create my own Real Time Text detection like this Apple project. https://developer.apple.com/documentation/vision/extracting-phone-numbers-from-text-in-images I want to use the new iOS18 RecognizeTextRequest instead of the old VNRecognizeTextRequest in my SwiftUI project. This is my delegate code with the camera setup. I removed region of interest for debugging but I'm trying to scan English words in books. The idea is to get one word in the ROI in the future. But I can't even get proper words so testing without ROI incase my math is wrong. @Observable class CameraManager: NSObject, AVCapturePhotoCaptureDelegate ... override init() { super.init() setUpVisionRequest() } private func setUpVisionRequest() { textRequest = RecognizeTextRequest(.revision3) } ... func setup() -> Bool { captureSession.beginConfiguration() guard let captureDevice = AVCaptureDevice.default( .builtInWideAngleCamera, for: .video, position: .back) else { return false } self.captureDevice = captureDevice guard let deviceInput = try? AVCaptureDeviceInput(device: captureDevice) else { return false } /// Check whether the session can add input. guard captureSession.canAddInput(deviceInput) else { print("Unable to add device input to the capture session.") return false } /// Add the input and output to session captureSession.addInput(deviceInput) /// Configure the video data output videoDataOutput.setSampleBufferDelegate( self, queue: videoDataOutputQueue) if captureSession.canAddOutput(videoDataOutput) { captureSession.addOutput(videoDataOutput) videoDataOutput.connection(with: .video)? .preferredVideoStabilizationMode = .off } else { return false } // Set zoom and autofocus to help focus on very small text do { try captureDevice.lockForConfiguration() captureDevice.videoZoomFactor = 2 captureDevice.autoFocusRangeRestriction = .near captureDevice.unlockForConfiguration() } catch { print("Could not set zoom level due to error: \(error)") return false } captureSession.commitConfiguration() // potential issue with background vs dispatchqueue ?? Task(priority: .background) { captureSession.startRunning() } return true } } // Issue here ??? extension CameraManager: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput( _ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection ) { guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } Task { textRequest.recognitionLevel = .fast textRequest.recognitionLanguages = [Locale.Language(identifier: "en-US")] do { let observations = try await textRequest.perform(on: pixelBuffer) for observation in observations { let recognizedText = observation.topCandidates(1).first print("recognized text \(recognizedText)") } } catch { print("Recognition error: \(error.localizedDescription)") } } } } The results I get look like this ( full page of English from a any book) recognized text Optional(RecognizedText(string: e bnUI W4, confidence: 0.5)) recognized text Optional(RecognizedText(string: ?'U, confidence: 0.3)) recognized text Optional(RecognizedText(string: traQt4, confidence: 0.3)) recognized text Optional(RecognizedText(string: li, confidence: 0.3)) recognized text Optional(RecognizedText(string: 15,1,#, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllÈ, confidence: 0.3)) recognized text Optional(RecognizedText(string: vtrll, confidence: 0.3)) recognized text Optional(RecognizedText(string: 5,1,: 11, confidence: 0.5)) recognized text Optional(RecognizedText(string: 1141, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllll ljiiilij41, confidence: 0.3)) recognized text Optional(RecognizedText(string: 2f4, confidence: 0.3)) recognized text Optional(RecognizedText(string: ktril, confidence: 0.3)) recognized text Optional(RecognizedText(string: ¥LLI, confidence: 0.3)) recognized text Optional(RecognizedText(string: 11[Itl,, confidence: 0.3)) recognized text Optional(RecognizedText(string: 'rtlÈ131, confidence: 0.3)) Even with ROI set to a specific rectangle Normalized to Vision, I get the same results with single characters returning gibberish. Any help would be amazing thank you. Am I using the buffer right ? Am I using the new perform(on: CVPixelBuffer) right ? Maybe I didn't set up my camera properly? I can provide code
Replies
1
Boosts
0
Views
370
Activity
Jul ’25
Automated Testing and Performance Validation for Foundation Models Framework
I've been successfully integrating the Foundation Models framework into my healthcare app using structured generation with @Generable schemas. While my initial testing (20-30 iterations) shows promising results, I need to validate consistency and reliability at scale before production deployment. Question Is there a recommended approach for automated, large-scale testing of Foundation Models responses? Specifically, I'm looking to: Automate 1000+ test iterations with consistent prompts and structured schemas Measure response consistency across identical inputs Validate structured output reliability (proper schema adherence, no generation failures) Collect performance metrics (TTFT, TPS) for optimization Specific Questions Framework Limitations: Are there any undocumented rate limits or thermal throttling considerations for rapid session creation/destruction? Performance Tools: Can Xcode's Foundation Models Instrument be used programmatically, or only through Instruments UI? Automation Integration: Any recommendations for integrating with testing frameworks? Session Reuse: Is it better to reuse a single LanguageModelSession or create fresh sessions for each test iteration? Use Case Context My wellness app provides medically safe activity recommendations based on user health profiles. The Foundation Models framework processes health context and generates structured recommendations for exercises, nutrition, and lifestyle activities. Given the safety implications of providing health-related guidance, I need rigorous validation to ensure the model consistently produces appropriate, well-formed recommendations across diverse user scenarios and health conditions. Has anyone in the community built similar large-scale testing infrastructure for Foundation Models? Any insights on best practices or potential pitfalls would be greatly appreciated.
Replies
1
Boosts
0
Views
220
Activity
Jul ’25
Initializing session with transcript ignores tools
When I initialize a session with an existing transcript using this initializer: public convenience init(model: SystemLanguageModel = .default, guardrails: LanguageModelSession.Guardrails = .default, tools: [any Tool] = [], transcript: Transcript) The tools get ignored. I noticed that when doing that, the model never use the tools. When inspecting the transcript, I can see that the instruction entry does not have any tools available to it. I tried this for both transcripts that already include an instruction entry and ones that don't - both yielding the same result.. Is this the intended behavior / am I missing something here?
Replies
1
Boosts
0
Views
230
Activity
Jul ’25
Cannot find type ToolOutput in scope
My sample app has been working with the following code: func call(arguments: Arguments) async throws -&gt; ToolOutput { var temp:Int switch arguments.city { case .singapore: temp = Int.random(in: 30..&lt;40) case .china: temp = Int.random(in: 10..&lt;30) } let content = GeneratedContent(temp) let output = ToolOutput(content) return output } However in 26 beta 5, ToolOutput no longer available, please advice what has changed.
Replies
3
Boosts
0
Views
259
Activity
Aug ’25
Various On-Device Frameworks API & ChatGPT
Posting a follow up question after the WWDC 2025 Machine Learning AI & Frameworks Group Lab on June 12. In regards to the on-device API of any of the AI frameworks (foundation model, vision framework, ect.), is there a response condition or path where the API outsources it's input to ChatGPT if the user has allowed this like Siri does? Ignore this if it's a no: is this handled behind the scenes or by the developer?
Replies
0
Boosts
0
Views
333
Activity
Jun ’25
Initializing LanguageModelSession crashes app on macOS
Whenever I try to initialize a LanguageModelSession (let session = LanguageModelSession()), my app crashes with EXC_BAD_ACCESS. SystemLanguageModel.default.availability returns available. I tried running the two sample projects I found that use Foundation Models, FoundationModelsTripPlanner and SwiftTranscriptionSampleApp, and they both also crash—immediately on launch. I commented out the Foundation Models logic from the SwiftTranscriptionSampleApp and ran it again, and it no longer crashed. I'm on macOS 26 Beta 4 on an M1 Pro device. I'm based in Austria (EU), if that matters.
Replies
7
Boosts
4
Views
380
Activity
Aug ’25
LanguageModelSession.GenerationError.exceededContextWindowSize not called
When context window size exceeded, this error is not called (instead another error has shown up) to handle new session. LanguageModelSession.GenerationError.exceededContextWindowSize Or am I doing things wrong?
Replies
2
Boosts
0
Views
365
Activity
Jul ’25
Proposal: Develop a Token Estimation Tool for Foundation Models
Dear Apple Foundation Models Development Team, I am a developer integrating Apple Foundation Models (AFM) into my app and encountered the exceededContextWindowSize error when exceeding the 4096-token limit. Proposal: I suggest Apple develop a tool to estimate the token count of a prompt before sending it to the model. This tool could be integrated into FoundationModels Framework for ease of use. Benefits: A token estimation tool would help developers manage the context window limit and optimize performance. I hope Apple considers this proposal soon. Thank you!
Replies
6
Boosts
0
Views
388
Activity
Aug ’25
Vision Framework - Testing RecognizeDocumentsRequest
How do I test the new RecognizeDocumentRequest API. Reference: https://www.youtube.com/watch?v=H-GCNsXdKzM I am running Xcode Beta, however I only have one primary device that I cannot install beta software on. Please provide a strategy for testing. Will simulator work? The new capability is critical to my application, just what I need for structuring document scans and extraction. Thank you.
Replies
1
Boosts
0
Views
363
Activity
Jun ’25
When applied to a nested struct, @Generable macro results in infinite nested response from Foundation Model
When the @Generable is applied toward a Swift struct declared within another struct, and when said nested struct is defined as the type of one of the properties of another @Generable type, which is in turn defined as the output format of Foundation Model session, Foundation Model can stuck in a loop trying to create a infinitely nested response, until the context window limit exceeded error is triggered. I have filed feedback FB19987191 with a demo project. Is this expected behavior?
Replies
1
Boosts
0
Views
584
Activity
Sep ’25
ML models failed to decrypt and load
We have suddenly encountered a serious issue: our local ML models are no longer being decrypted. Everything was set up according to the guide at https://developer.apple.com/documentation/coreml/generating-a-model-encryption-key and had been working in production, but yesterday we started receiving the following error: Error Domain=com.apple.CoreML Code=8 "Fetching decryption key from server failed: noEntryFound("No records found"). Make sure the encryption key was generated with correct team ID." UserInfo={NSLocalizedDescription=Fetching decryption key from server failed: noEntryFound("No records found"). Make sure the encryption key was generated with correct team ID.} We haven’t changed anything in our code. This started spontaneously affecting users of the release version as of yesterday. It also no longer works locally — we receive the same error at the moment the autogenerated function is called: class func load(configuration: MLModelConfiguration = MLModelConfiguration(), completionHandler handler: @escaping (Swift.Result<ZingPDModel, Error>) -> Void) I assume that I can generate a new key through Xcode, integrate it in place of the old one, and it might start working again. However, this won’t affect existing users until they update the app. Could the issue be on Apple’s infrastructure side?
Replies
1
Boosts
0
Views
397
Activity
Jul ’25
Foundation Model Framework
Hey everyone, Is it possible to generate XML using the “Generable” macro of the Foundation Model Framework?
Replies
2
Boosts
0
Views
895
Activity
Sep ’25
Is there anywhere to get precompiled WhisperKit models for Swift?
If try to dynamically load WhipserKit's models, as in below, the download never occurs. No error or anything. And at the same time I can still get to the huggingface.co hosting site without any headaches, so it's not a blocking issue. let config = WhisperKitConfig( model: "openai_whisper-large-v3", modelRepo: "argmaxinc/whisperkit-coreml" ) So I have to default to the tiny model as seen below. I have tried so many ways, using ChatGPT and others, to build the models on my Mac, but too many failures, because I have never dealt with builds like that before. Are there any hosting sites that have the models (small, medium, large) already built where I can download them and just bundle them into my project? Wasted quite a large amount of time trying to get this done. import Foundation import WhisperKit @MainActor class WhisperLoader: ObservableObject { var pipe: WhisperKit? init() { Task { await self.initializeWhisper() } } private func initializeWhisper() async { do { Logging.shared.logLevel = .debug Logging.shared.loggingCallback = { message in print("[WhisperKit] \(message)") } let pipe = try await WhisperKit() // defaults to "tiny" self.pipe = pipe print("initialized. Model state: \(pipe.modelState)") guard let audioURL = Bundle.main.url(forResource: "44pf", withExtension: "wav") else { fatalError("not in bundle") } let result = try await pipe.transcribe(audioPath: audioURL.path) print("result: \(result)") } catch { print("Error: \(error)") } } }
Replies
0
Boosts
0
Views
129
Activity
Jun ’25
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
Replies
2
Boosts
0
Views
742
Activity
Sep ’25
Foundation Model - Change LLM
Almost everywhere else you see Apple Intelligence, you get to select whether it's on device, private cloud compute, or ChatGPT. Is there a way to do that via code in the Foundation Model? I searched through the docs and couldn't find anything, but maybe I missed it.
Replies
2
Boosts
1
Views
200
Activity
Jul ’25