Machine Learning & AI

Easy way to implement facial recognition

Hi everyone😊, I want to implement facial recognition into my app. I was planning to use createML's image classification, but there seams to be a lot of hassle to implement (the JSON file etc.). Are there some other easy to implement options that don't involve advanced coding. Thanks, Oliver

Machine Learning & AI General

2

0

612

Feb ’25

A Summary of the WWDC25 Group Lab - Apple Intelligence

At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Apple Intelligence. Can I integrate writing tools in my own text editor? UITextView, NSTextView, and SwiftUI TextEditor automatically get Writing Tools on devices that support Apple Intelligence. For custom text editors, check out Enhancing your custom text engine with Writing Tools. Given that Foundation Models are on-device, how will Apple update the models over time? And how should we test our app against the model updates? Model updates are in sync with OS updates. As for testing with updated models, watch our WWDC session about prompt engineering and safety, and read the Human Interface Guidelines to understand best practices in prompting the on-device model. What is the context size of a session in Foundation Models Framework? How to handle the error if a session runs out of the context size? Currently the context size is about 4,000 tokens. If it’s exceeded, developers can catch the .exceededContextWindowSize error at runtime. As discussed in one of our WWDC25 sessions, when the context window is exceeded, one approach is to trim and summarize a transcript, and then start a new session. Can I do image generation using the Foundation Models Framework or is only text generation supported? Foundation Models do not generate images, but you can use the Foundation Models framework to generate prompts for ImageCreator in the Image Playground framework. Developers can also take advantage of Tools in Foundation Models framework, if appropriate for their app. My app currently uses a third party server-based LLM. Can I use the Foundation Models Framework as well in the same app? Any guidance here? The Foundation Models framework is optimized for a subset of tasks like summarization, extraction, classification, and tagging. It’s also on-device, private, and free. But at 3 billion parameters it isn’t designed for advanced reasoning or world knowledge, so for some tasks you may still want to use a larger server-based model. Should I use the AFM for my language translation features given it does text translation, or is the Translation API still the preferred approach? The Translation API is still preferred. Foundation Models is great for tasks like text summarization and data generation. It’s not recommended for general world knowledge or translation tasks. Will the TranslationSession class introduced in ios18 get any new improvments in performance or reliability with the new live translation abilities in ios/macos/ipados 26? Essentially, will we get access to live translation in a similar way and if so, how? There's new API in LiveCommunicationKit to take advantage of live translation in your communication apps. The Translate framework is using the same models as used by Live Communication and can be combined with the new SpeechAnalyzer API to translate your own audio. How do I set a default value for an App Intent parameter that is otherwise required? You can implement a default value as part of your parameter declaration by using the @Parameter(defaultValue:) form of the property wrapper. How long can an App Intent run? On macOS there is no limit to how long app intents can run. On iOS, there is a limit of 30 seconds. This time limit is paused when waiting for user interaction. How do I vary the options for a specific parameter of an App Intent, not just based on the type? Implement a DynamicOptionsProvider on that parameter. You can add suggestedEntities() to suggest options. What if there is not a schema available for what my app is doing? If an app intent schema matching your app’s functionality isn’t available, take a look to see if there’s a SiriKit domain that meets your needs, such as for media playback and messaging apps. If your app’s functionality doesn’t match any of the available schemas, you can define a custom app intent, and integrate it with Siri by making it an App Shortcut. Please file enhancement requests via Feedback Assistant for new App intent schemas that would benefit your app. Are you adding any new app intent domains this year? In addition to all the app intent domains we announced last year, this year at WWDC25 we announced that Visual Intelligence will be added to iOS 26 and macOS Tahoe. When my App Intent doesn't show up as an action in Shortcuts, where do I start in figuring out what went wrong? App Intents are statically extracted. You can check the ExtractMetadata info in Xcode's build log. What do I need to do to make sure my App Intents work well with Spotlight+? Check out our WWDC25 sessions on App Intents, including Explore new advances in App Intents and Develop for Shortcuts and Spotlight with App Intents. Mostly, make sure that your intent can run from the parameter summary alone, no required parameters without default values that are not already in the parameter summary. Does Spotlight+ on macOS support App Shortcuts? Not directly, but it shows the App Intents your App Shortcuts are sitting on top of. I’m wondering if the on-device Foundation Models framework API can be integrated into an app to act strictly as an app in-universe AI assistant, responding only within the boundaries of the app’s fictional context. Is such controlled, context-limited interaction supported? FM API runs inside the process of your app only and does not automatically integrate with any remaining part of the system (unless you choose to implement your own tool and utilize tool calling). You can provide any instructions and prompts you want to the model. If a country does not support Apple Intelligence yet, can the Foundation Models framework work? FM API works on Apple Intelligence-enabled devices in supported regions and won’t work in regions where Apple Intelligence is not yet supported

Machine Learning & AI Apple Intelligence

2

0

285

Jul ’25

Foundation Model - Change LLM

Almost everywhere else you see Apple Intelligence, you get to select whether it's on device, private cloud compute, or ChatGPT. Is there a way to do that via code in the Foundation Model? I searched through the docs and couldn't find anything, but maybe I missed it.

Machine Learning & AI Foundation Models

2

1

154

Jul ’25

How to encode Tool.Output (aka PromptRepresentable)?

Hey, I've been trying to write an AI agent for OpenAI's GPT-5, but using the @Generable Tool types from the FoundationModels framework, which is super awesome btw! I'm having trouble implementing the tool calling, though. When I receive a tool call from the OpenAI api, I do the following: Find the tool in my [any Tool] array via the tool name I get from the model if let tool = tools.first(where: { $0.name == functionCall.name }) { // ... } Parse the arguments of the tool call via GeneratedContent(json:) let generatedContent = try GeneratedContent(json: functionCall.arguments) Pass the tool and arguments to a function that calls tool.call(arguments: arguments) and returns the tool's output type private func execute<T: Tool>(_ tool: T, with generatedContent: GeneratedContent) async throws -> T.Output { let arguments = try T.Arguments.init(generatedContent) return try await tool.call(arguments: arguments) } Up to this point, everything is working as expected. However, the tool's output type is any PromptRepresentable and I have no idea how to turn that into something that I can encode and send back to the model. I assumed there might be a way to turn it into a GeneratedContent but there is no fitting initializer. Am I missing something or is this not supported? Without a way to return the output to an external provider, it wouldn't really be possible to use FoundationModels Tool type I think. That would be unfortunate because it's implemented so elegantly. Thanks!

Machine Learning & AI Foundation Models

2

0

234

Aug ’25

Is there an API to check if a Core ML compiled model is already cached?

Hello Apple Developer Community, I'm investigating Core ML model loading behavior and noticed that even when the compiled model path remains unchanged after an APP update, the first run still triggers an "uncached load" process. This seems to impact user experience with unnecessary delays. Question: Does Core ML provide any public API to check whether a compiled model (from a specific .mlmodelc path) is already cached in the system? If such API exists, we'd like to use it for pre-loading decision logic - only perform background pre-load when the model isn't cached. Has anyone encountered similar scenarios or found official solutions? Any insights would be greatly appreciated!

Machine Learning & AI Core ML iOS Core ML

2

0

240

May ’25

WWDC25 combining metal and ML

WWDC25: Combine Metal 4 machine learning and graphics Demonstrated a way to combine neural network in the graphics pipeline directly through the shaders, using an example of Texture Compression. However there is no mention of using which ML technique texture is compressed. Can anyone point me to some well known model/s for this particular use case shown in WWDC25.

Machine Learning & AI General ML Compute Metal Core Graphics

2

0

445

Jul ’25

Swift playgrounds (.swiftpm) and CoreML

Hey guys, I've been having difficulties transferring my Xcode project to a Swift playground (.swiftpm) for the Swift Student Challenge. I keep getting these errors as well as none of the views being able to find the model in scope: "TrashDetector 1.mlmodel: No predominant language detected. Set COREML_CODEGEN_LANGUAGE to preferred language." Unexpected duplicate tasks: Target 'TrashQuest' (project 'TrashQuest') has write command with output /Users/kmcph3/Library/Developer/Xcode/DerivedData/TrashQuest-glvzskunedgtakfrdmsxdoplondj/Build/Intermediates.noindex/TrashQuest.build/Debug-iphonesimulator/TrashQuest.build/0a4ef2429d66360920ddb4f16e65e233.sb I've gone through multiple post with these exact problems, but they all seem to be talking about ".playground" files due to the "Resources" folder (mind you I did try exactly what they said). Is there anyone that can help??? (Quick side note, why does it need to be a swiftpm file for the SSC??? Like why can't we just send the zip of our Xcode project??)

Machine Learning & AI Core ML

2

0

852

Feb ’25

FoundationModels guardrailViolation on Beta 3

Hello everybody! I’m encountering an unexpected guardrailViolation error when using Foundation Models on macOS Beta 3 (Tahoe) with an Apple M2 Pro chip. This issue didn’t occur on Beta 1 or Beta 2 using the same codebase. Reproduction Context I’m developing an app that leverages Foundation Models for structured generation, paired with a local database tool. After upgrading to macOS Beta 3, I started receiving this error consistently, despite no changes in the generation logic. To isolate the issue, I opened the official WWDC sample project from the Adding intelligent app features with generative models and the same guardrailViolation error appeared without any modifications. Simplified Working Example I attempted to narrow down the issue by starting with a minimal prompt structure. This basic case works fine: import Foundation import Playgrounds import FoundationModels @Generable struct GeneableLandmark { @Guide(description: "Name of the landmark to visit") var name: String } final class LandmarkSuggestionGenerator { var landmarkSuggestion: GeneableLandmark.PartiallyGenerated? private var session: LanguageModelSession init(){ self.session = LanguageModelSession( instructions: Instructions { """ generate a list of landmarks to visit """ } ) } func createLandmarkSuggestion(location: String) async throws { let stream = session.streamResponse( generating: GeneableLandmark.self, options: GenerationOptions(sampling: .greedy), includeSchemaInPrompt: false ) { """ Generate a list of landmarks to viist in \(location) """ } for try await partialResponse in stream { landmarkSuggestion = partialResponse } } } #Playground { let generator = LandmarkSuggestionGenerator() Task { do { try await generator.createLandmarkSuggestion(location: "New york") if let suggestion = generator.landmarkSuggestion { print("Suggested landmark: \(suggestion)") } else { print("No suggestion generated.") } } catch { print("Error generating landmark suggestion: \(error)") } } } But as soon as I use the Sample ItineraryPlanner: #Playground { // Example landmark for demonstration let exampleLandmark = Landmark( id: 1, name: "San Francisco", continent: "North America", description: "A vibrant city by the bay known for the Golden Gate Bridge.", shortDescription: "Iconic Californian city.", latitude: 37.7749, longitude: -122.4194, span: 0.2, placeID: nil ) let planner = ItineraryPlanner(landmark: exampleLandmark) Task { do { try await planner.suggestItinerary(dayCount: 3) if let itinerary = planner.itinerary { print("Suggested itinerary: \(itinerary)") } else { print("No itinerary generated.") } } catch { print("Error generating itinerary: \(error)") } } } The error pops up: Multiline Error generating itinerary: guardrailViolation(FoundationModels.LanguageModelSession. >GenerationError.Context(debug Description: "May contain sensitive or unsafe content", >underlyingErrors: [FoundationModels. LanguageModelSession. Gene >rationError.guardrailViolation(FoundationMo dels. >LanguageModelSession.GenerationError.C ontext (debugDescription: >"May contain unsafe content", underlyingErrors: []))])) Based on my tests: The error may not be tied to structure complexity (since more nested structures work) The issue may stem from the tools or prompt content used inside the ItineraryPlanner The guardrail sensitivity may have increased or changed in Beta 3, affecting models that worked in earlier betas Thank you in advance for your help. Let me know if more details or reproducible code samples are needed - I’m happy to provide them. Best, Sasha Morozov

Machine Learning & AI Foundation Models Beta

2

1

410

Jul ’25

ModelManager received unentitled request. Expected entitlement com.apple.modelmanager.inference

Just tried to write a very simple test of using foundation models, but it gave me the error like this "ModelManager received unentitled request. Expected entitlement com.apple.modelmanager.inference establishment of session failed with Missing entitlement: com.apple.modelmanager.inference" The simple code is listed below: let session: LanguageModelSession = LanguageModelSession() let response = try? await session.respond(to: "What is the capital of France?") print("Response: (response)") So what's the problem of this one?

Machine Learning & AI Foundation Models

2

0

256

Jul ’25

Khmer Script Misidentified as Thai in Vision Framework

It is vital for Apple to refine its OCR models to correctly distinguish between Khmer and Thai scripts. Incorrectly labeling Khmer text as Thai is more than a technical bug; it is a culturally insensitive error that impacts national identity, especially given the current geopolitical climate between Cambodia and Thailand. Implementing a more robust language-detection threshold would prevent these harmful misidentifications. There is a significant logic flaw in the VNRecognizeTextRequest language detection when processing Khmer script. When the property automaticallyDetectsLanguage is set to true, the Vision framework frequently misidentifies Khmer characters as Thai. While both scripts share historical roots, they are distinct languages with different alphabets. Currently, the model’s confidence threshold for distinguishing between these two scripts is too low, leading to incorrect OCR output in both developer-facing APIs and Apple’s native ecosystem (Preview, Live Text, and Photos). import SwiftUI import Vision class TextExtractor { func extractText(from data: Data, completion: @escaping (String) -> Void) { let request = VNRecognizeTextRequest { (request, error) in guard let observations = request.results as? [VNRecognizedTextObservation] else { completion("No text found.") return } let recognizedStrings = observations.compactMap { observation in let str = observation.topCandidates(1).first?.string return "{text: \(str!), confidence: \(observation.confidence)}" } completion(recognizedStrings.joined(separator: "\n")) } request.automaticallyDetectsLanguage = true // <-- This is the issue. request.recognitionLevel = .accurate let handler = VNImageRequestHandler(data: data, options: [:]) DispatchQueue.global(qos: .background).async { do { try handler.perform([request]) } catch { completion("Failed to perform OCR: \(error.localizedDescription)") } } } } Recognizing Khmer Confidence Score is low for Khmer text. (The output is in Thai language with low confidence score) Recognizing English Confidence Score is high expected. Recognizing Thai Confidence Score is high as expected Issues on Preview, Photos Khmer text Copied text Kouk Pring Chroum Temple [19121 รอาสายสุกตีนานยารรีสใหิสรราภูชิตีนนสุฐตีย์ [รุก เผือชิษาธอยกัตธ์ตายตราพาษชาณา ถวเชยาใบสราเบรถทีมูสินตราพาษชาณา ทีมูโษา เช็ก อาษเชิษฐอารายสุกบดตพรธุรฯ ตากร"สุก"ผาตากรธกรธุกเยากสเผาพศฐตาสาย รัอรณาษ"ตีพย" สเผาพกรกฐาภูชิสาเครๆผู:สุกรตีพาสเผาพสรอสายใผิตรรารตีพสๆ เดียอลายสุกตีน ธาราชรติ ธิพรหณาะพูชุบละเาหLunet De Lajonquiere ผารูกรสาราพารผรผาสิตภพ ตารสิทูก ธิพิ คุณที่นสายเระพบพเคเผาหนารเกะทรนภาษเราภุพเสารเราษทีเลิกสญาเราหรุฬารชสเกาก เรากุม สงสอบานตรเราะากกต่ายภากายระตารุกเตียน Recommended Solutions 1. Set a Threshold Filter out the detected result where the threshold is less than or equal to 0.5, so that it would not output low quality text which can lead to the issue. For example, let recognizedStrings = observations.compactMap { observation in if observation.confidence <= 0.5 { return nil } let str = observation.topCandidates(1).first?.string return "{text: \(str!), confidence: \(observation.confidence)}" } 2. Add Khmer Language Support This issue would never happen if the model has the capability to detect and recognize image with Khmer language. Doc2Text GitHub: https://github.com/seanghay/Doc2Text-Swift

Machine Learning & AI General Vision VisionKit Localization

2

0

863

6d

Ways I can leverage AI when the user asks Siri, "What does this word mean"

I'm the creator of an app that helps users learn Arabic. Inside of the app users can save words, engage in lessons specific to certain grammar concepts etc. I'm looking for a way for Siri to 'suggest' my app when the user asks to define any Arabic words. There are other questions that I would like for Siri to suggest my app for, but I figure that's a good start. What framework am I looking for here? I think AppItents? I remember I played with it for a bit last year but didn't get far. Any suggestions would be great. Would the new Foundations model be any help here?

Machine Learning & AI Apple Intelligence

2

0

135

Jun ’25

`LanguageModelSession.respond()` never resolves in Beta 5

Hi all, I noticed on Friday that on the new Beta 5 using FoundationModels on a simulator LanguageModelSession.respond() neither resolves nor throws most of the time. The SwiftUI test app below was working perfectly in Xcode 16 Beta 4 and iOS 26 Beta 4 (simulator). import SwiftUI import FoundationModels struct ContentView: View { var body: some View { VStack { Image(systemName: "globe") .imageScale(.large) .foregroundStyle(.tint) Text("Hello, world!") } .padding() .onAppear { Task { do { let session = LanguageModelSession() let response = try await session.respond(to: "are cats better than dogs ???") print(response.content) } catch { print("error") } } } } } After updating to Xcode 16 Beta 5 and iOS 26 Beta 5 (simulator), the code now often hangs. Occasionally it will work if I toggle Apple Intelligence on and off in Settings, but it’s unreliable.

Machine Learning & AI Foundation Models

2

0

358

Aug ’25

Feature Request – Support for GS1 DataBar Stacked in Vision Framework

Dear Apple Developer Team, I am writing to request the addition of GS1 DataBar Stacked (both regular and expanded variants) to the barcode symbologies supported by the Vision framework (VNBarcodeSymbology) and VisionKit's DataScannerViewController. Currently, Vision supports several GS1 DataBar formats, such as: VNBarcodeSymbology.gs1DataBar VNBarcodeSymbology.gs1DataBarExpanded VNBarcodeSymbology.gs1DataBarLimited However, GS1 DataBar Stacked is widely used in industries such as retail, pharmaceuticals, and logistics, where space constraints prevent the use of the standard GS1 DataBar format. Many businesses rely on this symbology to encode GTINs and other product data, but Apple's barcode scanning API does not explicitly support it. Why This Feature Matters: Essential for Small Packaging: GS1 DataBar Stacked is commonly used on small product labels where a standard linear barcode does not fit. Widespread Industry Adoption: Many point-of-sale (POS) systems and inventory management tools require this symbology. Improves iOS Adoption for Enterprise Use: Adding support would make Apple’s Vision framework a more viable solution for businesses that currently rely on third-party barcode scanning SDKs. Feature Request: Please add GS1 DataBar Stacked and GS1 DataBar Expanded Stacked to the recognized symbologies in: VNBarcodeSymbology (for Vision framework) DataScannerViewController (for VisionKit) This addition would enhance the versatility of Apple’s barcode scanning tools and reduce the need for third-party libraries. I appreciate your consideration of this request and would be happy to provide more details or test implementations if needed. Thank you for your time and support! Best regards

Machine Learning & AI Core ML Vision VisionKit

2

5

693

Feb ’25

Bundle Identifier for Apple Intellligence

May i know the bundle identifier for apple intelligence?

Machine Learning & AI Apple Intelligence

2

0

817

Feb ’25

Does Apple's new foundation models include a Vision API for accessing on-device LLM capabilities?

I couldn't find information about this in the documentation. Could someone clarify if this API is available and how to access it?

Machine Learning & AI Apple Intelligence Machine Learning

2

0

209

Jun ’25

Image Playground files suddenly not available

My app lets you create images with Image Playground. When the user approves an image I move it to the documents dir from the temp storage. With over a year of usage I’ve created a lot of images over time. Out of nowhere the app stopped loading my custom creations from Image Playground saying it couldn’t find the files. It still had my VoiceOver strings I had added for each image and still had the custom categories I assigned them. Debug code to look in the docs dir doesn’t find them. I downloaded the app’s container and only see the images I created as a test after the problem started. But my ~70MB app is still taking up 300MB on my iPhone so it feels like they’re there but not accessible. Is there anything else I can try?

Machine Learning & AI Apple Intelligence Apple Intelligence

2

0

772

3w

Apple's AI development language is not compatible

We are developing Apple AI for overseas markets and adapting it for iPhone 17 and later models. When the system language and Siri language do not match—such as the system being in English while Siri is in Chinese—it may result in Apple AI being unusable. So, I would like to ask, how can this issue be resolved, and are there other reasons that might cause it to be unusable within the app?

Machine Learning & AI General Machine Learning Core ML

2

0

992

3w

Rate limit exceeded when using Foundation Model framework

When I use the FoundationModel framework to generate long text, it will always hit an error. "Passing along Client rate limit exceeded, try again later in response to ExecuteRequest" And stop generating. eg. for the prompt "Write a long story", it will almost certainly hit that error after 17 seconds of generation. do{ let session = LanguageModelSession() let prompt: String = "Write a long story" let response = try await session.respond(to: prompt) }catch{} If possible, I want to know how to prevent that error or at least how to handle it.

Machine Learning & AI Foundation Models Apple Intelligence

2

1

728

Jul ’25

Defining a Foundation Models Tool with arguments determined at runtime

I'm experimenting with Foundation Models and I'm trying to understand how to define a Tool whose input argument is defined at runtime. Specifically, I want a Tool that takes a single String parameter that can only take certain values defined at runtime. I think my question is basically the same as this one: https://developer.apple.com/forums/thread/793471 However, the answer provided by the engineer doesn't actually demonstrate how to create the GenerationSchema. Trying to piece things together from the documentation that the engineer linked to, I came up with this: let citiesDefinedAtRuntime = ["London", "New York", "Paris"] let citySchema = DynamicGenerationSchema( name: "CityList", properties: [ DynamicGenerationSchema.Property( name: "city", schema: DynamicGenerationSchema( name: "city", anyOf: citiesDefinedAtRuntime ) ) ] ) let generationSchema = try GenerationSchema(root: citySchema, dependencies: []) let tools = [CityInfo(parameters: generationSchema)] let session = LanguageModelSession(tools: tools, instructions: "...") With the CityInfo Tool defined like this: struct CityInfo: Tool { let name: String = "getCityInfo" let description: String = "Get information about a city." let parameters: GenerationSchema func call(arguments: GeneratedContent) throws -> String { let cityName = try arguments.value(String.self, forProperty: "city") print("Requested info about \(cityName)") let cityInfo = getCityInfo(for: cityName) return cityInfo } func getCityInfo(for city: String) -> String { // some backend that provides the info } } This compiles and usually seems to work. However, sometimes the model will try to request info about a city that is not in citiesDefinedAtRuntime. For example, if I prompt the model with "I want to travel to Tokyo in Japan, can you tell me about this city?", the model will try to request info about Tokyo, even though this is not in the citiesDefinedAtRuntime array. My understanding is that this should not be possible – constrained generation should only allow the LLM to generate an input argument from the list of cities defined in the schema. Am I missing something here or overcomplicating things? What's the correct way to make sure the LLM can only call a Tool with an input parameter from a set of possible values defined at runtime? Many thanks!

Machine Learning & AI Foundation Models

2

0

1.1k

2w

Metal GPU Work Won't Stop

Is there any way to stop GPU work running that is scheduled using metal? Long shader calculations don't stop when application is stopped in Xcode and continue to take up GPU time and affect the display. Why is this functionality not available when Swift Tasks are able to be canceled?

Machine Learning & AI General

2

0

783

Feb ’25

Post

Replies

Boosts

Views

Activity