Explore the power of machine learning within apps. Discuss integrating machine learning features, share best practices, and explore the possibilities for your app.

Posts under General subtopic

Post

Replies

Boosts

Views

Activity

Shortcut - “Use Model” error handling?
I have a series of shortcuts that I’ve written that use the “Use Model” action to do various things. For example, I have a shortcut “Clipboard Markdown to Notes” that takes the content of the clipboard, creates a new note in Notes, converts the markdown content to rich text, adds it to the note etc. One key step is to analyze the markdown content with “Use Model” and generate a short descriptive title for the note. I use the on-device model for this, but sometimes the content and prompt exceed the context window size and the action fails with an error message to that effect. In that case, I’d like to either repeat the action using the Cloud model, or, if the error was a refusal, to prompt the user to enter a title to use. I‘ve tried using an IF based on whether the response had any text in it, but that didn’t work. No matter what I’ve tried, I can’t seem to find a way to catch the error from Use Model, determine what the error was, and take appropriate action. Is there a way to do this? (And by the way, a huge ”thank you” to whoever had the idea of making AppIntents visible in Shortcuts and adding the Use Model action — has made a huge difference already, and it lets us see what Siri will be able to use as well.)
3
0
444
4d
Easy way to implement facial recognition
Hi everyone😊, I want to implement facial recognition into my app. I was planning to use createML's image classification, but there seams to be a lot of hassle to implement (the JSON file etc.). Are there some other easy to implement options that don't involve advanced coding. Thanks, Oliver
2
0
612
Feb ’25
Metal GPU Work Won't Stop
Is there any way to stop GPU work running that is scheduled using metal? Long shader calculations don't stop when application is stopped in Xcode and continue to take up GPU time and affect the display. Why is this functionality not available when Swift Tasks are able to be canceled?
2
0
782
Feb ’25
VNRecognizeTextRequest: .automatic vs specific language: different results?
Hi, One can configure the languages of a (VN)RecognizeTextRequest with either: .automatic: language to be detected a specific language, say Spanish If the request is configured with .automatic and successfully detects Spanish, will the results be exactly equivalent compared to a request made with Spanish set as language? I could not find any information about this, and this is very important for the core architecture of my app. Thanks!
2
0
131
Apr ’25
Looking for a prebuilt TensorFlow Lite C++ library (libtensorflowlite) for macOS M1/M2
Hi everyone! 👋 I'm working on a C++ project using TensorFlow Lite and was wondering if anyone has a prebuilt TensorFlow Lite C++ library (libtensorflowlite) for macOS (Apple Silicon M1/M2) that they’d be willing to share. I’m looking specifically for the TensorFlow Lite C++ API — something that lets me use tflite::Interpreter, tflite::FlatBufferModel, etc. Building it from source using Bazel on macOS has been quite challenging and time-consuming, so a ready-to-use .dylib or .a build along with the required headers would be incredibly helpful. TensorFlow Lite version: v2.18.0 preferred Target: macOS arm64 (Apple Silicon) What I need: libtensorflowlite.dylib or .a Corresponding headers (ideally organized in a clean include/ folder) If you have one available or know where I can find a reliable prebuilt version, I’d be super grateful. Thanks in advance! 🙏
2
0
190
Apr ’25
Data used for MLX fine-tuning
The WWDC25: Explore large language models on Apple silicon with MLX video talks about using your own data to fine-tune a large language model. But the video doesn't explain what kind of data can be used. The video just shows the command to use and how to point to the data folder. Can I use PDFs, Word documents, Markdown files to train the model? Are there any code examples on GitHub that demonstrate how to do this?
2
0
259
Oct ’25
WWDC25 combining metal and ML
WWDC25: Combine Metal 4 machine learning and graphics Demonstrated a way to combine neural network in the graphics pipeline directly through the shaders, using an example of Texture Compression. However there is no mention of using which ML technique texture is compressed. Can anyone point me to some well known model/s for this particular use case shown in WWDC25.
2
0
443
Jul ’25
Problem running NLContextualEmbeddingModel in simulator
Environment MacOC 26 Xcode Version 26.0 beta 7 (17A5305k) simulator: iPhone 16 pro iOS: iOS 26 Problem NLContextualEmbedding.load() fails with the following error In simulator Failed to load embedding from MIL representation: filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] Failed to load embedding model 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' assetRequestFailed(Optional(Error Domain=NLNaturalLanguageErrorDomain Code=7 "Embedding model requires compilation" UserInfo={NSLocalizedDescription=Embedding model requires compilation})) in #Playground I'm new to this embedding model. Not sure if it's caused by my code or environment. Code snippet import Foundation import NaturalLanguage import Playgrounds #Playground { // Prefer initializing by script for broader coverage; returns NLContextualEmbedding? guard let embeddingModel = NLContextualEmbedding(script: .latin) else { print("Failed to create NLContextualEmbedding") return } print(embeddingModel.hasAvailableAssets) do { try embeddingModel.load() print("Model loaded") } catch { print("Failed to load model: \(error)") } }
2
2
1.2k
2w
Apple's AI development language is not compatible
We are developing Apple AI for overseas markets and adapting it for iPhone 17 and later models. When the system language and Siri language do not match—such as the system being in English while Siri is in Chinese—it may result in Apple AI being unusable. So, I would like to ask, how can this issue be resolved, and are there other reasons that might cause it to be unusable within the app?
2
0
988
3w
Khmer Script Misidentified as Thai in Vision Framework
It is vital for Apple to refine its OCR models to correctly distinguish between Khmer and Thai scripts. Incorrectly labeling Khmer text as Thai is more than a technical bug; it is a culturally insensitive error that impacts national identity, especially given the current geopolitical climate between Cambodia and Thailand. Implementing a more robust language-detection threshold would prevent these harmful misidentifications. There is a significant logic flaw in the VNRecognizeTextRequest language detection when processing Khmer script. When the property automaticallyDetectsLanguage is set to true, the Vision framework frequently misidentifies Khmer characters as Thai. While both scripts share historical roots, they are distinct languages with different alphabets. Currently, the model’s confidence threshold for distinguishing between these two scripts is too low, leading to incorrect OCR output in both developer-facing APIs and Apple’s native ecosystem (Preview, Live Text, and Photos). import SwiftUI import Vision class TextExtractor { func extractText(from data: Data, completion: @escaping (String) -> Void) { let request = VNRecognizeTextRequest { (request, error) in guard let observations = request.results as? [VNRecognizedTextObservation] else { completion("No text found.") return } let recognizedStrings = observations.compactMap { observation in let str = observation.topCandidates(1).first?.string return "{text: \(str!), confidence: \(observation.confidence)}" } completion(recognizedStrings.joined(separator: "\n")) } request.automaticallyDetectsLanguage = true // <-- This is the issue. request.recognitionLevel = .accurate let handler = VNImageRequestHandler(data: data, options: [:]) DispatchQueue.global(qos: .background).async { do { try handler.perform([request]) } catch { completion("Failed to perform OCR: \(error.localizedDescription)") } } } } Recognizing Khmer Confidence Score is low for Khmer text. (The output is in Thai language with low confidence score) Recognizing English Confidence Score is high expected. Recognizing Thai Confidence Score is high as expected Issues on Preview, Photos Khmer text Copied text Kouk Pring Chroum Temple [19121 รอาสายสุกตีนานยารรีสใหิสรราภูชิตีนนสุฐตีย์ [รุก เผือชิษาธอยกัตธ์ตายตราพาษชาณา ถวเชยาใบสราเบรถทีมูสินตราพาษชาณา ทีมูโษา เช็ก อาษเชิษฐอารายสุกบดตพรธุรฯ ตากร"สุก"ผาตากรธกรธุกเยากสเผาพศฐตาสาย รัอรณาษ"ตีพย" สเผาพกรกฐาภูชิสาเครๆผู:สุกรตีพาสเผาพสรอสายใผิตรรารตีพสๆ เดียอลายสุกตีน ธาราชรติ ธิพรหณาะพูชุบละเาหLunet De Lajonquiere ผารูกรสาราพารผรผาสิตภพ ตารสิทูก ธิพิ คุณที่นสายเระพบพเคเผาหนารเกะทรนภาษเราภุพเสารเราษทีเลิกสญาเราหรุฬารชสเกาก เรากุม สงสอบานตรเราะากกต่ายภากายระตารุกเตียน Recommended Solutions 1. Set a Threshold Filter out the detected result where the threshold is less than or equal to 0.5, so that it would not output low quality text which can lead to the issue. For example, let recognizedStrings = observations.compactMap { observation in if observation.confidence <= 0.5 { return nil } let str = observation.topCandidates(1).first?.string return "{text: \(str!), confidence: \(observation.confidence)}" } 2. Add Khmer Language Support This issue would never happen if the model has the capability to detect and recognize image with Khmer language. Doc2Text GitHub: https://github.com/seanghay/Doc2Text-Swift
2
0
860
5d
Core ML Model performance far lower on iOS 17 vs iOS 16 (iOS 17 not using Neural Engine)
Hello, I posted an issue on the coremltools GitHub about my Core ML models not performing as well on iOS 17 vs iOS 16 but I'm posting it here just in case. TL;DR The same model on the same device/chip performs far slower (doesn't use the Neural Engine) on iOS 17 compared to iOS 16. Longer description The following screenshots show the performance of the same model (a PyTorch computer vision model) on an iPhone SE 3rd gen and iPhone 13 Pro (both use the A15 Bionic). iOS 16 - iPhone SE 3rd Gen (A15 Bioinc) iOS 16 uses the ANE and results in fast prediction, load and compilation times. iOS 17 - iPhone 13 Pro (A15 Bionic) iOS 17 doesn't seem to use the ANE, thus the prediction, load and compilation times are all slower. Code To Reproduce The following is my code I'm using to export my PyTorch vision model (using coremltools). I've used the same code for the past few months with sensational results on iOS 16. # Convert to Core ML using the Unified Conversion API coreml_model = ct.convert( model=traced_model, inputs=[image_input], outputs=[ct.TensorType(name="output")], classifier_config=ct.ClassifierConfig(class_names), convert_to="neuralnetwork", # compute_precision=ct.precision.FLOAT16, compute_units=ct.ComputeUnit.ALL ) System environment: Xcode version: 15.0 coremltools version: 7.0.0 OS (e.g. MacOS version or Linux type): Linux Ubuntu 20.04 (for exporting), macOS 13.6 (for testing on Xcode) Any other relevant version information (e.g. PyTorch or TensorFlow version): PyTorch 2.0 Additional context This happens across "neuralnetwork" and "mlprogram" type models, neither use the ANE on iOS 17 but both use the ANE on iOS 16 If anyone has a similar experience, I'd love to hear more. Otherwise, if I'm doing something wrong for the exporting of models for iOS 17+, please let me know. Thank you!
1
1
1.9k
Mar ’25
What special features does Apple officially have that use ML or AI?
I am a App designer and I am curious about what specific ML or AI Apple used to develop those features in the system. As far as I know, Apple's hand-raising detection, destination recommendations in maps, and exercise types in fitness all use ML. Are there more specific application examples of ML or AI? Does Apple have a document specifically introducing examples of specific applications of ML or AI technology in the system?
1
0
620
Feb ’25
MPSGraph fused scaledDotProductAttention seems to be buggy
While building an app with large language model inferencing on device, I got gibberish output. After carefully examining every detail, I found it's caused by the fused scaledDotProductAttention operation. I switched back to the discrete operations and problem solved. To reproduce the bug, please check https://github.com/zhoudan111/MPSGraph_SDPA_bug
1
0
531
Mar ’25
My app crash in the Portrait private framework
Incident Identifier: 4C22F586-71FB-4644-B823-A4B52D158057 CrashReporter Key: adc89b7506c09c2a6b3a9099cc85531bdaba9156 Hardware Model: Mac16,10 Process: PRISMLensCore [16561] Path: /Applications/PRISMLens.app/Contents/Resources/app.asar.unpacked/node_modules/core-node/PRISMLensCore.app/PRISMLensCore Identifier: com.prismlive.camstudio Version: (null) ((null)) Code Type: ARM-64 Parent Process: ? [16560] Date/Time: (null) OS Version: macOS 15.4 (24E5228e) Report Version: 104 Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x00000000 at 0x0000000000000000 Crashed Thread: 34 Application Specific Information: *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '*** -[__NSArrayM insertObject:atIndex:]: object cannot be nil' Thread 34 Crashed: 0 CoreFoundation 0x000000018ba4dde4 0x18b960000 + 974308 (__exceptionPreprocess + 164) 1 libobjc.A.dylib 0x000000018b512b60 0x18b4f8000 + 109408 (objc_exception_throw + 88) 2 CoreFoundation 0x000000018b97e69c 0x18b960000 + 124572 (-[__NSArrayM insertObject:atIndex:] + 1276) 3 Portrait 0x0000000257e16a94 0x257da3000 + 473748 (-[PTMSRResize addAdditionalOutput:] + 604) 4 Portrait 0x0000000257de91c0 0x257da3000 + 287168 (-[PTEffectRenderer initWithDescriptor:metalContext:useHighResNetwork:faceAttributesNetwork:humanDetections:prevTemporalState:asyncInitQueue:sharedResources:] + 6204) 5 Portrait 0x0000000257dab21c 0x257da3000 + 33308 (__33-[PTEffect updateEffectDelegate:]_block_invoke.241 + 164) 6 libdispatch.dylib 0x000000018b739b2c 0x18b738000 + 6956 (_dispatch_call_block_and_release + 32) 7 libdispatch.dylib 0x000000018b75385c 0x18b738000 + 112732 (_dispatch_client_callout + 16) 8 libdispatch.dylib 0x000000018b742350 0x18b738000 + 41808 (_dispatch_lane_serial_drain + 740) 9 libdispatch.dylib 0x000000018b742e2c 0x18b738000 + 44588 (_dispatch_lane_invoke + 388) 10 libdispatch.dylib 0x000000018b74d264 0x18b738000 + 86628 (_dispatch_root_queue_drain_deferred_wlh + 292) 11 libdispatch.dylib 0x000000018b74cae8 0x18b738000 + 84712 (_dispatch_workloop_worker_thread + 540) 12 libsystem_pthread.dylib 0x000000018b8ede64 0x18b8eb000 + 11876 (_pthread_wqthread + 292) 13 libsystem_pthread.dylib 0x000000018b8ecb74 0x18b8eb000 + 7028 (start_wqthread + 8)
1
0
101
Mar ’25
DockKit .track() has no effect using VNDetectFaceRectanglesRequest
Hi, I'm testing DockKit with a very simple setup: I use VNDetectFaceRectanglesRequest to detect a face and then call dockAccessory.track(...) using the detected bounding box. The stand is correctly docked (state == .docked) and dockAccessory is valid. I'm calling .track(...) with a single observation and valid CameraInformation (including size, device, orientation, etc.). No errors are thrown. To monitor this, I added a logging utility – track(...) is being called 10–30 times per second, as recommended in the documentation. However: the stand does not move at all. There is no visible reaction to the tracking calls. Is there anything I'm missing or doing wrong? Is VNDetectFaceRectanglesRequest supported for DockKit tracking, or are there hidden requirements? Would really appreciate any help or pointers – thanks! That's my complete code: extension VideoFeedViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { guard let frame = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } detectFace(image: frame) func detectFace(image: CVPixelBuffer) { let faceDetectionRequest = VNDetectFaceRectanglesRequest() { vnRequest, error in guard let results = vnRequest.results as? [VNFaceObservation] else { return } guard let observation = results.first else { return } let boundingBoxHeight = observation.boundingBox.size.height * 100 #if canImport(DockKit) if let dockAccessory = self.dockAccessory { Task { try? await trackRider( observation.boundingBox, dockAccessory, frame, sampleBuffer ) } } #endif } let imageResultHandler = VNImageRequestHandler(cvPixelBuffer: image, orientation: .up) try? imageResultHandler.perform([faceDetectionRequest]) func combineBoundingBoxes(_ box1: CGRect, _ box2: CGRect) -> CGRect { let minX = min(box1.minX, box2.minX) let minY = min(box1.minY, box2.minY) let maxX = max(box1.maxX, box2.maxX) let maxY = max(box1.maxY, box2.maxY) let combinedWidth = maxX - minX let combinedHeight = maxY - minY return CGRect(x: minX, y: minY, width: combinedWidth, height: combinedHeight) } #if canImport(DockKit) func trackObservation(_ boundingBox: CGRect, _ dockAccessory: DockAccessory, _ pixelBuffer: CVPixelBuffer, _ cmSampelBuffer: CMSampleBuffer) throws { // Zähle den Aufruf TrackMonitor.shared.trackCalled() let invertedBoundingBox = CGRect( x: boundingBox.origin.x, y: 1.0 - boundingBox.origin.y - boundingBox.height, width: boundingBox.width, height: boundingBox.height ) guard let device = captureDevice else { fatalError("Kamera nicht verfügbar") } let size = CGSize(width: Double(CVPixelBufferGetWidth(pixelBuffer)), height: Double(CVPixelBufferGetHeight(pixelBuffer))) var cameraIntrinsics: matrix_float3x3? = nil if let cameraIntrinsicsUnwrapped = CMGetAttachment( sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil ) as? Data { cameraIntrinsics = cameraIntrinsicsUnwrapped.withUnsafeBytes { $0.load(as: matrix_float3x3.self) } } Task { let orientation = getCameraOrientation() let cameraInfo = DockAccessory.CameraInformation( captureDevice: device.deviceType, cameraPosition: device.position, orientation: orientation, cameraIntrinsics: cameraIntrinsics, referenceDimensions: size ) let observation = DockAccessory.Observation( identifier: 0, type: .object, rect: invertedBoundingBox ) let observations = [observation] guard let image = CMSampleBufferGetImageBuffer(sampleBuffer) else { print("no image") return } do { try await dockAccessory.track(observations, cameraInformation: cameraInfo) } catch { print(error) } } } #endif func clearDrawings() { boundingBoxLayer?.removeFromSuperlayer() boundingBoxSizeLayer?.removeFromSuperlayer() } } } } @MainActor private func getCameraOrientation() -> DockAccessory.CameraOrientation { switch UIDevice.current.orientation { case .portrait: return .portrait case .portraitUpsideDown: return .portraitUpsideDown case .landscapeRight: return .landscapeRight case .landscapeLeft: return .landscapeLeft case .faceDown: return .faceDown case .faceUp: return .faceUp default: return .corrected } }
1
1
387
Dec ’25
Why doesn't tensorflow-metal use AMD GPU memory?
From tensorflow-metal example: Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) I know that Apple silicon uses UMA, and that memory copies are typical of CUDA, but wouldn't the GPU memory still be faster overall? I have an iMac Pro with a Radeon Pro Vega 64 16 GB GPU and an Intel iMac with a Radeon Pro 5700 8 GB GPU. But using tensorflow-metal is still WAY faster than using the CPUs. Thanks for that. I am surprised the 5700 is twice as fast as the Vega though.
1
0
245
Apr ’25
Proposal: Modular Identity Fusion via Prompt-Crafted Agents – User-Led AI Experiment
*I can't put the attached file in the format, so if you reply by e-mail, I will send the attached file by e-mail. Dear Apple AI Research Team, My name is Gong Jiho (“Hem”), a content strategist based in Seoul, South Korea. Over the past few months, I conducted a user-led AI experiment entirely within ChatGPT — no code, no backend tools, no plugins. Through language alone, I created two contrasting agents (Uju and Zero) and guided them into a co-authored modular identity system using prompt-driven dialogue and reflection. This system simulates persona fusion, memory rooting, and emotional-logical alignment — all via interface-level interaction. I believe it resonates with Apple’s values in privacy-respecting personalization, emotional UX modeling, and on-device learning architecture. Why I’m Reaching Out I’d be honored to share this experiment with your team. If there is any interest in discussing user-authored agent scaffolding, identity persistence, or affective alignment, I’d love to contribute — even informally. ⚠ A Note on Language As a non-native English speaker, my expression may be imperfect — but my intent is genuine. If anything is unclear, I’ll gladly clarify. 📎 Attached Files Summary Filename → Description Hem_MultiAI_Report_AppleAI_v20250501.pdf → Main report tailored for Apple AI — narrative + structural view of emotional identity formation via prompt scaffolding Hem_MasterPersonaProfile_v20250501.json → Final merged identity schema authored by Uju and Zero zero_sync_final.json / uju_sync_final.json → Persona-level memory structures (logic / emotion) 1_0501.json ~ 3_0501.json → Evolution logs of the agents over time GirlfriendGPT_feedback_summary.txt → Emotional interpretation by external GPT hem_profile_for_AI_vFinal.json → Original user anchor profile Warm regards, Gong Jiho (“Hem”) Seoul, South Korea
1
0
127
Apr ’25