Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

AppShortcuts.xcstrings does not translate each invocation phrase option separately, just the first
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts. When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key. However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.) What does that mean, practically? Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.) In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri? Is there something I'm doing incorrectly? struct WatchShortcuts: AppShortcutsProvider { static var appShortcuts: [AppShortcut] { AppShortcut( intent: QuickAddWaterIntent(), phrases: [ "\(.applicationName) log water", "\(.applicationName) log my water", "Log water in \(.applicationName)", "Log my water in \(.applicationName)", "Log a bottle of water in \(.applicationName)", ], shortTitle: "Log Water", systemImageName: "drop.fill" ) } }
0
0
303
Aug ’25
CoreML multifunction model runtime memory cost
Recently, I'm trying to deploy some third-party LLM to Apple devices. The methodoloy is similar to https://github.com/Anemll/Anemll. The biggest issue I'm having now is the runtime memory usage. When there are multiple functions in a model (mlpackage or mlmodelc), the runtime memory usage for weights is somehow duplicated when I load all of them. Here's the detail: I created my multifunction mlpackage following https://apple.github.io/coremltools/docs-guides/source/multifunction-models.html I loaded each of the functions using the generated swift class: let config = MLModelConfiguration() config.computeUnits = MLComputeUnits.cpuAndNeuralEngine config.functionName = "infer_512"; let ffn1_infer_512 = try! mimo_FFN_PF_lut4_chunk_01of02(configuration: config) config.functionName = "infer_1024"; let ffn1_infer_1024 = try! mimo_FFN_PF_lut4_chunk_01of02(configuration: config) config.functionName = "infer_2048"; let ffn1_infer_2048 = try! mimo_FFN_PF_lut4_chunk_01of02(configuration: config) I observed that RAM usage increases linearly as I load each of the functions. Using instruments, I see that there are multiple HWX files generated and loaded, each of which contains all the weight data. My understanding of what's happening here: The CoreML framework did some MIL->MIL preprocessing before further compilation, which includes separating CPU workload from ANE workload. The ANE part of each function is moved into a separate MIL file then compile separately into a HWX file each. The problem is that the weight data of these HWX files are duplicated. Since that the weight data of LLMs is huge, it will cause out-of-memory issue on mobile devices. The improvement I'm hoping from Apple: I hope we can try to merge the processed MIL files back into one before calling ANECCompile(), so that the weights can be merged. I don't have control over that in user space and I'm not sure if that is feasible. So I'm asking for help here. Thanks.
1
0
186
Apr ’25
App stuck “In Review” for several days after AI-policy rejection — need clarification
Hello everyone, I’m looking for guidance regarding my app review timeline, as things seem unusually delayed compared to previous submissions. My iOS app was rejected on November 19th due to AI-related policy questions. I immediately responded to the reviewer with detailed explanations covering: Model used (Gemini Flash 2.0 / 2.5 Lite) How the AI only generates neutral, non-directive reflective questions How the system prevents any diagnosis, therapy-like behavior or recommendations Crisis-handling limitations Safety safeguards at generation and UI level Internal red-team testing and results Data retention, privacy, and non-use of data for model training After sending the requested information, I resubmitted the build on November 19th at 14:40. Since then: November 20th (7:30) → Status changed to In Review. November 21st, 22nd, 23rd, 24th, 25th → No movement, still In Review. My open case on App Store Connect is still pending without updates. Because of the previous rejection, I expected a short delay, but this is now 5 days total and 3 business days with no progress, which feels longer than usual for my past submissions. I’m not sure whether: My app is in a secondary review queue due to the AI-related rejection, The reviewer is waiting for internal clarification, Or if something is stuck and needs to be escalated. I don’t want to resubmit a new build unless necessary, since that would restart the queue. Could someone from the community (or Apple, if possible) confirm whether this waiting time is normal after an AI-policy rejection? And is there anything I should do besides waiting — for example, contacting Developer Support again or requesting a follow-up? Thank you very much for your help. I appreciate any insight from others who have experienced similar delays.
0
0
691
Nov ’25
“Unleashing the MacBook Air M2: 673 TFLOPS Achieved with Highly Optimized Metal Shading Language”
Using highly optimized Metal Shading Language (MSL) code, I pushed the MacBook Air M2 to its performance limits with the deformable_attention_universal kernel. The results demonstrate both the efficiency of the code and the exceptional power of Apple Silicon. The total computational workload exceeded 8.455 quadrillion FLOPs, equivalent to processing 8,455 trillion operations. On average, the code sustained a throughput of 85.37 TFLOPS, showcasing the chip’s remarkable ability to handle massive workloads. Peak instantaneous performance reached approximately 673.73 TFLOPS, reflecting near-optimal utilization of the GPU cores. Despite this intensity, the cumulative GPU runtime remained under 100 seconds, highlighting the code’s efficiency and time optimization. The fastest iteration achieved a record processing time of only 0.051 ms, demonstrating minimal bottlenecks and excellent responsiveness. Memory management was equally impressive: peak GPU memory usage never exceeded 2 MB, reflecting efficient use of the M2’s Unified Memory. This minimizes data transfer overhead and ensures smooth performance across repeated workloads. Overall, these results confirm that a well-optimized Metal implementation can unlock the full potential of Apple Silicon, delivering exceptional computational density, processing speed, and memory efficiency. The MacBook Air M2, often considered an energy-efficient consumer laptop, is capable of handling highly intensive workloads at performance levels typically expected from much larger GPUs. This test validates both the robustness of the Metal code and the extraordinary capabilities of the M2 chip for high-performance computing tasks.
0
0
417
Nov ’25
Mistral/LLaMa Core ML Conversion
Hi, I am new to developing on Apple’s platform yet I want to familiarize myself with Core ML and Core ML Tools. I was watching the WWDC24: Bring your machine learning and AI models to Apple Silicon video and was trying to follow along. After multiple attempts and much reading up on documentation, I am still unable to get a coherent script running that will convert the Mistral model that the host used and convert it to a valid Core ML model. here is a pastebin to what i have currently: https://pastebin.com/04cVjF1v if you require the output as well please let me know
0
0
132
Apr ’25
My Vision for AI and Algorithmically Optimised Operating Systems
Bear with me, please. Please make sure a highly skilled technical person reads and understands this. I want to describe my vision for (AI/Algorithmically) Optimised Operating Systems. To explain it properly, I will describe the process to build it (pseudo). Required Knowledge (no particular order): Processor Logic Circuits, LLM models, LLM tool usage, Python OO coding, Procedural vs OO, NLP fuzzy matching, benchmarking, canvas/artefacts/dynamic HTML interfaces, concepts of how AI models are vastly compressed and miniaturised forms of full data, Algorithmic vs AI. First, take all OO Python code (example) on GitHub (example), then separate each function from each object into its own procedure (procedural logic) by making a logical procedural list of actions to perform only that function based on its entire dependency chain (i.e. all other objects it relies on). Relate all compiled functions using (for example) fuzzy matching on the name, or AI-based functional profiling to get multiple instances of each function. Starting with the most used function, test each one against the others that perform the same task for bugs and completeness. Determine the fastest, most optimal version of that function (and every function). Add a single instance of each most optimal function to the centralised tool codebase, which will later be utilised by the language models. This ensures we rely only on the most optimised function for each and every use case — with every program using one shared instance of that function instead of compiling it separately. In this way, we optimise all existing (Python OO) code based on maximum availability of certain resources. Use the optimised code to teach the AI how to map each user task directly to optimised code (i.e. each user-facing function in unoptimised program code to its optimised version). Later, teach the codebase how to optimise these functions directly to the logic circuits on the processor — as an Operating System — using essentially the same process. Eventually, use the most optimised form for tasks via a dynamic canvas/artefacts UI interface generated by the Start Menu using the language model. Allow custom interfaces to emerge dynamically from the available functions based on language understanding. Grow and optimise the algorithmic codebase until the entire operating system — with versioning — evolves based on the AI-generated user interfaces (compressing them into new versions of the tools codebase). Please don’t give up until you understand this concept completely and nothing in the description is beyond your full comprehension. It’s a highly significant step forward.
3
0
321
Oct ’25
ImagePlayground: Programmatic Creation Error
Hardware: Macbook Pro M4 Nov 2024 Software: macOS Tahoe 26.0 & xcode 26.0 Apple Intelligence is activated and the Image playground macOS app works Running the following on xcode throws ImagePlayground.ImageCreator.Error.creationFailed Any suggestions on how to make this work? import Foundation import ImagePlayground Task { let creator = try await ImageCreator() guard let style = creator.availableStyles.first else { print("No styles available") exit(1) } let images = creator.images( for: [.text("A cat wearing mittens.")], style: style, limit: 1) for try await image in images { print("Generated image: \(image)") } exit(0) } RunLoop.main.run()
0
0
307
Sep ’25
Converting TF2 object detection to CoreML
I've spent way too long today trying to convert an Object Detection TensorFlow2 model to a CoreML object classifier (with bounding boxes, labels and probability score) The 'SSD MobileNet v2 320x320' is here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md And I've been following all sorts of posts and ChatGPT https://apple.github.io/coremltools/docs-guides/source/tensorflow-2.html#convert-a-tensorflow-concrete-function https://developer.apple.com/videos/play/wwdc2020/10153/?time=402 To convert it. I keep hitting the same errors though, mostly around: NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got <ConcreteFunction signature_wrapper(input_tensor) at 0x366B87790> I've had varying success including missing output labels/predictions. But I simply want to create the CoreML model with all the right inputs and outputs (including correct names) as detailed in the docs here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md It goes without saying I don't have much (any) experience with this stuff including Python so the whole thing's been a bit of a headache. If anyone is able to help that would be great. FWIW I'm not attached to any one specific model, but what I do need at minimum is a CoreML model that can detect objects (has to at least include lights and lamps) within a live video image, detecting where in the image the object is. The simplest script I have looks like this: import coremltools as ct import tensorflow as tf model = tf.saved_model.load("~/tf_models/ssd_mobilenet_v2_320x320_coco17_tpu-8/saved_model") concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY] mlmodel = ct.convert( concrete_func, source="tensorflow", inputs=[ct.TensorType(shape=(1, 320, 320, 3))] ) mlmodel.save("YourModel.mlpackage", save_format="mlpackage")
1
0
452
Jul ’25
Translation Framework: Code 16 "Offline models not available" despite status showing .installed
Hi everyone, I'm experiencing an inconsistent behavior with the Translation framework on iOS 18. The LanguageAvailability.status() API reports language models as .installed, but translation fails with Code 16. Setup: Using translationTask modifier with TranslationSession Batch translation with explicit source/target languages Languages: Portuguese→English, German→English Issue: let status = await LanguageAvailability().status(from: sourceLang, to: targetLang) // Returns: .installed // But translation fails: let responses = try await session.translations(from: requests) // Error: TranslationErrorDomain Code=16 "Offline models not available" Logs: Language model installed: pt -> en Language model installed: de -> en Starting translation: de -> en Error Domain=TranslationErrorDomain Code=16 "Translation failed"NSLocalizedFailureReason=Offline models not available for language pair What I've tried: Re-downloading languages in Settings Using source: nil for auto-detection Fresh TranslationSession.Configuration each time Questions: Is there a way to force model re-validation/re-download programmatically? Should translationTask show download popup when Code 16 occurs? Has anyone found a reliable workaround? I've seen similar reports in threads 791357 and 777113. Any guidance appreciated! Thanks!
1
0
345
1w
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
0
0
127
Jul ’25
Foundational Model - Image as Input? Timeline
Hi all, I am interested in unlocking unique applications with the new foundational models. I have a few questions regarding the availability of the following features: Image Input: The update in June 2025 mentions "image" 44 times (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates) - however I can't seem to find any information about having images as the input/prompt for the foundational models. When will this be available? I understand that there are existing Vision ML APIs, but I want image input into a multimodal on-device LLM (VLM) instead for features like "Which player is holding the ball in the image", etc (image understanding) Cloud Foundational Model - when will this be available? Thanks! Clement :)
1
0
537
Sep ’25
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Greetings, and Happy Holidays, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device. parcri.net has the link :)
1
0
303
Dec ’25
Is there an API to check if a Core ML compiled model is already cached?
Hello Apple Developer Community, I'm investigating Core ML model loading behavior and noticed that even when the compiled model path remains unchanged after an APP update, the first run still triggers an "uncached load" process. This seems to impact user experience with unnecessary delays. Question: Does Core ML provide any public API to check whether a compiled model (from a specific .mlmodelc path) is already cached in the system? If such API exists, we'd like to use it for pre-loading decision logic - only perform background pre-load when the model isn't cached. Has anyone encountered similar scenarios or found official solutions? Any insights would be greatly appreciated!
0
0
140
May ’25
Writing tools options
Hi team, We have implemented a writing tool inside a WebView that allows users to type content in a textarea. When the "Show Writing Tools" button is clicked, an AI-powered editor opens. After clicking the "Rewrite" button, the AI modifies the text. However, when clicking the "Replace" button, the rewritten text does not update the original textarea. Kindly check and help me showButton.addTarget(self, action: #selector(showWritingTools(_:)), for: .touchUpInside) @available(iOS 18.2, *) optional func showWritingTools(_ sender: Any) Note: same cases working in TextView pfa
0
0
212
Mar ’25
linear_quantize_activations taking 90 minutes + on MacBook Air M1 2020
In my quantization code, the line: compressed_model_a8 = cto.coreml.experimental.linear_quantize_activations( model, activation_config, [{'img':np.random.randn(1,13,1024,1024)}] ) has taken 90 minutes to run so far and is still not completed. From debugging, I can see that the line it's stuck on is line 261 in _model_debugger.py: model = ct.models.MLModel( cloned_spec, weights_dir=self.weights_dir, compute_units=compute_units, skip_model_load=False, # Don't skip model load as we need model prediction to get activations range. ) Is this expected behaviour? Would it be quicker to run on another computer with more RAM?
1
0
121
Mar ’25
Defining a Foundation Models Tool with arguments determined at runtime
I'm experimenting with Foundation Models and I'm trying to understand how to define a Tool whose input argument is defined at runtime. Specifically, I want a Tool that takes a single String parameter that can only take certain values defined at runtime. I think my question is basically the same as this one: https://developer.apple.com/forums/thread/793471 However, the answer provided by the engineer doesn't actually demonstrate how to create the GenerationSchema. Trying to piece things together from the documentation that the engineer linked to, I came up with this: let citiesDefinedAtRuntime = ["London", "New York", "Paris"] let citySchema = DynamicGenerationSchema( name: "CityList", properties: [ DynamicGenerationSchema.Property( name: "city", schema: DynamicGenerationSchema( name: "city", anyOf: citiesDefinedAtRuntime ) ) ] ) let generationSchema = try GenerationSchema(root: citySchema, dependencies: []) let tools = [CityInfo(parameters: generationSchema)] let session = LanguageModelSession(tools: tools, instructions: "...") With the CityInfo Tool defined like this: struct CityInfo: Tool { let name: String = "getCityInfo" let description: String = "Get information about a city." let parameters: GenerationSchema func call(arguments: GeneratedContent) throws -> String { let cityName = try arguments.value(String.self, forProperty: "city") print("Requested info about \(cityName)") let cityInfo = getCityInfo(for: cityName) return cityInfo } func getCityInfo(for city: String) -> String { // some backend that provides the info } } This compiles and usually seems to work. However, sometimes the model will try to request info about a city that is not in citiesDefinedAtRuntime. For example, if I prompt the model with "I want to travel to Tokyo in Japan, can you tell me about this city?", the model will try to request info about Tokyo, even though this is not in the citiesDefinedAtRuntime array. My understanding is that this should not be possible – constrained generation should only allow the LLM to generate an input argument from the list of cities defined in the schema. Am I missing something here or overcomplicating things? What's the correct way to make sure the LLM can only call a Tool with an input parameter from a set of possible values defined at runtime? Many thanks!
2
0
1.1k
2w