Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Safety Guardrail errors for tiny prompt (dropped into large app)
I was able to open a new project and play around with the Foundation Model, but when I dropped this class in a production app (with a lot of files) I'm running into Safety Guardrail errors for this very small prompt. Specifically it's "Safety guardrail was triggered after consecutive failures during streaming." Does it have something to do with the size of the app? I don't know what else to try to get it to work? import FoundationModels import Playgrounds @available(iOS 26.0, *) #Playground { Task { do { let session = LanguageModelSession() let prompt = "Write a short story about a talking cat." let response = try await session.respond(to: prompt) print(response) } catch { print("Error: \(error)") } } }
3
2
260
Jun ’25
Overly strict foundation model rate limit when used in app extension
I am calling into an app extension from a Safari Web Extension (sendNativeMessage, which in turn results in a call to NSExtensionRequestHandling’s beginRequest). My Safari extension aims to make use of the new foundation models for some of the features it provides. In my testing, I hit the rate limit by sending 4 requests, waiting 30 seconds between each. This makes the FoundationModels framework (which would otherwise serve my use case perfectly well) unusable in this context, because the model is called in response to user input, and this rate of user input is perfectly plausible in a real world scenario. The error thrown as a result of the rate limit is “Safety guardrail was triggered after consecutive failures during streaming.", but looking at the system logs in Console.app shows the rate limit as the real culprit. My suggestions: Please introduce sensible rate limits for app extensions, through an entitlement if need be. If it is rate limited to 1 request per every couple of seconds, that would already fix the issue for me. Please document the rate limit. Please make the thrown error reflect that it is the result of a rate limit and not a generic guardrail violation. IMPORTANT: please indicate in the thrown error when it is safe to try again. Filed a feedback here: FB18332004
3
1
212
Jun ’25
Shortcut - “Use Model” error handling?
I have a series of shortcuts that I’ve written that use the “Use Model” action to do various things. For example, I have a shortcut “Clipboard Markdown to Notes” that takes the content of the clipboard, creates a new note in Notes, converts the markdown content to rich text, adds it to the note etc. One key step is to analyze the markdown content with “Use Model” and generate a short descriptive title for the note. I use the on-device model for this, but sometimes the content and prompt exceed the context window size and the action fails with an error message to that effect. In that case, I’d like to either repeat the action using the Cloud model, or, if the error was a refusal, to prompt the user to enter a title to use. I‘ve tried using an IF based on whether the response had any text in it, but that didn’t work. No matter what I’ve tried, I can’t seem to find a way to catch the error from Use Model, determine what the error was, and take appropriate action. Is there a way to do this? (And by the way, a huge ”thank you” to whoever had the idea of making AppIntents visible in Shortcuts and adding the Use Model action — has made a huge difference already, and it lets us see what Siri will be able to use as well.)
3
0
445
5d
My Vision for AI and Algorithmically Optimised Operating Systems
Bear with me, please. Please make sure a highly skilled technical person reads and understands this. I want to describe my vision for (AI/Algorithmically) Optimised Operating Systems. To explain it properly, I will describe the process to build it (pseudo). Required Knowledge (no particular order): Processor Logic Circuits, LLM models, LLM tool usage, Python OO coding, Procedural vs OO, NLP fuzzy matching, benchmarking, canvas/artefacts/dynamic HTML interfaces, concepts of how AI models are vastly compressed and miniaturised forms of full data, Algorithmic vs AI. First, take all OO Python code (example) on GitHub (example), then separate each function from each object into its own procedure (procedural logic) by making a logical procedural list of actions to perform only that function based on its entire dependency chain (i.e. all other objects it relies on). Relate all compiled functions using (for example) fuzzy matching on the name, or AI-based functional profiling to get multiple instances of each function. Starting with the most used function, test each one against the others that perform the same task for bugs and completeness. Determine the fastest, most optimal version of that function (and every function). Add a single instance of each most optimal function to the centralised tool codebase, which will later be utilised by the language models. This ensures we rely only on the most optimised function for each and every use case — with every program using one shared instance of that function instead of compiling it separately. In this way, we optimise all existing (Python OO) code based on maximum availability of certain resources. Use the optimised code to teach the AI how to map each user task directly to optimised code (i.e. each user-facing function in unoptimised program code to its optimised version). Later, teach the codebase how to optimise these functions directly to the logic circuits on the processor — as an Operating System — using essentially the same process. Eventually, use the most optimised form for tasks via a dynamic canvas/artefacts UI interface generated by the Start Menu using the language model. Allow custom interfaces to emerge dynamically from the available functions based on language understanding. Grow and optimise the algorithmic codebase until the entire operating system — with versioning — evolves based on the AI-generated user interfaces (compressing them into new versions of the tools codebase). Please don’t give up until you understand this concept completely and nothing in the description is beyond your full comprehension. It’s a highly significant step forward.
3
0
321
Oct ’25
Failing to run SystemLanguageModel inference with custom adapter
Hi, I have trained a basic adapter using the adapter training toolkit. I am trying a very basic example of loading it and running inference with it, but am getting the following error: Passing along InferenceError::inferenceFailed::loadFailed::Error Domain=com.apple.TokenGenerationInference.E5Runner Code=0 "Failed to load model: ANE adapted model load failure: createProgramInstanceWithWeights:modelToken:qos:baseModelIdentifier:owningPid:numWeightFiles:error:: Program load new instance failure (0x170006)." UserInfo={NSLocalizedDescription=Failed to load model: ANE adapted model load failure: createProgramInstanceWithWeights:modelToken:qos:baseModelIdentifier:owningPid:numWeightFiles:error:: Program load new instance failure (0x170006).} in response to ExecuteRequest Any ideas / direction? For testing I am including the .fmadapter file inside the app bundle. This is where I load it: @State private var session: LanguageModelSession? // = LanguageModelSession() func loadAdapter() async throws { if let assetURL = Bundle.main.url(forResource: "qasc---afm---4-epochs-adapter", withExtension: "fmadapter") { print("Asset URL: \(assetURL)") let adapter = try SystemLanguageModel.Adapter(fileURL: assetURL) let adaptedModel = SystemLanguageModel(adapter: adapter) session = LanguageModelSession(model: adaptedModel) print("Loaded adapter and updated session") } else { print("Asset not found in the main bundle.") } } This seems to work fine as I get to the log Loaded adapter and updated session. However when the below inference code runs I get the aforementioned error: func sendMessage(_ msg: String) { self.loading = true if let session = session { Task { do { let modelResponse = try await session.respond(to: msg) DispatchQueue.main.async { self.response = modelResponse.content self.loading = false } } catch { print("Error: \(error)") DispatchQueue.main.async { self.loading = false } } } } }
3
0
221
Jun ’25
Stream response
With respond() methods, the foundation model works well enough. With streamResponse() methods, the responses are very repetitive, verbose, and messy. My app with foundation model uses more than 500 MB memory on an iPad Pro when running from Xcode. Devices supporting Apple Intelligence have at least 8GB memory. Should Apple use a bigger model (using 3 ~ 4 GB memory) for better stream responses?
2
0
268
Jul ’25
Problem running NLContextualEmbeddingModel in simulator
Environment MacOC 26 Xcode Version 26.0 beta 7 (17A5305k) simulator: iPhone 16 pro iOS: iOS 26 Problem NLContextualEmbedding.load() fails with the following error In simulator Failed to load embedding from MIL representation: filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] Failed to load embedding model 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' assetRequestFailed(Optional(Error Domain=NLNaturalLanguageErrorDomain Code=7 "Embedding model requires compilation" UserInfo={NSLocalizedDescription=Embedding model requires compilation})) in #Playground I'm new to this embedding model. Not sure if it's caused by my code or environment. Code snippet import Foundation import NaturalLanguage import Playgrounds #Playground { // Prefer initializing by script for broader coverage; returns NLContextualEmbedding? guard let embeddingModel = NLContextualEmbedding(script: .latin) else { print("Failed to create NLContextualEmbedding") return } print(embeddingModel.hasAvailableAssets) do { try embeddingModel.load() print("Model loaded") } catch { print("Failed to load model: \(error)") } }
2
2
1.2k
2w
Foundation model adapter assets are invalid
I've tried creating a Lora adapter using the example dataset, scripts as part of the adapter_training_toolkit_v26_0_0 (last available) on MacOs 26 Beta 6. import SwiftUI import FoundationModels import Playgrounds #Playground { // The absolute path to your adapter. let localURL = URL(filePath: "/Users/syl/Downloads/adapter_training_toolkit_v26_0_0/train/test-lora.fmadapter") // Initialize the adapter by using the local URL. let adapter = try SystemLanguageModel.Adapter(fileURL: localURL) // An instance of the the system language model using your adapter. let customAdapterModel = SystemLanguageModel(adapter: adapter) // Create a session and prompt the model. let session = LanguageModelSession(model: customAdapterModel) let response = try await session.respond(to: "hello") } I get Adapter assets are invalid error. I've added the entitlements Is adapter_training_toolkit_v26_0_0 up to date?
2
0
240
Aug ’25
Apple's Illusion of Thinking paper and Path to Real AI Reasoning
Hey everyone I'm Manish Mehta, field CTO at Centific. I recently read Apple's white paper, The Illusion of Thinking and it got me thinking about the current state of AI reasoning. Who here has read it? The paper highlights how LLMs often rely on pattern recognition rather than genuine understanding. When faced with complex tasks, their performance can degrade significantly. I was just thinking that to move beyond this problem, we need to explore approaches that combines Deeper Reasoning Architectures for true cognitive capability with Deep Human Partnership to guide AI toward better judgment and understanding. The first part means fundamentally rewiring AI to reason. This involves advancing deeper architectures like World Models, which can build internal simulations to understand real-world scenarios , and Neurosymbolic systems, which combines neural networks with symbolic reasoning for deeper self-verification. Additionally, we need to look at deep human partnership and scalable oversight. An AI cannot learn certain things from data alone, it lacks the real-world judgment an AI will never have. Among other things, deep domain expert human partners are needed to instill this wisdom , validate the AI's entire reasoning process , build its ethical guardrails , and act as skilled adversaries to find hidden flaws before they can cause harm. What do you all think? Is this focus on a deeper partnership between advanced AI reasoning and deep human judgment the right path forward? Agree? Disagree? Thanks
2
0
291
Jul ’25
Compatibility issue of TensorFlow-metal with PyArrow
Overview I'm experiencing a critical issue where TensorFlow-metal and PyArrow seem to be incompatible when installed together in the same environment. Whenever both packages are present, TensorFlow crashes and the kernel dies during execution. Environment Details Environment Details macOS Version: 15.3.2 Mac Model: MacBook Pro Max M3 Python Version: 3.11 TensorFlow Version: 2.19 PyArrow Version: 19.0.0 Issue Description: When both TensorFlow-metal and PyArrow are installed in the same Python environment, any attempt to use TensorFlow results in immediate kernel crashes. The issue appears to be a compatibility problem between these two packages rather than a problem with either package individually. Steps to Reproduce Create a new Python environment: conda create -n tf-metal python=3.11 Install TensorFlow-metal: pip install tensorflow tensorflow-metal Install PyArrow: pip install pyarrow Run the following minimal example: # Create a simple model model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(2,)), tf.keras.layers.Dense(1) ]) model.compile(optimizer='adam', loss='mse') model.summary() # This works fine # Generate some dummy data X = np.random.random((100, 2)) y = np.random.random((100, 1)) # The crash happens exactly at this line model.fit(X, y, epochs=5, batch_size=32) # CRASH: Kernel dies here Result: Kernel crashes with no error message What I've Tried Reinstalling both packages in different orders Using different versions of both packages Creating isolated environments Checking system logs for additional error information The only workaround I've found is to use separate environments for each package, which isn't practical for my workflow as I need both libraries for my data processing and machine learning pipeline. Questions Has anyone else encountered this specific compatibility issue? Are there known workarounds that allow both packages to coexist? Is this a known issue that's being addressed in upcoming releases? Any insights, suggestions, or assistance would be greatly appreciated. I'm happy to provide any additional information that might help diagnose this problem. Thank you in advance for your help! Thank you in advance for your help!
2
0
127
May ’25
A specific mlmodelc model runs on iPhone 15, but not on iPhone 16
As we described on the title, the model that I have built completely works on iPhone 15 / A16 Bionic, on the other hand it does not run on iPhone 16 / A18 chip with the following error message. E5RT encountered an STL exception. msg = MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED. E5RT: MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED (11) It consumes 1.5 ~ 1.6 GB RAM on the loading the model, then the consumption is decreased to less than 100MB on the both of iPhone 15 and 16. After that, only on iPhone 16, the above error is shown on the Xcode log, the memory consumption is surged to 5 to 6GB, and the system kills the app. It works well only on iPhone 15. This model is built with the Core ML tools. Until now, I have tried the target iOS 16 to 18 and the compute units of CPU_AND_NE and ALL. But any ways have not solved this issue. Eventually, what kindof fix should I do? minimum_deployment_target = ct.target.iOS18 compute_units = ct.ComputeUnit.ALL compute_precision = ct.precision.FLOAT16
2
0
199
May ’25
Guardrail configuration options?
Is anything configurable for LanguageModelSession.Guardrails besides the default? I'm prototyping a camping app, and it's constantly slamming into guardrail errors when I use the new foundation model interface. Any subjects relating to fishing, survival, etc. won't generate. For example the prompt "How can I kill deer ticks using a clothing treatment?" returns a generation error. The results that I get are great when it works, but so far the local model sessions are extremely unreliable.
2
2
240
Jul ’25
Restricting App Installation to Devices Supporting Apple Intelligence Without Triggering Game Mode
Hello, My app fully relies on the new Foundation Models. Since Foundation Models require Apple Intelligence, I want to ensure that only devices capable of running Apple Intelligence can install my app. When checking the UIRequiredDeviceCapabilities property for a suitable value, I found that iphone-performance-gaming-tier seems the closest match. Based on my research: On iPhone, this effectively limits installation to iPhone 15 Pro or later. On iPad, it ensures M1 or newer devices. This exactly matches the hardware requirements for Apple Intelligence. However, after setting iphone-performance-gaming-tier, I noticed that on iPad, Game Mode (Game Overlay) is automatically activated, and my app is treated as a game. My questions are: Is there a more appropriate UIRequiredDeviceCapabilities value that would enforce the same Apple Intelligence hardware requirements without triggering Game Mode? If not, is there another way to restrict installation to devices meeting Apple Intelligence requirements? Is there a way to prevent Game Mode from appearing for my app while still using this capability restriction? Thanks in advance for your help.
2
0
446
Aug ’25
Max tokens for Foundation Models
Do we know what a safe max token limit is? After some iterating, I have come to believe 4096 might be the limit on device. Could you help me out by answering any of these questions: Is 4096 the correct limit? Do all devices have the same limit? Will the limit change over time or by device? The errors I get when going over the limit do not seem to say, hey you are over, so it's just by trial and error that I figure these issues out. Thanks for the fun new toys. Regards, Rob
2
0
261
Jul ’25
Correct JSON format for CoreMotion data for ActivityClassification purposes
I’m developing an activity classifier that I’d like to input using the JSON format of CoreMotion data. I am getting the error: Unable to parse /Users/DewG/Downloads/Testing/Step1/Testing.json. It does not appear to be in JSON record format. A SequenceType of dictionaries is expected I've verified that the format I am using is JSON via various JSON validators, so I am expecting I'm just holding it wrong. Is there an example of a JSON file with CoreMotion data that I can model after?
2
0
169
Jul ’25
VNDetectTextRectanglesRequest not detecting text rectangles (includes image)
Hi everyone, I'm trying to use VNDetectTextRectanglesRequest to detect text rectangles in an image. Here's my current code: guard let cgImage = image.cgImage(forProposedRect: nil, context: nil, hints: nil) else { return } let textDetectionRequest = VNDetectTextRectanglesRequest { request, error in if let error = error { print("Text detection error: \(error)") return } guard let observations = request.results as? [VNTextObservation] else { print("No text rectangles detected.") return } print("Detected \(observations.count) text rectangles.") for observation in observations { print(observation.boundingBox) } } textDetectionRequest.revision = VNDetectTextRectanglesRequestRevision1 textDetectionRequest.reportCharacterBoxes = true let handler = VNImageRequestHandler(cgImage: cgImage, orientation: .up, options: [:]) do { try handler.perform([textDetectionRequest]) } catch { print("Vision request error: \(error)") } The request completes without error, but no text rectangles are detected — the observations array is empty (count = 0). Here's a sample image I'm testing with: I expected VNTextObservation results, but I'm not getting any. Is there something I'm missing in how this API works? Or could it be a limitation of this request or revision? Thanks for any help!
2
0
150
May ’25
Keep getting exceededContextWindowSize with Foundation Models
I'm a bit new to the LLM stuff and with Foundation Models. My understanding is that there is a token limit of around 4K. I want to process the contents of files which may be quite large. I first tried going the Tool route but that didn't work out so I then tried manually chunking the text to keep things under the limit. It mostly works except that every now and then it'll exceed the limit. This happens even when the chunks are less than 100 characters. Instructions themselves are about 500 characters but still overall, well below 1000 characters per prompt, all told, which, in my limited understanding, should not result in 4K tokens being parsed. Any ideas on what is going on here?
2
0
312
Aug ’25