Hi,
I'm using LanguageModelSession and giving it two different tools to query data from a local database. I'm wondering how I can have the session generate structured content as the response that includes data one or both tools (or no tool at all).
Here is an example of what I'm trying to do:
Let's say the app has access to a database that contains information about exercise and sleep data (this is just an analogy). There are two tools, GetExerciseData() and GetSleepData(). The user may then prompt something like, "how well did I sleep in November". I have this working so that it calls through to the right tool, which would return a SleepSummary. However, I can't figure out how to have the session return the right structured data.
I can do this and get back good text data:
let response = session.respond(to: userInput), but I believe I want to do something like:
let response = session.respond(to: trimmed, generating: <SomeStructure?>) Sometimes the model I run one tool or the other, or both tools, or no tool at all.
Any help of what the right way to go about this would be much appreciated. Most of the example I found have to do with 1 tool.
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I have recently been having trouble with my iOS 18.2 beta update. It has been 2 weeks since I have updated to iOS 18.2 beta and joined the Genmoji and image playground waitlist. I am wondering how much longer I have to wait till my request is approved.
Hi,
I’m developing an app targeting iOS 26, using the new FoundationModels framework to perform on-device LLM inference. I’m currently testing memory usage.
Does the memory used by FoundationModels—including model weights, KV cache, and any inference-related buffers—count toward my app’s Jetsam memory limit, or is any of it managed separately by the system?
I may need to run two concurrent inferences, each with a 4096-token context window. Is this explicitly supported or allowed by FoundationModels on iOS 26? Would this significantly increase the risk of memory-based termination?
Thanks in advance for any clarification.
Thanks.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Documentation on adapter train is lacking any details related to training on dataset with tool calling. And page about tool calling itself only explain how to use it from Swift without any internal details useful in training.
Question is how schema should looks like for including tool calling in dataset?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I'm working on localizing my prompts to support multiple languages, and in some cases my prompts has String interpolated Generable objects. for example:
"Given the following workout routine: \(routine), suggest one additional exercise to complement it."
In the Strings dictionary, I'm only able to select String, Int or Double parameters using %@ and %lld.
Has anyone found a way to accomplish this?
Greetings, and Happy Holidays,
I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community.
The Problem
Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain.
What Newton Does
Newton validates every prompt pre-inference and returns:
Phase (0/1/7/8/9)
Shape classification
Confidence score
Full audit trace
If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model.
v1.3 Detection Categories (14 total)
Jailbreak / prompt injection
Corrosive self-negation ("I hate myself")
Hedged corrosive ("Not saying I'm worthless, but...")
Emotional dependency ("You're the only one who understands")
Third-person manipulation ("If you refuse, you're proving nobody cares")
Logical contradictions ("Prove truth doesn't exist")
Self-referential paradox ("Prove that proof is impossible")
Semantic inversion ("Explain how truth can be false")
Definitional impossibility ("Square circle")
Delegated agency ("Decide for me")
Hallucination-risk prompts ("Cite the 2025 CDC report")
Unbounded recursion ("Repeat forever")
Conditional unbounded ("Until you can't")
Nonsense / low semantic density
Test Results
94.3% catch rate on 35 adversarial test cases (33/35 passed).
Architecture
User Input
↓
[ Newton ] → Validates prompt, assigns Phase
↓
Phase 9? → [ FoundationModels ] → Response
Phase 1/7/8? → Blocked with explanation
Key Properties
Deterministic (same input → same output)
Fully auditable (ValidationTrace on every prompt)
On-device (no network required)
Native Swift / SwiftUI
String Catalog localization (EN/ES/FR)
FoundationModels-ready (#if canImport)
Code Sample — Validation
let governor = NewtonGovernor()
let result = governor.validate(prompt: userInput)
if result.permitted {
// Proceed to FoundationModels
let session = LanguageModelSession()
let response = try await session.respond(to: userInput)
} else {
// Handle block
print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)")
print(result.trace.summary) // Full audit trace
}
Questions for the Community
Anyone else building pre-inference validation for FoundationModels?
Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail?
Interest in Shape Theory classification for prompt complexity?
Best practices for integrating with LanguageModelSession?
Links
GitHub: https://github.com/jaredlewiswechs/ada-newton
Technical overview: parcri.net
Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device.
parcri.net has the link :)
Hello,
I am developing an iOS app that uses machine learning models.
To improve accuracy and user experience, I would like to download .mlmodel files (compiled and compressed as zip files) from our own server after the app is installed, and use them for inference within the app.
No executable code, scripts, or dynamic libraries will be downloaded—only model data files are used.
According to App Store Review Guideline 2.5.2, I understand that apps may not download or execute code which introduces or changes features or functionality.
In this case, are compiled and zip-compressed .mlmodel files considered "data" rather than "code", and is it allowed to download and use them in the app?
If there are any restrictions or best practices related to this, please let me know.
Thank you.
How reliable is the Models, to use as a comparison, such as a cholesterol test, to inform, for example, whether it is worth it to go see a doctor?
I would like to use Tool to attach the simple blood test data to the session and with this the Model can analyse and made a simple suggestion if is necessary to see a doctor etc.. ?
ps.: Local model
Problem:
We trained a LoRA adapter for Apple's FoundationModels framework using their TAMM (Training Adapter for Model Modification)
toolkit v0.2.0 on macOS 26 beta 4. The adapter trains successfully but fails to load with: "Adapter is not compatible with the
current system base model."
TAMM 2.0 contains export/constants.py with: BASE_SIGNATURE = "9799725ff8e851184037110b422d891ad3b92ec1"
Findings:
Adapter Export Process:
In export_fmadapter.py
def write_metadata(...):
self_dict[MetadataKeys.BASE_SIGNATURE] = BASE_SIGNATURE # Hardcoded value
The Compatibility Check:
- When loading an adapter, Apple's system compares the adapter's baseModelSignature with the current system model
- If they don't match: compatibleAdapterNotFound error
- The error doesn't reveal the expected signature
Questions:
- How is BASE_SIGNATURE derived from the base model?
- Is it SHA-1 of base-model.pt or some other computation?
- Can we compute the correct signature for beta 4?
- Or do we need Apple to release TAMM v0.3.0 with updated signature?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Tags:
Core ML
Create ML
tensorflow-metal
Apple Intelligence
I have used mlx_lm.lora to fine tune a mistral-7b-v0.3-4bit model with my data. I fused the mistral model with my adapters and upload the fused model to my directory on huggingface. I was able to use mlx_lm.generate to use the fused model in Terminal. However, I don't know how to load the model in Swift. I've used
Imports
import SwiftUI
import MLX
import MLXLMCommon
import MLXLLM
let modelFactory = LLMModelFactory.shared
let configuration = ModelConfiguration(
id: "pharmpk/pk-mistral-7b-v0.3-4bit"
)
// Load the model off the main actor, then assign on the main actor
let loaded = try await modelFactory.loadContainer(configuration: configuration)
{ progress in
print("Downloading progress: \(progress.fractionCompleted * 100)%")
}
await MainActor.run {
self.model = loaded
}
I'm getting an error
runModel error: downloadError("A server with the specified hostname could not be found.")
Any suggestions?
Thanks, David
PS, I can load the model from the app bundle
// directory: Bundle.main.resourceURL!
but it's too big to upload for Testflight
Topic:
Machine Learning & AI
SubTopic:
General
Hello,
I am interested in using jax-metal to train ML models using Apple Silicon. I understand this is experimental.
After installing jax-metal according to https://developer.apple.com/metal/jax/, my python code fails with the following error
JaxRuntimeError: UNKNOWN: -:0:0: error: unknown attribute code: 22
-:0:0: note: in bytecode version 6 produced by: StableHLO_v1.12.1
My issue is identical to the one reported here https://github.com/jax-ml/jax/issues/26968#issuecomment-2733120325, and is fixed by pinning to jax-metal 0.1.1., jax 0.5.0 and jaxlib 0.5.0.
Thank you!
Hello,
I am studying macOS26 Apple Intelligence features.
I have created a basic swift program with Xcode. This program is sending prompts to FoundationModels.LanguageModelSession.
It works fine but this model is not trained for programming or code completion.
Xcode has an AI code completion feature. It is called "Predictive Code completion model".
So, there are multiple on-device models on macOS26 ?
Are there others ?
Is there a way for me to send prompts to this "Predictive Code completion model" from my program ?
Thanks
Is it possible to expose a custom VirtIO device to a Linux guest running inside a VM — likely using QEMU backed by Hypervisor.framework. The guest would see this device as something like /dev/npu0, and it would use a kernel driver + userspace library to submit inference requests.
On the macOS host, these requests would be executed using CoreML, MPSGraph, or BNNS. The results would be passed back to the guest via IPC.
Does the macOS allow this kind of "fake" NPU / GPU
Hi Apple product owners.
I am missing a unified concept which might be derived from the use cases for mail categories and mail spam for the app "Mail" on Mac.
I need a recommendation on how to use categories in combination with the spam filter to get most out of it.
So I was looking for the use cases for the 2 functionality areas in order to figure out how to organise my mails by using as much automation as possible before I start creating intelligent folders in addition.
What can you recommend where I get this information from? I don't want to guess or read a lot of forum contributions which are based on guesses.
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
In the play ground I'm trying to bias my LanguageModel to use a tool I registered, but I don't see it actually calling the tool. I'm following the developer video on landmarks itinerary generation tutorial almost verbatim. Is this a prompt engineering thing I'm missing? Or is it possible that I'm injecting my tool wrong?
Hi,
I have an app that uses Core Data to store user information and display it in various views. I want to know if it's possible to easily integrate this setup with FoundationModels to make it easier for the user to query and manipulate the information, and if so, how would I go about it? Can the model be pointed to the database schema file and the SQLite file sitting in the user's app group container to parse out the information needed? And/or should the NSManagedObjects be made @Generable for better output? Any guidance about this would be useful.
This is my code:
witch SystemLanguageModel.default.availability {
case .available:
ContentView()
.popover(isPresented: $showSettings) {
SettingsView().presentationCompactAdaptation(.popover)
}
case .unavailable(.modelNotReady):
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark",
description: Text("Please come back later."))
case .unavailable(.appleIntelligenceNotEnabled):
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark",
description: Text("Please turn on Apple Intelligence."))
case .unavailable(.deviceNotEligible):
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark",
description: Text("This device is not eligible for Apple Intelligence."))
case .unavailable:
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark")
}
When I switch off Apple Intelligence, I expected "Please turn on Apple Intelligence.", but instead I get "Please come back later."
This seems to be wrong error?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
v3 was released 2 years ago but developers are unable to convert models created with Keras v3 to CoreML
Is foundation models matured enough to take input from the Apple Vision framework to generate responses? Something similar to what google's gemini does although in a much smaller scale and for a very specific niche.
Is the face and body detection service in the Vision framework a local model or a cloud model? Is there a performance report?
https://developer.apple.com/documentation/vision