I'm trying the new RecognizeDocumentsRequest supposed to detect paragraphs (among other things) in a document.
I tried many source images, and I don't see the slightest difference compared to the old API (VN)RecognizedTextRequest
Is it supposed to not work or is it in beta?
General
RSS for tagExplore the power of machine learning within apps. Discuss integrating machine learning features, share best practices, and explore the possibilities for your app.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I get the following error when running this command in a Jupyter notebook:
v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))
v[0, 0].assign(3.)
Environment:
python == 3.11.14
tensorflow==2.19.1
tensorflow-metal==1.2.0
{
"name": "InvalidArgumentError",
"message": "Cannot assign a device for operation ResourceStridedSliceAssign: Could not satisfy explicit device specification '/job:localhost/replica:0/task:0/device:GPU:0' because no supported kernel for GPU devices is available.\nColocation Debug Info:\nColocation group had the following types and supported devices: \nRoot Member(assigned_device_name_index_=1 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]\nResourceStridedSliceAssign: CPU \n_Arg: GPU CPU \n\nColocation members, user-requested devices, and framework assigned devices, if any:\n ref (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0\n ResourceStridedSliceAssign (ResourceStridedSliceAssign) /job:localhost/replica:0/task:0/device:GPU:0\n\nOp: ResourceStridedSliceAssign\n
[...]
[[{{node ResourceStridedSliceAssign}}]] [Op:ResourceStridedSliceAssign] name: strided_slice/_assign"
}
It seems like the ResourceStridedSliceAssign operation is not implemented for the GPU
Environment:
macOS 26.2 (Tahoe)
Xcode 16.3
Apple Silicon (M4)
Sandboxed Mac App Store app
Description:
Repeated use of VNRecognizeTextRequest causes permanent memory growth in the host process. The physical footprint increases by approximately 3-15 MB per OCR call and never returns to baseline, even after all references to the request, handler, observations, and image are released.
`
private func selectAndProcessImage() {
let panel = NSOpenPanel()
panel.allowedContentTypes = [.image]
panel.allowsMultipleSelection = false
panel.canChooseDirectories = false
panel.message = "Select an image for OCR processing"
guard panel.runModal() == .OK, let url = panel.url else { return }
selectedImageURL = url
isProcessing = true
recognizedText = "Processing..."
// Run OCR on a background thread to keep UI responsive
let workItem = DispatchWorkItem {
let result = performOCR(on: url)
DispatchQueue.main.async {
recognizedText = result
isProcessing = false
}
}
DispatchQueue.global(qos: .userInitiated).async(execute: workItem)
}
private func performOCR(on url: URL) -> String {
// Wrap EVERYTHING in autoreleasepool so all ObjC objects are drained immediately
let resultText: String = autoreleasepool {
// Load image and convert to CVPixelBuffer for explicit memory control
guard let imageData = try? Data(contentsOf: url) else {
return "Error: Could not read image file."
}
guard let nsImage = NSImage(data: imageData) else {
return "Error: Could not create image from file data."
}
guard let cgImage = nsImage.cgImage(forProposedRect: nil, context: nil, hints: nil) else {
return "Error: Could not create CGImage."
}
let width = cgImage.width
let height = cgImage.height
// Create a CVPixelBuffer from the CGImage
var pixelBuffer: CVPixelBuffer?
let attrs: [String: Any] = [
kCVPixelBufferCGImageCompatibilityKey as String: true,
kCVPixelBufferCGBitmapContextCompatibilityKey as String: true
]
let status = CVPixelBufferCreate(
kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_32ARGB,
attrs as CFDictionary,
&pixelBuffer
)
guard status == kCVReturnSuccess, let buffer = pixelBuffer else {
return "Error: Could not create CVPixelBuffer (status: \(status))."
}
// Draw the CGImage into the pixel buffer
CVPixelBufferLockBaseAddress(buffer, [])
guard let context = CGContext(
data: CVPixelBufferGetBaseAddress(buffer),
width: width,
height: height,
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
) else {
CVPixelBufferUnlockBaseAddress(buffer, [])
return "Error: Could not create CGContext for pixel buffer."
}
context.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
CVPixelBufferUnlockBaseAddress(buffer, [])
// Run OCR
let requestHandler = VNImageRequestHandler(cvPixelBuffer: buffer, options: [:])
let request = VNRecognizeTextRequest()
request.recognitionLevel = .accurate
request.usesLanguageCorrection = true
do {
try requestHandler.perform([request])
} catch {
return "Error during OCR: \(error.localizedDescription)"
}
guard let observations = request.results, !observations.isEmpty else {
return "No text found in image."
}
let lines = observations.compactMap { observation in
observation.topCandidates(1).first?.string
}
// Explicitly nil out the pixel buffer before the pool drains
pixelBuffer = nil
return lines.joined(separator: "\n")
}
// Everything — Data, NSImage, CGImage, CVPixelBuffer, VN objects — released here
return resultText
}
`
Subject: Technical Report: Float32 Precision Ceiling & Memory Fragmentation in JAX/Metal Workloads on M3
To: Metal Developer Relations
Hello,
I am reporting a repeatable numerical saturation point encountered during sustained recursive high-order differential workloads on the Apple M3 (16 GB unified memory) using the JAX Metal backend.
Workload Characteristics:
Large-scale vector projections across multi-dimensional industrial datasets
Repeated high-order finite-difference calculations
Heavy use of jax.grad and lax.cond inside long-running loops
Observation:
Under these conditions, the Metal/MPS backend consistently enters a terminal quantization lock where outputs saturate at a fixed scalar value (2.0000), followed by system-wide NaN propagation. This appears to be a precision-limited boundary in the JAX-Metal bridge when handling high-order operations with cubic time-scale denominators.
have identified the specific threshold where recursive high-order tensor derivatives exceed the numerical resolution of 32-bit consumer architectures, necessitating a migration to a dedicated 64-bit industrial stack.
I have prepared a minimal synthetic test script (randomized vectors only, no proprietary logic) that reliably reproduces the allocator fragmentation and saturation behavior. Let me know if your team would like the telemetry for XLA/MPS optimization purposes.
Best regards,
Alex Severson
Architect, QuantumPulse AI
I don't know if these forums are any good for rumors or plans, but does anybody know whether or not Apple plans to release a library for training reinforcement learning? It would be handy, implementing games in Swift, for example, to be able to train the computer players on the same code.
In macOS Tahoe 26.2 an RDMA capability was added for Thunderbolt-5 interfaces. This has been demonstrated to significantly decrease the latency and maintain bandwidth for "clustered" Apple Silicon devices with TB5. What is the ideal and the maximum RDMA burst width for transfers over RDMA-enabled Thunderbolt-5 interfaces?
I have a very terrible crash problem in my App when I use AVSpeechSynthesizer and I can't repetition it.Here is my code, It's a singleton- (void)stopSpeech {
if ([self.synthesizer isPaused]) {
return;
}
if ([self.synthesizer isSpeaking]) {
BOOL isSpeech = [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
if (!isSpeech) {
[self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord];
}
}
self.stopBlock ? self.stopBlock() : nil;
}
-(AVSpeechSynthesizer *)synthesizer {
if (!_synthesizer) {
_synthesizer = [[AVSpeechSynthesizer alloc] init];
_synthesizer.delegate = self;
}
return _synthesizer;
}When the user leaves the page, I call the stopSpeech method。Then I got a lot of crash messagesHere is a crash log:# Crashlytics - plaintext stacktrace downloaded by liweican at Mon, 13 May 2019 03:03:24 GMT
# URL: https://fabric.io/youdao-dict/ios/apps/com.youdao.udictionary/issues/5a904ed88cb3c2fa63ad7ed3?time=last-thirty-days/sessions/b1747d91bafc4680ab0ca8e3a702c52c_DNE_0_v2
# Organization: zzz
# Platform: ios
# Application: U-Dictionary
# Version: 3.0.5.4
# Bundle Identifier: com.youdao.UDictionary
# Issue ID: 5a904ed88cb3c2fa63ad7ed3
# Session ID: b1747d91bafc4680ab0ca8e3a702c52c_DNE_0_v2
# Date: 2019-05-13T02:27:00Z
# OS Version: 12.2.0 (16E227)
# Device: iPhone 8 Plus
# RAM Free: 17%
# Disk Free: 64.6%
#19. Crashed: AXSpeech
0 libsystem_pthread.dylib 0x19c15e5b8 pthread_mutex_lock$VARIANT$armv81 + 102
1 CoreFoundation 0x19c4cf84c CFRunLoopSourceSignal + 68
2 Foundation 0x19cfc7280 performQueueDequeue + 464
3 Foundation 0x19cfc680c __NSThreadPerformPerform + 136
4 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
5 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88
6 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256
7 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004
8 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
9 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300
10 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264
11 Foundation 0x19cfc66e4 __NSThread__start__ + 984
12 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
13 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
14 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
--
#0. com.apple.main-thread
0 libsystem_malloc.dylib 0x19c11ce24 small_free_list_remove_ptr_no_clear + 768
1 libsystem_malloc.dylib 0x19c11f094 small_malloc_from_free_list + 296
2 libsystem_malloc.dylib 0x19c11f094 small_malloc_from_free_list + 296
3 libsystem_malloc.dylib 0x19c11d63c small_malloc_should_clear + 224
4 libsystem_malloc.dylib 0x19c11adcc szone_malloc_should_clear + 132
5 libsystem_malloc.dylib 0x19c123c18 malloc_zone_malloc + 156
6 CoreFoundation 0x19c569ab4 __CFBasicHashRehash + 300
7 CoreFoundation 0x19c56b430 __CFBasicHashAddValue + 96
8 CoreFoundation 0x19c56ab9c CFBasicHashAddValue + 2160
9 CoreFoundation 0x19c49f3bc CFDictionaryAddValue + 260
10 CoreFoundation 0x19c572ee8 __54-[CFPrefsSource mergeIntoDictionary:sourceDictionary:]_block_invoke + 28
11 CoreFoundation 0x19c49f0b4 __CFDictionaryApplyFunction_block_invoke + 24
12 CoreFoundation 0x19c568b7c CFBasicHashApply + 116
13 CoreFoundation 0x19c49f090 CFDictionaryApplyFunction + 168
14 CoreFoundation 0x19c42f504 -[CFPrefsSource mergeIntoDictionary:sourceDictionary:] + 136
15 CoreFoundation 0x19c4bcd38 -[CFPrefsSearchListSource alreadylocked_getDictionary:] + 644
16 CoreFoundation 0x19c42e71c -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 152
17 CoreFoundation 0x19c42e660 -[CFPrefsSource copyValueForKey:] + 60
18 CoreFoundation 0x19c579e88 __76-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:]_block_invoke + 40
19 CoreFoundation 0x19c4bdff4 __108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 272
20 CoreFoundation 0x19c4bda38 normalizeQuintuplet + 340
21 CoreFoundation 0x19c42c634 -[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 108
22 CoreFoundation 0x19c42cec0 -[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 148
23 CoreFoundation 0x19c57c2d0 _CFPreferencesCopyAppValueWithContainerAndConfiguration + 124
24 TextInput 0x1a450e550 -[TIPreferencesController valueForPreferenceKey:] + 460
25 UIKitCore 0x1c87c71f8 -[UIKeyboardPreferencesController handBias] + 36
26 UIKitCore 0x1c887275c -[UIKeyboardLayoutStar showKeyboardWithInputTraits:screenTraits:splitTraits:] + 320
27 UIKitCore 0x1c88f4240 -[UIKeyboardImpl finishLayoutChangeWithArguments:] + 492
28 UIKitCore 0x1c88f47c8 -[UIKeyboardImpl updateLayout] + 1208
29 UIKitCore 0x1c88eaad0 -[UIKeyboardImpl updateLayoutIfNecessary] + 448
30 UIKitCore 0x1c88eab9c -[UIKeyboardImpl setFrame:] + 140
31 UIKitCore 0x1c88d5d60 -[UIKeyboard activate] + 652
32 UIKitCore 0x1c894c90c -[UIKeyboardAutomatic activate] + 128
33 UIKitCore 0x1c88d5158 -[UIKeyboard setFrame:] + 296
34 UIKitCore 0x1c88d81b0 -[UIKeyboard _didChangeKeyplaneWithContext:] + 228
35 UIKitCore 0x1c88f4aa0 -[UIKeyboardImpl didMoveToSuperview] + 136
36 UIKitCore 0x1c8f2ad84 __45-[UIView(Hierarchy) _postMovedFromSuperview:]_block_invoke + 888
37 UIKitCore 0x1c8f2a970 -[UIView(Hierarchy) _postMovedFromSuperview:] + 760
38 UIKitCore 0x1c8f39ddc -[UIView(Internal) _addSubview:positioned:relativeTo:] + 1740
39 UIKitCore 0x1c88d5d84 -[UIKeyboard activate] + 688
40 UIKitCore 0x1c894c90c -[UIKeyboardAutomatic activate] + 128
41 UIKitCore 0x1c893b3a4 -[UIPeripheralHost(UIKitInternal) _reloadInputViewsForResponder:] + 1332
42 UIKitCore 0x1c8ae66d8 -[UIResponder(UIResponderInputViewAdditions) reloadInputViews] + 80
43 UIKitCore 0x1c8ae23bc -[UIResponder becomeFirstResponder] + 804
44 UIKitCore 0x1c8f2a560 -[UIView(Hierarchy) becomeFirstResponder] + 156
45 UIKitCore 0x1c8d93e84 -[UITextField becomeFirstResponder] + 244
46 UIKitCore 0x1c8d578dc -[UITextInteractionAssistant(UITextInteractionAssistant_Internal) setFirstResponderIfNecessary] + 192
47 UIKitCore 0x1c8d45d8c -[UITextSelectionInteraction oneFingerTap:] + 3136
48 UIKitCore 0x1c86e0bcc -[UIGestureRecognizerTarget _sendActionWithGestureRecognizer:] + 64
49 UIKitCore 0x1c86e8dd4 _UIGestureRecognizerSendTargetActions + 124
50 UIKitCore 0x1c86e6778 _UIGestureRecognizerSendActions + 316
51 UIKitCore 0x1c86e5ca4 -[UIGestureRecognizer _updateGestureWithEvent:buttonEvent:] + 760
52 UIKitCore 0x1c86d9d80 _UIGestureEnvironmentUpdate + 2180
53 UIKitCore 0x1c86d94b0 -[UIGestureEnvironment _deliverEvent:toGestureRecognizers:usingBlock:] + 384
54 UIKitCore 0x1c86d9290 -[UIGestureEnvironment _updateForEvent:window:] + 204
55 UIKitCore 0x1c8af14a8 -[UIWindow sendEvent:] + 3112
56 UIKitCore 0x1c8ad1534 -[UIApplication sendEvent:] + 340
57 UIKitCore 0x1c8b977c0 __dispatchPreprocessedEventFromEventQueue + 1768
58 UIKitCore 0x1c8b99eec __handleEventQueueInternal + 4828
59 UIKitCore 0x1c8b9311c __handleHIDEventFetcherDrain + 152
60 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
61 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88
62 CoreFoundation 0x19c4d1b24 __CFRunLoopDoSources0 + 176
63 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004
64 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
65 GraphicsServices 0x19e6cc79c GSEventRunModal + 104
66 UIKitCore 0x1c8ab7b68 UIApplicationMain + 212
67 UDictionary 0x10517e138 main (main.m:17)
68 libdyld.dylib 0x19bf928e0 start + 4
#1. Thread
0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8
1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340
2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4
#2. com.apple.uikit.eventfetch-thread
0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8
1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72
2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236
3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360
4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
5 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300
6 Foundation 0x19ce99e5c -[NSRunLoop(NSRunLoop) runUntilDate:] + 96
7 UIKitCore 0x1c8b9d540 -[UIEventFetcher threadMain] + 136
8 Foundation 0x19cfc66e4 __NSThread__start__ + 984
9 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
10 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
11 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#3. JavaScriptCore bmalloc scavenger
0 libsystem_kernel.dylib 0x19c0ddee4 __psynch_cvwait + 8
1 libsystem_pthread.dylib 0x19c15d4a4 _pthread_cond_wait$VARIANT$armv81 + 628
2 libc++.1.dylib 0x19b6b5090 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 24
3 JavaScriptCore 0x1a36a2238 void std::__1::condition_variable_any::wait<std::__1::unique_lock<bmalloc::Mutex> >(std::__1::unique_lock<bmalloc::Mutex>&) + 108
4 JavaScriptCore 0x1a36a622c bmalloc::Scavenger::threadRunLoop() + 176
5 JavaScriptCore 0x1a36a59a4 bmalloc::Scavenger::Scavenger(std::__1::lock_guard<bmalloc::Mutex>&) + 10
6 JavaScriptCore 0x1a36a73e4 std::__1::__thread_specific_ptr<std::__1::__thread_struct>::set_pointer(std::__1::__thread_struct*) + 38
7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#4. WebThread
0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8
1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72
2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236
3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360
4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
5 WebCore 0x1a5126480 RunWebThread(void*) + 600
6 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
7 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
8 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#5. com.twitter.crashlytics.ios.MachExceptionServer
0 UDictionary 0x1058a5564 CLSProcessRecordAllThreads (CLSProcess.c:376)
1 UDictionary 0x1058a594c CLSProcessRecordAllThreads (CLSProcess.c:407)
2 UDictionary 0x1058952dc CLSHandler (CLSHandler.m:26)
3 UDictionary 0x1058906cc CLSMachExceptionServer (CLSMachException.c:446)
4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#6. com.apple.NSURLConnectionLoader
0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8
1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72
2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236
3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360
4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
5 CFNetwork 0x19cae574c -[__CoreSchedulingSetRunnable runForever] + 216
6 Foundation 0x19cfc66e4 __NSThread__start__ + 984
7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#7. AVAudioSession Notify Thread
0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8
1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72
2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236
3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360
4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
5 AVFAudio 0x1a238a378 GenericRunLoopThread::Entry(void*) + 156
6 AVFAudio 0x1a23b4c60 CAPThread::Entry(CAPThread*) + 88
7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#8. WebCore: LocalStorage
0 libsystem_kernel.dylib 0x19c0ddee4 __psynch_cvwait + 8
1 libsystem_pthread.dylib 0x19c15d4a4 _pthread_cond_wait$VARIANT$armv81 + 628
2 JavaScriptCore 0x1a3668ce4 ***::ThreadCondition::timedWait(***::Mutex&, ***::WallTime) + 80
3 JavaScriptCore 0x1a364f96c ***::ParkingLot::parkConditionallyImpl(void const*, ***::ScopedLambda<bool ()> const&, ***::ScopedLambda<void ()> const&, ***::TimeWithDynamicClockType const&) + 2004
4 WebKitLegacy 0x1a67b6ea8 bool ***::Condition::waitUntil<***::Lock>(***::Lock&, ***::TimeWithDynamicClockType const&) + 184
5 WebKitLegacy 0x1a67b9ba4 std::__1::unique_ptr<***::Function<void ()>, std::__1::default_delete<***::Function<void ()> > > ***::MessageQueue<***::Function<void ()> >::waitForMessageFilteredWithTimeout<***::MessageQueue<***::Function<void ()> >::waitForMessage()::'lambda'(***::Function<void ()> const&)>(***::MessageQueueWaitResult&, ***::MessageQueue<***::Function<void ()> >::waitForMessage()::'lambda'(***::Function<void ()> const&)&&, ***::WallTime) + 156
6 WebKitLegacy 0x1a67b91c0 WebCore::StorageThread::threadEntryPoint() + 68
7 JavaScriptCore 0x1a3666f88 ***::Thread::entryPoint(***::Thread::NewThreadContext*) + 260
8 JavaScriptCore 0x1a3668494 ***::wtfThreadEntryPoint(void*) + 12
9 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
10 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
11 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#9. com.apple.CoreMotion.MotionThread
0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8
1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72
2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236
3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360
4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
5 CoreFoundation 0x19c4cd0b0 CFRunLoopRun + 80
6 CoreMotion 0x1a1df0240 (Missing)
7 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
8 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
9 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#10. Thread
0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8
1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340
2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4
#11. Thread
0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8
1 libsystem_pthread.dylib 0x19c1611f8 _pthread_wqthread + 532
2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4
#12. com.apple.CFStream.LegacyThread
0 libsystem_kernel.dylib 0x19c0d30f4 mach_msg_trap + 8
1 libsystem_kernel.dylib 0x19c0d25a0 mach_msg + 72
2 CoreFoundation 0x19c4d1cb4 __CFRunLoopServiceMachPort + 236
3 CoreFoundation 0x19c4ccbc4 __CFRunLoopRun + 1360
4 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
5 CoreFoundation 0x19c4e5094 _legacyStreamRunLoop_workThread + 260
6 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
7 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
8 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#13. Thread
0 libsystem_pthread.dylib 0x19c163cd0 start_wqthread + 190
#14. Thread
0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8
1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340
2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4
#15. Thread
0 libsystem_kernel.dylib 0x19c0deb74 __workq_kernreturn + 8
1 libsystem_pthread.dylib 0x19c161138 _pthread_wqthread + 340
2 libsystem_pthread.dylib 0x19c163cd4 start_wqthread + 4
#16. Thread
0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8
1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64
2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72
3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344
4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#17. Thread
0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8
1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64
2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72
3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344
4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#18. Thread
0 libsystem_kernel.dylib 0x19c0d3148 semaphore_timedwait_trap + 8
1 libdispatch.dylib 0x19bf50a4c _dispatch_sema4_timedwait$VARIANT$armv81 + 64
2 libdispatch.dylib 0x19bf513a8 _dispatch_semaphore_wait_slow + 72
3 libdispatch.dylib 0x19bf647c8 _dispatch_worker_thread + 344
4 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
5 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
6 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#19. Crashed: AXSpeech
0 libsystem_pthread.dylib 0x19c15e5b8 pthread_mutex_lock$VARIANT$armv81 + 102
1 CoreFoundation 0x19c4cf84c CFRunLoopSourceSignal + 68
2 Foundation 0x19cfc7280 performQueueDequeue + 464
3 Foundation 0x19cfc680c __NSThreadPerformPerform + 136
4 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
5 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88
6 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256
7 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004
8 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
9 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300
10 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264
11 Foundation 0x19cfc66e4 __NSThread__start__ + 984
12 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
13 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
14 libsystem_pthread.dylib 0x19c163cdc thread_start + 4
#20. AXSpeech
0 (Missing) 0x1071ba524 (Missing)
1 (Missing) 0x1071b3e7c (Missing)
2 (Missing) 0x10718fba4 (Missing)
3 (Missing) 0x107184bc8 (Missing)
4 libdyld.dylib 0x19bf95908 dlopen + 176
5 CoreFoundation 0x19c5483e8 _CFBundleDlfcnLoadBundle + 140
6 CoreFoundation 0x19c486918 _CFBundleLoadExecutableAndReturnError + 352
7 Foundation 0x19ced5734 -[NSBundle loadAndReturnError:] + 428
8 TextToSpeech 0x1abfff800 TTSSpeechUnitTestingMode + 1020
9 libdispatch.dylib 0x19bf817d4 _dispatch_client_callout + 16
10 libdispatch.dylib 0x19bf52040 _dispatch_once_callout + 28
11 TextToSpeech 0x1abfff478 TTSSpeechUnitTestingMode + 116
12 libobjc.A.dylib 0x19b7173cc CALLING_SOME_+initialize_METHOD + 24
13 libobjc.A.dylib 0x19b71cee0 initializeNonMetaClass + 296
14 libobjc.A.dylib 0x19b71e640 initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 260
15 libobjc.A.dylib 0x19b7265a4 lookUpImpOrForward + 244
16 libobjc.A.dylib 0x19b733858 _objc_msgSend_uncached + 56
17 libAXSpeechManager.dylib 0x1ac167324 -[AXSpeechManager _initialize] + 68
18 Foundation 0x19cfc68d4 __NSThreadPerformPerform + 336
19 CoreFoundation 0x19c4d22bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
20 CoreFoundation 0x19c4d223c __CFRunLoopDoSource0 + 88
21 CoreFoundation 0x19c4d1b74 __CFRunLoopDoSources0 + 256
22 CoreFoundation 0x19c4cca60 __CFRunLoopRun + 1004
23 CoreFoundation 0x19c4cc354 CFRunLoopRunSpecific + 436
24 Foundation 0x19ce99fcc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 300
25 libAXSpeechManager.dylib 0x1ac16c94c -[AXSpeechThread main] + 264
26 Foundation 0x19cfc66e4 __NSThread__start__ + 984
27 libsystem_pthread.dylib 0x19c1602c0 _pthread_body + 128
28 libsystem_pthread.dylib 0x19c160220 _pthread_start + 44
29 libsystem_pthread.dylib 0x19c163cdc thread_start + 4I change my code like this, It still has the same problem- (void)stopSpeech {
if (self.synthesizer != nil && [self.synthesizer isPaused]) {
return;
}
// if ([self.synthesizer isSpeaking]) {
// BOOL isSpeech = [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
// if (!isSpeech) {
// [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord];
// }
// }
if (self.synthesizer != nil) {
[self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
// if (!isSpeech) {
// [self.synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryWord];
// }
self.stopBlock ? self.stopBlock() : nil;
}
}
Incident Identifier: 4C22F586-71FB-4644-B823-A4B52D158057
CrashReporter Key: adc89b7506c09c2a6b3a9099cc85531bdaba9156
Hardware Model: Mac16,10
Process: PRISMLensCore [16561]
Path: /Applications/PRISMLens.app/Contents/Resources/app.asar.unpacked/node_modules/core-node/PRISMLensCore.app/PRISMLensCore
Identifier: com.prismlive.camstudio
Version: (null) ((null))
Code Type: ARM-64
Parent Process: ? [16560]
Date/Time: (null)
OS Version: macOS 15.4 (24E5228e)
Report Version: 104
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x00000000 at 0x0000000000000000
Crashed Thread: 34
Application Specific Information:
*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '*** -[__NSArrayM insertObject:atIndex:]: object cannot be nil'
Thread 34 Crashed:
0 CoreFoundation 0x000000018ba4dde4 0x18b960000 + 974308 (__exceptionPreprocess + 164)
1 libobjc.A.dylib 0x000000018b512b60 0x18b4f8000 + 109408 (objc_exception_throw + 88)
2 CoreFoundation 0x000000018b97e69c 0x18b960000 + 124572 (-[__NSArrayM insertObject:atIndex:] + 1276)
3 Portrait 0x0000000257e16a94 0x257da3000 + 473748 (-[PTMSRResize addAdditionalOutput:] + 604)
4 Portrait 0x0000000257de91c0 0x257da3000 + 287168 (-[PTEffectRenderer initWithDescriptor:metalContext:useHighResNetwork:faceAttributesNetwork:humanDetections:prevTemporalState:asyncInitQueue:sharedResources:] + 6204)
5 Portrait 0x0000000257dab21c 0x257da3000 + 33308 (__33-[PTEffect updateEffectDelegate:]_block_invoke.241 + 164)
6 libdispatch.dylib 0x000000018b739b2c 0x18b738000 + 6956 (_dispatch_call_block_and_release + 32)
7 libdispatch.dylib 0x000000018b75385c 0x18b738000 + 112732 (_dispatch_client_callout + 16)
8 libdispatch.dylib 0x000000018b742350 0x18b738000 + 41808 (_dispatch_lane_serial_drain + 740)
9 libdispatch.dylib 0x000000018b742e2c 0x18b738000 + 44588 (_dispatch_lane_invoke + 388)
10 libdispatch.dylib 0x000000018b74d264 0x18b738000 + 86628 (_dispatch_root_queue_drain_deferred_wlh + 292)
11 libdispatch.dylib 0x000000018b74cae8 0x18b738000 + 84712 (_dispatch_workloop_worker_thread + 540)
12 libsystem_pthread.dylib 0x000000018b8ede64 0x18b8eb000 + 11876 (_pthread_wqthread + 292)
13 libsystem_pthread.dylib 0x000000018b8ecb74 0x18b8eb000 + 7028 (start_wqthread + 8)
Topic:
Machine Learning & AI
SubTopic:
General
Hi,
I'm testing DockKit with a very simple setup:
I use VNDetectFaceRectanglesRequest to detect a face and then call dockAccessory.track(...) using the detected bounding box.
The stand is correctly docked (state == .docked) and dockAccessory is valid.
I'm calling .track(...) with a single observation and valid CameraInformation (including size, device, orientation, etc.). No errors are thrown.
To monitor this, I added a logging utility – track(...) is being called 10–30 times per second, as recommended in the documentation.
However: the stand does not move at all.
There is no visible reaction to the tracking calls.
Is there anything I'm missing or doing wrong?
Is VNDetectFaceRectanglesRequest supported for DockKit tracking, or are there hidden requirements?
Would really appreciate any help or pointers – thanks!
That's my complete code:
extension VideoFeedViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let frame = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
detectFace(image: frame)
func detectFace(image: CVPixelBuffer) {
let faceDetectionRequest = VNDetectFaceRectanglesRequest() { vnRequest, error in
guard let results = vnRequest.results as? [VNFaceObservation] else {
return
}
guard let observation = results.first else {
return
}
let boundingBoxHeight = observation.boundingBox.size.height * 100
#if canImport(DockKit)
if let dockAccessory = self.dockAccessory {
Task {
try? await trackRider(
observation.boundingBox,
dockAccessory,
frame,
sampleBuffer
)
}
}
#endif
}
let imageResultHandler = VNImageRequestHandler(cvPixelBuffer: image, orientation: .up)
try? imageResultHandler.perform([faceDetectionRequest])
func combineBoundingBoxes(_ box1: CGRect, _ box2: CGRect) -> CGRect {
let minX = min(box1.minX, box2.minX)
let minY = min(box1.minY, box2.minY)
let maxX = max(box1.maxX, box2.maxX)
let maxY = max(box1.maxY, box2.maxY)
let combinedWidth = maxX - minX
let combinedHeight = maxY - minY
return CGRect(x: minX, y: minY, width: combinedWidth, height: combinedHeight)
}
#if canImport(DockKit)
func trackObservation(_ boundingBox: CGRect, _ dockAccessory: DockAccessory, _ pixelBuffer: CVPixelBuffer, _ cmSampelBuffer: CMSampleBuffer) throws {
// Zähle den Aufruf
TrackMonitor.shared.trackCalled()
let invertedBoundingBox = CGRect(
x: boundingBox.origin.x,
y: 1.0 - boundingBox.origin.y - boundingBox.height,
width: boundingBox.width,
height: boundingBox.height
)
guard let device = captureDevice else {
fatalError("Kamera nicht verfügbar")
}
let size = CGSize(width: Double(CVPixelBufferGetWidth(pixelBuffer)),
height: Double(CVPixelBufferGetHeight(pixelBuffer)))
var cameraIntrinsics: matrix_float3x3? = nil
if let cameraIntrinsicsUnwrapped = CMGetAttachment(
sampleBuffer,
key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix,
attachmentModeOut: nil
) as? Data {
cameraIntrinsics = cameraIntrinsicsUnwrapped.withUnsafeBytes { $0.load(as: matrix_float3x3.self) }
}
Task {
let orientation = getCameraOrientation()
let cameraInfo = DockAccessory.CameraInformation(
captureDevice: device.deviceType,
cameraPosition: device.position,
orientation: orientation,
cameraIntrinsics: cameraIntrinsics,
referenceDimensions: size
)
let observation = DockAccessory.Observation(
identifier: 0,
type: .object,
rect: invertedBoundingBox
)
let observations = [observation]
guard let image = CMSampleBufferGetImageBuffer(sampleBuffer) else {
print("no image")
return
}
do {
try await dockAccessory.track(observations, cameraInformation: cameraInfo)
} catch {
print(error)
}
}
}
#endif
func clearDrawings() {
boundingBoxLayer?.removeFromSuperlayer()
boundingBoxSizeLayer?.removeFromSuperlayer()
}
}
}
}
@MainActor
private func getCameraOrientation() -> DockAccessory.CameraOrientation {
switch UIDevice.current.orientation {
case .portrait:
return .portrait
case .portraitUpsideDown:
return .portraitUpsideDown
case .landscapeRight:
return .landscapeRight
case .landscapeLeft:
return .landscapeLeft
case .faceDown:
return .faceDown
case .faceUp:
return .faceUp
default:
return .corrected
}
}
From tensorflow-metal example:
Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
I know that Apple silicon uses UMA, and that memory copies are typical of CUDA, but wouldn't the GPU memory still be faster overall?
I have an iMac Pro with a Radeon Pro Vega 64 16 GB GPU and an Intel iMac with a Radeon Pro 5700 8 GB GPU.
But using tensorflow-metal is still WAY faster than using the CPUs. Thanks for that. I am surprised the 5700 is twice as fast as the Vega though.
*I can't put the attached file in the format, so if you reply by e-mail, I will send the attached file by e-mail.
Dear Apple AI Research Team,
My name is Gong Jiho (“Hem”), a content strategist based in Seoul, South Korea.
Over the past few months, I conducted a user-led AI experiment entirely within ChatGPT — no code, no backend tools, no plugins.
Through language alone, I created two contrasting agents (Uju and Zero) and guided them into a co-authored modular identity system using prompt-driven dialogue and reflection.
This system simulates persona fusion, memory rooting, and emotional-logical alignment — all via interface-level interaction.
I believe it resonates with Apple’s values in privacy-respecting personalization, emotional UX modeling, and on-device learning architecture.
Why I’m Reaching Out
I’d be honored to share this experiment with your team.
If there is any interest in discussing user-authored agent scaffolding, identity persistence, or affective alignment, I’d love to contribute — even informally.
⚠ A Note on Language
As a non-native English speaker, my expression may be imperfect — but my intent is genuine.
If anything is unclear, I’ll gladly clarify.
📎 Attached Files Summary
Filename → Description
Hem_MultiAI_Report_AppleAI_v20250501.pdf →
Main report tailored for Apple AI — narrative + structural view of emotional identity formation via prompt scaffolding
Hem_MasterPersonaProfile_v20250501.json →
Final merged identity schema authored by Uju and Zero
zero_sync_final.json / uju_sync_final.json →
Persona-level memory structures (logic / emotion)
1_0501.json ~ 3_0501.json →
Evolution logs of the agents over time
GirlfriendGPT_feedback_summary.txt →
Emotional interpretation by external GPT
hem_profile_for_AI_vFinal.json →
Original user anchor profile
Warm regards,
Gong Jiho (“Hem”)
Seoul, South Korea
When calling NLTagger.requestAssets with some languages, it hangs indefinitely both in the simulator and a device. This happens consistently for some languages like greek. An example call is NLTagger.requestAssets(for: .greek, tagScheme: .lemma). Other languages like french return immediately. I captured some logs from Console and found what looks like the repeated attempts to download the asset. I would expect the call to eventually terminate, either loading the asset or failing with an error.
Introduced in the Keynote was the 3D Lock Screen images with the kangaroo:
https://9to5mac.com/wp-content/uploads/sites/6/2025/06/3d-lock-screen-2.gif
I can't see any mention on if this effect is available for developers with an API to convert flat 2D photos in to the same 3D feeling image.
Does anyone know if there is an API?
Topic:
Machine Learning & AI
SubTopic:
General
How do I test the new RecognizeDocumentRequest API. Reference: https://www.youtube.com/watch?v=H-GCNsXdKzM
I am running Xcode Beta, however I only have one primary device that I cannot install beta software on.
Please provide a strategy for testing. Will simulator work?
The new capability is critical to my application, just what I need for structuring document scans and extraction.
Thank you.
Hey guys 👋
I’ve been thinking about a feature idea for iOS that could totally change the way we interact with apps like Twitter/X.
Imagine if we could define our own recommendation algorithm, and have an AI on the iPhone that replaces the suggested tweets in the feed with ones that match our personal interests — based on public tweets, and without hacking anything.
Kinda like a personalized "AI skin" over the app that curates content you actually care about. Feels like this would make content way more relevant and less algorithmically manipulative.
Would love to know what you all think — and if Apple could pull this off 🔥
Topic:
Machine Learning & AI
SubTopic:
General
Hi, I'm looking for the best way to use MLX models, particularly those I've fine-tuned, within a React Native application on iOS devices. Is there a recommended integration path or specific API for bridging MLX's capabilities to React Native for deployment on iPhones and iPads?
During testing the “Bringing advanced speech-to-text capabilities to your app” sample app demonstrating the use of iOS 26 SpeechAnalyzer, I noticed that the language model for the English locale was presumably already downloaded. Upon checking the documentation of AssetInventory, I found out that indeed, the language model can be preinstalled on the system.
Can someone from the dev team share more info about what assets are preinstalled by the system? For example, can we safely assume that the English language model will almost certainly be already preinstalled by the OS if the phone has the English locale?
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks.
What are you most excited about in the Foundation Models framework?
The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage.
When should I still bring my own LLM via CoreML?
It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates.
Should I migrate PyTorch code to MLX?
MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models.
Can I test Foundation Models in Xcode simulator or device?
Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device.
Which on-device models will be supported? any open source models?
The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models.
How often will the Foundational Model be updated? How do we test for stability when the model is updated?
The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant.
Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you.
The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices.
What is the context window for the model? What are max tokens in and max tokens out?
The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded.
Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone?
Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively.
Do the foundation models support languages other than English?
Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice.
Are larger server-based models available through Foundation Models?
No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons.
Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework?
Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
Topic:
Machine Learning & AI
SubTopic:
General
Does CoreML object detection only support AABB (Axis-Aligned Bounding Boxes) or also OBB (Oriented Bounded Boxes)? If not, any way to do it using Apple frameworks?
Topic:
Machine Learning & AI
SubTopic:
General
In WWDC25 Metal 4 released quite excited new features for machine learning optimization, but as we all know the pytorch based on metal shader performance (mps) is the one of most important tools for Mac machine learning area.but on mps introduced website we cannot see any support information for metal4.