Instruments is your friend. Check this WWDC video: https://developer.apple.com/videos/play/wwdc2023/10049.
Core ML used to serialize predictions per MLModel instance. In recent years this per-instance lock has been relaxed, but the optimization is often available only for the newer model type (ML Program) and API usage (async predictions.)
Using Instruments, we can see which activities are serialized and make an informed decision to utilize the compute resource.
Topic:
Machine Learning & AI
SubTopic:
Core ML
Tags: