Post

Replies

Boosts

Views

Activity

Reply to Core ML Model Performance report shows prediction speed much faster than actual app runs
Hello, I was having a similar issue with my model and app. The performance report shows a median execution time of ~1.5 ms, but the average time reported while running in my app is ~3.8 ms. The typical execution spacing is every ~25ms, but within my app I am able to replicate the 1.5ms execution time by manually running inference 20 times back to back. Screenshot of Instruments after running inference repeatedly. Notice how the inference time drops from the slow time (3.8ms) to the fast time (1.5ms) gradually. Screenshot of instruments with a more typical usage profile. The inference time stays at 3.8ms. It seems there is some sort of caching occurring that gets cleared if you wait a few ms before running another inference. Is there a way to maintain that fast inference time even with some regular spacing to inference?
Topic: Machine Learning & AI SubTopic: Core ML Tags:
Oct ’25