face and body detection in the Vision framework a local model or a cloud model?

Is the face and body detection service in the Vision framework a local model or a cloud model? Is there a performance report? https://developer.apple.com/documentation/vision

All Vision Framework requests are on device and are highly performance tuned for the on device inference while still maintaining high quality and accuracy.

face and body detection in the Vision framework a local model or a cloud model?
 
 
Q