Can iOS capture video at 4032×3024 while running a Vision/ML model?

I am new to Swift and iOS development, and I have a question about video capture performance.

Is it possible to capture video at a resolution of 4032×3024 while simultaneously running a vision/ML model on the video stream (e.g., using Vision or CoreML)?

I want to know:

whether iOS devices support capturing video at that resolution,

whether the frame rate drops significantly at that scale,

and whether it is practical to run a Vision/ML model in real-time while recording at such a high resolution.

If anyone has experience with high-resolution AVCaptureSession setups or combining them with real-time ML processing, I would really appreciate guidance or sample code.

Hi mujahirabbasi, To answer the first question, yes you can stream video buffers through the AVCaptureVideoDataOutput at 4032x3024 on most cameras. Whether vision is able to keep up with 30 fps @ 12 MP is a different story. It would depend on the model and its complexity.

If you need 12 MP buffers for saving to storage but the Vision processing takes too long, you could always use a second VideoDataOutput sourcing from the same camera and request a lower resolution and run that through Vision to get needed inferences. Most models internally run at a lower resolution anyway.

Can iOS capture video at 4032×3024 while running a Vision/ML model?
 
 
Q