Using Metal, IOSurface and Virtualization.framework

Question

Created Jun ’26

Replies 1

Boosts 2

Participants 2

This post is from the WWDC26 Virtualization Q&A.

For a 3D workload going through Virtualization.framework's graphics device, what's the supported/expected path for the guest to drive host-side Metal rendering, and what are the known performance sharp edges?
Is zero-copy IOSurface sharing between host and guest a sanctioned pattern? What are the best practices and limits for high-throughput texture handoff across the VM boundary?
We translate a foreign 3D command stream into Metal on the host. Any guidance on command-buffer batching, synchronization, or present/vsync alignment to keep latency low?

Answered by Graphics and Games Engineer in 890599022

There is a driver in macOS guests that uses the host virtualization stack to forward metal rendering. It should be populated as part of the list of MTLDevices returned from MTLCopyAllDevices() or from MTLCreateSystemDefaultDevice(). As for sharp edges: Creation and deletion of resources is more expensive than on a physical machine and there is likely more latency in command buffer completion than on a physical machine. MTLEvents are resolved in the guest so will have additional latency. Native metal hazard tracking is done on the host and will be faster.
IOSurfaces are indeed zero-copy between host and guest. MTLBuffers are as well. Wrapping an IOSurface with a MTLTexture will result in a zero-copy texture pattern.
Getting a tight vsync bound isn't possible, there's no mechanism for synchronizing the guest display with the host compositor but frames are presented as eagerly as possible. Standard rules of metal batching apply; avoid using waitUntilCompleted, and proper render pass usage is important.

Answer 1

Graphics and Games Engineer OP

Apple

Jun ’26

Accepted Answer

There is a driver in macOS guests that uses the host virtualization stack to forward metal rendering. It should be populated as part of the list of MTLDevices returned from MTLCopyAllDevices() or from MTLCreateSystemDefaultDevice(). As for sharp edges: Creation and deletion of resources is more expensive than on a physical machine and there is likely more latency in command buffer completion than on a physical machine. MTLEvents are resolved in the guest so will have additional latency. Native metal hazard tracking is done on the host and will be faster.
IOSurfaces are indeed zero-copy between host and guest. MTLBuffers are as well. Wrapping an IOSurface with a MTLTexture will result in a zero-copy texture pattern.
Getting a tight vsync bound isn't possible, there's no mechanism for synchronizing the guest display with the host compositor but frames are presented as eagerly as possible. Standard rules of metal batching apply; avoid using waitUntilCompleted, and proper render pass usage is important.