Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Posts under Metal subtopic

Post

Replies

Boosts

Views

Activity

Xcode Metal Capture crash when using MTLSamplerState
The sample code just draw a triangle and sample texture. both sample code can draw a correct triangle and sample texture as expected. there are no error message from terminal. Sample code using constexpr Sampler can capture and replay well. Sample code using a argumentTable to bind a MTLSamplerState was crashed when using Metal capture and replay on Xcode. Here are sample codes. Sample Code Test Environment: M1 Pro MacOS 26.3 (25D125) Xcode Version 26.2 (17C52) Feedback ID: FB22031701
2
0
329
2w
Question on setVertexBytes
I think if your buffer is less than 4k its recommended to use setVertexBytes, the question I have is can I keep hammering on setVertexBytes as the primary method to issue multiple draw calls within a render buffer and rely on Metal to figure out how to orphan and replace the target buffer? A lot of the primitives I am drawing are less than 4k and the process of wiring down larger segments of memory for individual buffers for each draw primitive call seems to be a negative. And it's just simpler to copy, submit and forget about buffer synchronization.
2
0
555
2w
Can a compute pipeline be as efficient as a render pipeline for rasterization?
I'm new to graphics and game design and I just wanted to know if a compute pipeline could be as efficient as a render pipeline for rasterization and an explanation on how and why. Also is it possible to manually perform rasterization with a render pipeline as in manipulate individual pixel data in a metal texture yourself but do it with a render pipeline?
1
0
331
2w
Xcode26 Replay frame broken
Got a broken frame when using Xcode to capture a frame and replay it from a Unity game. It seems like the vertex buffer is broken; I see a bunch of "nan"s in the vertex buffer. However, the game displays correct when running, and it only happend when I upgrade my Xcode and iphone to Xcode26 and IOS26 ios26
1
0
281
2w
Float64 (Double Precision) Support on MPS with PyTorch on Apple Silicon?
Hi everyone, This project uses PyTorch on an Apple Silicon Mac (M1/M2/etc.), and the goal is to use the MPS backend for GPU acceleration, notes Apple Developer. However, the workflow depends on Float64 (double-precision) floating-point numbers for certain computations, notes PyTorch Forums. The error "Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead" has been encountered, notes GitHub. It seems that the MPS backend doesn't currently support Float64 for direct GPU computation. Questions for the community: Are there any known workarounds or best practices for handling Float64-dependent operations when using the MPS backend with PyTorch? For those working with high-precision tasks on Apple Silicon, what strategies are being used to balance performance with the need for Float64? Offloading to the CPU is an option, and it's of interest to know if there are any specific techniques or libraries within the Apple ecosystem that could streamline this process while aiming for optimal performance. Any insights, tips, or experiences would be appreciated. Thanks in advance, Jonaid MacBook Pro M3 Max
2
1
501
Oct ’25
iPhone limited to 60hz frame rate
Just wondering if anyone knows what it will take to hit greater than 60hz when targeting iPhone. If I set the preferredFramesPerSecond of an MTKView to 120, it works on the iPad, but on iPhone it never goes over 60hz, even with a simple hello triangle sample app... is this a limitation of targeting iPhone?
2
0
235
Sep ’25
How to use MetalPeformancePrimitives
I am trying to learn the new Metal Peformance Primitives APIs. I have added the MetalPeformancePrimitives framework and included the header in my shader code as per documentation #include <MetalPeformancePrimitives/MetalPeformancePrimitives.h> Unfortunately, Xcode complains that the header cannot be found. How do I include it properly? I am using Xcode 26 on Tahoe. The MetalPeformancePrimitives framework is present on my machine and I can inspect the headers in the filesystem.
3
1
800
Oct ’25
Help Request! How to Render Models with SubMeshes Using Metal 4?
Hi, I'm Beginner with Metal 4 and Model I/O 🥺. I can render simple models with just one mesh, but when I try to render models with SubMeshes, nothing shows up on screen. Can anyone help me figure out how to properly render models with multiple submeshes? I think I'm not iterating through them correctly or maybe missing some buffers setup. Here's what I have so far: https://www.icloud.com.cn/iclouddrive/0a6x_NLwlWy-herPocExZ8g3Q#LoadModel
1
0
318
Nov ’25
Metal: Intersection results unstable when reusing Instance Acceleration Structures
Hi all, I'm encountering an issue with Metal raytracing on my M5 MacBook Pro regarding Instance Acceleration Structure (IAS). Intersection tests suddenly stop working after a certain point in the sampling loop. Situation I implemented an offline GPU path tracer that runs the same kernel multiple times per pixel (sampleCount) using metal::raytracing. Intersection tests are performed using an IAS. Since this is an offline path tracer, geometries inside the IAS never changes across samples (no transforms or updates). As sampleCount increases, there comes a point where the number of intersections drops to zero, and remains zero for all subsequent samples. Here's a code sketch: let sampleCount: UInt16 = 1024 for sampleIndex: UInt16 in 0..<sampleCount { // ... do { let commandBuffer = commandQueue.makeCommandBuffer() // Dispatch the intersection kernel. await commandBuffer.completed() } do { let commandBuffer = commandQueue.makeCommandBuffer() // Use the intersection test results from the previous command buffer. await commandBuffer.completed() } // ... } kernel void intersectAlongRay( const metal::uint32_t threadIndex [[thread_position_in_grid]], // ... const metal::raytracing::instance_acceleration_structure accelerationStructure [[buffer(2)]], // ... ) { // ... const auto result = intersector.intersect(ray, accelerationStructure); switch (result.type) { case metal::raytracing::intersection_type::triangle: { // Write intersection result to device buffers. break; } default: break; } Observations Encoding both the intersection kernel and the subsequent result usage in the same command buffer does not resolve the problem. Switching from IAS to Primitive Acceleration Structure (PAS) fixes the problem. Rebuilding the IAS for each sample also resolves the issue. Intersections produce inconsistent results even though the IAS and rays are identical — Image 1 shows a hit, while Image 2 shows a miss. Questions Am I misusing IAS in some way ? Could this be a Metal bug ? Any guidance or confirmation would be greatly appreciated.
1
1
359
Dec ’25
App Freezes on iPadOS 26.x - GPU Metal Errors
I work on a Qt/QML app that uses Esri Maps SDK for Qt and that is deployed to both Windows and iPads. With a recent iPad OS upgrade to 26.1, many iPad users are reporting the application freezing after panning and/or identifying features in the map. It runs fine for our Windows users. I was able to reproduce this and grabbed the following error messages when the freeze happens: IOGPUMetalError: Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault) IOGPUMetalError: Invalid Resource (00000009:kIOGPUCommandBufferCallbackErrorInvalidResource) Environment: Qt 6.5.4 (Qt for iOS) Esri Maps SDK for Qt 200.3 iPadOS 26.1 Because it appears to be a Metal error, I tried using OpenGL (Qt offers a way to easily set hte target graphics api): QQuickWindow::setGraphicsApi(QSGRendererInterface::GraphicsApi::OpenGL) Which worked! No more freezing. But I'm seeing many posts that OpenGL has been deprecated by Apple. I've seen posts that Apple deprecated OpenGL ES. But it seems to still be available with iPadOS 26.1. If so, will this fix (above) just cause problems with a future iPadOS update? Any other suggestions to address this issue? Upgrading our version of Qt + Esri SDK to the latest version is not an option for us. We are in the process to upgrade the full application, but it is a year or two out. So, we just need a fix to buy us some time for now. Appreciate any thoughts/insights....
3
0
581
Dec ’25
Metal 4: When is it ok to dealloc a MTLBuffer's memory
I have something like this drawing in an MTKView (see at bottom). I am finding it difficult to figure out when can the Swift-land resources used in making the MTLBuffer(s) be released? Below, for example, is it ok if args goes out of scope (or is otherwise deallocated) at point 1, 2, or 3? Or perhaps even earlier, as soon as argsBuffer has been created? I have been reading through various articles such as Setting resource storage modes Choosing a resource storage mode for Apple GPUs Copying data to a private resource but it's a lot to absorb and I haven't been really able to find an authoritative description of the required lifetime of the resources in CPU land. I should mention that this is Metal 4 code. In previous versions of Metal, the MTLCommandBuffer had the ability to add a completion handler to be called by the GPU after it has finished running the commands in the buffer but in Metal 4 there is no such thing (it it were even needed for the purpose I am interested in). Any advice and/or pointers to the definitive literature will be appreciated. guard let argsBuffer = device.makeBuffer(bytes: &args,... argumentTable.setAddress(argsBuffer.gpuAddress, ... encoder.setArgumentTable(argumentTable, stages: .vertex) // encode drawing renderEncoder.draw... ... encoder.endEncoding() // 1 commandBuffer.endCommandBuffer() // 2 commandQueue.waitForDrawable(drawable) commandQueue.commit([commandBuffer]) // 3 commandQueue.signalDrawable(drawable) drawable.present()
2
0
252
Jan ’26
Race conditions when changing CAMetalLayer.drawableSize?
Is the pseudocode below thread-safe? Imagine that the Main thread sets the CAMetalLayer's drawableSize to a new size meanwhile the rendering thread is in the middle of rendering into an existing MTLDrawable which does still have the old size. Is the change of metalLayer.drawableSize thread-safe in the sense that I can present an old MTLDrawable which has a different resolution than the current value of metalLayer.drawableSize? I assume that setting the drawableSize property informs Metal that the next MTLDrawable offered by the CAMetalLayer should have the new size, right? Is it valid to assume that "metalLayer.drawableSize = newSize" and "metalLayer.nextDrawable()" are internally synchronized, so it cannot happen that metalLayer.nextDrawable() would produce e.g. a MTLDrawable with the old width but with the new height (or a completely invalid resolution due to potential race conditions)? func onWindowResized(newSize: CGSize) { // Called on the Main thread metalLayer.drawableSize = newSize } func onVsync(drawable: MTLDrawable) { // Called on a background rendering thread renderer.renderInto(drawable: drawable) }
1
1
575
Dec ’25
Xcode Metal Trace
Code is download from apple official metal4 sample [https://developer.apple.com/documentation/metal/drawing-a-triangle-with-metal-4?language=objc] enable metal gpu trace in macOS schema and trace a frame in Xcode. Xcode may show segment fault on App from some 'GTTrace' function when click trace button. When replay a .gputrace file, Xcode may crash , throw an internal error or a XPC error. The example code using old metal-renderer can trace without any problem and everything works fine. Test Environment: Xcode Version 26.2 (17C52) macOS 26.2 (25C56) M1 Pro 16GB A2442
2
0
529
Jan ’26
BGContinuedProcessingTask GPU access — no iPhone support?
We are developing a video processing app that applies CIFilter chains to video frames. To not force the user to keep the app foregrounded, we were happy to see the introduction of BGContinuedProcessingTask to continue processing when backgrounded. With iOS 26, I was excited to see the com.apple.developer.background-tasks.continued-processing.gpu entitlement, which should allow GPU access in the background. Even the article in the documentation provides "exporting video in a film-editing app" or "applying visual filters (HDR, etc) or compressing images for social media posts" as use cases. However, when I check BGTaskScheduler.shared.supportedResources.contains(.gpu) at runtime, it returns false on every iPhone I've tested (including iPhone 15 Pro and iPhone 16 Pro). From forum responses I've seen, it sounds like background GPU access is currently limited to iPad only. If that's the case, I have a few questions: Is this an intentional, permanent limitation — or is iPhone support planned for a future iOS release? What is the recommended approach for GPU-dependent background work on iPhone? My custom CIKernels are written in Metal (as Apple recommends since CIKL is deprecated), but Metal CIKernels cannot fall back to CPU rendering. This creates a situation where Apple's own deprecation guidance (migrate to Metal) conflicts with background processing realities (no GPU on iPhone). Should developers maintain deprecated CIKL kernel versions alongside Metal kernels purely as a CPU fallback for background execution? That feels like it defeats the purpose of the migration. It seems like a gap in the platform: the API exists, the entitlement exists, but the hardware support isn't there for the most common device category. Any clarity on Apple's direction here would be very helpful.
2
0
314
Feb ’26
How to load and draw texture with opacity in Metal
The background I'm finally working to convert my very old Mac kaleidoscope application, ScopeWorks, which was written in OpenGL and Objective-C, to a Multiplatform app in SwiftUI and Metal. I'm using the MetalKit MTKView class, wrapped for SwiftUI as an NSViewRepresentable or UIViewRepresentable. I then provide an MTKViewDelegate that provides a draw method. The draw method fetches the current render pass descriptor, creates a command buffer, sets up a render pipeline, and does its drawing. My renderer's makePipeline method looks like this: func makePipeline() { let library = device.makeDefaultLibrary() let pipelineDesc = MTLRenderPipelineDescriptor() pipelineDesc.vertexFunction = library?.makeFunction(name: "vertex_main") pipelineDesc.fragmentFunction = library?.makeFunction(name: "fragment_main") pipelineDesc.colorAttachments[0].pixelFormat = .bgra8Unorm pipeline = try! device.makeRenderPipelineState(descriptor: pipelineDesc) } And my shaders look like this: struct VertexOut { float4 position [[position]]; float2 texCoord; }; vertex VertexOut vertex_main(const device float2* position [[buffer(0)]], uint vid [[vertex_id]]) { VertexOut out; float2 pos = position[vid]; out.position = float4(pos, 0, 1); out.texCoord = pos * 0.5 + 0.5; // basic mapping return out; } fragment float4 fragment_main(VertexOut in [[stage_in]], texture2d<float> tex [[texture(0)]], constant float4& color [[buffer(1)]]) { constexpr sampler s(address::repeat, filter::linear); // float4 texColor = tex.sample(s, in.texCoord); // return texColor * color; float4 textureColor = {1, 2, 3, 4}; if (all(color == textureColor)) { return tex.sample(s, in.texCoord); } else { return color; } // Sample the texture directly — no color tint applied return tex.sample(s, in.texCoord); } The first part of my MTKViewDelegate's draw method looks like this: func draw(in view: MTKView) { guard let drawable = view.currentDrawable, let descriptor = view.currentRenderPassDescriptor, let pipeline = pipeline, let texture = texture else { return } let commandBuffer = commandQueue.makeCommandBuffer()! let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: descriptor)! encoder.setRenderPipelineState(pipeline) encoder.setFragmentTexture(texture, index: 0) descriptor.colorAttachments[0].clearColor = MTLClearColor(red: 0.0, green: 0, blue: 0, alpha: 1.0) // Draw six equilateral triangles forming the hexagon let radius: Float = 0.6 for i in 0..<6 { let angle = Float(i) * (.pi / 3) let cosA = cos(angle) let sinA = sin(angle) let nextA = Float(i+1) * (.pi / 3) let cosB = cos(nextA) let sinB = sin(nextA) let verts: [simd_float2] = [ simd_float2(0, 0), simd_float2(radius * cosA, radius * sinA), simd_float2(radius * cosB, radius * sinB) ] encoder.setVertexBytes(verts, length: MemoryLayout<simd_float2>.stride * 3, index: 0) // Tell the fragment shader to use the texture color. var textureColor: simd_float4 = simd_float4(1, 2, 3, 4) encoder.setFragmentBytes(&textureColor, length: MemoryLayout<SIMD4<Float>>.stride, index: 1) encoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3) One of the things the existing app does is load PNG or TIFF images with an alpha channel, and then overlay parts of the image on top of themselves flipped, so you get interesting Moiré patterns in the lines in the resulting kaleidoscope. For now I'm working on a single sample image, loading it into a texture in Metal, and just rendering it as a hexagon and drawing lines for the triangles that make up the hexagon. (For now I'm using the vertex coordinates as the texture coordinates, so I get a hexagonal part of my texture rather than a single triangular part tessellated into a hexagon. I'll fix that later.) In both iOS and OS I set the clear color to black at the beginning of the draw function. The issue: The source image is mostly transparent, but with a lot of partly transparent pixels. Here's what it looks like in Photoshop, where you can see the transparent parts as a checkerboard pattern: (I tried to crop the original image to show the approximate part that I'm rendering in a hexagon, but it's not exact. Look for the same shapes in the different images to compare them.) When I render my hexagon in the Metal view in the iOS version of the app, it looks like it's forcing each pixel to fully opaque or fully transparent: And in the macOS version of the app, it seems to force ALL the pixels to opaque: I haven't shown all the setup code, because it's' a lot. Is there some rendering mode setup I'm missing in order to get it to draw the pixels into the output based on their opacity, including partial opacity?
2
0
894
Mar ’26
Best Way to Use MetalFX in Unreal Engine 5.7 for macOS Port?
Hi everyone, We’re currently porting a high-fidelity AA+ PC title built on Unreal Engine 5.7 to macOS (Apple Silicon), and we’re looking for guidance from anyone with experience in this area. At the moment, the game is already runnable on Mac, but not yet at a playable level — we’re seeing performance around 10–15 FPS on an M4 device. We’re actively analyzing and defining the work needed to reach production-quality performance on macOS. One of the key areas we’re exploring is leveraging MetalFX to improve frame rate. However, it seems there’s no official MetalFX plugin or direct integration available for Unreal Engine. Has anyone here successfully integrated MetalFX into a UE5 rendering pipeline, or found a recommended approach to do so? Any insights on best practices, workflows, or references (docs, samples, etc.) would be greatly appreciated. Thanks in advance!
3
0
717
2w
D3DMetal Extreme Over Synchronization Issues
Explanation Currently, D3DMetal’s GPU synchronization approach introduces significant compute overhead on the CPU. This specifically affects D3D12 games that use modern rendering pipelines on Apple Silicon. Specifically, I’ve tested Death Stranding 2 On the Beach for how it handles its rendering. And the results are extreme: frame times are suffering from a 42% decrease from synchronization. Although there are obviously other effects at play, such as the overhead introduced by Rosetta and Wine, both of them don’t introduce as much overhead as D3DMetal. This issue isn’t just specific to Death Stranding 2 On the Beach; most games running through D3DMetal suffer from this. Most games still seem to force synchronization to ~30 ms to reach the 30 fps amount. But it could be better with better synchronization, such as how DXMT handles it. Instead of doubling the work, it allows Metal to single-handedly track resource dependencies internally. This is in part due to the unfortunate bad mapping of D3D12 calls onto shared logic between D3D11 and D3D12. System M2 Max Mac Studio — 32 GBs — 30-core GPU macOS 26.4 Tahoe CrossOver 26.1 RC Death Stranding 2 On the Beach — Steam Assassin’s Creed Valhalla — Steam & Ubisoft Connect Thank you for your commitment. Another game that I recommend testing to really see this swell is Assassin’s Creed Valhalla. Feedback FB22426600 - D3DMetal Extereme Over Syncranization Issues
1
1
338
2w
Xcode Metal Capture crash when using MTLSamplerState
The sample code just draw a triangle and sample texture. both sample code can draw a correct triangle and sample texture as expected. there are no error message from terminal. Sample code using constexpr Sampler can capture and replay well. Sample code using a argumentTable to bind a MTLSamplerState was crashed when using Metal capture and replay on Xcode. Here are sample codes. Sample Code Test Environment: M1 Pro MacOS 26.3 (25D125) Xcode Version 26.2 (17C52) Feedback ID: FB22031701
Replies
2
Boosts
0
Views
329
Activity
2w
Question on setVertexBytes
I think if your buffer is less than 4k its recommended to use setVertexBytes, the question I have is can I keep hammering on setVertexBytes as the primary method to issue multiple draw calls within a render buffer and rely on Metal to figure out how to orphan and replace the target buffer? A lot of the primitives I am drawing are less than 4k and the process of wiring down larger segments of memory for individual buffers for each draw primitive call seems to be a negative. And it's just simpler to copy, submit and forget about buffer synchronization.
Replies
2
Boosts
0
Views
555
Activity
2w
Can a compute pipeline be as efficient as a render pipeline for rasterization?
I'm new to graphics and game design and I just wanted to know if a compute pipeline could be as efficient as a render pipeline for rasterization and an explanation on how and why. Also is it possible to manually perform rasterization with a render pipeline as in manipulate individual pixel data in a metal texture yourself but do it with a render pipeline?
Replies
1
Boosts
0
Views
331
Activity
2w
Xcode26 Replay frame broken
Got a broken frame when using Xcode to capture a frame and replay it from a Unity game. It seems like the vertex buffer is broken; I see a bunch of "nan"s in the vertex buffer. However, the game displays correct when running, and it only happend when I upgrade my Xcode and iphone to Xcode26 and IOS26 ios26
Replies
1
Boosts
0
Views
281
Activity
2w
Float64 (Double Precision) Support on MPS with PyTorch on Apple Silicon?
Hi everyone, This project uses PyTorch on an Apple Silicon Mac (M1/M2/etc.), and the goal is to use the MPS backend for GPU acceleration, notes Apple Developer. However, the workflow depends on Float64 (double-precision) floating-point numbers for certain computations, notes PyTorch Forums. The error "Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead" has been encountered, notes GitHub. It seems that the MPS backend doesn't currently support Float64 for direct GPU computation. Questions for the community: Are there any known workarounds or best practices for handling Float64-dependent operations when using the MPS backend with PyTorch? For those working with high-precision tasks on Apple Silicon, what strategies are being used to balance performance with the need for Float64? Offloading to the CPU is an option, and it's of interest to know if there are any specific techniques or libraries within the Apple ecosystem that could streamline this process while aiming for optimal performance. Any insights, tips, or experiences would be appreciated. Thanks in advance, Jonaid MacBook Pro M3 Max
Replies
2
Boosts
1
Views
501
Activity
Oct ’25
iPhone limited to 60hz frame rate
Just wondering if anyone knows what it will take to hit greater than 60hz when targeting iPhone. If I set the preferredFramesPerSecond of an MTKView to 120, it works on the iPad, but on iPhone it never goes over 60hz, even with a simple hello triangle sample app... is this a limitation of targeting iPhone?
Replies
2
Boosts
0
Views
235
Activity
Sep ’25
How to use MetalPeformancePrimitives
I am trying to learn the new Metal Peformance Primitives APIs. I have added the MetalPeformancePrimitives framework and included the header in my shader code as per documentation #include <MetalPeformancePrimitives/MetalPeformancePrimitives.h> Unfortunately, Xcode complains that the header cannot be found. How do I include it properly? I am using Xcode 26 on Tahoe. The MetalPeformancePrimitives framework is present on my machine and I can inspect the headers in the filesystem.
Replies
3
Boosts
1
Views
800
Activity
Oct ’25
Metal 4 support in iOS simulator
I'm updating our app to support metal 4, but the metal 4 types don't seem to get recognized when targeting simulator. Is it known if metal 4 will be supported in the near future, or am I setting up the app wrong?
Replies
5
Boosts
0
Views
746
Activity
Oct ’25
Why is there no Metal on Apple Watch?
subj And how in this case are beautiful system dials made with smoke effects and other particles?
Replies
1
Boosts
0
Views
331
Activity
Oct ’25
Help Request! How to Render Models with SubMeshes Using Metal 4?
Hi, I'm Beginner with Metal 4 and Model I/O 🥺. I can render simple models with just one mesh, but when I try to render models with SubMeshes, nothing shows up on screen. Can anyone help me figure out how to properly render models with multiple submeshes? I think I'm not iterating through them correctly or maybe missing some buffers setup. Here's what I have so far: https://www.icloud.com.cn/iclouddrive/0a6x_NLwlWy-herPocExZ8g3Q#LoadModel
Replies
1
Boosts
0
Views
318
Activity
Nov ’25
Metal: Intersection results unstable when reusing Instance Acceleration Structures
Hi all, I'm encountering an issue with Metal raytracing on my M5 MacBook Pro regarding Instance Acceleration Structure (IAS). Intersection tests suddenly stop working after a certain point in the sampling loop. Situation I implemented an offline GPU path tracer that runs the same kernel multiple times per pixel (sampleCount) using metal::raytracing. Intersection tests are performed using an IAS. Since this is an offline path tracer, geometries inside the IAS never changes across samples (no transforms or updates). As sampleCount increases, there comes a point where the number of intersections drops to zero, and remains zero for all subsequent samples. Here's a code sketch: let sampleCount: UInt16 = 1024 for sampleIndex: UInt16 in 0..<sampleCount { // ... do { let commandBuffer = commandQueue.makeCommandBuffer() // Dispatch the intersection kernel. await commandBuffer.completed() } do { let commandBuffer = commandQueue.makeCommandBuffer() // Use the intersection test results from the previous command buffer. await commandBuffer.completed() } // ... } kernel void intersectAlongRay( const metal::uint32_t threadIndex [[thread_position_in_grid]], // ... const metal::raytracing::instance_acceleration_structure accelerationStructure [[buffer(2)]], // ... ) { // ... const auto result = intersector.intersect(ray, accelerationStructure); switch (result.type) { case metal::raytracing::intersection_type::triangle: { // Write intersection result to device buffers. break; } default: break; } Observations Encoding both the intersection kernel and the subsequent result usage in the same command buffer does not resolve the problem. Switching from IAS to Primitive Acceleration Structure (PAS) fixes the problem. Rebuilding the IAS for each sample also resolves the issue. Intersections produce inconsistent results even though the IAS and rays are identical — Image 1 shows a hit, while Image 2 shows a miss. Questions Am I misusing IAS in some way ? Could this be a Metal bug ? Any guidance or confirmation would be greatly appreciated.
Replies
1
Boosts
1
Views
359
Activity
Dec ’25
App Freezes on iPadOS 26.x - GPU Metal Errors
I work on a Qt/QML app that uses Esri Maps SDK for Qt and that is deployed to both Windows and iPads. With a recent iPad OS upgrade to 26.1, many iPad users are reporting the application freezing after panning and/or identifying features in the map. It runs fine for our Windows users. I was able to reproduce this and grabbed the following error messages when the freeze happens: IOGPUMetalError: Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault) IOGPUMetalError: Invalid Resource (00000009:kIOGPUCommandBufferCallbackErrorInvalidResource) Environment: Qt 6.5.4 (Qt for iOS) Esri Maps SDK for Qt 200.3 iPadOS 26.1 Because it appears to be a Metal error, I tried using OpenGL (Qt offers a way to easily set hte target graphics api): QQuickWindow::setGraphicsApi(QSGRendererInterface::GraphicsApi::OpenGL) Which worked! No more freezing. But I'm seeing many posts that OpenGL has been deprecated by Apple. I've seen posts that Apple deprecated OpenGL ES. But it seems to still be available with iPadOS 26.1. If so, will this fix (above) just cause problems with a future iPadOS update? Any other suggestions to address this issue? Upgrading our version of Qt + Esri SDK to the latest version is not an option for us. We are in the process to upgrade the full application, but it is a year or two out. So, we just need a fix to buy us some time for now. Appreciate any thoughts/insights....
Replies
3
Boosts
0
Views
581
Activity
Dec ’25
Metal 4: When is it ok to dealloc a MTLBuffer's memory
I have something like this drawing in an MTKView (see at bottom). I am finding it difficult to figure out when can the Swift-land resources used in making the MTLBuffer(s) be released? Below, for example, is it ok if args goes out of scope (or is otherwise deallocated) at point 1, 2, or 3? Or perhaps even earlier, as soon as argsBuffer has been created? I have been reading through various articles such as Setting resource storage modes Choosing a resource storage mode for Apple GPUs Copying data to a private resource but it's a lot to absorb and I haven't been really able to find an authoritative description of the required lifetime of the resources in CPU land. I should mention that this is Metal 4 code. In previous versions of Metal, the MTLCommandBuffer had the ability to add a completion handler to be called by the GPU after it has finished running the commands in the buffer but in Metal 4 there is no such thing (it it were even needed for the purpose I am interested in). Any advice and/or pointers to the definitive literature will be appreciated. guard let argsBuffer = device.makeBuffer(bytes: &args,... argumentTable.setAddress(argsBuffer.gpuAddress, ... encoder.setArgumentTable(argumentTable, stages: .vertex) // encode drawing renderEncoder.draw... ... encoder.endEncoding() // 1 commandBuffer.endCommandBuffer() // 2 commandQueue.waitForDrawable(drawable) commandQueue.commit([commandBuffer]) // 3 commandQueue.signalDrawable(drawable) drawable.present()
Replies
2
Boosts
0
Views
252
Activity
Jan ’26
Race conditions when changing CAMetalLayer.drawableSize?
Is the pseudocode below thread-safe? Imagine that the Main thread sets the CAMetalLayer's drawableSize to a new size meanwhile the rendering thread is in the middle of rendering into an existing MTLDrawable which does still have the old size. Is the change of metalLayer.drawableSize thread-safe in the sense that I can present an old MTLDrawable which has a different resolution than the current value of metalLayer.drawableSize? I assume that setting the drawableSize property informs Metal that the next MTLDrawable offered by the CAMetalLayer should have the new size, right? Is it valid to assume that "metalLayer.drawableSize = newSize" and "metalLayer.nextDrawable()" are internally synchronized, so it cannot happen that metalLayer.nextDrawable() would produce e.g. a MTLDrawable with the old width but with the new height (or a completely invalid resolution due to potential race conditions)? func onWindowResized(newSize: CGSize) { // Called on the Main thread metalLayer.drawableSize = newSize } func onVsync(drawable: MTLDrawable) { // Called on a background rendering thread renderer.renderInto(drawable: drawable) }
Replies
1
Boosts
1
Views
575
Activity
Dec ’25
Why there's no rgb32Float in Metal?
I noticed that MTLPixelFormat has this cases: case r32Float = 55 case rg32Float = 105 case rgba32Float = 125 But no case rgb32Float. What's the reason for such a discrimination?
Replies
1
Boosts
0
Views
287
Activity
Jan ’26
Xcode Metal Trace
Code is download from apple official metal4 sample [https://developer.apple.com/documentation/metal/drawing-a-triangle-with-metal-4?language=objc] enable metal gpu trace in macOS schema and trace a frame in Xcode. Xcode may show segment fault on App from some 'GTTrace' function when click trace button. When replay a .gputrace file, Xcode may crash , throw an internal error or a XPC error. The example code using old metal-renderer can trace without any problem and everything works fine. Test Environment: Xcode Version 26.2 (17C52) macOS 26.2 (25C56) M1 Pro 16GB A2442
Replies
2
Boosts
0
Views
529
Activity
Jan ’26
BGContinuedProcessingTask GPU access — no iPhone support?
We are developing a video processing app that applies CIFilter chains to video frames. To not force the user to keep the app foregrounded, we were happy to see the introduction of BGContinuedProcessingTask to continue processing when backgrounded. With iOS 26, I was excited to see the com.apple.developer.background-tasks.continued-processing.gpu entitlement, which should allow GPU access in the background. Even the article in the documentation provides "exporting video in a film-editing app" or "applying visual filters (HDR, etc) or compressing images for social media posts" as use cases. However, when I check BGTaskScheduler.shared.supportedResources.contains(.gpu) at runtime, it returns false on every iPhone I've tested (including iPhone 15 Pro and iPhone 16 Pro). From forum responses I've seen, it sounds like background GPU access is currently limited to iPad only. If that's the case, I have a few questions: Is this an intentional, permanent limitation — or is iPhone support planned for a future iOS release? What is the recommended approach for GPU-dependent background work on iPhone? My custom CIKernels are written in Metal (as Apple recommends since CIKL is deprecated), but Metal CIKernels cannot fall back to CPU rendering. This creates a situation where Apple's own deprecation guidance (migrate to Metal) conflicts with background processing realities (no GPU on iPhone). Should developers maintain deprecated CIKL kernel versions alongside Metal kernels purely as a CPU fallback for background execution? That feels like it defeats the purpose of the migration. It seems like a gap in the platform: the API exists, the entitlement exists, but the hardware support isn't there for the most common device category. Any clarity on Apple's direction here would be very helpful.
Replies
2
Boosts
0
Views
314
Activity
Feb ’26
How to load and draw texture with opacity in Metal
The background I'm finally working to convert my very old Mac kaleidoscope application, ScopeWorks, which was written in OpenGL and Objective-C, to a Multiplatform app in SwiftUI and Metal. I'm using the MetalKit MTKView class, wrapped for SwiftUI as an NSViewRepresentable or UIViewRepresentable. I then provide an MTKViewDelegate that provides a draw method. The draw method fetches the current render pass descriptor, creates a command buffer, sets up a render pipeline, and does its drawing. My renderer's makePipeline method looks like this: func makePipeline() { let library = device.makeDefaultLibrary() let pipelineDesc = MTLRenderPipelineDescriptor() pipelineDesc.vertexFunction = library?.makeFunction(name: "vertex_main") pipelineDesc.fragmentFunction = library?.makeFunction(name: "fragment_main") pipelineDesc.colorAttachments[0].pixelFormat = .bgra8Unorm pipeline = try! device.makeRenderPipelineState(descriptor: pipelineDesc) } And my shaders look like this: struct VertexOut { float4 position [[position]]; float2 texCoord; }; vertex VertexOut vertex_main(const device float2* position [[buffer(0)]], uint vid [[vertex_id]]) { VertexOut out; float2 pos = position[vid]; out.position = float4(pos, 0, 1); out.texCoord = pos * 0.5 + 0.5; // basic mapping return out; } fragment float4 fragment_main(VertexOut in [[stage_in]], texture2d<float> tex [[texture(0)]], constant float4& color [[buffer(1)]]) { constexpr sampler s(address::repeat, filter::linear); // float4 texColor = tex.sample(s, in.texCoord); // return texColor * color; float4 textureColor = {1, 2, 3, 4}; if (all(color == textureColor)) { return tex.sample(s, in.texCoord); } else { return color; } // Sample the texture directly — no color tint applied return tex.sample(s, in.texCoord); } The first part of my MTKViewDelegate's draw method looks like this: func draw(in view: MTKView) { guard let drawable = view.currentDrawable, let descriptor = view.currentRenderPassDescriptor, let pipeline = pipeline, let texture = texture else { return } let commandBuffer = commandQueue.makeCommandBuffer()! let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: descriptor)! encoder.setRenderPipelineState(pipeline) encoder.setFragmentTexture(texture, index: 0) descriptor.colorAttachments[0].clearColor = MTLClearColor(red: 0.0, green: 0, blue: 0, alpha: 1.0) // Draw six equilateral triangles forming the hexagon let radius: Float = 0.6 for i in 0..<6 { let angle = Float(i) * (.pi / 3) let cosA = cos(angle) let sinA = sin(angle) let nextA = Float(i+1) * (.pi / 3) let cosB = cos(nextA) let sinB = sin(nextA) let verts: [simd_float2] = [ simd_float2(0, 0), simd_float2(radius * cosA, radius * sinA), simd_float2(radius * cosB, radius * sinB) ] encoder.setVertexBytes(verts, length: MemoryLayout<simd_float2>.stride * 3, index: 0) // Tell the fragment shader to use the texture color. var textureColor: simd_float4 = simd_float4(1, 2, 3, 4) encoder.setFragmentBytes(&textureColor, length: MemoryLayout<SIMD4<Float>>.stride, index: 1) encoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3) One of the things the existing app does is load PNG or TIFF images with an alpha channel, and then overlay parts of the image on top of themselves flipped, so you get interesting Moiré patterns in the lines in the resulting kaleidoscope. For now I'm working on a single sample image, loading it into a texture in Metal, and just rendering it as a hexagon and drawing lines for the triangles that make up the hexagon. (For now I'm using the vertex coordinates as the texture coordinates, so I get a hexagonal part of my texture rather than a single triangular part tessellated into a hexagon. I'll fix that later.) In both iOS and OS I set the clear color to black at the beginning of the draw function. The issue: The source image is mostly transparent, but with a lot of partly transparent pixels. Here's what it looks like in Photoshop, where you can see the transparent parts as a checkerboard pattern: (I tried to crop the original image to show the approximate part that I'm rendering in a hexagon, but it's not exact. Look for the same shapes in the different images to compare them.) When I render my hexagon in the Metal view in the iOS version of the app, it looks like it's forcing each pixel to fully opaque or fully transparent: And in the macOS version of the app, it seems to force ALL the pixels to opaque: I haven't shown all the setup code, because it's' a lot. Is there some rendering mode setup I'm missing in order to get it to draw the pixels into the output based on their opacity, including partial opacity?
Replies
2
Boosts
0
Views
894
Activity
Mar ’26
Best Way to Use MetalFX in Unreal Engine 5.7 for macOS Port?
Hi everyone, We’re currently porting a high-fidelity AA+ PC title built on Unreal Engine 5.7 to macOS (Apple Silicon), and we’re looking for guidance from anyone with experience in this area. At the moment, the game is already runnable on Mac, but not yet at a playable level — we’re seeing performance around 10–15 FPS on an M4 device. We’re actively analyzing and defining the work needed to reach production-quality performance on macOS. One of the key areas we’re exploring is leveraging MetalFX to improve frame rate. However, it seems there’s no official MetalFX plugin or direct integration available for Unreal Engine. Has anyone here successfully integrated MetalFX into a UE5 rendering pipeline, or found a recommended approach to do so? Any insights on best practices, workflows, or references (docs, samples, etc.) would be greatly appreciated. Thanks in advance!
Replies
3
Boosts
0
Views
717
Activity
2w
D3DMetal Extreme Over Synchronization Issues
Explanation Currently, D3DMetal’s GPU synchronization approach introduces significant compute overhead on the CPU. This specifically affects D3D12 games that use modern rendering pipelines on Apple Silicon. Specifically, I’ve tested Death Stranding 2 On the Beach for how it handles its rendering. And the results are extreme: frame times are suffering from a 42% decrease from synchronization. Although there are obviously other effects at play, such as the overhead introduced by Rosetta and Wine, both of them don’t introduce as much overhead as D3DMetal. This issue isn’t just specific to Death Stranding 2 On the Beach; most games running through D3DMetal suffer from this. Most games still seem to force synchronization to ~30 ms to reach the 30 fps amount. But it could be better with better synchronization, such as how DXMT handles it. Instead of doubling the work, it allows Metal to single-handedly track resource dependencies internally. This is in part due to the unfortunate bad mapping of D3D12 calls onto shared logic between D3D11 and D3D12. System M2 Max Mac Studio — 32 GBs — 30-core GPU macOS 26.4 Tahoe CrossOver 26.1 RC Death Stranding 2 On the Beach — Steam Assassin’s Creed Valhalla — Steam & Ubisoft Connect Thank you for your commitment. Another game that I recommend testing to really see this swell is Assassin’s Creed Valhalla. Feedback FB22426600 - D3DMetal Extereme Over Syncranization Issues
Replies
1
Boosts
1
Views
338
Activity
2w