I am puzzled by the setAddress(_:attributeStride:index:) of MTL4ArgumentTable. Can anyone please explain what the attributeStride parameter is for? The doc says that it is "The stride between attributes in the buffer." but why?
Who uses this for what? On the C++ side in the shaders the stride is determined by the C++ type, as far as I know. What am I missing here?
Thanks!
Metal
RSS for tagRender advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hey I'm using the CIDepthBlurEffect Core Image Filter in my app. It seems to work ok but I get these errors in the console when calling the class.
CoreImage Metal library does not contain function for name: sparserendering_xhlrb_scan
CoreImage Metal library does not contain function for name: sparserendering_xhlrb_diffuse
CoreImage Metal library does not contain function for name: sparserendering_xhlrb_copy_back
CoreImage Metal library does not contain function for name: plain_or_sRGB_copy
Am I missing some sort of import to gain these Metal functions? I am using my own custom shaders but I assume you'd be able to use them along side the built in ones.
Hi,
I would like clarification on whether the new hover effects feature introduced in vision os 26 supported pinch gestures through the psvr 2 controllers.
In your sample application, I was not able to confirm that this was working. Only pinch clicking with my hands worked. Pulling the trigger on the controller whilst looking at a 3d object did not activate the hover effect spatial event in the sample application. (The object is showing the highlight though)
This is inconsistent with hover effect behavior with psvr2 controllers on swift ui views, where the trigger press does count as a button click.
The sample I used was this one:
https://developer.apple.com/documentation/compositorservices/rendering_hover_effects_in_metal_immersive_apps
On MacBook Pro M3 14" I can profile the Metal App performance by running it, then clicking on the M icon and choosing profile after replay.
On Mac Studio M2 Ultra I cannot: the profiler starts and crashes. I have tried everything including reinstalling the OS, Xcode, the Metal SDK, you name it.
The app uses the Metal 4 API. The content of the replayer errorinfo report is shown at the end.
Any ideas what is going on here and/or what else I can do do root cause this and fix it?
FWIW, it was worse on 26.1 (Xcode just reported Metal 4 profiling not available). In 26.2 Xcode attempts to profile and invariably crashes.
=== Error summary: ===
1x DYErrorDomain (512) - guest app crashed (512)
1x com.apple.gputools.MTLReplayer (100) - Abort trap: 6
=== First Error ===
Domain: DYErrorDomain
Error code: 512
Description: guest app crashed (512)
GTErrorKeyPID: 26913
GTErrorKeyProcessName: GPUToolsReplayService
GTErrorKeyCrashDate: 2026-01-09 19:22:52 +0000
=== Underlying Error #1 ===
Domain: com.apple.gputools.MTLReplayer
Error code: 100
Description: Abort trap: 6
Call stack:
0 GPUToolsReplay 0x0000000249c25850 MakeNSError + 284
1 GPUToolsReplay 0x0000000249c26428 HandleCrashSignal + 252
2 libsystem_platform.dylib 0x00000001856c7744 _sigtramp + 56
3 libsystem_pthread.dylib 0x00000001856bd888 pthread_kill + 296
4 libsystem_c.dylib 0x00000001855c2850 abort + 124
5 libsystem_c.dylib 0x00000001855c1a84 err + 0
6 IOGPU 0x00000001a9ea60a8 -[IOGPUMetal4CommandQueue _commit:count:commitFeedback:].cold.1 + 0
7 IOGPU 0x00000001a9ea0df8 __77-[IOGPUMetal4CommandQueue commitFillArgs:count:args:argsSize:commitFeedback:]_block_invoke + 0
8 IOGPU 0x00000001a9ea1004 -[IOGPUMetal4CommandQueue _commit:count:commitFeedback:] + 148
9 AGXMetalG14X 0x00000001158a2c98 -[AGXG14XFamilyCommandQueue_mtlnext noMergeCommit:count:options:commitFeedback:error:] + 116
10 AGXMetalG14X 0x0000000115a45c14 +[AGXG14XFamilyRenderContext_mtlnext mergeRenderEncoders:count:options:commitFeedback:queue:error:] + 4740
11 AGXMetalG14X 0x00000001158a2b34 -[AGXG14XFamilyCommandQueue_mtlnext commit:count:options:] + 96
12 GPUToolsReplay 0x0000000249bf0644 GTMTLReplayController_defaultDispatchFunction_noPinning + 2744
13 GPUToolsReplay 0x0000000249befb10 GTMTLReplayController_defaultDispatchFunction + 1368
14 GPUToolsReplay 0x0000000249b7a61c _ZL16DispatchFunctionP21GTMTLReplayControllerPK11GTTraceFuncRb + 476
15 GPUToolsReplay 0x0000000249b8603c ___ZN35GTUSCSamplingStreamingManagerHelper19StreamFrameTimeDataEv_block_invoke + 456
16 Foundation 0x0000000186f6c878 __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 24
17 Foundation 0x0000000186f6c740 -[NSBlockOperation main] + 96
18 Foundation 0x0000000186f6c6d8 __NSOPERATION_IS_INVOKING_MAIN__ + 16
19 Foundation 0x0000000186f6c308 -[NSOperation start] + 640
20 Foundation 0x0000000186f6c080 __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 16
21 Foundation 0x0000000186f6bf70 __NSOQSchedule_f + 164
22 libdispatch.dylib 0x00000001855104d0 _dispatch_block_async_invoke2 + 148
23 libdispatch.dylib 0x000000018551aad4 _dispatch_client_callout + 16
24 libdispatch.dylib 0x00000001855056e4 _dispatch_continuation_pop + 596
25 libdispatch.dylib 0x0000000185504d58 _dispatch_async_redirect_invoke + 580
26 libdispatch.dylib 0x0000000185512fc8 _dispatch_root_queue_drain + 364
27 libdispatch.dylib 0x0000000185513784 _dispatch_worker_thread2 + 180
28 libsystem_pthread.dylib 0x00000001856b9e10 _pthread_wqthread + 232
29 libsystem_pthread.dylib 0x00000001856b8b9c start_wqthread + 8
Replayer breadcrumbs:
[
]
GTErrorKeyProcessSignal: SIGABRT
=== Setup ===
Capture device: star.localdomain (Mac14,14) - macOS 26.2 (25C56) - 0BA10D1D-D340-5F2E-934B-536675AF9BA1
Metal version: 370.64.2
Supported graphics APIs:
Metal device: Apple M2 Ultra
Supported GPU families: Apple1 Apple2 Apple3 Apple4 Apple5 Apple6 Apple7 Apple8 Mac1 Mac2 Common1 Common2 Common3 Metal3 Metal4
Replay device: star (Mac14,14) - macOS 26.2 (25C56) - 0BA10D1D-D340-5F2E-934B-536675AF9BA1
Metal version: 370.64.2
Supported graphics APIs:
Metal device: Apple M2 Ultra
Supported GPU families: Apple1 Apple2 Apple3 Apple4 Apple5 Apple6 Apple7 Apple8 Mac1 Mac2 Common1 Common2 Common3 Metal3 Metal4
Host: Mac14,14 - macOS 26.2 (25C56)
Tool: Xcode (17C52)
Known SDKs:
Topic:
Graphics & Games
SubTopic:
Metal
hello apple through this message i want to draw you attention to some problems with gptk and rosetta some games like marvel spiderman 2 have broken animations and t pose issues and other like uncharted and the last of us have severe memory leak issues so its my request please fix it asap
I am integrating MetalFX FrameInterpolator into a custom Unity RenderGraph–based render pipeline (C++ native plugin + C# render passes), and I am hitting the following assertion at runtime:
/MetalFXDebugError.h:29: failed assertion `Color texture width mismatch from descriptor'
What makes this confusing is that all input/output textures have the correct width and height, and they exactly match the values specified in the MTLFXFrameInterpolatorDescriptor.
Setup
Input resolution: 1024 x 512
Output resolution: 2048 x 1024
MTLFXTemporalScaler is created first and then passed into MTLFXFrameInterpolator
The TemporalScaler and FrameInterpolator descriptors use the same input/output sizes and formats
All Metal textures:
Have no parentTexture
Are 2D textures
Match the descriptor sizes exactly (verified via logging)
Texture bindings at encode time
frameInterpolator.colorTexture = mtlTexColor; // 1024 x 512
frameInterpolator.prevColorTexture = mtlTexPrevColor; // 1024 x 512
frameInterpolator.motionTexture = mtlTexMotion; // 1024 x 512
frameInterpolator.depthTexture = mtlTexDepth; // 1024 x 512
frameInterpolator.uiTexture = mtlTexUI; // 2048 x 1024
frameInterpolator.outputTexture = mtlTexOutput; // 2048 x 1024
All widths/heights are logged and match:
Color : 1024 x 512 (input)
PrevColor : 1024 x 512 (input)
Motion : 1024 x 512 (input)
Depth : 1024 x 512 (input)
UI : 2048 x 1024 (output)
Output : 2048 x 1024 (output)
The TemporalScaler works correctly on its own.
The assertion only occurs when using FrameInterpolator.
Important detail about colorTexture
Originally, colorTexture was copied from BuiltinRenderTextureType.CurrentActive.
After reading that this might violate MetalFX semantics, I changed the pipeline so that:
colorTexture now comes from a dedicated private RenderGraph texture
It is not the backbuffer
It is not a drawable
It is not used as a final output
It is created before UI rendering
Despite this, the assertion still occurs.
Question
Can uiTexture for MTLFXFrameInterpolator legally come from a texture copied from BuiltinRenderTextureType.CurrentActive?
More generally:
Are there additional hidden constraints on colorTexture / prevColorTexture (such as Metal usage, storageMode, aliasing, or hazard tracking) that could cause this assertion, even when sizes match?
Does FrameInterpolator require colorTexture and prevColorTexture to be created in a very specific way (e.g. non-aliased, ShaderRead usage, identical Metal resource properties)?
Any clarification on the exact semantic requirements for colorTexture, prevColorTexture, or uiTexture in MetalFX FrameInterpolator would be greatly appreciated.
I am trying to create a simple portal like that in RealityKit, but using metal instead of RealityKit. Has anyone been able to create a window or portal like thing to show a skybox outside in mixed Reality?
Topic:
Graphics & Games
SubTopic:
Metal
Code is download from apple official metal4 sample
[https://developer.apple.com/documentation/metal/drawing-a-triangle-with-metal-4?language=objc]
enable metal gpu trace in macOS schema and trace a frame in Xcode.
Xcode may show segment fault on App from some 'GTTrace' function when click trace button.
When replay a .gputrace file, Xcode may crash , throw an internal error or a XPC error.
The example code using old metal-renderer can trace without any problem and everything works fine.
Test Environment:
Xcode Version 26.2 (17C52)
macOS 26.2 (25C56)
M1 Pro 16GB A2442
Hi everyone,
I’ve been developing a custom, end-to-end 3D rendering engine called Crescent from scratch using C++20 and Metal-cpp (targeting macOS and visionOS). My primary goal is to build a zero-bottleneck, GPU-driven pipeline that maximizes the potential of Apple Silicon’s Unified Memory and TBDR architecture.
While the fundamental systems are stable, I am looking for architectural feedback from Metal framework engineers regarding specific synchronization and latency challenges.
Current Core Implementations:
GPU-Driven Instance Culling: High-performance occlusion culling using a Hierarchical Z-Buffer (HZB) approach via Compute Shaders.
Clustered Forward Shading: Support for high-count dynamic lights through view-space clustering.
Temporal Stability: Custom TAA with history rejection and Motion Blur resolve.
Asset Infrastructure: Robust GUID-based scene serialization and a JSON-driven ECS hierarchy.
The Architectural Challenge:
I am currently seeing slight synchronization overhead when generating the HZB mip-chain. On Apple Silicon, I am evaluating the cost of encoder transitions versus cache-friendly barriers.
&& m_hzbInitPipeline && m_hzbDownsamplePipeline && !m_hzbMipViews.empty();
if (canBuildHzb) {
MTL::ComputeCommandEncoder* hzbInit = commandBuffer->computeCommandEncoder();
hzbInit->setComputePipelineState(m_hzbInitPipeline);
hzbInit->setTexture(m_depthTexture, 0);
hzbInit->setTexture(m_hzbMipViews[0], 1);
if (m_pointClampSampler) {
hzbInit->setSamplerState(m_pointClampSampler, 0);
} else if (m_linearClampSampler) {
hzbInit->setSamplerState(m_linearClampSampler, 0);
}
const uint32_t hzbWidth = m_hzbMipViews[0]->width();
const uint32_t hzbHeight = m_hzbMipViews[0]->height();
const uint32_t threads = 8;
MTL::Size tgSize = MTL::Size(threads, threads, 1);
MTL::Size gridSize = MTL::Size((hzbWidth + threads - 1) / threads * threads,
(hzbHeight + threads - 1) / threads * threads,
1);
hzbInit->dispatchThreads(gridSize, tgSize);
hzbInit->endEncoding();
for (size_t mip = 1; mip < m_hzbMipViews.size(); ++mip) {
MTL::Texture* src = m_hzbMipViews[mip - 1];
MTL::Texture* dst = m_hzbMipViews[mip];
if (!src || !dst) {
continue;
}
MTL::ComputeCommandEncoder* downEncoder = commandBuffer->computeCommandEncoder();
downEncoder->setComputePipelineState(m_hzbDownsamplePipeline);
downEncoder->setTexture(src, 0);
downEncoder->setTexture(dst, 1);
const uint32_t mipWidth = dst->width();
const uint32_t mipHeight = dst->height();
MTL::Size downGrid = MTL::Size((mipWidth + threads - 1) / threads * threads,
(mipHeight + threads - 1) / threads * threads,
1);
downEncoder->dispatchThreads(downGrid, tgSize);
downEncoder->endEncoding();
}
if (m_instanceCullHzbPipeline) {
dispatchInstanceCulling(m_instanceCullHzbPipeline, true);
}
}
My Questions:
Encoder Synchronization: Would you recommend moving this loop into a single ComputeCommandEncoder using MTLBarrier between dispatches to maintain L2 cache residency, or is the overhead of separate encoders negligible for depth-downsampling on TBDR?
visionOS Bindless Latency: For stereo rendering on visionOS, what are the best practices for managing MTL4ArgumentTable updates at 90Hz+? I want to ensure that updating bindless resources for each eye doesn't introduce unnecessary CPU-to-GPU latency.
Memory Management: Are there specific hints for Memoryless textures that could be applied to intermediate HZB levels to save bandwidth during this process?
I’ve attached a screenshot of a scene rendered with the engine (PBR, SSR, and IBL).