Post

Replies

Boosts

Views

Activity

Reply to Generating vertex data in compute shader
I would still consider mesh shaders even if regenerating the geometry every frame seems wasteful. You will likely end up with considerably simpler code and unless your terrain generation is extremely complicated I kind of doubt that you will see any performance degradation (don't forget that the mesh shader also performs the function of the vertex shader — with the benefit of having access to neighbouring vertex data "for free"). Tessellation for example also regenerates tons of geometry every frame, and yet it's a popular technique fr improving performance (because storing and moving all that geometry in memory ends up being more expensive than regenerating it on the fly).
Topic: Graphics & Games SubTopic: General Tags:
Jul ’23
Reply to Pointers in MSL
I second the above. You are writing to a null pointer, which is undefined behavior. There is no standard way to allocate device memory in Metal. Use local storage (value instead of pointer) if you only need the context for the invocation of the shader, or implement your own bump allocator from a pre-allocated data buffer if you need the context to escape the shader invocation.
Topic: Graphics & Games SubTopic: General Tags:
Jul ’23
Reply to Tile shading pipeline without fragment shader?
I was not able to find how to do what I wanted exactly (building a per-tile list of primitives), but a reasonable alternative is to build a per-fragment list of primitives using raster order groups. Works very well for my purpose. The approach, in case anyone is interested, is described in detail here: https://developer.apple.com/documentation/metal/metal_sample_code_library/implementing_order-independent_transparency_with_image_blocks
Topic: Graphics & Games SubTopic: General Tags:
Oct ’23
Reply to How to use MetalPeformancePrimitives
Thanks for getting back to me! It appears that upgrading to Xcode 26.1 has fixed the issue, and the headers are now detected correctly. By the way, I noticed that there is a lot of discrepancy between the documentation and the shipped APIs. I suppose you guys are aware of this and working on a fix? And we really need a Performance Primitives tuning guide. The API is very flexible and finding settings that actually work well for performance can be challenging. For example, I am yet to find a tile size for which using a multi-simdgroup execution scope would not result in performance regression. Also, what about bfloat?
Oct ’25