Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.

All subtopics
Posts under Graphics & Games topic

Post

Replies

Boosts

Views

Created

How to get GKMatch instance after accepting GKInvite?
In my SceneKit game I'm able to connect two players with GKMatchmakerViewController. Now I want to support the scenario where one of them disconnects and wants to reconnect. I tried to do this with this code: nonisolated public func match(_ match: GKMatch, player: GKPlayer, didChange state: GKPlayerConnectionState) { Task { @MainActor in switch state { case .connected: break case .disconnected, .unknown: let matchRequest = GKMatchRequest() matchRequest.recipients = [player] do { try await GKMatchmaker.shared().addPlayers(to: match, matchRequest: matchRequest) } catch { } @unknown default: break } } } nonisolated public func player(_ player: GKPlayer, didAccept invite: GKInvite) { guard let viewController = GKMatchmakerViewController(invite: invite) else { return } viewController.matchmakerDelegate = self present(viewController) } But after presenting the view controller with GKMatchmakerViewController(invite:), nothing else happens. I would expect matchmakerViewController(_:didFind:) to be called, or how would I get an instance of GKMatch? Here is the code I use to reproduce the issue, and below the reproduction steps. Code Run the attached project on an iPad and a Mac simultaneously. On both devices, tap the ship to connect to GameCenter. Create an automatched match by tapping the rightmost icon on both devices. When the two devices are matched, on iPad close the dialog and tap on the ship to disconnect from GameCenter. Wait some time until the Mac detects the disconnect and automatically sends an invitation to join again. When the notification arrives on the iPad, tap it, then tap the ship to connect to GameCenter again. The iPad receives the call player(_:didAccept:), but nothing else, so there’s no way to get a GKMatch instance again.
6
0
752
Nov ’24
NSScreen frame location with multiple monitors
I have a Mac Studio 2023 M2 Max Running Sonoma 14.6.1 Developing in XCode 16.1 It seems that the NSScreen frame settings may be incorrect. The frame settings received from NSScreen.screens don't seem to match up with the Desktop arrangement settings in the Settings. Apologies in advance for this long post! for screen in NSScreen.screens { let name = screen.localizedName Globals.logger.debug("Globals initializeScreens - screen \(i) '\(name, privacy: .public)'") Globals.logger.debug("Globals initializeScreens - '\(screen.debugDescription, privacy: .public)'") } This is what I receive in the log: Globals initializeScreens - '<NSScreen: 0x600000ef4240; name="PHL 346E2C"; backingScaleFactor=1.000000; frame={{0, 0}, {3440, 1440}}; visibleFrame={{0, 0}, {3440, 1415}}>' Globals initializeScreens - screen 2 'Blackmagic (1)' Globals initializeScreens - '<NSScreen: 0x600000ef42a0; name="Blackmagic (1)"; backingScaleFactor=1.000000; frame={{-3840, 0}, {1920, 1080}}; visibleFrame={{-3840, 0}, {1920, 1055}}>' Globals initializeScreens - screen 3 'Blackmagic (4)' Globals initializeScreens - '<NSScreen: 0x600000ef4360; name="Blackmagic (4)"; backingScaleFactor=1.000000; frame={{-1920, 0}, {1920, 1080}}; visibleFrame={{-1920, 0}, {1920, 1055}}>' Globals initializeScreens - screen 4 'Blackmagic (2)' Globals initializeScreens - '<NSScreen: 0x600000ef43c0; name="Blackmagic (2)"; backingScaleFactor=1.000000; frame={{5360, 0}, {1920, 1080}}; visibleFrame={{5360, 0}, {1920, 1055}}>' Globals initializeScreens - screen 5 'Blackmagic (3)' Globals initializeScreens - '<NSScreen: 0x600000ef4420; name="Blackmagic (3)"; backingScaleFactor=1.000000; frame={{3440, 0}, {1920, 1080}}; visibleFrame={{3440, 0}, {1920, 1055}}>' It looks like the frame settings for Blackmagic (2) and Blackmagic (4) are switched. The setup has five monitors. Four are using the USB-C Digital AV Multiport Adapters. The output for these are streamed into a rack of A/V equipment using BlackMagic Design mini converters and monitors. My Swift application allows users to open four movies, one for each of the AV Adapters. The movies can then be played back in sync for later processing by the A/V equipment. Here are some screen captures that show my display settings. Blackmagic (1) and Blackmagic (2) are to the left of the main screen. Blackmagic (3) and Blackmagic(4) are to the right of the main screen. The desktop is hard to see but is correct. The wallpaper settings are all correct. The wallpaper is correctly ordered when displayed on the monitors. After opening the movies and using the NSScreen frame settings, the displays are incorrectly ordered. Test B and Test D are switched, which is what I would expect given the NSScreen frame values. Any ideas? I've tried re-arranging the desktops, rebooting, etc. but no luck. The code that changes the screen location is similar to this post on Stack Overflow public func setDisplay( screen: NSScreen ) { Globals.logger.log("MovieWindowController - setDisplay = \(screen.localizedName, privacy: .public)") Globals.logger.debug("MovieWindowController - setDisplay - '\(screen.debugDescription, privacy: .public)'") let dx = CGFloat(Constants.midX) let dy = CGFloat(Constants.midY) var pos = NSPoint() pos.x = screen.visibleFrame.midX - dx pos.y = screen.visibleFrame.midY - dy Globals.logger.debug("MovieWindowController - setDisplay - x = '\(pos.x, privacy: .public)', y = '\(pos.y, privacy: .public)'") window?.setFrameOrigin(pos) } The log show just what I would expect given the incorrect frame values. MovieWindowController - setDisplay = Blackmagic (1) MovieWindowController - setDisplay - '<NSScreen: 0x6000018e8420; name="Blackmagic (1)"; backingScaleFactor=1.000000; frame={{-3840, 0}, {1920, 1080}}; visibleFrame={{-3840, 0}, {1920, 1055}}>' MovieWindowController - setDisplay - x = '-3840.000000', y = '-12.500000' MovieWindowController - setDisplay = Blackmagic (2) MovieWindowController - setDisplay - '<NSScreen: 0x6000018a10e0; name="Blackmagic (2)"; backingScaleFactor=1.000000; frame={{5360, 0}, {1920, 1080}}; visibleFrame={{5360, 0}, {1920, 1055}}>' MovieWindowController - setDisplay - x = '5360.000000', y = '-12.500000' MovieWindowController - setDisplay = Blackmagic (3) MovieWindowController - setDisplay - '<NSScreen: 0x6000018cc8a0; name="Blackmagic (3)"; backingScaleFactor=1.000000; frame={{3440, 0}, {1920, 1080}}; visibleFrame={{3440, 0}, {1920, 1055}}>' MovieWindowController - setDisplay - x = '3440.000000', y = '-12.500000' MovieWindowController - setDisplay = Blackmagic (4) MovieWindowController - setDisplay - '<NSScreen: 0x6000018c9ce0; name="Blackmagic (4)"; backingScaleFactor=1.000000; frame={{-1920, 0}, {1920, 1080}}; visibleFrame={{-1920, 0}, {1920, 1055}}>' MovieWindowController - setDisplay - x = '-1920.000000', y = '-12.500000' Am I correct? I think this is driving me crazy! Thanks in advance! Edit: The mouse behavior is correct in moving across the displays!
4
0
646
Nov ’24
D3DMetal unsupported CreateCommandQueue1 API while running simple wglgears using Mesa 24.3 GLon12 driver..
Hi, wanted to test if possible to use Mesa3D OGLon12+D3DMetal 2b3 to get GL>4.1 support on windows apps via D3D12Metal.. using simple wglgears.c app (similar glxgears) and running like: GALLIUM_DRIVER=d3d12 wine64 wglgears64 -info with overridden opengl32.dll using contents from: https://github.com/pal1000/mesa-dist-win/releases/download/24.3.0-rc1/mesa3d-24.3.0-rc1-release-msvc.7z I get: [D3DMetal:LOG:5E53] Unsupported API: CreateCommandQueue1 caused by: https://gitlab.freedesktop.org/mesa/mesa/-/commit/c022c9603d500b59ff5e6f93c8a214d1785ab20a API: https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12device9-createcommandqueue1 note setup is correct as using: GALLIUM_DRIVER=llvmpipe wine64 wglgears64 -info I get: GL_RENDERER = llvmpipe (LLVM 19.1.3, 128 bits) GL_VERSION = 4.5 (Compatibility Profile) Mesa 24.3.0-rc1 (git-85ba713d76) GL_VENDOR = Mesa GL_EXTENSIONS = GL_ARB_multisample GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color GL_EXT_blend_minmax GL_EXT_blend_subtract r GL_EXT_texture.. etc..
1
0
575
Nov ’24
Updated Object Capure -- needs LiDAR?
I have two apps released -- ReefScan and ReefBuild -- that are based on the WWDC21 sample photogrammetry apps for iOS and MacOS. Those run fine without LiDAR and are used mostly for underwater models where LiDAR does not work at all. It now appears that the updated photogrammetry session requires LiDAR data, and building my app on current xcode results in a non-working app. Has the "old" version of photgrammetry session been broken by this update? It worked very well previously so I would hate to see this regression to needing LiDAR. Most of my users do not have that.
3
0
551
Nov ’24
MDLAsset loads texture in usdz file loaded with wrong colorspace
I have a very basic usdz file from this repo I call loadTextures() after loading the usdz via MDLAsset. Inspecting the MDLTexture object I can tell it is assigning a colorspace of linear rgb instead of srgb although the image file in the usdz is srgb. This causes the textures to ultimately render as over saturated. In the code I later convert the MDLTexture to MTLTexture via MTKTextureLoader but if I set the srgb option it seems to ignore it. This significantly impacts the usefulness of Model I/O if it can't load a simple usdz texture correctly. Am I missing something? Thanks!
3
3
850
Nov ’24
Editor Ok, but running batch get: Cannot locate a Release or Debug Apple.GameKit native library for macOS
After running build.py -p Core GameKit and adding the tar balls to the Unity project in Assets/ExternalPackages no packages seem to be found when running the build using our continuous integration system. This was not the case when the project was opened in the Editor. It looks like in Apple.Core, the ApplePluginEnvironment hasn't run the OnEditorUpdate function and so the _appleUnityPackages Dictionary is empty. A change to ApplePlugInEnvironment.cs seemed to fix the issue: public static AppleNativeLibrary GetLibrary(string packageDisplayName, string appleBuildConfig, string applePlatform) { // ?FIX?: If we're not in the editor, we might not have updated the package list. if (_appleUnityPackages.Count == 0 && _updateState == UpdateState.Initializing) { OnEditorUpdate(); // UpdateState.Initializing OnEditorUpdate(); // UpdateState.Updating } I'm not sure if this is something we're doing incorrectly, the documentation for the plug-in mostly covered building the package.
2
0
702
Dec ’24
EXC_BAD_ACCESS when removing IKComponent from Entity
I'm trying to position an Entity with inverse kinematics while dragging the handle only, but use forward kinematics (pose jointTransforms) otherwise. The System, Components, Gestures and Rig all seem to work individually. My approach is to add the IKComponent when dragging starts on the handle and removing the IKComponent it is released. The switch into IK works, but when removing the IKComponent the App crashes * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8) * frame #0: 0x00000001aa5bb188 CoreRE`(anonymous namespace)::IKComponentSolverWrapper::getSolver() + 60 frame #1: 0x00000001aa5bafb0 CoreRE`re::internal::ikParametersNodeCallback(re::Slice<re::StringID>, re::Slice<re::RigDataValue>, re::Slice<re::StringID>, re::MutableSlice<re::RigDataValue>, void*) + 48 frame #2: 0x00000001aa52d090 CoreRE`re::(anonymous namespace)::resolveEvaluationContextCallback(re::EvaluationContext&, void*) + 152 frame #3: 0x00000001aa68c024 CoreRE`re::(anonymous namespace)::$_76::__invoke(re::Slice<unsigned long>, re::(anonymous namespace)::RegisterTable&) + 1080 frame #4: 0x00000001aa678c94 CoreRE`re::EvaluationModelSingleThread::evaluate(re::EvaluationContextSlices&) + 1188 frame #5: 0x00000001aa866984 CoreRE`re::SkeletalPoseRuntimeData::executeEvaluationTree() + 136 frame #6: 0x00000001aadf37ec CoreRE`re::ecs2::SkeletalPoseComponent::calculateSkeletalPoseBufferWithRig(re::ecs2::MeshComponent*, re::ecs2::RigComponent*, re::ecs2::SkeletalPoseBufferComponent*) + 492 frame #7: 0x00000001aadf4a84 CoreRE`re::ecs2::SkeletalPoseComponentStateImpl::processPreparingComponents(re::ecs2::System::UpdateContext const&, re::ecs2::BasicComponentStateSceneData<re::ecs2::SkeletalPoseComponent>*, re::ecs2::ComponentBuckets<re::ecs2::SkeletalPoseComponent>::BucketIteration, void*) + 268 frame #8: 0x00000001aadf54b0 CoreRE`re::ecs2::SkeletalPoseSystem::update(re::ecs2::System::UpdateContext) const + 732 frame #9: 0x00000001aaed3e54 CoreRE`re::internal::Callable<re::ecs2::ECSManager::configurePhaseECSSystems(re::Scheduler::ScheduleDescriptor&, re::ecs2::ECSSystemGroup, unsigned long)::$_1, void (float)>::operator()(float&&) const + 168 frame #10: 0x00000001ab40eda4 CoreRE`re::Scheduler::executePhase(unsigned long) + 440 frame #11: 0x00000001aa6a3b74 CoreRE`re::Engine::executePhase(re::FramePhase) + 144 frame #12: 0x000000023173de9c RealitySystemSupport`RCPSharedSimulationExecuteUpdate + 64 frame #13: 0x00000002276c9820 MRUIKit`__65-[MRUISharedSimulation _doJoinWithConnectionConfiguration:error:]_block_invoke.35 + 168 frame #14: 0x00000002276c8530 MRUIKit`__addCAPreFenceHandler_block_invoke + 32 frame #15: 0x000000018af22058 QuartzCore`CA::Transaction::run_commit_handlers(CATransactionPhase) + 112 frame #16: 0x000000018aef2ad4 QuartzCore`CA::Context::commit_transaction(CA::Transaction*, double, double*) + 592 frame #17: 0x000000018af21898 QuartzCore`CA::Transaction::commit() + 652 frame #18: 0x000000018af22dac QuartzCore`CA::Transaction::flush_as_runloop_observer(bool) + 68 frame #19: 0x0000000185a26820 UIKitCore`_UIApplicationFlushCATransaction + 48 frame #20: 0x0000000184f97af0 UIKitCore`_UIUpdateSequenceRun + 76 frame #21: 0x0000000185954290 UIKitCore`schedulerStepScheduledMainSection + 168 frame #22: 0x00000001859536d8 UIKitCore`runloopSourceCallback + 80 frame #23: 0x00000001804157fc CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 frame #24: 0x0000000180415744 CoreFoundation`__CFRunLoopDoSource0 + 172 frame #25: 0x0000000180414eb0 CoreFoundation`__CFRunLoopDoSources0 + 232 frame #26: 0x000000018040f454 CoreFoundation`__CFRunLoopRun + 788 frame #27: 0x000000018040ecd4 CoreFoundation`CFRunLoopRunSpecific + 552 frame #28: 0x0000000190104b70 GraphicsServices`GSEventRunModal + 160 frame #29: 0x0000000185a27e30 UIKitCore`-[UIApplication _run] + 796 frame #30: 0x0000000185a2c058 UIKitCore`UIApplicationMain + 124 frame #31: 0x00000001d29558b4 SwiftUI`closure #1 (Swift.UnsafeMutablePointer<Swift.Optional<Swift.UnsafeMutablePointer<Swift.Int8>>>) -> Swift.Never in SwiftUI.KitRendererCommon(Swift.AnyObject.Type) -> Swift.Never + 164 frame #32: 0x00000001d29555dc SwiftUI`SwiftUI.runApp<τ_0_0 where τ_0_0: SwiftUI.App>(τ_0_0) -> Swift.Never + 84 frame #33: 0x00000001d265ecdc SwiftUI`static SwiftUI.App.main() -> () + 164 frame #34: 0x000000010303f1c4 Playground.debug.dylib`static PlaygroundApp.$main() at <compiler-generated>:0 frame #35: 0x000000010303f290 Playground.debug.dylib`main at PlaygroundApp.swift:7:8 frame #36: 0x0000000102f6d410 dyld_sim`start_sim + 20 frame #37: 0x000000010312e274 dyld`start + 2840 Is there a workaround or another way to switch between IK and FK?
1
0
452
Dec ’24
Collision Shape not work in Reality Composer
I’m having issues getting Collision Shapes working in Reality Composer on iPadOS, or with Reality Composer Pro via Xcode on macOS? I’ve posted a video recorded through my Vision Pro showing the issue. The project i’m working on is a Dice Rolling application. The dice don’t appear to be working set as Collision Shape=Automatic, which I assume takes into account the actual silhouette of the shape. https://youtu.be/upPtQY4QOAk?si=yyx6rbSSmVkLxBLg They also don’t rest on their face when they land. Anyone experience this type of behavior and found a solution? I’m currently doing this with Reality Composer, but most likely will also be wanting to get it to work properly in Reality Composer Pro as well. Thx!
6
1
771
Dec ’24
Recursively searching the realityKitContentBundle
Hi Hopefully someone can share some ideas on how to accomplish this. I know we can load models from realityKitContentBundle like let model = try? await Entity(named: “testModel”, in: realityKitContentBundle) But this is in the root of RealityKitContent.rkassets , if I have the models in some subfolder then I have to add the complete path like let model = try? await Entity(named: “/superModels/testModel”, in: realityKitContentBundle) What I want is to be able to search recursively in all folders for that file as I have several subfolders with different models. Any suggestion ? Thanks in advance. Guillermo
3
0
698
Dec ’24
SCNGeometry and .copy()
Up to now I have created multiple new SCNNodes using an instance of SCNGeometry and it was OK that they all had the same appearance. Now I want variety and when I make a copy of that instance using: let newGeo = myGeoInstance.copy() as! SCNGeometry (must be force cast because copy() -> any?) all elements are verified present. :-) Likewise: node.geometry?.replaceMaterial(at: index, with: myNewMaterial) is verified to correctly change the material(s) at the correct index(s). The only problem is the modified "teapot" is not visible, and yes I have set node.isHidden = false. Has anyone experienced this? In the old days reversing the verts was a solution. In desperation I tried that. |-(
6
0
826
Dec ’24
USDZ files with camera can't be opened on IOS 18.2/iPadOS 18.2 correctly.
Hi experts, When I open a USDZ file which contains perspective cameras by "Files" app in IOS 18.2/iPadOS 18.2, I can't see anything. And when I open the USDZ file in IOS 18.1/iPadOS 18.1, it works well. On the other hand, when I open a USDZ file which contains orthographic cameras in IOS 18.1 or IOS 18.2, the scene is stuck. Could you help to solve these issues please? Thanks.
4
2
654
Dec ’24
Disable IBL in IOS QUICKLOOK
https://developer.apple.com/documentation/arkit/arkit_in_ios/specifying_a_lighting_environment_in_ar_quick_look How can I disable it? or at least use a custom texture that's just black? I don't see the purpose of having the real-time environment probe that captures IBL, but always add this fake studio IBL that you can't remove...
0
0
352
Dec ’24
Jurassic World Evolution 2 Likely Fails Due to Missing Tiled Resources Support
I’ve been trying to run Jurassic World Evolution 2 using the Game Porting Toolkit on macOS, but the game doesn’t launch and crashes immediately. Based on the error and research, it seems the issue is related to missing support for D3D12_TILED_RESOURCES_TIER_2 in the Metal API. If this is the case, does anyone know if support for tiled resources is planned for future updates of the toolkit? Or are there any potential workarounds for bypassing this limitation?
1
0
730
Dec ’24
Snap to Item with Assistive Touch does not work when building an Game from Unity.
Hi all, I have been trying to get Apple's assistive touch's snap to item to work for a unity game built using Apple's Core & Accessibility API. The switch control recognises these buttons however, eye tracking will not snap to them. The case in which it needs to snap is when an external eye tracking device is connected and utilises assistive touch & assistive touch's snap to item. All buttons in the game have a AccessibilityNode with the trait 'Button' on them & an appropriate label, which, following the documentation and comments on the developer forum, should allow them to be recognised by snap to item. This is not the case, devices (iPads and iPhones) do not recognise the buttons as a snap to target. Does anyone know why this is the case, and if this is a bug?
0
0
585
Dec ’24
Metal-cpp-extensions isn't working inside frameworks
I am making a framework in C++ using metal-cpp, basically a small game engine. I am also consequently using metal-cpp-extensions provided in LearnMetalCPP to make applications work. For one of my classes, I needed to add AppKit.hpp inside a public header file, so I moved it and its associate headers(NSApplication.hpp, NSMenu.hpp, etc.) from Project headers to Public in Build Phases' Headers, however, it started giving me the error "cast of C pointer type 'void *' to Objective-C pointer type 'Class' requires a bridged cast" at several points in the AppKit headers. They don't appear when AppKit and its associates are in the Project headers, or when they are in the Private headers and no headers import it. I imagined that disabling Objective-C ARC and Using __bridge casts outside of ARC in Build Settings would solve it, but it didn't budge. I imagined it wouldn't involve actively changing the headers would be the answer, but even if I try to put __bridge before the problematic casts, it didn't recognize __bridge. How do I solve this? And why is it only happening in Public and not Project headers?
2
0
805
Dec ’24
How to use MTKTextureLoader to load png data
I am trying to load some PNG data with MTKTextureLoader newTextureWithData,but the result shows wrong at the alpha area. Here is the code. I have an image URL, after it downloads successfully, I try to use the data or UIImagePNGRepresentation (image), they all show wrong. UIImage *tempImg = [UIImage imageWithData:data]; CGImageRef cgRef = tempImg.CGImage; MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:device]; id<MTLTexture> temp1 = [loader newTextureWithData:data options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil]; NSData *tempData = UIImagePNGRepresentation(tempImg); id<MTLTexture> temp2 = [loader newTextureWithData:tempData options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil]; id<MTLTexture> temp3 = [loader newTextureWithCGImage:cgRef options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil]; }] resume];
5
0
624
Dec ’24
M1 GPU violates atomic_thread_fence across threadgroups
I have an M1 Pro with a 16-core GPU. When I run a shader with 8193 threads, atomic_thread_fence is violated across the boundary between thread 8191 (the last thread in the 7th threadgroup) and 8192 (the first thread in the 9th threadgroup). I've attached the Metal and Swift files, but I'll repost the relevant kernel here. It's a function that launches N threads to iterate through a binary tree from the leaves, where the first thread to reach the parent terminates and the second one populates it with the sum of the nodes two children. // clang-format off void sum(device const int& size, device const int* __restrict__ in, device int* __restrict__ out, device atomic_int* visited, uint i [[thread_position_in_grid]]) { // clang-format on int val = in[i]; uint cur = (size + i - 1); out[cur] = val; atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst); cur = (cur - 1) / 2; int proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed); while (proceed == 1) { uint left = 2 * cur + 1; uint right = 2 * cur + 2; uint val_left = out[left]; uint val_right = out[right]; uint val_cur = val_left + val_right; out[cur] = val_cur; if (cur == 0) { break; } cur = (cur - 1) / 2; atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst); proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed); } } What I'm observing is that thread 8192 hits the atomic_fetch_add first and terminates, while thread 8191 hits it second (observes that thread 8192 had incremented it by 1) and proceeds into the loop. Thread 8191 reads out[16383] (which it populated with 8191) and out[16384] (which thread 8192 populated with 8192 prior to the atomic_thread_fence). Instead of reading 8192 from out[16384] though, it reads 0. Maybe I'm missing something but this seems like a pretty clear violation of the atomic_thread_fence which (I thought) was supposed to guarantee that the write from thread 8192 to out[16384] would be visible to any thread observing the effects of the following atomic_fetch_add. Is atomic_fetch_add not a store operation? Modifying it to an atomic_store or atomic_exchange still results in the bug. Adding another atomic_thread_fence between the atomic_fetch_add and reading of out also doesn't change anything. I only begin to observe this on grid sizes of 8193 and upwards. That's 9 threadgroups per grid, which I assume could be related to my M1 Pro GPU having 16 cores. Running the same example on an A17 Pro GPU doesn't show any of this behavior up through a tested grid size of 4194303 (2^22-1), at which point testing larger grid sizes starts to run into other issues so I can't test anything larger. Removing the atomic_thread_fences on both the M1 and A17 cause the test to fail at much smaller grid sizes, as expected. sum.metal main.swift
2
0
531
Dec ’24
Tile Shaders performance when writing to tile texture vs. resolve texture
I am working on a custom resolve tile shader for a client. I see a big difference in performance depending on where we write to: 1- the resolve texture of the color attachment 2- a rw tile shader texture set via [renderEncoder setTileTexture: myResolvedTexture] Option 2 is more than twice as slow than option 1. Our compute shader writes to 4 UAVs so just using the resolve texture entry is not possible. Why such a difference as there is no more data being written? Can option 2 be as fast as option 1? I can demonstrate the issue in a modified version of the Multisample code sample.
5
0
568
Dec ’24