In my SceneKit game I'm able to connect two players with GKMatchmakerViewController. Now I want to support the scenario where one of them disconnects and wants to reconnect. I tried to do this with this code:
nonisolated public func match(_ match: GKMatch, player: GKPlayer, didChange state: GKPlayerConnectionState) {
Task { @MainActor in
switch state {
case .connected:
break
case .disconnected, .unknown:
let matchRequest = GKMatchRequest()
matchRequest.recipients = [player]
do {
try await GKMatchmaker.shared().addPlayers(to: match, matchRequest: matchRequest)
} catch {
}
@unknown default:
break
}
}
}
nonisolated public func player(_ player: GKPlayer, didAccept invite: GKInvite) {
guard let viewController = GKMatchmakerViewController(invite: invite) else {
return
}
viewController.matchmakerDelegate = self
present(viewController)
}
But after presenting the view controller with GKMatchmakerViewController(invite:), nothing else happens. I would expect matchmakerViewController(_:didFind:) to be called, or how would I get an instance of GKMatch?
Here is the code I use to reproduce the issue, and below the reproduction steps.
Code
Run the attached project on an iPad and a Mac simultaneously.
On both devices, tap the ship to connect to GameCenter.
Create an automatched match by tapping the rightmost icon on both devices.
When the two devices are matched, on iPad close the dialog and tap on the ship to disconnect from GameCenter.
Wait some time until the Mac detects the disconnect and automatically sends an invitation to join again.
When the notification arrives on the iPad, tap it, then tap the ship to connect to GameCenter again. The iPad receives the call player(_:didAccept:), but nothing else, so there’s no way to get a GKMatch instance again.
Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Created
I have a Mac Studio 2023 M2 Max
Running Sonoma 14.6.1
Developing in XCode 16.1
It seems that the NSScreen frame settings may be incorrect. The frame settings received from NSScreen.screens don't seem to match up with the Desktop arrangement settings in the Settings.
Apologies in advance for this long post!
for screen in NSScreen.screens {
let name = screen.localizedName
Globals.logger.debug("Globals initializeScreens - screen \(i) '\(name, privacy: .public)'")
Globals.logger.debug("Globals initializeScreens - '\(screen.debugDescription, privacy: .public)'")
}
This is what I receive in the log:
Globals initializeScreens - '<NSScreen: 0x600000ef4240;
name="PHL 346E2C";
backingScaleFactor=1.000000;
frame={{0, 0}, {3440, 1440}};
visibleFrame={{0, 0}, {3440, 1415}}>'
Globals initializeScreens - screen 2 'Blackmagic (1)'
Globals initializeScreens - '<NSScreen: 0x600000ef42a0;
name="Blackmagic (1)";
backingScaleFactor=1.000000;
frame={{-3840, 0}, {1920, 1080}};
visibleFrame={{-3840, 0}, {1920, 1055}}>'
Globals initializeScreens - screen 3 'Blackmagic (4)'
Globals initializeScreens - '<NSScreen: 0x600000ef4360;
name="Blackmagic (4)";
backingScaleFactor=1.000000;
frame={{-1920, 0}, {1920, 1080}};
visibleFrame={{-1920, 0}, {1920, 1055}}>'
Globals initializeScreens - screen 4 'Blackmagic (2)'
Globals initializeScreens - '<NSScreen: 0x600000ef43c0;
name="Blackmagic (2)";
backingScaleFactor=1.000000;
frame={{5360, 0}, {1920, 1080}};
visibleFrame={{5360, 0}, {1920, 1055}}>'
Globals initializeScreens - screen 5 'Blackmagic (3)'
Globals initializeScreens - '<NSScreen: 0x600000ef4420;
name="Blackmagic (3)";
backingScaleFactor=1.000000;
frame={{3440, 0}, {1920, 1080}};
visibleFrame={{3440, 0}, {1920, 1055}}>'
It looks like the frame settings for Blackmagic (2) and Blackmagic (4) are switched.
The setup has five monitors. Four are using the USB-C Digital AV Multiport Adapters. The output for these are streamed into a rack of A/V equipment using BlackMagic Design mini converters and monitors.
My Swift application allows users to open four movies, one for each of the AV Adapters. The movies can then be played back in sync for later processing by the A/V equipment.
Here are some screen captures that show my display settings.
Blackmagic (1) and Blackmagic (2) are to the left of the main screen.
Blackmagic (3) and Blackmagic(4) are to the right of the main screen.
The desktop is hard to see but is correct.
The wallpaper settings are all correct.
The wallpaper is correctly ordered when displayed on the monitors.
After opening the movies and using the NSScreen frame settings, the displays are incorrectly ordered. Test B and Test D are switched, which is what I would expect given the NSScreen frame values.
Any ideas? I've tried re-arranging the desktops, rebooting, etc. but no luck.
The code that changes the screen location is similar to this post on Stack Overflow
public func setDisplay( screen: NSScreen ) {
Globals.logger.log("MovieWindowController - setDisplay = \(screen.localizedName, privacy: .public)")
Globals.logger.debug("MovieWindowController - setDisplay - '\(screen.debugDescription, privacy: .public)'")
let dx = CGFloat(Constants.midX)
let dy = CGFloat(Constants.midY)
var pos = NSPoint()
pos.x = screen.visibleFrame.midX - dx
pos.y = screen.visibleFrame.midY - dy
Globals.logger.debug("MovieWindowController - setDisplay - x = '\(pos.x, privacy: .public)', y = '\(pos.y, privacy: .public)'")
window?.setFrameOrigin(pos)
}
The log show just what I would expect given the incorrect frame values.
MovieWindowController - setDisplay = Blackmagic (1)
MovieWindowController - setDisplay - '<NSScreen: 0x6000018e8420; name="Blackmagic (1)"; backingScaleFactor=1.000000; frame={{-3840, 0}, {1920, 1080}}; visibleFrame={{-3840, 0}, {1920, 1055}}>'
MovieWindowController - setDisplay - x = '-3840.000000', y = '-12.500000'
MovieWindowController - setDisplay = Blackmagic (2)
MovieWindowController - setDisplay - '<NSScreen: 0x6000018a10e0; name="Blackmagic (2)"; backingScaleFactor=1.000000; frame={{5360, 0}, {1920, 1080}}; visibleFrame={{5360, 0}, {1920, 1055}}>'
MovieWindowController - setDisplay - x = '5360.000000', y = '-12.500000'
MovieWindowController - setDisplay = Blackmagic (3)
MovieWindowController - setDisplay - '<NSScreen: 0x6000018cc8a0; name="Blackmagic (3)"; backingScaleFactor=1.000000; frame={{3440, 0}, {1920, 1080}}; visibleFrame={{3440, 0}, {1920, 1055}}>'
MovieWindowController - setDisplay - x = '3440.000000', y = '-12.500000'
MovieWindowController - setDisplay = Blackmagic (4)
MovieWindowController - setDisplay - '<NSScreen: 0x6000018c9ce0; name="Blackmagic (4)"; backingScaleFactor=1.000000; frame={{-1920, 0}, {1920, 1080}}; visibleFrame={{-1920, 0}, {1920, 1055}}>'
MovieWindowController - setDisplay - x = '-1920.000000', y = '-12.500000'
Am I correct? I think this is driving me crazy!
Thanks in advance!
Edit: The mouse behavior is correct in moving across the displays!
Topic:
Graphics & Games
SubTopic:
General
Hi,
wanted to test if possible to use Mesa3D OGLon12+D3DMetal 2b3 to get GL>4.1 support on windows apps via D3D12Metal..
using simple wglgears.c app (similar glxgears) and running like:
GALLIUM_DRIVER=d3d12 wine64 wglgears64 -info
with overridden opengl32.dll using contents from:
https://github.com/pal1000/mesa-dist-win/releases/download/24.3.0-rc1/mesa3d-24.3.0-rc1-release-msvc.7z
I get:
[D3DMetal:LOG:5E53] Unsupported API: CreateCommandQueue1
caused by:
https://gitlab.freedesktop.org/mesa/mesa/-/commit/c022c9603d500b59ff5e6f93c8a214d1785ab20a
API:
https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12device9-createcommandqueue1
note setup is correct as using:
GALLIUM_DRIVER=llvmpipe wine64 wglgears64 -info
I get:
GL_RENDERER = llvmpipe (LLVM 19.1.3, 128 bits)
GL_VERSION = 4.5 (Compatibility Profile) Mesa 24.3.0-rc1 (git-85ba713d76)
GL_VENDOR = Mesa
GL_EXTENSIONS = GL_ARB_multisample GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color GL_EXT_blend_minmax GL_EXT_blend_subtract
r GL_EXT_texture.. etc..
I have two apps released -- ReefScan and ReefBuild -- that are based on the WWDC21 sample photogrammetry apps for iOS and MacOS. Those run fine without LiDAR and are used mostly for underwater models where LiDAR does not work at all. It now appears that the updated photogrammetry session requires LiDAR data, and building my app on current xcode results in a non-working app. Has the "old" version of photgrammetry session been broken by this update? It worked very well previously so I would hate to see this regression to needing LiDAR. Most of my users do not have that.
Topic:
Graphics & Games
SubTopic:
RealityKit
I have a very basic usdz file from this repo
I call loadTextures() after loading the usdz via MDLAsset. Inspecting the MDLTexture object I can tell it is assigning a colorspace of linear rgb instead of srgb although the image file in the usdz is srgb.
This causes the textures to ultimately render as over saturated.
In the code I later convert the MDLTexture to MTLTexture via MTKTextureLoader but if I set the srgb option it seems to ignore it.
This significantly impacts the usefulness of Model I/O if it can't load a simple usdz texture correctly. Am I missing something?
Thanks!
After running build.py -p Core GameKit and adding the tar balls to the Unity project in Assets/ExternalPackages no packages seem to be found when running the build using our continuous integration system.
This was not the case when the project was opened in the Editor.
It looks like in Apple.Core, the ApplePluginEnvironment hasn't run the OnEditorUpdate function and so the _appleUnityPackages Dictionary is empty.
A change to ApplePlugInEnvironment.cs seemed to fix the issue:
public static AppleNativeLibrary GetLibrary(string packageDisplayName, string appleBuildConfig, string applePlatform)
{
// ?FIX?: If we're not in the editor, we might not have updated the package list.
if (_appleUnityPackages.Count == 0 && _updateState == UpdateState.Initializing)
{
OnEditorUpdate(); // UpdateState.Initializing
OnEditorUpdate(); // UpdateState.Updating
}
I'm not sure if this is something we're doing incorrectly, the documentation for the plug-in mostly covered building the package.
I'm trying to position an Entity with inverse kinematics while dragging the handle only, but use forward kinematics (pose jointTransforms) otherwise.
The System, Components, Gestures and Rig all seem to work individually.
My approach is to add the IKComponent when dragging starts on the handle and removing the IKComponent it is released.
The switch into IK works, but when removing the IKComponent the App crashes
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
* frame #0: 0x00000001aa5bb188 CoreRE`(anonymous namespace)::IKComponentSolverWrapper::getSolver() + 60
frame #1: 0x00000001aa5bafb0 CoreRE`re::internal::ikParametersNodeCallback(re::Slice<re::StringID>, re::Slice<re::RigDataValue>, re::Slice<re::StringID>, re::MutableSlice<re::RigDataValue>, void*) + 48
frame #2: 0x00000001aa52d090 CoreRE`re::(anonymous namespace)::resolveEvaluationContextCallback(re::EvaluationContext&, void*) + 152
frame #3: 0x00000001aa68c024 CoreRE`re::(anonymous namespace)::$_76::__invoke(re::Slice<unsigned long>, re::(anonymous namespace)::RegisterTable&) + 1080
frame #4: 0x00000001aa678c94 CoreRE`re::EvaluationModelSingleThread::evaluate(re::EvaluationContextSlices&) + 1188
frame #5: 0x00000001aa866984 CoreRE`re::SkeletalPoseRuntimeData::executeEvaluationTree() + 136
frame #6: 0x00000001aadf37ec CoreRE`re::ecs2::SkeletalPoseComponent::calculateSkeletalPoseBufferWithRig(re::ecs2::MeshComponent*, re::ecs2::RigComponent*, re::ecs2::SkeletalPoseBufferComponent*) + 492
frame #7: 0x00000001aadf4a84 CoreRE`re::ecs2::SkeletalPoseComponentStateImpl::processPreparingComponents(re::ecs2::System::UpdateContext const&, re::ecs2::BasicComponentStateSceneData<re::ecs2::SkeletalPoseComponent>*, re::ecs2::ComponentBuckets<re::ecs2::SkeletalPoseComponent>::BucketIteration, void*) + 268
frame #8: 0x00000001aadf54b0 CoreRE`re::ecs2::SkeletalPoseSystem::update(re::ecs2::System::UpdateContext) const + 732
frame #9: 0x00000001aaed3e54 CoreRE`re::internal::Callable<re::ecs2::ECSManager::configurePhaseECSSystems(re::Scheduler::ScheduleDescriptor&, re::ecs2::ECSSystemGroup, unsigned long)::$_1, void (float)>::operator()(float&&) const + 168
frame #10: 0x00000001ab40eda4 CoreRE`re::Scheduler::executePhase(unsigned long) + 440
frame #11: 0x00000001aa6a3b74 CoreRE`re::Engine::executePhase(re::FramePhase) + 144
frame #12: 0x000000023173de9c RealitySystemSupport`RCPSharedSimulationExecuteUpdate + 64
frame #13: 0x00000002276c9820 MRUIKit`__65-[MRUISharedSimulation _doJoinWithConnectionConfiguration:error:]_block_invoke.35 + 168
frame #14: 0x00000002276c8530 MRUIKit`__addCAPreFenceHandler_block_invoke + 32
frame #15: 0x000000018af22058 QuartzCore`CA::Transaction::run_commit_handlers(CATransactionPhase) + 112
frame #16: 0x000000018aef2ad4 QuartzCore`CA::Context::commit_transaction(CA::Transaction*, double, double*) + 592
frame #17: 0x000000018af21898 QuartzCore`CA::Transaction::commit() + 652
frame #18: 0x000000018af22dac QuartzCore`CA::Transaction::flush_as_runloop_observer(bool) + 68
frame #19: 0x0000000185a26820 UIKitCore`_UIApplicationFlushCATransaction + 48
frame #20: 0x0000000184f97af0 UIKitCore`_UIUpdateSequenceRun + 76
frame #21: 0x0000000185954290 UIKitCore`schedulerStepScheduledMainSection + 168
frame #22: 0x00000001859536d8 UIKitCore`runloopSourceCallback + 80
frame #23: 0x00000001804157fc CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
frame #24: 0x0000000180415744 CoreFoundation`__CFRunLoopDoSource0 + 172
frame #25: 0x0000000180414eb0 CoreFoundation`__CFRunLoopDoSources0 + 232
frame #26: 0x000000018040f454 CoreFoundation`__CFRunLoopRun + 788
frame #27: 0x000000018040ecd4 CoreFoundation`CFRunLoopRunSpecific + 552
frame #28: 0x0000000190104b70 GraphicsServices`GSEventRunModal + 160
frame #29: 0x0000000185a27e30 UIKitCore`-[UIApplication _run] + 796
frame #30: 0x0000000185a2c058 UIKitCore`UIApplicationMain + 124
frame #31: 0x00000001d29558b4 SwiftUI`closure #1 (Swift.UnsafeMutablePointer<Swift.Optional<Swift.UnsafeMutablePointer<Swift.Int8>>>) -> Swift.Never in SwiftUI.KitRendererCommon(Swift.AnyObject.Type) -> Swift.Never + 164
frame #32: 0x00000001d29555dc SwiftUI`SwiftUI.runApp<τ_0_0 where τ_0_0: SwiftUI.App>(τ_0_0) -> Swift.Never + 84
frame #33: 0x00000001d265ecdc SwiftUI`static SwiftUI.App.main() -> () + 164
frame #34: 0x000000010303f1c4 Playground.debug.dylib`static PlaygroundApp.$main() at <compiler-generated>:0
frame #35: 0x000000010303f290 Playground.debug.dylib`main at PlaygroundApp.swift:7:8
frame #36: 0x0000000102f6d410 dyld_sim`start_sim + 20
frame #37: 0x000000010312e274 dyld`start + 2840
Is there a workaround or another way to switch between IK and FK?
Topic:
Graphics & Games
SubTopic:
RealityKit
I’m having issues getting Collision Shapes working in Reality Composer on iPadOS, or with Reality Composer Pro via Xcode on macOS?
I’ve posted a video recorded through my Vision Pro showing the issue.
The project i’m working on is a Dice Rolling application. The dice don’t appear to be working set as Collision Shape=Automatic, which I assume takes into account the actual silhouette of the shape.
https://youtu.be/upPtQY4QOAk?si=yyx6rbSSmVkLxBLg
They also don’t rest on their face when they land.
Anyone experience this type of behavior and found a solution? I’m currently doing this with Reality Composer, but most likely will also be wanting to get it to work properly in Reality Composer Pro as well.
Thx!
Hi
Hopefully someone can share some ideas on how to accomplish this.
I know we can load models from realityKitContentBundle like
let model = try? await Entity(named: “testModel”, in: realityKitContentBundle)
But this is in the root of RealityKitContent.rkassets , if I have the models in some subfolder then I have to add the complete path like
let model = try? await Entity(named: “/superModels/testModel”, in: realityKitContentBundle)
What I want is to be able to search recursively in all folders for that file as I have several subfolders with different models.
Any suggestion ?
Thanks in advance.
Guillermo
Up to now I have created multiple new SCNNodes using an instance of SCNGeometry and it was OK that they all had the same appearance. Now I want variety and when I make a copy of that instance using:
let newGeo = myGeoInstance.copy() as! SCNGeometry
(must be force cast because copy() -> any?)
all elements are verified present. :-)
Likewise:
node.geometry?.replaceMaterial(at: index, with: myNewMaterial)
is verified to correctly change the material(s) at the correct index(s). The only problem is the modified "teapot" is not visible, and yes I have set node.isHidden = false.
Has anyone experienced this?
In the old days reversing the verts was a solution. In desperation I tried that. |-(
Currently looking for Metal developers to port Quake 2 RTX to Metal RT in order to give Apple Silicon Macs an amazing Pathtracing demo, This project falls under NightSightProductions who is also working on a Portal 2 with RTX Remaster. if you are interested and want to help further Mac gaming, message me here or on discord at king_vulpes
Hi experts,
When I open a USDZ file which contains perspective cameras by "Files" app in IOS 18.2/iPadOS 18.2, I can't see anything. And when I open the USDZ file in IOS 18.1/iPadOS 18.1, it works well.
On the other hand, when I open a USDZ file which contains orthographic cameras in IOS 18.1 or IOS 18.2, the scene is stuck.
Could you help to solve these issues please?
Thanks.
Hi! I just installed GPTK2 on my new Mac , but the Terminal gave “Error:OpenSSL1.1 has been disabled.”
How should I fix it?Or waiting for the GPTK2 beta4?
Thanks.
https://developer.apple.com/documentation/arkit/arkit_in_ios/specifying_a_lighting_environment_in_ar_quick_look
How can I disable it? or at least use a custom texture that's just black?
I don't see the purpose of having the real-time environment probe that captures IBL, but always add this fake studio IBL that you can't remove...
Topic:
Graphics & Games
SubTopic:
RealityKit
I’ve been trying to run Jurassic World Evolution 2 using the Game Porting Toolkit on macOS, but the game doesn’t launch and crashes immediately. Based on the error and research, it seems the issue is related to missing support for D3D12_TILED_RESOURCES_TIER_2 in the Metal API.
If this is the case, does anyone know if support for tiled resources is planned for future updates of the toolkit? Or are there any potential workarounds for bypassing this limitation?
Hi all,
I have been trying to get Apple's assistive touch's snap to item to work for a unity game built using Apple's Core & Accessibility API. The switch control recognises these buttons however, eye tracking will not snap to them. The case in which it needs to snap is when an external eye tracking device is connected and utilises assistive touch & assistive touch's snap to item.
All buttons in the game have a AccessibilityNode with the trait 'Button' on them & an appropriate label, which, following the documentation and comments on the developer forum, should allow them to be recognised by snap to item.
This is not the case, devices (iPads and iPhones) do not recognise the buttons as a snap to target.
Does anyone know why this is the case, and if this is a bug?
I am making a framework in C++ using metal-cpp, basically a small game engine. I am also consequently using metal-cpp-extensions provided in LearnMetalCPP to make applications work.
For one of my classes, I needed to add AppKit.hpp inside a public header file, so I moved it and its associate headers(NSApplication.hpp, NSMenu.hpp, etc.) from Project headers to Public in Build Phases' Headers, however, it started giving me the error "cast of C pointer type 'void *' to Objective-C pointer type 'Class' requires a bridged cast" at several points in the AppKit headers. They don't appear when AppKit and its associates are in the Project headers, or when they are in the Private headers and no headers import it.
I imagined that disabling Objective-C ARC and Using __bridge casts outside of ARC in Build Settings would solve it, but it didn't budge.
I imagined it wouldn't involve actively changing the headers would be the answer, but even if I try to put __bridge before the problematic casts, it didn't recognize __bridge.
How do I solve this? And why is it only happening in Public and not Project headers?
I am trying to load some PNG data with MTKTextureLoader newTextureWithData,but the result shows wrong at the alpha area.
Here is the code. I have an image URL, after it downloads successfully, I try to use the data or UIImagePNGRepresentation (image), they all show wrong.
UIImage *tempImg = [UIImage imageWithData:data];
CGImageRef cgRef = tempImg.CGImage;
MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:device];
id<MTLTexture> temp1 = [loader newTextureWithData:data options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil];
NSData *tempData = UIImagePNGRepresentation(tempImg);
id<MTLTexture> temp2 = [loader newTextureWithData:tempData options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil];
id<MTLTexture> temp3 = [loader newTextureWithCGImage:cgRef options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil];
}] resume];
I have an M1 Pro with a 16-core GPU. When I run a shader with 8193 threads, atomic_thread_fence is violated across the boundary between thread 8191 (the last thread in the 7th threadgroup) and 8192 (the first thread in the 9th threadgroup).
I've attached the Metal and Swift files, but I'll repost the relevant kernel here. It's a function that launches N threads to iterate through a binary tree from the leaves, where the first thread to reach the parent terminates and the second one populates it with the sum of the nodes two children.
// clang-format off
void sum(device const int& size,
device const int* __restrict__ in,
device int* __restrict__ out,
device atomic_int* visited,
uint i [[thread_position_in_grid]]) {
// clang-format on
int val = in[i];
uint cur = (size + i - 1);
out[cur] = val;
atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst);
cur = (cur - 1) / 2;
int proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed);
while (proceed == 1) {
uint left = 2 * cur + 1;
uint right = 2 * cur + 2;
uint val_left = out[left];
uint val_right = out[right];
uint val_cur = val_left + val_right;
out[cur] = val_cur;
if (cur == 0) {
break;
}
cur = (cur - 1) / 2;
atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst);
proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed);
}
}
What I'm observing is that thread 8192 hits the atomic_fetch_add first and terminates, while thread 8191 hits it second (observes that thread 8192 had incremented it by 1) and proceeds into the loop. Thread 8191 reads out[16383] (which it populated with 8191) and out[16384] (which thread 8192 populated with 8192 prior to the atomic_thread_fence). Instead of reading 8192 from out[16384] though, it reads 0.
Maybe I'm missing something but this seems like a pretty clear violation of the atomic_thread_fence which (I thought) was supposed to guarantee that the write from thread 8192 to out[16384] would be visible to any thread observing the effects of the following atomic_fetch_add. Is atomic_fetch_add not a store operation? Modifying it to an atomic_store or atomic_exchange still results in the bug. Adding another atomic_thread_fence between the atomic_fetch_add and reading of out also doesn't change anything.
I only begin to observe this on grid sizes of 8193 and upwards. That's 9 threadgroups per grid, which I assume could be related to my M1 Pro GPU having 16 cores.
Running the same example on an A17 Pro GPU doesn't show any of this behavior up through a tested grid size of 4194303 (2^22-1), at which point testing larger grid sizes starts to run into other issues so I can't test anything larger.
Removing the atomic_thread_fences on both the M1 and A17 cause the test to fail at much smaller grid sizes, as expected.
sum.metal
main.swift
I am working on a custom resolve tile shader for a client. I see a big difference in performance depending on where we write to:
1- the resolve texture of the color attachment
2- a rw tile shader texture set via [renderEncoder setTileTexture: myResolvedTexture]
Option 2 is more than twice as slow than option 1.
Our compute shader writes to 4 UAVs so just using the resolve texture entry is not possible.
Why such a difference as there is no more data being written? Can option 2 be as fast as option 1?
I can demonstrate the issue in a modified version of the Multisample code sample.