Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.

All subtopics
Posts under Graphics & Games topic

Post

Replies

Boosts

Views

Activity

Query GPU metrics
Hello! I'm a developer working on a plugin for the Elgato Stream Deck, called GPU Metrics. The plugin currently only works on Windows but I'd like to bring it to macOS. However, based on forum posts I've read (and StackOverflow) there isn't a very clear path to query GPU metrics like usage, temperature, used GPU memory, and power consumption. There are some tools out there that do similar things, but I wanted to see what would be the recommendation from Apple's engineering team to get this data via a public API. Requirements: Access GPU utilization, temperature, memory usage, power usage C/C++ based API for querying the metrics so I can expose the data to JavaScript via Node Addon No need to compatibile with Intel-based Macs, as Apple silicon will be fine for now Plugin GitHub Thank you! Noah
0
0
135
May ’25
Game Center leaderboards not posting scores
My app is live but the leaderboards still aren’t updating. App was built with unreal engine 5 with blueprints. I have the leaderboard stat info entered into the node for write integer to leaderboard and a node for show platform specific leaderboard. The leaderboards are shown as live on app connect. When I run the app, the Game Center login functions and the leaderboard interface launches as expected but it just lists a group of friends to invite. There are no scores listed and it says number of players 0 even though I have scored on two different devices and accounts. I have the Game Center entitlement added in Xcode. Not sure where else to look.
0
0
687
Nov ’25
Core Image recipe for QR code icon image
Create the QRCode CIFilter<CIBlendWithMask> *f = CIFilter.QRCodeGenerator; f.message = [@"Message" dataUsingEncoding:NSASCIIStringEncoding]; f.correctionLevel = @"Q"; // increase level CIImage *qrcode = f.outputImage; Overlay the icon CIImage *icon = [CIImage imageWithURL:url]; CGAffineTransform *t = CGAffineTransformMakeTranslation( (qrcode.extent.width-icon.extent.width)/2.0, (qrcode.extent.height-icon.extent.height)/2.0); icon = [icon imageByApplyingTransform:t]; qrcode = [icon imageByCompositingOver:qrcode]; Round off the corners static dispatch_once_t onceToken; static CIWarpKernel *k; dispatch_once(&onceToken, ^ { k = [CIWarpKernel kernelWithFunctionName:name fromMetalLibraryData:metalLibData() error:nil]; }); CGRect iExtent = image.extent; qrcode = [k applyWithExtent:qrcode.extent roiCallback:^CGRect(int i, CGRect r) { return CGRectInset(r, -radius, -radius); } inputImage:qrcode arguments:@[[CIVector vectorWithCGRect:qrcode.extent], @(radius)]]; …and this code for the kernel should go in a separate .ci.metal source file: float2 bend_corners (float4 extent, float s, destination dest) { float2 p, dc = dest.coord(); float ratio = 1.0; // Round lower left corner p = float2(extent.x+s,extent.y+s); if (dc.x < p.x && dc.y < p.y) { float2 d = abs(dc - p); ratio = min(d.x,d.y)/max(d.x,d.y); ratio = sqrt(1.0 + ratio*ratio); return (dc - p)*ratio + p; } // Round lower right corner p = float2(extent.x+extent.z-s, extent.y+s); if (dc.x > p.x && dc.y < p.y) { float2 d = abs(dc - p); ratio = min(d.x,d.y)/max(d.x,d.y); ratio = sqrt(1.0 + ratio*ratio); return (dc - p)*ratio + p; } // Round upper left corner p = float2(extent.x+s,extent.y+extent.w-s); if (dc.x < p.x && dc.y > p.y) { float2 d = abs(dc - p); ratio = min(d.x,d.y)/max(d.x,d.y); ratio = sqrt(1.0 + ratio*ratio); return (dc - p)*ratio + p; } // Round upper right corner p = float2(extent.x+extent.z-s, extent.y+extent.w-s); if (dc.x > p.x && dc.y > p.y) { float2 d = abs(dc - p); ratio = min(d.x,d.y)/max(d.x,d.y); ratio = sqrt(1.0 + ratio*ratio); return (dc - p)*ratio + p; } return dc; }
0
0
107
Mar ’25
Animations for streaming
We have a macOS app (not yet released, but in use by ourselves), that provides scoreboards for streaming sport events. Today it is expected, that there are nice animations for goals, etc. We are streaming using NDI, which requires a CVPixelBuffer for each frame. We currently create these animations using CABasicAnimation, CAAnimation and CAKeyframeAnimation. In addition we use ScreenCaptureKit to generate the frames. This works fine with 25/30 fps, as long as the window where our animations are performed in is visible. But this is not what it should be. We have a smaller window as main app window and control display performing the animations in reduced size, while the streaming animations need to be in HD format and later maybe in 4K. When using an offscreen window, the animations are not calculated. We get 1 frame per second or so. So we actually have to connect an external display to the MacBook and open the large windows there. Ugly solution. Do we use a completely wrong approach? Or is there a way to tell the macOS to perform the animations although it is an offscreen window? If it cannot work that way, what is an alternative?
0
0
141
May ’25
Game Center Challenges and Activities are not appearing
Hi, I'm trying to add game center challenges and activities to an already live game, but they are not appearing in game for testing, GameCenter, or the Games app. I know the game is setup with GameKit entitlements since this is a live game and it has working leaderboards and achievements. I've updated to Tahoe beta 8, added a challenge and activity on app store connect, added that to a new distribution and added that distribution to 'Add for Review' I'm using Unity and the Apple Unity plugin Not sure what other steps I'm missing Thanks
0
0
1.1k
Sep ’25
Optimizing HZB Mip-Chain Generation and Bindless Argument Tables in a Custom Metal Engine
Hi everyone, I’ve been developing a custom, end-to-end 3D rendering engine called Crescent from scratch using C++20 and Metal-cpp (targeting macOS and visionOS). My primary goal is to build a zero-bottleneck, GPU-driven pipeline that maximizes the potential of Apple Silicon’s Unified Memory and TBDR architecture. While the fundamental systems are stable, I am looking for architectural feedback from Metal framework engineers regarding specific synchronization and latency challenges. Current Core Implementations: GPU-Driven Instance Culling: High-performance occlusion culling using a Hierarchical Z-Buffer (HZB) approach via Compute Shaders. Clustered Forward Shading: Support for high-count dynamic lights through view-space clustering. Temporal Stability: Custom TAA with history rejection and Motion Blur resolve. Asset Infrastructure: Robust GUID-based scene serialization and a JSON-driven ECS hierarchy. The Architectural Challenge: I am currently seeing slight synchronization overhead when generating the HZB mip-chain. On Apple Silicon, I am evaluating the cost of encoder transitions versus cache-friendly barriers. && m_hzbInitPipeline && m_hzbDownsamplePipeline && !m_hzbMipViews.empty(); if (canBuildHzb) { MTL::ComputeCommandEncoder* hzbInit = commandBuffer->computeCommandEncoder(); hzbInit->setComputePipelineState(m_hzbInitPipeline); hzbInit->setTexture(m_depthTexture, 0); hzbInit->setTexture(m_hzbMipViews[0], 1); if (m_pointClampSampler) { hzbInit->setSamplerState(m_pointClampSampler, 0); } else if (m_linearClampSampler) { hzbInit->setSamplerState(m_linearClampSampler, 0); } const uint32_t hzbWidth = m_hzbMipViews[0]->width(); const uint32_t hzbHeight = m_hzbMipViews[0]->height(); const uint32_t threads = 8; MTL::Size tgSize = MTL::Size(threads, threads, 1); MTL::Size gridSize = MTL::Size((hzbWidth + threads - 1) / threads * threads, (hzbHeight + threads - 1) / threads * threads, 1); hzbInit->dispatchThreads(gridSize, tgSize); hzbInit->endEncoding(); for (size_t mip = 1; mip < m_hzbMipViews.size(); ++mip) { MTL::Texture* src = m_hzbMipViews[mip - 1]; MTL::Texture* dst = m_hzbMipViews[mip]; if (!src || !dst) { continue; } MTL::ComputeCommandEncoder* downEncoder = commandBuffer->computeCommandEncoder(); downEncoder->setComputePipelineState(m_hzbDownsamplePipeline); downEncoder->setTexture(src, 0); downEncoder->setTexture(dst, 1); const uint32_t mipWidth = dst->width(); const uint32_t mipHeight = dst->height(); MTL::Size downGrid = MTL::Size((mipWidth + threads - 1) / threads * threads, (mipHeight + threads - 1) / threads * threads, 1); downEncoder->dispatchThreads(downGrid, tgSize); downEncoder->endEncoding(); } if (m_instanceCullHzbPipeline) { dispatchInstanceCulling(m_instanceCullHzbPipeline, true); } } My Questions: Encoder Synchronization: Would you recommend moving this loop into a single ComputeCommandEncoder using MTLBarrier between dispatches to maintain L2 cache residency, or is the overhead of separate encoders negligible for depth-downsampling on TBDR? visionOS Bindless Latency: For stereo rendering on visionOS, what are the best practices for managing MTL4ArgumentTable updates at 90Hz+? I want to ensure that updating bindless resources for each eye doesn't introduce unnecessary CPU-to-GPU latency. Memory Management: Are there specific hints for Memoryless textures that could be applied to intermediate HZB levels to save bandwidth during this process? I’ve attached a screenshot of a scene rendered with the engine (PBR, SSR, and IBL).
0
0
203
1d
GKObstacleGraph<GKGraphNode2D> copy() not work (Bad ACCESS) (SpriteKit)
Given a graph with added obstacles I want to make a copy of it. When I make the copy: currentGrath added 20 obstacles. var newGrapth = currentGrath.copy() as? GKObstacleGraph newGrapth2.removeObstacles([newGrapth!.obstacles.first!]) This returns a BAD ACCESS. I don't understand what's going on or what the problem is. If I do this same thing with the main network there is no problem: currentGrath.removeObstacles([currentGrath!.obstacles.first!]) Thanks for the help
0
0
463
Feb ’25
screencapturekit
I have code that captures a window and displays a cropped image. The problem is 2 fold. Kit doesn't seem to allow to modify stop and recapture image in window mode to capture a portion of the screen. So this makes me having to crop and display the cropped image via a published variable. This all works find. But seems to stop after some time. Using an M1 16gig ram. program is taking less than 100meg of mem with 40-70%cpu as the crow flies. printing captured success in debug mode and sometimes frame isn't valid so guarding against it. any ideas on how to improve my strategy?
0
0
135
Jun ’25
MacOS Catalina 10.15.7 CoreGraphic.framework not find symbol
I recently needed to develop an application to obtain the window list, which requires Screen Recording permissions. Apple's official documentation mentions using the two functions CGPreflightScreenCaptureAccess and CGRequestScreenCaptureAccess to request permissions. These functions are stated to be available since version 10.15. However, when I used these two functions on a device running macOS 10.15.7, I encountered the errors shown in the attached screenshot. I used the nm tool to inspect the symbols in the CoreGraphics.framework and found that these two functions were not present. Could you help me understand why this is happening?
0
0
93
May ’25
Bone deformation
I have been tasked with creating content for the Apple Vision Pro. Just the 3D content and animation, not the programming end of things. I can't seem to get any kind of mesh deformation animation to import into Reality Composer Pro. By that I mean bones/skin, or even point cache. I work on PC, and my main software is 3DS Max, but I'm borrowing an iMac for this job, and was instructed to use RCP on it for testing before handing things off to the programmer. My files open and play fine in other USD programs, like Omniverse, or USD View, just not Reality Composer Pro. I've seen the dinosaur demo in AVP, so I know mesh deformation is possible. If there are other essential tools that might make this possible, I have not been made aware of them. I am experimenting with bouncing things off of Blender, in case that exports better, but not really having luck there either -though my results are different. Thanks.
0
0
120
Sep ’25
iPad - Can I prevent Multitasking on my app?
I have a game built in Unreal Engine 5.6 which uses tilt motion controls to rotate an object. I've restricted the app to only run in portrait for iPhone, and everything works fine, however for iPad I've had a few issues relating to multitasking and I can't seem to solve it. Forcing the app to portrait only still allows the app to run in landscape mode, but shows black bars either side of the game, and the axes for the motion controls are incorrect. X becomes Y and Y becomes X, and there's no way for my app to know which orientation it is because the container is still technically portrait. Allowing my game to run in all orientations makes the whole app more presentable, it doesn't add black bars and the game is still functional and I'm able to map the controls correctly because the game knows it's landscape rather than portrait. The problem with allowing my app to run in landscape mode is if multitasking is enabled on the ipad, you can resize the app to be portrait, and then I run into the same problem again where the game thinks it's portrait mode and all of the axes are wrong again. I tried getting the true orientation of the device rather than the scene, but the game is intended to be played flat so instead of returning the orientation of the OS the orientation is FaceUp, which doesn't help. I need to either disable multitasking or find a way of getting the orientation of the OS (not the scene or the device). I haven't found how to get the OS orientation so I've been trying to disable multitasking. I've got Requires Fullscreen true and UIApplicationSupportsMultipleScreens false in my info.plist but my iPad still seems to allow the window to be resized in landscape view. Opening the IOS workspace of my project Requires Fullscreen is ticked but under that it says "Supports Multiple Windows" and the arrow button next to it takes my to my info.plist values, but no indication of how I can change it. I'm using Unreal Engine 5.6 and Xcode 16.0. Xcode is old I know, but this version of unreal engine doesn't seem to support any newer.
0
0
295
Nov ’25
Diagnose data access latency
The code is pretty simple kernel void naive( constant RunParams *param [[ buffer(0) ]], const device float *A [[ buffer(1) ]], // [N, K] device float *output [[ buffer(2) ]], uint2 gid [[ thread_position_in_grid ]]) { uint a_ptr = gid.x * param->K; for (uint i = 0; i < param->K; i++, a_ptr++) { val += A[b_ptr]; } output[ptr] = val; } when uint a_ptr = gid.x * param->K, the code got 150 GFLops when uint a_ptr = gid.y * param->K, the code got 860 GFLops param->K = 256; thread per group: [16, 16] I'd like to understand why the performance is so different, and how can I profile/diagnose this to help with further optimization.
0
0
91
Apr ’25
How to properly pass a Metal layer from SwiftUI MTKView to C++ for use with metal-cpp?
Hello! I'm currently porting a videogame console emulator to iOS and I'm trying to make the renderer (tested on MacOS) work on iOS as well. The emulator core is written in C++ and uses metal-cpp for rendering, whereas the iOS frontend is written in Swift with SwiftUI. I have an Objective-C++ bridging header for bridging the Swift and C++ sides. On the Swift side, I create an MTKView. Inside the MTKView delegate, I run the emulator for 1 video frame and pass it the view's backing layer for it to render the final output image with. The emulator runs and returns, but when it returns I get a crash in Swift land (callstack attached below), inside objc_release, which indicates I'm doing something wrong with memory management. My bridging interface (ios_driver.h): #pragma once #include <Foundation/Foundation.h> #include <QuartzCore/QuartzCore.h> void iosCreateEmulator(); void iosRunFrame(CAMetalLayer* layer); Bridge implementation (ios_driver.mm): #import <Foundation/Foundation.h> extern "C" { #include "ios_driver.h" } <...> #define IOS_EXPORT extern "C" __attribute__((visibility("default"))) std::unique_ptr<Emulator> emulator = nullptr; IOS_EXPORT void iosCreateEmulator() { ... } // Runs 1 video frame of the emulator and IOS_EXPORT void iosRunFrame(CAMetalLayer* layer) { void* layerBridged = (__bridge void*)layer; // Pass the CAMetalLayer to the emulator emulator->getRenderer()->setMTKLayer(layerBridged); // Runs the emulator for 1 frame and renders the output image using our layer emulator->runFrame(); } My MTKView delegate: class Renderer: NSObject, MTKViewDelegate { var parent: ContentView var device: MTLDevice! init(_ parent: ContentView) { self.parent = parent if let device = MTLCreateSystemDefaultDevice() { self.device = device } super.init() } func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {} func draw(in view: MTKView) { var metalLayer = view.layer as! CAMetalLayer // Run the emulator for 1 frame & display the output image iosRunFrame(metalLayer) } } Finally, the emulator's render function that interacts with the layer: void RendererMTL::setMTKLayer(void* layer) { metalLayer = (CA::MetalLayer*)layer; } void RendererMTL::display() { CA::MetalDrawable* drawable = metalLayer->nextDrawable(); if (!drawable) { return; } MTL::Texture* texture = drawable->texture(); <rest of rendering follows here using the drawable & its texture> } This is the Swift callstack at the time of the crash: To my understanding, I shouldn't be violating ARC rules as my bridging header uses CAMetalLayer* instead of void* and Swift will automatically account for ARC when passing CoreFoundation objects to Objective-C. However I don't have any other idea as to what might be causing this. I've been trying to debug this code for a couple of days without much success. If you need more info, the emulator code is also on Github Metal renderer: https://github.com/wheremyfoodat/Panda3DS/blob/ios/src/core/renderer_mtl/renderer_mtl.cpp#L58-L68 Bridge implementation: https://github.com/wheremyfoodat/Panda3DS/blob/ios/src/ios_driver.mm Bridging header: https://github.com/wheremyfoodat/Panda3DS/blob/ios/include/ios_driver.h Any help is more than appreciated. Thank you for your time in advance.
0
0
549
Mar ’25
Metal recommendedMaxWorkingSetSize vs actual RAM on iPhone (LLM load fails)
Context I’m deploying large language models on iPhone using llama.cpp. A new iPhone Air (12 GB RAM) reports a Metal MTLDevice.recommendedMaxWorkingSetSize of 8,192 MB, and my attempt to load Llama-2-13B Q4_K (~7.32 GB weights) fails during model initialization. Environment Device: iPhone Air (12 GB RAM) iOS: 26 Xcode: 26.0.1 Build: Metal backend enabled llama.cpp App runs on device (not Simulator) What I’m seeing MTLCreateSystemDefaultDevice().recommendedMaxWorkingSetSize == 8192 MiB Loading Llama-2-13B Q4_K (7.32 GB) fails to complete. Logs indicate memory pressure / allocation issues consistent with the 8 GB working-set guidance. Smaller models (e.g., 7B/8B with similar quantization) load and run (8B Q4_K provide around 9 tokens/second decoding speed). Questions Is 8,192 MB an expected recommendedMaxWorkingSetSize on a 12 GB iPhone? What values should I expect on other 2025 devices including iPhone 17 (8 GB RAM) and iPhone 17 Pro (12 GB RAM) Is it strictly enforced by Metal allocations (heaps/buffers), or advisory for best performance/eviction behavior? Can a process practically exceed this for long-lived buffers without immediate Jetsam risk? Any guidance for LLM scenarios near the limit?
0
0
505
Oct ’25
We are getting a blank image after capturing and compressing the picture.
We used below method to resize image while compress the image, Below method is correct or need to do the correction in method or "CGBitmapContextCreate" -(UIImage *)resizeImage:(UIImage *)anImage width:(int)width height:(int)height { CGImageRef imageRef = [anImage CGImage]; CGImageAlphaInfo alphaInfo = CGImageGetAlphaInfo(imageRef); if (alphaInfo == kCGImageAlphaNone) alphaInfo = kCGImageAlphaNoneSkipLast; CGContextRef bitmap = CGBitmapContextCreate(NULL, width, height, CGImageGetBitsPerComponent(imageRef), 4 * width, CGImageGetColorSpace(imageRef), alphaInfo); CGContextDrawImage(bitmap, CGRectMake(0, 0, width, height), imageRef); CGImageRef ref = CGBitmapContextCreateImage(bitmap); UIImage *result = [UIImage imageWithCGImage:ref]; CGContextRelease(bitmap); CGImageRelease(ref); return result; }
0
0
355
Dec ’25
Deterministic RNG behaviour across Mac M1 CPU and Metal GPU – BigCrush pass & structural diagnostics
Hello, I am currently working on a research project under ENINCA Consulting, focused on advanced diagnostic tools for pseudorandom number generators (structural metrics, multi-seed stability, cross-architecture reproducibility, and complementary indicators to TestU01). To validate this diagnostic framework, I prototyped a small non-linear 64-bit PRNG (not as a goal in itself, but simply as a vehicle to test the methodology). During these evaluations, I observed something interesting on Apple Silicon (Mac M1): • bit-exact reproducibility between M1 ARM CPU and M1 Metal GPU, • full BigCrush pass on both CPU and Metal backends, • excellent p-values, • stable behaviour across multiple seeds and runs. This was not the intended objective, the goal was mainly to validate the diagnostic concepts, but these results raised some questions about deterministic compute behaviour in Metal. My question: Is there any official guidance on achieving (or expecting) deterministic RNG or compute behaviour across CPU ↔ Metal GPU on Apple Silicon? More specifically: • Are deterministic compute kernels expected or guaranteed on Metal for scientific workloads? • Are there recommended patterns or best practices to ensure reproducibility across GPU generations (M1 → M2 → M3 → M4)? • Are there known Metal features that can introduce non-determinism? I am not sharing the internal recurrence (this work is proprietary), but I can discuss the high-level diagnostic observations if helpful. Thank you for any insight, very interested in how the Metal engineering team views deterministic compute patterns on Apple Silicon. Pascal ENINCA Consulting
0
0
208
Nov ’25
Place a box on a wall or on the floor
Hi, I wanted to do something quite simple: Put a box on a wall or on the floor. My box: let myBox = ModelEntity( mesh: .generateBox(size: SIMD3<Float>(0.1, 0.1, 0.01)), materials: [SimpleMaterial(color: .systemRed, isMetallic: false)], collisionShape: .generateBox(size: SIMD3<Float>(0.1, 0.1, 0.01)), mass: 0.0) For that I used Plane Detection to identify the walls and floor in the room. Then with SpatialTapGesture I was able to retrieve the position where the user is looking and tap. let position = value.convert(value.location3D, from: .local, to: .scene) And then positioned my box myBox.setPosition(position, relativeTo: nil) When I then tested it I realized that the box was not parallel to the wall but had a slightly inclined angle. I also realized if I tried to put my box on the wall to my left the box was placed perpendicular to this wall and not placed on it. After various searches and several attempts I ended up playing with transform.matrix to identify if the plane is wall or a floor, if it was in front of me or on the side and set up a rotation on the box to "place" it on the wall or a floor. let surfaceTransform = surface.transform.matrix let surfaceNormal = normalize(surfaceTransform.columns.2.xyz) let baseRotation = simd_quatf(angle: .pi, axis: SIMD3<Float>(0, 1, 0)) var finalRotation: simd_quatf if acos(abs(dot(surfaceNormal, SIMD3<Float>(0, 1, 0)))) < 0.3 { logger.info("Surface: ceiling/floor") finalRotation = simd_quatf(angle: surfaceNormal.y > 0 ? 0 : .pi, axis: SIMD3<Float>(1, 0, 0)) } else if abs(surfaceNormal.x) > abs(surfaceNormal.z) { logger.info("Surface: left/right") finalRotation = simd_quatf(angle: surfaceNormal.x > 0 ? .pi/2 : -.pi/2, axis: SIMD3<Float>(0, 1, 0)) } else { logger.info("Surface: front/back") finalRotation = baseRotation } Playing with matrices is not really my thing so I don't know if I'm doing it right. Could you tell me if my tests for the orientation of the walls are correct? During my tests I don't always correctly identify whether the wall is in front or on the side. Is this generally the right way to do it? Is there an easier way to do this? Regards Tof
0
0
482
Feb ’25
EXC_BREAKPOINT, QuartzCore , Crash CA::Render::Image::new_image
We are seeing crashes in Xcode organizer. So far we are not able to reproduce them locally. They affect multiple app releases (some older, built with Xcode 15.x and newer built with Xcode 16.0). They only affect iOS 18.5. Is there anything that changed in latest iOS? It's hard to tell what exactly is causing this crash because setting symbolic breakpoint on CA::Render::Image::new_image(unsigned int, unsigned int, unsigned int, unsigned int, CGColorSpace*, void const*, unsigned long const*, void (*)(void const*, void*), void*) triggers this breakpoint all the time, but not necessarily with exactly the previous stack frames matching the crash report. Is it a known issue? crash.crash Thank you.
0
5
393
Jul ’25