Create the QRCode
CIFilter<CIBlendWithMask> *f = CIFilter.QRCodeGenerator;
f.message = [@"Message" dataUsingEncoding:NSASCIIStringEncoding];
f.correctionLevel = @"Q"; // increase level
CIImage *qrcode = f.outputImage;
Overlay the icon
CIImage *icon = [CIImage imageWithURL:url];
CGAffineTransform *t = CGAffineTransformMakeTranslation(
(qrcode.extent.width-icon.extent.width)/2.0,
(qrcode.extent.height-icon.extent.height)/2.0);
icon = [icon imageByApplyingTransform:t];
qrcode = [icon imageByCompositingOver:qrcode];
Round off the corners
static dispatch_once_t onceToken;
static CIWarpKernel *k;
dispatch_once(&onceToken, ^ {
k = [CIWarpKernel kernelWithFunctionName:name
fromMetalLibraryData:metalLibData()
error:nil];
});
CGRect iExtent = image.extent;
qrcode = [k applyWithExtent:qrcode.extent
roiCallback:^CGRect(int i, CGRect r) {
return CGRectInset(r, -radius, -radius); }
inputImage:qrcode
arguments:@[[CIVector vectorWithCGRect:qrcode.extent], @(radius)]];
…and this code for the kernel should go in a separate .ci.metal source file:
float2 bend_corners (float4 extent, float s, destination dest)
{
float2 p, dc = dest.coord();
float ratio = 1.0;
// Round lower left corner
p = float2(extent.x+s,extent.y+s);
if (dc.x < p.x && dc.y < p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
// Round lower right corner
p = float2(extent.x+extent.z-s, extent.y+s);
if (dc.x > p.x && dc.y < p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
// Round upper left corner
p = float2(extent.x+s,extent.y+extent.w-s);
if (dc.x < p.x && dc.y > p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
// Round upper right corner
p = float2(extent.x+extent.z-s, extent.y+extent.w-s);
if (dc.x > p.x && dc.y > p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
return dc;
}
Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi Apple,
In VisionOS, for real-time streaming of large 3D scenes, I plan to create Metal buffers and textures in multiple threads and then use a compute shader on the main thread to copy the Metal resources into RealityKit, minimizing main thread usage. Given that most of RealityKit's default APIs require execution on the main actor (main thread), it is not ideal for streaming data. Is this approach the best way to handle streaming data and real-time rendering?
Thank you very much.
I've been working on Swift game which is not yet launched or available for preview.
The game works in such a way that it has idle CPU while the user is thinking and sustained max CPU and GPU on as many cores as possible when he makes a move.
Rarely, due to OS activity or something else outside of my control (for example when dropping the OS curtain even if for just a bit then remove it), the game or some of its threads are moved to efficiency cores which results in major stuttering which persists precisely until the game is idle again at which point the game is moved back on performance cores - but if the player keeps making moves the stuttering simply won't go away and so I guess compuptation is locked onto efficiency cores.
The issue does not reproduce on MacCatalyst on Intel.
How do I tell Swift to avoid efficiency cores?
BTW Swift and SceneKIT have AMAZING performance especially when compared to others.
Description:
I'm developing an AR effect using SceneKit and applying a transparent material to a face mesh. However, I'm facing an issue where the front faces of the mesh overlap each other, causing incorrect rendering.
Problem:
The front faces of the mesh overlap with each other when transparency is applied.
This causes areas like the cheeks to be visible through the nose, even though they should be occluded.
Expected Behavior: The material should behave as if it were opaque to itself—that is, overlapping front faces should be occluded properly, while still allowing transparency for background elements.
Actual Behavior: The mesh renders its own front faces incorrectly, making parts of the face visible through others when they should be blocked.
What I Have Tried:
testMaterial.writesToDepthBuffer = true
testMaterial.readsFromDepthBuffer = true
Question:
👉 How can I prevent SceneKit's transparent material from rendering overlapping front faces?
👉 Is there a way to force SceneKit to treat its own mesh as opaque for itself while still being transparent to the background?
👉 Does SceneKit support a proper depth pre-pass or an equivalent to Unity’s ZWrite shaders to solve this issue?
Attached screenshots demonstrate the problem visually. Any help would be greatly appreciated! 🚀
我们想在游戏类 App 内接入 Game Center。用户可以在游戏内创建多个角色,若用户在游戏内创建了2个角色:角色1、角色2,请问:
当用户将角色1与 Game Center 绑定后,数据将上报至 Game Center。此时玩家想要将角色1与 Game Center 解除绑定,解绑后,再将角色2与 Game Center 绑定。那么这时角色1的数据是留存在 Game Center 中,还是将被移除?
We are seeing crashes in Xcode organizer. So far we are not able to reproduce them locally. They affect multiple app releases (some older, built with Xcode 15.x and newer built with Xcode 16.0). They only affect iOS 18.5.
Is there anything that changed in latest iOS? It's hard to tell what exactly is causing this crash because setting symbolic breakpoint on CA::Render::Image::new_image(unsigned int, unsigned int, unsigned int, unsigned int, CGColorSpace*, void const*, unsigned long const*, void (*)(void const*, void*), void*) triggers this breakpoint all the time, but not necessarily with exactly the previous stack frames matching the crash report.
Is it a known issue?
crash.crash
Thank you.
My IOS app generates pdf files.
Every time my users open the generated pdf files, the autofill popup jumps out, but my pdf file is NOT for interacting.
I'm here to ask if there's a way to mark my pdf files as "not a form", like in metadata or anywhere else?
Hi all,
I've developed some code that enables an arcball camera interaction with my scene. I've done this using components and systems. The implementation feels a bit messy as I've got gesture code on my realityView, and then a bunch of other code that uses those gesture inputs in my component and system.
Is there a demo app, or some example code that shows a nice way to encapsulate these things in to one item for custom cameras, something like Apple's .realityViewCameraControls(.orbit)
If not can anyone recommend an approach to take?
Hi,
I'm trying to add game center challenges and activities to an already live game, but they are not appearing in game for testing, GameCenter, or the Games app.
I know the game is setup with GameKit entitlements since this is a live game and it has working leaderboards and achievements.
I've updated to Tahoe beta 8, added a challenge and activity on app store connect, added that to a new distribution and added that distribution to 'Add for Review'
I'm using Unity and the Apple Unity plugin
Not sure what other steps I'm missing
Thanks
In a Unity game, if the assetBundle.Unloadinterface is called very frequently during normal gameplay, it can cause the game application's screen to freeze, although the background music continues to play normally. This issue only occurs on iPhone 16 and iPhone 17 models, with no problems on lower-version phones. How can this problem be resolved?
I have run into an issue where I am trying to use atomic_float in a swift package but I cannot get things to compile because it appears that the Swift Package Manager doesn't support Metal 3 (atomic_float is Metal 3 functionality). Is there any way around this? I am using
// swift-tools-version: 6.1
and my Metal code includes:
#include <metal_stdlib>
#include <metal_geometric>
#include <metal_math>
#include <metal_atomic>
using namespace metal;
kernel void test(device atomic_float* imageBuffer [[buffer(1)]],
uint id [[ thread_position_in_grid ]]) {
}
But I get an error on the definition of atomic_float .
Any help, one more importantly, where I could have found this information about this limitation, would be helpful.
-RadBobby
Topic:
Graphics & Games
SubTopic:
Metal
Hi everyone,
I’ve been developing a custom, end-to-end 3D rendering engine called Crescent from scratch using C++20 and Metal-cpp (targeting macOS and visionOS). My primary goal is to build a zero-bottleneck, GPU-driven pipeline that maximizes the potential of Apple Silicon’s Unified Memory and TBDR architecture.
While the fundamental systems are stable, I am looking for architectural feedback from Metal framework engineers regarding specific synchronization and latency challenges.
Current Core Implementations:
GPU-Driven Instance Culling: High-performance occlusion culling using a Hierarchical Z-Buffer (HZB) approach via Compute Shaders.
Clustered Forward Shading: Support for high-count dynamic lights through view-space clustering.
Temporal Stability: Custom TAA with history rejection and Motion Blur resolve.
Asset Infrastructure: Robust GUID-based scene serialization and a JSON-driven ECS hierarchy.
The Architectural Challenge:
I am currently seeing slight synchronization overhead when generating the HZB mip-chain. On Apple Silicon, I am evaluating the cost of encoder transitions versus cache-friendly barriers.
&& m_hzbInitPipeline && m_hzbDownsamplePipeline && !m_hzbMipViews.empty();
if (canBuildHzb) {
MTL::ComputeCommandEncoder* hzbInit = commandBuffer->computeCommandEncoder();
hzbInit->setComputePipelineState(m_hzbInitPipeline);
hzbInit->setTexture(m_depthTexture, 0);
hzbInit->setTexture(m_hzbMipViews[0], 1);
if (m_pointClampSampler) {
hzbInit->setSamplerState(m_pointClampSampler, 0);
} else if (m_linearClampSampler) {
hzbInit->setSamplerState(m_linearClampSampler, 0);
}
const uint32_t hzbWidth = m_hzbMipViews[0]->width();
const uint32_t hzbHeight = m_hzbMipViews[0]->height();
const uint32_t threads = 8;
MTL::Size tgSize = MTL::Size(threads, threads, 1);
MTL::Size gridSize = MTL::Size((hzbWidth + threads - 1) / threads * threads,
(hzbHeight + threads - 1) / threads * threads,
1);
hzbInit->dispatchThreads(gridSize, tgSize);
hzbInit->endEncoding();
for (size_t mip = 1; mip < m_hzbMipViews.size(); ++mip) {
MTL::Texture* src = m_hzbMipViews[mip - 1];
MTL::Texture* dst = m_hzbMipViews[mip];
if (!src || !dst) {
continue;
}
MTL::ComputeCommandEncoder* downEncoder = commandBuffer->computeCommandEncoder();
downEncoder->setComputePipelineState(m_hzbDownsamplePipeline);
downEncoder->setTexture(src, 0);
downEncoder->setTexture(dst, 1);
const uint32_t mipWidth = dst->width();
const uint32_t mipHeight = dst->height();
MTL::Size downGrid = MTL::Size((mipWidth + threads - 1) / threads * threads,
(mipHeight + threads - 1) / threads * threads,
1);
downEncoder->dispatchThreads(downGrid, tgSize);
downEncoder->endEncoding();
}
if (m_instanceCullHzbPipeline) {
dispatchInstanceCulling(m_instanceCullHzbPipeline, true);
}
}
My Questions:
Encoder Synchronization: Would you recommend moving this loop into a single ComputeCommandEncoder using MTLBarrier between dispatches to maintain L2 cache residency, or is the overhead of separate encoders negligible for depth-downsampling on TBDR?
visionOS Bindless Latency: For stereo rendering on visionOS, what are the best practices for managing MTL4ArgumentTable updates at 90Hz+? I want to ensure that updating bindless resources for each eye doesn't introduce unnecessary CPU-to-GPU latency.
Memory Management: Are there specific hints for Memoryless textures that could be applied to intermediate HZB levels to save bandwidth during this process?
I’ve attached a screenshot of a scene rendered with the engine (PBR, SSR, and IBL).
Hello!
I'm a developer working on a plugin for the Elgato Stream Deck, called GPU Metrics. The plugin currently only works on Windows but I'd like to bring it to macOS. However, based on forum posts I've read (and StackOverflow) there isn't a very clear path to query GPU metrics like usage, temperature, used GPU memory, and power consumption. There are some tools out there that do similar things, but I wanted to see what would be the recommendation from Apple's engineering team to get this data via a public API.
Requirements:
Access GPU utilization, temperature, memory usage, power usage
C/C++ based API for querying the metrics so I can expose the data to JavaScript via Node Addon
No need to compatibile with Intel-based Macs, as Apple silicon will be fine for now
Plugin GitHub
Thank you!
Noah
I have an odd bug, if I use initWithFrame as the init routine for NSView subclass that uses layers I don't see this bug.
But if I embedded this view into a storyboard with a .nib file and use initWithCoder, I need to return true on
(BOOL) contentsAreFlipped
From the NSView subclass
If I don't the CALayer actually renders from 0,0 from the view upwards and off the window.
The frame sizes for the NSView and the CALayer are good.. when I see them in updateLayer.
Obviously I have a fix.. but I would like to understand why.
Topic:
Graphics & Games
SubTopic:
General
When running on my iPhone SE3 under IOS 18.4.1, achievement banners show as expected. The same code running on my iPad Air2 under IOS 15.8.4, achievement banners do not show, but they are accepted (as shown in the GameCenterViewController). The banners also don't show when running the simulator under iPhone 16 Pro Max under IOS 18.2 or simulator under iPhone SE3 under IOS 18.3. I haven't tried others. [Note that I clear the achievements each run during test so that I can duplicate this]
Topic:
Graphics & Games
SubTopic:
GameKit
I'm updating an existing distributed game to add turn-based matches. When the Matchmaker ViewController Info Button next to a game is pressed, the results vary:
iOS 15.x - Button under avatar says "Accept Invite" or "View Game" (depending on if invite has already been accepted)
iOS 18.x - Button always says "App Store" - I assume that means it would lead one to the App store to install the game.
Both devices (iPad 15.x and iPhone 18.x) have the same version of the game installed. The results are the same when running in the simulator.
When the game is released, I assume this button will work properly, no?
Topic:
Graphics & Games
SubTopic:
GameKit
The code is pretty simple
kernel void naive(
constant RunParams *param [[ buffer(0) ]],
const device float *A [[ buffer(1) ]], // [N, K]
device float *output [[ buffer(2) ]],
uint2 gid [[ thread_position_in_grid ]]) {
uint a_ptr = gid.x * param->K;
for (uint i = 0; i < param->K; i++, a_ptr++) {
val += A[b_ptr];
}
output[ptr] = val;
}
when uint a_ptr = gid.x * param->K, the code got 150 GFLops
when uint a_ptr = gid.y * param->K, the code got 860 GFLops
param->K = 256;
thread per group: [16, 16]
I'd like to understand why the performance is so different, and how can I profile/diagnose this to help with further optimization.
Topic:
Graphics & Games
SubTopic:
Metal
Hi,
I have an Unity game. I need to have multiple App Icons for my game for it to be able to be recognized in different countries.
In other words, is it possible to have an iOS app in which the App Icon changes based on device locale/language?
On Android this is possible using Unity Localization package "com.unity.localization"
Topic:
Graphics & Games
SubTopic:
General
Hello!
I have a question about how thread groups work with tile shading. When running "traditional" compute, I get to choose both thread group size and the grid size. However, when using tile shading kernel I only have dispatchThreadsPerTile method - this controls how many threads will be ran in each tile. So far so good, but what about thread groups?
The examples in video "Tile Shading on A11" seem to suggest that there will be only one thread group per tile. In the video, [[thread_index_in_threadgroup]] is called "local_id" and it is used to access the image block.
I assume this is the default configuration. So when one does the following:
Creates MTLRenderPassDescriptor with tileWidth set to W and tileHeight set to H
Fires up the tile shading kernel using dispatchThreadsPerTile with MTLSize size = { W, H, 1 }
I understand that the result is 1-to-1 mapping between the tile "pixels" and kernel threads. Now, what I would like to do is to have more than one thread group there. I want this for performance reasons: I have a certain compute kernel which I know executes very well with small thread group size. In fact, { 32, 1, 1 } seems to be the fastest. My understanding is that even if I set tile size to 16x16, and so I am executing 256 threads there, there will only be one SIMD group active in a thread group. Meaning that this SIMD group has to execute 8 times over the tile.
Is it possible somehow? Or perhaps the limitations of the API are pointing at the limitations of hardware itself, and if I want to execute with SIMD group sized thread groups I have to use "traditional" compute encoder?
Will be grateful for help.
Michał
Since macOS 15.3.2, we have observed that when another window is moved near the App Store's install button, the button disappears.
We have attached a related video in the Feedback submission here https://feedbackassistant.apple.com/feedback/20444423
Our application overlays a transparent, watermark-window on top of the system window, which causes the install button in the App Store to be hidden when a user attempts to install an application.Could you advise on how to avoid this issue?