In my project I need to do the following:
In runtime create metal Dynamic library from source.
In runtime create metal Executable library from source and Link it with my previous created Dynamic library.
Create compute pipeline using those two libraries created above.
But I get the following error at the third step:
Error Domain=AGXMetalG15X_M1 Code=2 "Undefined symbols:
_Z5noisev, referenced from: OnTheFlyKernel
" UserInfo={NSLocalizedDescription=Undefined symbols:
_Z5noisev, referenced from: OnTheFlyKernel
}
import Foundation
import Metal
class MetalShaderCompiler {
let device = MTLCreateSystemDefaultDevice()!
var pipeline: MTLComputePipelineState!
func compileDylib() -> MTLDynamicLibrary {
let source = """
#include <metal_stdlib>
using namespace metal;
half3 noise() {
return half3(1, 0, 1);
}
"""
let option = MTLCompileOptions()
option.libraryType = .dynamic
option.installName = "@executable_path/libFoundation.metallib"
let library = try! device.makeLibrary(source: source, options: option)
let dylib = try! device.makeDynamicLibrary(library: library)
return dylib
}
func compileExlib(dylib: MTLDynamicLibrary) -> MTLLibrary {
let source = """
#include <metal_stdlib>
using namespace metal;
extern half3 noise();
kernel void OnTheFlyKernel(texture2d<half, access::read> src [[texture(0)]],
texture2d<half, access::write> dst [[texture(1)]],
ushort2 gid [[thread_position_in_grid]]) {
half4 rgba = src.read(gid);
rgba.rgb += noise();
dst.write(rgba, gid);
}
"""
let option = MTLCompileOptions()
option.libraryType = .executable
option.libraries = [dylib]
let library = try! self.device.makeLibrary(source: source, options: option)
return library
}
func runtime() {
let dylib = self.compileDylib()
let exlib = self.compileExlib(dylib: dylib)
let pipelineDescriptor = MTLComputePipelineDescriptor()
pipelineDescriptor.computeFunction = exlib.makeFunction(name: "OnTheFlyKernel")
pipelineDescriptor.preloadedLibraries = [dylib]
pipeline = try! device.makeComputePipelineState(descriptor: pipelineDescriptor, options: .bindingInfo, reflection: nil)
}
}
Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Created
I am building a MacOS desktop app (https://anukari.com) that is using Metal compute to do real-time audio/DSP processing, as I have a problem that is highly parallelizable and too computationally expensive for the CPU.
However it seems that the way in which I am using the GPU, even when my app is fully compute-limited, the OS never increases the power/performance state. Because this is a real-time audio synthesis application, it's a huge problem to not be able to take advantage of the full clock speeds that the GPU is capable of, because the app can't keep up with real-time.
I discovered this issue while profiling the app using Instrument's Metal tracing (and Game tracing) modes. In the profiling configuration under "Metal Application" there is a drop-down to select the "Performance State." If I run the application under Instruments with Performance State set to Maximum, it runs amazingly well, and all my problems go away.
For comparison, when I run the app on its own, outside of Instruments, the expensive GPU computation it's doing takes around 2x as long to complete, meaning that the app performs half as well.
I've done a ton of work to micro-optimize my Metal compute code, based on every scrap of information from the WWDC videos, etc. A problem I'm running into is that I think that the more efficient I make my code, the less it signals to the OS that I want high GPU clock speeds!
I think part of why the OS is confused is that in most use cases, my computation can be done using only a small number of Metal threadgroups. I'm guessing that the OS heuristics see that only a small fraction of the GPU is saturated and fail to scale up the power/clock state.
I'm not sure what to do here; I'm in a bit of a bind. One possibility is that I intentionally schedule busy work -- spin threadgroups just to waste energy and signal to the OS that I need higher clock speeds. This is obviously a really bad idea, but it might work.
Is there any other (better) way for my app to signal to the OS that it is doing real-time latency-sensitive computation on the GPU and needs the clock speeds to be scaled up?
Note that game mode is not really an option, as my app also runs as an AU plugin inside hosts like Garageband, so it can't be made fullscreen, etc.
In my SceneKit game I'm able to connect two players with GKMatchmakerViewController. Now I want to support the scenario where one of them disconnects and wants to reconnect. I tried to do this with this code:
nonisolated public func match(_ match: GKMatch, player: GKPlayer, didChange state: GKPlayerConnectionState) {
Task { @MainActor in
switch state {
case .connected:
break
case .disconnected, .unknown:
let matchRequest = GKMatchRequest()
matchRequest.recipients = [player]
do {
try await GKMatchmaker.shared().addPlayers(to: match, matchRequest: matchRequest)
} catch {
}
@unknown default:
break
}
}
}
nonisolated public func player(_ player: GKPlayer, didAccept invite: GKInvite) {
guard let viewController = GKMatchmakerViewController(invite: invite) else {
return
}
viewController.matchmakerDelegate = self
present(viewController)
}
But after presenting the view controller with GKMatchmakerViewController(invite:), nothing else happens. I would expect matchmakerViewController(_:didFind:) to be called, or how would I get an instance of GKMatch?
Here is the code I use to reproduce the issue, and below the reproduction steps.
Code
Run the attached project on an iPad and a Mac simultaneously.
On both devices, tap the ship to connect to GameCenter.
Create an automatched match by tapping the rightmost icon on both devices.
When the two devices are matched, on iPad close the dialog and tap on the ship to disconnect from GameCenter.
Wait some time until the Mac detects the disconnect and automatically sends an invitation to join again.
When the notification arrives on the iPad, tap it, then tap the ship to connect to GameCenter again. The iPad receives the call player(_:didAccept:), but nothing else, so there’s no way to get a GKMatch instance again.
Hi,
wanted to test if possible to use Mesa3D OGLon12+D3DMetal 2b3 to get GL>4.1 support on windows apps via D3D12Metal..
using simple wglgears.c app (similar glxgears) and running like:
GALLIUM_DRIVER=d3d12 wine64 wglgears64 -info
with overridden opengl32.dll using contents from:
https://github.com/pal1000/mesa-dist-win/releases/download/24.3.0-rc1/mesa3d-24.3.0-rc1-release-msvc.7z
I get:
[D3DMetal:LOG:5E53] Unsupported API: CreateCommandQueue1
caused by:
https://gitlab.freedesktop.org/mesa/mesa/-/commit/c022c9603d500b59ff5e6f93c8a214d1785ab20a
API:
https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12device9-createcommandqueue1
note setup is correct as using:
GALLIUM_DRIVER=llvmpipe wine64 wglgears64 -info
I get:
GL_RENDERER = llvmpipe (LLVM 19.1.3, 128 bits)
GL_VERSION = 4.5 (Compatibility Profile) Mesa 24.3.0-rc1 (git-85ba713d76)
GL_VENDOR = Mesa
GL_EXTENSIONS = GL_ARB_multisample GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color GL_EXT_blend_minmax GL_EXT_blend_subtract
r GL_EXT_texture.. etc..
I am trying to load some PNG data with MTKTextureLoader newTextureWithData,but the result shows wrong at the alpha area.
Here is the code. I have an image URL, after it downloads successfully, I try to use the data or UIImagePNGRepresentation (image), they all show wrong.
UIImage *tempImg = [UIImage imageWithData:data];
CGImageRef cgRef = tempImg.CGImage;
MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:device];
id<MTLTexture> temp1 = [loader newTextureWithData:data options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil];
NSData *tempData = UIImagePNGRepresentation(tempImg);
id<MTLTexture> temp2 = [loader newTextureWithData:tempData options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil];
id<MTLTexture> temp3 = [loader newTextureWithCGImage:cgRef options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil];
}] resume];
Starting with iOS 18.0 beta 1, I've noticed that RealityKit frequently crashes in the simulator when an app launches and presents an ARView.
I was able to create a small sample app with repro steps that demonstrates the issue, and I've submitted feedback: FB16144085
I've included a crash log with the feedback.
If possible, I'd appreciate it if an Apple engineer could investigate and suggest a workaround. It's awkward to be restricted to the iOS 17 simulator, which does not exhibit this behavior.
Please let me know if there's anything I can do to help.
Thank you.
We are trying to implement saving and fetching data to and from iCloud, but it have some problems.
MacOS: 15.3
Here is what I do:
Enable Game Center and iCloud capbility in Signing & Capabilities, pick iCloud Documents, create and select a Container.
Sample code:
void SaveDataToCloud( const void* buffer, unsigned int datasize, const char* name )
{
if(!GKLocalPlayer.localPlayer.authenticated) return;
NSData* data = [ NSData dataWithBytes:databuffer length:datasize];
NSString* filename = [ NSString stringWithUTF8String:name ];
[[GKLocalPlayer localPlayer] saveGameData:data withName:filename completionHandler:^(GKSavedGame * _Nullable savedGame, NSError * _Nullable){
if (error != nil)
{
NSLog( @"SaveDataToCloud error:%@", [ error localizedDescription ] );
}
}];
}
void FetchCloudSavedGameData()
{
if ( !GKLocalPlayer.localPlayer.authenticated ) return;
[ [ GKLocalPlayer localPlayer ] fetchSavedGamesWithCompletionHandler:^(NSArray<GKSavedGame *> * _Nullable savedGames, NSError * _Nullable error) {
if ( error == nil )
{
for ( GKSavedGame *item in savedGames )
{
[ item loadDataWithCompletionHandler:^(NSData * _Nullable data, NSError * _Nullable error) {
if ( error == nil )
{
//handle data
}
else
{
NSLog( @"FetchCloudSavedGameData failed to load iCloud file: %@, error:%@", item.name, [ error localizedDescription ] );
}
} ];
}
}
else
{
NSLog( @"FetchCloudSavedGameData error:%@", [ error localizedDescription ] );
}
} ];
}
Both saveGameData and fetchSavedGamesWithCompletionHandler are not reporting any error, when debugging, saveGameData completionHandler got a nil error, and can get a valid "savedGame", but when try to rebot the game and use "fetchSavedGamesWithCompletionHandler" to fetch data, we got nothing, no error reported, and the savedGames got a 0 length.
From this page https://developer.apple.com/forums/thread/718541?answerId=825596022#825596022
we try to wait 30sec after authenticated , then try fetchSavedGamesWithCompletionHandler, still got the same error.
Checked:
Game Center and iCloud are enabled and login with the same account.
iCloud have enough space to save.
So what's wrong with it.
I'm building a game with a client-server architecture. Using GKMatch.chooseBestHostingPlayer(_:) rarely works. When I started testing it today, it worked once at the very beginning, and since then it always succeeds on one client and returns nil on the other client. I'm testing with a Mac and an iPhone. Sometimes it fails on the Mac, sometimes on the iPhone. On the device that it succeeds on, the provided host can be the device itself or the other one.
I created FB9583628 in August 2021, but after the Feedback Assistant team replied that they are not able to reproduce it, the feedback never went forward.
import SceneKit
import GameKit
#if os(macOS)
typealias ViewController = NSViewController
#else
typealias ViewController = UIViewController
#endif
class GameViewController: ViewController, GKMatchmakerViewControllerDelegate, GKMatchDelegate {
var match: GKMatch?
var matchStarted = false
override func viewDidLoad() {
super.viewDidLoad()
GKLocalPlayer.local.authenticateHandler = authenticate
}
private func authenticate(_ viewController: ViewController?, _ error: Error?) {
#if os(macOS)
if let viewController = viewController {
presentAsSheet(viewController)
} else if let error = error {
print(error)
} else {
print("authenticated as \(GKLocalPlayer.local.gamePlayerID)")
let viewController = GKMatchmakerViewController(matchRequest: defaultMatchRequest())!
viewController.matchmakerDelegate = self
GKDialogController.shared().present(viewController)
}
#else
if let viewController = viewController {
present(viewController, animated: true)
} else if let error = error {
print(error)
} else {
print("authenticated as \(GKLocalPlayer.local.gamePlayerID)")
let viewController = GKMatchmakerViewController(matchRequest: defaultMatchRequest())!
viewController.matchmakerDelegate = self
present(viewController, animated: true)
}
#endif
}
private func defaultMatchRequest() -> GKMatchRequest {
let request = GKMatchRequest()
request.minPlayers = 2
request.maxPlayers = 2
request.defaultNumberOfPlayers = 2
request.inviteMessage = "Ciao!"
return request
}
func matchmakerViewControllerWasCancelled(_ viewController: GKMatchmakerViewController) {
print("cancelled")
}
func matchmakerViewController(_ viewController: GKMatchmakerViewController, didFailWithError error: Error) {
print(error)
}
func matchmakerViewController(_ viewController: GKMatchmakerViewController, didFind match: GKMatch) {
self.match = match
match.delegate = self
startMatch()
}
func match(_ match: GKMatch, player: GKPlayer, didChange state: GKPlayerConnectionState) {
print("\(player.gamePlayerID) changed state to \(String(describing: state))")
startMatch()
}
func startMatch() {
let match = match!
if matchStarted || match.expectedPlayerCount > 0 {
return
}
print("starting match with local player \(GKLocalPlayer.local.gamePlayerID) and remote players \(match.players.map({ $0.gamePlayerID }))")
match.chooseBestHostingPlayer { host in
print("host is \(String(describing: host?.gamePlayerID))")
}
}
}
So I've been trying out GPTK with Elite Dangerous Horizons game and it looks like from what I can tell. The VRAM keeps going up until it goes over the limit where it drops the FPS to 1-3 FPS and then crashes the game. From the Performance HUD I can see that it looks like when using GPTK, the VRAM usage just keeps climbing and I never saw it drop down at all. I did some limited testing, and from that I think I can conclude that it is probably not a VRAM leak, but it might be caching it. The reason for this is because I noticed that if I went back to the area that I've been before. It won't increase the VRAM usage.
So either there is something wrong with the freeing VRAM memory part, or it could be that GPTK might not be reporting the right amount of VRAM available to use? So maybe that's why it keeps allocating VRAM until it went out of memory and crashed the game.
Just to test, I did try running the game with DXVK+MoltenVK combo, and I can see that it works just fine. VRAM is being freed up when it's no longer used.
Is this a known issue in some games?
Question:
I'm encountering an issue with in-app purchases (IAP) in Unity, specifically for a non-consumable product in the iOS sandbox environment. Below are the details:
Environment:
Unity Version: 2022.3.55f1 Unity In-App Purchasing
Version: v4.12.2
Device: iPhone (15, iOS 18.1.1)
Connection: Wi-Fi iOS
Settings: In-App Purchases set to “Allowed” initially Problem Behavior:
I attempted to purchase a non-consumable item for the first time. The payment is successfully completed by entering the password.
I then background the game app and navigate to the iOS Settings to set In-App Purchases to "Don't Allow."
After returning to the game and either closing or killing the app, I try to purchase the same non-consumable item again.
I checked canMakePayments() through the Apple configuration, and the app correctly detected that I could not make purchases due to the restriction.
I then navigate back to Settings and set In-App Purchases to "Allow."
Upon returning to the game, I try purchasing the non-consumable item again. A pop-up appears, saying, "You’ve already purchased this. Would you like to get it again for free?"
The issue is: Will it deduct money for the second time, and why is the system allowing the user to purchase the same non-consumable item multiple times after purchasing it once?
Is this the expected behavior for Unity In-App Purchasing, or is there something I might be missing in handling non-consumable purchases in this scenario?
Additional Information:
I’ve confirmed that the "In-App Purchases" are set to “Allowed” before attempting the purchase again.
I understand that non-consumable products should not be purchased more than once, so I’m unsure why the system is offering to let the user purchase it again.
I appreciate any insights into whether this is expected behavior or if I need to adjust how I handle the purchase flow.
In my project, I have several nodes (SCNNode) with some levels of detail (SCNLevelOfDetail) and everything works correctly, but when I add animation using morphing (SCNMorpher), the animation works correctly but without the levels of detail. Note: the entire scene is created in Autodesk 3D Studio Max and then exported in (.ASE) format.
The goal is to make animations using morphing that have some levels of detail.
Does anyone know if SCNMorpher supports geometry with some levels of detail?
I appreciate any information about this case.
Thanks everyone!!!
Part of the code I use to load geometries (SCNGeometry) with some levels of detail (SCNLevelOfDetail) using morphing (SCNMorpher).
node.morpher = [SCNMorpher new];
SCNGeometry *geometry = [self geometryWithMesh:mesh];
NSMutableArray <SCNLevelOfDetail*> *mutLevelOfDetail = [NSMutableArray arrayWithCapacity:self.mutLevelsOfDetail.count];
for (int i = 0; i < self.mutLevelsOfDetail.count; i++) {
ASCGeomObject *geomObject = self.mutLevelsOfDetail[i];
SCNGeometry *geometry = [self geometryWithMesh:geomObject.mesh.mutMeshAnimation[i]];
[mutLevelOfDetail addObject:[SCNLevelOfDetail levelOfDetailWithGeometry:geometry worldSpaceDistance:geomObject.worldSpaceDistance]];
}
geometry.levelsOfDetail = mutLevelOfDetail;
node.morpher.targets = [node.morpher.targets arrayByAddingObject:geometry];
The flushContextInternal function in glr_sync.mm:262 called abort internally. What caused this? Was it due to high device temperature or some other reason?
Date/Time: 2024-08-29 09:20:09.3102 +0800
Launch Time: 2024-08-29 08:53:11.3878 +0800
OS Version: iPhone OS 16.7.10 (20H350)
Release Type: User
Baseband Version: 8.50.04
Report Version: 104
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Triggered by Thread: 0
Thread 0 name:
Thread 0 Crashed:
0 libsystem_kernel.dylib 0x00000001ed053198 __pthread_kill + 8 (:-1)
1 libsystem_pthread.dylib 0x00000001fc5e25f8 pthread_kill + 208 (pthread.c:1670)
2 libsystem_c.dylib 0x00000001b869c4b8 abort + 124 (abort.c:118)
3 AppleMetalGLRenderer 0x00000002349f574c GLDContextRec::flushContextInternal() + 700 (glr_sync.mm:262)
4 DiSpecialDriver 0x000000010824b07c Di::RHI::onRenderFrameEnd() + 184 (RHIDevice.cpp:118)
5 DiSpecialDriver 0x00000001081b85f8 Di::Client::drawFrame() + 120 (Client.cpp:155)
2024-08-27_14-44-10.8104_+0800-07d9de9207ce4c73289507e608e5de4320d02ccf.crash
Topic:
Graphics & Games
SubTopic:
Metal
as in the environments we have real tiem reflections of movies on a screen or reflections of the surrounding hood in the background...
could i get a metallic surface getting accurate reflections of a box on top ?
i don't mean getting a rpobe or hdr cubemap, i mean the same accurate reflections as the water of the mt hood with movie i'm wacthing in other app
In this video, tile fragment shading is recommended for image processing. In this example, the unpack function takes two arguments, one of which is RasterizerData. As I understand it, this is the data passed to us from the previous stage (Vertex) of the graphics pipeline.
However, the properties of MTLTileRenderPipelineDescriptor do not include an option for specifying a Vertex function. Therefore, in this render pass, a mix of commands is used: first, a draw command is executed to obtain UV coordinates, and then threads are dispatched.
My question is: without using a draw command, only dispatch, how can I get pixel coordinates in the fragment tile function? For the kernel tile function, everything is clear.
typedef struct
{
float4 OPTexture [[ color(0) ]];
float4 IntermediateTex [[ color(1) ]];
} FragmentIO;
fragment FragmentIO Unpack(RasterizerData in [[ stage_in ]],
texture2d<float, access::sample> srcImageTexture [[texture(0)]])
{
FragmentIO out;
//...
// Run necessary per-pixel operations
out.OPTexture = // assign computed value;
out.IntermediateTex = // assign computed value;
return out;
}
I'm trying to use MTLBinaryArchive. I collected a BinaryArchive from one device and used metal-tt to translate it for all supported iPhone devices, ranging from iPhone 7 Plus to iPhone 16.
However, this BinaryArchive is quite large, around 1.5GB uncompressed, and about 500MB compressed in the IPA. I'm wondering how to address the size issue.
I watched the WWDC 2022 video, which mentioned that the operating system or app installation process would handle compatibility. Does this compatibility support different GPU chips? I tried installing an IPA with a BinaryArchive collected only from an iPhone 12 on an iPhone 13, but the BinaryArchive didn't take effect.
I also saw that Apple supports App Thinning. However, it seems that resources in the Asset Catalog cannot be accessed via URL, and creating an MTLBinaryArchive requires a URL. Is it possible for MTLBinaryArchive to be distributed through App Thinning?
The WWDC 2022 video also mentioned using the -Os optimization flag to reduce size. Can this give an estimate of how much compression it would achieve? Are there any methods to solve the BinaryArchive size issue without impacting performance?
Topic:
Graphics & Games
SubTopic:
Metal
I have this drawing app that I have been working on for the past few years when I have free time. I recently rebuilt the app in Metal to build out other brushes and improve performance, need to render 10000s of lines in realtime.
I’m running into this issue trying to create a uniform opacity per path. I have a solution but do not love it - as this is a realtime app and the solution could have some bottlenecks. If I just generate a triangle strip from touch points and do my best to smooth, resample, and handle miters I will always get some overlaps. See:
To create a uniform opacity I render to an offscreen texture with blending disabled. I then pre-multiply the color and draw that texture to a composite texture with blending on (I do this per path). This works but gets tricky when you introduce a textured brush, the edges of the texture in the frag shader cut out the line.
Pasted Graphic 1.png
Solution: I discard below a threshold
fragment float4 fragment_line(VertexOut in [[stage_in]],
texture2d<float> texture [[ texture(0) ]]) {
constexpr sampler s(coord::normalized, address::mirrored_repeat, filter::linear);
float2 texCoord = in.texCoord;
float4 texColor = texture.sample(s, texCoord);
if (texColor.a < 0.01) discard_fragment(); // may be slow (from what I read)
return in.color * texColor;
}
Better but still not perfect.
Question: I'm looking for better ways to create a uniform opacity per path. I tried .max blending but that will cause no blending of other paths. Any tips, ideas, much appreciated. If this is too detailed of a question just achieve.
Create the QRCode
CIFilter<CIBlendWithMask> *f = CIFilter.QRCodeGenerator;
f.message = [@"Message" dataUsingEncoding:NSASCIIStringEncoding];
f.correctionLevel = @"Q"; // increase level
CIImage *qrcode = f.outputImage;
Overlay the icon
CIImage *icon = [CIImage imageWithURL:url];
CGAffineTransform *t = CGAffineTransformMakeTranslation(
(qrcode.extent.width-icon.extent.width)/2.0,
(qrcode.extent.height-icon.extent.height)/2.0);
icon = [icon imageByApplyingTransform:t];
qrcode = [icon imageByCompositingOver:qrcode];
Round off the corners
static dispatch_once_t onceToken;
static CIWarpKernel *k;
dispatch_once(&onceToken, ^ {
k = [CIWarpKernel kernelWithFunctionName:name
fromMetalLibraryData:metalLibData()
error:nil];
});
CGRect iExtent = image.extent;
qrcode = [k applyWithExtent:qrcode.extent
roiCallback:^CGRect(int i, CGRect r) {
return CGRectInset(r, -radius, -radius); }
inputImage:qrcode
arguments:@[[CIVector vectorWithCGRect:qrcode.extent], @(radius)]];
…and this code for the kernel should go in a separate .ci.metal source file:
float2 bend_corners (float4 extent, float s, destination dest)
{
float2 p, dc = dest.coord();
float ratio = 1.0;
// Round lower left corner
p = float2(extent.x+s,extent.y+s);
if (dc.x < p.x && dc.y < p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
// Round lower right corner
p = float2(extent.x+extent.z-s, extent.y+s);
if (dc.x > p.x && dc.y < p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
// Round upper left corner
p = float2(extent.x+s,extent.y+extent.w-s);
if (dc.x < p.x && dc.y > p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
// Round upper right corner
p = float2(extent.x+extent.z-s, extent.y+extent.w-s);
if (dc.x > p.x && dc.y > p.y) {
float2 d = abs(dc - p);
ratio = min(d.x,d.y)/max(d.x,d.y);
ratio = sqrt(1.0 + ratio*ratio);
return (dc - p)*ratio + p;
}
return dc;
}
Hello!
I have a question about how thread groups work with tile shading. When running "traditional" compute, I get to choose both thread group size and the grid size. However, when using tile shading kernel I only have dispatchThreadsPerTile method - this controls how many threads will be ran in each tile. So far so good, but what about thread groups?
The examples in video "Tile Shading on A11" seem to suggest that there will be only one thread group per tile. In the video, [[thread_index_in_threadgroup]] is called "local_id" and it is used to access the image block.
I assume this is the default configuration. So when one does the following:
Creates MTLRenderPassDescriptor with tileWidth set to W and tileHeight set to H
Fires up the tile shading kernel using dispatchThreadsPerTile with MTLSize size = { W, H, 1 }
I understand that the result is 1-to-1 mapping between the tile "pixels" and kernel threads. Now, what I would like to do is to have more than one thread group there. I want this for performance reasons: I have a certain compute kernel which I know executes very well with small thread group size. In fact, { 32, 1, 1 } seems to be the fastest. My understanding is that even if I set tile size to 16x16, and so I am executing 256 threads there, there will only be one SIMD group active in a thread group. Meaning that this SIMD group has to execute 8 times over the tile.
Is it possible somehow? Or perhaps the limitations of the API are pointing at the limitations of hardware itself, and if I want to execute with SIMD group sized thread groups I have to use "traditional" compute encoder?
Will be grateful for help.
Michał
Anyone else unable to download the "Rendering a Scene with Deferred Lighting in C++" (https://developer.apple.com/documentation/metal/rendering-a-scene-with-deferred-lighting-in-c++?language=objc)?
I just an error page:
Is there another place to download this sample?
Topic:
Graphics & Games
SubTopic:
Metal
I have a UIView that displays lines, and I zoom in (scale by 2 on the scroll view zoomScale variable containing the UIView).
As I zoom in, on the Mac version (Designed for IPad) I loose the graphic after a certain number of zooms (the scrollView maximumZoomScale is set at 10).
To ensure that lines are correctly represented, I modify the contentScaleFactor variable on the UIView; otherwise, the line's display is pixelated.
On the IPad (simulator and real) I do not loose the graphic when zooming.
So the Mac port of the UIView drawing is not working as the IPad version.
Everything else of the application works fine except this important details.
I already submitted a feedback request (#FB16829106) with the images showing the problem. I need a solution to this problem.
Thanks.