Post

Replies

Boosts

Views

Activity

insertDebugSignpost doesn't appear in Metal GPU Capture
I need to be able to tag each draw call with a quick string that details shader name, draw counts, etc. In Vulkan, we have pVkCmdInsertDebugUtilsLabelEXT (and begin/end event). In DX, there's Pix setMarker (in addition to begin/endEvent). And the Metal equivalent would seem to be insertDebugSignpost. But these don't appear in the Metal GPU capture at all. I also tried using a quick beginDebugGroup/endDebugGroup, but since that doesn't surround any commands, it appears to get stripped. A "marker" are needed for two reasons, quickly tagging points in code. And also to replace and flatten the begin/endDebugGroup hierarchy from folders used by "groups" when we want to do that. Why doesn't this Metal equivalent appear?
0
0
534
Mar ’21
GPU capture only reports Counters on iOS/macOS when reopening capture
Make sure gpu capture is set to "Automically Enabled" and "Profile GPU Trace after Capture" in Xcode 12.2 and 12.4 Run an iOS app Do a GPU capture Try to go to look at Counters and they're not there. Save capture out via "Export" Reopen capture, and now Counters are there I see the "Counters" pane have a spinner for a short amount of time after doing step 2, but the Counters are never filled out. I don't want to have to exit my app to look at captures, since I need to look at multiple captures over the course of a session.
0
0
558
Apr ’21
Rosetta2 missing AVX and f16c ops
We can drop our compiles from AVX to SSE4.2, but we also use f16c ops to handle fp16 <-> fp32 conversions. Neon already has similar routines to f16c support, so why are these missing from Rosetta2? Until we can generate universal apps, we need to fallback to running our tools under Rosetta2. Also looks like popcount is missing. These limits should be posted in Apple Rosetta2 documents. Here's my MBP 16" Intel sysctl -a | grep machdep.cpu.features machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C And an M1 comparison: sysctl -a | grep machdep.cpu.features machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTSE64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 AES SEGLIM64
0
1
1.3k
Nov ’21
Xcode "IntelliSense" picks up on only 90% of our C++ types/calls
When we build our C++ code in Visual Studio, IntelliSense finds all of the types and functions. When we build in Xcode, it finds about 90%. There seems to be no consistent pattern to why Xcode skips some things, and then that daisychains into the next header that includes that prior header. We have a class with If/Else function calls, but Add calls are skipped. Even one header with the struct defined in the same header isn't highlighted as a type within that header. Sources are built with Gnu makefiles, but ultimately the .o and .d files are all complied and linked together by clang using Xcode 13.3 and we use the new build system. What could we be doing wrong here? This isn't a recent problem, and has happened with all Xcode builds prior.
0
0
860
Mar ’22
How to determine count of big vs. little cores?
I know how to do this with macOS 12/iOS 15, but how do we determine the split prior? I know most phones are 2/4, but A10 is 2/2 exclusive. This is the new way below, but what is the old way? Especially with Alderlake chips using 8HT/8 configs with 24 threads, this info is important to identify. sysctlbyname( "hw.nperflevels", &perfLevelCount, &countSize, NULL, 0 ) sysctlbyname( "hw.perflevel0.physicalcpu", &info.bigCores, &countSize, NULL, 0 ) sysctlbyname( "hw.perflevel1.physicalcpu", &info.littleCores, &countSize, NULL, 0 )
0
0
705
May ’22
Bootcamp not mapping keys properly.
The Windows Keyboard layout tool shows the ~ key as VK_OEM_3, but when it reaches our app, we get VK_OEM_5. And same with the '" and | keys. These all arrive incorrectly as if something in software is remapping them. This is on an Intel MBP 16" 2019 w/Windows 10 Pro latest. These forums don't even have a Bootcamp channel. // hack for bootcamp if ( button == VK_OEM_3 ) button = VK_OEM_7; else if ( button == VK_OEM_5 ) button = VK_OEM_3; else if ( button == VK_OEM_7 ) button = VK_OEM_5;
0
0
1k
Apr ’23
Bootcamp Win10 problems with HDR, keys, and trackpad
I'm on a MBP 16" 2019 with HDR 10. latest Apple/AMD drivers don't enable HDR mode under Windows 10. Is there a monitor profile that Apple can provide/install? trackpad stutters in M2 chip during high cpu usage keyboard keys for VK_OEM3/5/7 are incorrect, mapping to one another static bool useBootcampHack = true; if ( useBootcampHack ) { if ( button == VK_OEM_3 ) button = VK_OEM_7; // '" for US else if ( button == VK_OEM_5 ) button = VK_OEM_3; // `~ for US else if ( button == VK_OEM_7 ) button = VK_OEM_5; // \| for US }
0
0
598
May ’23
Clickthrough to headers broken using new build system.
When we have warnings/errors in our make based builds, the new build system reports the warnings as relative instead of absolute paths. When I then try to click to follow to the code where the warnings/errors occur I get the "bonk" noise and Xcode doesn't take me there. My understanding is that the old build systems resolved these to full paths and so they would then jump to the line in the code, but the new build system just leaves them as a relative path. This mostly defeats the use of an IDE if we can quickly review and fix issues like this. Any suggestions for fixing this? Neither the warning/error summary, or the report navigator build panel take me to the line in FooClump.h. This is the line take from the report naviagator build pane. In file included from /Users/Me/MyAppFolder/FooClump.cpp:4: FooClump.h:30:15: warning: 'postConstructor' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]     virtual void postConstructor();
1
0
698
Apr ’21
UIKeyboardHIDUsageKeyboard missing left/right command keys
For keyboard handling on iOS (and iOS on macOS M1), the iOS 13.4 keyboard constants are missing the command keys. We need to be able to detect key up/down on all the modifiers. I realize there's a modifiers field on UIKey, but this seems inconsistent. case UIKeyboardHIDUsageKeyboardLeftShift: b = kButton_Shift; break; case UIKeyboardHIDUsageKeyboardRightShift: b = kButton_Shift; break; case UIKeyboardHIDUsageKeyboardLeftAlt: b = kButton_Alt; break; case UIKeyboardHIDUsageKeyboardRightAlt: b = kButton_Alt; break; // ? case kVK_Command: b = kButton_Command; break; // ? case kVK_RightCommand: b = kButton_Command; break; case UIKeyboardHIDUsageKeyboardLeftControl: b = kButton_Ctrl; break; case UIKeyboardHIDUsageKeyboardRightControl: b = kButton_Ctrl; break;
Topic: UI Frameworks SubTopic: UIKit Tags:
1
0
716
Dec ’21
Any possibility of metal support min/max filter reduction?
We have this on many of our platforms, but Apple doesn't appear to expose this in Metal. Nvidia/AMD have had this for a long time. We can workaround for now, with gather followed by a component min/max on a single channel. For large scale multi-channel downsampling, having access to the sampler setting would be better. This would even work with 3d volumes, etc. VK_EXT_sampler_filter_minmax These are the three modes WeightedAverage - basic nearest/blinear/trilinear Min Max
1
1
838
May ’22
Need faster Xcode Metal capture button
By the time I background the app, hit the capture button, wait on the UI popup to appear, and then hit the "capture" button in the popup, the even that I was trying to capture has already passed. Can we get a button, or double-click on the slanted M icon to just do the capture instead of verify that I want to. All told, it's about 5s to get a capture to execute and that is too long when running at 60 or 120Hz. I know there's programmatic capture too, but we don't have that hooked up yet.
1
0
774
May ’22
"Network unreachable" error trying to connect socket from iPadOS 16 back to macOS 13
This seems to have broken as of my update from iPadOS 15 to 16. Now our connect returns "Network unreachable" on the connect() call, and the select() call times out after 10 or 30s. I have "Developer Mode" enabled on the device. We are just trying to connect the iPad back to the devhost mac that is running the application. When the mac does the same connect to the same IP, connect returns "Connect in progress", and the select succeeds in setting up the socket. I'm using macOS 14.1 on macOS Intel, and Xcode 14.1.
1
0
1.4k
Dec ’22
macOS 10.15 deployment breaks monkeypatching C++ vtable
How do we get this code to not crash? This was working up until we bumped our macOS deployment to 10.15. When deployment is set to macOS 10.14, the code works fine. Data is nearly identical in the debugger, although the vtable is at a slightly lower address in 10.15. Have the C++ vtables been put in read-only marked pages, and if so how do we prevent that? Hardened runtime is not enabled, and I don't recall any mention from Apple about this change. #import Foundation/Foundation.h class Base { public: virtual ~Base() {}; virtual const char* Print() { return "Base"; } }; class Derived : public Base { public: virtual const char* Print() override { return "Derived"; } }; const char* PrintPatch( Base* localThis ) { return "Patch"; } template class T1, class T2 void* PatchVtablePtr( void** vtable, T1 memberFunction, T2 newFunction ) { // Replace the instance of memberFunction in the vtable with newFunction void* offset = *(void**)&amp;memberFunction; auto vtableIndex = (uintptr_t)offset / sizeof(void*); vtable[vtableIndex] = (void*)newFunction; - Thread 1: EXC_BAD_ACCESS (code=2, address=0x100004038) // return the original vtable address return offset; } int main(int argc, const char * argv[]) { Derived* derived = new Derived(); printf("%s\n", derived-Print()); // monkeypatch the vtable //Base* base = derived; //void** vtable = *(void*)base; void vtable = *(void***)derived; PatchVtablePtr( vtable, &amp;Derived::Print, &amp;PrintPatch ); printf("%s\n", derived-Print()); return 0; }
2
0
1k
Apr ’21
GPU capture should display draw call after pushDebugGroup/commands
The push/popDebugGroup calls are captured by GPU capture and display a folder around a series of draw calls. But when you select the folder, the previous draw call results and attachments are displayed. This makes walking through a deep hierarchy of draw calls confusing, especially to people new to GPU capture. A simple change, but selecting a folder like this or any command after a draw should really display the results from the next draw call instead of the previous.
2
0
756
May ’21