Post

Replies

Boosts

Views

Activity

Any recent changes to dlopen() implementation?
In some recent releases of macos (14.x and 15.x), we have noticed what seems to be a slower dlopen() implementation. I don't have any numbers to support this theory. I happened to notice this "slowness" when investigating something unrelated. In one part of the code we have a call of the form: const char * fooBarLib = ....; dlopen(fooBarLib, RTLD_NOW + RTLD_GLOBAL); It so happened that due to some timing related issues, the process was crashing. A slow execution of code in this part of the code would trigger an issue in some other part of the code that would then lead to a process crash. The crash itself isn't a concern, because it's an internal issue that will addressed in the application code. What was interesting is that the slowness appears to be contributed by the call to dlopen(). Specifically, whenever a slowness was observed, the crash reports showed stack frames of the form: Thread 1: 0 dyld 0x18f08b5b4 _kernelrpc_mach_vm_protect_trap + 8 1 dyld 0x18f08f540 vm_protect + 52 2 dyld 0x18f0b87e0 lsl::MemoryManager::writeProtect(bool) + 204 3 dyld 0x18f0a7fe4 invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 932 4 dyld 0x18f0e629c invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 172 5 dyld 0x18f0d9c38 invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 496 6 dyld 0x18f08c2dc dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 300 7 dyld 0x18f0d8bcc dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192 8 dyld 0x18f0db5a0 dyld3::MachOFile::forEachInitializerPointerSection(Diagnostics&, void (unsigned int, unsigned int, bool&) block_pointer) const + 160 9 dyld 0x18f0e5f90 dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 432 10 dyld 0x18f0a7bb4 dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 176 11 dyld 0x18f0af190 dyld4::JustInTimeLoader::runInitializers(dyld4::RuntimeState&) const + 36 12 dyld 0x18f0a8270 dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 312 13 dyld 0x18f0ac560 dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_0::operator()() const + 180 14 dyld 0x18f0a8460 dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 412 15 dyld 0x18f0c089c dyld4::APIs::dlopen_from(char const*, int, void*) + 2432 16 libjli.dylib 0x1025515b4 DoFooBar + 56 17 libjli.dylib 0x10254d2c0 Hello_World_Launch + 1160 18 helloworld 0x10250bbb4 main + 404 19 libjli.dylib 0x102552148 apple_main + 88 20 libsystem_pthread.dylib 0x18f4132e4 _pthread_start + 136 21 libsystem_pthread.dylib 0x18f40e0fc thread_start + 8 So, out of curiosity, have there been any known changes in the implementation of dlopen() which might explain the slowness? Like I noted, I don't have concrete numbers, but to quantify the slowness I don't think it's slower by a noticeable amount - maybe a few milli seconds. I guess what I am trying to understand is, whether there's anything that needs attention here.
3
0
457
Mar ’25
Process with equal instances but unequal identities
I am looking at some logs that I collected through sysdiagnose and I notice several messages of the form: ... fault 2025-03-05 01:12:04.034832 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=86764 AUID=502> and <anon<java>(502)(0) pid=86764> fault 2025-03-05 01:15:05.829696 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88001 AUID=502> and <anon<java>(502)(0) pid=88001> fault 2025-03-05 01:15:06.047003 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88010 AUID=502> and <anon<java>(502)(0) pid=88010> fault 2025-03-05 01:15:06.385648 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88012 AUID=502> and <anon<java>(502)(0) pid=88012> fault 2025-03-05 01:15:07.135896 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88019 AUID=502> and <anon<java>(502)(0) pid=88019> fault 2025-03-05 01:15:07.491316 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88021 AUID=502> and <anon<java>(502)(0) pid=88021> fault 2025-03-05 01:15:07.542102 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88022 AUID=502> and <anon<java>(502)(0) pid=88022> fault 2025-03-05 01:15:07.803126 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88025 AUID=502> and <anon<java>(502)(0) pid=88025> fault 2025-03-05 01:15:59.774214 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88568 AUID=502> and <anon<java>(502)(0) pid=88568> fault 2025-03-05 01:16:00.142288 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88572 AUID=502> and <anon<java>(502)(0) pid=88572> fault 2025-03-05 01:16:00.224019 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88573 AUID=502> and <anon<java>(502)(0) pid=88573> fault 2025-03-05 01:16:01.180670 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88580 AUID=502> and <anon<java>(502)(0) pid=88580> fault 2025-03-05 01:16:01.879884 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88588 AUID=502> and <anon<java>(502)(0) pid=88588> fault 2025-03-05 01:16:02.233165 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88589 AUID=502> and <anon<java>(502)(0) pid=88589> ... What's strange is that each of the message seems to say that it has identified two instances with unequal identities and yet it prints the same process for each such message. Notice: fault 2025-03-05 01:16:02.233165 +0000 runningboardd Two equal instances have unequal identities. <anon<java>(502) pid=88589 AUID=502> and <anon<java>(502)(0) pid=88589> I suspect the identity it is talking about is the one explained as designated requirement here https://developer.apple.com/documentation/Technotes/tn3127-inside-code-signing-requirements#Designated-requirement. Yet the message isn't clear why the same process would have two different identities. The type of this message is "fault", so I'm guessing that this message is pointing to some genuine issue with the executable of the process. Is that right? Any inputs on what could be wrong here? This is from a 15.3.1 macosx aarch64 system. On that note, is runningboardd the process which is responsible for these identity checks?
6
0
414
Mar ’25
macos entitlements - com.apple.security.cs.allow-unsigned-executable-memory vs com.apple.security.cs.allow-jit
In context of entitlements that are applicable on macos platform, I was discussing in another thread about the com.apple.security.cs.allow-unsigned-executable-memory and the com.apple.security.cs.allow-jit entitlements in a hardened runtime https://developer.apple.com/forums/thread/775520?answerId=827440022#827440022 In that thread it was noted that: The hardened runtime enables a bunch of additional security checks. None of them are related to networking. Some of them are very important to a Java VM author, most notably the com.apple.security.cs.allow-jit -> com.apple.security.cs.allow-unsigned-executable-memory -> com.apple.security.cs.disable-executable-page-protection cascade. My advice on that front: This sequence is a trade off between increasing programmer convenience and decreasing security. com.apple.security.cs.allow-jit is the most secure, but requires extra work in your code. Only set one of these entitlements, because each is a superset of its predecessor. com.apple.security.cs.disable-executable-page-protection is rarely useful. Indeed, on Apple silicon [1] it’s the same as com.apple.security.cs.allow-unsigned-executable-memory. If you want to investigate moving from com.apple.security.cs.allow-unsigned-executable-memory to com.apple.security.cs.allow-jit, lemme know because there are a bunch of additional resources on that topic. What that tells me is that com.apple.security.cs.allow-jit is the recommended entitlement that retains enough security and yet provides the necessary programmer convenience for applications. In the OpenJDK project we use both com.apple.security.cs.allow-unsigned-executable-memory and com.apple.security.cs.allow-jit entitlements for the executables shipped in the JDK (for example java). I was told in that other thread that it might be possible to just use the com.apple.security.cs.allow-unsigned-executable-memory, but there are some additional details to consider. I'm starting this thread to understand what those details are.
3
0
399
Mar ’25
macOS 26 Beta - man page of sw_vers is not accurate
A few minutes back I filed a feedback assistant issue for this (FB18173706), but I am not sure I filed it in the correct category and I can't find a way to edit it either. So posting this message here just to have to assigned in the right category if appropriate. The issue is as follows. On macOS 26 Tahoe Beta, "man sw_vers" has this among other details: Previous versions of sw_vers respected the SYSTEM_VERSION_COMPAT environment variable to provide compatibility fallback versions for scripts which did not support the macOS 11.0+ version transition. This is no longer supported, versions returned by sw_vers will always reflect the real system version. It says that SYSTEM_VERSION_COMPAT is no longer supported. That doesn't look right, because running sw_vers as follows on macOS 26 Beta results in: SYSTEM_VERSION_COMPAT=1 sw_vers ProductName: macOS ProductVersion: 16.0 BuildVersion: 25A5279m i.e. setting the environment variable SYSTEM_VERSION_COMPAT=1 results in sw_vers reporting the version as 16.0. Now try with SYSTEM_VERSION_COMPAT=0, and the result is: SYSTEM_VERSION_COMPAT=0 sw_vers ProductName: macOS ProductVersion: 26.0 BuildVersion: 25A5279m notice the output says 26.0. So it appears that SYSTEM_VERSION_COMPAT is supported even on macOS 26. I think the man page requires an update to match this behaviour.
5
0
191
Aug ’25
Where is the instruments command line tool?
I was reading through this documentation about instruments command line tool https://help.apple.com/instruments/mac/current/#/devb14ffaa5 and how it can be launched from the command line. However, unlike what the documentation states, there's no such instruments command anywhere on my macos M1 (OS version 15.6). That command gives: $> instruments zsh: command not found: instruments I do have XCode installed which has the Instruments.App (GUI app) but not the command line utility: $> ls Xcode.app/Contents/Applications/ ... Instruments.app Is that linked documentation up-to-date (it does say "latest" in the URL)? Is there some other way to install this command line utility?
3
0
231
Aug ’25
macos 15.6.1 - BSD sendto() fails for IPv4-mapped IPv6 addresses
There appears to be some unexplained change in behaviour in the recent version of macos 15.6.1 which is causing the BSD socket sendto() syscall to no longer send the data when the source socket is bound to a IPv4-mapped IPv6 address. I have attached a trivial native code which reproduces the issue. What this reproducer does is explained as a comment on that code's main() function: // Creates a AF_INET6 datagram socket, marks it as dual socket (i.e. IPV6_V6ONLY = 0), // then binds the socket to a IPv4-mapped IPv6 address (chosen on the host where this test runs). // // The test then uses sendto() to send some bytes. For the sake of this test, it uses the same IPv4-mapped // IPv6 address as the destination address to sendto(). The test then waits for (a maximum of) 15 seconds to // receive that sent message by calling recvfrom(). // // The test passes on macos (x64 and aarch64) hosts of versions 12.x, 13.x, 14.x and 15.x upto 15.5. // Only on macos 15.6.1 and the recent macos 26, the test fails. Specifically, the first message that is // sent using sendto() is never sent (and thus the recvfrom()) times out. sendto() however returns 0, // incorrectly indicating a successful send. Interesting, if you repeat sendto() a second message from the // same bound socket to the exact same destination address, the send message is indeed correctly sent and // received immediately by the recvfrom(). It's only the first message which goes missing (the test uses // unique content in each message to be sure which exact message was received and it has been observed that // only the second message is received and the first one lost). // // Logs collected using "sudo log collect --last 2m" (after the test program returns) shows the following log // message, which seem relevant: // ... // default kernel cfil_hash_entry_log:6088 <CFIL: Error: sosend_reinject() failed>: // [86868 a.out] <UDP(17) out so 59faaa5dbbcef55d 127846646561221313 127846646561221313 age 0> // lport 65051 fport 65051 laddr 192.168.1.2 faddr 192.168.1.2 hash 201AAC1 // default kernel cfil_service_inject_queue:4472 CFIL: sosend() failed 22 // ... // As noted, this test passes without issues on various macosx version (12 through 15.5), both x64 and aarch64 but always fails against 15.6.1. I have been told that it also fails on the recently released macos 26 but I don't have access to such host to verify it myself. The release notes don't usually contain this level of detail, so it's hard to tell if something changed intentionally or if this is a bug. Should I report this through the feedback assistant? Attached is the source of the reproducer, run it as: clang dgramsend.c ./a.out On macos 15.6.1, you will see that it will fail to send (and thus receive) the message on first attempt but the second one passes: ... created and bound a datagram dual socket to ::ffff:192.168.1.2:65055 ::ffff:192.168.1.2:65055 sendto() ::ffff:192.168.1.2:65055 ---- Attempt 1 ---- sending greeting "hello 1" sendto() succeeded, sent 8 bytes calling recvfrom() receive timed out --------------------- ---- Attempt 2 ---- sending greeting "hello 2" sendto() succeeded, sent 8 bytes calling recvfrom() received 8 bytes: "hello 2" --------------------- TEST FAILED ... The output "log collect --last 2m" contains a related error (and this log message consistently shows up every time you run that reproducer): ... default kernel cfil_hash_entry_log:6088 <CFIL: Error: sosend_reinject() failed>: [86248 a.out] <UDP(17) out so 59faaa5dbbcef55d 127846646561221313 127846646561221313 age 0> lport 65055 fport 65055 laddr 192.168.1.2 faddr 192.168.1.2 hash 201AAC1 default kernel cfil_service_inject_queue:4472 CFIL: sosend() failed 22 ... I don't know what it means though. dgramsend.c
2
0
227
Sep ’25
Debugging a crashing ksh?
I have a setup where /bin/ksh constantly crashes. The generated crash logs available at ~/Library/Logs/DiagnosticReports aren't too helpful because the content there contains only hex references to the ksh stack backtrace. I then ran "dtruss" against the crashing /bin/ksh program and even that isn't helping narrow it down because that too uses hex digits in the stack backtrace. What would be the right way to debug this crashing ksh program? Are there debug symbols available for the /bin/ksh binary shipped in macos M1 (13.0.1 version)? I found some documentation about using XCode to "symbolicate" the crash logs, but following those instructions too wasn't helpful because even after loading the crash report in the XCode "Device Logs" window, it still only showed hex references.
1
0
650
Feb ’23
UDP socket bind with ephemeral port on macos results in OS allocating a already bound/in-use port
We have been observing an issue where when binding a UDP socket to an ephemeral port (i.e. port 0), the OS ends up allocating a port which is already bound and in-use. We have been seeing this issue across all macos versions we have access to (10.x through recent released 13.x). Specifically, we (or some other process) create a udp4 socket bound to wildcard and ephemeral port. Then our program attempts a bind on a udp46 socket with ephemeral port. The OS binds this socket to an already in use port, for example you can see this netstat output when that happens: netstat -anv -p udp | grep 51630 udp46 0 0 *.51630 *.* 786896 9216 89318 0 00000 00000000 00000000001546eb 00000000 00000800 1 0 000001 udp4 0 0 *.51630 *.* 786896 9216 89318 0 00000 00000000 0000000000153d9d 00000000 00000800 1 0 000001 51630 is the (OS allocated) port here, which as you can see has been allocated to 2 sockets. The process id in this case is the same (because we ran an explicit reproducer to reproduce this), but it isn't always the case. We have a reproducer which consistenly shows this behaviour. Before filing a feedback assistant issue, I wanted to check if this indeed appears to be an issue or if we are missing something here, since this appears to be a very basic thing.
6
1
1.6k
Jul ’24
XCode 16 clang++ compiler generates unexpected results for conditional checks at -O2 and -O3 optimization levels
Around a month back, developers of the OpenJDK project, when using XCode 16 to build the JDK started noticing odd failures when executing code which was compiled using the clang++ compiler shipped in that XCode 16 release (details in https://bugs.openjdk.org/browse/JDK-8340341). Specifically, a trivial for loop in a c++ code of the form: int limit = ... // method local variable for (i=0; i<limit; i++) { ... } ends up iterating more times than the specified limit. The "i<limit" returns true even when it should have returned false. In fact, debug log messages within the for loop of the form: fprintf(stderr, "parsing %d of %d, %d < % d == %s", i, limit, i, limit, (i<limit) ? "true" : "false"); would show output of the form: parsing 0 of 2, 0 < 2 == true parsing 1 of 2, 1 < 2 == true parsing 2 of 2, 2 < 2 == true Notice, how it entered the for loop even when 2 < 2 should have prevented it from entering it. Furthermore, notice the message says 2 < 2 == true (which clearly isn't right). This happens when that code is compiled with optimization level -O2 or -O3. The issue doesn't happen with -O1. I had reported this as an issue to Apple through feedback assistance, more than a month back. The feedback id is FB15162411. There hasn't been any response to it nor any indication that the issue has been noticed and can be reproduced (the steps to reproduce have been provided in that issue). In the meantime, more and more users are now running into this failure in JDK when using XCode 16. We haven't put any workaround in place (the only workaround we know of is using -O1 for the compilation of this file) because it isn't clear what exactly is causing this issue (other than the fact that it shows up with specific optimization levels). It's also unknown if this bug has wider impact. Would it be possible to check if FB15162411 is being looked into and any technical details on what's causing this? That would help us decide if it's OK to put in place a temporary workaround in the OpenJDK build and how long to maintain that workaround. For reference, this was reproduced on: clang++ --version Apple clang version 16.0.0 (clang-1600.0.26.3) Target: arm64-apple-darwin23.6.0 Thread model: posix InstalledDir: /xcode-16/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
13
1
2.1k
Dec ’24
Cannot access profile page of individual DTS engineers
Until recently, it was possible to view the profiles of every user who posted in the forums. The profile page would then have links to posts/replies and other pages so that one could "follow" the recent comments by those users. However, it appears that it no longer is possible to view a profile of individual "DTS Engineer". What I mean is I can no longer view their profile page and as a result the recent posts/replies that a specific "DTS Engineer" makes. It's especially a loss because posts made by such engineers are very helpful and valuable and not being able to easily follow such posts on a single page makes the forum software less useful. Is there a way the previous feature can be brought back?
0
1
323
Feb ’25
Instructions for debugging recent macos kernel versions?
Is there any recent and a bit authoritative documentation which explains how to debug recent versions of macos kernel? I have found some blog posts from other users but those are either outdated or don't work for some other reason. I am guessing kernel debugging is pretty common for developers working on macos itself, so I'm hoping someone in this forum would have some working instructions for that.
9
1
315
Oct ’25