Develop kernel-resident device drivers and kernel extensions using Kernel.

Posts under Kernel tag

48 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

KEXT Code Signing Problems
On modern systems all KEXTs must be code signed with a Developer ID. Additionally, the Developer ID must be specifically enabled for KEXT development. You can learn more about that process on the Developer ID page. If your KEXT is having code signing problems, check that it’s signed with a KEXT-enabled Developer ID. Do this by looking at the certificate used to sign the KEXT. First, extract the certificates from the signed KEXT: % codesign -d --extract-certificates MyKEXT.kext Executable=/Users/quinn/Desktop/MyKEXT/build/Debug/MyKEXT.kext/Contents/MacOS/MyKEXT This creates a bunch of certificates of the form codesignNNN, where NNN is a number in the range from 0 (the leaf) to N (the root). For example: % ls -lh codesign* -rw-r--r--+ 1 quinn staff 1.4K 20 Jul 10:23 codesign0 -rw-r--r--+ 1 quinn staff 1.0K 20 Jul 10:23 codesign1 -rw-r--r--+ 1 quinn staff 1.2K 20 Jul 10:23 codesign2 Next, rename each of those certificates to include the .cer extension: % for i in codesign*; do mv $i $i.cer; done Finally, look at the leaf certificate (codesign0.cer) to see if it has an extension with the OID 1.2.840.113635.100.6.1.18. The easiest way to view the certificate is to use Quick Look in Finder. Note If you’re curious where these Apple-specific OIDs comes from, check out the documents on the Apple PKI page. In this specific case, look at section 4.11.3 Application and Kernel Extension Code Signing Certificates of the Developer ID CPS. If the certificate does have this extension, there’s some other problems with your KEXT’s code signing. In that case, feel free to create a new thread here on DevForums with your details. If the certificate does not have this extension, there are two possible causes: Xcode might be using an out-of-date signing certificate. Re-create your Developer ID signing certificate using the developer site and see if the extension shows up there. If so, you’ll have to investigate why Xcode is not using the most up-to-date signing certificate. If a freshly-created Developer ID signing certificate does not have this extension, you need to apply to get your Developer ID enabled for KEXT development per the instructions on the Developer ID page. Share and Enjoy — Quinn “The Eskimo!” @ Developer Technical Support @ Apple let myEmail = "eskimo" + "1" + "@" + "apple.com" Change history: 20 Jul 2016 — First published. 28 Mar 2019 — Added a link to the Apple PKI site. Other, minor changes. 15 Mar 2022 — Fixed the formatting. Updated the section number in the Developer ID CPS. Made other minor editorial changes.
0
0
6.6k
Mar ’22
VNOP_MONITOR+vnode_notify() operation details
After perusing the sources of Apple's SMB and NFS clients' implementation of VNOP_MONITOR, my understanding of how VNOP_MONITOR+vnode_notify() operate is as follows: A user-space process advertises an interest in monitoring a file or directory via kqueue(2)/kevent(2). VFS calls the filesystem's implementation of VNOP_MONITOR. VNOP_MONITOR forwards the commencing or terminating of monitoring events request to the filesystem server. Network filesystem client nodes call vnode_notify() to notify the underlying VFS of a filesystem event, e.g. file/directory creation/removal, etc. What I'm still vague about is how does the server communicate back to client nodes that an event of interest has occurred? I'd appreciate being enlightened on the operation of `VNOP_MONITOR+vnode_notify()' in a network filesystem setting.
1
0
68
6d
Support for Multi-Homed IPv6 Networks, esp. RFC 8028
Hi everyone, I’m running a dual-homed IPv6-mostly LAN where two on-link routers advertise distinct global Provider-Assigned prefixes (one per ISP). On Linux, the host stack appears to follow RFC 8028. It keeps one default route per prefix, and packets appear to leave through a router that recognises their source address and pass ISP BCP 38 (https://datatracker.ietf.org/doc/bcp38/) checks. On macOS Sequoia, I'm only seeing a single un-scoped default route. As a result, traffic sourced from prefix B often exits via router A and is dropped upstream. Questions: Is the single-default-per-interface model in macOS an intentional design choice or simply legacy behaviour that has not yet been updated to RFC 8028? Does the kernel perform any hidden next-hop selection that isn’t reflected in netstat -rn output? Are there any road-map items for fully adopting RFC 8028 in macOS? As a bonus, I'd be very interested in any info you might be able to provide on the status of implementation/support for https://datatracker.ietf.org/doc/html/rfc8978 (Reaction of IPv6 Stateless Address Autoconfiguration (SLAAC) to Flash-Renumbering Events).
2
0
54
3w
No KDKs available for macOS 26.0 Developer Beta 2 and later
As of now, there is no Kernel Debug Kit (KDK) available for macOS 26.0 Developer Betas after the first build. Kernel Debug Kits are crucial for understanding panics and other bugs within custom Kernel Extensions. Without the KDK for the corresponding macOS version, tools like kmutil fail to recognize a KDK and certain functions are disabled. Additionally, as far as I am aware, a KDK for one build of macOS isn't able to be used on a differing build. Especially since this is a developer beta, where developers are updating their software to function with the latest versions of macOS, I'd expect a KDK to be available for more than one build.
2
0
222
4w
Kext loads well after launchd and early os_log entries rarely appear in unified log
Is there a way to ensure a kernel extension in the Auxiliary Kernel Collection loads (and runs its start routines) before launchd? I'm emitting logs via os_log_t created with an os_log_create (custom subsystem/category) in both my KMOD's start function and the IOService::start() function. Those messages-- which both say "I've been run"-- inconsistently show up in log show --predicate 'subsystem == "com.bluefalconhd.pandora"' --last boot, which makes me think they are running very early. However, I also record timestamps (using mach_absolute_time, etc.) and expose them to user space through an IOExternalMethod. The results (for the most recent boot): hayes@fortis Pandora/tests main % build/pdtest Pandora Metadata: kmod_start_time: Time: 2025-07-22 14:11:32.233 Mach time: 245612546 Nanos since boot: 10233856083 (10.23 seconds) io_service_start_time: Time: 2025-07-22 14:11:32.233 Mach time: 245613641 Nanos since boot: 10233901708 (10.23 seconds) user_client_init_time: Time: 2025-07-22 14:21:42.561 Mach time: 14893478355 Nanos since boot: 620561598125 (620.56 seconds) hayes@fortis Pandora/tests main % ps -p 1 -o lstart= Tue Jul 22 14:11:27 2025 Everything in the kernel extension appears to be loading after launchd (PID 1) starts. Also, the kext isn't doing anything crazy which could cause that kind of delay. For reference, here's the Info.plist: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>CFBundleExecutable</key> <string>Pandora</string> <key>CFBundleIdentifier</key> <string>com.bluefalconhd.Pandora</string> <key>CFBundleName</key> <string>Pandora</string> <key>CFBundlePackageType</key> <string>KEXT</string> <key>CFBundleVersion</key> <string>1.0.7</string> <key>IOKitPersonalities</key> <dict> <key>Pandora</key> <dict> <key>CFBundleIdentifier</key> <string>com.bluefalconhd.Pandora</string> <key>IOClass</key> <string>Pandora</string> <key>IOMatchCategory</key> <string>Pandora</string> <key>IOProviderClass</key> <string>IOResources</string> <key>IOResourceMatch</key> <string>IOKit</string> <key>IOUserClientClass</key> <string>PandoraUserClient</string> </dict> </dict> <key>OSBundleLibraries</key> <dict> <key>com.apple.kpi.dsep</key> <string>24.2.0</string> <key>com.apple.kpi.iokit</key> <string>24.2.0</string> <key>com.apple.kpi.libkern</key> <string>24.2.0</string> <key>com.apple.kpi.mach</key> <string>24.2.0</string> </dict> </dict> </plist> My questions are: A. Why don't the early logs (from KMOD's start function and IOService::start) consistently appear in the unified log, while logs later in IOExternalMethods do? B. How can I force this kext to load earlier-- ideally before launchd? Thanks in advance for any guidance!
0
0
56
Jul ’25
Hardlinks reported as non-existing on macOS Sequoia for 3rd party FS
After creating a hardlink on a distributed filesystem of my own via: % ln f.txt hlf.txt Neither the original file, f.txt, nor the hardlink, hlf.txt, are immediately accessible, e.g. via cat(1) with ENOENT returned. A short time later though, both the original file and the hardlink are accessible. Both files can be stat(1)ed though, which confirms that vnop_getattr returns success for both files. Dtruss(1) indicates it's the open(2) syscall that fails: % sudo dtruss -f cat hlf.txt 2038/0x4f68: open("hlf.txt\0", 0x0, 0x0) = -1 Err#2 ;ENOENT 2038/0x4f68: write_nocancel(0x2, "cat: \0", 0x5) = 5 0 2038/0x4f68: write_nocancel(0x2, "hlf.txt\0", 0x7) = 7 0 2038/0x4f68: write_nocancel(0x2, ": \0", 0x2) = 2 0 2038/0x4f68: write_nocancel(0x2, "No such file or directory\n\0", 0x1A) = 26 0 Dtrace(1)ing my KEXT no longer works on macOS Sequoia, so based on the diagnostics print statements I inserted into my KEXT, the following sequence of calls is observed: vnop_lookup(hlf.txt) -&gt; EJUSTRETURN ;ln(1) vnop_link(hlf.txt) -&gt; KERN_SUCCESS ;ln(1) vnop_lookup(hlf.txt) -&gt; KERN_SUCCESS ;cat(1) vnop_open(/) ; I expected to see vnop_open(hlf.txt) here instead of the parent directory. Internally, hardlinks are created in vnop_link via a call to vnode_setmultipath with cache_purge_negatives called on the destination directory. On macOS Monterey for example, where the same code does result in hardlinks being accessible, the following calls are made: vnop_lookup(hlf.txt) -&gt; EJUSTRETURN ;ln(1) vnop_link(hlf.txt) -&gt; KERN_SUCCESS ;ln(1) vnop_lookup(hlf.txt) -&gt; KERN_SUCCESS ;cat(1) vnop_open(hlf.txt) -&gt; KERN_SUCCESS ;cat(1) Not sure how else to debug this. Perusing the kernel sources for uses of VISHARDLINK, VNOP_LINK and vnode_setmultipath call sites did not clear things up for me. Any pointers would be greatly appreciated.
3
0
210
Jul ’25
FSKit caching by kernel and performance
I've faced with some performance issues developing my readonly filesystem using fskit. For below screenshot: enumerateDirectory returns two hardcoded items, compiled with release config 3000 readdirsync are done from nodejs. macos 15.5 (24F74) I see that getdirentries syscall takes avg 121us. Because all other variables are minimised, it seems like it's fskit<->kernel overhead. This itself seems like a big number. I need to compare it with fuse though to be sure. But what fuse has and fskit seams don't (I checked every page in fskit docs) is kernel caching. Fuse supports: caching lookups (entry_timeout) negative lookups (entry_timeout) attributes (attr_timeout) readdir (via opendir cache_readdir and keep_cache) read and write ops but thats another topic. And afaik it works for both readonly and read-write file systems, because kernel can assume (if client is providing this) that cache is valid until kernel do write operations on corresponding inodes (create, setattr, write, etc). Questions are: is 100+us reasonable overhead for fskit? is there any way to do caching by kernel. If not currently, any plans to implement? Also, additional performance optimisation could be done by providing lower level api when we can operate with raw inodes (Uint64), this will eliminate overhead from storing, removing and retrieving FSItems in hashmap.
2
1
145
Jul ’25
OpenDirectory module causes bootloop (kernel panic) on restart
With macOS 15, and DSPlugin support removal we searched for an alternative method to be able to inject users/groups into the system dynamically. We tried to write an OpenDirectory XPC based module based on the documentation and XCode template which can be found here: https://developer.apple.com/library/archive/releasenotes/NetworkingInternetWeb/RN_OpenDirectory/chapters/chapter-1.xhtml.html It is more or less working, until I restart the computer: then macOS kernel panics 90% of the time. When the panic occurs, our code does not seem to get run at all, I only see my logs in the beginning of main() when the machine successfully starts. I have verified this also by logging to file. Also tried replacing the binary with eg a shell script, or a "return 0" empty main function, that also triggers the panic. But, if I remove my executable (from /Library/OpenDirectory/Modules/com.quest.vas.xpc/Contents/MacOS/com.quest.vas), that saves the day always, macOS boots just fine. Do you have an idea what can cause this behavior? I can share the boot logs for the boot loops and/or panic file. Do you have any other way (other than OpenDirectory module) to inject users/groups into the system dynamically nowadays? (MDM does not seem a viable option for us)
3
0
134
Jul ’25
When implementing a custom Mach exception handler, all recovery operations for SIGBUS/SIGSEGV except the first attempt will fail.
Recovery operations for signals SIGBUS/SIGSEGV fail when the process intercepts Mach exceptions. Only the first recovery attempt succeeds, and subsequent Signal notifications are no longer received within the process. I think this is a bug in XNU. The test code main.c is: If we comment out AddMachExceptionServer;, everything will return to normal. #include &lt;fcntl.h&gt; #include &lt;mach/arm/kern_return.h&gt; #include &lt;mach/kern_return.h&gt; #include &lt;mach/mach.h&gt; #include &lt;mach/message.h&gt; #include &lt;mach/port.h&gt; #include &lt;pthread.h&gt; #include &lt;setjmp.h&gt; #include &lt;signal.h&gt; #include &lt;stdbool.h&gt; #include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;string.h&gt; #include &lt;sys/_types/_mach_port_t.h&gt; #include &lt;sys/mman.h&gt; #include &lt;sys/types.h&gt; #include &lt;unistd.h&gt; #pragma pack(4) typedef struct { mach_msg_header_t header; mach_msg_body_t body; mach_msg_port_descriptor_t thread; mach_msg_port_descriptor_t task; NDR_record_t NDR; exception_type_t exception; mach_msg_type_number_t codeCount; integer_t code[2]; /** Padding to avoid RCV_TOO_LARGE. */ char padding[512]; } MachExceptionMessage; typedef struct { mach_msg_header_t header; NDR_record_t NDR; kern_return_t returnCode; } MachReplyMessage; #pragma pack() static jmp_buf jump_buffer; static void sigbus_handler(int signo, siginfo_t *info, void *context) { printf("Caught SIGBUS at address: %p\n", info-&gt;si_addr); longjmp(jump_buffer, 1); } static void *RunExcServer(void *userdata) { kern_return_t kr = KERN_FAILURE; mach_port_t exception_port = MACH_PORT_NULL; kr = mach_port_allocate(mach_task_self_, MACH_PORT_RIGHT_RECEIVE, &amp;exception_port); if (kr != KERN_SUCCESS) { printf("mach_port_allocate: %s", mach_error_string(kr)); return NULL; } kr = mach_port_insert_right(mach_task_self_, exception_port, exception_port, MACH_MSG_TYPE_MAKE_SEND); if (kr != KERN_SUCCESS) { printf("mach_port_insert_right: %s", mach_error_string(kr)); return NULL; } kr = task_set_exception_ports( mach_task_self_, EXC_MASK_ALL &amp; ~(EXC_MASK_RPC_ALERT | EXC_MASK_GUARD), exception_port, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES,THREAD_STATE_NONE); if (kr != KERN_SUCCESS) { printf("task_set_exception_ports: %s", mach_error_string(kr)); return NULL; } MachExceptionMessage exceptionMessage = {{0}}; MachReplyMessage replyMessage = {{0}}; for (;;) { printf("Wating for message\n"); // Wait for a message. kern_return_t kr = mach_msg(&amp;exceptionMessage.header, MACH_RCV_MSG, 0, sizeof(exceptionMessage), exception_port, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL); if (kr == KERN_SUCCESS) { // Send a reply saying "I didn't handle this exception". replyMessage.header = exceptionMessage.header; replyMessage.NDR = exceptionMessage.NDR; replyMessage.returnCode = KERN_FAILURE; printf("Catch exception: %d codecnt:%d code[0]: %d, code[1]: %d\n", exceptionMessage.exception, exceptionMessage.codeCount, exceptionMessage.code[0], exceptionMessage.code[1]); mach_msg(&amp;replyMessage.header, MACH_SEND_MSG, sizeof(replyMessage), 0, MACH_PORT_NULL, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL); } else { printf("Mach error: %s\n", mach_error_string(kr)); } } return NULL; } static bool AddMachExceptionServer(void) { int error; pthread_attr_t attr; pthread_attr_init(&amp;attr); pthread_attr_setdetachstate(&amp;attr, PTHREAD_CREATE_DETACHED); pthread_t ptid = NULL; error = pthread_create(&amp;ptid, &amp;attr, &amp;RunExcServer, NULL); if (error != 0) { pthread_attr_destroy(&amp;attr); return false; } pthread_attr_destroy(&amp;attr); return true; } int main(int argc, char *argv[]) { AddMachExceptionServer(); struct sigaction sa; memset(&amp;sa, 0, sizeof(sa)); sa.sa_sigaction = sigbus_handler; sa.sa_flags = SA_SIGINFO; // #if TARGET_OS_IPHONE // sigaction(SIGSEGV, &amp;sa, NULL); // #else sigaction(SIGBUS, &amp;sa, NULL); // #endif int i = 0; while (i++ &lt; 3) { printf("\nProgram start %d\n", i); bzero(&amp;jump_buffer, sizeof(jump_buffer)); if (setjmp(jump_buffer) == 0) { int fd = open("tempfile", O_RDWR | O_CREAT | O_TRUNC, 0666); ftruncate(fd, 0); char *map = (char *)mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); unlink("tempfile"); printf("About to write to mmap of size 0 — should trigger SIGBUS...\n"); map[0] = 'X'; // ❌ triger a SIGBUS munmap(map, 4096); } else { printf("Recovered from SIGBUS via longjmp!\n"); } } printf("_exit(0)\n"); _exit(0); return 0; }
2
0
75
Apr ’25
How to prevent holes from being created by cluster_write() in files
A filesystem of my own making exibits the following undesirable behaviour. ClientA % echo line1 >>echo.txt % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n 6c 69 6e 65 31 0a 0000006 ClientB % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n 6c 69 6e 65 31 0a 0000006 % echo line2 >>echo.txt % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 000000c ClientA % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 000000c % echo line3 >>echo.txt ClientB % echo line4 >>echo.txt ClientA % echo line5 >>echo.txt ClientB % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n l i n e 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 6c 69 6e 65 0000010 3 \n l i n e 4 \n \0 \0 \0 \0 \0 \0 33 0a 6c 69 6e 65 34 0a 00 00 00 00 00 00 000001e ClientA % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n l i n e 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 6c 69 6e 65 0000010 3 \n \0 \0 \0 \0 \0 \0 l i n e 5 \n 33 0a 00 00 00 00 00 00 6c 69 6e 65 35 0a 000001e ClientB % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n l i n e 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 6c 69 6e 65 0000010 3 \n \0 \0 \0 \0 \0 \0 l i n e 5 \n 33 0a 00 00 00 00 00 00 6c 69 6e 65 35 0a 000001e The first write on clientA is done via the following call chain: vnop_write()->vnop_close()->cluster_push_err()->vnop_blockmap()->vnop_strategy() The first write on clientB first does a read, which is expected: vnop_write()->cluster_write()->vnop_blockmap()->vnop_strategy()->myfs_read() Followed by a write: vnop_write()->vnop_close()->cluster_push_err()->vnop_blockmap()->vnop_strategy() The final write on clientA calls cluster_write(), which doesn't do that initial read before doing a write. I believe it is this write that introduces the hole. What I don't understand is why this happens and how this may be prevented. Any pointers on how to combat this would be much appreciated.
2
0
121
Apr ’25
Is that possible to allocate virtual memory space between 0~4GB?
I tried to use the following code to get a virtual address within 4GB memory space int size = 4 * 1024; int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_32BIT; void* addr = mmap(NULL, size, PROT_READ | PROT_WRITE, flags, -1, 0); I also tried MAP_FIXED and pass an address for the first argument of mmap. However neither of them can get what I want. Is there a way to get a virtual memory address within 4GB on arm64 on MacOS?
1
0
54
Apr ’25
How to disable the built-in speakers and microphone on a Mac
I need to implement a solution through an API or custom driver to completely block out the built-in speakers and microphone of Mac, because I need other apps to use specified external devices as audio input and output. Is there a way to achieve this requirement? What I mean is that even in system preferences, it should not be possible to choose the built-in microphone and speakers; only my external device can be used.
0
0
43
Apr ’25
Compatibility between macOS VFS ACLs and Linux VFS ACLs
Implementing ACL support in a distributed filesystem, with macOS and Linux clients talking to a remote file server, requires compatibility between the ACL models supported in Darwin-XNU and Linux kernels to be taken into consideration. My filesystem does support EAs to facilitate ACL storage and retrieval. So setting ACLs via chmod(1) and retrieving them via ls(1) does work. However, the macOS and Linux ACL models are incompatible and would require some sort of conversion between them. chmod(1) uses acl(3) to create ACL entries. While acl(3) claims to implement POSIX.1e ACL security API, which, to the best of my knowledge, Linux VFS implements as well, their respective implementations of the standard obviously do differ. Which is also stated in acl(3): This implementation of the POSIX.1e library differs from the standard in a number of non-portable ways in order to support the MacOS/Darwin ACL semantic. Then there's this NFSv4 to POSIX ACL mapping draft that describes the conversion algorithm. What's the recommended way to bridge the compatibility gap there, so that macOS ACL rules are honoured in Linux and vice versa? Thanks.
2
0
142
Apr ’25
Inconsistent KEXT Status Between System Information and kextstat
Hello Everyone, I have noticed an inconsistency in the KEXT status between the System Information Extensions section and the output of the kextstat command. In System Information, the extension appears as loaded: ACS6x: Version: 3.8.3 Last Modified: 2025/3/10, 8:03 PM Bundle ID: com.Accusys.driver.Acxxx Loaded: Yes Get Info String: ACS6x 3.8.4 Copyright (c) 2004-2020 Accusys, Ltd. Architectures: arm64e 64-Bit (Intel): No Location: /Library/Extensions/ACS6x.kext/ Kext Version: 3.8.3 Load Address: 0 Loadable: Yes Dependencies: Satisfied Signed by: Developer ID Application: Accusys, Inc (K3TDMD9Y6B) Issuer: Developer ID Certification Authority Signing time: 2025-03-10 12:03:20 +0000 Identifier: com.Accusys.driver.Acxxx TeamID: K3TDMD9Y6B However, when I check using kextstat, it does not appear as loaded: $ kextstat | grep ACS6x Executing: /usr/bin/kmutil showloaded No variant specified, falling back to release I use a script to do these jobs echo " Change to build/Release" echo " CodeSign ACS6x.kext" echo " Compress to zip file" echo " Notary & Staple" echo " Unload the old Acxxx Driver" echo " Copy ACS6x.kext driver to /Library/Extensions/" echo " Change ACS6x.kext driver owner" echo " Loaded ACS6x.kext driver" sudo kextload ACS6x.kext echo " Rebiuld system cache" sudo kextcache -system-prelinked-kernel sudo kextcache -system-caches sudo kextcache -i / echo " Reboot" sudo reboot But it seems that the KEXT is not always loaded successfully. What did I forget to do? Any help would be greatly appreciated. Best regards, Charles
2
0
236
Mar ’25
Any recent changes to dlopen() implementation?
In some recent releases of macos (14.x and 15.x), we have noticed what seems to be a slower dlopen() implementation. I don't have any numbers to support this theory. I happened to notice this "slowness" when investigating something unrelated. In one part of the code we have a call of the form: const char * fooBarLib = ....; dlopen(fooBarLib, RTLD_NOW + RTLD_GLOBAL); It so happened that due to some timing related issues, the process was crashing. A slow execution of code in this part of the code would trigger an issue in some other part of the code that would then lead to a process crash. The crash itself isn't a concern, because it's an internal issue that will addressed in the application code. What was interesting is that the slowness appears to be contributed by the call to dlopen(). Specifically, whenever a slowness was observed, the crash reports showed stack frames of the form: Thread 1: 0 dyld 0x18f08b5b4 _kernelrpc_mach_vm_protect_trap + 8 1 dyld 0x18f08f540 vm_protect + 52 2 dyld 0x18f0b87e0 lsl::MemoryManager::writeProtect(bool) + 204 3 dyld 0x18f0a7fe4 invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 932 4 dyld 0x18f0e629c invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 172 5 dyld 0x18f0d9c38 invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 496 6 dyld 0x18f08c2dc dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 300 7 dyld 0x18f0d8bcc dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192 8 dyld 0x18f0db5a0 dyld3::MachOFile::forEachInitializerPointerSection(Diagnostics&, void (unsigned int, unsigned int, bool&) block_pointer) const + 160 9 dyld 0x18f0e5f90 dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 432 10 dyld 0x18f0a7bb4 dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 176 11 dyld 0x18f0af190 dyld4::JustInTimeLoader::runInitializers(dyld4::RuntimeState&) const + 36 12 dyld 0x18f0a8270 dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 312 13 dyld 0x18f0ac560 dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_0::operator()() const + 180 14 dyld 0x18f0a8460 dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 412 15 dyld 0x18f0c089c dyld4::APIs::dlopen_from(char const*, int, void*) + 2432 16 libjli.dylib 0x1025515b4 DoFooBar + 56 17 libjli.dylib 0x10254d2c0 Hello_World_Launch + 1160 18 helloworld 0x10250bbb4 main + 404 19 libjli.dylib 0x102552148 apple_main + 88 20 libsystem_pthread.dylib 0x18f4132e4 _pthread_start + 136 21 libsystem_pthread.dylib 0x18f40e0fc thread_start + 8 So, out of curiosity, have there been any known changes in the implementation of dlopen() which might explain the slowness? Like I noted, I don't have concrete numbers, but to quantify the slowness I don't think it's slower by a noticeable amount - maybe a few milli seconds. I guess what I am trying to understand is, whether there's anything that needs attention here.
3
0
357
Mar ’25