In some recent releases of macos (14.x and 15.x), we have noticed what seems to be a slower dlopen() implementation. I don't have any numbers to support this theory. I happened to notice this "slowness" when investigating something unrelated. In one part of the code we have a call of the form:
const char * fooBarLib = ....;
dlopen(fooBarLib, RTLD_NOW + RTLD_GLOBAL);
It so happened that due to some timing related issues, the process was crashing. A slow execution of code in this part of the code would trigger an issue in some other part of the code that would then lead to a process crash. The crash itself isn't a concern, because it's an internal issue that will addressed in the application code. What was interesting is that the slowness appears to be contributed by the call to dlopen(). Specifically, whenever a slowness was observed, the crash reports showed stack frames of the form:
Thread 1:
0 dyld 0x18f08b5b4 _kernelrpc_mach_vm_protect_trap + 8
1 dyld 0x18f08f540 vm_protect + 52
2 dyld 0x18f0b87e0 lsl::MemoryManager::writeProtect(bool) + 204
3 dyld 0x18f0a7fe4 invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 932
4 dyld 0x18f0e629c invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 172
5 dyld 0x18f0d9c38 invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 496
6 dyld 0x18f08c2dc dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 300
7 dyld 0x18f0d8bcc dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192
8 dyld 0x18f0db5a0 dyld3::MachOFile::forEachInitializerPointerSection(Diagnostics&, void (unsigned int, unsigned int, bool&) block_pointer) const + 160
9 dyld 0x18f0e5f90 dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 432
10 dyld 0x18f0a7bb4 dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 176
11 dyld 0x18f0af190 dyld4::JustInTimeLoader::runInitializers(dyld4::RuntimeState&) const + 36
12 dyld 0x18f0a8270 dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 312
13 dyld 0x18f0ac560 dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_0::operator()() const + 180
14 dyld 0x18f0a8460 dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 412
15 dyld 0x18f0c089c dyld4::APIs::dlopen_from(char const*, int, void*) + 2432
16 libjli.dylib 0x1025515b4 DoFooBar + 56
17 libjli.dylib 0x10254d2c0 Hello_World_Launch + 1160
18 helloworld 0x10250bbb4 main + 404
19 libjli.dylib 0x102552148 apple_main + 88
20 libsystem_pthread.dylib 0x18f4132e4 _pthread_start + 136
21 libsystem_pthread.dylib 0x18f40e0fc thread_start + 8
So, out of curiosity, have there been any known changes in the implementation of dlopen() which might explain the slowness?
Like I noted, I don't have concrete numbers, but to quantify the slowness I don't think it's slower by a noticeable amount - maybe a few milli seconds. I guess what I am trying to understand is, whether there's anything that needs attention here.
3
0
554