Post

Replies

Boosts

Views

Created

Getting Valgrind to run on macOS 10.15 Catalina
I'm looking at getting Valgrind to run on macOS 10.15 Catalina.So far I have the build working OK (based on a fork for 10.14 plus a few tweaks specific to 10.15).However when I run Valgrind [and I'm running the minimal --tool=none with an app that is just "int main(void) {}"] then I'm getting an error related to pthread_init. From what I see from the executed machine code, there is a test for _os_xbs_chrooted (a global variable in the kernel by the looks of it) which then leads to a call to __pthread_init.cold.2. This function contains a ud2 opcode which triggers a SIGILL in the Valgrind VM.Dearching google for _os_xbs_chrooted doesn't come up with anything much. There's this https://github.com/apple/darwin-libpthread/blob/master/src/pthread.c for the pthread check, and one other reference for the initialization.I realize this looks like it could be security related and information is not made public.Any suggestions as to how I can proceed? I have little experience in kernel programming.
11
0
6.9k
Jan ’20
memchr variant
I'm trying to find which macOS version added _platform_memchr$VARIANT$Base (see https://bugs.kde.org/show_bug.cgi?id=43779 for details, a request to get this handled properly by Valgrind).
1
0
601
Nov ’21
lldb handling of dyld shared cache
[I've already asked this on llvm discourse, no answer yet] Can someone give me a brief intro or point me to documentation that describes how lldb handles the dyld shared cache on macOS? I’m trying to evaluate how to implement the same functionality in Valgrind. Prior to macOS 11 Big Sur Valgrind used DYLD_SHARED_REGION=avoid to force loading the libraries and then to bypass the cache and so to trigger reading the mach-o info to be able to redirect malloc/pthread functions. Without these redirs not much is working correctly.
0
0
940
Jan ’23
Getting Valgrind to run on macOS 10.15 Catalina, reboot
I posted about this about 5 years ago, and now at last it's close to being finished. The main problem that I now have is related to matching up DWARF debuginfo and global variables. This works fairly well on macOS 10.14. On 10.15 much less so, and I think that the reason is how the macho data is mmap'd. When Valgrind runs it does the job of the OS and loads the guest exe into memory. It'll then load and run dyld in Valgrind. I can get memory map debug traces. In one test with a problem I see 4 load segments. __DATA_CONST and __DATA both have prot 3 (RW) so we load them RW and not R then RW. Then I think that dyld munmaps and re mmaps the _DATA_CONST segment as RO. Valgrind works based on mmaps triggering reading debuginfo. I don't think that it handles munmap correctly. I need to debug that a lot more - I can see the changed mappings in the debug output but I don't see exactly what is happening with munmap and mmap (unless dyld is doing that on a section by basis). Does what I'm saying about the mappings make any sense?
3
0
68
5d
pthread_cond_timedwait problem
I'm trying to debug an issue with the Valgrind tool Helgrind. This is on masOS 11 (I've also seen it on 12, probably the same for other macOS versions). Here is the testcase. #include <pthread.h> #include <string.h> #include <assert.h> #include <errno.h> int main(void) { pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t cond = PTHREAD_COND_INITIALIZER; int res; // This time has most definitely passed already. (Epoch) struct timespec now; memset(&now, 0, sizeof(now)); res = pthread_mutex_lock(&mutex); assert(res == 0); res = pthread_cond_timedwait(&cond, &mutex, &now); assert(res == ETIMEDOUT); res = pthread_mutex_unlock(&mutex); assert(res == 0); res = pthread_mutex_destroy(&mutex); assert(res == 0); res = pthread_cond_destroy(&cond); assert(res == 0); } The error that I'm getting from Helgrind is ==56754== Thread #1 unlocked a not-locked lock at 0x1048C7A08 ==56754== at 0x10020E2F9: mutex_unlock_WRK (hg_intercepts.c:1255) ==56754== by 0x10020E278: pthread_mutex_unlock (hg_intercepts.c:1278) ==56754== by 0x7FF80F526813: _pthread_cond_wait (in /usr/lib/system/libsystem_pthread.dylib) ==56754== by 0x10020E812: pthread_cond_timedwait_WRK (hg_intercepts.c:1465) ==56754== by 0x10020E6A8: pthread_cond_timedwait (hg_intercepts.c:1512) ==56754== by 0x100003DD2: main (cond_timedwait_test.c:18) ==56754== Lock at 0x1048C7A08 was first observed ==56754== at 0x10020DE91: mutex_lock_WRK (hg_intercepts.c:1009) ==56754== by 0x10020DD68: pthread_mutex_lock (hg_intercepts.c:1031) ==56754== by 0x100003D7D: main (cond_timedwait_test.c:16) ==56754== Address 0x1048c7a08 is on thread #1's stack ==56754== in frame #5, created by main (cond_timedwait_test.c:7) If I turn on extra tracing then on FreeBSD the Helgrind pthread traces correspond to the C source. On macOS I see an extra mutex. << pthread_mxlock 0x7ff850c41 :: mxlock -> 0 >> << pthread_mxunlk 0x7ff850c41 :: mxunlk -> 0 >> ^^ I don't know what this mutex is << pthread_mxlock 0x1048c7a08 :: mxlock -> 0 >> ^^ this is the user mutex << pthread_mxlock 0x7ff850c41 :: mxlock -> 0 >> << pthread_cond_timedwait 0x1048c79d8 0x1048c7a08 0x1048c79c0<< pthread_mxunlk 0x7ff850c41 :: mxunlk -> 0 >> ^^ pthread_cond_timedwait unlocking the non-user mutex << pthread_mxlock 0x7ff850c41 :: mxlock -> 0 >> << pthread_mxunlk 0x7ff850c41 :: mxunlk -> 0 >> << pthread_mxunlk 0x1048c7a08 [error here] :: mxunlk -> 0 >> << pthread_mxlock 0x1048c7a08 :: mxlock -> 0 >> << pthread_mxlock 0x7ff850c41 :: mxlock -> 0 >> cotimedwait -> 60 >> In these traces the "-> 0" is the return code, showing that all of the calls succeeded. I need to do more debugging inside Helgrind. In the traces above I only see the user mutex being locked and then unlocked. Can anyone explain why I'm seeing an extra mutex in there? I'll have a poke around the XNU source
0
0
24
10h
Initial stack construction
I'm having problems constructing the initial stack for the guest executable for Valgrind on macOS 12 Intel. This seemed to work OK for macOS 11 but I'm getting a bad 'apple' pointer on macOS 12. The stack (constructed by Valgrind) looks like this higher address +-----------------+ <- clstack_end | | : string table : | | +-----------------+ | NULL | +-----------------+ | executable_path | (first arg to execve()) +-----------------+ | NULL | - - | envp | +-----------------+ | NULL | - - | argv | +-----------------+ | argc | +-----------------+ | mach_header * | (dynamic only) lower address +-----------------+ <- sp | undefined | : : The problem that I'm having is with the executable path (or the apple pointer). This points to NULL. The actual pointer to the "executable=xxx" string is 16 bytes lower in memory. The code for main starts with Dump of assembler code for function main: 0x0000000100003a90 <+0>: push %rbp 0x0000000100003a91 <+1>: mov %rsp,%rbp 0x0000000100003a94 <+4>: sub $0x60,%rsp 0x0000000100003a98 <+8>: movl $0x0,-0x4(%rbp) 0x0000000100003a9f <+15>: mov %edi,-0x8(%rbp) 0x0000000100003aa2 <+18>: mov %rsi,-0x10(%rbp) 0x0000000100003aa6 <+22>: mov %rdx,-0x18(%rbp) 0x0000000100003aaa <+26>: mov %rcx,-0x20(%rbp) That's the prefix, making space for locals, setting a local variable to 0 then getting the 4 arguments from main in edi, rsi, rdx and rcx as per the SYSV amd64 ABI. I think that it is dyld that puts the apple pointer into rcx. Can anyone tall me how dyld works out the address of the apple pointer?
1
0
10
3h