Post

Replies

Boosts

Views

Activity

Pinpointing dandling pointers in 3rd party KEXTs
I'm debugging the following kernel panic to do with my custom filesystem KEXT: panic(cpu 0 caller 0xfffffe004cae3e24): [kalloc.type.var4.128]: element modified after free (off:96, val:0x00000000ffffffff, sz:128, ptr:0xfffffe2e7c639600) My reading of this is that somewhere in my KEXT I'm holding a reference 0xfffffe2e7c639600 to a 128 byte zone that wrote 0x00000000ffffffff at offset 96 after that particular chunk of memory had been released and zeroed out by the kernel. The panic itself is emitted when my KEXT requests the memory chunk that's been tempered with via the following set of calls. zalloc_uaf_panic() __abortlike static void zalloc_uaf_panic(zone_t z, uintptr_t elem, size_t size) { ... (panic)("[%s%s]: element modified after free " "(off:%d, val:0x%016lx, sz:%d, ptr:%p)%s", zone_heap_name(z), zone_name(z), first_offs, first_bits, esize, (void *)elem, buf); ... } zalloc_validate_element() static void zalloc_validate_element( zone_t zone, vm_offset_t elem, vm_size_t size, zalloc_flags_t flags) { ... if (memcmp_zero_ptr_aligned((void *)elem, size)) { zalloc_uaf_panic(zone, elem, size); } ... } The panic is triggered if memcmp_zero_ptr_aligned(), which is implemented in assembly, detects that an n-sized chunk of memory has been written after being free'd. /* memcmp_zero_ptr_aligned() checks string s of n bytes contains all zeros. * Address and size of the string s must be pointer-aligned. * Return 0 if true, 1 otherwise. Also return 0 if n is 0. */ extern int memcmp_zero_ptr_aligned(const void *s, size_t n); Normally, KASAN would be resorted to to aid with that. The KDK README states that KASAN kernels won't load on Apple Silicon. Attempting to follow the instructions given in the README for Intel-based machines does result in a failure for me on Apple Silicon. I stumbled on the Pishi project. But the custom boot kernel collection that gets created doesn't have any of the KEXTs that were specified to kmutil(8) via the --explicit-only flag, so it can't be instrumented in Ghidra. Which is confirmed as well by running: % kmutil inspect -B boot.kc.kasan boot kernel collection at /Users/user/boot.kc.kasan (AEB8F757-E770-8195-458D-B87CADCAB062): Extension Information: I'd appreciate any pointers on how to tackle UAFs in kernel space.
2
0
112
6d
VNOP_MONITOR+vnode_notify() operation details
After perusing the sources of Apple's SMB and NFS clients' implementation of VNOP_MONITOR, my understanding of how VNOP_MONITOR+vnode_notify() operate is as follows: A user-space process advertises an interest in monitoring a file or directory via kqueue(2)/kevent(2). VFS calls the filesystem's implementation of VNOP_MONITOR. VNOP_MONITOR forwards the commencing or terminating of monitoring events request to the filesystem server. Network filesystem client nodes call vnode_notify() to notify the underlying VFS of a filesystem event, e.g. file/directory creation/removal, etc. What I'm still vague about is how does the server communicate back to client nodes that an event of interest has occurred? I'd appreciate being enlightened on the operation of `VNOP_MONITOR+vnode_notify()' in a network filesystem setting.
1
0
78
2w
Hardlinks reported as non-existing on macOS Sequoia for 3rd party FS
After creating a hardlink on a distributed filesystem of my own via: % ln f.txt hlf.txt Neither the original file, f.txt, nor the hardlink, hlf.txt, are immediately accessible, e.g. via cat(1) with ENOENT returned. A short time later though, both the original file and the hardlink are accessible. Both files can be stat(1)ed though, which confirms that vnop_getattr returns success for both files. Dtruss(1) indicates it's the open(2) syscall that fails: % sudo dtruss -f cat hlf.txt 2038/0x4f68: open("hlf.txt\0", 0x0, 0x0) = -1 Err#2 ;ENOENT 2038/0x4f68: write_nocancel(0x2, "cat: \0", 0x5) = 5 0 2038/0x4f68: write_nocancel(0x2, "hlf.txt\0", 0x7) = 7 0 2038/0x4f68: write_nocancel(0x2, ": \0", 0x2) = 2 0 2038/0x4f68: write_nocancel(0x2, "No such file or directory\n\0", 0x1A) = 26 0 Dtrace(1)ing my KEXT no longer works on macOS Sequoia, so based on the diagnostics print statements I inserted into my KEXT, the following sequence of calls is observed: vnop_lookup(hlf.txt) -> EJUSTRETURN ;ln(1) vnop_link(hlf.txt) -> KERN_SUCCESS ;ln(1) vnop_lookup(hlf.txt) -> KERN_SUCCESS ;cat(1) vnop_open(/) ; I expected to see vnop_open(hlf.txt) here instead of the parent directory. Internally, hardlinks are created in vnop_link via a call to vnode_setmultipath with cache_purge_negatives called on the destination directory. On macOS Monterey for example, where the same code does result in hardlinks being accessible, the following calls are made: vnop_lookup(hlf.txt) -> EJUSTRETURN ;ln(1) vnop_link(hlf.txt) -> KERN_SUCCESS ;ln(1) vnop_lookup(hlf.txt) -> KERN_SUCCESS ;cat(1) vnop_open(hlf.txt) -> KERN_SUCCESS ;cat(1) Not sure how else to debug this. Perusing the kernel sources for uses of VISHARDLINK, VNOP_LINK and vnode_setmultipath call sites did not clear things up for me. Any pointers would be greatly appreciated.
3
0
215
Jul ’25
How to prevent holes from being created by cluster_write() in files
A filesystem of my own making exibits the following undesirable behaviour. ClientA % echo line1 >>echo.txt % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n 6c 69 6e 65 31 0a 0000006 ClientB % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n 6c 69 6e 65 31 0a 0000006 % echo line2 >>echo.txt % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 000000c ClientA % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 000000c % echo line3 >>echo.txt ClientB % echo line4 >>echo.txt ClientA % echo line5 >>echo.txt ClientB % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n l i n e 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 6c 69 6e 65 0000010 3 \n l i n e 4 \n \0 \0 \0 \0 \0 \0 33 0a 6c 69 6e 65 34 0a 00 00 00 00 00 00 000001e ClientA % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n l i n e 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 6c 69 6e 65 0000010 3 \n \0 \0 \0 \0 \0 \0 l i n e 5 \n 33 0a 00 00 00 00 00 00 6c 69 6e 65 35 0a 000001e ClientB % od -Ax -ctx1 echo.txt 0000000 l i n e 1 \n l i n e 2 \n l i n e 6c 69 6e 65 31 0a 6c 69 6e 65 32 0a 6c 69 6e 65 0000010 3 \n \0 \0 \0 \0 \0 \0 l i n e 5 \n 33 0a 00 00 00 00 00 00 6c 69 6e 65 35 0a 000001e The first write on clientA is done via the following call chain: vnop_write()->vnop_close()->cluster_push_err()->vnop_blockmap()->vnop_strategy() The first write on clientB first does a read, which is expected: vnop_write()->cluster_write()->vnop_blockmap()->vnop_strategy()->myfs_read() Followed by a write: vnop_write()->vnop_close()->cluster_push_err()->vnop_blockmap()->vnop_strategy() The final write on clientA calls cluster_write(), which doesn't do that initial read before doing a write. I believe it is this write that introduces the hole. What I don't understand is why this happens and how this may be prevented. Any pointers on how to combat this would be much appreciated.
2
0
121
Apr ’25
Compatibility between macOS VFS ACLs and Linux VFS ACLs
Implementing ACL support in a distributed filesystem, with macOS and Linux clients talking to a remote file server, requires compatibility between the ACL models supported in Darwin-XNU and Linux kernels to be taken into consideration. My filesystem does support EAs to facilitate ACL storage and retrieval. So setting ACLs via chmod(1) and retrieving them via ls(1) does work. However, the macOS and Linux ACL models are incompatible and would require some sort of conversion between them. chmod(1) uses acl(3) to create ACL entries. While acl(3) claims to implement POSIX.1e ACL security API, which, to the best of my knowledge, Linux VFS implements as well, their respective implementations of the standard obviously do differ. Which is also stated in acl(3): This implementation of the POSIX.1e library differs from the standard in a number of non-portable ways in order to support the MacOS/Darwin ACL semantic. Then there's this NFSv4 to POSIX ACL mapping draft that describes the conversion algorithm. What's the recommended way to bridge the compatibility gap there, so that macOS ACL rules are honoured in Linux and vice versa? Thanks.
2
0
142
Mar ’25
AppleDouble, aka dotbar, files not removed via Finder
I've having trouble deleting AppleDouble files residing on my custom filesystem through Finder. This also affects files that use the AppleDouble naming convention, i.e. their names start with '._', but aren't AppleDoubles themselves. dtrace output In vnop_readdir, 'struct dent/dentry' is set up for dotbar files and written to the uio_t buffer. It's just that my vnop_remove is never called for dotbar files, and I don't understand why not. Dotbar files are removed successfully, when deleted through command line. For SMBClients, vnop_readdir is followed by vnop_access, followed by vnop_lookup, followed by vnop_remove of dotbar files. SMBClient rm dotbar files dtrace output Implementing vnop_access for my filesystem did not result in the combination of vnop_lookup and vnop_remove being called for dotbar files. Perusing the kernel sources, I observed the following functions that might be involved, but I have not way of verifying this, as none of the functions of interest are dtrace(1)-able, rmdir_remove_orphaned_appleDouble() in particular. rmdir_remove_orphaned_appleDouble() -> VNOP_READDIR(). rmdirat_internal() -> rmdir_remove_orphaned_appleDouble() unlinkat()-> rmdirat_internal() rmdir()-> rmdirat_internal() Any pointers on how dotbar files may be removed through Finder would be greatly appreciated.
1
0
541
Dec ’24
kernel_sysctlbyname("kern.hostname") returns EPERM
Attempting to acquire the value of the 'kern.hostname' ctl from a kext by calling sysctlbyname() returns EPERM with no hostname returned. sysctlbyname() is aliased to kernel_sysctlbyname(): config/Libkern.exports:839:_sysctlbyname:_kernel_sysctlbyname Looking at the implementation of kernel_sysctlbyname(), EPERM is returned by sysctl_root(). Not sure how to correctly identify the point of failure. Alternately, calling sysctlbyname("hw.ncpu") does return the value set for the ctl. The kext was compiled with SYSCTL_DEF_ENABLED defined to have the relevant section of sys/sysctl.h enabled. bsd_hostname() is a private symbol which is inaccessible to my kext. % sysctl -n kern.hostname does return the host name, so the ctl must be set. Is it possible to get the name of a host from the context of my kext? Thanks.
1
0
572
Dec ’24
Kernel panic in mac_label_verify()
Accessing a directory on my custom distributed filesystem results in a kernel panic. According to the backtrace, the last function called before the panic is triggered is mac_label_verify(). See the backtrace file attached. mac_label_verify-panic.txt The panic manifests itself given the following conditions: Machine-a: make a directory in Finder. Machine-b: remove the directory created on machine-a in Finder. Machine-a: access the directory removed on machine-b in Finder. Kernel panic ensues. The panic is reproducible on both Apple Silicon and x86-64. The backtrace is for x86-64 as I wasn't able to symbolicate it on Apple Silicon. Not sure how to tackle this one. Any pointers would be much appreciated.
15
0
1.2k
Nov ’24
Subsequent expansion of same archive fails due to name collision
Extracting an archive into the same directory on my custom filesystem more than once fails with the following message: Unable to finish expanding 'misc.tar.xz' into 'extractme'. Could not move 'misc' into destination directory. I.e. initial extraction succeeds with archive contents extracted into extractme/misc. Subsequent extraction fails to rename extractme.sb-db71cd27-lFjN1f/misc to extractme/misc 2. This behaviour is observed on macOS Monterey and Ventura. It does work as expected on macOS Sonoma though. Dtrace(1)-ing the archive being extracted over smbfs results in the following sequence of calls being made: 2 -> smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 nameiop:0 2 <- smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 -> 2 ;ENOENT 2 -> smbfs_vnop_lookup AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc nameiop:0x2 ;DELETE 2 <- smbfs_vnop_lookup AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc -> 0 2 -> smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 nameiop:0x3 ;RENAME 2 <- smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 -> EJUSTRETURN 1 -> smbfs_vnop_rename AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc -> extractme/nil 2 <- smbfs_vnop_rename AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc -> extractme/nil -> 0 2 -> smbfs_vnop_lookup AUHelperService-2163 -> TheRooT/extractme/misc 2 nameiop:0 3 <- smbfs_vnop_lookup AUHelperService-2163 -> TheRooT/extractme/misc 2 -> 0 ;Successful lookup What I don't understand is what causes vnop_lookup to be called for misc to be removed from the temporary directory and renamed into 'misc 2' and placed in the destination directory, 'extractme' via vnop_rename? I had a look at smbfs_vnop_lookup and rename and didn't see anything that would cause 'misc 2' to come into being. Based on the output of the dtrace(1) script running on my custom filesystem, there are no vnop_lookup and vnop_rename calls being made to remove the 'misc' directory from the temporary directory and to rename it to 'misc 2' and place it in the destination directory at extractme. Archive extraction proceeds no further after extracting the archive contents into the temporary directory. What am I missing?
1
0
610
Oct ’24
What are the locking rules for VFS and VNOP operations?
I'm observing all sorts of race conditions occurring in various VNOPs my custom filesystem implements. I'm inclined to attribute this to my implementation not following the locking rules expected by the system of a 3rd party filesystem as well as it should. I've looked at how locking is done in Apple's own implementation of Samba and NFS clients. The Samba client uses read/write locks to protect its node from data races. While the NFS client uses mutex locks for the same purpose. I realised that I don't have a clear model in my head of how locking should be done properly. Thus my question, what are the locking rules for VFS and VNOP operations? Thanks.
4
0
737
Sep ’24
vnop_lookup returning ENOENT aborts rm(1)
When recursively removing a directory with a large number of entries that resides on my custom filesystem, rm(1) aborts with ENOENT. % rm -Rv /Volumes/myfs/linux-kernel/linux-6.10.6 [...] /Volumes/myfs/linux-kernel/linux-6.10.6/include/drm/bridge/aux-bridge.h /Volumes/myfs/linux-kernel/linux-6.10.6/include/drm/bridge/dw_hdmi.h rm: fts_read: No such file or directory I'm observing the following sequence of calls being made. 2024-09-17 17:58:25 vnops.c:281 myfs_vnop_lookup: rm-936 -> dw_hdmi.h ;initial lookup call 2024-09-17 17:58:25 vnops.c:315 myfs_vnop_lookup: -> cache_lookup(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:317 myfs_vnop_lookup: <- cache_lookup(dw_hdmi.h) -> 0 ;cache miss 2024-09-17 17:58:25 rpc.c:431 myfsLookup: rm-936 -> dw_hdmi.h ;do remote lookup 2024-09-17 17:58:25 rpc.c:500 myfsLookup: -> myfs_lookup_rpc(dw_hdmi.h) 2024-09-17 17:58:25 rpc.c:502 myfsLookup: <- myfs_lookup_rpc(dw_hdmi.h) -> 0 ;file found and added to vfs cache 2024-09-17 17:58:25 vnops.c:281 myfs_vnop_lookup: rm-936 -> dw_hdmi.h ;subsequent lookup call 2024-09-17 17:58:25 vnops.c:315 myfs_vnop_lookup: -> cache_lookup(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:317 myfs_vnop_lookup: <- cache_lookup(dw_hdmi.h) -> -1 ;cache hit 2024-09-17 17:58:25 vnops.c:1478 myfs_vnop_remove: -> myfs_unlink(dw_hdmi.h) ;unlink sequence 2024-09-17 17:58:25 rpc.c:1992 myfs_unlink: -> myfs_unlink_rpc(dw_hdmi.h) 2024-09-17 17:58:25 rpc.c:1994 myfs_unlink: <- myfs_unlink_rpc(dw_hdmi.h) -> 0 ;remote unlink succeeded 2024-09-17 17:58:25 vnops.c:1480 myfs_vnop_remove: <- myfs_unlink(dw_hdmi.h) -> 0 2024-09-17 17:58:25 vnops.c:1487 myfs_vnop_remove: -> cache_purge(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:1489 myfs_vnop_remove: <- cache_purge(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:1499 myfs_vnop_remove: -> vnode_recycle(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:1501 myfs_vnop_remove: <- vnode_recycle(dw_hdmi.h) 2024-09-17 17:58:27 vnops.c:281 myfs_vnop_lookup: fseventsd-101 -> dw_hdmi.h ;another lookup; why? 2024-09-17 17:58:27 vnops.c:315 myfs_vnop_lookup: -> cache_lookup(dw_hdmi.h) 2024-09-17 17:58:27 vnops.c:317 myfs_vnop_lookup: <- cache_lookup(dw_hdmi.h) -> 0 2024-09-17 17:58:27 rpc.c:431 myfsLookup: fseventsd-101 -> dw_hdmi.h 2024-09-17 17:58:27 rpc.c:500 myfsLookup: -> myfs_lookup_rpc(dw_hdmi.h) 2024-09-17 17:58:27 rpc.c:502 myfsLookup: <- myfs_lookup_rpc(dw_hdmi.h) -> ENOENT(2) 2024-09-17 17:58:27 vnops.c:371 myfs_vnop_lookup: SET(NNEGNCENTRIES): dw_hdmi.h 2024-09-17 17:58:27 vnops.c:373 myfs_vnop_lookup: ENOENT(2) <- shouldAddToNegativeNameCache(dw_hdmi.h) I checked the value of vnode's v_iocount when vnop_remove and vnop_reclaim are being called. Each vnop_remove and followed by vnop_reclaim with v_iocount set to 1 in both calls, as expected. What I don't understand is why after removing the file is there another lookup call being made, which returns ENOENT to rm(1), which causes it to abort. Any pointers on what could be amiss there would be much appreciated.
3
0
662
Sep ’24
vnop_advlock not being called for my filesystem
When running AJA System Test for my custom filesystem, the write and read tests get stuck intermittently. I didn't observe any error codes being returned by my vnop_read/write or sock_receive/send functions. Dtrace(1)'ing the vnops being called by AJA System Test for smbfs revealed that amongst other things vnop_advlock is being called: 0 -> smbfs_vnop_advlock ajasystemtest -> smbfs_vnop_advlock(ajatest.dat, op: 0x2, fl->l_start: 0, fl->l_len: 0, fl->l_pid: 0, fl->l_type: 2, fl->l_whence: 0, flags: 0x40, timeout: 0) 0 <- smbfs_vnop_advlock ajasystemtest -> smbfs_vnop_advlock(ajatest.dat) -> -1934627947504 op: 0x2 #define F_SETFD 2 /* set file descriptor flags */ fl->l_len: 0 ;len = 0 means until end of file fl->l_type: 2 ;#define F_UNLCK 2 /* unlock */ fl->l_whence: 0 ;#define SEEK_SET 0 /* set file offset to offset */ flags: 0x40 ;#define F_POSIX 0x040 /* Use POSIX semantics for lock */ As my filesystem didn't implement vnop_advlock, I thought I'd explore that avenue. My vnop_advlock simply returns KERN_SUCCESS. Both f_capabilities.valid and f_capabilities.capabilities of struct vfs_attr have VOL_CAP_INT_ADVLOCK and VOL_CAP_INT_FLOCK set. Yet, vnop_advlock doesn't get called for my filesystem when running AJA System Test. Any tips on what could be amiss there would be much appreciated.
4
0
777
Aug ’24
vnop_strategy unexpectedly zero-extends files
On implementing vnop_mmap, vnop_strategy and other related VNOPs as suggested in https://developer.apple.com/forums/thread/756358 my vnop_strategy routine ends up zero-extending files. I don't understand why my filesystem behaves as described above. Perusing the source code of both the relevant parts of Darwin/XNU and SMBClient did not clarify things for me. A nudge in the right direction would be greatly appreciated. The technical details of the issue are given in the plain text file attached, as some text was found to be sensitive. Unsure what exactly it was. apple-dts-issue-desc.txt
3
0
794
Jul ’24