kithrup’s Profile | Apple Developer Forums

Swift, XPC, and... segmentation faults?

I thought Swift wasn't supposed to get them, which is part of the reason why I chose to use it for my network extension. But we're getting crashes occasionally, that look like: Thread 4 Crashed:: Dispatch queue: com.apple.NSXPCConnection.user.endpoint 0 com.kithrup.MyApp.NExt 0x102c4ffe2 MyExt.sendData(_:data:completion:) + 610 1 com.kithrup.MyApp.NExt 0x102c5091f @objc MyExt.sendData(_:data:completion:) + 255 2 Foundation 0x7ff81ef97490 __NSXPCCONNECTION_IS_CALLING_OUT_TO_EXPORTED_OBJECT_S3__ + 10 3 Foundation 0x7ff81ef3fa1f -[NSXPCConnection _decodeAndInvokeMessageWithEvent:flags:] + 2322 4 Foundation 0x7ff81eef641e message_handler + 206 5 libxpc.dylib 0x7ff81de24b6c _xpc_connection_call_event_handler + 56 6 libxpc.dylib 0x7ff81de23947 _xpc_connection_mach_event + 1382 7 libdispatch.dylib 0x7ff81df2e3b1 _dispatch_client_callout4 + 9 8 libdispatch.dylib 0x7ff81df47041 _dispatch_mach_msg_invoke + 445 9 libdispatch.dylib 0x7ff81df341cd _dispatch_lane_serial_drain + 342 10 libdispatch.dylib 0x7ff81df47b77 _dispatch_mach_invoke + 484 11 libdispatch.dylib 0x7ff81df341cd _dispatch_lane_serial_drain + 342 12 libdispatch.dylib 0x7ff81df34e30 _dispatch_lane_invoke + 417 13 libdispatch.dylib 0x7ff81df3eeee _dispatch_workloop_worker_thread + 753 14 libsystem_pthread.dylib 0x7ff81e0e1fd0 _pthread_wqthread + 326 The XPC method is func sendData(_: UUID, data: Data?, completion: @escaping (_: Error?) -> Void) It's crashing on address 0x10, so pretty clearly a NULL-dereference. Since this is happening in my extension, it's in Swift (as I said above), so I have no idea what could be NULL without the compiler yelling at me first.

Programming Languages Swift Swift XPC

11

0

2.4k

Jun ’22

Why does spotlight hate me?

This query should find everything with a display name of "Safari." That should include, for example, /Applications/Safari.app. [bigbook:/tmp] sef% mdfind 'kMDItemDisplayName == "Safari"c' /Library/Application Support/Apple/Safari /Library/Apple/System/Library/Assistant/Plugins/Safari.assistantBundle/Contents/MacOS/Safari /Users/Shared/Previously Relocated Items 1/Security/System/Library/AssetsV2/com_apple_MobileAsset_MacSoftwareUpdate/f7b05c91052116c046919f72de2c03a86cabcf3e.asset/AssetData/payloadv2/ecc_data/System/Library/Templates/Data/Applications/Safari.app /Users/Shared/Previously Relocated Items/Security/Developer/SDKs/MacOSX10.6.sdk/System/Library/PrivateFrameworks/Safari.framework/Versions/A/Safari /Users/Shared/Previously Relocated Items/Security/Developer/SDKs/MacOSX10.7.sdk/System/Library/PrivateFrameworks/Safari.framework/Versions/A/Safari /Users/sef/Applications/Microsoft Office 2004/Office/Themes/safari /Users/sef/Library/Application Support/SyncService/LastSync Data/Safari And yet, /Applications/Safari.app is in fact missing from there. Why? (This used to work. But then mds was broken on my machine, so I bit the bullet and upgraded to Monterey. Multiple Monterey systems are showing this weird behaviour.)

App & System Services General Spotlight macOS

1

0

808

Jun ’22

Can I tell if I'm in a captive portal?

That's pretty much the question: we've got a tunnel provider, and I think the OS' ability to handle a captive portal situation is better than I could do, so is there a way to find out if we are in one, and if so wait for it to be handled by the user before we start doing things?

App & System Services Core OS System Configuration Network

5

0

1.3k

Jun ’22

Is there a way to measure lock contention?

I was surprised I could not find such a template in Instruments / xctrace; maybe it's in something else and I couldn't find it? (I am trying to figure out why my throughput got slow. Is it because a mutex is too heavy? Or is there a lot of contention over the lock? How long do the locks tend to be held? Etc.)

Developer Tools & Services Instruments Instruments Developer Tools

1

0

1.8k

Jul ’22

Crashing with hardened runtime, not without?

On Apple Silicon only. It's a bad dereference, address 0xbeadddaf65d0 which looks fake. What does hardening do differently that might cause that, any ideas?

Developer Tools & Services Xcode Developer Tools

0

536

Jul ’22

Very basic question: diagnosing DNS issues

Our transparent proxy provider sends flows to a daemon which analyzes and then does proxying. Works fine. Except that sometimes it stops working. As far as I can tell, it's due to DNS not working. Queries hang -- we've got some internal ones we log, that have timed out after 20 or 30 seconds. Now, clearly, we're doing something bad (because if we kill the daemon and it restarts, everything goes back to working). Unfortunately, I have forgotten so much I can't figure out how to see where it's broken! Things like dig @8.8.8.8 com. any fail -- I am presuming because it's trying to do a lookup of "8.8.8.8" and that fails, but I could be wrong. Admittedly, that one doesn't time out, it simply says no servers could be reached. Meanwhile, pinging that address works. (And, also, the local DNS host -- the one provided via DHCP and listed in /etc/resolv.conf and ipconfig getstatus -- behaves the same way.) I haven't been able to reproduce this myself, unfortunately. Although I have, somewhat interestingly, had a similar issue, which was clearly due to a Google Home WiFi access point (as resetting it fixed the problem, as does moving to another area of the house such that a different AP in the mesh takes over). On my FreeBSD systems, I'd run tcpdump and truss/ktrace on named, but as I said, I've forgotten so much about how macOS does DNS I'm flailing. Help?

App & System Services Networking Network Network Extension

5

0

509

Jul ’22

malloc_history never works for me: unable to read input graph: The data couldn’t be read because it isn’t in the correct format

root# malloc_history /tmp/stack-logs.60147.10f5f7000.agent-tests.0EDkOu.index -callTree malloc_history[60193]: [fatal] unable to read input graph: The data couldn’t be read because it isn’t in the correct format. I ran my program as root# env MallocDebugReport=stderr MallocGuardEdges=1 MallocStackLogging=1 MallocStackLoggingNoCompact=1 MallocScribble=1 MallocErrorAbort=1 DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib ./test/agent-test (The program then segfaults, which looks to be due to a memory stomper.)

Developer Tools & Services Xcode Developer Tools

1

0

894

Sep ’23

On reboot, two instances of faceless app

We have a containing app for our network extension; it's set up as a faceless app and run as a LaunchAgent. It works rather well, we're happy with it. Except sometimes, possibly only on M1's, on reboot, it'll show up twice. Our name in the plist is com.kithrup.appName -- simple enough. On reboot, launchctl list shows two com.kithrup jobs -- and the extra one is application.com.kithrup.appName.3238445.3238450. Anyone have any idea about this?

App & System Services Core OS macOS Service Management

8

0

982

Sep ’22

Pointer Authentication and dispatch_queue_t

We got a crash in some code, I had managed to miss this topic entirely somehow. This says: Pointer authentication can also expose latent bugs in existing code. In C++, it’s incorrect to call a virtual method using a declaration that differs from its definition. In practice, such calls typically succeed in arm64, but trigger a pointer authentication failure in arm64e. You might encounter this bug when using OS_OBJECT types like dispatch_queue_t and xpc_connection_t. You can’t pass instances of these types from C++ code to an Objective-C++ function (or vice versa) because they’re defined differently in Objective-C++ to support automatic reference counting (ARC). and, yes, we have both C++ and ObjC++ code, and a class does have a dispatch_queue_t member, and it does get passed around (although I don't think anything other than ObjC++ code touches the member), but... the documentation there says "you can't d this" but has absolutely no information on what you are supposed to do instead. Again, I've managed to miss this completely, and my network searching ability is pretty awful, so I assume I simply couldn't find documentation on it? (And I can't stream video very well where I am right now.)

Developer Tools & Services Xcode Debugging Apple Silicon

6

0

1.4k

Dec ’22

Transparent proxy provider and multiple users

This is somewhat to my question at On reboot, two instances of faceless app - but slightly different focus. This is my understanding of how the system works, and please correct me if I'm wrong: A network extension can only be loaded by an application That application must contain the extension (in Contents/Library/SystemExtensions) Only the application instance that loads an extension can get VPN notifications (eg, NEVPNStatusDidChangeNotification) There does not appear to be a way to get the version of installed network extensions programmatically? When a second user logs in, and runs the containing app, and requests loading the extension, it does the normal replacement request. Given that... how is it supposed to handle multiple users (via Fast User Switching)?

App & System Services Networking Network Extension

3

0

719

Sep ’22

SCDynamicstoreCopyConsoleUser returns an empty string

consoleUser = SCDynamicstoreCopyConsoleUser(NULL, &uid, &gid); the string is empty, but not NULL. uid and gid are set properly. Any idea why this would happen? NB: it only happens from a LaunchAgent, for some reason; if I isolate the code in question, and run it via CLI, it works exactly as expected. And it only seems to happen for one person -- but for him, it happens on both Intel and Apple Silicon.

App & System Services Core OS System Configuration

5

0

1.2k

Sep ’22

Multiple instances of TransparentProxyProvider

I had this a happen a long time ago, and I suspect that was due to the object not releasing due to its own retained objects. But now it's happening again. Now, I know this happening because I logged the address of the object, and there are different values alternating in the log. So my questions really are: How can I prevent this? How can I detect this?

App & System Services Networking Network Extension

0

511

Sep ’22

Network Extension installation and multiple users

We have a network extension. It is bundled in an app, that is launched as a launch agent for each user. When doing the install, the installer bootstraps the agent for each currently-logged-in console user. When the agent runs, it checks to see if it is the current active console user, and if so, goes through the process of activating the extension. This part works fine. But... if the installation is done while two users [haven't tried more than 2, sorry] are simultaneously logged in, SysPrefs gets launched for both users. Is this expected behaviour?

App & System Services Networking Network Extension

4

0

792

Oct ’22

Getting the pid of a network extension

Yes, actual process ID: on upgrades, our network extension sometimes decides to become completely incommunicado as far as XPC is concerned -- any attempt to send an XPC message to it results in "couldn't communicate with a helper application" or similar. The only workaround I've been able to come up with is unloading and reloading the extension. It was suggested that I try killing it. Which, great, but... how would I get it's pid? I do not at all feel comfortable launching pkill; I could get all the processes on the system and look for the name. But is there a way for the wrapping process to be able to get the pid?

App & System Services Networking Network Extension

4

0

709

Oct ’22

Transparent network proxy ... stops?

I don't know how to go forward on this one: we have a test engineer who can, reliably, cause networking to simply stop working. Our app has 3 major components -- a proxy daemon, a containing UI app, and a network extension. Because I am lousy at using debuggers, the extension logs every single new flow it gets (to .debug), as well as a bunch more. When our engineer gets this problem, the proxy may crash a couple of times, but is still running; the extension is also still running, but no longer gets new flows. Networking outside the machine no longer works. But doing echo foo | nc 127.0.0.1 88 succeeds (or, at least, doesn't print any error -- and also doesn't get any log messages from the extension). I've got a sysdiagnose from it, as well as a bunch of logs, and all I can really see is that the proxy app restarted, and when it came back, it said there was no networking available. And that the extension stopped logging new flows at about the same time. I have not been able to reproduce this -- even though our engineer is using the same script I wrote to try to reproduce it, and he can, within an hour. (As opposed to my systems, which have been running for almost a day on both an M1 and Intel system.) Any ideas of things I should try looking for in the sysdiagnose?

App & System Services Networking Network Extension

2

0

1.1k

Nov ’22

kithrup

Post

Replies

Boosts

Views

Activity