Post

Replies

Boosts

Views

Activity

Reply to When updating a VPN app with `includeAllNetworks`, the newer instance of the packet tunnel is not started via on-demand rules
Thank you for the timely response! I added an extra log call early in the constructor of our packet tunnel provider class, and saw no logs. We also tried a dummy implementation of a packet tunnel and also saw the same symptoms. We pontificated that it could be due to on-demand rules - if we disable those and try and update the app whilst our tunnel is connected we get the same results. That is not to say that we'd be happy to use a workaround that relies on not using on-demand rules. The issue number is FB16482585 and I've updated it with a sysdiagnosed that was collected with the appropriate VPN logging profile now. Not that we could do anything about the system not finding any route to the internet with includeAllNetworks set and the packet tunnel being shut down, but should a packet tunnel provider ever anticipate stopTunnel to be called with the appUpdate stop reason? Whilst on the topic of calls we've never seen in the wild, are there any circumstances we should be seeing a call to sleep? If the latter deserves a seperate thread, I'll make one :)
Feb ’25
Reply to When updating a VPN app with `includeAllNetworks`, the newer instance of the packet tunnel is not started via on-demand rules
It seems that now on Xcode26/iOS26, we can at least update the app via Xcode. Of course, updates via TestFlight or app store are still not working. Since updating via Xcode is fixed, I can no longer trivially produce a minimum reproducible Xcode project for this bug. Thus I have a question - should I attempt to do so anyway (implementing the minimum viable PacketTunnel) or is there no point? Would embedding WireGuard constitute a minumum viable project for reproducing the bug? Whilst Claude et al are cool and all, I really don't feel like implementing another virtual networking stack or a VPN implementation entirely in Swift to be able to get technical support on this. Then again, maybe technical support won't be of much help either? Happy new year and best regards, Emīls
Jan ’26
Reply to VPN profile corruption
[quote='871173022, DTS Engineer, /thread/811445?answerId=871173022#871173022'] Is this macOS or iOS? [/quote] Sorry for not specifying earlier, it's iOS. Has been an issue since at least iOS16. [quote='871173022, DTS Engineer, /thread/811445?answerId=871173022#871173022'] What about restarting the device? [/quote] Restarting the device sometimes helps, as per our user emails. [quote='871173022, DTS Engineer, /thread/811445?answerId=871173022#871173022'] Can you share those bug numbers? [/quote] The bug number is FB17057908.
3w
Reply to VPN profile corruption
That is a sysdiagnose from 2 almost years ago where we caught the bug on a colleagues device. It is the real deal. As far as signal handlers go, we are doing our best to avoid them, but it might be the case that the signal handlers are being executed on threads that are attempting to run Go code. It is a problem because the Go runtime implements cooperative thread scheduling where each thread gets a tiny stack. Such a thread would most definitely crash if a signal handler was invoked on top of it - some goroutine stacks are bound to be smashed. This is however a detour, I believe. I do not believe that the packet tunnel profile corruption (?) is an issue that is bespoke to Go or to our implementation of our tunnel. Further, it is my understanding that if our tunnel process is receiving signals, it probably is already too late. Why Go? Because the first WireGuard implementation that wasn't a Linux kernel module was implemented in Go. Just for the record, we are in the process of moving away from a WireGuard implementation in Go to one written in Rust.
2w
Reply to VPN profile corruption
I have to say, I and everyone else in my team are very happy to see you analyse the dump of logs there. We had tried to read them many a times, but we did not manage to extract the insight you have. Thank you so much for the help. We will get rid of Go and it's weird stacks, but as per the docs (https://pkg.go.dev/os/signal#hdr-Non_Go_programs_that_call_Go_code) we might be installing some signal handlers, inadvertently. However, the original signal handlers should still be invoked, as pre the docs. For reference, we build the go library with buildmode c-archive. Even still, as per the docs, any time a signal is delivered, it should be handled the original signal handler, unless it was invoked on a Go routine. I would not be surprised if using Go in an iOS app was unsafe at any speed. So, something in your process is doing a synchronous blocking read on a pipe and that’s blocked indefinitely. And we can’t figure out what it is because the spindump isn’t showing the user-space component of the backtrace. Given that it is the sigpipe handler that is giving us issues, this might well be the root cause. Difficult to verify, of course. It wouldn’t surprise me if this were a signal handler because we had similar issues on Android. And we're not the only ones using wireguard-go, and we're not the only ones experiencing these issues on iOS either. If it is a wonky interaction between the signal handlers, how come does this state reliably reproduce once the bug is hit? I would expect the wonkiness to be a problem at all times. Seemingly, this is almost always fixed by logging out and logging back in our app, with the significant operation there being that logging out removes the VPN profile. Is it the update process that is desperately trying to send a specific signal to the pre-updated VPN process that is blocking the new one from starting? Further, your analysis does shed light on some crash reports we've seen that originate in PluginKit. We were very confused since we did not explicitly use PluginKit. Sorry, I think my questions are begging to become too academic. I think our best bet now is to migrate away from the Go library and use something that will not override signal handlers willy nilly. Again, thank you for spending your time on this, this was immensely helpful, even if the bugs we're seeing still need to be resolved.
2w
Reply to VPN profile corruption
I've since gone and uploaded yet another sysdiagnose where we seemingly see the same thing happens, this time, on every reinstall from Xcode. In this case, once the newly installed packet tunnel starts, all networking on the device is broken, and seemingly this reliably reproduces. However, it doesn't happen by default, usually after a day of not restarting the device and developing does this start happening again. I've barely had time to look at the sysdiagnose, but from the cursory look there's plenty of No route to host errors, which would explain why the packet tunnel is not able to connect. Why is there no route to host? Who knows. At the time of installation, the VPN profile of the app I'm developing is set to be used on-demand, and it is set to include all routes (0.0.0.0/0 and ::0/0). We are not using enforceRoutes or includeAllNetworks due to bugs. It seems that there is a discrepancy between the two packet tunnel instances (the old one and new one), and one of them is desperately trying to work whilst the other one is the one that is allowed to send traffic, i.e. the routes are setup to route traffic into one instance but the system is routing traffic into the other. The phone had another VPN profile installed, but even when it is not there, the bug reproduces. I am not posting this with the expectation that you will do another deep dive, as much as we'd appreciate it. Just posting here so that maybe someone else who is encountering similar issues ends up seeing some documented.
1w