I have to say, I and everyone else in my team are very happy to see you analyse the dump of logs there. We had tried to read them many a times, but we did not manage to extract the insight you have. Thank you so much for the help.
We will get rid of Go and it's weird stacks, but as per the docs (https://pkg.go.dev/os/signal#hdr-Non_Go_programs_that_call_Go_code) we might be installing some signal handlers, inadvertently. However, the original signal handlers should still be invoked, as pre the docs. For reference, we build the go library with buildmode c-archive. Even still, as per the docs, any time a signal is delivered, it should be handled the original signal handler, unless it was invoked on a Go routine. I would not be surprised if using Go in an iOS app was unsafe at any speed.
So, something in your process is doing a synchronous blocking read on a pipe and that’s blocked indefinitely. And we can’t figure out what it is because the spindump isn’t showing the user-space component of the backtrace.
Given that it is the sigpipe handler that is giving us issues, this might well be the root cause. Difficult to verify, of course. It wouldn’t surprise me if this were a signal handler because we had similar issues on Android. And we're not the only ones using wireguard-go, and we're not the only ones experiencing these issues on iOS either.
If it is a wonky interaction between the signal handlers, how come does this state reliably reproduce once the bug is hit? I would expect the wonkiness to be a problem at all times. Seemingly, this is almost always fixed by logging out and logging back in our app, with the significant operation there being that logging out removes the VPN profile. Is it the update process that is desperately trying to send a specific signal to the pre-updated VPN process that is blocking the new one from starting?
Further, your analysis does shed light on some crash reports we've seen that originate in PluginKit. We were very confused since we did not explicitly use PluginKit.
Sorry, I think my questions are begging to become too academic. I think our best bet now is to migrate away from the Go library and use something that will not override signal handlers willy nilly.
Again, thank you for spending your time on this, this was immensely helpful, even if the bugs we're seeing still need to be resolved.