Post

Replies

Boosts

Views

Activity

Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
Do you think it would be possible to reconsider how these BSD socket APIs report these local network privacy restrictions with a more appropriate errno? Instead of this and other related BSD socket APIs returning errnos like EHOSTUNREACH, which have a pre-established meaning, would it be possible to instead return EPERM from these APIs? I am considering filing this and the other issue about the user switch causing the launchd daemon, running as root, losing its local network permission as two separate issues through feedback assistant later today. Would that be OK?
Mar ’25
Reply to sendto() system call - Nondeterministic "No route to host" due to local network restrictions
Hello Hoffman, While this seemingly is a bug somewhere in the macOS network stack, UDP apps absolutely need to be coded to expect packets to vanish with no errors reported to the sending app. Right, that part about UDP is understood. The reason I opened this thread is to have this behaviour analyzed to be certain that this isn't due to bug(s) in the local network restriction implementation, which is new to 15.x macos. If it indeed is a bug, it would be good to have it addressed to prevent hard to debug issues (like we are currently having).
Mar ’25
Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
Hello Quinn, Earlier you posted an excerpt from your launchd property list file, but not the whole thing. I’m specifically curious if: ... If this script, or anything in the chain between it and the Java code that’s calling sendto, is changing the user ID. For example, by calling setuid or setgid or using tools like su or sudo. I had a look at the process launching code involved in this entire hierarchy. So the flow is as follows - there's a launchd daemon process (launched through a plist file in /Library/LaunchDaemons) which acts as a task management "server". Just for additional information, it's a third party framework that's implemented in C++ (so it's not a Java program in itself). That launchd daemon receives regular requests for launching "tasks". You can imagine a task to be a process that this server launches on the same macos host. The task request that comes in to the launchd daemon has the ability to say that the task needs to be run as a specific user. The launchd daemon then creates a new process. If a user is specified, then the newly launched process first identifies the "uid", "gid" and "grouplist" of the chosen user through system calls getuid(), getgid() and getgroups() respectively. The newly launched process then calls the setuid(), setgid() and setgroups() system calls to change the user id of itself. In our setup where we are noticing this issue, the launchd daemon (running as root) launches processes which then have their user changed to a different user. These processes, running as a different user, then ulimately end up launching the java executable which then, as part of the application code, end up calling the sendto() system call. In general, a launchd daemon shouldn’t be troubled by local network privacy. That includes the daemon itself and any processes that it spawns. I’ve seen cases where that’s not the case but, at least so far, those are associated with daemon that change their user ID, either via the UserName property or explicitly. That can cause problems on macOS because macOS maintains a bunch of execution context beyond that standard BSD user and group IDs [1]. So yes, it appears that switching of the user is playing a role here. Having said that, is this a bug in macos? The implementation of local network restrictions appears to have identified the launchd daemon, running as root, as the top level application against which (as per the local network documentation) it is evaluating the permissions. Yet, it appears to be tripped by the user id of the leaf (and intermediate?) processes when making this decision. Would you have some suggestion on how to get past this?
Mar ’25
Reply to sendto() system call - Nondeterministic "No route to host" due to local network restrictions
The C program which has been provided as a reproducer in this thread was hand crafted to demonstrate more easily what the issue is. The real world code which reproduces this in Java program is as trivial as the following: import java.net.*; public class LocalNetworkTest { public static void main(final String[] args) throws Exception { final byte[] hello = "hello".getBytes(); try (final DatagramSocket ds = new DatagramSocket()) { final int arbitraryDestPort = 12345; final InetAddress destAddr = InetAddress.getByName("ff01::1"); final DatagramPacket packet = new DatagramPacket(hello, hello.length, destAddr, arbitraryDestPort); System.out.println("attempting to send a packet to " + destAddr); ds.send(packet); System.out.println("successfully sent packet to " + destAddr); System.out.println("application will now do some unrelated work for a second"); doSomeOtherWork(); // send again System.out.println("application will now again attempt to send a packet to " + destAddr); ds.send(packet); System.out.println("(again) successfully sent packet to " + destAddr); } } private static void doSomeOtherWork() throws Exception { Thread.sleep(1000); } } Save this code to a LocalNetworkTest.java file and then go to the Terminal (i.e. macos Terminal app) and just do: java LocalNetworkTest.java You should see the output as follows attempting to send a packet to /ff01:0:0:0:0:0:0:1 successfully sent packet to /ff01:0:0:0:0:0:0:1 application will now do some unrelated work for a second application will now again attempt to send a packet to /ff01:0:0:0:0:0:0:1 Exception in thread "main" java.net.NoRouteToHostException: No route to host at java.base/sun.nio.ch.DatagramChannelImpl.send0(Native Method) at java.base/sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(DatagramChannelImpl.java:914) at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:871) at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:798) at java.base/sun.nio.ch.DatagramChannelImpl.blockingSend(DatagramChannelImpl.java:857) at java.base/sun.nio.ch.DatagramSocketAdaptor.send(DatagramSocketAdaptor.java:178) at java.base/java.net.DatagramSocket.send(DatagramSocket.java:593) at LocalNetworkTest.main(LocalNetworkTest.java:17) Notice how the first attempt gives an impression that the send has succeeded then the second attempt (after a delay of 1 second within that program) ends up with that "no route to host" exception. Like I noted earlier, there are no pop-ups or notifications asking for permission to allow local network access. Plus, this was launched from the Terminal app, yet the local network restrictions seem to be applied which goes against what the documentation here https://developer.apple.com/documentation/technotes/tn3179-understanding-local-network-privacy states: macOS automatically allows local network access by: ... Command-line tools run from Terminal ...
Mar ’25
Reply to sendto() system call doesn't return an error even when there is one
Whether that’s a bug or not depends on your context. And speaking of context, what’s the context here? I suspect that this is coming out of a test suite. Is that right? Or do you have real world code that’s running into problems because of this behaviour? The above native C code was hand crafted from a Java program which runs as part of our large testsuite. Although that program is part of a testsuite, the actual program is very trivial and matches what one would expect out of a normal Java real world application. This current thread uses ENETDOWN as an example of demonstrating an issue where the sendto() doesn't report back errors promptly. After I opened this thread, I noticed a variant of this issue where the error starts getting reported on subsequent calls to sendto() if the program adds a small delay between the 2 calls. I explain that in a separate thread here https://developer.apple.com/forums/thread/776630. I will add a Java program to that thread, which demonstrates a real world usage where this current behaviour is problematic. I'm sorry about these 2 duplicate threads, but I only realized later that these 2 issues are just a variation of each other. I think it would be easier to just continue this discussion in that other thread.
Mar ’25
Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
While at it, coming to this point: In this case, the sendto() is returning a EHOSTUNREACH error which is what is then propagated to the application. I would like to note that this (and one other error message that I'm investigating), I feel, are a bit misleading. Do you think it would be possible to reconsider how these BSD socket APIs report these local network privacy restrictions with a more appropriate errno? Instead of this and other related BSD socket APIs returning errnos like EHOSTUNREACH, which have a pre-established meaning, would it be possible to instead return EPERM from these APIs? man errno states: 1 EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resources. This feels much more closer and accurate to what these local network operation restrictions are about. Of course, returning EPERM errno from these APIs would also mean that the man pages of relevant BSD functions like connect(), sendto() and such would have to be updated to state that this is now one of the possible errno they return. If that can be done, then I think this can reduce the confusion and at the same time more accurately represent the nature of the operation failure.
Mar ’25
Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
Hello Quinn, Earlier you posted an excerpt from your launchd property list file, but not the whole thing. I’m specifically curious if: The launchd property list has a UserName property to run the job as a user other than root. No, the plist file doesn't have a UserName. For reference, here's the complete plist file: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>polo-foobar</string> <key>ProgramArguments</key> <array> <string>/opt/marco-foobar/start-marco-foobar.bash</string> </array> <key>WorkingDirectory</key> <string>/cores</string> <key>RunAtLoad</key> <true/> <key>OnDemand</key> <false/> <key>EnvironmentVariables</key> <dict> </dict> <key>StandardErrorPath</key> <string>/var/log/foo-err.log</string> <key>StandardOutPath</key> <string>/var/log/foo-out.log</string> <key>KeepAlive</key> <true/> <key>HardResourceLimits</key> <dict> <key>Core</key> <integer>9223372036854775807</integer> <!-- RLIM_INFINITY --> </dict> <key>SoftResourceLimits</key> <dict> <key>Core</key> <integer>9223372036854775807</integer> <!-- RLIM_INFINITY --> </dict> <key>ProcessType</key> <string>Interactive</string> </dict> </plist> If this script, or anything in the chain between it and the Java code that’s calling sendto, is changing the user ID. For example, by calling setuid or setgid or using tools like su or sudo. I will look into that part and get back with the details. [1] The very old, but still surprisingly relevant, TN2083 Daemons and Agents covers this in much more detail. Yes, that's a very good documentation. I've been reading through it the past few days.
Mar ’25
Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
7384 is a process launched through launchd with the /opt/marco-foobar/start-marco-foobar.bash bash script file as the executable. When I say launchd, I may not be using the right term here. I think the /opt/marco-foobar/start-marco-foobar.bash bash script gets launched whenever that macos system starts (I will confirm that today). I've confirmed that this process is indeed launched by placing the plist file in the /Library/LaunchDaemons directory. That plist file, as noted in my previous post contains: <key>ProgramArguments</key> <array> <string>/opt/marco-foobar/start-marco-foobar.bash</string> </array> So the 7384 process that has been identified as the root top level app/process is a LaunchDaemon.
Mar ’25
Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
That "send0" is implemented by the JDK by invoking the sendto() system call. In this case, the sendto() is returning a EHOSTUNREACH error which is what is then propagated to the application. I would like to note that this (and one other error message that I'm investigating), I feel, are a bit misleading. When these issues showed up several weeks back, I and others started looking into this. At first we spent several days trying to understand if this was something to do with the networking configurations on the host or other devices. We had to involve some network admins from our lab to try and debug this. After several days of investigation, we came to realize that the "Local Networking" restrictions put in place in 15.x of macosx are at play.
Mar ’25
Reply to Process with equal instances but unequal identities
I asked about that internally and we suspect that it’s caused by one of these internal identities being cloned with an override of the audit user ID being set to 0. However, I’m able to take this further at this time. The runningboardd implementation is complex and to learn more I’d need to consult with the folks who maintain that, and I don’t want to do that until I know more about your side of this equation. Thank you Quinn for looking deeper into this. I have been trying to reproduce this locally but so far I haven't been able to. This is still on my TODO list to investigate and I hope to get back shortly with additional details when I have them. Just to clear, the presence of this log message does indicate that there is some genuine issue that is worth investigating, right?
Topic: App & System Services SubTopic: Core OS Tags:
Mar ’25
Reply to macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
Here's what I understand of the log messages that I pasted in my previous reply. Remember that the process id of interest is 58700. In the log above, you will notice that the only place where this proces id 58700 is present is the line: cfprefsd [0x74c933e80] activating connection: mach=false listener=false peer=true name=com.apple.cfprefsd.daemon.peer[58700].0x74c933e80 That line isn't too interesting and I don't know if it's relevant in this discussion, so I'll skip that one. The good thing however is that these logs appear to have captured enough details about the "Local Network" restriction's implementation. Several of those above log messages have something interesting, and I think it can be summarized by these few: UserEventAgent Got local network blocked notification: pid: 7384, uuid: 4E7709E7-AD5C-38B8-9ED0-0354767877BD, bundle_id: (null) UserEventAgent LocalNetwork: found bundle id marco-foobar by PID UserEventAgent LocalNetwork: did not find bundle ID for UUID 4E7709E7-AD5C-38B8-9ED0-0354767877BD UserEventAgent Found bundle ID: marco-foobar nehelper application record search init. Node: (null) bundleID: <private> itemID: 0 ... 708 info 2025-03-13 09:53:42.446497 +0000 runningboardd _executablePath = /opt/marco-foobar/start-marco-foobar.bash 708 info 2025-03-13 09:53:42.446500 +0000 runningboardd no additional launch properties found for <private> 708 default 2025-03-13 09:53:42.446543 +0000 runningboardd _resolveProcessWithIdentifier pid 7384 euid 0 auid 0 708 default 2025-03-13 09:53:42.446584 +0000 runningboardd Resolved pid 7384 to [osservice<polo-foobar>:7384] ... runningboardd [osservice<polo-foobar>:7384] is not RunningBoard jetsam managed. runningboardd [osservice<polo-foobar>:7384] This process will not be managed. runningboardd PERF: Received request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null)): lookupProcessName:error: nehelper No team ID found for (bundleID: marco-foobar, name: marco-foobar) So when the java program (in process 58700) intitated the sendto() local network operation, it appears to have triggered a "local network blocked notification". I'm not sure if that log line from UserEventAgent means that the notification pop-up (asking the user to allow/disallow the operation) was generated or whether it is determining if the pop-up needs to be generated. In any case, that log message indicates that the pid is 7384. Looking at the additional data I collected from that system and the documentation of "Local Networking" which states: When a process performs a local network operation, macOS tries to track down the responsible code. For example, if your app spawns a helper tool and the helper tool performs a local network operation, macOS considers the app to be the responsible code. So what that log tells me is that the local network restriction has identified 7384 process as the top level root application for the 58700 java process which is doing the networking operation. So I went back and looked into the process and launch hierarchy on this setup. 7384 is a process launched through launchd with the /opt/marco-foobar/start-marco-foobar.bash bash script file as the executable. When I say launchd, I may not be using the right term here. I think the /opt/marco-foobar/start-marco-foobar.bash bash script gets launched whenever that macos system starts (I will confirm that today). The plist file which configures this bash script as the entry point looks something like: <plist version="1.0"> <dict> <key>Label</key> <string>polo-foobar</string> <key>ProgramArguments</key> <array> <string>/opt/marco-foobar/start-marco-foobar.bash</string> ... and launchtl list on that system shows: PID Status Label ... 7384 0 polo-foobar ... The /opt/marco-foobar/start-marco-foobar.bash ultimately ends up doing a: ... exec /opt/polo/bin/polo-foobar polo-foobar is a 3rd party binary and doesn't have any plist file of its own: launchctl plist /opt/polo/bin/polo-foobar 64-bit Mach-O does not have a __TEXT,__info_plist or is invalid. polo-foobar is a generic (3rd party) framework which allows application specific code to be executed. In this case, it ends up launching a java process which then further launches (through Java's ProcessBuilder API https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/ProcessBuilder.html#start()) the ultimate 58700 java process. So given all this, it seems to me that the entry point process /opt/marco-foobar/start-marco-foobar.bash (which "exec"ed the polo-foobar binary) is being denied access to the network operation. Did I understand it right? If so, how do I go about addressing this issue. The local networking document states: If you ship a launchd agent that’s not installed using SMAppService, make macOS aware of the responsible code by setting the AssociatedBundleIdentifiers property in your launchd property list. Does that mean I need to add the AssociatedBundleIdentifiers property to the plist snippet that I shared previously? What value (values?) would I add to it? Furthermore, these processes are running on a system which acts like a "server" and there's no user interaction involved. So what are the options of making this "allow this process (sequence) access to networking operations" non-interactive and configurable/automatable? While at it, I would like to note that the output of ps -Meo pid,pcpu,cputime,start,pmem,vsz,rss,state,wchan,user,args for this top level 7384 process which seems to have been denied the network operation access shows that it is running as root: USER PID TT %CPU STAT PRI STIME UTIME COMMAND PID %CPU TIME STARTED %MEM VSZ RSS STAT WCHAN USER ARGS root 7384 ?? 0.0 S 20T 0:00.03 0:00.55 /opt/ma 7384 0.0 73:40.19 1Mar25 0.4 412145952 58976 S - root /opt/polo/bin/polo-foobar So this local network access denial for this process seems to go against what the local networking documentation states: macOS considerations macOS maintains separate local network privacy state for each user account. macOS automatically allows local network access by: Any daemon started by launchd Any program running as root ... Overall this feels way too complicated to manage and if I understand it correctly, none of these issues has to do with java itself and I can imagine the exact same launch sequence leading to a go (or even python) application which uses that language's standard networking APIs to run into this same thing. Have I misunderstood this? While at it, I would also like to understand if those above log messages show any other issues that might need to be addressed.
Mar ’25