Post

Replies

Boosts

Views

Activity

Reply to Debugging a crashing ksh?
Just for the record, I wasn't able to get the debug symbols for the ksh shipped in macos. However, fortunately, this issue was reproducible even on upstream ksh, for which the source is available and I was able to reproduce it on that version. This was then reported and fixed in https://github.com/ksh93/ksh/issues/591. A feedback issue has also been filed with Apple so that this fix gets pulled in into Apple's version of ksh. Feedback id is FB11941810
Topic: App & System Services SubTopic: Core OS Tags:
Feb ’23
Reply to syslogd - out-of-box bsd_out module sends UDP packets to non-existent destination socket?
Yeah, I just tried that myself and I’m having the same problem [1]. The good news is that things get rendered correctly on our side, both in Radar and our internal view of Feedback Assistant. Still, this is annoying and I encourage you to file a bug. Make sure to include screen shots of what the Safari window looked like just before you submitted the bug and how the bug gets rendered in the end. Hello Quinn, I've now submitted FB12022556 to track this rendering issue. I've attached relevant screenshots to that issue.
Topic: App & System Services SubTopic: Core OS Tags:
Mar ’23
Reply to UDP socket bind with ephemeral port on macos results in OS allocating a already bound/in-use port
Hello enodev, How do you create/bind the socket? We use low level system calls, the udp4 socket creation/bind looks something like: fd = socket(AF_INET, SOCK_DGRAM, 0); if (fd < 0) { printf("Failed to open socket: %d\n", errno); return -1; } SOCKETADDRESS sa; memset((char *)&sa, 0, sizeof(SOCKETADDRESS)); sa.sa4.sin_family = AF_INET; sa.sa4.sin_port = 0; sa.sa4.sin_addr.s_addr = htonl(0x0); // bind to wildcard socklen_t len = sizeof(sa.sa4); int res; res = bind(fd, &sa.sa, len); if (res < 0) { printf("Failed to bind: %d\n", errno); return -2; } and the udp46 socket creation/bind looks similar except for the additional setsockopt call which we do to disable IPv6_ONLY on that socket: int fd; fd = socket(AF_INET6, SOCK_DGRAM, 0); if (fd < 0) { printf("Failed to open socket: %d\n", errno); return -1; } // mark it as dual socket int ipv6_only = 0; // dual socket if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &ipv6_only, sizeof(int)) < 0) { return -2; } SOCKETADDRESS sa; memset((char *)&sa, 0, sizeof(SOCKETADDRESS)); sa.sa6.sin6_family = AF_INET6; sa.sa6.sin6_port = 0; char caddr[16]; memset((char *)caddr, 0, 16); // wildcard memcpy((void *)&sa.sa6.sin6_addr, caddr, sizeof(struct in6_addr)); socklen_t len = sizeof(sa.sa6); int res; res = bind(fd, &sa.sa, len); if (res < 0) { printf("Failed to bind: %d\n", errno); return -3; } Both sockets above are owned by PID 89318, what PID is that? Please ignore that process id. As I noted in the original description this process id is the reproducer code that we ran to reproduce this issue. So in this case, the reproducer first creates a udp4 socket and binds it to an ephemeral port (and lets it stay bound till the program exits) and then creates udp46 socket and binds it to an ephemeral port. Occasionally (but consistently), this programs ends up with the already in-use port being assigned to the udp46 socket bind call. So in this case, the process ids are the same but in reality (and in production), the processes are different and unrelated (for example: we have noticed that one of our program when it binds a udp46 socket with ephemeral port, it gets assigned the port which is already in use by the system's syslogd process which has it bound to udp4)
Apr ’23
Reply to JPackage signing leaves app unusable since updating to Ventura
@javadev12345, since you note that this happens with jpackage and is even reproducible with the recent released Java 20, I would recommend that you open an issue here https://bugreport.java.com/bugreport/start_form with all relevant details, including the commands that you use and whether this is a macos x64 or M1, so that someone from the jpackage team can take a look. I remember that in the past there was at least one similar issue which I think was addressed in https://bugs.openjdk.org/browse/JDK-8276150 and https://bugs.openjdk.org/browse/JDK-8277493. This could be a different variant of the issue though.
Topic: Code Signing SubTopic: General Tags:
Apr ’23
Reply to Out-of-band data returned by recv() and read() on socket bound to non-loopback address even when SO_OOBINLINE is disabled
Hello Quinn, I don't have data on whether applications still rely on out-of-band or whether applications do send out-of-band data. However, given that this is an option at the socket level, the JDK as part of the Java specification exposes an API on its java.net.Socket class to allow applications to enable or disable this option https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/net/Socket.html#setOOBInline(boolean). As part of verifying that the Java API works as expected (on all OS), we run tests for that API. In fact, that's what prompted me to look into this issue when we noticed it fail (only) on macos. For context, here's the JDK issue https://bugs.openjdk.org/browse/JDK-8279920 where we have been tracking this. Specifically here, if some peer ends up sending a out-of-band data (for whatever reason) and even if the application has decided it isn't interested in out-of-band data (which is the default), then the application can still end up receiving this out-of-band data unexpectedly.
May ’24
Reply to Out-of-band data returned by recv() and read() on socket bound to non-loopback address even when SO_OOBINLINE is disabled
Thank you Quinn for your help so far. I have now filed FB13799990 to track this issue and attached the same reproducer to it. On a different note, during the past couple of years I have filed a few issues through feedback assistant, 2 of them can be classified as belonging to networking area. They are still open and haven't seen any acknowledgement or any response. In some other channels, I have been told (by people who aren't from Apple) that when such issues in feedback assistant don't see any response, it most likely means that they haven't been triaged and the suggestion is to refile them afresh. I don't know if I should be doing that. The feedback ids for those other issues are FB12128351, FB12016446 and FB9997771. Would you have any inputs on whether or not I should refile these other issues or just leave them alone and hope someone responds to them?
May ’24
Reply to Out-of-band data returned by recv() and read() on socket bound to non-loopback address even when SO_OOBINLINE is disabled
So, that last one FB9997771 was reported as fixed in the 2022 OS rereleases (so, macOS 13 and friends). Thank you for that detail, Quinn. I am very happy to hear that that one is officially fixed. That had caused really odd intermittent failures in unexpected areas within the JDK and had taken us a long time to narrow it down. I will run our tests against macosx 13 and higher to verify the fix. That should’ve been communicated to you but wasn’t. I’m not sure why. I’ll follow-up on that internally, just for my own understanding, but this is all about Apple internal processes so I probably won’t post any more details here. I understand. The other two are still under investigation; I don’t have any further info to share. Thank you for that update. Bug Reporting: How and Why? has a bunch of hints and tips on this front, but probably the best important is this one: If you’re filing a bug against an API, choose Developer Technologies & SDKs at the top level. This is useful. So far I've been filing it under "Something else not in this list" category for several of the bugs that I've opened either related to network APIs or kernel APIs. Including the one that I opened in this discussion. Henceforth, I'll keep that category in mind. Overall, thank you again for all the help and responses you have been providing, not just in this thread but other previous discussions too. Not receiving any updates/responses on feedback assistance issues is demotivating but seeing the responses in the developer forums here and being assured that the feedback assistant issues have been noticed and are being investigated does encourage in filing new ones.
May ’24
Reply to Out-of-band data returned by recv() and read() on socket bound to non-loopback address even when SO_OOBINLINE is disabled
Hello Quinn, So, that last one FB9997771 was reported as fixed in the 2022 OS rereleases (so, macOS 13 and friends). Thank you for that detail, Quinn. I am very happy to hear that that one is officially fixed. That had caused really odd intermittent failures in unexpected areas within the JDK and had taken us a long time to narrow it down. I will run our tests against macosx 13 and higher to verify the fix. I ran our reproducer against macos 12.x, 13.x and 14.x versions of macos aarch64. I can confirm that the issue is fixed and no longer reproducible in 13.x and 14.x versions. The issue continues to reproduce in 12.x versions of macos aarch64.
May ’24
Reply to UDP socket bind with ephemeral port on macos results in OS allocating a already bound/in-use port
Hello Marten, Is the issue public? I'm getting a "Feedback Not Found" under this link. The issue isn't public. None of the issues filed with "Feedback assistant" are public. It's still an open issue and we very regularly run into this. I have been told in a different discussion that the issue is being investigated by Apple. There's no fix for it right now. This bug has cost me a few days of debugging work to track down flaky test failures in quic-go. It also seems to be the root cause behind https://github.com/golang/go/issues/67226. I am not from Apple, but my recommendation would be to file a feedback assistant issue of your own with these details (and any other details) so that this gets additional attention. While filing that issue, I would recommend following Quinn's suggestions here https://forums.developer.apple.com/forums/thread/751587?answerId=787971022#787971022 (specifically choose Developer Technologies & SDKs at the top level when filing the issue) P.S: I didn't receive any notification from this thread when you posted your message. I only accidentally happened to view this thread today and noticed your post.
Jul ’24
Reply to What is the command to list all socket filters/extensions in use?
Hello Quinn, I am in the middle of investigating an issue arising in the call to setsockopt syscall where it returns an undocumented and unexpected errno. What’s that value? A IPv4 SOCK_DGRAM socket that's bound and subsequently a setsockopt on that socket for IP_ADD_MEMBERSHIP option is leading to the return value from that call to be -1 with errno set to 8, which gets reported as "Exec format error". The setsockopt reproducer is very trivial #include <netinet/in.h> #include <stdio.h> #include <errno.h> #include <string.h> #include <unistd.h> #include <arpa/inet.h> int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stderr, "Error, expected usage: <program> <multicast-ip-address> <network-interface-ip-address>\n"); fprintf(stderr, "example usage: ./a.out 225.4.5.6 192.168.1.2\n"); return -1; } char *mcast_join_group_addr = argv[1]; char *network_intf_addr = argv[2]; fprintf(stderr, "test will join multicast group address = %s of network interface address = %s\n", mcast_join_group_addr, network_intf_addr); // create a datagram IPv4 socket int type = SOCK_DGRAM; int domain = AF_INET; int fd = socket(domain, type, 0); if (fd < 0) { fprintf(stderr, "FAILED to create socket, errno %d - %s\n", errno, strerror(errno)); return -1; } fprintf(stderr, "SOCK_DGRAM socket created, fd=%d\n", fd); // bind the socket to a wildcard address and ephemeral port struct sockaddr_in sa; memset((char *) &sa, 0, sizeof(sa)); sa.sin_family = AF_INET; sa.sin_port = 0; // bind to wildcard inet_pton(AF_INET, "0.0.0.0", &(sa.sin_addr.s_addr)); socklen_t len = sizeof(sa); int b = bind(fd, (struct sockaddr *) &sa, len); if (b < 0) { fprintf(stderr, "failed to bind: errno=%d - %s\n", errno, strerror(errno)); return -1; } fprintf(stderr, "socket successfully bound\n"); // set IP_ADD_MEMBERSHIP socket option on the socket struct ip_mreq mreq; // multicast group address inet_pton(AF_INET, mcast_join_group_addr, &(mreq.imr_multiaddr.s_addr)); // interface IP address inet_pton(AF_INET, network_intf_addr, &(mreq.imr_interface.s_addr)); int opt = IP_ADD_MEMBERSHIP; void *optval = (void *) &mreq; int optlen = sizeof(mreq); fprintf(stderr, "setting IP_ADD_MEMBERSHIP on socket\n"); int n = setsockopt(fd, IPPROTO_IP, opt, optval, optlen); if (n < 0) { fprintf(stderr, "FAILED - setsockopt(IP_ADD_MEMBERSHIP) returned %d with errno %d - %s\n", n, errno, strerror(errno)); close(fd); return -1; } close(fd); fprintf(stderr, "SUCCESSFUL completion of the test\n"); } The fact that the errno is set to (or atleast interpreted as a) ENOEXEC is surprising since man setsockopt makes no mention of that error for this call. My guess is that some specific filter/extension code gets run through the setsockopt syscall. Reading through https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/NKEConceptual/socket_nke/socket_nke.html I suspected it could be some socket filter. This issue has been reported to the JDK team since around a decade https://bugs.openjdk.org/browse/JDK-8144003 but it's only recently that we have started noticing it more frequently in our setups. It could be something to do with our macosx hosts, but at this point I don't have an idea of what tools/commands/options I should be using to understand what code from within setsockopt is interfering here. Would you happen to know any tracing (ktrace?) that might help narrow this down further? The system logs (viewed through Console app) haven't shown anything specific. Having said that, the netstat output you posted makes it clear that all the filters currently attached were attached by the OS. You are not dealing with third-party code here. That's good to know. In context of multicasting (or more specifically that setsockopt IP_ADD_MEMBERSHIP option) do these OS attached filters play any role or apply any specific rules that I should be aware of?
Oct ’24
Reply to What is the command to list all socket filters/extensions in use?
Hello Quinn, To be clear, that’s most definitely a bug. I encourage you to file a bug report about it, even if you can’t reproduce it yourself. Ideally that bug report would include a sysdiagnose log taken by one of your users just after that reproduce it. Please post your bug number, just for the record. I have now created a bug through feedback assistance. The id is FB15368430. I've attached the trivial .c code which reproduces this on several of the hosts that I have run this against. For now, I don't have access to sysdiagnose output. I will check if that can be shared from one of the hosts that we reproduce this issue on. I will upload to that issue once I get access to those logs. Now, I can’t guarantee that the solution to that bug might be that we add ENOEXEC to the man page, but someone from the networking team needs to make that call. Understood. In fact, more than the "man" page update, once the relevant team finds out the root cause of this issue, what would be useful is either details of what exactly causes this error and/or advice to application developers (those who call setsockopt) on what is expected of them to address this error or advice to network administrators on how to fix/updated their configurations to prevent this error. For the record, during investigation of this issue, I've experimented with retrying the "setsockopt" when it fails with this errno, just to see if there is some kind of race or some such thing. But that hasn't helped - the subsequent call returns back with the same error. Is the uptick you mentioned correlated with macOS 15’s release? The firewall got a major rework in that release. I can answer this one for certain that this issue isn't related to macOS 15 release. None of the reports (including for hosts that run in our internal setups) have been against this version. In fact, we haven't yet started using this version in our setup. Have you looked to see if reports of the problem are always from those folks using the firewall? With help from admins who have access to some of these hosts, I know that some of the hosts on which this issue reproduces, has the firewall disabled. Specifically, on those hosts, "Settings" -> "Network" -> "Firewall" says "This computer's firewall is currently turned off. ...." If there's anything else that you or others would like to know to narrow this down, please do let me know, either here on in the feedback issue FB15368430.
Oct ’24
Reply to NSProcessInfo operatingSystemVersion generates warning CFPropertyListCreateFromXMLData(): Old-style plist parser: missing semicolon in dictionary
Thank you Quinn for that explanation and the example. That helps. Foundation was created as part of Apple’s (well, NeXT’s) app development story. For this reason it contains a component, UserDefaults, with some helpful, but non-obvious, behaviour: You can override specific user defaults by passing them on the command line. More out of curiosity - for a property like NSProcessInfo's operatingSystemVersion, which I guess will always be fixed on a given host, does it still internally need/use any user overriddable values that require parsing the command line of a process?
Topic: App & System Services SubTopic: Core OS Tags:
Oct ’24
Reply to NSProcessInfo operatingSystemVersion generates warning CFPropertyListCreateFromXMLData(): Old-style plist parser: missing semicolon in dictionary
Hello Quinn, Apple’s frameworks tend to use lazy initialisation, ... Thank you for that detail. That (and the rest of what you note in your answer) addresses my curiosity. Taking a big step back, the standard way to get the macOS version from the command line Noted. Although in the context of where this issue shows up, the command line way of determining the OS version isn't applicable for us. For context, this unexpected log message got reported as a bug against the JDK https://bugs.openjdk.org/browse/JDK-8340727. The JDK internally uses the NSProcessInfo's operationSystemProperty to determine the OS version, which then triggers the log. Well, you can report it as log noise if you like, but it’s definitely not a sign of an actual problem. Later today I will go ahead and report this, through feedback assistant, as a log noise. Thank you for your answers and the detailed technical explanations.
Topic: App & System Services SubTopic: Core OS Tags:
Oct ’24