F_FULLFSYNC is basically "always" going to be relatively slow. The API operates on a file handle, but flushing the volume to a coherent state is a broader operation the final I/O command is a device command. All of that makes it a fairly "heavy" operation.
Gotcha, thanks for the background!
On that note, I did my research—to our mutual peril—and found a comment (https://news.ycombinator.com/item?id=25204202) that cited sqlite's codebase, noting that "fdatasync() on HFS+ doesn't yet flush the file size if it changed correctly". Some folks on the postgres mailing list also appear to be a bit uncertain about the current situation: https://www.postgresql.org/message-id/flat/CA%2BhUKGLv-kvrtA5QEL%3Dp%3DdYK0p9gsMXJaVhUFu%2BA-KyFrFi%3D2g%40mail.gmail.com#fe6d1c5665a381687842758fd5b245d4.
I'm a newbie when it comes to understanding C code, but looking over kern_aio.c, I see two seemingly contradictory comments.
On lines 658–659:
* NOTE - we do not support op O_DSYNC at this point since we do not support the
* fdatasync() call.
Later, on lines 721–727, above aio_return:
/*
* aio_return - return the return status associated with the async IO
* request referred to by uap->aiocbp. The return status is the value
* that would be returned by corresponding IO request (read, write,
* fdatasync, or sync). This is where we release kernel resources
* held for async IO call associated with the given aiocb pointer.
*/
So I guess I'm wondering: is it... fine to now use fdatasync on APFS? Because if it is now fine (as per sqlite's understanding via the Hacker News comment...), then I think there's a bunch of software might be relying on outdated documentation/advice, since:
man fsync, on macOS 26.1 refers to a drive's "platters". To the best of my knowledge, my MacBook Pro does not have any platters!
As of 2022, it appears that Apple's patched version of SQLite uses F_BARRIERFSYNC. The wording of the documentation, at least for iOS, suggests that file data size would be synced to disk.
Foundation's FileHandle (which is, I think, equivalent to Rust's std::fs::File?) uses a plain fsync(_fd), not an F_FULLFSYNC like Rust's (and Go's, for that matter!) standard libraries do.
On the APFS side, the specific concern I have here is about the performance dip at high core count. That's partly because of the immediate issue and mostly because our core count has been increasing, and we need to be watching for these concurrency bottlenecks.
Understood. My friend André—who produced the graph on his M3 Ultra Mac Studio—theorized this past weekend that part of the observed performance degradation was partly due to the interconnects between the M3 Maxes, but this was idle speculation over brunch. He, to your point, also noted that the base Apple Silicon chips went from 8 cores to 12 cores in the span of 5 years, with roughly 20% year-over-year performance improvements! That'll certainly stress design assumptions!
Expanding on that last point, there's a real danger in these comparisons that comes from assuming that both implementations are directly "equivalent", so any performance divergence comes from correctable issues in the other platform. That certainly true some of the time, but it's definitely NOT true here.
Understood! If the answer from your end is "you're hitting a pathological edge case on APFS doing something it wasn't really designed for", then that's fine! It'd be nice if I can have my cake and eat it too à la ext4, but as you mentioned previously, there's a latency/throughput tradeoff here, and APFS is firmly on the side of "latency".
(And just to wave my credentials/sympathy for APFS' position: I spent four years of my life working on a latency-sensitive reimplementation of the Rust compiler, so I get how fundamental these design tradeoffs are!)
Asking a question that probably should have been asked earlier... why? Why are you doing this at all? Unless you're applying some external force/factor (basically, cutting power to the drive), I think all of these sync calls are really just slowing you down. That's true of ALL platforms, not just macOS. If your code is doing it as part of it's own internal logic then that fine, but if this is actually part of your testing infrastructure then I'm not sure you shouldn't just "stop".
No, that's a great point to clarify! The fsyncs are not part of the tests—at least, not directly—but rather, a part of core logic of jj itself. The change was introduced in this commit and has reduced the frequency of people reporting data corruption issues.
(The "op_store" in the linked commit can be thought of as a "write ahead log, but for the files you're keeping under version control". We could probably restore from the op log in the event of data corruption, now that I think of it...)
Topic:
App & System Services
SubTopic:
Core OS
Tags: