Interesting, gotcha! Here's a related question: since rust-analyzer is only interested in file change events while the IDE is running (we take deliberate steps to avoid any external, serializable state for a bunch of reasons that I'll elide for now, but I can get into later!), does it still make sense sense to do the layering you describe, or can we reasonably rely on the real-time approaches?
Simplifying greatly, the different APIs are effectively layered on top of each other. Side note on this, the FSEvents Programming Guide is still worth a read and it's last section is basically "Should you use FSEvent or kqueue?".
does it still make sense sense to do the layering you describe, or can we reasonably rely on the real-time approaches?
So, conceptually at least, you can basically think of FSEvents working by:
a) At the API level, change the monitoring target so that the client isn't having to open monitor targets.
b) Get events from kqueue but delay delivery so that the client isn't bothered by noise they don't really care about.
c) Layer a "tracking" system on top of that so that the client can save some time "catching up" on things when they next run.
The issue with monitoring targets ("a") is the biggest difference between FSEvents and kqueue. Unlike FSEvents, kqueue can actually monitor for individual file changes as well a directory changes, but doing so requires opening every file you want to monitor. That obviously becomes pretty cumbersome when dealing with large file counts. FYI, a few years ago I wrote the "DirectoryWatcher" that was included in the "DocInteraction" sample. The sample is an iOS sample (that's why it used kqueue), but the class should work perfectly fine on macOS.
In terms of which is better for "you" my guess is that it's probably FSEvents, simply because of the overall file count, but there may be cases/arguments for kqueue.
I did have one other thought I wanted to mention here:
I also realized that I didn't clarify the current state particularly well: by switching to using FSEvents via the Notify Rust library, rust-analyzer's reliability during rebases went from "guaranteed to be broken" to "basically works every time".
How closely did you look at what EXACTLY was going wrong here and why? I don't really know anything about the mechanism you were using earlier ("VS Code"?), but I have a suspicion that the issue here might actually have been that you were getting notified to "early", not to "late"? Particularly if you're dealing with a larger set of file systems which may have "wider" set of behaviors, you may be ending up in a situation where you've end up trying to retrieve data before the data was actually fully committed.
Following up on my earlier tease here:
Before I consider changing the default file watching behavior for our (many!) users, I wanted to check: is it possible to combine "walk & watch" into a single, atomic operation?
That is a great question that I'm not (quite) ready to answer yet, but I wanted to reply with what I already had. I'll have more to say about this in the next day or two.
If you're running on APFS, file cloning can be useful for this sort of thing. The idea here is instead of scanning the "live" hierarchy (which can change while you're scanning), you capture the hierarchy at a fixed state, scan THAT state, then delete that state once you're done.
Now, clonefile does come with significant warning. From the beginning of it's manpage:
LIMITATIONS
Cloning directories with these functions is strongly discouraged. Use copyfile(3) to clone directories instead.
The background here is because the clone operation is atomic, cloning a directory can basically "pause" ALL other file system activity while the clone is created. This doesn't really matter if the total file count is small, but as the file count grows it can become EXTREMELY disruptive. Basically, don't clone directory hierarchies unless you already "know" the count to be small.
The other alternative here is to simply clone the hierarchy yourself by cloning every file individually. That doesn't give you a truly atomic duplicate, but file cloning is EXTREMELY fast, even with high file counts. If you're scanning process is time consuming*, then scanning a "semi-atomic" copy is still better than scanning the live data.
*Architecturally, this is exactly why/how backup utilities use snapshots- snapshot the whole volume and then you can scan the snapshot "at your liesure" without worrying about how on going changes muddle up your backup state.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware