So this method falsely returns NSURLRelationshipSame for different directories. One is empty, one is not. Really weird behavior.
Do you know where/what the directories "were"? The problem here is that there's a pretty wide variation between the "basic" case of "a bunch of files and directories sitting on a standard volume" and "the range of ALL possible edge cases".
Two file path URLs pointing to two different file paths have the same NSURLFileResourceIdentifierKey?
Yes, this is possible. As one example, the data volume basically ends up in the hierarchy "twice" meaning that, for example, the path "/System/Volumes/Data/Users/" and "/Users/" are in fact the same directory. And, yes, getRelationship returns NSURLRelationshipSame for those directories.
Now, this:
One is empty, one is not.
...is definitely "weirder". Ignoring the cache issue below, I don't think you could do it within a standard volume, but you might be able to do it using multiple volumes, particularly duplicated disk image and/or network file systems.
However, in this case:
Could it be related to https://developer.apple.com/forums/thread/813641?
One URL in the check lived at the same file path as the other URL at one time (but no longer does). No symlinks or anything going on. Just plain directory URLs.
...yes, it's a/the cache. The proof of that is this:
And YES calling -removeCachedResourceValueForKey: with NSURLFileResourceIdentifierKey causes the proper result of NSURLRelationshipOther to be returned. And I'm doing the check on a background queue.
...since any issue that is fixed by clearing the cache is, by definition, "caused" by the cache. That's a good excuse to revisit this thread here, which I'm afraid I missed:
Could it be related to https://developer.apple.com/forums/thread/813641 ?
The core of the issue here is the inherent tension between a few facts:
-
The entire file system is essentially a lock-free database being simultaneously modified by an unconstrained number of processes/threads.
-
Your ability to monitor file system state is relatively limited. Basically, you can either ask for the current state and receive an answer with unknown latency or ask the system to update you as things change, at which point you'll receive a stream of events... with unknown latency.
-
Accessing the file system is sufficiently slow that it's worth avoiding/minimizing that access.
Jumping back to here, there's actually a VERY straightforward way to do this:
Two file path URLs pointing to two different file paths have the same NSURLFileResourceIdentifierKey?
That is, have two processes where:
Process 1 calls "getRelationship".
Process 2 manipulates the file system such that the following sequence occurs:
- Process 1 retrieves the metadata of the source object.
- Process 2 deletes the existing directory at the target location.
- Process 2 moves the source object to the target location.
- Process 2 deletes the contents of the target object.
- Process 1 retrieves the metadata of the target object.
...and process 1 now compares #1 and #5, returning NSURLRelationshipSame because they are in fact the same. Now, you might say this seems far-fetched/impossible to time; however, I never said process 2 was running on the same system. With SMB over a slow connection, I suspect you could replicate the scenario above pretty easily.
The point here is that the system’s caching behavior is simply one dynamic among many. That is, caching increases the probability of strange behavior (like the one above) because it increases the time gap between #1 and #5, and the wider the gap between actions, the more likely it is that "something" has changed. However, you can't actually shrink the gap to the point where it goes away.
One solution to these issues is for the interested processes to communicate with each other to coordinate their actions (for example, by using "File Coordination“). However, that requires all of the processes involved to participate in that mechanism, which they definitely don't today.
Realistically, the reason this all isn't a total disaster is that most of the activity here is either:
- Directly controlled/managed by the user, who is both being careful about what they does and moving "slow" enough that collisions don't happen.
OR
- Happening in "private" parts of the file system where only one "entity" is manipulating the data (for example, an app’s data container).
All of which leads to the big question... what are you actually trying to do?
If this is a one-off event that you're concerned/confused about, then the answer is basically, yes, the file system can be way weirder than it looks, and sometimes that means calling removeCachedResourceValueForKey "just in case".
However, if this is something that is a recurring problem for your app, then it might be worth stepping back and rethinking your approach to minimize the possibility and consequences of these kinds of "oddities".
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware