NSFileManager getRelationship:ofDirectoryAtURL:toItemAtURL:error: returning NSURLRelationshipSame for Different Directories

I'll try to ask a question that makes sense this time :) . I'm using the following method on NSFileManager:

  • (BOOL) getRelationship:(NSURLRelationship *) outRelationship ofDirectoryAtURL:(NSURL *) directoryURL toItemAtURL:(NSURL *) otherURL error:(NSError * *) error;
  • Sets 'outRelationship' to NSURLRelationshipContains if the directory at 'directoryURL' directly or indirectly contains the item at 'otherURL', meaning 'directoryURL' is found while enumerating parent URLs starting from 'otherURL'. Sets 'outRelationship' to NSURLRelationshipSame if 'directoryURL' and 'otherURL' locate the same item, meaning they have the same NSURLFileResourceIdentifierKey value. If 'directoryURL' is not a directory, or does not contain 'otherURL' and they do not locate the same file, then sets 'outRelationship' to NSURLRelationshipOther. If an error occurs, returns NO and sets 'error'.

So this method falsely returns NSURLRelationshipSame for different directories. One is empty, one is not. Really weird behavior. Two file path urls pointing to two different file paths have the same NSURLFileResourceIdentifierKey? Could it be related to https://developer.apple.com/forums/thread/813641 ?

One url in the check lived at the same file path as the other url at one time (but no longer does). No symlinks or anything going on. Just plain directory urls.

And YES calling -removeCachedResourceValueForKey: with NSURLFileResourceIdentifierKey causes proper result of NSURLRelationshipOther to be returned. And I'm doing the check on a background queue.

Answered by DTS Engineer in 878053022

Doesn't appear to be what's going on in this case. I made this dumb little test which can easily reproduce the issue (sorry, can't get code to format well on these forums).

Interesting. So, I can actually explain what's going on, and it's actually not the cache.

So, architecturally, NSURL has two different mechanisms for tracking file location— "path" and "file reference". Path works exactly the way you'd expect (it's a string-based path to a fixed location), while file reference relies on low-level file system metadata to track files. Critically, this means that the file reference will track the object as it's moved/modified within a volume.

Secondly, keep in mind NSURLs are generally "data" objects, meaning they don't "proactively" update their content.

So, the actual issue here starts here:

if (![fm trashItemAtURL:untitledFour resultingItemURL:&resultingURL error:nil])

At the point that method returns, "untitledFour" is no longer entirely coherent, as its path points to the original location, but its reference points to the file in the trash. You can see this for yourself by running this at the top of compareBothURLS:

NSURL* pathURL = untitledFour.filePathURL;
NSURL* refURL = untitledFour.fileReferenceURL;

NSLog(@"1 %@", untitledFour.path);
NSLog(@"2 %@", pathURL.path);
NSLog(@"3 %@", refURL.path);
	
NSLog(@"A %@", untitledFour.fileReferenceURL.description);
NSLog(@"B %@", pathURL.fileReferenceURL.description);
NSLog(@"D %@", refURL.fileReferenceURL.description);

What you'll find is that:

  • In the first log set, "1" & "2" will match, both pointing to the original file location. "3" will not, pointing to the trash instead.

  • In the second log set, "A" & "C" will match, while "B" will not.

More specifically, the strings returned in the second log set will have this format:

file:///.file/id=<number>.<number>/

...and the second number will be different for "B".

With all that context:

(1) The reason getRelationship is returning "same" is that it primarily relies on file reference data and the reference data points to the file in the trash. There's an argument that it shouldn't do this, however. In its defense, using the reference data makes it much easier to sort out issues like hard-linked files and/or symbolic links allowing multiple references to the same file.

(2) The reason "removeCachedResourceValueForKey" changed the behavior is that it deleted the file reference data, forcing NSURL to resolve the data again. You'll actually get exactly the same effect if you test with "untitledFour.filePathURL".

What I'd highlight here is that the "right" behavior here isn't entirely clear. That is, is the problem that "getRelationship" is claiming that two different paths are "the same file"? Or is the problem that NSURL is returning the wrong path value for a specific file?

That question doesn't have a direct answer because the system doesn't really "know" what you actually want— are you trying to track a particular "object" (fileReferenceURL) or are you trying to reference a particular "path" (filePathURL)? It doesn't "know", so it's ended up with an slightly different object that's tracking both...

...but you can tell it what you want, at which point the API will now do exactly what you'd expect. More specifically, you can change the behavior by forcing the URL type you want immediately after you create the directory:

    if (![fm createDirectoryAtURL:untitledFour withIntermediateDirectories:YES attributes:nil error:nil])
    {
        NSLog(@"Test failed");
        return;
    }
    
#if 1
    untitledFour = untitledFour.fileReferenceURL;
#else
    untitledFour = untitledFour.filePathURL;
#endif

Strictly speaking, you could set "filePathURL" anywhere you want, but you can't create a fileReferenceURL to a non-existent object, so it needs to be after the create. In any case, either of those two configurations works the way you'd expect.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thanks for all the information!

The other problem here is that you're assuming the APIs involved here operate on paths... when they don't.

Perhaps I didn't explain myself well. I wasn't making assumptions about how the the APIs operate - the only thing I was claiming (well trying to) was that paths provide the concept of file location and in certain contexts that concept matters to GUI apps (with the NSBrowser example being an obvious one).

The current API for file reference urls that I'm aware of don't try to solve that problem in any way.

Note that this INCLUDES FSEvents. [...] If the directory you're monitoring moves... then FSEventStreamCreate will happily continue sending you events about that directory, NOT the original path location. If you want to know when the directory you're monitoring moves, then that's what "kFSEventStreamCreateFlagWatchRoot"

Very interesting. Yeah I use that (watch root) With the watch root flag set it does seem to just track whatever is at the path.

To test this I just put 'TestFolder' on my Desktop and started a FSEventStream on it. Then I rename 'TestFolder' to TestFolder2 I pick up the root change. If I start making changes in TestFolder2 I'm not picking up any changes. With the stream still open, I create a new folder 'TestFolder' - and the opened stream starts picks up changes on the new TestFolder (not TestFolder2) along the original path - so it does 'stick' to the path and doesn't follow the reference (I've never tried not using kFSEventStreamCreateFlagWatchRoot since I need it).

then it might be worth starting a new thread that's focused on those issues.

I've not had problems with FSEvents I just brought it up because if there was some pathless way to achieve what an FSEventStream provides I'd be open to exploring it. I'll be sure to open a new thread if I run into anything.

Perhaps I didn't explain myself well. I wasn't making assumptions about how the the APIs operate - the only thing I was claiming (well trying to) was that paths provide the concept of file location and in certain contexts that concept matters to GUI apps (with the NSBrowser example being an obvious one).

Sure. And, to be clear, I don't mean that paths don't EVER have a place and/or aren't a good solution. However, I do think those use case are more limited then they seem and that most interactions with files are better handled through object references (which are tied to a specific file system object), NOT paths (which describe a location in the hierarchy).

Our APIs don't do a very good job of expressing that idea, but APIs like reference URLs and bookmarks are actually "object reference" APIs, not just weird ways of storing paths.

Very interesting. Yeah I use that (watch root) With the watch root flag set it does seem to just track whatever is at the path.

Ahh... Yes, you're right. It's been awhile since I looked at this and my memory let me down. So, to clarify what's going on here, how FSEvents actually works is that it's hooked into the vfs system here and then exports that data to user space through /dev/fsevents. As part of that process, it converts the vnode reference to a path, as that's the simplest way to collect and log "all" the data it's collecting, particularly since it's going to be logging the data out for longer term storage.

I've not had problems with FSEvents I just brought it up because if there was some pathless way to achieve what an FSEventStream provides I'd be open to exploring it.

Sure, that's what kqueue does. If you look at the code above, the functional call "lock_vnode_and_post" just before "create_fsevent_from_kevent" is what actually posts to the kqueue (if any happen to be monitoring). If you're curious about it, (years ago) I wrote the "DirectoryWatcher" class inside this sample which use kqueue (it's an iOS sample, so FSEvents isn't available).

Disclaimer: The code uses NSString paths instead of NSURL. It's fairly old and I didn't know better. Setting aside all other issues, issue like Unicode variation between file systems mean that NSString paths should be avoided whenever possible.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

NSFileManager getRelationship:ofDirectoryAtURL:toItemAtURL:error: returning NSURLRelationshipSame for Different Directories
 
 
Q