If I'm in an enumerateDirectory call, I can very quickly fill in the fileID, parentID, and (maybe) the type attributes based on the directory entry I have loaded. That is, I can quickly fill in anything that is contained in the dirent structure in dirent.h, plus the parentID.
However, if any other attributes are requested (say, flags), or if the file system doesn't store the filetype in the directory entry, then I need to do additional I/O and load an inode. If I have to load an inode, I might keep a reference to it and assume that I can clean it up later whenever there is a matching call to reclaimItem. But in the enumerateDirectory call, I never provide an FSItem to the system!
By observation, I see that normally, a call to enumerateDirectory of this nature is followed up by a lookupItem call for every single fetched item, and then assumedly the system can later reclaim it if need be. At least, I tried various ways of listing directories, and each way I tried showed this behavior. If that's the case, then I can rely on a later reclaimItem call telling me when to clean up this cached data from memory.
Is this guaranteed, however? I don't see a mention of this in the documentation, so I'm not sure if I can rely on this.
Or, do I need to handle a case where, if I do additional I/O after enumerateDirectory, I might need to figure out when cached data should be cleaned up to avoid a "leak?" (Using the term "leak" loosely here, since in theory looking up the file later would make it reclaimable, but perhaps that might not happen.)
Part 2...
My question is more so, if I'm creating a new FSItem in enumerateDirectory like that (for the purpose of getting those "non-minimal" attributes), can I assume a lookupItem call is going to follow, and thus giving the system a chance to later reclaim that FSItem?
I don't think that's a safe assumption and, in practice, I think you're very likely to see lots of cases where a lookup ISN'T generated. I don't think a basic "ls" will generate a lookup call and I'd expect/hope the Finder would avoid it at least some of the time.
Or should I be throwing away my reference to the FSItem if I created it inside enumerateDirectory? It seems like it would be a bit wasteful if I threw away an FSItem made in the enumerateDirectory call if I know there's going to be a lookupItem call shortly after asking for it, although I don't see a guarantee that that would be the case.
I think this all depends on how you want to manage this process. Most file system implementations route all object requests through some kind of "bottleneck" (like "lookup") to help ensure that they provide a consistent "view" of every object across multiple clients. That bottleneck either provides the cached object it already has or creates a new object (adding it to the cache) if it doesn't have one, ensuring that there is never more than one object for any given file system object.
When that object goes away is entirely up to you, but it's very typical for the caching system to intentionally hold on to otherwise unused/needed references just in case they're needed again "shortly".
Reorganizing things a bit for clarity:
...what its docs are referring to when they say "This method doesn’t support partial reading of metadata?"
This is a case of making something sound a lot more complicated than it actually is. Let me start with the "readInto" documentation:
(1)
length
A maximum number of bytes to read. The completion handler receives a parameter with the actual number of bytes read.
...
(2)
For the read to succeed, requests must conform to any transfer requirements of the underlying resource. Disk drives typically require sector (physicalBlockSize) addressed operations of one or more sector-aligned offsets.
Both of those requirements are straightforward side effects of readInto/writeInto's underlying implementation. Starting with #2, the I/O is actually going to the raw dev node ("/dev/rdiskXXs"), which will fail any I/O that doesn't match the hardware requirements the device presents in IOKit. For #1, the actual API being called are pread/pwrite, which:
Upon successful completion, read(), readv(), and pread() return the num-
ber of bytes actually read and placed in the buffer. The system guaran-
tees to read the number of bytes requested if the descriptor references a
normal file that has that many bytes left before the end-of-file, but in
no other case.
... pointer associated with fildes, see lseek(2). Upon return from
write(), the pointer is incremented by the number of bytes which were
written.
Note that the behavior above isn't necessarily "useful" or even really "normal". I think the main way you'll get a "short" read if your read request extends past the end of the disk, but I'm not sure why that would ever be helpful/useful. pread/pwrite work this way because of other I/O context (like socket I/O), and FSKit is exposing this behavior; it's easier that creating a new read/write implementation.
With all that context, all this means:
"This method doesn’t support partial reading of metadata"
...is that the metadata I/O methods WON'T do #1. The UBC (Universal Buffer Cache) expects I/O requests to be "sensible", so it fails the I/O requests that would cause pread/pwrite to do partial I/O. This is arguably the more reasonable behavior, but the documentation should probably have said this in a much more straightforward way (r.175533440).
Oh? Does this mean it's not good to pass lengths smaller than 16KB to e.g. metadataRead,
No, or at least not exactly. First off, as background context, I have a post on this here which is worth reading.
As I explained there, metadataRead is the direct equivalent of buf_meta_bread. As such, it should not be used for (generally) not be used for multipage I/O. However, the nature of the UBC also means that there isn't a big difference between a 4KB read and a 16KB read.
Finally, one follow-up point here:
although right now I only have reading implemented, so I don't know if that causes a problem later if I implement write.
As a block storage file system, one thing you should be looking at/planning is to transition to FSVolumeKernelOffloadedIOOperations and away from FSVolumeReadWriteOperations. Routing bulk I/O through your extension puts a hard ceiling on your overall performance, so I wouldn't waste too much time trying to optimize the direct I/O protocol.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware