We're trying to implement a backup/restore data feature in our business productivity iPad app using UIDocumentPickerViewController and AppleArchive, but discovered odd behavior of [UIDocumentPickerViewController initForOpeningContentTypes: asCopy:YES]
when reading large archive files from a USB drive.
We've duplicated this behavior with iPadOS 16.6.1 and 17.7 when building our app with Xcode 15.4 targeting minimum deployment of iPadOS 16. We haven't tested this with bleeding edge iPadOS 18.
Here's our Objective-C code which presents the picker:
NSArray* contentTypeArray = @[UTTypeAppleArchive];
UIDocumentPickerViewController* docPickerVC = [[UIDocumentPickerViewController alloc] initForOpeningContentTypes:contentTypeArray asCopy:YES];
docPickerVC.delegate = self;
docPickerVC.allowsMultipleSelection = NO;
docPickerVC.shouldShowFileExtensions = YES;
docPickerVC.modalPresentationStyle = UIModalPresentationPopover;
docPickerVC.popoverPresentationController.sourceView = self.view;
[self presentViewController:docPickerVC animated:YES completion:nil];
The UIDocumentPickerViewController remains visible until the selected external archive file has been copied from the USB drive to the app's local tmp sandbox. This may take several seconds due to the slow access speed of the USB drive. During this time the UIDocumentPickerViewController does NOT disable its tableview rows displaying files found on the USB drive. Even the most patient user will tap the desired filename a second (or third or fourth) time since the user's initial tap appears to have been ignored by UIDocumentPickerViewController, which lacks sufficient UI feedback showing it's busy copying the selected file.
When the user taps the file a second time, UIDocumentPickerViewController apparently begins to copy the archive file once again. The end result is a truncated copy of the selected file based on the time between taps. For instance, a 788 MB source archive may be copied as a 56 MB file. Here, the UIDocumentPickerDelegate receives a 56 MB file instead of the original 788 MB of data.
Not surprisingly, AppleArchive fails to decrypt the local copy of the archive because it's missing data. Instead of failing gracefully, AppleArchive crashes in AAArchiveStreamClose() (see forums post 765102 for details).
Does anyone know if there's a workaround for this strange behavior of UIDocumentPickerViewController?
Should we file a separate bug report, or extend FB16131472 with iPadOS 18.6 info?
Sorry, I missed that in the previous posts. Please update the existing bug.
Our app utilizes AppleArchive to back up and restore data via UIDocumentPickerViewController. We implemented the variant of UIDocumentPickerViewController that copies the archive file because we had the impression Apple recommended (or required) this variant when reading archives from an external drive.
To be honest, I can't think of any reason it would really be THAT different. Practically speaking, the ability to read a file is the same as being able to copy it (since a copy is just reading from a source and writing to a destination). Similarly, while it's possible for two different engines to produce different results*, we've been consolidating our copy engines such that using "asCopy" SHOULD give you the same result as copyItems. Finally, the kinds of edge cases that "copying” is so messy are exactly why formats like AppleArchive or zip exist. That is, the point of an archive format is to move all meaningful metadata into the file’s contents, which means these are the files that are least likely to be disrupted by the nuances of a copy engine.
*There's a whole separate rant about how "file copying" is a poorly defined mirage we all pretend is meaningful
Using "asCopy:YES" saves the work of dealing with things file coordination and security-scoped access, but those aren't a huge deal.
We requested Developer Technical Support in February 2025. At that time, we were told there was no workaround. In March 2025, we received a follow-up response from DTS, but it didn't point us in a worthwhile direction.
I pulled up your TSI, and I think this was a case where we got a bit too focused on your current solution instead of in terms of getting something to work "today". Pulling a few details from that email exchange to clarify and comment on:
Basically, UIDocumentPickerViewController copies from the external drive to the app's sandbox for improved security and better runtime performance.
No, not really. Copying is fundamentally I/O constrained, and it's very hard to make it a LOT faster without massively complicating the implementation details and risks. More to the point, having done all that work... it often won't make any difference.
Case in point, the biggest single performance gain is to move read and writing to separate threads so you can push I/O to both devices at the same time. Except, in a case like this, something like a spinning USB drive is SO much slower than the destination SSD that you don't actually gain all that much. My guess is that using copyItem to do the copy yourself will basically have identical performance.
We suspect the decryption and processing of archive data would be significantly slower if handled directly on the external drive.
Actually, no, that's not true. Right now, you're doing:
Copy Data->
- Read data from source
- Write to destination
Unarchive data->
- Read data from destination
- Process data
- Write data to final destination
That could be done as:
- Read data from source
- Process data as it's read
- Write data to final destination
...which cuts an entire I/O cycle out of the process. Now, there are cases where that COULD be slower, primarily when processing the data requires multiple I/O passes and the performance gap between the source and destination is VERY large. However:
-
Archive formats are generally designed to avoid exactly that I/O pattern.
-
If this really is an issue, it's an easy one to avoid, as the writer can actually write the data twice, saving the data it directly read at the same time it also processes the data, then using the "extra" file it's writing out as the data source if/when the processing engine needs to backtrack.
Note that the architecture here basically ends up being a variant of an optimized copy engine— one thread is reading data while the other is writing; it just happens to be the case that what's being written isn't identical to what was originally read.
The critical issue here is this:
The data in our candidate archive file is three orders of magnitude more complex than the sample archive.
There are a TON of different "flavors“ a task like this can have. For example, if processing the data itself is VERY time-consuming, you can end up in a situation where the writer thread is holding back the reader. However, every edge case also has its own solution— for example, the writer thread can directly write the data it receives to disk while a 3rd "processing thread" starts working on the data that's been written to disk.
The big thing to understand here is that none of our copy APIs are designed to be "as fast as possible", as doing so is basically impossible. For example, the ideal copy engine for "1 big file" is basically the exact OPPOSITE of one designed for "lots of small files". Our copy APIs try to be "as fast as possible" while only using one thread and without using very much memory. In most cases, that's a totally reasonable trade-off set; however, if you’re working with large data and/or have special requirements, it's also not difficult to be much faster than them.
Additionally, our app utilizes NSFileManager for file I/O in lieu of NSFileCoordinator. The latter is required when directly accessing files on an external drive.
A few points here:
-
Yes, you would need to use file coordination. Note that the big issue here isn't actually USB drives; it's things like cloud storage where accessing the file is what's triggering the download.
-
Assuming the destination is "private" (meaning, inside your app container and not accessible by any other process), you can just do a coordinated read of the source. File coordination isn't necessary when you "know" a location will ONLY be accessible by you.
-
Assuming you were copying the data, you'd just use NSFileManager.copyItems (or whatever copy API you wanted) to copy the file. The file coordination APIs are about managing how files are accessed, they don't actually do I/O.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware