We are developing a DriverKit driver on Apple M1. We use the following code to prepare DMA buffer:
IODMACommandSpecification dmaSpecification;
bzero(&dmaSpecification, sizeof(dmaSpecification));
dmaSpecification.options = kIODMACommandSpecificationNoOptions;
dmaSpecification.maxAddressBits = p_dma_mgr->maxAddressBits;
kret = IODMACommand::Create(p_dma_mgr->device,
kIODMACommandCreateNoOptions,
&dmaSpecification,
&impl->dma_cmd
);
if (kret != kIOReturnSuccess) {
os_log(OS_LOG_DEFAULT, "Error: IODMACommand::Create failed! ret=0x%x\n", kret);
impl->user_mem.reset();
IOFree(impl, sizeof(*impl));
return ret;
}
uint64_t flags = 0;
uint32_t segmentsCount = 32;
IOAddressSegment segments[32];
kret = impl->dma_cmd->PrepareForDMA(kIODMACommandPrepareForDMANoOptions,
impl->user_mem.get(),
0,
0, // 0 for entire memory
&flags,
&segmentsCount,
segments
);
if (kret != kIOReturnSuccess) {
OSSafeReleaseNULL(impl->dma_cmd);
impl->user_mem.reset();
IOFree(impl, sizeof(*impl));
os_log(OS_LOG_DEFAULT, "Error: PrepareForDMA failed! ret=0x%x\n", kret);
return kret;
}
I allocated several 8K BGRA video frames, each with a size of 141557760 bytes, and prepared the DMA according to the method mentioned above. The process was successful when the number of frames was 15 or fewer. However, issues arose when allocating 16 frames:
Error: PrepareForDMA failed! ret=0xe00002bd
By calculating, I found that the total size of 16 video frames exceeds 2GB. Is there such a limitation in DriverKit that the total DMA size cannot exceed 2GB? Are there any methods that would allow me to bypass this restriction so I can use more video frame buffers?
By calculating, I found that the total size of 16 video frames exceeds 2GB. Is there such a limitation in DriverKit that the total DMA size cannot exceed 2GB?
Yes, but it's actually a kernel limitation, not DriverKit. I mentioned it in this thread, but part of the IOMMU/DART implementation intentionally limits the max size of any single allocation. Note that this also means that you're only ever going to get "1" segment out of PrepareForDMA.
That leads to here:
Are there any methods that would allow me to bypass this restriction so I can use more video frame buffers?
What are you actually trying to do? Strictly speaking, the only reason you'd specifically need a single IODMACommand that large is that you wanted a single, (apparently) physically contiguous allocation that was 2GB+... and I strongly suspect you don't actually need that. If you're actually just going to chop the DMA buffer into small pieces to feed to your PCI card, then I believe you can just use more IODMACommands to get what you want.
*To be clear, this is specifically about what's visible on the PCI bus. If you also want to interact with this memory as a logically contiguous, then that's a separate and different issue.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware