I just ran into the same thing.
My guess is this is a bug in the CoreML compiler or on-device scheduler, where it tries to put part of the network onto the Neural Engine, even though it contains convolutions that need more memory than the Neural Engine can handle.
Topic:
Machine Learning & AI
SubTopic:
Core ML
Tags: