Inexplicable Metal crash ever since iOS 26.5 beta 4

Question

Created May ’26

Replies 4

Boosts 0

Views 1.8k

Participants 3

Hi all,

I'm working on updating my audio visualizer app. I'm adding new visualizers based on Metal 4 compute shaders. They worked in iOS 26.4 and iOS 26.5 up until beta 3. However, after that, the visualizers started crashing the phone and forcing a restart. On the latest version of iOS 26.5, the crash is still there. I submitted feedback, but haven't heard anything back just yet. I was wondering if others have faced this same issue, and if there are any workarounds.

Here is my repo if you want to look at the code (forgive me if it's sloppy, I'm quite new to graphics programming and Metal): https://github.com/aabagdi/VisualMan/tree/main

Thank you!

Answered by DTS Engineer in 888299022

Thanks for posting and for sharing the repo — being able to look at the actual code makes a real difference here.

The phone-restart class of crash matters. App-level Metal exceptions normally crash the app, not the device. A full phone reboot from a compute shader means whatever's failing crossed out of user-space into the GPU driver or kernel. That can happen two ways:

A driver-side bug provoked by some specific user-space pattern, which would be a Metal/driver issue Apple needs to address.
Memory corruption in your app — for example, an over-write of an MTLBuffer's contents past its allocated size, a stale gpuResourceID for a resource that's been freed, a race between threads populating argument-table entries, or wrong format/stride assumptions — that lands the GPU on garbage data and the driver panics. Allocator layouts and hardening change between iOS versions, so a dormant memory issue under beta 3 can become a reliable kernel panic under beta 4. The "worked, then stopped working" timing fits this case too.

I can't tell from a quick look at your shader code alone which of those it is — both fit the symptom.

You're using Metal 4 specifically. Your renderers use MTL4ComputeCommandEncoder, argumentTable.setTexture(.gpuResourceID, ...), and argumentTable.setAddress(...) — that's the Metal 4 argument-table API rather than the classic MTLComputeCommandEncoder / setTexture:atIndex: style. Metal 4 arrived in iOS 26.0 and is still relatively new.

Your shader source itself looks reasonable. No infinite loops, no atomics, and no obvious correctness issues. Threadgroup sizes are within iOS limits. The largest kernels (the AbstractExpressionism ones) use 32×16 = 512 threads per group with up to nine texture bindings per dispatch — on the heavier side but legal. Nothing in the shader text alone would explain a kernel-level crash on a properly-functioning driver. But the Swift host code that sets up arguments, manages texture/buffer lifecycles, and populates the argument table is where most of the surface area lives for the memory-corruption case above, and that's harder to evaluate from reading.

The fastest way to narrow which class of issue this is is to reproduce the crash in a small focused project that uses only the Metal 4 APIs you depend on. Strip everything that isn't directly involved in the failing dispatch — the audio engine, the SwiftUI shell, the other visualizers — and see if the minimal app still kernel-panics on iOS 26.5 beta 4. If yes, the bug is in the API surface you're exercising and the small project itself becomes an excellent Feedback Report attachment for the GPU team. If no, the bug is somewhere in the code paths you removed, which is much narrower territory to search.

A few things to enable while you investigate either way:

Metal API Validation (Edit Scheme → Run → Diagnostics → "Metal API Validation: Enabled"). This will flag many binding errors, lifetime mistakes, and threadgroup-size mismatches at the API layer before they reach the GPU.
Address Sanitizer and Thread Sanitizer in the same Diagnostics panel. ASan would catch buffer overflows and use-after-free of host-side allocations; TSan would catch threading races. They run slower, but they often surface latent bugs that release builds tolerate silently.
Metal Frame Capture during a run that reaches the crash — the captured frame shows exactly which resources are bound to which slots at the moment of the dispatch that fails, which is invaluable for spotting a wrong binding or a wrong-format texture.

On the original question: I can't say from the shader code alone whether this is a Metal 4 driver-level issue or something app-side surfacing under the new beta's allocator. Could you share the FB number? With that I can look at the report, the attached diagnostic data, and the panic log to narrow further. A few additional pieces of information that would help:

Which specific visualizer crashes? All of the Metal 4 ones, or only certain ones (Abstract Expressionism, Navier-Stokes, Liquid Light, or Game of Life)? If only one or two, that narrows the trigger considerably.
A sysdiagnose captured right after one of the crashes (Settings → Privacy & Security → Analytics & Improvements → Analytics Data, look for the recently generated panic logs) attached to the FB shows exactly which kernel and driver path was active at the time of the panic.
Results from a Metal API Validation / ASan / TSan run of one of the failing visualizers — any warnings or sanitizer hits there are the strongest possible signal toward an app-side cause.

Answer 1

DTS Engineer OP

Apple

May ’26

Recommended

Thanks for posting and for sharing the repo — being able to look at the actual code makes a real difference here.

The phone-restart class of crash matters. App-level Metal exceptions normally crash the app, not the device. A full phone reboot from a compute shader means whatever's failing crossed out of user-space into the GPU driver or kernel. That can happen two ways:

A driver-side bug provoked by some specific user-space pattern, which would be a Metal/driver issue Apple needs to address.
Memory corruption in your app — for example, an over-write of an MTLBuffer's contents past its allocated size, a stale gpuResourceID for a resource that's been freed, a race between threads populating argument-table entries, or wrong format/stride assumptions — that lands the GPU on garbage data and the driver panics. Allocator layouts and hardening change between iOS versions, so a dormant memory issue under beta 3 can become a reliable kernel panic under beta 4. The "worked, then stopped working" timing fits this case too.

I can't tell from a quick look at your shader code alone which of those it is — both fit the symptom.

You're using Metal 4 specifically. Your renderers use MTL4ComputeCommandEncoder, argumentTable.setTexture(.gpuResourceID, ...), and argumentTable.setAddress(...) — that's the Metal 4 argument-table API rather than the classic MTLComputeCommandEncoder / setTexture:atIndex: style. Metal 4 arrived in iOS 26.0 and is still relatively new.

Your shader source itself looks reasonable. No infinite loops, no atomics, and no obvious correctness issues. Threadgroup sizes are within iOS limits. The largest kernels (the AbstractExpressionism ones) use 32×16 = 512 threads per group with up to nine texture bindings per dispatch — on the heavier side but legal. Nothing in the shader text alone would explain a kernel-level crash on a properly-functioning driver. But the Swift host code that sets up arguments, manages texture/buffer lifecycles, and populates the argument table is where most of the surface area lives for the memory-corruption case above, and that's harder to evaluate from reading.

The fastest way to narrow which class of issue this is is to reproduce the crash in a small focused project that uses only the Metal 4 APIs you depend on. Strip everything that isn't directly involved in the failing dispatch — the audio engine, the SwiftUI shell, the other visualizers — and see if the minimal app still kernel-panics on iOS 26.5 beta 4. If yes, the bug is in the API surface you're exercising and the small project itself becomes an excellent Feedback Report attachment for the GPU team. If no, the bug is somewhere in the code paths you removed, which is much narrower territory to search.

A few things to enable while you investigate either way:

Metal API Validation (Edit Scheme → Run → Diagnostics → "Metal API Validation: Enabled"). This will flag many binding errors, lifetime mistakes, and threadgroup-size mismatches at the API layer before they reach the GPU.
Address Sanitizer and Thread Sanitizer in the same Diagnostics panel. ASan would catch buffer overflows and use-after-free of host-side allocations; TSan would catch threading races. They run slower, but they often surface latent bugs that release builds tolerate silently.
Metal Frame Capture during a run that reaches the crash — the captured frame shows exactly which resources are bound to which slots at the moment of the dispatch that fails, which is invaluable for spotting a wrong binding or a wrong-format texture.

On the original question: I can't say from the shader code alone whether this is a Metal 4 driver-level issue or something app-side surfacing under the new beta's allocator. Could you share the FB number? With that I can look at the report, the attached diagnostic data, and the panic log to narrow further. A few additional pieces of information that would help:

Which specific visualizer crashes? All of the Metal 4 ones, or only certain ones (Abstract Expressionism, Navier-Stokes, Liquid Light, or Game of Life)? If only one or two, that narrows the trigger considerably.
A sysdiagnose captured right after one of the crashes (Settings → Privacy & Security → Analytics & Improvements → Analytics Data, look for the recently generated panic logs) attached to the FB shows exactly which kernel and driver path was active at the time of the panic.
Results from a Metal API Validation / ASan / TSan run of one of the failing visualizers — any warnings or sanitizer hits there are the strongest possible signal toward an app-side cause.

Answer 2

Vendetagainst OP

May ’26

I got a GPU trace. When running it in Xcode, the replayer crashes with a seg fault. Let me know if you want a copy of the trace, it's rather large so I can't attach it here.

Answer 3

OP

Apple

May ’26

A GPU trace would definitely be helpful, though it’s possible the replayer crash is a separate issue from the on-device crash. Would you mind filing feedback with the project link and GPU trace attached, and sharing the feedback ID here?

Answer 4

Vendetagainst OP

May ’26

Sure, will do! Xcode actually already compiled and sent a feedback report, here is the ID: FB22816142.