We have a long-form audio recording app built on AVAudioEngine. We install a tap on inputNode, accumulate the PCM buffers, and encode them to AAC in ~60-second chunks. Setup is essentially:
let session = AVAudioSession.sharedInstance()
try session.setCategory(.playAndRecord, mode: .default,
options: [.defaultToSpeaker, .allowBluetooth, .allowBluetoothA2DP])
try session.setActive(true)
let engine = AVAudioEngine()
let input = engine.inputNode
let format = input.inputFormat(forBus: 0) // valid, e.g. 48 kHz, 1 ch
input.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, _ in
// In the failure case, buffer.floatChannelData is entirely 0.0
// (accumulate + encode to AAC)
}
engine.prepare()
try engine.start()
Intermittently — and so far only reported from the field, never reproduced in normal testing — a recording comes out completely silent. When we decode the resulting AAC and inspect the raw PCM, every sample is exactly 0.0. The signature is very specific:
- The engine is running and the tap keeps firing for the full duration (normal number of buffers / full-length chunks).
- inputFormat is valid (sampleRate ≠ 0, e.g. 48 kHz).
- No error is thrown anywhere — setCategory, setActive, start(), and the tap callback all succeed.
- The PCM is literally all zeros (not low-level noise / room tone — exact 0.0).
- Two separate silent recordings decode to byte-identical AAC, confirming pure digital silence rather than corruption.
So as far as our error handling, format checks, and tap-liveness are concerned, everything looks healthy — yet the microphone is delivering pure silence.
One way we can reproduce it: recording while the iPhone is being driven via macOS iPhone Mirroring (the iPhone stays locked, the mic is effectively unavailable from the device, but our session still activates with a valid format and the tap fires zero-filled buffers for the whole recording — with no error at any point).
What we've ruled out: microphone permission is granted; it's not truncation or short capture (full-length, full frame count); it's not our encoding step (the input buffers themselves are zero); it's not a quiet/obstructed mic (that would be low noise, not exact 0.0).
Questions:
- What other device states or scenarios can cause a running AVAudioEngine input tap to deliver all-zero buffers with a valid format and no error? (e.g. another process/system feature holding the mic, Continuity Camera/Mic, CallKit/PushToTalk session ownership, etc.)
- Since this surfaces with no error and a valid format, what is the recommended way to detect it at runtime? Is monitoring the input level / PCM energy the only signal, or is there a supported API to know the input isn't actually live?
- What's the recommended recovery once detected — is a full session deactivate/reactivate re-handshake sufficient, or is recreating the engine required?