Voice Processing

Why voice processing enabled on AVAudioInputNode makes output audio noticable lower than without it and how to overcome it using voice processing enabled

Answered by Engineer in 890487022

One additional behavior to be aware of: when voice processing is enabled, it will duck other audio that is not part of the voice processing output. For example, if your app has a secondary audio output stream alongside the voice processing stream, or if another app is playing audio concurrently, its volume will be significantly reduced. This is intentional behavior designed to improve the intelligibility of the voice processing output.

There is API available to control the amount of ducking applied. https://developer.apple.com/documentation/avfaudio/avaudioinputnode/voiceprocessingotheraudioduckingconfiguration

Voice processing applies a distinct DSP pipeline from the one used for media playback, and it is intentionally band-limited to optimize speech intelligibility. As a result, output levels in voice processing mode are not expected to match those in non-voice processing mode — this is by design.

That said, if you are observing a significant level discrepancy that seems outside the norm, please do file a bug report with repro steps and any relevant measurements. We'd be happy to take a closer look.

I also noticed that the audio volume for playAndRecord with voice processing enabled has a different volume bar, in comparison with anything that plays directly to AVPlayer.

For example, if anything plays via AudioEngine with VoiceProcessing enabled, it uses a completely different volume bar that can be adjusted independently from regular playback that can be triggered alongside AVAudioEngine within the same app session

Is it expected behaviour?

Accepted Answer

One additional behavior to be aware of: when voice processing is enabled, it will duck other audio that is not part of the voice processing output. For example, if your app has a secondary audio output stream alongside the voice processing stream, or if another app is playing audio concurrently, its volume will be significantly reduced. This is intentional behavior designed to improve the intelligibility of the voice processing output.

There is API available to control the amount of ducking applied. https://developer.apple.com/documentation/avfaudio/avaudioinputnode/voiceprocessingotheraudioduckingconfiguration

Thanks a lot!

Voice Processing
 
 
Q