I found that the aggregated device correctly obtains input channels in the standard microphone mode. However, in voice isolation mode, it only retrieves channels from the first sub-device in the aggregated device's list. If I want to properly obtain channel information in voice isolation mode, how should I do it?
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi, In my project I am using AVFoundation for recording the audio. We are using AVAudioMixerNode class below method to record the audio packet.
**func installTap(
onBus bus: AVAudioNodeBus,
bufferSize: AVAudioFrameCount,
format: AVAudioFormat?,
block tapBlock: @escaping AVAudioNodeTapBlock
)
**
It works perfectly fine.
But in production env some small percentage of the user we are facing issue like after recording few packets it stops automatically without stopping the audio engine. Can anyone help here that why this happens? I have also observed for mediaServicesWereResetNotification and added log on receiving this notification but when this issue happens I don't see any occurence of this log. Also is there any callback when the engine stops?
Hi there!
We have a suite of AudioUnit v2 plugins that have been shipped for some time as aufx plugins, and we are looking into MIDI-related platform upgrades, so we need a way to update these plugins to request MIDI from Logic (and other AU hosts) but avoid changing our AU type and subtype so we don't break existing sessions. Any ideas on how we can do this?
Does anyone know how to pronounce the sound of a specific instrument when you tap a button on the screen on your iPhone or iPad? Now, in the middle of creating a music learning app, I'm thinking of assigning monotones or chords to the button-like frames on the keyboard and fingerboard on the screen. Can it be achieved with SwiftUI chords alone? Once upon a time, MIDI level 1 I remember that there was a pronunciation function of the instrument, but I don't think about implementing the same function in the current OS. Please lend me your wisdom.
Topic:
Media Technologies
SubTopic:
Audio
I’m implementing a custom playback pipeline using AVSampleBufferAudioRenderer together with AVSampleBufferRenderSynchronizer.
hasSufficientMediaDataForReliablePlaybackStart appears to be the intended signal for determining when enough media has been queued to start playback.
For local playback, this works well in practice — the property becomes true after a reasonable amount of media is enqueued.
However, when the output route is AirPlay, using this property becomes difficult:
AirPlay requires significantly more buffered media before the renderer reports sufficient data.
The required preroll amount is much larger than for local playback.
For short assets, it is possible to enqueue the entire audio track and still never observe hasSufficientMediaDataForReliablePlaybackStart == true.
In that situation there is no more media data to enqueue, but the renderer still reports that playback is not ready.
Given this behavior, what is the recommended way to determine playback readiness when using AVSampleBufferAudioRenderer with AirPlay?
Since iOS 18, the system setting “Allow Audio Playback” (enabled by default) allows third-party app audio to continue playing while the user is recording video with the Camera app. This has created a problem for the app I’m developing.
➡️ The problem:
My app plays continuous audio in both foreground and background states. If the user starts recording video using the iOS Camera app, the app’s audio — still playing in the background — gets captured in the video — obviously an unintended behavior.
Yes, the user could stop the app manually before starting the video recording, but that can’t be guaranteed. As a developer, I need a way to stop the app’s audio before the video recording begins.
So far, I haven’t found a reliable way to detect when video recording starts if ‘Allow Audio Playback’ is ON.
➡️ What I’ve tried:
— AVAudioSession.interruptionNotification → doesn’t fire
— devicesChangedEventStream → not triggered
I don’t want to request mic permission (app doesn’t use mic). also, disabling the app from playing audio in the background isn’t an option as it is a crucial part of the user experience
➡️ What I need:
A reliable, supported way to detect when the Camera app begins video recording, without requiring mic access — so I can stop audio and avoid unintentional overlap with the user’s recordings.
Any official guidance, workarounds, or AVFoundation techniques would be greatly appreciated.
Thanks.
Environment
Device: iPhone 16e
iOS Version: 18.4.1 - 18.7.1
Framework: AVFoundation (AVAudioEngine)
Problem Summary
On iPhone 16e (iOS 18.4.1-18.7.1), the installTap callback stops being invoked after resuming from a phone call interruption. This issue is specific to phone call interruptions and does not occur on iPhone 14, iPhone SE 3, or earlier devices.
Expected Behavior
After a phone call interruption ends and audioEngine.start() is called, the previously installed tap should continue receiving audio buffers.
Actual Behavior
After resuming from phone call interruption:
Tap callback is no longer invoked
No audio data is captured
No errors are thrown
Engine appears to be running normally
Note: Normal pause/resume (without phone call interruption) works correctly.
Steps to Reproduce
Start audio recording on iPhone 16e
Receive or make a phone call (triggers AVAudioSession interruption)
End the phone call
Resume recording with audioEngine.start()
Result: Tap callback is not invoked
Tested devices:
iPhone 16e (iOS 18.4.1-18.7.1): Issue reproduces ✗
iPhone 14 (iOS 18.x): Works correctly ✓
iPhone SE 3 (iOS 18.x): Works correctly ✓
Code
Initial Setup (Works)
let inputNode = audioEngine.inputNode
inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil) { buffer, time in
self.processAudioBuffer(buffer, at: time)
}
audioEngine.prepare()
try audioEngine.start()
Interruption Handling
NotificationCenter.default.addObserver(
forName: AVAudioSession.interruptionNotification,
object: AVAudioSession.sharedInstance(),
queue: nil
) { notification in
guard let userInfo = notification.userInfo,
let typeValue = userInfo[AVAudioSessionInterruptionTypeKey] as? UInt,
let type = AVAudioSession.InterruptionType(rawValue: typeValue) else {
return
}
if type == .began {
self.audioEngine.pause()
} else if type == .ended {
try? self.audioSession.setActive(true)
try? self.audioEngine.start()
// Tap callback doesn't work after this on iPhone 16e
}
}
Workaround
Full engine restart is required on iPhone 16e:
func resumeAfterInterruption() {
audioEngine.stop()
inputNode.removeTap(onBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil) { buffer, time in
self.processAudioBuffer(buffer, at: time)
}
audioEngine.prepare()
try audioSession.setActive(true)
try audioEngine.start()
}
This works but adds latency and complexity compared to simple resume.
Questions
Is this expected behavior on iPhone 16e?
What is the recommended way to handle phone call interruptions?
Why does this only affect iPhone 16e and not iPhone 14 or SE 3?
Any guidance would be appreciated!
Two issues:
No matter what I set in
try audioSession.setPreferredSampleRate(x)
the sample rate on both iOS and macOS is always 48000 when the output goes through the speaker, and 24000 when my Airpods connect to an iPhone/iPad.
Now, I'm checking the current output loudness to animate a 3D character, using
mixerNode.installTap(onBus: 0, bufferSize: y, format: nil) { [weak self] buffer, time in
Task { @MainActor in
// calculate rms and animate character accordingly
but any buffer size under 4800 is just ignored and the buffers I get are 4800 sized.
This is ok, when the sampleRate is currently 48000, as 10 samples per second lead to decent visual results.
But when AirPods connect, the samplerate is 24000, which means only 5 samples per second, so the character animation looks lame.
My AVAudioEngine setup is the following:
audioEngine.connect(playerNode, to: pitchShiftEffect, format: format)
audioEngine.connect(pitchShiftEffect, to: mixerNode, format: format)
audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: nil)
Now, I'd be fine if the outputNode runs at whatever if it needs, as long as my tap would get at least 10 samples per second.
PS: Specifying my favorite format in the
let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)!
mixerNode.installTap(onBus: 0, bufferSize: y, format: format)
doesn't change anything either
I neet to take pcm data from aac data, but this api has fossy me deeply.
We are currently working on a CarPlay navigation app and so far everything is working well except for speaking turn notifications.
Our TTS implementation works fine on the phone and works fine on CarPlay if the voice is spoken over the speaker in the car. If users connect a BT headset to the car and listen through that headset, then the voice commands are chopped up / stutter.
Why would users use BT headset? Well, we are working on a motorcycle app, and there are no speakers usually on a motorcycle.
It sounds like the BT channel is opened and closed repeatedly for every character / word spoken. This happens on different CarPlay devices and different Bluetooth headsets, we have reports from multiple users that they find this behavior annoying and that other apps work fine.
Is this a known issue? Are there possible workaround?
I have a new 2725QC (Dell) Monitor that uses USB-C connection to connect with the iMac (2019, 27 inch) through the back port but the problem is that the volume control can currently only be done from the hardware, not the software control using the Apple keyboard. What should I do in terms of writing code to do this (Swift or Obj-C)? Is there a third-party solution for Intel iMac and ARM Mac?
Hello,
Has anyone else experienced variations in the accuracy of the playbackTime value? After a few seconds of playback, the reported time adjusts by a fraction of a second, making it difficult to calculate the actual playbackTime of the audio.
This can be recreated by playing a song in MusicKit, recording the start time of the audio, playing for at least 10-20 seconds, and then comparing the playbackTime value to one calculated using the start time of the audio. In my experience this jump occurs after about 10 seconds of playback.
Any help would be appreciated.
Thanks!
Hi,
I'm still stuck getting a basic record-with-playthrouh pipeline to work.
Has anyone a sample of setting up a AVAudioEngine pipeline for recording with playthrough?
Plkaythrough works with AVPlayerNode as input but not with any microphone input. The docs mention the "enabled state" of the outputNode of the engine without explaining the concept, i.e. how to enable an output.
When the engine renders to and from an audio device, the AVAudioSession category and the availability of hardware determines whether an app performs output. Check the output node’s output format (specifically, the hardware format) for a nonzero sample rate and channel count to see if output is in an enabled state.
Well, in my setup the output is NOT enabled, and any attempt to switch (e.g. audioEngine.outputNode.auAudioUnit.setDeviceID(deviceID) )/ attach a dedicated device / ... results in exceptions / errors
Session player regions populate blank, with no sound media when tracks or regions are created.
We are developing an apple music app on phone, the developed web works fine on chrome, but when i load it on webivew on my phone, i can't play the first song,
We doubt that the drm init, key exchange, session creation was on the music.play() function, while we trigger the play, the drm or session was not ok for play a real song, so it got an error
so we may wanna know:
what about the realative process of drm, key, session, etc in the play() function?
are there some state detect function to show weather the drm is ok?
Topic:
Media Technologies
SubTopic:
Audio
Tags:
Apple Music API
MusicKit
MusicKit JS
Apple Music Feed
Hi everyone,
I’m testing audio recording on an iPhone 15 Plus using AVFoundation.
Here’s a simplified version of my setup:
let settings: [String: Any] = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 8000,
AVNumberOfChannelsKey: 1,
AVLinearPCMBitDepthKey: 16,
AVLinearPCMIsFloatKey: false
]
audioRecorder = try AVAudioRecorder(url: fileURL, settings: settings)
audioRecorder?.record()
When I check the recorded file’s sample rate, it logs:
Actual sample rate: 8000.0
However, when I inspect the hardware sample rate:
try session.setCategory(.playAndRecord, mode: .default)
try session.setActive(true)
print("Hardware sample rate:", session.sampleRate)
I consistently get:
`Hardware sample rate: 48000.0
My questions are:
Is the iPhone mic actually capturing at 8 kHz, or is it recording at 48 kHz and then downsampling to 8 kHz internally?
Is there any way to force the hardware to record natively at 8 kHz?
If not, what’s the recommended approach for telephony-quality audio (true 8 kHz) on iOS devices?
Thanks in advance for your guidance!
I'm getting Crashlytics crashes from some my users, deep in the Apple code:
Crashed: AXSpeech
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x00000007ec54b360
0 libobjc.A.dylib 0x3c9c objc_retain_x8 + 16
1 AudioToolboxCore 0x99580 auoop::RenderPipeUser::~RenderPipeUser() + 112
2 AudioToolboxCore 0xe6090 -[AUAudioUnit_XPC internalDeallocateRenderResources] + 92
3 AVFAudio 0x90a0 AUInterfaceBaseV3::Uninitialize() + 60
4 AVFAudio 0x4cbe0 AVAudioEngineGraph::PerformCommand(AUGraphNodeBaseV3&, AVAudioEngineGraph::ENodeCommand, void*, unsigned int) const + 768
5 AVFAudio 0x56b0c AVAudioEngineGraph::_Uninitialize(NSError**) + 132
6 AVFAudio 0x7834 AVAudioEngineImpl::Stop(NSError**) + 388
7 AVFAudio 0x636c -[AVAudioEngine dealloc] + 52
8 TextToSpeech 0x30674 _TTSNameForVoiceInformation + 20864
9 libobjc.A.dylib 0x20a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
10 libobjc.A.dylib 0x6e00 objc_destructInstance + 80
11 libobjc.A.dylib 0x104fc _objc_rootDealloc + 80
12 TextToSpeech 0x2d2f4 _TTSNameForVoiceInformation + 7680
13 TextToSpeech 0x496c TTSVocalizerCopyURLForFallbackResource + 8540
14 TextToSpeech 0x26094 TTSSpeechUnitTestingMode + 5548
15 libAXSpeechManager.dylib 0x108b0 -[AXSpeechManager .cxx_destruct] + 192
16 libobjc.A.dylib 0x20a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
17 libobjc.A.dylib 0x6e00 objc_destructInstance + 80
18 libobjc.A.dylib 0x104fc _objc_rootDealloc + 80
19 libAXSpeechManager.dylib 0x5298 -[AXSpeechManager dealloc] + 268
20 Foundation 0x3b8a4 __NSThreadPerformPerform + 272
21 CoreFoundation 0xd3208 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
22 CoreFoundation 0xdf864 __CFRunLoopDoSource0 + 176
23 CoreFoundation 0x646c8 __CFRunLoopDoSources0 + 244
24 CoreFoundation 0x7a1c4 __CFRunLoopRun + 828
25 CoreFoundation 0x7f4dc CFRunLoopRunSpecific + 612
26 Foundation 0x420c4 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212
27 libAXSpeechManager.dylib 0x13390 -[AXSpeechThread main] + 552
28 Foundation 0x5b634 __NSThread__start__ + 716
29 libsystem_pthread.dylib 0x16b8 _pthread_start + 148
30 libsystem_pthread.dylib 0xb88 thread_start + 8
It's most likely related to my use of AVSpeechSynthesizer. I do change some of the utterance fields, including the voice that's being used (which is set to a value from speechVoices()).
UtilAudioIos_tts = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance
utterance.voice = AVSpeechSynthesisVoice(identifier: voice.voiceCode)
utterance.volume = volume
utterance.pitchMultiplier = pitch
utterance.rate = rate
UtilAudioIos_tts!.speak(utterance)
By coincidence or not, the following sometimes appears in the device log:
2023-05-30 20:35:29.948078+0100 <appname>[466:12882] [catalog] Unable to list voice folder
and also, sometimes:
2023-05-30 20:37:35.345933+0100 <appname>[466:13298] [catalog] Query for com.apple.MobileAsset.VoiceServices.VoiceResources failed: 2
2023-05-30 20:37:35.360854+0100 rehearserfree[466:13433] [AXTTSCommon] MauiVocalizer: 11006 (Can't compile rule): regularExpression=\Oviedo(?=, (\x1b\\pause=\d+\\)?Florida)\b, message=unrecognized character follows \, characterPosition=1
2023-05-30 20:37:35.363163+0100 <appname>[466:13433] [AXTTSCommon] MauiVocalizer: 16038 (Resource load failed): component=ttt/re, uri=, contentType=application/x-vocalizer-rettt+text, lhError=88602000
2023-05-30 20:37:35.363182+0100 <appname>[466:13433] [AXTTSCommon] Error loading rules: 2147483648
All of these crashes have been on the various versions of iOS 16.
Edit: I can't reproduce the crash myself - it's just some (not all) app users. The log entries above appear locally on my device (with no crash) but I can't see the logs of the users who have the crashes.
Any idea what this might be caused by, or how to go about tracking the problem down?
According to the documentation (https://developer.apple.com/documentation/avfoundation/avplayeritem/externalmetadata), AVPlayerItem should have an externalMetadata property. However it does not appear to be visible to my app. When I try, I get:
Value of type 'AVPlayerItem' has no member 'externalMetadata'
Documentation states iOS 12.2+; I am building with a minimum deployment target of iOS 18.
Code snippet:
import Foundation
import AVFoundation
/// ... in function ...
// create metadata as described in https://developer.apple.com/videos/play/wwdc2022/110338
var title = AVMutableMetadataItem()
title.identifier = .commonIdentifierAlbumName
title.value = "My Title" as NSString?
title.extendedLanguageTag = "und"
var playerItem = await AVPlayerItem(asset: composition)
playerItem.externalMetadata = [ title ]
I am work an app development on an app which request an audio function in background as an alert sound.
during debug testing , the function work fine,
but once I testing standalone without debugging , The function not work , it will play out the sound when I back to app.
does any way to trace the issues ?
I have integrated the ShazamKit SDK into my iOS app and would like to implement the same functionality in my Android app.
My question is: Can I use the Android version of the ShazamKit SDK for commercial purposes?
After extensive research, I could not find any official information regarding the license of the Android version of the ShazamKit SDK.
Could you please provide a formal license statement?