Context:
I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup.
I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate.
I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking.
Issue
When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block.
Details:
The audio session is already active and configured with .playAndRecord.
The input tap is already installed when the engine is started.
When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio.
Assumptions / Current Setup
Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio.
However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking.
Questions
Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine?
Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio?
Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer?
Would using separate engines for input and output improve responsiveness?
I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation.
Relevant Code Snippets
Engine Setup
func setup() {
let input = audioEngine.inputNode
do {
try input.setVoiceProcessingEnabled(true)
} catch {
print("Could not enable voice processing \(error)")
return
}
input.isVoiceProcessingAGCEnabled = false
let output = audioEngine.outputNode
let mainMixer = audioEngine.mainMixerNode
audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat)
audioEngine.connect(beepNode, to: mainMixer, format: outputFormat)
audioEngine.connect(mainMixer, to: output, format: outputFormat)
// Initialize converters
converter = AVAudioConverter(from: inputFormat, to: outputFormat)!
f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)!
audioEngine.prepare()
}
Input Tap Installation
func installTap() {
guard AudioHandler.shared.checkMicrophonePermission() else {
print("Microphone not granted for recording")
return
}
guard !isInputTapped else {
print("[AudioEngine] Input is already tapped!")
return
}
let input = audioEngine.inputNode
let microphoneFormat = input.inputFormat(forBus: 0)
let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)!
let desiredFormat = outputFormat
let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate)
input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in
guard let self = self else { return }
// Output buffer: 1920 frames at 16kHz
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return }
outputBuffer.frameLength = outputBuffer.frameCapacity
let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = .haveData
return buffer
}
var error: NSError?
let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock)
if converterResult != .haveData {
DebugLogger.shared.print("Downsample error \(converterResult)")
} else {
self.handleDownsampledBuffer(outputBuffer)
}
}
isInputTapped = true
}
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I’m developing a macOS audio monitoring app using AVAudioEngine, and I’ve run into a critical issue on macOS 26 beta where AVFoundation fails to detect any input devices, and AVAudioEngine.start() throws the familiar error 10877.
FB#: FB19024508
Strange Behavior:
AVAudioEngine.inputNode shows no channels or input format on bus 0.
AVAudioEngine.start() fails with -10877 (AudioUnit connection error).
AVCaptureDevice.DiscoverySession returns zero audio devices.
Microphone permission is granted (authorized), and the app is properly signed and sandboxed with com.apple.security.device.audio-input.
However, CoreAudio HAL does detect all input/output devices:
Using AudioObjectGetPropertyDataSize and AudioObjectGetPropertyData with kAudioHardwarePropertyDevices, I can enumerate 14+ devices, including AirPods, USB DACs, and BlackHole.
This suggests the lower-level audio stack is functional.
I have tried:
Resetting CoreAudio with sudo killall coreaudiod
Rebuilding and re-signing the app
Clearing TCC with tccutil reset Microphone
Running on Apple Silicon and testing Rosetta/native detection via sysctl.proc_translated
Using a fallback mechanism that logs device info from HAL and rotates logs for submission via Feedback Assistant
I have submitted logs and a reproducible test case via Feedback Assitant : FB#: FB19024508]
So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
I developed an educational app that implements audio-video communication through RTC, while using WebView to display course materials during classes. However, some users are experiencing an issue where the audio playback from WebView is very quiet. I've checked that the AVAudioSessionCategory is set by RTC to AVAudioSessionCategoryPlayAndRecord, and the AVAudioSessionCategoryOption also includes AVAudioSessionCategoryOptionMixWithOthers. What could be causing the WebView audio to be suppressed, and how can this be resolved?
Hello Apple Developer Community,
I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag.
In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so:
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8"
Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English").
We would like to understand the official or intended behavior regarding this.
Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label?
If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI?
Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute?
Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated.
Thank you for your time and assistance.
So experimenting with the new SpeechTranscriber, if I do:
let transcriber = SpeechTranscriber(
locale: locale,
transcriptionOptions: [],
reportingOptions: [.volatileResults],
attributeOptions: [.audioTimeRange]
)
only the final result has audio time ranges, not the volatile results.
Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses.
I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while.
The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
i tried combine speech detector and speech transciber to anlayzer.
but speech detector is not speech module. please help me
I've got a setup using AVAudioEngine with several tone generator nodes, each with a chain of processing nodes, the chains then mixed into the main output.
Generator ➡️ Effect ➡️... ➡️ .mainMixerNode ➡️ .outputNode).
Generator ➡️ Effect ➡️... ⤴️
...
Generator ➡️ Effect ➡️... ⤴️
The user should be able to mute any chain individually. I've found several potential approaches to muting, but not terribly happy with any of them.
Adjust the amplitudes directly in my tone generators. Issue: Consumes CPU even when completely muted. 4 generators adds ~15% cpu, even when all chains are muted.
Detach/attach chains that are muted/unmuted. Issue: Causes loud clicking/popping sounds whenever muted/unmuted.
Fade mixer output volume while detaching/attaching a chain (just cutting the volume immediately to 0 doesn't get rid of the clicking/popping). Issue: Causes all channels to fade during the transition, so not ideal.
The rest of these ideas are variations on making volume control+detatch/attach work for individual chains, since approach #3 worked well.
Add an AVAudioMixer to the end of each chain (just for volume control). Issue: Only the mixer on the final chain functions -- the others block all output. Not sure what's going on there.
Use matrix mixer (for multi-input volume control). Plus detach/attach to reduce CPU if necessary. Not yet attempted, due to perceived complexity and reports of fragility in order of wiring in. A bunch of effort before I even know if it's going to work.
Develop my own fader node to put on the end of each channel. Unlike the tone generator (simple AVSourceNode), developing an effect node seems complex and time consuming. Might not even fix CPU use.
I'm not completely averse to the learning curve of either 5 or 6, but would rather get some guidance on best approach before diving in. They both seem likely to take more effort than I'd like for the simple behavior I'm trying to achieve.
I bought two "Apple USB-C to Headphone Jack Adapters". Upon closer inspection, they seems to be of different generations:
The one with product ID 0x110a on top is working fine. The one with product ID 0x110b has two issues:
There is a short but loud click noise on the headphone when I connect it to the iPad.
When I play audio using AVAudioPlayer the first half of a second or so is cut off.
Here's how I'm playing the audio:
audioPlayer = try AVAudioPlayer(contentsOf: url)
audioPlayer?.delegate = self
audioPlayer?.prepareToPlay()
audioPlayer?.play()
Is this a known issue? Am I doing something wrong?
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns.
Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally.
var bufferedData = Data()
func parseBytes(data: Data) {
bufferedData.append(data)
// XXX: this buffering reduces glitching
// to rather infrequent. But why?
if bufferedData.count > 32768 {
bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in
guard let baseAddress = bytes.baseAddress else { return }
let result = AudioFileStreamParseBytes(audioStream!,
UInt32(bufferedData.count),
baseAddress,
[])
if result != noErr {
print("❌ error parsing stream: \(result)")
}
}
bufferedData = Data()
}
}
No errors are returned by AudioFileStream or AVAudioConverter.
func handlePackets(data: Data,
packetDescriptions: [AudioStreamPacketDescription]) {
guard let audioConverter else {
return
}
var maxPacketSize: UInt32 = 0
for packetDescription in packetDescriptions {
maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize)
if packetDescription.mDataByteSize == 0 {
print("EMPTY PACKET")
}
if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count {
print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)")
}
}
let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize))
bufferIn.byteLength = UInt32(data.count)
for i in 0 ..< Int(packetDescriptions.count) {
bufferIn.packetDescriptions![i] = packetDescriptions[i]
}
bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count)
_ = data.withUnsafeBytes { ptr in
memcpy(bufferIn.data, ptr.baseAddress, data.count)
}
if verbose {
print("handlePackets: \(data.count) bytes")
}
// Setup input provider closure
var inputProvided = false
let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in
if !inputProvided {
inputProvided = true
statusPtr.pointee = .haveData
return bufferIn
} else {
statusPtr.pointee = .noDataNow
return nil
}
}
// Loop until converter runs dry or is done
while true {
let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)!
bufferOut.frameLength = 0
var error: NSError?
let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock)
switch status {
case .haveData:
if verbose {
print("✅ convert returned haveData: \(bufferOut.frameLength) frames")
}
if bufferOut.frameLength > 0 {
if bufferOut.isSilent {
print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)")
}
outBuffers.append(bufferOut)
totalFrames += Int(bufferOut.frameLength)
}
case .inputRanDry:
if verbose {
print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames")
}
if bufferOut.frameLength > 0 {
if bufferOut.isSilent {
print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)")
}
outBuffers.append(bufferOut)
totalFrames += Int(bufferOut.frameLength)
}
return // wait for next handlePackets
case .endOfStream:
if verbose {
print("✅ convert returned endOfStream")
}
return
case .error:
if verbose {
print("❌ convert returned error")
}
if let error = error {
print("error converting: \(error.localizedDescription)")
}
return
@unknown default:
fatalError()
}
}
}
Hello,
I have an existing AUv3 instrument plugin. In the plug in, users can access files (audio files, song projects) via a UIDocumentPickerViewController
In Logic Pro, (and some other hosts, but not all), the document picker is unable to receive touches, while a keyboard case is attached to the iPad.
Removing the case (this is an Apple brand iPad case) allows the interactions to resume and allows me to pick files in the usual way.
One of my users reports this non-responsive behavior occurs even after disconnecting their keyboard.
I have fiddled with entitlements all day, and have determined that is not the issue, since the keyboard disconnection appears to fix it every time for me.
Here is my, very boilerplate, presentation code :
guard let type = UTType("com.my.type") else {
return
}
let fileBrowser = UIDocumentPickerViewController(forOpeningContentTypes: [type])
fileBrowser.overrideUserInterfaceStyle = .dark
fileBrowser.delegate = self
fileBrowser.directoryURL = myFileFolderURL()
self.present(fileBrowser, animated: true) {
Hi all,
I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect.
After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above.
After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure.
Here is the setupAudioEngine function:
private func setupAudioEngine() {
do {
try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
} catch {
print("Audio session error: \(error)")
}
engine.attach(playerNode)
engine.attach(analyzer)
engine.connect(playerNode, to: analyzer, format: nil)
engine.connect(analyzer, to: engine.mainMixerNode, format: nil)
analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in
self?.processAudioBuffer(buffer)
}
}
Here is the play function:
func play(_ mediaItem: MPMediaItem) {
guard let assetURL = mediaItem.assetURL else {
print("No asset URL for media item")
return
}
stop()
do {
audioFile = try AVAudioFile(forReading: assetURL)
guard let audioFile else {
print("Failed to create audio file")
return
}
duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate
if !engine.isRunning {
try engine.start()
}
playerNode.scheduleFile(audioFile, at: nil)
playerNode.play()
DispatchQueue.main.async { [weak self] in
self?.isPlaying = true
self?.startDisplayLink()
}
} catch {
print("Error playing audio: \(error)")
DispatchQueue.main.async { [weak self] in
self?.isPlaying = false
self?.stopDisplayLink()
}
}
}
Here is a link to my test project if you want to try it out for yourself:
https://github.com/aabagdi/VisualMan-example
Thanks!
When multiple identical songs are added to a playlist, Playlist.Entry.id uses a suffix-based identifier (e.g. songID_0, songID_1, etc.). Removing one entry causes others to shift, changing their .id values. This leads to diffing errors and collection view crashes in SwiftUI or UIKit when entries are updated.
Steps to Reproduce:
Add the same song to a playlist multiple times.
Observe .id.rawValue of entries (e.g. i.SONGID_0, i.SONGID_1).
Remove one entry.
Fetch playlist again — note the other IDs have shifted.
FB18879062
It sounds simple but searching for the name "Favorite Songs" is a non-starter because it's called different names in different countries, even if I specify "&l=en_us" on the query.
So is there another property, relationship or combination thereof which I can use to tell me when I've found the right playlist?
Properties I've looked at so far:
canEdit: will always be false so narrows things down a little
inFavorites: not helpful as it depends on whether the user has favourite the favourites playlist, so not relevant
hasCatalog: seems always true so again may narrow things down a bit
isPublic: doesn't help
Adding the catalog relationship doesn't seem to show anything immediately useful either.
Can anyone help?
Ideally I'd like to see this as a "kind" or "type" as it has different properties to other playlists, but frankly I'll take anything at this point.
Sequoia 15.4.1 (24E263)
XCode: 16.3 (16E140)
Logic Pro: 11.2.1
I’ve been developing a complex audio unit for Mac OS that works perfectly well in its own bespoke host app and is now well into its beta testing stage.
It did take some effort to get it to work well in Logic Pro however and all was fine and working well until:
The AU part is an empty app extension with a framework containing its code.
The framework contains Swift code for the UI and C code for the DSP parts.
When the framework is compiled using the Swift 5 compiler the AU will run in Logic with no problems.
(I should also mention that AU passes the most strict auval tests).
But… when the framework is compiled with Swift 6 Logic Pro cannot load it.
Logic displays a message saying the audio unit could not be loaded and to contact the developer.
My own host app loads the AU perfectly well with the Swift 6 version, so I know there’s nothing wrong with the audio unit.
I cannot find any differences in any of the built output files except, of course, the actual binary code in the framework.
I’ve worked for hours on this and cannot find a solution other than to build the framework in Swift 5.
(I worked hard to get all the async code updated and working with Swift 6! so I feel a little cheated!)
What is happening?
Is this a bug in Logic?
Is this a bug in Swift 6 compiler/linker?
I’m at the Duh! hands in the air, tearing out hair stage! ( once again!)
I have sent in a feedback report (FB18222398) but I have no idea if anyone has looked at it. I know from past experiences that Apple devs do look at these forums.
This applies to each of the betas, 1, 2 and 3. I have created a new Personal Voice with each beta. I create a personal voice in English. When it's done processing, I tap Preview and it says in English what is expected. But after some time, an hour or a day, the language of the voice file changes languages and no longer works properly. If I press Preview it is no longer intelligible. I have a text to speech app and initially the created voice works but then when the language of the file changes, it no longer works. I have run an app on my iphone through Xcode that prints to the console the voices installed on the device with the language. Currently this is the voice file:
Voice Identifier: com.apple.speech.personalvoice.AAA9C6F2-9125-475F-BA2F-22C63274991D
Language: es-MX
and on a second device the same personal voice is in a different language:
Voice Identifier: com.apple.speech.personalvoice.AAA9C6F2-9125-475F-BA2F-22C63274991D
Language: zh-CN
Although, a previous personal voice file that listed as Spanish-Mexican played in English with a Spanish accent or when playing Spanish text, it sounded almost perfect. This current personal voice doesn't do that, and is unintelligible. Previous attempts have converted to Chinese.
I hope someone can look into this.
Since many users like me use Apple Music on Android, the app is almost as feature-rich as iOS. It would be fantastic if the developers could add the new iOS 26 features to the Android app, along with a minor UI change. I know it’s challenging to implement liquid glass on Android hardware or design, but features like auto-mix, pronunciation, and translation could be added.
kindly consider this request !!!!
Since MacOS 26 Apple Music has inconsitent drops to the Quality of some Tracks indiscrimantly. I don't know if others Expereinced it. It doesn't happen on the Speakers or connected via Bluetooth, but the AUX I/O has it quite often. It is more noticable on Headphones with 48kHz and higher Frequency Bandwidth.
Here is the FB18062589
I'm using AVFoundation to make a multi-track editor app, which can insert multiple track and clip, including scale some clip to change the speed of the clip, (also I'm not sure whether AVFoundation the best choice for me) but after making the scale with scaleTimeRange API, there is some short noise sound in play back. Also, sometimes it's fine when play AVMutableCompostion using AVPlayer with AVPlayerItem, but after exporting with AVAssetReader, will catch some short noise sounds in result file.... Not sure why.
Here is the example project, which can build and run directly. https://github.com/luckysmg/daily_images/raw/refs/heads/main/TestDemo.zip
Hello everyone,
I’m new to Swift development and have been working on an audio module that plays a specific sound at regular intervals - similar to a workout timer that signals switching exercises every few minutes.
Following AVFoundation documentation, I’m configuring my audio session like this:
let session = AVAudioSession.sharedInstance()
try session.setCategory(
.playback,
mode: .default,
options: [.interruptSpokenAudioAndMixWithOthers, .duckOthers]
)
self.engine.attach(self.player)
self.engine.connect(self.player, to: self.engine.outputNode, format: self.audioFormat)
try? session.setActive(true)
When it’s time to play cues, I schedule playback on a DispatchQueue:
// scheduleAudio uses DispatchQueue
self.scheduleAudio(at: interval.start) {
do {
try audio.engine.start()
audio.node.play()
for sample in interval.samples {
audio.node.scheduleBuffer(sample.buffer, at: AVAudioTime(hostTime: sample.hostTime))
}
} catch {
print("Audio activation failed: \(error)")
}
}
This works perfectly in the foreground. But once the app goes into the background, the scheduled callback runs, yet the audio engine fails to start, resulting in an error with code 561015905.
Interestingly, if the app is already playing audio before going to the background, the scheduled sounds continue to play as expected.
I have added the required background audio mode to my Info plist file by including the key UIBackgroundModes with the value audio.
Is there anything else I should configure? What is the best practice to play periodic audio when the app runs in the background? How do apps like turn-by-turn navigation handle continuous audio playback in the background?
Any advice or pointers would be greatly appreciated!