Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

Unexpected AVAudioSession behavior after iOS 18.5 causing audio loss in VoIP calls
After updating to iOS 18.5, we’ve observed that outgoing audio from our app intermittently stops being transmitted during VoIP calls using AVAudioSession configured with .playAndRecord and .voiceChat. The session is set active without errors, and interruptions are handled correctly, yet audio capture suddenly ceases mid-call. This was not observed in earlier iOS versions (≤ 18.4). We’d like to confirm if there have been any recent changes in AVAudioSession, CallKit, or related media handling that could affect audio input behavior during long-running calls. func configureForVoIPCall() throws { try setCategory( .playAndRecord, mode: .voiceChat, options: [.allowBluetooth, .allowBluetoothA2DP, .defaultToSpeaker]) try setActive(true) }
1
0
323
Aug ’25
Why Does WebView Audio Get Quiet During RTC Calls? (AVAudioSession Analysis)
I developed an educational app that implements audio-video communication through RTC, while using WebView to display course materials during classes. However, some users are experiencing an issue where the audio playback from WebView is very quiet. I've checked that the AVAudioSessionCategory is set by RTC to AVAudioSessionCategoryPlayAndRecord, and the AVAudioSessionCategoryOption also includes AVAudioSessionCategoryOptionMixWithOthers. What could be causing the WebView audio to be suppressed, and how can this be resolved?
0
0
564
Jul ’25
USB microphone input : Mac "Designed for iPad"
My app - natively iOS but built with the "Designed for iPad" option to run on Mac - does not recognise an attached USB microphone when running on a Mac. This line int32_t items = (int32_t) [[[AVAudioSession sharedInstance] availableInputs] count ]; returns 1, which is the Mac internal mic. On iPad and iPhone it sees both the internal mic and the USB mic. Is this an inherent "Designed for iPad" restriction, and is there some trick I can pull to get the USB microphone to be recognised by the system?
1
0
275
Jan ’26
Play Audio and Recognize Speech in Car
Hello, I'm trying to determine the best/recommended AVAudioSession configuration (i.e category, mode, and options) for the following use-case. Essentially, I'd like to switch between periods of playing an audio file and then recognizing speech. The audio file is typically speech and I don't intend for playback and speech recognition to occur simultaneously. I'd like for the user to sill be able to interact with Siri and I'd like for it to work with CarPlay where navigation prompts can occur. I would assume the category to use is 'playAndRecord', but I'm not sure if it's better to just set that once for the entire lifecycle, or set to 'playback' for audio file playback and then switch to 'playAndRecord' for speech recognition . I'm also not sure on the best 'mode' and 'options' to set. Any suggestions would be appreciated. Thanks.
0
0
654
Sep ’25
Number of songs in the Apple Music Feed
Hello, I'm evaluating the Apple Music Feed dataset and I noticed that the total number of songs available in the feed is too small. As of today, the number of objects returned in each feed is: 51,198,712 albums 23,093,698 artists 173,235,315 songs This gives an average of 3.38 songs per album which is quite low. Also, iterating on the data I see that there are albums referencing songs that don't exist in the songs feed. I would like to know: Is the feed data incomplete? If so, in what situations an object may be missing from the feed? Thank you in advance!
0
0
337
Aug ’25
Bug: Channels erroneously populated when sending audio from an iPhone to a linux gadget audio device.
I have a device which is using linux gadget audio to receive audio input via USB, exposing 24 capture channels. This device works well with Mac, Windows, and Android phones. However, when sending audio from an iPhone (both USB-C iPhones and lightning iPhones using an official Apple lightning -> usb adaptor) I am seeing strange behaviour. Audio which is sent from the iPhone to any one of inputs 12, 19, 20, 21, or 22 appears in all of those channels, rather than only the channel to which audio is routed. I have confirmed on my linux device that these channels are not being erroneously populated by the software running on that device; the issue is visible in audio recorded directly from the gadget using arecord, meaning it is present in the audio being sent from the iPhone. I have confirmed that the gadget channel mask is correct for 24 channel audio (0xFFFFFF). As said above, audio routed to this device from any non-iPhone device (Mac, Windows, Android) works fine. The only sensible conclusion seems to be that the iPhone is populating the additional channels erroneously due to some bug in CoreAudio's handling of gadget audio devices. I would appreciate any insight on this from Apple developers, or from anyone else who has come across this issue and found a workaround.
0
0
253
2w
Unable to match music with shazamkit for Android
Hello, i can successfully match music using shazamkit on Apple using SwiftUI, a simple app that let user to load an audio file and exctracts the relative match, while i am unable to match music using shamzamkit on Android. I am trying to make the same simple app but i cannot match music as i get MATCH_ATTEMPT_FAILED every time i try to. I don't know what i am doing wrong but the shazam part in the kotlin Android code is in this method : suspend fun processAudioFileInBackground( filePath: String, developerTokenProvider: DeveloperTokenProvider ) = withContext(Dispatchers.IO) { val bufferSize = 1024 * 1024 val audioFile = FileInputStream(filePath) val byteBuffer = ByteBuffer.allocate(bufferSize) byteBuffer.order(ByteOrder.LITTLE_ENDIAN) var bytesRead: Int while (audioFile.read(byteBuffer.array()).also { bytesRead = it } != -1) { val signatureGenerator = (ShazamKit.createSignatureGenerator(AudioSampleRateInHz.SAMPLE_RATE_44100) as ShazamKitResult.Success).data signatureGenerator.append(byteBuffer.array(), bytesRead, System.currentTimeMillis()) val signature = signatureGenerator.generateSignature() println("Signature: ${signature.durationInMs}") val catalog = ShazamKit.createShazamCatalog(developerTokenProvider, Locale.ENGLISH) val session = (ShazamKit.createSession(catalog) as ShazamKitResult.Success).data val matchResult = session.match(signature) println("MatchResult : $matchResult") setMatchResult(matchResult) byteBuffer.clear() } audioFile.close() } I noticed that changing Locale in catalog creation results in different result as i get NoMatch without exception. Can you please help me with this? Do i need to create a custom catalog?
0
0
158
May ’25
failed to set category, reason: 未能完成操作。(OSStatus错误4097。)
When using the [AVAudioSession setCategory:withOptions:error:] API, the call hangs for a long time and eventually returns an error.This issue occurs on iOS 16, and did not appear in earlier versions. Thread 135: 0 libsystem_kernel.dylib 0x00000002478e3cd4 _mach_msg2_trap :8 (in libsystem_kernel.dylib) 1 libsystem_kernel.dylib 0x00000002478e7214 _mach_msg_overwrite :428 (in libsystem_kernel.dylib) 2 libsystem_kernel.dylib 0x00000002478e705c _mach_msg :24 (in libsystem_kernel.dylib) 3 libdispatch.dylib 0x00000001d63ffe84 __dispatch_mach_send_and_wait_for_reply :548 (in libdispatch.dylib) 4 libdispatch.dylib 0x00000001d6400224 _dispatch_mach_send_with_result_and_wait_for_reply :60 (in libdispatch.dylib) 5 libxpc.dylib 0x00000001b2114e04 _xpc_connection_send_message_with_reply_sync :256 (in libxpc.dylib) 6 Foundation 0x000000019b6249f0 ___NSXPCCONNECTION_IS_WAITING_FOR_A_SYNCHRONOUS_REPLY__ :16 (in Foundation) 7 Foundation 0x000000019c06d1b4 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] :2100 (in Foundation) 8 CoreFoundation 0x000000019dfcb1cc ____forwarding___ :1072 (in CoreFoundation) 9 CoreFoundation 0x000000019dfd3200 ___forwarding_prep_0___ :96 (in CoreFoundation) 10 AudioSession 0x00000001c77498b0 __ZN4avas6client11SessionCore10HandlePingEv :192 (in AudioSession) 11 AudioSession 0x00000001c77497b0 ____ZN4avas6client11SessionCore12DispatchPingEv_block_invoke :52 (in AudioSession) 12 libdispatch.dylib 0x00000001d63e4adc __dispatch_call_block_and_release :32 (in libdispatch.dylib) 13 libdispatch.dylib 0x00000001d63fe7ec __dispatch_client_callout :16 (in libdispatch.dylib) 14 libdispatch.dylib 0x00000001d63ed468 __dispatch_lane_serial_drain :740 (in libdispatch.dylib) 15 libdispatch.dylib 0x00000001d63edf78 __dispatch_lane_invoke :440 (in libdispatch.dylib) 16 libdispatch.dylib 0x00000001d63f6f48 __dispatch_root_queue_drain :364 (in libdispatch.dylib) 17 libdispatch.dylib 0x00000001d63f6d08 __dispatch_worker_thread :268 (in libdispatch.dylib) 18 libsystem_pthread.dylib 0x00000001f9ff144c __pthread_start :136 (in libsystem_pthread.dylib) 19 libsystem_pthread.dylib 0x00000001f9fed8cc _thread_start :8 (in libsystem_pthread.dylib) Thread 132: 0 libsystem_kernel.dylib 0x00000002478e3cd4 _mach_msg2_trap :8 (in libsystem_kernel.dylib) 1 libsystem_kernel.dylib 0x00000002478e7214 _mach_msg_overwrite :428 (in libsystem_kernel.dylib) 2 libsystem_kernel.dylib 0x00000002478e705c _mach_msg :24 (in libsystem_kernel.dylib) 3 libdispatch.dylib 0x00000001d63ffe84 __dispatch_mach_send_and_wait_for_reply :548 (in libdispatch.dylib) 4 libdispatch.dylib 0x00000001d6400224 _dispatch_mach_send_with_result_and_wait_for_reply :60 (in libdispatch.dylib) 5 libxpc.dylib 0x00000001b2114e04 _xpc_connection_send_message_with_reply_sync :256 (in libxpc.dylib) 6 Foundation 0x000000019b6249f0 ___NSXPCCONNECTION_IS_WAITING_FOR_A_SYNCHRONOUS_REPLY__ :16 (in Foundation) 7 Foundation 0x000000019c06d1b4 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] :2100 (in Foundation) 8 CoreFoundation 0x000000019dfcb1cc ____forwarding___ :1072 (in CoreFoundation) 9 CoreFoundation 0x000000019dfd3200 ___forwarding_prep_0___ :96 (in CoreFoundation) 10 AudioSession 0x00000001c7754198 __ZNK4avas6client11SessionCore18SetBatchPropertiesEP12NSDictionaryIP8NSStringPU25objcproto14NSSecureCoding11objc_objectEPU15__autoreleasingP7NSArrayIPS2_IS4_P8NSNumberEENS_30AVAudioSessionBatchSetStrategyEbb :548 (in AudioSession) 11 AudioSession 0x00000001c7753e58 __ZNK4avas6client11SessionCore20SetBatchPropertiesMXEP12NSDictionaryIP8NSStringPU25objcproto14NSSecureCoding11objc_objectE :92 (in AudioSession) 12 AudioSession 0x00000001c775179c __ZN4avas6client11SessionCore11setCategoryEP8NSStringS3_32AVAudioSessionRouteSharingPolicym :472 (in AudioSession) 13 AudioSession 0x00000001c7768f88 -[AVAudioSession setCategory:withOptions:error:] :68 (in AudioSession) 14 AlipayWallet 0x000000010140580c -[AVAudioSession(APMHook) apmhook_setCategory:withOptions:error:] APMHookAudioSession.m:35 (in AlipayWallet) 15 AlipayWallet 0x00000001014001a4 -[APMAudioSessionManager resume] APMAudioSessionManager.m:718 (in AlipayWallet) 16 libdispatch.dylib 0x00000001d63e4adc __dispatch_call_block_and_release :32 (in libdispatch.dylib) 17 libdispatch.dylib 0x00000001d63fe7ec __dispatch_client_callout :16 (in libdispatch.dylib) 18 libdispatch.dylib 0x00000001d63ed468 __dispatch_lane_serial_drain :740 (in libdispatch.dylib) 19 libdispatch.dylib 0x00000001d63edf44 __dispatch_lane_invoke :388 (in libdispatch.dylib) 20 libdispatch.dylib 0x00000001d63f83ec __dispatch_root_queue_drain_deferred_wlh :292 (in libdispatch.dylib) 21 libdispatch.dylib 0x00000001d63f7ce4 __dispatch_workloop_worker_thread :692 (in libdispatch.dylib) 22 libsystem_pthread.dylib 0x00000001f9fee3b8 __pthread_wqthread :292 (in libsystem_pthread.dylib) 23 libsystem_pthread.dylib 0x00000001f9fed8c0 _start_wqthread :8 (in libsystem_pthread.dylib)
2
0
844
Feb ’26
How to record voice, auto-transcribe, translate (auto-detect input language), and play back translated audio on same device in iOS Swift?
Hi everyone 👋 I’m building an iOS app in Swift where I want to do the following: Record the user’s voice Transcribe the spoken sentence (speech-to-text) Auto-detect the spoken language Translate it to another language selected by the user (e.g., English → Spanish or Hindi → English) Speak back (text-to-speech) the translated text on the same device Is this possible to record via phone mic and play the transcribe voice into headphone's audio?
0
0
289
Oct ’25
Question about Apple Vision Pro audio input sampling rate for research
I am a graduate student conducting research in speech/audio signal processing and multimodal interaction. Apple Vision Pro is widely recognized as a multimodal interactive system supporting voice, eye, and gesture inputs. However, I could not find detailed specifications or documentation about the audio input sampling rate used by the device’s built-in microphone array when capturing user audio. Specifically, I would like to understand: What is the default audio input sampling rate (e.g., 16 kHz, 44.1 kHz, 48 kHz, etc.) for the Vision Pro’s microphones? When developing with visionOS / AVAudioSession / AVAudioEngine, is there a documented or recommended sampling rate for audio capture? Are there any best practices or settings for enabling high-quality voice capture on Vision Pro (especially for voice research tasks)? For context, my work involves voice processing, analysis, and possibly on-device real-time speech recognition. Any pointers to relevant APIs, documentation or examples (especially regarding audio capture buffer size or available formats on visionOS) would be very helpful. Thank you in advance! Best regards.
0
0
198
Jan ’26
macOS sample for AVAudioEngine recording with playthrough
Hi, I'm still stuck getting a basic record-with-playthrouh pipeline to work. Has anyone a sample of setting up a AVAudioEngine pipeline for recording with playthrough? Plkaythrough works with AVPlayerNode as input but not with any microphone input. The docs mention the "enabled state" of the outputNode of the engine without explaining the concept, i.e. how to enable an output. When the engine renders to and from an audio device, the AVAudioSession category and the availability of hardware determines whether an app performs output. Check the output node’s output format (specifically, the hardware format) for a nonzero sample rate and channel count to see if output is in an enabled state. Well, in my setup the output is NOT enabled, and any attempt to switch (e.g. audioEngine.outputNode.auAudioUnit.setDeviceID(deviceID) )/ attach a dedicated device / ... results in exceptions / errors
0
0
412
Oct ’25
CarPlay: Voice Conversational Entitlement Details
With the Voice Conversational Entitlement, can a CarPlay app establish a turn-based audio interface that operates in two modes: Speaking mode: Audio Session configured for playback Buffered audio Listening mode: Switch Audio Session to .record or .playAndRecord Activate SFSpeechRecognizer And continue toggling back and forth. The app should listen for responses to questions or other audio cues, and assuming those answers are correct (based on analysis of results from SFSpeechRecognizer), continue this pattern of mode 1 and 2 alternating. This appears to be a valid use of this entitlement. Does this also require the Audio App Entitlement, or is the Voice Conversational Entitlement sufficient? Are there other obstacles to this type of app that I'm not seeing? Or perhaps this is technically possible, but unlikely to pass app store review?
0
0
203
4w
AVSpeechSynthesizer pulls words out of thin air.
Hi, I'm working on a project that uses the AVSpeechSynthesizer and AVSpeechUtterance. I discovered by chance that the AVSpeechSynthesizer automatically completes some words instead of just outputting what it's supposed to. These are abbreviations for days of the week or months, but not all of them. I don't want either of them automatically completed, but only the specified text. The completion transcends languages. I have written a short example program for demonstration purposes. import SwiftUI import AVFoundation import Foundation let synthesizer: AVSpeechSynthesizer = AVSpeechSynthesizer() struct ContentView: View { var body: some View { VStack { Button { utter("mon") } label: { Text("mon") } .buttonStyle(.borderedProminent) Button { utter("tue") } label: { Text("tue") } .buttonStyle(.borderedProminent) Button { utter("thu") } label: { Text("thu") } .buttonStyle(.borderedProminent) Button { utter("feb") } label: { Text("feb") } .buttonStyle(.borderedProminent) Button { utter("feb", lang: "de-DE") } label: { Text("feb DE") } .buttonStyle(.borderedProminent) Button { utter("wed") } label: { Text("wed") } .buttonStyle(.borderedProminent) } .padding() } private func utter(_ text: String, lang: String = "en-US") { let utterance = AVSpeechUtterance(string: text) let voice = AVSpeechSynthesisVoice(language: lang) utterance.voice = voice synthesizer.speak(utterance) } } #Preview { ContentView() } Thank you Christian
1
0
231
Nov ’25
watchOS longFormAudio cannot de active
My workout watch app supports audio playback during exercise sessions. When users carry both Apple Watch, iPhone, and AirPods, with AirPods connected to the iPhone, I want to route audio from Apple Watch to AirPods for playback. I've implemented this functionality using the following code. try? session.setCategory(.playback, mode: .default, policy: .longFormAudio, options: []) try await session.activate() When users are playing music on iPhone and trigger my code in the watch app, Apple Watch correctly guides users to select AirPods, pauses the iPhone's music, and plays my audio. However, when playback finishes and I end the session using the code below: try session.setActive(false, options:[.notifyOthersOnDeactivation]) the iPhone doesn't automatically resume the previously interrupted music playback—it requires manual intervention. Is this expected behavior, or am I missing other important steps in my code?
1
0
340
Nov ’25
coreaudiod display sleep
hi all, as soon an audio is played in a whatever app, coreaudiod inserts a sleep prevent assertion for both, the system AND the display. can i somehow stop the insertion of the display sleep assertion? pid 223(coreaudiod): [0x00004e9e00058dc2] 00:03:18 PreventUserIdleDisplaySleep named: "com.apple.audio.AppleGFXHDAEngineOutputDP:10001:0:{B31A-08C6-00000000}.context.preventuseridledisplaysleep" Created for PID: 4145. where PID 4145 is spotify. but it doesn't matter which app is playing the audio. any help would be appreciated thanks
0
0
89
Nov ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
0
0
585
Jul ’25
Audio System Trace: Zero Time Stamp
In Instruments, I'm seeing "Zero Time Stamp" events in the "Audio Server" lane. What does that mean?
Replies
1
Boosts
0
Views
191
Activity
Mar ’26
Unexpected AVAudioSession behavior after iOS 18.5 causing audio loss in VoIP calls
After updating to iOS 18.5, we’ve observed that outgoing audio from our app intermittently stops being transmitted during VoIP calls using AVAudioSession configured with .playAndRecord and .voiceChat. The session is set active without errors, and interruptions are handled correctly, yet audio capture suddenly ceases mid-call. This was not observed in earlier iOS versions (≤ 18.4). We’d like to confirm if there have been any recent changes in AVAudioSession, CallKit, or related media handling that could affect audio input behavior during long-running calls. func configureForVoIPCall() throws { try setCategory( .playAndRecord, mode: .voiceChat, options: [.allowBluetooth, .allowBluetoothA2DP, .defaultToSpeaker]) try setActive(true) }
Replies
1
Boosts
0
Views
323
Activity
Aug ’25
Why Does WebView Audio Get Quiet During RTC Calls? (AVAudioSession Analysis)
I developed an educational app that implements audio-video communication through RTC, while using WebView to display course materials during classes. However, some users are experiencing an issue where the audio playback from WebView is very quiet. I've checked that the AVAudioSessionCategory is set by RTC to AVAudioSessionCategoryPlayAndRecord, and the AVAudioSessionCategoryOption also includes AVAudioSessionCategoryOptionMixWithOthers. What could be causing the WebView audio to be suppressed, and how can this be resolved?
Replies
0
Boosts
0
Views
564
Activity
Jul ’25
Apple Device Sync Backup
When using the Apple Devices to sync Apple Music to iPhone where is the Apple Devices backup being written to? Apple Devices->music->sync. Not trying to backup the iPhone via Apple Devices app.
Replies
0
Boosts
0
Views
92
Activity
Jun ’25
USB microphone input : Mac "Designed for iPad"
My app - natively iOS but built with the "Designed for iPad" option to run on Mac - does not recognise an attached USB microphone when running on a Mac. This line int32_t items = (int32_t) [[[AVAudioSession sharedInstance] availableInputs] count ]; returns 1, which is the Mac internal mic. On iPad and iPhone it sees both the internal mic and the USB mic. Is this an inherent "Designed for iPad" restriction, and is there some trick I can pull to get the USB microphone to be recognised by the system?
Replies
1
Boosts
0
Views
275
Activity
Jan ’26
Play Audio and Recognize Speech in Car
Hello, I'm trying to determine the best/recommended AVAudioSession configuration (i.e category, mode, and options) for the following use-case. Essentially, I'd like to switch between periods of playing an audio file and then recognizing speech. The audio file is typically speech and I don't intend for playback and speech recognition to occur simultaneously. I'd like for the user to sill be able to interact with Siri and I'd like for it to work with CarPlay where navigation prompts can occur. I would assume the category to use is 'playAndRecord', but I'm not sure if it's better to just set that once for the entire lifecycle, or set to 'playback' for audio file playback and then switch to 'playAndRecord' for speech recognition . I'm also not sure on the best 'mode' and 'options' to set. Any suggestions would be appreciated. Thanks.
Replies
0
Boosts
0
Views
654
Activity
Sep ’25
Number of songs in the Apple Music Feed
Hello, I'm evaluating the Apple Music Feed dataset and I noticed that the total number of songs available in the feed is too small. As of today, the number of objects returned in each feed is: 51,198,712 albums 23,093,698 artists 173,235,315 songs This gives an average of 3.38 songs per album which is quite low. Also, iterating on the data I see that there are albums referencing songs that don't exist in the songs feed. I would like to know: Is the feed data incomplete? If so, in what situations an object may be missing from the feed? Thank you in advance!
Replies
0
Boosts
0
Views
337
Activity
Aug ’25
Bug: Channels erroneously populated when sending audio from an iPhone to a linux gadget audio device.
I have a device which is using linux gadget audio to receive audio input via USB, exposing 24 capture channels. This device works well with Mac, Windows, and Android phones. However, when sending audio from an iPhone (both USB-C iPhones and lightning iPhones using an official Apple lightning -> usb adaptor) I am seeing strange behaviour. Audio which is sent from the iPhone to any one of inputs 12, 19, 20, 21, or 22 appears in all of those channels, rather than only the channel to which audio is routed. I have confirmed on my linux device that these channels are not being erroneously populated by the software running on that device; the issue is visible in audio recorded directly from the gadget using arecord, meaning it is present in the audio being sent from the iPhone. I have confirmed that the gadget channel mask is correct for 24 channel audio (0xFFFFFF). As said above, audio routed to this device from any non-iPhone device (Mac, Windows, Android) works fine. The only sensible conclusion seems to be that the iPhone is populating the additional channels erroneously due to some bug in CoreAudio's handling of gadget audio devices. I would appreciate any insight on this from Apple developers, or from anyone else who has come across this issue and found a workaround.
Replies
0
Boosts
0
Views
253
Activity
2w
Unable to match music with shazamkit for Android
Hello, i can successfully match music using shazamkit on Apple using SwiftUI, a simple app that let user to load an audio file and exctracts the relative match, while i am unable to match music using shamzamkit on Android. I am trying to make the same simple app but i cannot match music as i get MATCH_ATTEMPT_FAILED every time i try to. I don't know what i am doing wrong but the shazam part in the kotlin Android code is in this method : suspend fun processAudioFileInBackground( filePath: String, developerTokenProvider: DeveloperTokenProvider ) = withContext(Dispatchers.IO) { val bufferSize = 1024 * 1024 val audioFile = FileInputStream(filePath) val byteBuffer = ByteBuffer.allocate(bufferSize) byteBuffer.order(ByteOrder.LITTLE_ENDIAN) var bytesRead: Int while (audioFile.read(byteBuffer.array()).also { bytesRead = it } != -1) { val signatureGenerator = (ShazamKit.createSignatureGenerator(AudioSampleRateInHz.SAMPLE_RATE_44100) as ShazamKitResult.Success).data signatureGenerator.append(byteBuffer.array(), bytesRead, System.currentTimeMillis()) val signature = signatureGenerator.generateSignature() println("Signature: ${signature.durationInMs}") val catalog = ShazamKit.createShazamCatalog(developerTokenProvider, Locale.ENGLISH) val session = (ShazamKit.createSession(catalog) as ShazamKitResult.Success).data val matchResult = session.match(signature) println("MatchResult : $matchResult") setMatchResult(matchResult) byteBuffer.clear() } audioFile.close() } I noticed that changing Locale in catalog creation results in different result as i get NoMatch without exception. Can you please help me with this? Do i need to create a custom catalog?
Replies
0
Boosts
0
Views
158
Activity
May ’25
failed to set category, reason: 未能完成操作。(OSStatus错误4097。)
When using the [AVAudioSession setCategory:withOptions:error:] API, the call hangs for a long time and eventually returns an error.This issue occurs on iOS 16, and did not appear in earlier versions. Thread 135: 0 libsystem_kernel.dylib 0x00000002478e3cd4 _mach_msg2_trap :8 (in libsystem_kernel.dylib) 1 libsystem_kernel.dylib 0x00000002478e7214 _mach_msg_overwrite :428 (in libsystem_kernel.dylib) 2 libsystem_kernel.dylib 0x00000002478e705c _mach_msg :24 (in libsystem_kernel.dylib) 3 libdispatch.dylib 0x00000001d63ffe84 __dispatch_mach_send_and_wait_for_reply :548 (in libdispatch.dylib) 4 libdispatch.dylib 0x00000001d6400224 _dispatch_mach_send_with_result_and_wait_for_reply :60 (in libdispatch.dylib) 5 libxpc.dylib 0x00000001b2114e04 _xpc_connection_send_message_with_reply_sync :256 (in libxpc.dylib) 6 Foundation 0x000000019b6249f0 ___NSXPCCONNECTION_IS_WAITING_FOR_A_SYNCHRONOUS_REPLY__ :16 (in Foundation) 7 Foundation 0x000000019c06d1b4 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] :2100 (in Foundation) 8 CoreFoundation 0x000000019dfcb1cc ____forwarding___ :1072 (in CoreFoundation) 9 CoreFoundation 0x000000019dfd3200 ___forwarding_prep_0___ :96 (in CoreFoundation) 10 AudioSession 0x00000001c77498b0 __ZN4avas6client11SessionCore10HandlePingEv :192 (in AudioSession) 11 AudioSession 0x00000001c77497b0 ____ZN4avas6client11SessionCore12DispatchPingEv_block_invoke :52 (in AudioSession) 12 libdispatch.dylib 0x00000001d63e4adc __dispatch_call_block_and_release :32 (in libdispatch.dylib) 13 libdispatch.dylib 0x00000001d63fe7ec __dispatch_client_callout :16 (in libdispatch.dylib) 14 libdispatch.dylib 0x00000001d63ed468 __dispatch_lane_serial_drain :740 (in libdispatch.dylib) 15 libdispatch.dylib 0x00000001d63edf78 __dispatch_lane_invoke :440 (in libdispatch.dylib) 16 libdispatch.dylib 0x00000001d63f6f48 __dispatch_root_queue_drain :364 (in libdispatch.dylib) 17 libdispatch.dylib 0x00000001d63f6d08 __dispatch_worker_thread :268 (in libdispatch.dylib) 18 libsystem_pthread.dylib 0x00000001f9ff144c __pthread_start :136 (in libsystem_pthread.dylib) 19 libsystem_pthread.dylib 0x00000001f9fed8cc _thread_start :8 (in libsystem_pthread.dylib) Thread 132: 0 libsystem_kernel.dylib 0x00000002478e3cd4 _mach_msg2_trap :8 (in libsystem_kernel.dylib) 1 libsystem_kernel.dylib 0x00000002478e7214 _mach_msg_overwrite :428 (in libsystem_kernel.dylib) 2 libsystem_kernel.dylib 0x00000002478e705c _mach_msg :24 (in libsystem_kernel.dylib) 3 libdispatch.dylib 0x00000001d63ffe84 __dispatch_mach_send_and_wait_for_reply :548 (in libdispatch.dylib) 4 libdispatch.dylib 0x00000001d6400224 _dispatch_mach_send_with_result_and_wait_for_reply :60 (in libdispatch.dylib) 5 libxpc.dylib 0x00000001b2114e04 _xpc_connection_send_message_with_reply_sync :256 (in libxpc.dylib) 6 Foundation 0x000000019b6249f0 ___NSXPCCONNECTION_IS_WAITING_FOR_A_SYNCHRONOUS_REPLY__ :16 (in Foundation) 7 Foundation 0x000000019c06d1b4 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] :2100 (in Foundation) 8 CoreFoundation 0x000000019dfcb1cc ____forwarding___ :1072 (in CoreFoundation) 9 CoreFoundation 0x000000019dfd3200 ___forwarding_prep_0___ :96 (in CoreFoundation) 10 AudioSession 0x00000001c7754198 __ZNK4avas6client11SessionCore18SetBatchPropertiesEP12NSDictionaryIP8NSStringPU25objcproto14NSSecureCoding11objc_objectEPU15__autoreleasingP7NSArrayIPS2_IS4_P8NSNumberEENS_30AVAudioSessionBatchSetStrategyEbb :548 (in AudioSession) 11 AudioSession 0x00000001c7753e58 __ZNK4avas6client11SessionCore20SetBatchPropertiesMXEP12NSDictionaryIP8NSStringPU25objcproto14NSSecureCoding11objc_objectE :92 (in AudioSession) 12 AudioSession 0x00000001c775179c __ZN4avas6client11SessionCore11setCategoryEP8NSStringS3_32AVAudioSessionRouteSharingPolicym :472 (in AudioSession) 13 AudioSession 0x00000001c7768f88 -[AVAudioSession setCategory:withOptions:error:] :68 (in AudioSession) 14 AlipayWallet 0x000000010140580c -[AVAudioSession(APMHook) apmhook_setCategory:withOptions:error:] APMHookAudioSession.m:35 (in AlipayWallet) 15 AlipayWallet 0x00000001014001a4 -[APMAudioSessionManager resume] APMAudioSessionManager.m:718 (in AlipayWallet) 16 libdispatch.dylib 0x00000001d63e4adc __dispatch_call_block_and_release :32 (in libdispatch.dylib) 17 libdispatch.dylib 0x00000001d63fe7ec __dispatch_client_callout :16 (in libdispatch.dylib) 18 libdispatch.dylib 0x00000001d63ed468 __dispatch_lane_serial_drain :740 (in libdispatch.dylib) 19 libdispatch.dylib 0x00000001d63edf44 __dispatch_lane_invoke :388 (in libdispatch.dylib) 20 libdispatch.dylib 0x00000001d63f83ec __dispatch_root_queue_drain_deferred_wlh :292 (in libdispatch.dylib) 21 libdispatch.dylib 0x00000001d63f7ce4 __dispatch_workloop_worker_thread :692 (in libdispatch.dylib) 22 libsystem_pthread.dylib 0x00000001f9fee3b8 __pthread_wqthread :292 (in libsystem_pthread.dylib) 23 libsystem_pthread.dylib 0x00000001f9fed8c0 _start_wqthread :8 (in libsystem_pthread.dylib)
Replies
2
Boosts
0
Views
844
Activity
Feb ’26
How to record voice, auto-transcribe, translate (auto-detect input language), and play back translated audio on same device in iOS Swift?
Hi everyone 👋 I’m building an iOS app in Swift where I want to do the following: Record the user’s voice Transcribe the spoken sentence (speech-to-text) Auto-detect the spoken language Translate it to another language selected by the user (e.g., English → Spanish or Hindi → English) Speak back (text-to-speech) the translated text on the same device Is this possible to record via phone mic and play the transcribe voice into headphone's audio?
Replies
0
Boosts
0
Views
289
Activity
Oct ’25
Question about Apple Vision Pro audio input sampling rate for research
I am a graduate student conducting research in speech/audio signal processing and multimodal interaction. Apple Vision Pro is widely recognized as a multimodal interactive system supporting voice, eye, and gesture inputs. However, I could not find detailed specifications or documentation about the audio input sampling rate used by the device’s built-in microphone array when capturing user audio. Specifically, I would like to understand: What is the default audio input sampling rate (e.g., 16 kHz, 44.1 kHz, 48 kHz, etc.) for the Vision Pro’s microphones? When developing with visionOS / AVAudioSession / AVAudioEngine, is there a documented or recommended sampling rate for audio capture? Are there any best practices or settings for enabling high-quality voice capture on Vision Pro (especially for voice research tasks)? For context, my work involves voice processing, analysis, and possibly on-device real-time speech recognition. Any pointers to relevant APIs, documentation or examples (especially regarding audio capture buffer size or available formats on visionOS) would be very helpful. Thank you in advance! Best regards.
Replies
0
Boosts
0
Views
198
Activity
Jan ’26
macOS sample for AVAudioEngine recording with playthrough
Hi, I'm still stuck getting a basic record-with-playthrouh pipeline to work. Has anyone a sample of setting up a AVAudioEngine pipeline for recording with playthrough? Plkaythrough works with AVPlayerNode as input but not with any microphone input. The docs mention the "enabled state" of the outputNode of the engine without explaining the concept, i.e. how to enable an output. When the engine renders to and from an audio device, the AVAudioSession category and the availability of hardware determines whether an app performs output. Check the output node’s output format (specifically, the hardware format) for a nonzero sample rate and channel count to see if output is in an enabled state. Well, in my setup the output is NOT enabled, and any attempt to switch (e.g. audioEngine.outputNode.auAudioUnit.setDeviceID(deviceID) )/ attach a dedicated device / ... results in exceptions / errors
Replies
0
Boosts
0
Views
412
Activity
Oct ’25
CarPlay: Voice Conversational Entitlement Details
With the Voice Conversational Entitlement, can a CarPlay app establish a turn-based audio interface that operates in two modes: Speaking mode: Audio Session configured for playback Buffered audio Listening mode: Switch Audio Session to .record or .playAndRecord Activate SFSpeechRecognizer And continue toggling back and forth. The app should listen for responses to questions or other audio cues, and assuming those answers are correct (based on analysis of results from SFSpeechRecognizer), continue this pattern of mode 1 and 2 alternating. This appears to be a valid use of this entitlement. Does this also require the Audio App Entitlement, or is the Voice Conversational Entitlement sufficient? Are there other obstacles to this type of app that I'm not seeing? Or perhaps this is technically possible, but unlikely to pass app store review?
Replies
0
Boosts
0
Views
203
Activity
4w
AVSpeechSynthesizer pulls words out of thin air.
Hi, I'm working on a project that uses the AVSpeechSynthesizer and AVSpeechUtterance. I discovered by chance that the AVSpeechSynthesizer automatically completes some words instead of just outputting what it's supposed to. These are abbreviations for days of the week or months, but not all of them. I don't want either of them automatically completed, but only the specified text. The completion transcends languages. I have written a short example program for demonstration purposes. import SwiftUI import AVFoundation import Foundation let synthesizer: AVSpeechSynthesizer = AVSpeechSynthesizer() struct ContentView: View { var body: some View { VStack { Button { utter("mon") } label: { Text("mon") } .buttonStyle(.borderedProminent) Button { utter("tue") } label: { Text("tue") } .buttonStyle(.borderedProminent) Button { utter("thu") } label: { Text("thu") } .buttonStyle(.borderedProminent) Button { utter("feb") } label: { Text("feb") } .buttonStyle(.borderedProminent) Button { utter("feb", lang: "de-DE") } label: { Text("feb DE") } .buttonStyle(.borderedProminent) Button { utter("wed") } label: { Text("wed") } .buttonStyle(.borderedProminent) } .padding() } private func utter(_ text: String, lang: String = "en-US") { let utterance = AVSpeechUtterance(string: text) let voice = AVSpeechSynthesisVoice(language: lang) utterance.voice = voice synthesizer.speak(utterance) } } #Preview { ContentView() } Thank you Christian
Replies
1
Boosts
0
Views
231
Activity
Nov ’25
Infrequent Sound inconsitency with Apple Music
Since MacOS 26 Apple Music has inconsitent drops to the Quality of some Tracks indiscrimantly. I don't know if others Expereinced it. It doesn't happen on the Speakers or connected via Bluetooth, but the AUX I/O has it quite often. It is more noticable on Headphones with 48kHz and higher Frequency Bandwidth. Here is the FB18062589
Replies
0
Boosts
0
Views
269
Activity
Jul ’25
watchOS longFormAudio cannot de active
My workout watch app supports audio playback during exercise sessions. When users carry both Apple Watch, iPhone, and AirPods, with AirPods connected to the iPhone, I want to route audio from Apple Watch to AirPods for playback. I've implemented this functionality using the following code. try? session.setCategory(.playback, mode: .default, policy: .longFormAudio, options: []) try await session.activate() When users are playing music on iPhone and trigger my code in the watch app, Apple Watch correctly guides users to select AirPods, pauses the iPhone's music, and plays my audio. However, when playback finishes and I end the session using the code below: try session.setActive(false, options:[.notifyOthersOnDeactivation]) the iPhone doesn't automatically resume the previously interrupted music playback—it requires manual intervention. Is this expected behavior, or am I missing other important steps in my code?
Replies
1
Boosts
0
Views
340
Activity
Nov ’25
CoreAudio: Specification of Private Aggregate or Tap
If a Tap or AggregateDevice with the Private property set is created, does it automatically disappear when the process ends? If not, how can I remove the Tap or AggregateDevice before the main process terminates?
Replies
0
Boosts
0
Views
288
Activity
Mar ’26
coreaudiod display sleep
hi all, as soon an audio is played in a whatever app, coreaudiod inserts a sleep prevent assertion for both, the system AND the display. can i somehow stop the insertion of the display sleep assertion? pid 223(coreaudiod): [0x00004e9e00058dc2] 00:03:18 PreventUserIdleDisplaySleep named: "com.apple.audio.AppleGFXHDAEngineOutputDP:10001:0:{B31A-08C6-00000000}.context.preventuseridledisplaysleep" Created for PID: 4145. where PID 4145 is spotify. but it doesn't matter which app is playing the audio. any help would be appreciated thanks
Replies
0
Boosts
0
Views
89
Activity
Nov ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
Replies
0
Boosts
0
Views
585
Activity
Jul ’25