Hi,
I am trying to remove the audio controls for my app on the lock screen. Since I use WKWebView, there are 3 audio tags in my html and I play and pause em via JS. However, if I do not play any sound since app launch, there are no audio controls on the lock screen. But if I play one of those 3 files (they are even less then 3 Sec sound effects e.g. for buttons) the audio controls appears on lock screen.
Note even when the sounds on pause() or not playing they were listed on the lock screen.
What I have tried so far without success
MPNowPlayingInfoCenter.default().nowPlayingInfo = [:]
and
``try audioSession.setCategory(.playback, mode: .default, options: [])
try audioSession.setActive(false, options: .notifyOthersOnDeactivation)``
and
UIApplication.shared.endReceivingRemoteControlEvents()
Another problem is that the app scales with iOS system settings "display zoom". Is there a way to deny it?
It is latest Xcode verion 16.3 and iOS 18.
I have no background mode in my Capabilities.
Nothing worked so far. Has anyone an idea?
Greetings
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Context:
I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup.
I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate.
I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking.
Issue
When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block.
Details:
The audio session is already active and configured with .playAndRecord.
The input tap is already installed when the engine is started.
When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio.
Assumptions / Current Setup
Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio.
However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking.
Questions
Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine?
Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio?
Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer?
Would using separate engines for input and output improve responsiveness?
I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation.
Relevant Code Snippets
Engine Setup
func setup() {
let input = audioEngine.inputNode
do {
try input.setVoiceProcessingEnabled(true)
} catch {
print("Could not enable voice processing \(error)")
return
}
input.isVoiceProcessingAGCEnabled = false
let output = audioEngine.outputNode
let mainMixer = audioEngine.mainMixerNode
audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat)
audioEngine.connect(beepNode, to: mainMixer, format: outputFormat)
audioEngine.connect(mainMixer, to: output, format: outputFormat)
// Initialize converters
converter = AVAudioConverter(from: inputFormat, to: outputFormat)!
f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)!
audioEngine.prepare()
}
Input Tap Installation
func installTap() {
guard AudioHandler.shared.checkMicrophonePermission() else {
print("Microphone not granted for recording")
return
}
guard !isInputTapped else {
print("[AudioEngine] Input is already tapped!")
return
}
let input = audioEngine.inputNode
let microphoneFormat = input.inputFormat(forBus: 0)
let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)!
let desiredFormat = outputFormat
let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate)
input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in
guard let self = self else { return }
// Output buffer: 1920 frames at 16kHz
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return }
outputBuffer.frameLength = outputBuffer.frameCapacity
let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = .haveData
return buffer
}
var error: NSError?
let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock)
if converterResult != .haveData {
DebugLogger.shared.print("Downsample error \(converterResult)")
} else {
self.handleDownsampledBuffer(outputBuffer)
}
}
isInputTapped = true
}
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes.
Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
Hi, I'm working on an AUv3 project.
The app itself displays my icon.
However the Auv3 extension does not display any icon in any host app (AUM, Drambo, etc.0).
I thought that the extension would inherit the host app icon but that it does not appear to be the case.
I tried to add the icon as a 1024x1024 file to the extension target and the update my extension plist file withe a CFBundleIconFile key but no luck either.
It must surely be really easy. What am I missing?
Thanks in advance for your help!
Topic:
Media Technologies
SubTopic:
Audio
So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
I am having issues deploying my iOS app, that uses ShazamKit, to get working on a Mac with Apple silicon.
When uploading the archive to App Store Connect I do get
ITMS-90863: Macs with Apple silicon support issue - The app links with libraries that aren’t present in macOS:
/usr/lib/swift/libswiftShazamKit.dylib
Is ShazamKit not supported for iOS apps that can run on Macs with Apple silicon? Or is there something I should fix in my setup / deployment?
Hello,
I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background.
I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way.
My questions are:
Does iOS expose any API to detect if a call is being recorded?
If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon?
Or is this something that iOS explicitly prevents for privacyreasons?
Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines.
Thanks in advance for any guidance.
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages.
When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error:
Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download."
The ".ar" appears to be the language code, which in this case was Arabic.
When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here.
For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around.
Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself.
private func setUpAnalyzer() async throws {
guard let sourceLanguage else {
throw Error.languageNotSpecified
}
guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else {
throw Error.unsupportedLanguage
}
let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription)
self.transcriber = transcriber
let reservedLocales = await AssetInventory.reservedLocales
if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales {
if let oldest = reservedLocales.last {
await AssetInventory.release(reservedLocale: oldest)
}
}
do {
let status = await AssetInventory.status(forModules: [transcriber])
print("status: \(status)")
if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) {
try await installationRequest.downloadAndInstall()
}
}
...
I am trying to debug the AAX version of my plugin (MIDI effect) on Pro Tools, but I am getting the following error (Mac console) when attempting to load it:
dlsym cannot find symbol g_dwILResult in CFBundle etc..
I used Xcode 16.4 to build the plugin.
Has anybody come across the same or a similar message?
Best,
Achillefs
Axart Labs
Is there a recommended way on macOS 26 Tahoe to take a CoreAudio AudioObjectID and use it to lookup the underlying USB LocationID?
I previously used AudioObjectID to query the corresponding DeviceUID with kAudioDevicePropertyDeviceUID. Then I queried for the IOService matching kIOAudioEngineClassName with property kIOAudioEngineGlobalUniqueIDKey matching DeviceUID, and I loaded kUSBDevicePropertyLocationID from the result.
This fails on macOS 26, because the IO Registry for the device has an entry for usbaudiod rather than AppleUSBAudioEngine, and usbaudiod does not include a kIOAudioEngineGlobalUniqueIDKey property (or any other property to map it to a CoreAudio DeviceUID).
My use-case here is a piece of audio recording software that allows configuring a set of supported audio devices via USB HID prior to recording. I present the user with a list of CoreAudio devices to use, but without a way to lookup the underlying USB LocationID, I cannot guarantee that the configured device matches the selected device (e.g. if the user plugged in two identical microphones).
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown:
"[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868
(related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022)
If I convert the buffer to AVAudioPCMFormatFloat32 playback works.
My questions are:
Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application?
If 1 is YES is this documented anywhere?
If 1 is YES is this required format subject to change at any point?
Thanks!
I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
iPhoneやiPadにおいて、画面上のボタンなどをタップした際に、特定の楽器音を発音させる方法をご存知の方いらっしゃいませんか?
現在音楽学習アプリを作成途中で、画面上の鍵盤や指板のボタン状のframeに、単音又は和音を割当て発音させる事を考えております
SwiftUIのcodeのみで実現できないでしょうか
嘗て、MIDIのlevel1の楽器の発音機能があった様に記憶していますが、現在のOS上では同様の機能を実装してないように思えます
皆様のお知恵をお貸しください
I might have misunderstood the docs, but is Call Translation going to be available for VOIP applications? Eg in an already connected VOIP call, would it be possible for Call Translations to be enabled on an iOS 26 and Apple Intelligence supported device?
I have personally tried it and it doesn’t look like it supported VOIP but would love to confirm this.
reference: https://developer.apple.com/documentation/callkit/cxsettranslatingcallaction/
Topic:
Media Technologies
SubTopic:
Audio
Hi,
I have just implemented an Audio Unit v3 host.
AgsAudioUnitPlugin *audio_unit_plugin;
AVAudioUnitComponentManager *audio_unit_component_manager;
NSArray<AVAudioUnitComponent *> *av_component_arr;
AudioComponentDescription description;
guint i, i_stop;
if(!AGS_AUDIO_UNIT_MANAGER(audio_unit_manager)){
return;
}
audio_unit_component_manager = [AVAudioUnitComponentManager sharedAudioUnitComponentManager];
/* effects */
description = (AudioComponentDescription) {0,};
description.componentType = kAudioUnitType_Effect;
av_component_arr = [audio_unit_component_manager componentsMatchingDescription:description];
i_stop = [av_component_arr count];
for(i = 0; i < i_stop; i++){
ags_audio_unit_manager_load_component(audio_unit_manager,
(gpointer) av_component_arr[i]);
}
/* instruments */
description = (AudioComponentDescription) {0,};
description.componentType = kAudioUnitType_MusicDevice;
av_component_arr = [audio_unit_component_manager componentsMatchingDescription:description];
i_stop = [av_component_arr count];
for(i = 0; i < i_stop; i++){
ags_audio_unit_manager_load_component(audio_unit_manager,
(gpointer) av_component_arr[i]);
}
But this doesn't show me Audio Unit v2 plugins, why?
regards, Joël
I've got a problem with my app where I'm testing it on my own phone.
I'm using audio kit to generate tones as part of the app. Everything seems to work fine. Sounds start, Stop, etc. They play when the app is closed and when the phone is locked, so background is working.
However, I'm seeing an issue where, even when STOP is pressed and the application exited, if I get a notification such as a text message, the base tone for the app starts to play.
If I then open the app, check the Start/Stop button - it says start so that. hasnt' been activated. If I click Start, then a 2nd tone starts. This one stops with the Stop button. However the original tone that was set off by an incoming message carries on playing.
Until I go to the Open Apps View on the phone and slide the application upwards.
For the life of me, I can't figure out whats happening here.
Hi,
I am getting into a trap. Please check stack-trace, howto fix this?
regards, Joël
stack-trace with ExtAudioFileWrite
I have a simple AVAudioEngine graph as follows:
AVAudioPlayerNode -> AVAudioUnitEQ -> AVAudioUnitTimePitch -> AVAudioUnitReverb -> Main mixer node of AVAudioEngine.
I noticed that whenever I have AVAudioUnitTimePitch or AVAudioUnitVarispeed in the graph, I noticed a very distinct crackling/popping sound in my Airpods Pro 2 when starting up the engine and playing the AVAudioPlayerNode and unable to find the reason why this is happening. When I remove the node, the crackling completely goes away. How do I fix this problem since i need the user to be able to control the pitch and rate of the audio during playback.
import AVKit
@Observable @MainActor
class AudioEngineManager {
nonisolated private let engine = AVAudioEngine()
private let playerNode = AVAudioPlayerNode()
private let reverb = AVAudioUnitReverb()
private let pitch = AVAudioUnitTimePitch()
private let eq = AVAudioUnitEQ(numberOfBands: 10)
private var audioFile: AVAudioFile?
private var fadePlayPauseTask: Task<Void, Error>?
private var playPauseCurrentFadeTime: Double = 0
init() {
setupAudioEngine()
}
private func setupAudioEngine() {
guard let url = Bundle.main.url(forResource: "Song name goes here", withExtension: "mp3") else {
print("Audio file not found")
return
}
do {
audioFile = try AVAudioFile(forReading: url)
} catch {
print("Failed to load audio file: \(error)")
return
}
reverb.loadFactoryPreset(.mediumHall)
reverb.wetDryMix = 50
pitch.pitch = 0 // Increase pitch by 500 cents (5 semitones)
engine.attach(playerNode)
engine.attach(pitch)
engine.attach(reverb)
engine.attach(eq)
// Connect: player -> pitch -> reverb -> output
engine.connect(playerNode, to: eq, format: audioFile?.processingFormat)
engine.connect(eq, to: pitch, format: audioFile?.processingFormat)
engine.connect(pitch, to: reverb, format: audioFile?.processingFormat)
engine.connect(reverb, to: engine.mainMixerNode, format: audioFile?.processingFormat)
}
func prepare() {
guard let audioFile else { return }
playerNode.scheduleFile(audioFile, at: nil)
}
func play() {
DispatchQueue.global().async { [weak self] in
guard let self else { return }
engine.prepare()
try? engine.start()
DispatchQueue.main.async { [weak self] in
guard let self else { return }
playerNode.play()
fadePlayPauseTask?.cancel()
playPauseCurrentFadeTime = 0
fadePlayPauseTask = Task { [weak self] in
guard let self else { return }
while true {
let volume = updateVolume(for: playPauseCurrentFadeTime / 0.1, rising: true)
// Ramp up volume until 1 is reached
if volume >= 1 { break }
engine.mainMixerNode.outputVolume = volume
try await Task.sleep(for: .milliseconds(10))
playPauseCurrentFadeTime += 0.01
}
engine.mainMixerNode.outputVolume = 1
}
}
}
}
func pause() {
fadePlayPauseTask?.cancel()
playPauseCurrentFadeTime = 0
fadePlayPauseTask = Task { [weak self] in
guard let self else { return }
while true {
let volume = updateVolume(for: playPauseCurrentFadeTime / 0.1, rising: false)
// Ramp down volume until 0 is reached
if volume <= 0 { break }
engine.mainMixerNode.outputVolume = volume
try await Task.sleep(for: .milliseconds(10))
playPauseCurrentFadeTime += 0.01
}
engine.mainMixerNode.outputVolume = 0
playerNode.pause()
// Shut down engine once ramp down completes
DispatchQueue.global().async { [weak self] in
guard let self else { return }
engine.pause()
}
}
}
private func updateVolume(for x: Double, rising: Bool) -> Float {
if rising {
// Fade in
return Float(pow(x, 2) * (3.0 - 2.0 * (x)))
} else {
// Fade out
return Float(1 - (pow(x, 2) * (3.0 - 2.0 * (x))))
}
}
func setPitch(_ value: Float) {
pitch.pitch = value
}
func setReverbMix(_ value: Float) {
reverb.wetDryMix = value
}
}
struct ContentView: View {
@State private var audioManager = AudioEngineManager()
@State private var pitch: Float = 0
@State private var reverb: Float = 0
var body: some View {
VStack(spacing: 20) {
Text("🎵 Audio Player with Reverb & Pitch")
.font(.title2)
HStack {
Button("Prepare") {
audioManager.prepare()
}
Button("Play") {
audioManager.play()
}
.padding()
.background(Color.green)
.foregroundColor(.white)
.cornerRadius(10)
Button("Pause") {
audioManager.pause()
}
.padding()
.background(Color.red)
.foregroundColor(.white)
.cornerRadius(10)
}
VStack {
Text("Pitch: \(Int(pitch)) cents")
Slider(value: $pitch, in: -2400...2400, step: 100) { _ in
audioManager.setPitch(pitch)
}
}
VStack {
Text("Reverb Mix: \(Int(reverb))%")
Slider(value: $reverb, in: 0...100, step: 1) { _ in
audioManager.setReverbMix(reverb)
}
}
}
.padding()
}
}
The AVB AVnu MILAN Convention has a groweing Population. Many big companies (Cisco, Meyer Sound, d&b Audio, l‘acoustics, Presonus, digico etc.) implements the AVB AVnu Milan Standards. Is there a plan on the Apple side to also implement AVnu Milan on top of the AVB Protocol?
The advantage for Apple Sound would be a great Integration in the professionell Audio market and a more stable intergration on top of the AVB protocol. The atdecc work, but Not that stable.
Topic:
Media Technologies
SubTopic:
Audio
I’m building a standalone Apple Watch metronome app for running.
My goal is for these 3 apps to work at the same time:
Runna owns the workout session
Spotify plays music
my app plays a metronome click in the background
So far this is what I've found:
Using HKWorkoutSession in my metronome app works well with Spotify, but conflicts with Runna and other workout apps, so I removed that.
Using watchOS background audio with longFormAudio allows my app run in the background, and it can coexist with Runna. However, it seems to conflict with Spotify playback, and one app tends to stop the other.
Is there any supported watchOS audio/background configuration that allows all 3 at once?
More specifically this is what I need:
another app owns HKWorkoutSession
Spotify keeps playing
my app keeps generating metronome clicks in the background
Or is this simply not supported by current watchOS session/background rules?
My metronome uses AVAudioEngine / AVAudioPlayerNode with generated click audio.
Thank you!
I work on an iOS app that records video and audio. We've been getting reports for a while from users who are experiencing their video recordings being cut off. After investigating, I found that many users are receiving the AVAudioSessionMediaServicesWereResetNotification (.mediaServicesWereResetNotification) notification while recording. It's associated with the AVFoundationErrorDomain[-11819] error, which seems to indicate that the system audio daemon crashed. We have a handler registered to end the recording, show the user a prompt, and restart our AV sessions. However, from our logs this looks to be happening to hundreds of users every day and it's not an ideal user experience, so I would like to figure out why this is happening and if it's due to something that we're doing wrong.
The debug menu option to trigger the audio session reset is not of much use, because it can't be triggered unless you leave the app and go to system settings. So our app can't be recording video when the debug reset is triggered. So far I haven't found a way to reproduced the issue locally, but I can see that it's happening to users from logs.
I've found some posts online from developers experiencing similar issues, but none of them seem to directly address our issue. The system error doesn't include a userInfo dictionary, and as far as I can tell it's a system daemon crash so any logs would need to be captured from the OS.
Is there any way that I could get more information about what may be causing this error that I may have missed?
Topic:
Media Technologies
SubTopic:
Audio