Since iOS 18, the system setting “Allow Audio Playback” (enabled by default) allows third-party app audio to continue playing while the user is recording video with the Camera app. This has created a problem for the app I’m developing.
➡️ The problem:
My app plays continuous audio in both foreground and background states. If the user starts recording video using the iOS Camera app, the app’s audio — still playing in the background — gets captured in the video — obviously an unintended behavior.
Yes, the user could stop the app manually before starting the video recording, but that can’t be guaranteed. As a developer, I need a way to stop the app’s audio before the video recording begins.
So far, I haven’t found a reliable way to detect when video recording starts if ‘Allow Audio Playback’ is ON.
➡️ What I’ve tried:
— AVAudioSession.interruptionNotification → doesn’t fire
— devicesChangedEventStream → not triggered
I don’t want to request mic permission (app doesn’t use mic). also, disabling the app from playing audio in the background isn’t an option as it is a crucial part of the user experience
➡️ What I need:
A reliable, supported way to detect when the Camera app begins video recording, without requiring mic access — so I can stop audio and avoid unintentional overlap with the user’s recordings.
Any official guidance, workarounds, or AVFoundation techniques would be greatly appreciated.
Thanks.
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I'm able to get text to speech to audio file using the following code for iOS 12 iPhone 8 to create a car file:
audioFile = try AVAudioFile(
forWriting: saveToURL,
settings: pcmBuffer.format.settings,
commonFormat: .pcmFormatInt16,
interleaved: false)
where pcmBuffer.format.settings is:
[AVAudioFileTypeKey: kAudioFileMP3Type,
AVSampleRateKey: 48000,
AVEncoderBitRateKey: 128000,
AVNumberOfChannelsKey: 2,
AVFormatIDKey: kAudioFormatLinearPCM]
However, this code does not work when I run the app in iOS 18 on iPhone 13 Pro Max. The audio file is created, but it doesn't sound right. It has a lot of static and it seems the speech is very low pitch.
Can anyone give me a hint or an answer?
Hi team,
With regards to Call (Live) Translations on VOIP:
Is it possible to invoke live translations within the app? (without going into the Call System UI)
Is it possible to navigate users from app to Call System UI via an API? (and also invoking the new live translations directly)
Will Apple support more languages apart from the current ones? (Currently I see 4 supported languages)
I have used AVQueuePlayer in my music app to play sequence of audios from a remote server, this how I have defined things my player in my ViewModel
Variables
private var cancellables = Set()
private let audioSession = AVAudioSession.sharedInstance()
private var avQueuePlayer: AVQueuePlayer?
@Published var playbackSpeed: Float = 1.0
before starting playback, I am making sure that audio session is set properly, the code snippet used for that is
do {
try audioSession.setCategory(.playback, mode: .default, options: [])
try audioSession.setActive(true, options: [])
} catch {
return
}
and this is the function I am using to update playback speed
func updatePlaybackSpeed(_ newSpeed: Float){
if newSpeed > 0.0, newSpeed <= 2.0{
playbackSpeed = newSpeed
avQueuePlayer?.rate = newSpeed
print("requested speed is (newSpeed) and actual speed is (String(describing: avQueuePlayer?.rate))")
}
}
sometimes whatever speed is set, player seems to play at the same speed as it was set,
e.g. Once I got "requested speed is 1.5 and actual speed is 1.5", and player also seemed to play at the speed of 1.5
but another time I got "requested speed is 2.0 and actual speed is 2.0", but player still seemed to play at the speed of 1.0
to observe changes in rate, I used this
**private func observeRateChanges() {
guard let avQueuePlayer = self.avQueuePlayer else { return }
NotificationCenter.default.publisher(for: AVQueuePlayer.rateDidChangeNotification, object: avQueuePlayer)
.compactMap { $0.userInfo?[AVPlayer.rateDidChangeReasonKey] as? AVPlayer.RateDidChangeReason }
.sink { reason in
switch reason {
case .appBackgrounded:
print("The app transitioned to the background.")
case .audioSessionInterrupted:
print("The system interrupts the app’s audio session.")
case .setRateCalled:
print("The app set the player’s rate.")
case .setRateFailed:
print("An attempt to change the player’s rate failed.")
default:
break
}
}
.store(in: &cancellables)
}**
when rate was set properly, I got this "The app set the player’s rate." from the above function, but when it wasn't, I got this "An attempt to change the player’s rate failed.,"
now I am not able to understand why rate is not being set, and if it gave "requested speed is 2.0 and actual speed is 2.0" from updatePlaybackSpeed function, why does the player seems to play with the speed of 1.0?
Topic:
Media Technologies
SubTopic:
Audio
Using an iPhone Pro 12 running iOS 26.0.1, with AirPods Pro 3. Camera app does capture video with what seems to be "Studio Quality Recording".
Am trying to replicate that SQR with my own Camera like app, and while I can pull audio in from the APP3 mic, and my video capture app is recording a 48,000Hz high-bitrate video, the audio still sounds non-SQR.
I'm seeing bluetoothA2DP , bluetoothLE , bluetoothHFP as portType, and not sure if SQR depends on one of those?
Is there sample code demonstrating a SQR capture? Nevermind video and camera, just audio even?
Also, I don't understand what SQR is doing between the APP3 and the iPhone. What codec is that? What bitrate is that? If I capture video using Capture and inspect the audio stream I see mono 74.14 kbit/s MPEG-4 AAC, 48000 Hz. But I assume that's been recompressed and not really giving me any insight into the APP3 H2 transmission?
Hello everyone,
I am working on an app that allows you to review your own music using Apple Music. Currently I am running into an issue with the skipping forwards and backwards outside of the app.
How it should work: When skipping forward or backwards on the lock or home screen of an iPhone, the next or previous song on an album should play and the information should change to reflect that in the app.
If you play a song in Apple Music, you can see a Now Playing view in the lock screen.
When you skip forward or backwards, it will do either action and it would reflect that when you see a little frequency icon on artwork image of a song.
What it's doing: When skipping forward or backwards on the lock or home screen of an iPhone, the next or previous song is reflected outside of the app, but not in the app.
When skipping a song outside of the app, it works correctly to head to the next song.
But when I return to the app, it is not reflected
NOTE: I am not using MusicKit variables such as Track, Album to display the songs. Since I want to grab the songs and review them I need a rating so I created my own that grabs the MusicItemID, name, artist(s), etc.
NOTE: I am using ApplicationMusicPlayer.shared
Is there a way to get the song to reflect in my app?
(If its easier, a simple example of it would be nice. No need to create an entire xprod file)
Environment→ ・Device: iPad 10th generation ・OS:**iOS18.3.2
I'm using AVAudioSession to record sound in my application. But I recently came to realize that when the app starts a recording session on a tablet, OS automatically sets the tablet volume to 50% and when after recording ends, it doesn't change back to the previous volume level before starting the recording. So I would like to know whether this is an OS default behavior or a bug?
If it's a default behavior, I much appreciate if I can get a link to the documentation.
Getting MatchError "MATCH_ATTEMPT_FAILED" everytime when matchstream on Android Studio Java+Kotlin project. My project reads the samples from the mic input using audioRecord class and sents them to the Shazamkit to matchstream. I created a kotlin class to handle to Shazamkit. The audioRecord is build to be mono and 16 bit.
My Kotlin Class
class ShazamKitHelper {
val shazamScope = CoroutineScope(Dispatchers.IO + SupervisorJob())
lateinit var streaming_session: StreamingSession
lateinit var signature: Signature
lateinit var catalog: ShazamCatalog
fun createStreamingSessionAsync(developerTokenProvider: DeveloperTokenProvider, readBufferSize: Int, sampleRate: AudioSampleRateInHz
): CompletableFuture<Unit>{
return CompletableFuture.supplyAsync {
runBlocking {
runCatching {
shazamScope.launch {
createStreamingSession(developerTokenProvider,readBufferSize,sampleRate)
}.join()
}.onFailure { throwable ->
}.getOrThrow()
}
}
}
private suspend fun createStreamingSession(developerTokenProvider:DeveloperTokenProvider,readBufferSize: Int,sampleRateInHz: AudioSampleRateInHz) {
catalog = ShazamKit.createShazamCatalog(developerTokenProvider)
streaming_session = (ShazamKit.createStreamingSession(
catalog,
sampleRateInHz,
readBufferSize
) as ShazamKitResult.Success).data
}
fun startMatching() {
val audioData = sharedAudioData ?: return // Return if sharedAudioData is null
CoroutineScope(Dispatchers.IO).launch {
runCatching {
streaming_session.matchStream(audioData.data, audioData.meaningfulLengthInBytes, audioData.timestampInMs)
}.onFailure { throwable ->
Log.e("ShazamKitHelper", "Error during matchStream", throwable)
}
}
}
@JvmField
var sharedAudioData: AudioData? = null;
data class AudioData(val data: ByteArray, val meaningfulLengthInBytes: Int, val timestampInMs: Long)
fun startListeningForMatches() {
CoroutineScope(Dispatchers.IO).launch {
streaming_session.recognitionResults().collect { matchResult ->
when (matchResult) {
is MatchResult.Match -> {
val match = matchResult.matchedMediaItems
println("Match found: ${match.get(0).title} by ${match.get(0).artist}")
}
is MatchResult.NoMatch -> {
println("No match found")
}
is MatchResult.Error -> {
val error = matchResult.exception
println("Match error: ${error.message}")
}
}
}
}
}
}
My code in java reads the samples from a thread:
shazam_create_session();
while (audioRecord.getRecordingState() == AudioRecord.RECORDSTATE_RECORDING){
if (shazam_session_created){
byte[] buffer = new byte[288000];//max_shazam_seconds * sampleRate * 2];
audioRecord.read(buffer,0,buffer.length,AudioRecord.READ_BLOCKING);
helper.sharedAudioData = new ShazamKitHelper.AudioData(buffer,buffer.length,System.currentTimeMillis());
helper.startMatching();
if (!listener_called){
listener_called = true;
helper.startListeningForMatches();
}
} else{
SystemClock.sleep(100);
}
}
private void shazam_create_session() {
MyDeveloperTokenProvider provider = new MyDeveloperTokenProvider();
AudioSampleRateInHz sample_rate = AudioSampleRateInHz.SAMPLE_RATE_48000;
if (sampleRate == 44100)
sample_rate = AudioSampleRateInHz.SAMPLE_RATE_44100;
CompletableFuture<Unit> future = helper.createStreamingSessionAsync(provider, 288000, sample_rate);
future.thenAccept(result -> {
shazam_session_created = true;
});
future.exceptionally(throwable -> {
Toast.makeText(mine, "Failure", Toast.LENGTH_SHORT).show();
return null;
});
}
I Implemented the developer token in java as follows
public static class MyDeveloperTokenProvider implements DeveloperTokenProvider {
DeveloperToken the_token = null;
@NonNull
@Override
public DeveloperToken provideDeveloperToken() {
if (the_token == null){
try {
the_token = generateDeveloperToken();
return the_token;
} catch (NoSuchAlgorithmException | InvalidKeySpecException e) {
throw new RuntimeException(e);
}
} else{
return the_token;
}
}
@NonNull
private DeveloperToken generateDeveloperToken() throws NoSuchAlgorithmException, InvalidKeySpecException {
PKCS8EncodedKeySpec priPKCS8 = new PKCS8EncodedKeySpec(Decoders.BASE64.decode(p8));
PrivateKey appleKey = KeyFactory.getInstance("EC").generatePrivate(priPKCS8);
Instant now = Instant.now();
Instant expiration = now.plus(Duration.ofDays(90));
String jwt = Jwts.builder()
.header().add("alg", "ES256").add("kid", keyId).and()
.issuer(teamId)
.issuedAt(Date.from(now))
.expiration(Date.from(expiration))
.signWith(appleKey) // Specify algorithm explicitly
.compact();
return new DeveloperToken(jwt);
}
}
Does anyone know how to pronounce the sound of a specific instrument when you tap a button on the screen on your iPhone or iPad? Now, in the middle of creating a music learning app, I'm thinking of assigning monotones or chords to the button-like frames on the keyboard and fingerboard on the screen. Can it be achieved with SwiftUI chords alone? Once upon a time, MIDI level 1 I remember that there was a pronunciation function of the instrument, but I don't think about implementing the same function in the current OS. Please lend me your wisdom.
Topic:
Media Technologies
SubTopic:
Audio
Hello,
I've discovered a buffer initialization bug in AVAudioUnitSampler that happens when loading presets with multiple zones referencing different regions in the same audio file (monolith/concatenated samples approach).
Almost all zones output silence (i.e. zeros) at the beginning of playback instead of starting with actual audio data.
The Problem
Setup:
Single audio file (monolith) containing multiple concatenated samples
Multiple zones in an .aupreset, each with different sample start and sample end values pointing to different regions of the same file
All zones load successfully without errors
Expected Behavior:
All zones should play their respective audio regions immediately from the first sample.
Actual Behavior:
Last zone in the zone list: Works perfectly - plays audio immediately
All other zones: Output [0, 0, 0, 0, ..., _audio_data] instead of [real_audio_data]
The number of zeros varies from event to event for each zone. It can be a couple of samples (<30) up to several buffers.
After the initial zeros, the correct audio plays normally, so there is no shift in audio playback, just missing samples at the beginning.
Minimal Reproduction
1. Create Test Monolith Audio File
Create a single Wav file with 3 concatenated 1-second samples (44.1kHz):
Sample 1: frames 0-44099 (constant amplitude 0.3)
Sample 2: frames 44100-88199 (constant amplitude 0.6)
Sample 3: frames 88200-132299 (constant amplitude 0.9)
2. Create Test Preset
Create an .aupreset with 3 zones all referencing the same file:
Pseudo code
<Zone array>
<zone 1> start : 0, end: 44099, note: 60, waveform: ref_to_monolith.wav;
<zone 2> start sample: 44100, note: 62, end sample: 88199, waveform: ref_to_monolith.wav;
<zone 3> start sample: 88200, note: 64, end sample: 132299, waveform: ref_to_monolith.wav;
</Zone array>
3. Load and Test
// Load preset into AVAudioUnitSampler
let sampler = AVAudioUnitSampler()
try sampler.loadAudioFiles(from: presetURL)
// Play each zone (MIDI notes C4=60, D4=62, E4=64)
sampler.startNote(60, withVelocity: 64, onChannel: 0) // Zone 1
sampler.startNote(62, withVelocity: 64, onChannel: 0) // Zone 2
sampler.startNote(64, withVelocity: 64, onChannel: 0) // Zone 3
4. Observed Result
Zone 1 (C4): [0, 0, 0, ..., 0.3, 0.3, 0.3] ❌ Zeros at beginning
Zone 2 (D4): [0, 0, 0, ..., 0.6, 0.6, 0.6] ❌ Zeros at beginning
Zone 3 (E4): [0.9, 0.9, 0.9, ...] ✅ Works correctly (last zone)
What I've Extensively Tested
What DOES Work
Separate files per zone:
Each zone references its own individual audio file
All zones play correctly without zeros
Problem: Not viable for iOS apps with 500+ sample libraries due to file handle limitations
What DOESN'T Work (All Tested)
1. Different Audio Formats:
CAF (Float32 PCM, Int16 PCM, both interleaved and non-interleaved)
M4A (AAC compressed)
WAV (uncompressed)
SF2 (SoundFont2)
Bug persists across all formats
2. CAF Region Chunks:
Created CAF files with embedded region chunks defining zone boundaries
Set zones with no sampleStart/sampleEnd in preset (nil values)
AVAudioUnitSampler completely ignores CAF region metadata
Bug persists
3. Unique Waveform IDs:
Gave each zone a unique waveform ID (268435456, 268435457, 268435458)
Each ID has its own file reference entry (all pointing to same physical file)
Hypothesized this might trigger separate buffer initialization
Bug persists - no improvement
4. Different Sample Rates:
Tested: 44.1kHz, 48kHz, 96kHz
Bug occurs at all sample rates
5. Mono vs Stereo:
Bug occurs with both mono and stereo files
Environment
macOS: Sonoma 14.x (tested across multiple minor versions)
iOS: Tested on iOS 17.x with same results
Xcode: 16.x
Frameworks: AVFoundation, AudioToolbox
Reproducibility: 100% reproducible with setup described above
Impact & Use Case
This bug severely impacts professional music applications that need:
Small file sizes: Monolith files allow sharing compressed audio data (AAC/M4A)
iOS file handle limits: Opening 400+ individual sample files is not viable on iOS
Performance: Single file loading is much faster than hundreds of individual files
Standard industry practice: Monolith/concatenated samples are used by EXS24, Kontakt, and most professional samplers
Current Impact:
Cannot use monolith files with AVAudioUnitSampler on iOS
Forced to choose between: unusable audio (zeros at start) OR hitting iOS file limits
No viable workaround exists
Root Cause Hypothesis
The bug appears to be in AVAudioUnitSampler's internal buffer initialization when:
Multiple zones share the same source audio file
Each zone specifies different sampleStart/sampleEnd offsets
Key observation: The last zone in the zone array always works correctly.
This is NOT related to:
File permissions or security-scoped resources (separate files work fine)
Audio codec issues (happens with uncompressed PCM too)
Preset parsing (preset loads correctly, all zones are valid)
Questions
Is this a known issue? I couldn't find any documentation, bug reports, or discussions about this.
Is there ANY workaround that allows monolith files to work with AVAudioUnitSampler?
Alternative APIs? Is there a different API or approach for iOS that properly supports monolith sample files?
Hello,
Has anyone else experienced variations in the accuracy of the playbackTime value? After a few seconds of playback, the reported time adjusts by a fraction of a second, making it difficult to calculate the actual playbackTime of the audio.
This can be recreated by playing a song in MusicKit, recording the start time of the audio, playing for at least 10-20 seconds, and then comparing the playbackTime value to one calculated using the start time of the audio. In my experience this jump occurs after about 10 seconds of playback.
Any help would be appreciated.
Thanks!
Hi everyone,
I wanted to bring up a question about Core Audio and its potential for future updates or improvements, specifically regarding latency optimization. As someone who relies on Core Audio for real-time audio processing, any enhancements in this area would be incredibly beneficial for professionals in the industry.
Does anyone know if Apple has shared any plans or updates regarding Core Audio’s performance, particularly for low-latency applications? I’d appreciate any insights or advice from the community!
Thanks so much!
Best,
Michael
ApplicationMusicPlayer is not available on watchOS but all other platforms. Is there a technical reason for that like battery life? Same goes for SystemMusicPlayer and MPMusicPlayerController. I already filed feedbacks for that.
Hi, everyone, I downloaded the source code EditingSpatialAudioWithAnAudioMix.zip from https://developer.apple.com/documentation/Cinematic/editing-spatial-audio-with-an-audio-mix, when I carried out one of the actions named "process" in command line the program crashed!!
Form the source code, I found that the value of componentType is set to kAudioUnitType_FormatConverter:
// The actual `AudioUnit`.
public var auAudioMix = AVAudioUnitEffect()
init() {
// Generate a component description for the audio unit.
let componentDescription = AudioComponentDescription(
componentType: kAudioUnitType_FormatConverter,
componentSubType: kAudioUnitSubType_AUAudioMix,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0)
auAudioMix=AVAudioUnitEffect(audioComponentDescription: componentDescription)
}
But in the document from https://developer.apple.com/documentation/avfaudio/avaudiouniteffect/init(audiocomponentdescription:), it seems that componentType can not be set to kAudioUnitType_FormatConverter and :
Has everyone encountered this problem?
I found that the aggregated device correctly obtains input channels in the standard microphone mode. However, in voice isolation mode, it only retrieves channels from the first sub-device in the aggregated device's list. If I want to properly obtain channel information in voice isolation mode, how should I do it?
Hi,
I've had a new deck installed in my car for about 1.5 weeks.
I'm having compatibility issues with my 15PM.
It happens both wired and wirelessly, I get the error "Accessory not supported by this device". It used to happen all the time, now it's 50/50. Sometimes it works.
I've removed and added Bluetooth multiple times on phone and deck, I bought a belkin usb-c to usb-a cable today and it seems to fix it but the problem comes back.
I've changed the setting "FaceID and passcode-allow access when locked-accessories."
The car stereo guy reckons it's definitely an issue with the phone not the deck, I'm inclined to believe him since the error states "by this device".
Any advice appreciated.
Topic:
Media Technologies
SubTopic:
Audio
Among the millions of users of our online product, we have identified through data metrics that the silent audio data capture rate on iPadOS 18.4.1 or 18.5 has increased abnormally. However, we are unable to reproduce the issue. Has anyone encountered a similar issue? The parameters we used are as follows:
AudioSession:
category:AVAudioSessionCategoryPlayAndRecord
mode:AVAudioSessionModeDefault
option:77
preferredSampleRate:48000.000000
preferredIOBufferDuration:0.010000
AudioUnit
format.mFormatID = kAudioFormatLinearPCM;
format.mSampleRate = 48000.0;
format.mChannelsPerFrame = 2;
format.mBitsPerChannel = 16;
format.mFramesPerPacket = 1;
format.mBytesPerFrame = format.mChannelsPerFrame * 16 / 8;
format.mBytesPerPacket = format.mBytesPerFrame * format.mFramesPerPacket;
format.mFormatFlags = kAudioFormatFlagsNativeEndian | kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
component.componentType = kAudioUnitType_Output;
component.componentSubType = kAudioUnitSubType_RemoteIO;
component.componentManufacturer = kAudioUnitManufacturer_Apple;
component.componentFlags = 0;
component.componentFlagsMask = 0;
I develop a application with an uvc camera, this camera is a webcam, I use the AVFoundation library ,but when I run the code "[self.mCaptureSession startRunning]" ,I can not get the buffer, I already set the delegate, any answer will help.
Hi, I'm trying to plan out development of an app and am wondering if it is possible to have user generated content automatically populate into a custom shazamkit catalogue and be able to query this catalogue non-locally?
Storing all the submissions locally would obviously not scale.
We have the necessary background recording entitlements, and for many users... do not run into any issues.
However, there is a subset of users that routinely get recordings ending.. we have narrowed this down and believe it to be the work of the watch dog.
First we removed the entire view hierarchy when app is backgrounded. There is just 'Text("Recording")'
This got the CPU usage in profiler down to 0%. We saw massive improvements to recording success rate.
We walked away assuming that was enough. However we are still seeing the same sort of crashes. All in the background. We're using Observation to drive audio state changes to a Live Activity.
Are those Observations causing the problem? Why doesn't apple provide a better API to background audio? The internet is full of weird issues
https://stackoverflow.com/questions/76010213/why-is-my-react-native-app-sometimes-terminated-in-the-background-while-tracking
https://stackoverflow.com/questions/71656047/why-is-my-react-native-app-terminating-in-the-background-while-recording-ios-r
https://github.com/expo/expo/issues/16807
This is such a terrible user experience. And we have very little visibility into what is happening and why.
No where in apple documentation states that in order for background recording to work, the app can only be 'Text("Recording")'
It does not outline a CPU or memory threshold. It just kills us.