I'm developing an audio player app that uses AVAudioFile to read PCM data from various formats. I'm experiencing severe performance issues when seeking in FLAC, while other compressed formats (M4A/AAC) work correctly.
I don't intend to use them in my app, but I also tested mp3 files just by curiosity and they also have this issue.
Environment:
macOS 26 (Tahoe)
Xcode 26.3
Apple Silicon (M1)
The issue:
After setting AVAudioFile.framePosition to a position mid-file, the subsequent call to AVAudioFile.read(into:frameCount:) blocks for an unreasonable amount of time for FLAC and MP3 files. The delay scales linearly with the seek target, seeking near the beginning is fast, seeking toward the end is proportionally slower, which suggests the decoder is decoding linearly from the beginning of the file rather than using any seek index.
(My app deals with “images” of Audio CDs ripped as a single long audio file.)
The issue is particularly severe when reading files from an SMB network share (server on Ethernet, client on Wi-Fi with the access point ~2 meters away in line of sight).
Quick Benchmark results:
I tested with the same 75-minute audio content (16-bit/44.1 kHz stereo, 200,502,708 frames) encoded in five formats, seeking to the midpoint.
Over SMB (Local Network, Server on Ethernet, Client on WiFi):
Format | Seek + Read Time
----------|------------------
WAV | 0.007 s
AIFF | 0.009 s
Apple | 0.015 s
Lossless |
MP3 | 9.2 s
FLAC | 30.2 s
Locally (MacBook Air M1 SSD) :
Format | Seek + Read Time
----------|------------------
WAV | 0.0005 s
AIFF | 0.0004 s
Apple | 0.0011 s
Lossless |
MP3 | 0.1958 s
FLAC | 0.7528 s
WAV, AIFF, and M4A all seek virtually instantly (< 15 ms). MP3 and FLAC exhibit linear-time behavior, with FLAC being the worst affected.
Note that M4A (AAC) is also a compressed format that requires decoding after seeking, yet it completes in 15 ms. This rules out any inherent limitation of compressed formats, the MP4 container's packet index (stts/stco) is clearly being used for fast random access. Both MP3 (Xing/LAME TOC) and FLAC (SEEKTABLE metadata block) have their own seek mechanisms that should provide similar performance.
Minimal CLI tool to reproduce:
import Foundation
guard CommandLine.arguments.count > 1 else {
print("Usage: FLACSpeed <audio-file-path>")
exit(1)
}
let path = CommandLine.arguments[1]
let fileURL = URL(fileURLWithPath: path)
do {
let file = try AVAudioFile(forReading: fileURL)
let format = file.processingFormat
let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: 8192)!
let totalFrames = file.length
let seekTarget = totalFrames / 2
print("File: \(fileURL.lastPathComponent)")
print("Format: \(format)")
print("Total frames: \(totalFrames)")
print("Seeking to frame: \(seekTarget)")
file.framePosition = seekTarget
let start = CFAbsoluteTimeGetCurrent()
try file.read(into: buffer, frameCount: 8192)
let elapsed = CFAbsoluteTimeGetCurrent() - start
print("Read after seek took \(elapsed) seconds")
} catch {
print("Error: \(error.localizedDescription)")
exit(1)
}
Expected behavior:
AVAudioFile.read(into:frameCount:) after setting framePosition should use the available seek mechanisms in FLAC and MP3 files for fast random access, as it already does for M4A (AAC). Even accounting for the fact that seek tables provide approximate (not sample-precise) positioning, the "jump to nearest index point + decode forward" approach should complete in milliseconds, not seconds.
Workaround:
For FLAC, I've worked around this by using libFLAC directly, which provides instant seeking via FLAC__stream_decoder_seek_absolute().
libFLAC Performance:
For comparison, libFLAC's FLAC__stream_decoder_seek_absolute() performs the same seek + read on the same FLAC file in around 0.015, using the FLAC seek table to jump to the nearest preceding seek point, then decoding forward a small number of frames to the exact target sample.
0
0
2