I am implementing pan and zoom features for an app using a custom USB camera device, in iPadOS. I am using an update function (shown below) to apply transforms for scale and translation but they are not working. By re-enabling the animation I can see that the scale translation seems to initially take effect but then the image animates back to its original scale. This all happens in a fraction of a second but I can see it. The translation transform seems to have no effect at all. Printing out the value of AVCaptureVideoPreviewLayer.transform before and after does show that my values have been applied.
private func updateTransform() {
#if false
// Disable default animation.
CATransaction.begin()
CATransaction.setDisableActions(true)
defer { CATransaction.commit() }
#endif
// Apply the transform.
logger.debug("\(String(describing: self.videoPreviewLayer.transform))")
let transform = CATransform3DIdentity
let translate = CATransform3DTranslate(transform, translationX, translationY, 0)
let scale = CATransform3DScale(transform, scale, scale, 1)
videoPreviewLayer.transform = CATransform3DConcat(translate, scale)
logger.debug("\(String(describing: self.videoPreviewLayer.transform))")
}
My question is this, how can I properly implement pan/zoom for an AVCaptureVideoPreviewLayer? Or even better, if you see a problem with my current approach or understand why the transforms I am applying do not work, please share that information.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
AVFoundation: Strange error while trying to switch camera formats with the touch of a single button.
I'm getting the following output from my iOS app's debug console, note the error on the last line:
Capture format keys: ["600x600@25", "1200x1200@5", "1200x1200@30", "1600x1200@2", "1600x1200@30", "3200x2400@15", "3200x2400@2", "600x600@30"]
Start capture session for 1600x1200@30: <AVCaptureSession: 0x303c70190 [AVCaptureSessionPresetPhoto]>
Stop capture session: <AVCaptureSession: 0x303c70190 [AVCaptureSessionPresetInputPriority]>
<AVCaptureDeviceInput: 0x303ebb720 [Medwand S3 Camera]>[vide] -> <AVCaptureVideoDataOutput: 0x303edf1e0>
<AVCaptureDeviceInput: 0x303ebb720 [Medwand S3 Camera]>[vide] -> <AVCapturePhotoOutput: 0x303ee3e20>
<AVCaptureDeviceInput: 0x303ebb720 [Medwand S3 Camera]>[vide] -> <AVCaptureVideoPreviewLayer: 0x3030b33c0>
Start capture session for 600x600@30: <AVCaptureSession: 0x303c70190 [AVCaptureSessionPresetInputPriority]>
<AVCaptureDeviceInput: 0x303ebb720 [Medwand S3 Camera]>[vide] -> <AVCaptureVideoDataOutput: 0x303edf1e0>
<AVCaptureDeviceInput: 0x303ebb720 [Medwand S3 Camera]>[vide] -> <AVCapturePhotoOutput: 0x303ee3e20>
<AVCaptureDeviceInput: 0x303ebb720 [Medwand S3 Camera]>[vide] -> <AVCaptureVideoPreviewLayer: 0x3030b33c0>
<<<< FigSharedMemPool >>>> Fig assert: "blkHdr->useCount > 0" at (FigSharedMemPool.c:591) - (err=0)
This is in response to trying to switch capture formats between the two key modes that must regularly be used by my application. Below you will find the functions that I use to start and stop capturing frames to my preview layer. I have a UI with three buttons.
Off
Mode 1
Mode 2
If I tap the Off button in between tapping Mode 1 or Mode 2 all is well; I can do this all day. However, attempt to jump between Mode 1andMode 2` directly I run into issues. I added a layer of software between the UI and the underlying functions so that I could make sure to turn off the Camera before turning it back on in the opposite mode and was surprised to get this output. Can someone at Apple please tell me what is going on here?
For the rest of you, if anyone knows the magic incantation to safely switch camera formats, please paste that code here. Thanks. I've included my code below.
func start(for deviceFormat: String) {
sessionQueue.async { [unowned self] in
logger.debug("Start capture session for \(deviceFormat): \(self.captureSession)")
do {
guard let format = formatDict[deviceFormat] else { throw Error.captureFormatNotFound }
captureSession.stopRunning()
captureSession.beginConfiguration() // May not be necessary.
try captureDevice.lockForConfiguration() // Without this we get an error.
captureDevice.activeFormat = format
captureDevice.unlockForConfiguration() // Matching function: Necessary.
captureSession.commitConfiguration() // Matching function: May not be necessary.
captureSession.startRunning()
} catch {
logger.fault("Failed to start camera: \(error.localizedDescription)")
errorPublisher.send(error)
}
}
}
func stop() {
sessionQueue.async { [unowned self] in
logger.debug("Stop capture session: \(self.captureSession)")
captureSession.stopRunning()
}
}
Here is some code I have to create an AVAudioFile instance based on Int16 samples.
let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 44100.0, channels: 2, interleaved: false)!
let audioFile = try AVAudioFile(forWriting: outputURL, settings: format.settings)
When writing to the file I get the following runtime error, presumably from CoreAudio.
CABufferList.h:184 ASSERTION FAILURE [(nBytes <= buf->mDataByteSize) != 0 is false]:
I read this as a size mismatch between what is specified in the format used to create the file and the file's own internal processingFormat property, which is read-only. Here is my debugger console output showing the input format I created, along with the resulting AVAudioFile fileFormat and processingFormat properties.
(lldb) po format
<AVAudioFormat 0x300e553b0: 2 ch, 44100 Hz, Int16, deinterleaved>
(lldb) po format.settings
▿ 7 elements
▿ 0 : 2 elements
- key : "AVNumberOfChannelsKey"
- value : 2
▿ 1 : 2 elements
- key : "AVLinearPCMBitDepthKey"
- value : 16
▿ 2 : 2 elements
- key : "AVFormatIDKey"
- value : 1819304813
▿ 3 : 2 elements
- key : "AVLinearPCMIsNonInterleaved"
- value : 1
▿ 4 : 2 elements
- key : "AVLinearPCMIsBigEndianKey"
- value : 0
▿ 5 : 2 elements
- key : "AVLinearPCMIsFloatKey"
- value : 0
▿ 6 : 2 elements
- key : "AVSampleRateKey"
- value : 44100
(lldb) po audioFile.fileFormat
<AVAudioFormat 0x300ea5400: 2 ch, 44100 Hz, Int16, interleaved>
(lldb) po audioFile.processingFormat
<AVAudioFormat 0x300ea5450: 2 ch, 44100 Hz, Float32, deinterleaved>
Please note that the input format I'm using does not match either the audio file fileFormat or processingFormat properties. The file format is interleaved even though I specified de-interleaved. This makes sense to me as working with audio files that are growing is much easier and more efficient with interleaved data. The head-scratcher is the processingFormat. I specified Int16 samples and it is expecting Float32? According to the format settings dictionary, we are specifying the correct key/value pairs.
Is this expected behavior? Does Apple always insist on Float32 internally or is this a bug?
I would like to write a driver that supports our custom USB-C connected device, which provides a serial port interface. USBSerialDriverKit looks like the solution I need. Unfortunately, without a decent sample, I'm not sure how to accomplish this. The DriverKit documentation does a good job of telling me what APIs exist but it is very light on semantic information and details about how to use all of these API elements. A function call with five unexplained parameters just is that useful to me.
Does anyone have or know of a resource that can help me figure out how to get started?
I'm trying to write 16-bit interleaved 2-channel data captured from a LiveSwitch audio source to a AVAudioFile. The buffer and file formats match but I get a bad parameter error from the API. Does this API not support the specified format or is there some other issue?
Here is the debugger output.
(lldb) po audioFile.url
▿ file:///private/var/mobile/Containers/Data/Application/1EB14379-0CF2-41B6-B742-4C9A80728DB3/tmp/Heart%20Sounds%201
- _url : file:///private/var/mobile/Containers/Data/Application/1EB14379-0CF2-41B6-B742-4C9A80728DB3/tmp/Heart%20Sounds%201
- _parseInfo : nil
- _baseParseInfo : nil
(lldb) po error
Error Domain=com.apple.coreaudio.avfaudio Code=-50 "(null)" UserInfo={failed call=ExtAudioFileWrite(_impl->_extAudioFile, buffer.frameLength, buffer.audioBufferList)}
(lldb) po buffer.format
<AVAudioFormat 0x302a12b20: 2 ch, 44100 Hz, Int16, interleaved>
(lldb) po audioFile.fileFormat
<AVAudioFormat 0x302a515e0: 2 ch, 44100 Hz, Int16, interleaved>
(lldb) po buffer.frameLength
882
(lldb) po buffer.audioBufferList
▿ 0x0000000300941e60
- pointerValue : 12894608992
This code handles the details of converting the Live Switch frame into an AVAudioPCMBuffer.
extension FMLiveSwitchAudioFrame {
func convertedToPCMBuffer() -> AVAudioPCMBuffer {
Self.convertToAVAudioPCMBuffer(from: self)!
}
static func convertToAVAudioPCMBuffer(from frame: FMLiveSwitchAudioFrame) -> AVAudioPCMBuffer? {
// Retrieve the audio buffer and format details from the FMLiveSwitchAudioFrame
guard
let buffer = frame.buffer(),
let format = buffer.format() as? FMLiveSwitchAudioFormat else { return nil }
// Extract PCM format details from FMLiveSwitchAudioFormat
let sampleRate = Double(format.clockRate())
let channelCount = AVAudioChannelCount(format.channelCount())
// Determine bytes per sample based on bit depth
let bitsPerSample = 16
let bytesPerSample = bitsPerSample / 8
let bytesPerFrame = bytesPerSample * Int(channelCount)
let frameLength = AVAudioFrameCount(Int(buffer.dataBuffer().length()) / bytesPerFrame)
// Create an AVAudioFormat from the FMLiveSwitchAudioFormat
guard let avAudioFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: sampleRate, channels: channelCount, interleaved: true) else {
return nil
}
// Create an AudioBufferList to wrap the existing buffer
let audioBufferList = UnsafeMutablePointer<AudioBufferList>.allocate(capacity: 1)
audioBufferList.pointee.mNumberBuffers = 1
audioBufferList.pointee.mBuffers.mNumberChannels = channelCount
audioBufferList.pointee.mBuffers.mDataByteSize = UInt32(buffer.dataBuffer().length())
audioBufferList.pointee.mBuffers.mData = buffer.dataBuffer().data().mutableBytes // Directly use LiveSwitch buffer
// Transfer ownership of the buffer to AVAudioPCMBuffer
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: avAudioFormat, bufferListNoCopy: audioBufferList) /* { buffer in
// Ensure the buffer is freed when AVAudioPCMBuffer is deallocated
buffer.deallocate() // Only call this if LiveSwitch allows manual deallocation
} */
pcmBuffer?.frameLength = frameLength
return pcmBuffer
}
}
This is the handler that is invoked with every frame in order to convert it for use with AVAudioFile and optionally update a scrolling signal display on the screen.
private func onRaisedFrame(obj: Any!) -> Void {
// Bail out early if no one is interested in the data.
guard isMonitoring else { return }
// Convert LS frame to AVAudioPCMBuffer (no-copy)
let frame = obj as! FMLiveSwitchAudioFrame
let buffer = frame.convertedToPCMBuffer()
// Hand subscribers a reference to the buffer for rendering to display.
bufferPublisher?.send(buffer)
// If we have and output file, store the data there, as well.
guard let audioFile = self.audioFile else { return }
do {
try audioFile.write(from: buffer) // FIXME: This call is throwing error -50
} catch {
FMLiveSwitchLog.error(withMessage: "Failed to write buffer to audio file at \(audioFile.url): \(error)")
self.audioFile = nil
}
}
This is how the audio file is being setup.
static var recordingFormat: AVAudioFormat = {
AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 44_100, channels: 2, interleaved: true)!
}()
let audioFile = try AVAudioFile(forWriting: outputURL, settings: Self.recordingFormat.settings)
I have an iPadOS M-processor application with two different running configurations.
In config1, the shared AVAudioSession is configured for .videoChat mode using the built-in microphone. The input/output nodes of the AVAudioEngine are configured with voice processing enabled. The built-in mic is formatted for 1 channel at 48KHz.
In config2, the shared AVAudioSession is configured for .measurement mode using an external USB microphone. The input/output nodes of the AVAudioEngine are configured with voice processing disabled. The external mic is formatted for 2 channels at 44.1KHz
I've written a configuration manager designed to safely switch between these two configurations. It works by stopping AVAudioEngine and detaching all but the input and output nodes, updating the shared audio session for the desired mic and sample-rates, and setting the appropriate state for voice processing to either true or false as required by the configuration. Finally the new audio graph is constructed by attaching appropriate nodes, connecting them, and re-starting AVAudioEngine
I'm experiencing what I believe is a race-condition between switching voice processing on or off and then trying to re-build and start the new audio graph. Even though notifications, which are dumped to the console indicate that my requested input and sample-rate settings are in place, I crash when trying to start the audio engine because the sample-rate is wrong. Investigating further it looks like the switch from remote I/O to voice-processing I/O or vice-versa has not yet actually completed. I introduced a 100ms second delay and that seems to help but is obviously not a reliable way to build software that must work consistently.
How can I make sure that what are apparently asynchronous configuration changes to the shared audio session and the input/output nodes have completed before I go on?
I tried using route change notifications from the shared AVAudioSession but these lie. They say my preferred mic input and sample-rate setting is in place but when I dump the AVAudioEngine graph to the debugger console, I still see the wrong sample rate assigned to the input/output nodes. Also these are the wrong AU nodes. That is, VPIO is still in place when RIO should be, or vice-versa.
How can I make the switch reliable without arbitrary time delays?
Is my configuration manager approach appropriate (question for Apple engineers)?