I am looping through an audio file, below is my very simple code.
Am looping through 400 frames each time, but I picked 400 here as a random number.
I would prefer to read in by time instead. Let's say a quarter of second. So I was wondering how can I determine the time length of each frame in the audio file?
I am assuming that determining this might differ based on audio formats? I know almost nothing about audio.
var myAudioBuffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: 400)!
guard var buffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: AVAudioFrameCount(input.length)) else {
return nil
}
var myAudioBuffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: 400)!
while (input.framePosition < input.length - 1 ) {
let fcIndex = ( input.length - input.framePosition > 400) ? 400 : input.length - input.framePosition
try? input.read(into: myAudioBuffer, frameCount: AVAudioFrameCount(fcIndex))
let volUme = getVolume(from: myAudioBuffer, bufferSize: myAudioBuffer.frameLength)
...manipulation code
}
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I would like to open an audio file on my iOS device and remove long silences. I already have the code for calculating volumes so am not pasting that here.
What I am unsure of "how to do" is: While I believe that I have the proper code to read the file below, I am unsure as to how to read it in proper pieces to I can later get the volume of each piece.
I realize that this might be a situation of calculating the size of frames and whatnot. But I am totally green when it comes to audio.
I would seriously appreciate any guidance.
guard let input = try? AVAudioFile(forReading: url) else {
return nil
}
guard let buffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: AVAudioFrameCount(input.length)) else {
return nil
}
do {
try input.read(into: buffer)
} catch {
return nil
}
Updated info below
Full disclosure: I do have this question over on StackOverflow, but I am at a standstill till I find a way to move forward, debug, etc.
I am trying to recognize prerecorded speech in Swift. Essentially it either detects no speech, detects blank speech, or works on the one prerecorded file where I screamed a few words.
I can't tell where the headache lies and can't figure out if there's a more detailed way to debug this. I can't find any properties that give more detailed info.
Someone on SO did recommend I go through Apple's demo, here. This works just fine, and my code is very similar to it. Yet the main difference remains if there is something about the way I save my audio files or something else is leading to my headaches.
If anyone has any insight into this I would very much appreciate any hints.
My question over on StackOverflow
Updated info below, and new code
Updated info It appears that I was calling SFSpeechURLRecognitionRequest too often, and before I completed the first request. Perhaps I need to create a new instance of SFSpeechRecognizer? Unsure.
Regardless, I quickly/sloppily adjusted the code to only run it once the previous instance returned its results.
The results were much better, except one audio file still came up as no results. Not an error, just no text.
This file is the same as the previous file, in that I took an audio recording and split it in two. So the formats and volumes are the same.
So I still need a better way to debug this, to find out what it going wrong with that file.
The code where I grab the file and attempt to read it
func findAudioFiles(){
let fm = FileManager.default
var aFiles : URL
print ("\(urlPath)")
do {
let items = try fm.contentsOfDirectory(atPath: documentsPath)
let filteredInterestArray1 = items.filter({$0.hasSuffix(".m4a")})
let filteredInterestArray2 = filteredInterestArray1.filter({$0.contains("SS-X-")})
let sortedItems = filteredInterestArray2.sorted()
for item in sortedItems {
audioFiles.append(item)
}
NotificationCenter.default.post(name: Notification.Name("goAndRead"), object: nil, userInfo: myDic)
} catch {
print ("\(error)")
}
}
@objc func goAndRead(){
audioIndex += 1
if audioIndex != audioFiles.count {
let fileURL = NSURL.fileURL(withPath: documentsPath + "/" + audioFiles[audioIndex], isDirectory: false)
transcribeAudio(url: fileURL, item: audioFiles[audioIndex])
}
}
func requestTranscribePermissions() {
SFSpeechRecognizer.requestAuthorization { [unowned self] authStatus in
DispatchQueue.main.async {
if authStatus == .authorized {
print("Good to go!")
} else {
print("Transcription permission was declined.")
}
}
}
}
func transcribeAudio(url: URL, item: String) {
guard let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US")) else {return}
let request = SFSpeechURLRecognitionRequest(url: url)
if !recognizer.supportsOnDeviceRecognition { print ("offline not available") ; return }
if !recognizer.isAvailable { print ("not available") ; return }
request.requiresOnDeviceRecognition = true
request.shouldReportPartialResults = true
recognizer.recognitionTask(with: request) {(result, error) in
guard let result = result else {
print("\(item) : There was an error: \(error.debugDescription)")
return
}
if result.isFinal {
print("\(item) : \(result.bestTranscription.formattedString)")
NotificationCenter.default.post(name: Notification.Name("goAndRead"), object: nil, userInfo: self.myDic)
}
}
}
Sorry, my question was idiotic, and due to my bad typing skills (not the first time).
So I am erasing it since I can't delete it
Again, sorry
I am looking for a proper tutorial on how to write let's say a messaging app.
in other words the user doesn't have to run Messaging to get messages. I would like to build that type of structured app. I realize that "push notifications" appear the way to go.
But at this point I still can't find an decent tutorial that seems to cover all the bases.
Thank you
I have a UIViewText that I have in all my UIViewControllers that carries over data from the entire app's instance.
If I put the 2 lines that scroll to the bottom in the viewDidAppear section, the text scrolls to the bottom, but you see it occur, so it's not pleasant visually.
However, if I put the same 2 lines in the viewWillAppear section (as shown below) then for some reason the UITextView starts at the top of the text.
Am I somehow doing this incorrectly?
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
myStatusWin.text = db.status
let range = NSMakeRange(myStatusWin.text.count - 1, 0)
myStatusWin.scrollRangeToVisible(range)
}
override func viewDidAppear(_ animated: Bool) {
super.viewDidAppear(animated)
}
Within my code to fetch data from CoreData I have the following line:
let itemNoSort = NSSortDescriptor(key:"itemNo", ascending: false)
What I am not sure of however is that the above is the same as saying descending: true
Can't seem to find it in the documentation.
I've had this issue before...
Under User-Defined I have created DEBUG_LEVEL_1
And within my code I have
#if DEBUG_LEVEL_1
self.status = printSimDir()
#endif
However the printSimDir function is never called.
So obviously I am setting something incorrectly here
Within my UIViewController I have a UITextView which I use to dump current status and info into. Obviously evry time I add text to the UITextView I would like it to scroll to the bottom.
So I've created this function, which I call from UIViewController whenever I have new data.
func updateStat(status: String, tView: UITextView) {
db.status = db.status + status + "\n"
tView.text = db.status
let range = NSMakeRange(tView.text.count - 1, 0)
tView.scrollRangeToVisible(range)
tView.flashScrollIndicators()
}
The only thing that does not work is the tView.scrollRangeToVisible. However, if from UIViewController I call:
updateStat(status: "...new data...", tView: mySession)
let range = NSMakeRange(mySession.text.count - 1, 0)
mySession.scrollRangeToVisible(range)
then the UITextView's scrollRangeToVisible does work.
I'm curious if anyone knows why this works when called within the UIViewController, but not when called from a function?
p.s. I have also tried the updateStatus function as an extension to UIViewController, but that doesn't work either
I am trying to add a UITextView within my app to output data to. Naturally the data will eventually be bigger than the size of the UITextView, and the view is a set size. So I would like the user to be able to scroll through its content.
However, I cannot scroll through the content in the app. Am I supposed to build the scrolling function myself? Seems weird that I would have to do that, but I cannot seem to find the answer to this on the web.
I’ve also noticed that no vertical scroll at shows up when the text count is larger than the size of the object, which makes me wonder if I am
missing a property or two.
func createStatusField() -> UITextView {
let myStatus = UITextView(frame: CGRect(x: 50, y: 50, width: 100, height: 300))
myStatus.autocorrectionType = .no
myStatus.text = "hello there"
myStatus.backgroundColor = .secondarySystemBackground
myStatus.textColor = .secondaryLabel
myStatus.font = UIFont.preferredFont(forTextStyle: .body)
myStatus.layer.zPosition = 1
myStatus.isScrollEnabled = true
myStatus.showsVerticalScrollIndicator = true
return myStatus
}
I already posted about Xcode 13.4.1 not supporting iPhone's 15.6 iOS. But the answer raised even more questions.
If the latest version of Xcode (13.4.1) won't support iOS 15.6, why should I think an earlier version of Xcode would?
What is the real solution to getting Xcode to run apps on that iOS? Github does not have files past 15.5?
Does Xcode automatically update its supported iOS files behind the scenes?
Is there a planned date for Xcode to support iOS 15.6?
Thank you
So I've found out from other posts that Xcode 13.4.1 won't debug apps on iPhones with iOS 15.6.
The solution everyone that everyone seems to agree on is to go back to Xcode 13.3.1.
While I am downloading the xip file for that version, I want to check first on how to install the older version? I don't need to mess things up any worse than they are now.
Am at the beginning of a voice recording app. I store incoming voice data into a buffer array, and write 50 of them to a file. The code works fine, Sample One.
However, I would like the recorded files to be smaller. So here I try to add an AVAudioMixer to downsize the sampling. But this code sample gives me two errors. Sample Two
The first error I get is when I call audioEngine.attach(downMixer). The debugger gives me nine of these errors:
throwing -10878
The second error is a crash when I try to write to audioFile. Of course they might all be related, so am looking to include the mixer successfully first.
But I do need help as I am just trying to piece these all together from tutorials, and when it comes to audio, I know less than anything else.
Sample One
//these two lines are in the init of the class that contains this function...
node = audioEngine.inputNode
recordingFormat = node.inputFormat(forBus: 0)
func startRecording()
{
audioBuffs = []
x = -1
node.installTap(onBus: 0, bufferSize: 8192, format: recordingFormat, block: {
[self]
(buffer, _) in
x += 1
audioBuffs.append(buffer)
if x >= 50 {
audioFile = makeFile(format: recordingFormat, index: fileCount)
mainView?.setLabelText(tag: 3, text: "fileIndex = \(fileCount)")
fileCount += 1
for i in 0...49 {
do {
try audioFile!.write(from: audioBuffs[i]);
} catch {
mainView?.setLabelText(tag: 4, text: "write error")
stopRecording()
}
}
...cleanup buffer code
}
})
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error { print ("oh catch \(error)") }
}
Sample Two
//these two lines are in the init of the class that contains this function
node = audioEngine.inputNode
recordingFormat = node.inputFormat(forBus: 0)
func startRecording() {
audioBuffs = []
x = -1
// new code
let format16KHzMono = AVAudioFormat.init(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 11025.0, channels: 1, interleaved: true)
let downMixer = AVAudioMixerNode()
audioEngine.attach(downMixer)
// installTap on the mixer rather than the node
downMixer.installTap(onBus: 0, bufferSize: 8192, format: format16KHzMono, block: {
[self]
(buffer, _) in
x += 1
audioBuffs.append(buffer)
if x >= 50 {
// use a different format in creating the audioFile
audioFile = makeFile(format: format16KHzMono!, index: fileCount)
mainView?.setLabelText(tag: 3, text: "fileIndex = \(fileCount)")
fileCount += 1
for i in 0...49 {
do {
try audioFile!.write(from: audioBuffs[i]);
} catch {
stopRecording()
}
}
...cleanup buffers...
}
})
let format = node.inputFormat(forBus: 0)
// new code
audioEngine.connect(node, to: downMixer, format: format)//use default input format
audioEngine.connect(downMixer, to: audioEngine.outputNode, format: format16KHzMono)//use new audio format
downMixer.outputVolume = 0.0
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error { print ("oh catch \(error)") }
}
Am trying to distinguish the differences in volumes between background noise, and someone speaking in Swift.
Previously, I had come across a tutorial which had me looking at the power levels in each channel. It come out as the code listed in Sample One which I called within the installTap closure. It was ok, but the variance between background and the intended voice to record, wasn't that great. Sure, it could have been the math used to calculate it, but since I have no experience in audio data, it was like reading another language.
Then I came across another demo. It's code was much simpler, and the difference in values between background noise and speaking voice was much greater, therefore much more detectable. It's listed here in Sample Two, which I also call within the installTap closure.
My issue here is wanting to understand what is happening in the code. In all my experiences with other languages, voice was something I never dealt with before, so this is way over my head.
Not looking for someone to explain this to me line by line. But if someone could let me know where I can find decent documentation so I can better grasp what is going on, I would appreciate it.
Thank you
Sample One
func audioMetering(buffer:AVAudioPCMBuffer) {
// buffer.frameLength = 1024
let inNumberFrames:UInt = UInt(buffer.frameLength)
if buffer.format.channelCount > 0 {
let samples = (buffer.floatChannelData![0])
var avgValue:Float32 = 0
vDSP_meamgv(samples,1 , &avgValue, inNumberFrames)
var v:Float = -100
if avgValue != 0 {
v = 20.0 * log10f(avgValue)
}
self.averagePowerForChannel0 = (self.LEVEL_LOWPASS_TRIG*v) + ((1-self.LEVEL_LOWPASS_TRIG)*self.averagePowerForChannel0)
self.averagePowerForChannel1 = self.averagePowerForChannel0
}
if buffer.format.channelCount > 1 {
let samples = buffer.floatChannelData![1]
var avgValue:Float32 = 0
vDSP_meamgv(samples, 1, &avgValue, inNumberFrames)
var v:Float = -100
if avgValue != 0 {
v = 20.0 * log10f(avgValue)
}
self.averagePowerForChannel1 = (self.LEVEL_LOWPASS_TRIG*v) + ((1-self.LEVEL_LOWPASS_TRIG)*self.averagePowerForChannel1)
}
}
Sample Two
private func getVolume(from buffer: AVAudioPCMBuffer, bufferSize: Int) -> Float {
guard let channelData = buffer.floatChannelData?[0] else {
return 0
}
let channelDataArray = Array(UnsafeBufferPointer(start:channelData, count: bufferSize))
var outEnvelope = [Float]()
var envelopeState:Float = 0
let envConstantAtk:Float = 0.16
let envConstantDec:Float = 0.003
for sample in channelDataArray {
let rectified = abs(sample)
if envelopeState < rectified {
envelopeState += envConstantAtk * (rectified - envelopeState)
} else {
envelopeState += envConstantDec * (rectified - envelopeState)
}
outEnvelope.append(envelopeState)
}
// 0.007 is the low pass filter to prevent
// getting the noise entering from the microphone
if let maxVolume = outEnvelope.max(),
maxVolume > Float(0.015) {
return maxVolume
} else {
return 0.0
}
}
I have no problem creating, and writing to AVAudioFiles.
do {
audioFile = try AVAudioFile(forWriting: destinationPath, settings: format.settings)
print ("created: \(curFileLinkName)")
} catch { print ("error creating file")}
But is there a way to close the file?
I've searched and all I could find was this very post on StackOverflow, which makes me wonder.