Curiosity’s Profile | Apple Developer Forums

When making an AVFoundation video copy, how do you only add particular ranges of the original video for which there exist trajectories

I've been looking through Apple's sample code Building a Feature-Rich App for Sports Analysis - https://developer.apple.com/documentation/vision/building_a_feature-rich_app_for_sports_analysis and its associated WWDC video to learn to reason about AVFoundation and VNDetectTrajectoriesRequest - https://developer.apple.com/documentation/vision/vndetecttrajectoriesrequest. My goal is to allow the user to import videos (this part I have working, the user sees a UIDocumentBrowserViewController - https://developer.apple.com/documentation/uikit/uidocumentbrowserviewcontroller, picks a video file, and then a copy is made), but I only want segments of the original video copied where trajectories are detected from a ball moving. I've tried as best I can to grasp the two parts, at the very least finding where the video copy is made and where the trajectory request is made. The full video copy happens in CameraViewController.swift (I'm starting with just imported video for now and not reading live from the device's video camera), line 160:func startReadingAsset(_ asset: AVAsset) { videoRenderView = VideoRenderView(frame: view.bounds) setupVideoOutputView(videoRenderView) let displayLink = CADisplayLink(target: self, selector: #selector(handleDisplayLink(:))) displayLink.preferredFramesPerSecond = 0 displayLink.isPaused = true displayLink.add(to: RunLoop.current, forMode: .default) guard let track = asset.tracks(withMediaType: .video).first else { AppError.display(AppError.videoReadingError(reason: "No video tracks found in AVAsset."), inViewController: self) return } let playerItem = AVPlayerItem(asset: asset) let player = AVPlayer(playerItem: playerItem) let settings = [ String(kCVPixelBufferPixelFormatTypeKey): kCVPixelFormatType420YpCbCr8BiPlanarFullRange ] let output = AVPlayerItemVideoOutput(pixelBufferAttributes: settings) playerItem.add(output) player.actionAtItemEnd = .pause player.play() self.displayLink = displayLink self.playerItemOutput = output self.videoRenderView.player = player let affineTransform = track.preferredTransform.inverted() let angleInDegrees = atan2(affineTransform.b, affineTransform.a) * CGFloat(180) / CGFloat.pi var orientation: UInt32 = 1 switch angleInDegrees { case 0: orientation = 1 // Recording button is on the right case 180, -180: orientation = 3 // abs(180) degree rotation recording button is on the right case 90: orientation = 8 // 90 degree CW rotation recording button is on the top case -90: orientation = 6 // 90 degree CCW rotation recording button is on the bottom default: orientation = 1 } videoFileBufferOrientation = CGImagePropertyOrientation(rawValue: orientation)! videoFileFrameDuration = track.minFrameDuration displayLink.isPaused = false } @objc private func handleDisplayLink(_ displayLink: CADisplayLink) { guard let output = playerItemOutput else { return } videoFileReadingQueue.async { let nextTimeStamp = displayLink.timestamp + displayLink.duration let itemTime = output.itemTime(forHostTime: nextTimeStamp) guard output.hasNewPixelBuffer(forItemTime: itemTime) else { return } guard let pixelBuffer = output.copyPixelBuffer(forItemTime: itemTime, itemTimeForDisplay: nil) else { return } // Create sample buffer from pixel buffer var sampleBuffer: CMSampleBuffer? var formatDescription: CMVideoFormatDescription? CMVideoFormatDescriptionCreateForImageBuffer(allocator: nil, imageBuffer: pixelBuffer, formatDescriptionOut: &formatDescription) let duration = self.videoFileFrameDuration var timingInfo = CMSampleTimingInfo(duration: duration, presentationTimeStamp: itemTime, decodeTimeStamp: itemTime) CMSampleBufferCreateForImageBuffer(allocator: nil, imageBuffer: pixelBuffer, dataReady: true, makeDataReadyCallback: nil, refcon: nil, formatDescription: formatDescription!, sampleTiming: &timingInfo, sampleBufferOut: &sampleBuffer) if let sampleBuffer = sampleBuffer { self.outputDelegate?.cameraViewController(self, didReceiveBuffer: sampleBuffer, orientation: self.videoFileBufferOrientation) DispatchQueue.main.async { let stateMachine = self.gameManager.stateMachine if stateMachine.currentState is GameManager.SetupCameraState { // Once we received first buffer we are ready to proceed to the next state stateMachine.enter(GameManager.DetectingBoardState.self) } } } } } Line 139 self.outputDelegate?.cameraViewController(self, didReceiveBuffer: sampleBuffer, orientation: self.videoFileBufferOrientation) is where the video sample buffer is passed to the Vision framework subsystem for analyzing trajectories, the second part. This delegate callback is implemented in GameViewController.swift on line 335: // Perform the trajectory request in a separate dispatch queue. trajectoryQueue.async { do { try visionHandler.perform([self.detectTrajectoryRequest]) if let results = self.detectTrajectoryRequest.results { DispatchQueue.main.async { self.processTrajectoryObservations(controller, results) } } } catch { AppError.display(error, inViewController: self) } } Trajectories found are drawn over the video in self.processTrajectoryObservations(controller, results). Where I'm stuck now is modifying this so that instead of drawing the trajectories, the new video only copies parts of the original video to it where trajectories were detected in the frame.

Media Technologies Audio Vision AVFoundation

0

1.1k

Apr ’21

With user imported video, how do you filter for frames based on Vision analysis?

I'd like to perform VNDetectHumanBodyPoseRequests on a video that the user imports through the system photo picker or document view controller. I started looking at the Building a Feature-Rich App for Sports Analysis - https://developer.apple.com/documentation/vision/building_a_feature-rich_app_for_sports_analysis sample code since it has an example where video is imported from disk and then analyzed. However, my end goal is to filter for frames that contain certain poses, so that all frames without them are edited out / deleted (instead of in the sample code drawing on frames with detected trajectories). For pose detection I'm looking at the Detecting Human Actions in a Live Video Feed - https://developer.apple.com/documentation/createml/detecting_human_actions_in_a_live_video_feed, but the live video capture isn't quite relevant. I'm trying to break this down into smaller problems and have a few questions: Should a full video file copy be made before analysis? The Detecting Human Actions in a Live Video Feed - https://developer.apple.com/documentation/createml/detecting_human_actions_in_a_live_video_feed sample code uses a Combine pipeline for analyzing live video frames. Since I'm analyzing imported video, would Combine be overkill or a good fit here? After I've detected which frames have a particular pose, how (in AVFoundation terms) do I filter for those frames or edit out / delete the frames without that pose?

App & System Services General Combine Vision AVFoundation

1

0

861

Mar ’21

What is the simplest approach to execute a Swift statement only after an asynchronous operation finishes?

For example, Operation A both fetches model data over the network and updates a UICollectionViewbacked by it. Operation B filters model data. What is a good approach to executing B only after A is finished?

Programming Languages Swift Swift

2

0

606

Jul ’21

What is a robust approach to deleting a CKRecord associated with an IndexPath in a table view or collection view?

When synchronizing model objects, local CKRecords, and CKRecords in CloudKit during swipe-to-delete, how can I make this as robust as possible? Error handling omitted for the sake of the example. override func tableView(_ tableView: UITableView, commit editingStyle: UITableViewCell.EditingStyle, forRowAt indexPath: IndexPath) { if editingStyle == .delete { let record = self.records[indexPath.row] privateDatabase.delete(withRecordID: record.recordID) { recordID, error in self.records.remove(at: indexPath.row) } } } Since indexPath could change due to other changes in the table view / collection view during the time it takes to delete the record from CloudKit, how could this be improved upon?

UI Frameworks UIKit CloudKit UIKit

1

0

651

Jul ’21

Constrain view's top anchor to just at the edge of sensor housing

What might be a good way to constrain a view's top anchor to be just at the edge of a device's Face ID sensor housing if it has one? This view is a product photo that would be clipped too much if it ignored the top safe area inset, but if it was positioned relative to the top safe area margin this wouldn't be ideal either because of the slight gap between the sensor housing and the view (the view is a photo of pants cropped at the waist). What might be a good approach here?

UI Frameworks UIKit UIKit Auto Layout

0

1.1k

Jul ’23

SwiftUI scroll view page indicator color

In a SwiftUI scroll view with the page style, is it possible to change the page indicator color?

UI Frameworks SwiftUI SwiftUI

2

0

4.9k

Apr ’22

Migrating from value semantics model stored on iCloud to Core Data model

I have an app that currently depends on fetching the model through CloudKit, and is composed of value types. I'm considering adding Core Data support so that record modifications are robust regardless of network conditions. Core Data resources seem to always assume a model layer with reference semantics, so I'm not sure where to begin. Should I keep my top-level model type a struct? Can I? If I move my model to reference semantics, how might I bridge from past model instances that are fetched through CloudKit and then decoded? Thank you in advance.

Programming Languages Swift Swift CloudKit Core Data

0

567

Sep ’21

When observing a notification that may be posted "on a thread other than the one used to registered the observer," how should I ensure thread-safe UI work?

I observe when an AVPlayer finishes play in order to present a UIAlert at the end time. NotificationCenter.default.addObserver( self, selector: #selector(presentAlert), name: .AVPlayerItemDidPlayToEndTime, object: nil ) I've had multiple user reports of the alert happening where they're not intended, such as the middle of the video after replaying, and on other views. I'm unable to reproduce this myself, but my guess is that it's a threading issue since AVPlayerItemDidPlayToEndTime says "the system may post this notification on a thread other than the one used to registered the observer." How then do I make sure the alert is present on the main thread? Should I dispatch to the main queue from within my presentAlert function, or add the above observer with addObserver(forName:object:queue:using:) instead, passing in the main operation queue?

App & System Services General Foundation Swift

1

0

1.3k

Oct ’21

How do you extend an iOS app's background execution time when continuing an upload operation?

I'd like a user's upload operation that's started in the foreground to continue when they leave the app. Apple's article Extending Your App's Background Execution Time has the following code listing func sendDataToServer( data : NSData ) { // Perform the task on a background queue. DispatchQueue.global().async { // Request the task assertion and save the ID. self.backgroundTaskID = UIApplication.shared. beginBackgroundTask (withName: "Finish Network Tasks") { // End the task if time expires. UIApplication.shared.endBackgroundTask(self.backgroundTaskID!) self.backgroundTaskID = UIBackgroundTaskInvalid } // Send the data synchronously. self.sendAppDataToServer( data: data) // End the task assertion. UIApplication.shared.endBackgroundTask(self.backgroundTaskID!) self.backgroundTaskID = UIBackgroundTaskInvalid } } The call to self.sendAppDataToServer( data: data) is unclear. Is this where the upload operation would go, wrapped in Dispatch.global().sync { }?

App & System Services General Swift Background Tasks CFNetwork Foundation

1

0

661

Nov ’21

How do you create a new AVAsset video that consists of only frames from given `CMTimeRange`s of another video?

Apple's sample code Identifying Trajectories in Video contains the following delegate callback: func cameraViewController(_ controller: CameraViewController, didReceiveBuffer buffer: CMSampleBuffer, orientation: CGImagePropertyOrientation) { let visionHandler = VNImageRequestHandler(cmSampleBuffer: buffer, orientation: orientation, options: [:]) if gameManager.stateMachine.currentState is GameManager.TrackThrowsState { DispatchQueue.main.async { // Get the frame of rendered view let normalizedFrame = CGRect(x: 0, y: 0, width: 1, height: 1) self.jointSegmentView.frame = controller.viewRectForVisionRect(normalizedFrame) self.trajectoryView.frame = controller.viewRectForVisionRect(normalizedFrame) } // Perform the trajectory request in a separate dispatch queue. trajectoryQueue.async { do { try visionHandler.perform([self.detectTrajectoryRequest]) if let results = self.detectTrajectoryRequest.results { DispatchQueue.main.async { self.processTrajectoryObservations(controller, results) } } } catch { AppError.display(error, inViewController: self) } } } } However, instead of drawing UI whenever detectTrajectoryRequest.results exist (https://developer.apple.com/documentation/vision/vndetecttrajectoriesrequest/3675672-results), I'm interested in using the CMTimeRange provided by each result to construct a new video. In effect, this would filter down the original video to only frames with trajectories. How might I accomplish this, perhaps through writing only specific time ranges' frames from one AVFoundation video to a new AVFoundation video?

Programming Languages Swift Swift Vision AVFoundation

1

0

822

Nov ’21

Trying to modernize and get working Apple's sample code AVReaderWriter. Why is this `DispatchGroup` work block not being called?

Apple's sample code "AVReaderWriter: Offline Audio / Video Processing" has the following listing let writingGroup = dispatch_group_create() // Transfer data from input file to output file. self.transferVideoTracks(videoReaderOutputsAndWriterInputs, group: writingGroup) self.transferPassthroughTracks(passthroughReaderOutputsAndWriterInputs, group: writingGroup) // Handle completion. let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0) dispatch_group_notify(writingGroup, queue) { // `readingAndWritingDidFinish()` is guaranteed to call `finish()` exactly once. self.readingAndWritingDidFinish(assetReader, assetWriter: assetWriter) } in CynanifyOperation.swift (an NSOperation subclass that stylizes imported video and exports it). How would I get about writing this part in modern Swift so that it compiles and works? I've tried writing this as let writingGroup = DispatchGroup() // Transfer data from input file to output file. self.transferVideoTracks(videoReaderOutputsAndWriterInputs: videoReaderOutputsAndWriterInputs, group: writingGroup) self.transferPassthroughTracks(passthroughReaderOutputsAndWriterInputs: passthroughReaderOutputsAndWriterInputs, group: writingGroup) // Handle completion. writingGroup.notify(queue: .global()) { // `readingAndWritingDidFinish()` is guaranteed to call `finish()` exactly once. self.readingAndWritingDidFinish(assetReader: assetReader, assetWriter: assetWriter) } However, it's taking an extremely long time for self.readingAndWritingDidFinish(assetReader: assetReader, assetWriter: assetWriter) to be called, and my UI is stuck in the ProgressViewController with a loading spinner. Is there something I wrote incorrectly or missed conceptually in the Swift 5 version?

Programming Languages Swift Swift AVFoundation

1

0

487

Nov ’21

What is a good way to reset a UIPanGestureRecognizer's view to its original frame once the pan gesture is done?

Say you have a pinch gesture recognizer and pan gesture recognizer on an image view: @IBAction func pinchPiece(_ pinchGestureRecognizer: UIPinchGestureRecognizer) { guard pinchGestureRecognizer.state == .began || pinchGestureRecognizer.state == .changed, let piece = pinchGestureRecognizer.view else { // After pinch releases, zoom back out. if pinchGestureRecognizer.state == .ended { UIView.animate(withDuration: 0.3, animations: { pinchGestureRecognizer.view?.transform = CGAffineTransform.identity }) } return } adjustAnchor(for: pinchGestureRecognizer) let scale = pinchGestureRecognizer.scale piece.transform = piece.transform.scaledBy(x: scale, y: scale) pinchGestureRecognizer.scale = 1 // Clear scale so that it is the right delta next time. } @IBAction func panPiece(_ panGestureRecognizer: UIPanGestureRecognizer) { guard panGestureRecognizer.state == .began || panGestureRecognizer.state == .changed, let piece = panGestureRecognizer.view else { return } let translation = panGestureRecognizer.translation(in: piece.superview) piece.center = CGPoint(x: piece.center.x + translation.x, y: piece.center.y + translation.y) panGestureRecognizer.setTranslation(.zero, in: piece.superview) } public func gestureRecognizer(_ gestureRecognizer: UIGestureRecognizer, shouldRecognizeSimultaneouslyWith otherGestureRecognizer: UIGestureRecognizer) -> Bool { true } The pinch gesture's view resets to its original state after the gesture is done, which occurs in its else clause. What would be a good way to do the same for the pan gesture recognizer? Ideally I'd like the gesture recognizers to be in an extension of UIImageView, which would also mean that I can't add a store property to the extension for tracking the initial state of the image view.

UI Frameworks UIKit Swift UIKit

5

0

1.2k

Dec ’21

Selectively reading sample buffers from specific time ranges and then writing them to an asset writer - why is the AVPlayer stuck loading?

Given an AVAsset, I'm performing a Vision trajectory request on it and would like to write out a video asset that only contains frames with trajectories (filter out downtime in sports footage where there's no ball moving). I'm unsure what would be a good approach, but as a starting point I tried the following pipeline: Copy sample buffer from the source AVAssetReaderOutput. Perform trajectory request on a vision handler parameterized by the sample buffer. For each resulting VNTrajectoryObservation (trajectory detected), use its associated CMTimeRange to configure a new AVAssetReader set to that time range. Append the time range constrained sample buffer to one AVAssetWriterInput until the forEach is complete. In code: private func transferSamplesAsynchronously(from readerOutput: AVAssetReaderOutput, to writerInput: AVAssetWriterInput, onQueue queue: DispatchQueue, sampleBufferProcessor: SampleBufferProcessor, completionHandler: @escaping () -> Void) { /* The writerInput continously invokes this closure until finished or cancelled. It throws an NSInternalInconsistencyException if called more than once for the same writer. */ writerInput.requestMediaDataWhenReady(on: queue) { var isDone = false /* While the writerInput accepts more data, process the sampleBuffer and then transfer the processed sample to the writerInput. */ while writerInput.isReadyForMoreMediaData { if self.isCancelled { isDone = true break } // Get the next sample from the asset reader output. guard let sampleBuffer = readerOutput.copyNextSampleBuffer() else { // The asset reader output has no more samples to vend. isDone = true break } let visionHandler = VNImageRequestHandler(cmSampleBuffer: sampleBuffer, orientation: self.orientation, options: [:]) do { try visionHandler.perform([self.detectTrajectoryRequest]) if let results = self.detectTrajectoryRequest.results { try results.forEach { result in let assetReader = try AVAssetReader(asset: self.asset) assetReader.timeRange = result.timeRange let trackOutput = AVTrackOutputs.firstTrackOutput(ofType: .video, fromTracks: self.asset.tracks, withOutputSettings: nil) assetReader.add(trackOutput) assetReader.startReading() guard let sampleBuffer = trackOutput.copyNextSampleBuffer() else { // The asset reader output has no more samples to vend. isDone = true return } // Append the sample to the asset writer input. guard writerInput.append(sampleBuffer) else { /* The writer could not append the sample buffer. The `readingAndWritingDidFinish()` function handles any error information from the asset writer. */ isDone = true return } } } } catch { print(error) } } if isDone { /* Calling `markAsFinished()` on the asset writer input does the following: 1. Unblocks any other inputs needing more samples. 2. Cancels further invocations of this "request media data" callback block. */ writerInput.markAsFinished() /* Tell the caller the reader output and writer input finished transferring samples. */ completionHandler() } } } private func readingAndWritingDidFinish(assetReaderWriter: AVAssetReaderWriter, completionHandler: @escaping FinishHandler) { if isCancelled { completionHandler(.success(.cancelled)) return } // Handle any error during processing of the video. guard sampleTransferError == nil else { assetReaderWriter.cancel() completionHandler(.failure(sampleTransferError!)) return } // Evaluate the result reading the samples. let result = assetReaderWriter.readingCompleted() if case .failure = result { completionHandler(result) return } /* Finish writing, and asynchronously evaluate the results from writing the samples. */ assetReaderWriter.writingCompleted { result in completionHandler(result) return } } When run I get the following: No error is caught in the first catch clause, and none are caught in private func readingAndWritingDidFinish(assetReaderWriter: AVAssetReaderWriter, completionHandler: @escaping FinishHandler), the completion handler is called. Help with any of the following questions would be appreciated: What is causing what appears to be indefinite loading? How might I isolate the problem further? Am I misusing or misunderstanding how to selectively read from time ranges of AVAssetReader objects? Should I forego the AVAssetReader / AVAsssetWriter route entirely, and use the time ranges with AVAssetExportSession instead? I don't know how the two approaches compare, or what to consider when choosing between the two.

Programming Languages Swift Swift Vision AVFoundation Core Media

1

0

976

Dec ’21

Strategy for avoiding of removing duplicate Vision trajectory request results

I am saving time ranges from an input video asset where trajectories are found, then exporting only those segments to an output video file. Currently I track these time ranges in a stored property var timeRangesOfInterest: [Double : CMTimeRange], which is set in the trajectory request's completion handler func completionHandler(request: VNRequest, error: Error?) { guard let request = request as? VNDetectTrajectoriesRequest else { return } if let results = request.results, results.count > 0 { for result in results { var timeRange = result.timeRange timeRange.start = timeRange.start - self.assetWriterStartTime self.timeRangesOfInterest[timeRange.start.seconds] = timeRange } } } Then these time ranges of interest are used in an export session to only export those segments /* Finish writing, and asynchronously evaluate the results from writing the samples. */ assetReaderWriter.writingCompleted { result in self.exportVideoTimeRanges(timeRanges: self.timeRangesOfInterest.map { $0.1 }) { result in completionHandler(result) } } Unfortunately however, I'm getting repeated trajectory video segments in the outputted video. Is this maybe because trajectory requests return "in progress" repeated trajectory results with slightly different time range start times? What might be a good strategy for avoiding or removing them? I noticed trajectory segments will appear out of order in the output as well.

Media Technologies Audio Vision AVFoundation

0

852

Dec ’21

Accurately getting timestamps for start and end of tennis rallies

I'm building a feature to automatically edit out all the downtime of a tennis video. I have a partial implementation that stores the start and end times of Vision trajectory detections and writes only those segments to an AVFoundation export session. I've encountered a major issue, which is that the trajectories returned end whenever the ball bounce, so each segment is just one tennis shot and nowhere close to an entire rally with multiple bounces. I'm ensure if I should continue done the trajectory route, maybe stitching together the trajectories and somehow only splitting at the start and end of a rally. Any general guidance would be appreciated. Is there a different Vision or ML approach that would more accurately model the start and end time of a rally? I considered creating a custom action classifier to classify frames to be either "playing tennis" or "inactivity," but I started with Apple's trajectory detection since it was already built and trained. Maybe a custom classifier would be needed, but not sure.

Machine Learning & AI Core ML Core ML Vision wwdc21-10039 wwdc21-10040

0

617

Dec ’21

Curiosity

Post

Replies

Boosts

Views

Activity