How to calculate real world X and Y position for each point in AVDepthData

Hi All,
I am using the following code to convert a AVDepthData object to a list of real world XYZ coordinates. However, I am getting strange results. The resulting pointcloud is skewed, see here


I am unsure about the calculation of X and Y, although I got the code from the wwdc videoon this topic.

Also I am using the inverseLensDistortionLookupTable to rectify my depthmap. Is this correct?


Thank you so much

private func getPoints(avDepthData: AVDepthData)->Array{
        print(avDepthData.depthDataMap.pixelFormatName())
        let depthData = avDepthData.converting(toDepthDataType: kCVPixelFormatType_DepthFloat32)
        guard let intrinsicMatrix = avDepthData.cameraCalibrationData?.intrinsicMatrix,
//     I am using the inverseDistortionLookupTable here, is this correct?      
        let depthDataMap = rectifyDepthData(avDepthData: depthData) else {
            return []
        }
    
        CVPixelBufferLockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
       
        let width = CVPixelBufferGetWidth(depthDataMap)
        let height = CVPixelBufferGetHeight(depthDataMap)
       
        var points = Array()

       
        for y in 0 ..< height{
            for x in 0 ..< width{
                let Z = getDistance(at: CGPoint(x: x, y: y) , depthMap: depthDataMap, depthWidth: width, depthHeight: height)
               
                if(Z == nil){
                    continue
                }
           
                // as seen in wwdc video -> https://developer.apple.com/videos/play/wwdc2018/503/?time=1498
                let X = (Float(x) - intrinsicMatrix[2][0]) * Z! / intrinsicMatrix[0][0]
                let Y = (Float(y) - intrinsicMatrix[2][1]) * Z! / intrinsicMatrix[1][1]
               
                let point = PointXYZ(x: X, y: Y, z: Z!)
                points.append(point)
            }
        }
        CVPixelBufferUnlockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
       
        return points
    }
Answered by dieselboris in 323331022

I managed to solve this by bringing the focal points and principal points in the same coordinate system as the image.


working code:



privatefunc getPoints(avDepthData: AVDepthData)->Array{
        let depthData = avDepthData.converting(toDepthDataType: kCVPixelFormatType_DepthFloat32)
        guard let intrinsicMatrix = avDepthData.cameraCalibrationData?.intrinsicMatrix,
            let depthDataMap = rectifyDepthData(avDepthData: depthData) else {
            return []
        }
       
        CVPixelBufferLockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
       
        let width = CVPixelBufferGetWidth(depthDataMap)
        let height = CVPixelBufferGetHeight(depthDataMap)
       
        var points = Array()
        let focalX = Float(width) * (intrinsicMatrix[0][0] / PHOTO_WIDTH)
        let focalY = Float(height) * ( intrinsicMatrix[1][1] / PHOTO_HEIGHT)
        let principalPointX = Float(width) * (intrinsicMatrix[2][0] / PHOTO_WIDTH)
        let principalPointY = Float(height) * (intrinsicMatrix[2][1] / PHOTO_HEIGHT)
        for y in 0 ..< height{
            for x in 0 ..< width{
                guard let Z = getDistance(at: CGPoint(x: x, y: y) , depthMap: depthDataMap, depthWidth: width, depthHeight: height) else {
                    continue
                }
               
                let X = (Float(x) - principalPointX) * Z / focalX
                let Y = (Float(y) - principalPointY) * Z / focalY
                points.append(PointXYZ(x: X, y: Y, z: Z))
            }
        }
        CVPixelBufferUnlockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
       
        return points
    }
Accepted Answer

I managed to solve this by bringing the focal points and principal points in the same coordinate system as the image.


working code:



privatefunc getPoints(avDepthData: AVDepthData)->Array{
        let depthData = avDepthData.converting(toDepthDataType: kCVPixelFormatType_DepthFloat32)
        guard let intrinsicMatrix = avDepthData.cameraCalibrationData?.intrinsicMatrix,
            let depthDataMap = rectifyDepthData(avDepthData: depthData) else {
            return []
        }
       
        CVPixelBufferLockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
       
        let width = CVPixelBufferGetWidth(depthDataMap)
        let height = CVPixelBufferGetHeight(depthDataMap)
       
        var points = Array()
        let focalX = Float(width) * (intrinsicMatrix[0][0] / PHOTO_WIDTH)
        let focalY = Float(height) * ( intrinsicMatrix[1][1] / PHOTO_HEIGHT)
        let principalPointX = Float(width) * (intrinsicMatrix[2][0] / PHOTO_WIDTH)
        let principalPointY = Float(height) * (intrinsicMatrix[2][1] / PHOTO_HEIGHT)
        for y in 0 ..< height{
            for x in 0 ..< width{
                guard let Z = getDistance(at: CGPoint(x: x, y: y) , depthMap: depthDataMap, depthWidth: width, depthHeight: height) else {
                    continue
                }
               
                let X = (Float(x) - principalPointX) * Z / focalX
                let Y = (Float(y) - principalPointY) * Z / focalY
                points.append(PointXYZ(x: X, y: Y, z: Z))
            }
        }
        CVPixelBufferUnlockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
       
        return points
    }

I'm trying to do something very similar and just came across your thread. Are you trying to do this with the front or back facing camera? I was able to get through the distortion correction and adjusting the intrinsic matrix to the depth map pixel space but my measurements still seem to be off. I believe this is because I'm using the dual camera and the depthmap values are all relative rather than absolute. Was just wondering if you ran into this problem? Thanks in advance.

Hi dieselboris,


Could you kindly provide the code for the function rectifyDepthData and getDistance?

This would help me a lot!
Thanks and Kind regards,

Rico

How to calculate real world X and Y position for each point in AVDepthData
 
 
Q