Since updating to iOS 26.0 (and confirmed on 26.1), ARBodyTrackingConfiguration no longer detects a valid ARBodyAnchor on devices with LiDAR (e.g., iPhone 15 Pro, iPhone 17 Pro Max).
This issue reproduces in custom projects and Apple’s official sample “Capturing Body Motion in 3D”.
The AR session runs normally, but the delegate call:
func session(_ session: ARSession, didUpdate anchors: [ARAnchor])
never yields an ARBodyAnchor with valid joint transforms.
All joints return nil when calling:
body.skeleton.modelTransform(for: jointName)
resulting in 0 valid joints per frame.
Environment
• Device: iPhone 17 Pro Max (LiDAR)
• iOS: 26.0 / 26.1
• Xcode: 16.0 (stable)
• Framework: ARKit + RealityKit
• Configuration used:
config.worldAlignment = .gravityAndHeading
config.isAutoFocusEnabled = true
config.environmentTexturing = .none
session.run(config)
Also tested: with and without frameSemantics = .bodyDetection
Expected Behavior
ARBodyAnchor should be detected and body.skeleton should contain ~89 valid joints with continuous updates.
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I am using AccessoryTrackingProvider from ARKit to get the transform of the PSVR2 controller via originFromAnchorTransform of the AccessoryAnchor. I also am trying to use AnchorEntity on the controller using RealityKit
However, none of the three options for Accessory.LocationName, which should be used to define the AnchorEntity target, seem to match the position on the controller which is being sent from ARKit.
The picture attached is showing two transforms:
RealityKit - using .gripSurface to define the AnchoringComponent.Target.accesssory location.
ARKit - using originFromAnchorTransform for AccessoryTrackingProvider.
They are not aligned at the same point.
As for the other options of Accessory.LocationName, using .aim is located at the tip of the controller and .grip is the same position as .gripSurface but with a different orientation.
I am wondering why there is not an option for Accessory.LocationName that actually matches the transform captured by ARKit?
i'd like to have a little bit control over the transparency of the videomaterial. is there any way to prepare a shadergraph unlit shader and use it with the videomaterial.
Hi there,
I received an enterprise license file to include enhanced object tracking configuration for the Vision Pro. My account is part of the team which got the allowance from Apple to use this capability. Unfortunately, although I followed the guide, I do not find the Object Tracking capability when I try to add it to my project. There are other capabilities like Main Camera on the Vision Pro, but not for Object Tracking. I am using Xcode 26.1 and visionOS 26.1. What am I missing here?
Thanks in advance,
Matthias
Problem Description:
I am developing an application that runs in the Shared Space on Apple Vision Pro using Unity. When using the UI ScrollView (Scroll View) component, I found that the Mask / RectMask2D does not function in the Shared Space.
Scrolling content is not masked or cropped; it extends beyond the view boundary and is displayed directly.
The same UI works correctly across platforms such as Unity Editor, iOS, and macOS, but the issue only occurs in the shared space of Vision Pro.
Reproduction steps:
Create a ScrollView in Unity.
Add a Mask or RectMask2D to the viewport.
Deploy the application to Apple Vision Pro and run it in Shared Space mode.
Sliding content will not be clipped by the mask, and the masked area is entirely ineffective.
Expected behavior:
The content of ScrollView should be properly clipped by Mask / RectMask2D and should not render outside the mask boundary.
Actual results:
In the shared space of Vision Pro, the mask is ineffective, causing scrolling content to extend beyond the designated area and resulting in severe UI distortion.
Environmental Information:
Device: Apple Vision Pro
Mode: Shared Space
Unity Version: 6000.0.40f1
visionOS version: visionOS 26.0
Unity PolySpatial Version: 2.0.4
Impact
This issue causes Unity UI to fail to display correctly on Vision Pro, preventing ScrollView from properly clipping content, which impacts the UI experience and interaction effects in practical applications.
Expected Result: When running a Unity app in the shared space of visionOS, the Mask / RectMask2D of ScrollView functions correctly
What is the reason the hand-tracking joints have these axes? I'm trying to create a virtual hands model and that's a mess.
My development team admin requested the Enterprise API for camera access on the vision pro. We got that granted, got a license for usage, and got instructions for integrating it with next steps.
We did the following:
Even when I try to download and run the sample project for "Accessing the Main Camera", and follow all the exact instructions mentioned here: https://developer.apple.com/documentation/visionos/accessing-the-main-camera
I am just unable to receive camera frames.
I added the capabilities, created a new provisioning profile with this access, added the entitlements to info.plist and entitlements, replaced the dummy license file with the one we were sent, and also have a matching bundle identifier and development certificate, but it is still not showing camera access for some reason.
"Main Camera Access" shows up in our Signing & Capabilities tab, and we also added the NSMainCameraDescription in the Info.plist and allow access while opening the app. None of this works. Not on my app, and not on the sample app that I just downloaded and tried to run on the Vision Pro after replacing the dummy license file.
I'm capturing a room via RoomPlan API and would like to access the DepthMap(sceneDepth) or SmoothDepthMap(smoothedSceneDepth) from my own provided ARSession for RoomCaptureSession.
But both depth maps are empty when handling the delegates. I have not found a solution yet. So is it even possible? Because i have not found any documentation of what RoomCaptureSession overwrites in the ARSession if I provide my own ARSession instance.
Here is a example code snippet of what i'm trying to do:
private let arSession = ARSession()
private lazy var roomPlanCaptureSession = RoomCaptureSession(arSession: arSession)
let arConfig = ARWorldTrackingConfiguration()
//Create semantics for ARconfig which is used for ARSession
var semantics: ARWorldTrackingConfiguration.FrameSemantics = []
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) {
semantics.insert(.sceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) {
semantics.insert(.smoothedSceneDepth)
}
arConfig.frameSemantics = semantics
//set delegates
roomPlanCaptureSession.delegate = self
arSession.delegate = self
//Check if device support for depthMap
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth){
arSession.run(arConfig)
}
else{
print(".sceneDepth is unsupported.")
}
//run roomcapture scan config
let captureConfig = RoomCaptureSession.Configuration()
roomPlanCaptureSession.run(configuration: captureConfig)
//trying to get sceneDepth
public func session(_ session: ARSession, didUpdate frame: ARFrame) {
print("session delegate capture: sceneDepth: \(String(describing: frame.sceneDepth))")
//prints: session delegate capture: sceneDepth: nil
also in this video from 2023 it is say that i can pass custom ARSession to my RoomPlan.
Explore enhancements to RoomPlan - Video
Quote 3:00: Here is the init and stop function in previous RoomPlan. And here is how you pass over a custom ARSession to init function. Any custom ARSession with ARWorldTrackingConfiguration will be honored inside RoomCaptureSession.
anyway I welcome any input. maybe im doing something wrong. :)
Best approach for high-quality textured room reconstruction using ARKit / RoomPlan / Object Capture?
I am developing an IOS App that allow users to scan rooms, view the scans on device, and add notes. I need to preserve actual geometry (odd angles, chamfers, fixtures), not simplified RoomPlan boxes.
Are there any easy ways to incorporate high quality texture mapping or PBR? Where is the documentation for scene reconstruction?
I have an iOS app that can display a USDZ model downloaded from the Internet (and cached locally) via an ARView.
I would like to light that model with an image based light (IBL) also downloaded from the Internet.
However, as far as I can tell, ARView can only create an IBL from a resource that has been compiled into the Xcode project and loaded with EnvironmentResource(named:in:) or EnvironmentResource.load(named:in:).
Is there a way to create an EnvironmentResource from an HDRI via a file URL to use in ARView in iOS?
I downloaded the official sample project “Accessing the Main Camera”, but I found that it’s not able to retrieve the camera feed on visionOS 26.1. After checking the debug logs, it seems the issue is caused by the system being unable to find the expected format.
I tested on a device running visionOS 2, and the camera feed worked correctly — but only when using the sample code from the visionOS 2 version, not the current one. I also noticed that some of the APIs have changed between versions.
Has anyone managed to successfully access the camera feed on visionOS 26.1?
Hi team,
I believe I’ve found a registration issue between ARFrame.sceneDepth and ARFrame.capturedImage when using high-resolution frame capture on a 2022 iPad Pro (6th gen).
When enabling high-resolution capture:
if let highResFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing {
config.videoFormat = highResFormat
}
…
arView.session.captureHighResolutionFrame { ... }
the depth map provided by ARFrame.sceneDepth no longer aligns correctly with the corresponding high-resolution capturedImage.
This misalignment results in consistently over-estimated distance measurements in my app (which relies on mapping depth to 2D pixel coordinates).
iPad Pro (6th gen): misalignment occurs only when capturing high-resolution frames.
iPhone 16 Pro: depth is correctly registered for both standard and high-resolution captures.
It appears the camera intrinsics, specifically the FOV, change between the “regular” resolution stream and the high-resolution capture on the iPad. My suspicion is that the depth data continues using the intrinsics of the lower resolution stream, resulting in an unregistered depth-to-RGB mapping.
Once I have the iPad in hand again, I will confirm whether camera.intrinsics or FOV differ between the low-res and high-res frames.
Is this a known issue with high-resolution frame capture on the 2022 iPad Pro? If not, I’m happy to provide some more thorough sample code.
Thanks for your time!
I use ARKit for motion tracking. I get the skeleton joint coordinates and use them for animation. I didn't make any changes to the code, but I updated the iOS version from 18 to 26, and modelTransform now always returns nil.
https://developer.apple.com/documentation/arkit/arskeleton3d/modeltransform(for:)
For example
bodyAnchor.skeleton.modelTransform(for: .init(rawValue: "head_joint"))
bodyAnchor is ARBodyAnchor.
I see the default skeleton on the screen, but now I can't get the coordinates out of it.
I'm using an example from Apple's WWDC presentation.
https://developer.apple.com/documentation/arkit/capturing-body-motion-in-3d
Are there any changes in the API? Or just bug?
Hi everyone! I am working on AR app and wanted to implement object occlusion because it removes drift pretty much from the object. This working great with RealityKit sample But I am unable to replicate such behaviour it with scenekit. Because scenekit does not offer object occlusion. Can we say scenekit is getting depricated, and we should re-write app in RealityKit (which is obviously a big task)?
I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?
While using apple's vision pro, we noticed that we can continue to use the visionOS keyboard when we no longer actually see it in passthrough.
In other words, when we focus on a field to type, visionOS displays the keyboard for us in such a way that we actually see it. Then, we noticed if we look away a little bit, either up, or down, or left, or right, in such a way that the keyboard is no longer visible by us in the passthrough, the keyboard still remains responsive to taps from our fingers at the location where it is. It seems the keyboard remains functional and responsive to taps even though we can no longer observe/see it.
We are trying to figure out how to implement similar functionality in our app whereby the user can continue to manipulate a 3d entity when the user can no longer actually observe it in passthrough (like the visionOS keyboard appears to allow).
I assume the visionOS keyboard has this functionality thanks to the downward facing sensors on the hardware that allow hand tracking even though the hands can no longer be observed by the user. That is likely how we can rest our hands on our lap is still be able to interact with visionOS.
How can we implement a similar functionality for 3D entities?
Is there a way to tap in, or to allow hand tracking, from those toward facing cameras?
Is it possible to manipulate a 3D entity when it is no longer observed by the user for example when they shift their attention somewhere else in the field of vision?
How does the visionOS keyboard achieve this?
https://developer.apple.com/documentation/arkit/arpointcloud
https://developer.apple.com/documentation/arkit/arframe/rawfeaturepoints
The point cloud (collection of points/features) main intention is a debug visualization to what the underlying tracking algorithm processes and is not designed for additional algorithms on top of that. But, we are utilizing information contained in the points/features collected by ARKit.
Currently, the range of rawfeaturepoints is limited to about 10 meter from the device.
We see a great chance if the range is unlock. The global localization will be more robust and accurate.
ARPointCloud - Apple ARKit - FindSurface
YouTube SIdQRiLj2jY
I have a visionOS app where I instantiate ARKitSession and various providers (HandTrackingProvider and WorldTrackingProvider) in my appModel. That way, I can pass these providers to a Task which runs a gRPC server for sending the data from these providers to a client. When the users enters the immersive space of the app, the ARKitSession will run the providers if they are not running already.
I am now trying to implement the AccessoryTrackingProvider with the PSVR sense controllers but it does not fit with my current framework because the controllers may not be connected when the ARKitSession.run function is called. So I need to find a new place to start the session.
My question is, if I already have a session which is running the hand and world tracking providers, can I start another session to run the accessory tracking? Should they all be running on the same session?
Is there a way to stop the session and restart it when the controllers are connected? When I tried this, I get an error that says "It is not possible to re-run a stopped data provider (<ar_hand_tracking_provider_t: " but if I instantiate a new HandTrackingProvider, then the one that got passed to the gRPC task would no longer be the one running in the new session.
Any advice on how best to manage the various providers and ARKit sessions would be greatly appreciated.
Using the example code posted here:
https://developer.apple.com/documentation/visionOS/tracking-images-in-3d-space
I can register multiple ReferenceImage s with a ImageTrackingProvider, but only one updates at a time - to have realtime updating, I can only have one ImageAnchor in my field of view at a time.
Is it possible to track multiple imageAnchors at the same time in the same field of view? As in having several ImageAnchor's tracked and entities updated to the transforms of the anchor in the same frame/moment from the Apple Vision Pro?
Topic:
Spatial Computing
SubTopic:
ARKit
0
I’m using ARKit + SceneKit (Swift) with ARWorldTrackingConfiguration and detectionImages to place a 3D object (USDZ via SCNScene(named:)) when a reference image is detected. While the image is tracked, the object stays correctly aligned.
Goal: When the tracked image is no longer visible, I want the placed node to remain visible and fixed at its last known pose (no drifting) as I move the camera.
What works so far: Detect image → add node → track updates When the image disappears → keep showing the node at its last pose
Problem: After the image is no longer tracked, the node drifts as I move the device/camera. It looks like it’s still influenced by the (now unreliable) image anchor or accumulating small world-tracking errors.
Question: What’s the correct way in ARKit to “freeze” the node at its last known world transform once ARImageAnchor stops tracking, so it doesn’t drift?