Hi!
I attempted to run a sample project for detecting human pose in photos, which can be found here:
https://developer.apple.com/documentation/vision/detecting-human-body-poses-in-3d-with-vision
The project works perfectly when run on my Macbook Pro M1, but it fails on Apple Vision Pro. After selecting the photo an endless loading screen is presented and the following output is produced in the console:
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Network path is nil: (null)
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Unable to perform the request: Error Domain=com.apple.Vision Code=9 "Async status object reported as failed but without an error" UserInfo={NSLocalizedDescription=Async status object reported as failed but without an error}.
de-activating session 70138 after timeout
It seems that VNDetectHumanBodyPose3DRequest is failing on Vision Pro for some reason. Are there any additional requirements for running Vision framework on VisionOS, that I might be missing?
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I'm using ARKitSession and PlaneDetectionProvider to detect planes. I have a basics process to create an entity for each detected plane. Each one will get a random color for the material.
Each plane is sized based on the bounds of the anchor provided by ARKit.
let mesh = MeshResource.generatePlane(
width: anchor.geometry.extent.width,
depth: anchor.geometry.extent.height
)
Then I'm using this to position each entity.
entity.transform = Transform(matrix: anchor.originFromAnchorTransform)
This seems to be the right method, but many (not all) planes are not where they should be. The sizes look OK, but the X and Y positions off.
Take this large green plane on the wall. It should span the entire wall, but it is offset along the X position so that it is pushed to the left from where the center of the anchor is.
When I visualize surfaces using the Xcode debugging tools, that tool reports the planes where I'd expect them to be.
Can you see what I'm getting wrong here? Full code below
struct Example068: View {
@State var session = ARKitSession()
@State private var planeAnchors: [UUID: Entity] = [:]
@State private var planeColors: [UUID: Color] = [:]
var body: some View {
RealityView { content in
} update: { content in
for (_, entity) in planeAnchors {
if !content.entities.contains(entity) {
content.add(entity)
}
}
}
.task {
try! await setupAndRunPlaneDetection()
}
}
func setupAndRunPlaneDetection() async throws {
let planeData = PlaneDetectionProvider(alignments: [.horizontal, .vertical, .slanted])
if PlaneDetectionProvider.isSupported {
do {
try await session.run([planeData])
for await update in planeData.anchorUpdates {
switch update.event {
case .added, .updated:
let anchor = update.anchor
if planeColors[anchor.id] == nil {
planeColors[anchor.id] = generatePastelColor()
}
let planeEntity = createPlaneEntity(for: anchor, color: planeColors[anchor.id]!)
planeAnchors[anchor.id] = planeEntity
case .removed:
let anchor = update.anchor
planeAnchors.removeValue(forKey: anchor.id)
planeColors.removeValue(forKey: anchor.id)
}
}
} catch {
print("ARKit session error \(error)")
}
}
}
private func generatePastelColor() -> Color {
let hue = Double.random(in: 0...1)
let saturation = Double.random(in: 0.2...0.4)
let brightness = Double.random(in: 0.8...1.0)
return Color(hue: hue, saturation: saturation, brightness: brightness)
}
private func createPlaneEntity(for anchor: PlaneAnchor, color: Color) -> Entity {
let mesh = MeshResource.generatePlane(
width: anchor.geometry.extent.width,
depth: anchor.geometry.extent.height
)
var material = PhysicallyBasedMaterial()
material.baseColor.tint = UIColor(color)
let entity = ModelEntity(mesh: mesh, materials: [material])
entity.transform = Transform(matrix: anchor.originFromAnchorTransform)
return entity
}
}
Topic:
Spatial Computing
SubTopic:
ARKit
I am trying to create an object in immersive space that is partially transparent (~50% opacity). I have implemented this in a few different ways including creating a model entity and setting its opacity component to 0.5, and creating a custom material with blending set to a transparent opacity of 0.5. These both work partially, as they behaved as intended for many cases, but seemingly randomly would act like occlusion material and block any other immersive content behind them, showing the real world instead.
Some notes: I am using RealityKit to render the semi-transparent object and an opaque object that is behind the semi-transparent object. I am using VisionOS 2.1, and am updating the location of the semi-transparent object often. Both objects are ModelEntities.
I would appreciate any guidance on how to implement this. Please let me know if there are any other questions.
In several visionOS apps, we readjust our scenes to the user's eye level (their heads). But, we have encountered issues whereby the WorldTrackingProvider returns bad/incorrect positions for the first x number of frames.
See below code which you can copy paste in any Immersive Space. Relaunch the space and observe the numberOfBadWorldInfos value is inconsistent.
a. what is the most reliable way to get the devices's position?
b. is this indeed a bug?
c. are we using worldInfo improperly?
d. as a workaround, in our apps we set to 10 the number of frames to let pass before using worldInfo, should we set our threshold differently?
import ARKit
import Combine
import OSLog
import SwiftUI
import RealityKit
import RealityKitContent
let SUBSYSTEM = Bundle.main.bundleIdentifier!
struct ImmersiveView: View {
let logger = Logger(subsystem: SUBSYSTEM, category: "ImmersiveView")
let session = ARKitSession()
let worldInfo = WorldTrackingProvider()
@State var sceneUpdateSubscription: EventSubscription? = nil
@State var deviceTransform: simd_float4x4? = nil
@State var numberOfBadWorldInfos = 0
@State var isBadWorldInfoLoged = false
var body: some View {
RealityView { content in
try? await session.run([worldInfo])
sceneUpdateSubscription = content.subscribe(to: SceneEvents.Update.self) { event in
guard let pose = worldInfo.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) else {
return
}
// `worldInfo` does not return correct values for the first few frames (exact number of frames is unknown)
// - known SO: https://stackoverflow.com/questions/78396187/how-to-determine-the-first-reliable-position-of-the-apple-vision-pro-device
deviceTransform = pose.originFromAnchorTransform
if deviceTransform!.columns.3.y < 1.6 {
numberOfBadWorldInfos += 1
logger.warning("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)")
} else {
if !isBadWorldInfoLoged {
logger.info("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)")
}
isBadWorldInfoLoged = true // stop logging.
}
}
}
}
}
Hello
RemoteDeviceIdentifier returns nil and therefore it crashes the HoverEffect sample project.
I have vision26 beta 2 on both devices
what the correct method of running this code sample ?
My ARViewContainer code is not working. I don't know how to debug the issue and I don't know or see where my results is going to. I need help to resolve this issue. please help debug. See code below:
Hi all,
I'm running into an issue with an app that previously worked fine on device using visionOS 2.0. After updating to visionOS 26, the same code runs fine in the simulator but crashes on the device with the following error:
-[MTLDebugComputeCommandEncoder _validateThreadsPerThreadgroup:]:1330:
failed assertion `(threadsPerThreadgroup.width(32) * threadsPerThreadgroup.height(32) * threadsPerThreadgroup.depth(1))(1024) must be <= 832. (kernel threadgroup size limit)`
Is there any documented way to check or increase the allowed threadsPerThreadgroup size on Apple Vision Pro? Or any recommended workaround for this regression?
Thanks in advance!
Hi, I called it "perspective problem", but I'm not quite sure what it is. I have a tag that I track with builtin camera. I calculate its pose, then use extrinsics and device anchor to calculate where to place entity with model.
When I place an entity that overlaps with physical object and start to look at it from different angles, the virtual object begins to move. Initially I thought that it's something wrong with calculations, or some image distortion closer to camera edges is affecting tag detection. To check, I calculated the position only once and displayed entity there, the physical tracked object is not moving. Now, when I move my head, so the object is more to the left, or right in my field of view, the virtual object becomes misaligned to the left, or right. It feels like a parallax effect, but distance from me to entity and to physical object are exactly the same.
Is that expected, because of some passthrough correction magic? And if so, can I somehow correct it back, so the entity always overlaps with object? I'm currently on v26 beta 5.
I also don't quite understand the camera extrinsics, because it seems that I need to flip it around X by 180 degrees to make it work in deviceAnchor * extrinsics.inverse * tag (shouldn't it be in same coordinates as all other RealityKit things?).
Is there any way to render a RealityView to an Image/UIImage like we used to be able to do using SCNView.snapshot() ?
ImageRenderer doesn't work because it renders a SwiftUI view hierarchy, and I need the currently presented RealityView with camera background and 3D scene content the way the user sees it
I tried UIHostingController and UIGraphicsImageRenderer like
extension View {
func snapshot() -> UIImage {
let controller = UIHostingController(rootView: self)
let view = controller.view
let targetSize = controller.view.intrinsicContentSize
view?.bounds = CGRect(origin: .zero, size: targetSize)
view?.backgroundColor = .clear
let renderer = UIGraphicsImageRenderer(size: targetSize)
return renderer.image { _ in
view?.drawHierarchy(in: view!.bounds, afterScreenUpdates: true)
}
}
}
but that leads to the app freezing and sending an infinite loop of
[CAMetalLayer nextDrawable] returning nil because allocation failed.
Same thing happens when I try
return renderer.image { ctx in
view.layer.render(in: ctx.cgContext)
}
Now that SceneKit is deprecated, I didn't want to start a new app using deprecated APIs.
We are working on a world scale AR app that leverages the device location and heading to place objects in the streets, so that they are correctly and stably anchored to certain locations.
Since the geo-tracking imagery is only available in certain cities and areas, we are trying to figure out how to fallback when geo-tracking is not available as the device move away, to still retain good AR camera accuracy. We might need to come up with some algorithm using the device GPS, to line up the ARCamera with our objects.
Question: Does geo-tracking always provide greater than or equal to the accuracy of world tracking, for a GPS outdoor AR experience?
If so, we can simply use the ARGeoTrackingConfiguration for the entire time, and rely on the ARView keeping itself aligned. Otherwise, we need to switch between it and ARWorldTrackingConfiguration when geo-tracking is not available and/or its accuracy is low, then roll our own algorithm to keep the camera aligned.
Thanks.
My development team admin requested the Enterprise API for camera access on the vision pro. We got that granted, got a license for usage, and got instructions for integrating it with next steps.
We did the following:
Even when I try to download and run the sample project for "Accessing the Main Camera", and follow all the exact instructions mentioned here: https://developer.apple.com/documentation/visionos/accessing-the-main-camera
I am just unable to receive camera frames.
I added the capabilities, created a new provisioning profile with this access, added the entitlements to info.plist and entitlements, replaced the dummy license file with the one we were sent, and also have a matching bundle identifier and development certificate, but it is still not showing camera access for some reason.
"Main Camera Access" shows up in our Signing & Capabilities tab, and we also added the NSMainCameraDescription in the Info.plist and allow access while opening the app. None of this works. Not on my app, and not on the sample app that I just downloaded and tried to run on the Vision Pro after replacing the dummy license file.
I am using AccessoryTrackingProvider from ARKit to get the transform of the PSVR2 controller via originFromAnchorTransform of the AccessoryAnchor. I also am trying to use AnchorEntity on the controller using RealityKit
However, none of the three options for Accessory.LocationName, which should be used to define the AnchorEntity target, seem to match the position on the controller which is being sent from ARKit.
The picture attached is showing two transforms:
RealityKit - using .gripSurface to define the AnchoringComponent.Target.accesssory location.
ARKit - using originFromAnchorTransform for AccessoryTrackingProvider.
They are not aligned at the same point.
As for the other options of Accessory.LocationName, using .aim is located at the tip of the controller and .grip is the same position as .gripSurface but with a different orientation.
I am wondering why there is not an option for Accessory.LocationName that actually matches the transform captured by ARKit?
Hello,
I was looking back into downloading the Tracking geographic locations in AR sample app from https://developer.apple.com/documentation/arkit/tracking-geographic-locations-in-ar
Unfortunately the Download links to the .zip of the DisplayingAPointCloudUsingSceneDepth sample project.
The exact same issue occurs when trying to download the sample code from https://developer.apple.com/documentation/ARKit/creating-a-fog-effect-using-scene-depth
Wondering if those links are deliberately broken because of possible deprecations.
Thanks to any Apple Engineer willing to look into that.
Hi folks, I’m new to Vision Pro stack, still trying to learn all the nuances. Here is a problem I can’t seem to find an answer.
I placed entity A( a small .02 radius sphere) inside entity B( size:.1 box). Both entities have HoverEffectComponent, and both inputcomponent is set to .direct. Entity A is NOT a child of Entity B. When I direct touch Entity B, I noticed that Entity A’s hover effect is fired as well. This only happens if Entity A‘s position is inside Entity B. The gesture that is only targeted at Entity A doesn’t work either. I double checked Entity A collider which sits inside entity B collider, my direct touch shouldn’t have trigger its hove effect. Having one collider inside another seems to produce unpredictable behavior? Thanks in advance 🙏🙏🙏
Context: I’m trying to create an invisible bound around Entity A, so when my hand approaches the bound to grab Entity A, a nice spotlight hover effect would fire first on the bound before hand reaching entity A.
I am developing an ARKit based application that requires plane detection of the tabletop at which the user is seated. Early testing was with an iPhone 8 and iPhone 8+. With those devices, ARKit rapidly detected the plane of the tabletop when it was only 8 to 10 inches away. Using iPhone 15 with the same code, it seems to require me to move the phone more like 15 to 16 inches away before detecting the plane of the table. This is an awkward motion for a user seated at a table. To validate that it was not necessarily a feature of my code, I determined that the same behavior results with Apple's sample AR Interaction application. Has anyone else experienced this, and if so, have suggestions to improve the situation?
Hello,
I have downloaded and run the sample object tracking app for visionos.
Now I'm working on my own objects for tracking. I have made a model using Create ML using images of my object.
However, I cannot see how to convert the Create ML output file (xxx.mlmodel) into a reference object like the files in the sample project.
is there a tool for converting them?
TIA
Topic:
Spatial Computing
SubTopic:
ARKit
As I understand it there are two ways I can track a hand, or a joint, in RealityKit:
either, create an AnchorEntity, for example AnchorEntity(.hand(.left, location: .palm))
or, set up an ARSession with a HandTrackingProvider ( a lot more code which I haven't repeated here).
Assuming this is correct, when would I want to use one over the other?
In ARKit for visionOS, I can track the user's head with a HeadAnchor, but it will not give the location. However, I can get the device's transform by calling queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) on a WorldTrackingProvider.
Why the difference? - if I know the device's transform, I effectively know the head's transform.
I am using Entity of RealityKit to display virtual content, however I find that sometimes the real object in front of the virtual content can not occulude the virtual content.
For example, I place an Entity in a room, but when I walk into another room, I can still see the Entity through the wall.
I wonder how should I fix the problem. Thank you!
I am currently creating an app where two people share an instance of an immersive space so that they are able to point to certain things in the immersive space. Right now, other people are hidden behind the immersive space, and even with people awareness enabled for everything, people are still too difficult to see. I've found this documentation (https://developer.apple.com/documentation/arkit/occluding-virtual-content-with-people) which describes what I want to do, but it is only listed as working on iOS an iPadOS. Is there anything similar to this that will work on VisionOS?