mcdopsa’s Profile | Apple Developer Forums

mcdopsa

Post

Replies

Boosts

Views

Activity

In the new Xcode we saw examples of Claude, OAI & Google coding agents that you can start conversations with inside your project, giving it access to your project files context. As far as I understand, this requires an API key for those models & the processing is run on Anthropic / Google servers, not locally nor on Private Cloud Compute. Is it possible to instead, use the LLM powering Foundation Models, for a “Siri Code Agent” which operates in the place of those models, but runs on device or in Private Cloud Compute? I like how this works for Siri AI requests, and would love to have a coding assistant agent that can operate in the same privacy preserving way! Is this possible with any of the open source frameworks or the command line tools? If not, what is the best way to request this feature?

Machine Learning & AI Foundation Models

411

Jun ’26

Visual Intelligence for VisionOS in 3rd Party Apps

During the keynote, we saw an amazing example of Siri using Visual Intelligence to identify items in the user's physical space and make inferences based on their size. Do 3rd party apps have the ability to perform this same, or similar actions? For example: User loads a photo of an item or product and clicks a button that says 'Find Item In My Space'. Apple Intelligence is then used to analyze the user's surroundings, and notify the user if the item is present or not present, along with some positional or physical context. Response is shown on the user interface as text, "This item is in your room, 1 meter to your right." Goal: Developers currently can not access the Passthrough Camera on Apple Vision Pro to run AI/ML vision processing models on, for privacy reasons. If Apple Intelligence can look through the camera for the developer, in a privacy-preserving, isolated black box, without providing the image texture to the developer in any way, the user can make use Visual Intelligence features based on their physical surroundings without sacrificing their privacy. Purpose: Visual Intelligence is a key feature for that exemplifies the benefits of Spatial Computing, and examples like the one shown in the Keynote are a perfect use-case for the medium. Since Siri now has this capability, users will come to expect that all apps across VisionOS will be able to perform the same kinds of actions. Developers don't generally want or need direct access to the images of a user's surroundings, and having a local/private method of processing these requests is ideal both for developers concerned with data privacy management and users concerned with developers having too much access to their surroundings. Wearable devices with cameras are a foundational accelerator to users adopting AI in useful ways for their daily life. It is the most natural way to communicate with AI about what is relevant to you at any given time, removes the friction/difficulty of manually scanning good data for AI inferencing, and brings purpose to wearing this class of device every day. As these devices become more common and capable, data privacy becomes even more important. Users will need reassurance that the devices they choose to wear will only have access to observe their surroundings when they choose to allow it, while retaining the capability to use the powerful features that make them worthwhile. Accessibility: Using Visual Intelligence is an extremely powerful accessibility tool (for example; for individuals who have low vision), and can meaningfully improve quality of life. Various applications beyond Siri AI can be designed by developers with very specific inferencing capabilities powered by AI. The future of Visually Intelligent apps should have intentional, unique purposes that users can choose to incorporate in their lives. This will not be a one-size-fits-all Visual Intelligence approach, and will require specific design, training and development to create meaningful capabilities. If this is already possible, amazing! Any resources to learn more would be greatly appreciated. If this is not yet possible, please let us know what we can do to encourage Apple to consider it. Thank you.

Machine Learning & AI App Intents

249

Jun ’26

VisionOS Equivalent to ARWorldMap for Permanence & Accessibility

Currently VisionOS has WorldAnchors which are a great, privacy-preserving method to affix and localize entities into physical spaces. These are great for apps that work in Mixed immersive mode as well as augmented layer or virtual overlays onto physical spaces. However, they have some fundamental limitations when compared to iOS’s ARWorldMap which limit their capabilities as truly persistent WorldAnchors, which the platform would greatly benefit from. The problem - Impermanence: WorldAnchors are on-device only, stored in data at a system level and expose their ID for collection and usa within apps. If the app which created them is removed, or if the device is rese, those saved World Anchors are removed and lost forever. Their ID will no longer point to an anchor in physical space and it is as if the anchor has been deleted or never existed. This means that anchors effectively only temporarily exist on a single device while that specific device is in an undisturbed state. A second device using that same app with interest in utilizing that anchor is not available. With no way to export this Anchor, save it, re-load it either inside an app bundle, from a server, on-site data storage, etc. the anchor is 'trapped' on the device which creates it. ARWorldMap can be saved to a file, and acquired in many ways which circumvents this problem, and allows it to exist as a single source of truth for that particular physical environment that any iOS or iPadOS device can localize to as long as they have the ARWorldMap. Imperfect Workaround #1 - SharedAnchors: After setting a WorldAnchor on Device #1, start a local S harePlay session with Device #1 & #2 in the same physical space. Then, use a 'Shared Anchor' with transform offset data matching the WorldAnchor. On Device #2, save that offset as a new WorldAnchor. This new WorldAnchor will have the same transform effectively matching the original, though it will have a new ID and is entirely separate. Problem with Workaround #1: This requires two VisionOS devices to be in the same physical location simultaneously, have an active SharePlay session between the two devices, then conduct a sharing operation. This means that if Device #1 does not happen to be available in the physical space at the same time as Device #2 for any reason, Device #2 will never have access to this anchor. In cases such as a public or industrial space, it is not realistically viable to always have a person with Device #1 available at all times. There are many situations where Device #2 would want to enter the space and observe the anchor while they are alone, and would never have access to this anchor. Workaround #2 - iOS / iPadOS Middleman: Once a visionOS WorldAnchor is created on Device #1, that same user can manually co-locate an iOS or iPadOS device, with some shared session (for example, using ImageTrigger on the iOS/iPadOS device's display which the visionOS device reads, then determines an offset for). Then, create an ARWorldMap on that co-located iOS/iPadOS device, and save it, then serve that ARWorldMap. Then to localize visionOS Device #2, enter the space and first localize an iOS / iPadOS device with the ARWorldMap. Then, manually co-locate visionOS Device #2 with the iOS/iPadOS device and access the offset. Then save this offset as WorldAnchor on Device #2, at this point the iOS/iPadOS device is no longer needed. Problems With Workaround #2: At a minimum, this requires 3 devices, or more realistically 4 if Device #1 and #2 are not in the space at the same time. This is not only a very poor user experience to have to operate an iOS/iPadOS device simultaneously while wearing visionOS device, but it is also very complicated and a several step process that is not intuitive to most users. The accuracy of this process is extremely poor, as tracking an image from the screen of an iOS/iPadOS device will never be as precise as the internal tracking system on VisionOS. It will always have SOME margin for drift and error, which can result in the anchor being very far offset from the intended anchor position. This is antithetical to the purpose of WorldAnchors, as the delta can be so inconsistent that Device #2 will observe content attached to the anchor at an incorrect location, which could lead to unintended user behaviour when interacting with this content potentially moving attached content to look correct for them, which would negatively impact all other devices viewing the content. Desired Solution: An optimal upgrade to visionOS WorldAnchors would; Use the same underlying tracking & relocalization system that is currently in use Allow for exporting of the WorldAnchor's data in an encrypted, privacy preserving way (not simply a point cloud) which could be saved and shared as a file similar to ARWorldMap, then used for relocalization on ANY device with access to that file and application. Be interoperable across platforms where iOS, iPadOS, visionOS and any other spatial-capable platform can use that single file, and localize themselves in the physical space. This would unify ARKit WorldAnchors for all platforms, ensuring that the same physical space can be localized to by all devices, anchors can be created by all devices, and the content would exist in the same position on all devices. Allowing an iOS or iPadOS device to create a WorldAnchor that is then identical on visionOS, or visa-versa. These features unlock true persistence in anchoring content to real-world spaces, a critical component of a Spatial Computing platform that maximizes the unique capabilities and benefits of the medium. This creates an upwards path for users to start on iOS / iPadOS today, and upgrade to visionOS in the future. Accessibility: Not all users are able to use visionOS for various reasons, including physical, regional and financial circumstances among many many others. There is no reason why someone who is unable to use visionOS for any one of these reasons should be LOCKED-OUT from Spatial Computing applications. Spatial computing is the most human computing medium ever created, and applications need to allow all humans to engage with Spatial Computing experiences regardless of their level of access to visionOS devices. We as developers want to build everlasting Spatial Computing applications that accentuate the medium, maximize the benefits of Spatial Computing, include / invite all humans regardless of their Accessibility level, and establish virtual content that can outlive the individuals who create it. Please, take this request into serious consideration, as the feature as described would contribute to the Apple Ecosystem being the single greatest Spatial Computing platform of all time (and space), enabling permanent layering of physical spaces, preserving privacy for sensitive data, and maximizing accessibility across the spectrum of humans and devices. Thank you.

Spatial Computing ARKit

402

Jun ’26

Reality View Preserves Camera Transform when toggling Virtual & Spatial Tracking modes

When switching from RealityView’s .spatialTracking camera mode to .virtual camera mode, the camera’s orientation relative to the scene is preserved permanently with no way to reset to default World-Up orientation. Since .spatialTracking’s camera mode will always have a non-default orientation, switching to .virtual camera mode ensures that the cameras’s ‘UP’ direction will never match the device display’s ‘UP’ direction as is default. This is especially noticeable when using .orbit camera controls, as the orbit’s UP direction matches the scene, not camera, and all rotation directions give unexpected results. Expected: When setting virtual camera mode after using spatialTracking camera mode, either 1. The Virtual Camera orientation returns to default (world up). Or 2. A 'content.camera.resetOrientation()' call is made available which resets the RealityView camera to default orientation. Reality: Switching from .spatialTracking -> .virtual camera mode permanently locks the .virtual camera’s orientation the final frame of the .spatialTracking camera’s rotation (relative to the RealityView content scene). One imperfect workaround is to reset / rebuild the entire RealityView after changing modes (by resetting .id() or otherwise. This is not ideal as it causes everything inside the make closure to rerun, which not only is a performance & time cost, visually incurs a flicker and can also be problematic with managing increasingly complicated views. Another imperfect alternative is to use more than one RealityView - which is not ideal as it incurs double the base ram usage, significantly increases code, and seemingly goes against the intent of being able to change the camera .virtual/.spatatialTracking mode at will. Code Sample: import SwiftUI import RealityKit struct RKSpatialVirtualToggle: View { @State var showAR: Bool = false var body: some View { RealityView { content in let cube = ModelEntity(mesh: .generateBox(size: 0.25), materials: [SimpleMaterial()]) cube.position.z = -1 content.add(cube) content.camera = showAR ? .spatialTracking : .virtual content.cameraTarget = cube } update: { content in content.camera = showAR ? .spatialTracking : .virtual } .realityViewCameraControls(.orbit) VStack{ Spacer() Button("Toggle AR"){ showAR.toggle() } .buttonStyle(.borderedProminent) } } } Xcode Version: Version 26.0 (17A324) iOS Version: iOS 26.5 (23F75) Tested on devices, iPhone 12 Pro, iPhone 15 Pro

Spatial Computing General RealityKit iOS Camera SwiftUI

324

Jun ’26

RealityView Camera Target Error when set while Orbiting

When interacting with RealityView’s realityViewCameraControls .orbit and setting a new RealityViewCameraContent .cameraTarget, the resulting camera target and camera orbit is incorrect. This can be demonstrated where one finger is orbiting the RealityView, and another pushes a button which changes the camera target. Instead of the camera facing the new target, some point in the scene is the new effective camera target and orbit point. This only occurs when an orbit interaction is currently taking place. If you stop interacting with the orbit, change target, then start orbit interacting again, everything works as expected. Though this example uses two-touches, any change of the camera target has this conflict with orbit interaction. This means interacting with orbit will result in the wrong camera view which is unexpected for users and difficult to reconcile or detect, for developers. Expected: Interacting (orbiting) the scene while setting a new camera target with the buttons on screen (at the same time), the camera’s new target shows centred in view the orbit revolves the new target and continues to match my gestures. Reality: Interacting (orbiting) the scene while setting a new camera target with the buttons on screen (at the same time), the camera’s new target is not centred in view, and camera is now orbiting an unexpected point in the scene, that is not my expected target. One imperfect workaround is to force a rebuild of the view after setting a new cameraTarget. This sets all targets correctly but results in a flicker, loss of orbit controls until re-touch and ultimately is a poor user experience, but is better than the wrong target being shown unexpectedly. Code Sample: import SwiftUI import RealityKit struct RKOribtTarget: View { @State private var target: Int = 0 @State private var rcContent: RealityViewCameraContent? @State private var rkID: UUID = UUID() let root = Entity() let center = ModelEntity(mesh: .generateSphere(radius: 0.05), materials: [UnlitMaterial(color: UIColor(.gray.opacity(0.5)))]) let red = ModelEntity(mesh: .generateBox(size: 0.1), materials: [SimpleMaterial(color: .red, isMetallic: false)]) let blue = ModelEntity(mesh: .generateBox(size: 0.1), materials: [SimpleMaterial(color: .blue, isMetallic: false)]) let green = ModelEntity(mesh: .generateBox(size: 0.1), materials: [SimpleMaterial(color: .green, isMetallic: false)]) var body: some View { VStack{ RealityView { content in red.position.x = 0.5 blue.position.z = 0.5 green.position.y = 0.5 center.position = .init(repeating: 0.25) content.cameraTarget = target == 0 ? root : blue root.addChild(red) root.addChild(blue) root.addChild(green) root.addChild(center) content.add(root) } update: { content in switch target{ case 0: content.cameraTarget = root case 1: content.cameraTarget = blue case 2: content.cameraTarget = red case 3: content.cameraTarget = green default: content.cameraTarget = root } } .id(rkID) .realityViewCameraControls(.orbit) VStack{ Text("Target") Button("Default") { target = 0 // Force rebuilding view resets orbit target and rotation // But shows a flicker, interaction requires touch reset // Not an ideal workaround // rkID = UUID() } .buttonStyle(.bordered) Button("Blue") { target = 1 // rkID = UUID() } .buttonStyle(.bordered) .tint(.blue) Button("Red") { target = 2 // rkID = UUID() } .buttonStyle(.bordered) .tint(.red) Button("Green") { target = 3 // rkID = UUID() } .buttonStyle(.bordered) .tint(.green) } } } } Xcode Version: Version 26.0 (17A324) iOS Version: iOS 26.5 (23F75) Tested on devices, iPhone 12 Pro, iPhone 15 Pro

Spatial Computing General iOS Camera RealityKit

900

May ’26

ManipulationComponent Not Translating using indirect input

When using the new RealityKit Manipulation Component on Entities, indirect input will never translate the entity - no matter what settings are applied. Direct manipulation works as expected for both translation and rotation. Is this intended behaviour? This is different from how indirect manipulation works on Model3D. How else can we get translation from this component? visionOS 26 Beta 2 Build from macOS 26 Beta 2 and Xcode 26 Beta 2 Attached is replicable sample code, I have tried this in other projects with the same results. var body: some View { RealityView { content in // Add the initial RealityKit content if let immersiveContentEntity = try? await Entity(named: "MovieFilmReel", in: reelRCPBundle) { ManipulationComponent.configureEntity(immersiveContentEntity, allowedInputTypes: .all, collisionShapes: [ShapeResource.generateBox(width: 0.2, height: 0.2, depth: 0.2)]) immersiveContentEntity.position.y = 1 immersiveContentEntity.position.z = -0.5 var mc = ManipulationComponent() mc.releaseBehavior = .stay immersiveContentEntity.components.set(mc) content.add(immersiveContentEntity) } } }

Spatial Computing General Swift RealityKit visionOS

3.7k

Apr ’26

Presenting images in RealityKit sample No Longer Builds

After updating to the latest visionOS beta, visionOS 26 Beta 4 (23M5300g) the ‘Presenting images in RealityKit’ sample from the following link no longer builds due to an error. https://developer.apple.com/documentation/RealityKit/presenting-images-in-realitykit Expected / Previous: Application builds and runs on device, working as described in the documentation. Reality: Application builds, but does not run on device due to an error (shown in screenshot) “Thread 1: EXC_BAD_ACCESS (code=1, address=0xb)”. The application still runs on the simulator, but not on device. When launching the app from Xcode, it builds and installs correctly but hangs due to the respective error. When loading the app from the Home Screen, the app does not load, and immediately returns to the Home Screen. This Xcode project previously ran with no changes to code - the only change was updating the visionOS system software to the latest version. visionOS 26 Beta 4 (23M5300g) Is anyone else experiencing this issue?

Spatial Computing General RealityKit visionOS

280

Aug ’25