So, our app imports CoreML and loads the CoreML models when starting up.
Locally all tests succeed consistently.
Running Tests on Xcode Cloud, they all fail with (UI and Unit Tests)
App (6846) encountered an error (Early unexpected exit, operation never finished bootstrapping - no restart will be attempted. (Underlying Error: Test crashed with signal ill before establishing connection.))
the test instance is busy for 35min and then just aborts, with all tests failing.
This sounds to me like the simulator is maybe showing some exception or is stuck?
Or is it possible that xcode server runs in a special environment that gets stuck on loading CoreML models?
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Created
In courses like Compose interactive 3D content in Reality Composer Pro Realitykit Engineers recommended working with Reality Composer Pro to create RealityKit packages to embed in our Realitykit Xcode projects.
And, comparing the workflow to Unity/Unreal, I can see the reasoning since it is nice to prepare scenes/materials/assets visually.
Now when we also want to run a Xcode Cloud CI/CD pipeline this seems to come into conflict:
When adding a basic *.usdz to the RealityKitContent.rkassets folder, every build we run on Xcode cloud fails with:
Compile Reality Asset RealityKitContent.rkassets
❌realitytool requires Metal for this operation and it is not available in this build environment
I have also found this related forum post here but it was specifically about compiling a *.skybox.
Basically, take just the Xcode 26 AR App template, where we put the ContentView as the detail end of a NavigationStack.
Opening app, the app uses < 20MB of memory. Tapping on Open AR the memory usage goes up to ~700MB for the AR Scene. Tapping back, the memory stays up at ~700MB.
Checking with Debug memory graph I can still see all the RealityKit classes in the memory, like ARView, ARRenderView, ARSessionManager.
Here's the sample app to illustrate the issue.
PS: To keep memory pressure on the system low, there should be a way of freeing all the memory the AR uses for apps that only occasionally show AR scenes.
Is there any standard way of efficiently showing a MTLTexture on a RealityKit Entity?
I can't find anything proper on how to , for example, generate a LowLevelTexture out of a MTLTexture. Closest match was this two year old thread.
In the old SceneKit app, we would just do
guard let material = someNode.geometry?.materials.first else { return }
material.diffuse.contents = mtlTexture
Our flow is as follows (for visualizing the currently detected object):
Camera-Stream -> CoreML Segmentation -> Send the relevant part of the MLShapedArray-Tensor to a MTLComputeShader that returns a MTLTexture -> Show the resulting texture on a 3D object to the user
Is there any way to render a RealityView to an Image/UIImage like we used to be able to do using SCNView.snapshot() ?
ImageRenderer doesn't work because it renders a SwiftUI view hierarchy, and I need the currently presented RealityView with camera background and 3D scene content the way the user sees it
I tried UIHostingController and UIGraphicsImageRenderer like
extension View {
func snapshot() -> UIImage {
let controller = UIHostingController(rootView: self)
let view = controller.view
let targetSize = controller.view.intrinsicContentSize
view?.bounds = CGRect(origin: .zero, size: targetSize)
view?.backgroundColor = .clear
let renderer = UIGraphicsImageRenderer(size: targetSize)
return renderer.image { _ in
view?.drawHierarchy(in: view!.bounds, afterScreenUpdates: true)
}
}
}
but that leads to the app freezing and sending an infinite loop of
[CAMetalLayer nextDrawable] returning nil because allocation failed.
Same thing happens when I try
return renderer.image { ctx in
view.layer.render(in: ctx.cgContext)
}
Now that SceneKit is deprecated, I didn't want to start a new app using deprecated APIs.
I thought the ARCoachingOverlayView was a nice touch, so each apps ARKit coaching was recognizable and I used it in my ARView/ARSCNView based apps.
Now with RealityView, is there any replacement planned?
Or should we just use UIViewRepresentable and wrap ARCoachingOverlayView?
I want an AR character to be able to look at a position while still playing the characters animation.
So far, I managed to manually adjust a single bone rotation using
skeletalComponent.poses.default = Transform(
scale: baseTransform.scale,
rotation: lookAtRotation,
translation: baseTransform.translation
)
which I run at every rendering update, while a full body animation is running.
But of course, hardcoding single joints to point into a direction (in my case the head) does not look as nice, as if I were to run some inverse cinematic that includes, hips + neck + head joints.
I found some good IKRig code in Composing interactive 3D content with RealityKit and Reality Composer Pro.
But when I try to adjust rigs while animations are playing, the animations are usually winning over the IKRig changes to the mesh.
Most models are only available as glb or fbx, so I usually reexport them into usdz using Blender.
When I import them into Reality Composer Pro, Mesh, Textures etc look great, but in the Animation Library subsection all I can see is one default subtree animation.
In Blender I can see all available animations and play them individually. The default subtree animation just plays the default idle animation.
In fact when I open the nonlinear animation view in Blender and select a different animation as the default animation, the exported usdz shows the newly selected animation as default subtree animation.
I can see in the Apple sample apps models can have multiple animations in their Animation Library.
I'm using the latest Blender 4.5 and the usdz exporter should be working properly?
Two issues:
No matter what I set in
try audioSession.setPreferredSampleRate(x)
the sample rate on both iOS and macOS is always 48000 when the output goes through the speaker, and 24000 when my Airpods connect to an iPhone/iPad.
Now, I'm checking the current output loudness to animate a 3D character, using
mixerNode.installTap(onBus: 0, bufferSize: y, format: nil) { [weak self] buffer, time in
Task { @MainActor in
// calculate rms and animate character accordingly
but any buffer size under 4800 is just ignored and the buffers I get are 4800 sized.
This is ok, when the sampleRate is currently 48000, as 10 samples per second lead to decent visual results.
But when AirPods connect, the samplerate is 24000, which means only 5 samples per second, so the character animation looks lame.
My AVAudioEngine setup is the following:
audioEngine.connect(playerNode, to: pitchShiftEffect, format: format)
audioEngine.connect(pitchShiftEffect, to: mixerNode, format: format)
audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: nil)
Now, I'd be fine if the outputNode runs at whatever if it needs, as long as my tap would get at least 10 samples per second.
PS: Specifying my favorite format in the
let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)!
mixerNode.installTap(onBus: 0, bufferSize: y, format: format)
doesn't change anything either
So, I was trying to animate a single bone using FromToByAnimation, but when I start the animation, the model instead does the full body animation stored in the availableAnimations.
If I don't run testAnimation nothing happens.
If I run testAnimation I see the same animation as If I had called
entity.playAnimation(entity.availableAnimations[0],..)
here's the full code I use to animate a single bone:
func testAnimation() {
guard let jawAnim = jawAnimation(mouthOpen: 0.4) else {
print("Failed to create jawAnim")
return
}
guard let creature, let animResource = try? AnimationResource.generate(with: jawAnim) else { return }
let controller = creature.playAnimation(animResource, transitionDuration: 0.02, startsPaused: false)
print("controller: \(controller)")
}
func jawAnimation(mouthOpen: Float) -> FromToByAnimation<JointTransforms>? {
guard let basePose else { return nil }
guard let index = basePose.jointNames.firstIndex(of: jawBoneName) else {
print("Target joint \(self.jawBoneName) not found in default pose joint names")
return nil
}
let fromTransforms = basePose.jointTransforms
let baseJawTransform = fromTransforms[index]
let maxAngle: Float = 40
let angle: Float = maxAngle * mouthOpen * (.pi / 180)
let extraRot = simd_quatf(angle: angle, axis: simd_float3(x: 0, y: 0, z: 1))
var toTransforms = basePose.jointTransforms
toTransforms[index] = Transform(
scale: baseJawTransform.scale * 2,
rotation: baseJawTransform.rotation * extraRot,
translation: baseJawTransform.translation
)
let fromToBy = FromToByAnimation<JointTransforms>(
jointNames: basePose.jointNames,
name: "jaw-anim",
from: fromTransforms,
to: toTransforms,
duration: 0.1,
bindTarget: .jointTransforms,
repeatMode: .none,
)
return fromToBy
}
PS: I can confirm that I can set this bone to a specific position if I use
guard let index = newPose.jointNames.firstIndex(of: boneName) ...
let baseTransform = basePose.jointTransforms[index]
newPose.jointTransforms[index] = Transform(
scale: baseTransform.scale,
rotation: baseTransform.rotation * extraRot,
translation: baseTransform.translation
)
skeletalComponent.poses.default = newPose
creatureMeshEntity.components.set(skeletalComponent)
This works for manually setting the bone position, so the jawBoneName and the joint-transformation can't be that wrong.
So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
So experimenting with the new SpeechTranscriber, if I do:
let transcriber = SpeechTranscriber(
locale: locale,
transcriptionOptions: [],
reportingOptions: [.volatileResults],
attributeOptions: [.audioTimeRange]
)
only the final result has audio time ranges, not the volatile results.
Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses.
I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while.
The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
Today I was surprised that the latest version of Three.js (browser based 3D Engine) actually supports a transmission property for transparent materials that looks pretty good and is really performant:
So far I have only used this material property when rendering in Blender, not in a realtime engine.
But if THREE can run this at 60fps in the browser, there must be a way of achieving the same in ARKit/SceneKit/RealityKit?
I just haven't found anything in the SCNMaterial documentation about Transmission nor anyone mentioning relevant shader modifiers.
Especially if we want to present objects with frosted glas effects in AR this would be super useful?
(The Three.js app for reference: https://q2nl8.csb.app/ and I learned about this in an article by Kelly Mulligan: https://tympanus.net/codrops/2021/10/27/creating-the-effect-of-transparent-glass-and-plastic-in-three-js/)
Hey there,
When I run the following 50 lines of code in release mode, or turn Optimization on in Build-Settings Swift Compiler - Code Generation I will get the following crash.
Anyone any idea why that happens? (Xcode 13.4.1, happens on Device as well as simulator on iOS 15.5 and 15.6)
Example Project: https://github.com/Bersaelor/ResourceCrashMinimalDemo
#0 0x000000010265dd58 in assignWithCopy for Resource ()
#1 0x000000010265d73c in outlined init with copy of Resource<VoidPayload, String> ()
#2 0x000000010265d5dc in specialized Resource<>.init(url:method:query:authToken:headers:) [inlined] at /Users/konradfeiler/Source/ResourceCrashMinimalDemo/ResourceCrashMinimalDemo/ContentView.swift:51
#3 0x000000010265d584 in specialized ContentView.crash() at /Users/konradfeiler/Source/ResourceCrashMinimalDemo/ResourceCrashMinimalDemo/ContentView.swift:18
Code needed:
import SwiftUI
struct ContentView: View {
var body: some View {
Button(action: { crash() }, label: { Text("Create Resouce") })
}
/// crashes in `outlined init with copy of Resource<VoidPayload, String>`
func crash() {
let testURL = URL(string: "https://www.google.com")!
let r = Resource<VoidPayload, String>(url: testURL, method: .get, authToken: nil)
print("r: \(r)")
}
}
struct VoidPayload {}
enum HTTPMethod<Payload> {
case get
case post(Payload)
case patch(Payload)
}
struct Resource<Payload, Response> {
let url: URL
let method: HTTPMethod<Payload>
let query: [(String, String)]
let authToken: String?
let parse: (Data) throws -> Response
}
extension Resource where Response: Decodable {
init(
url: URL,
method: HTTPMethod<Payload>,
query: [(String, String)] = [],
authToken: String?,
headers: [String: String] = [:]
) {
self.url = url
self.method = method
self.query = query
self.authToken = authToken
self.parse = {
return try JSONDecoder().decode(Response.self, from: $0)
}
}
}
So, we use ARFaceTrackingConfiguration and ARKit for a magic mirror like experience in our apps, augmenting users faces with digital content.
On the iPad Pro 5gen customers are complaining that the camera image is too wide, I'm assuming that is because of the new wide-angle camera necessary for Apples center-stage Facetime calls?
I have looked through Tracking and Visualizing Faces and the WWDC 2021 videos, but I must have missed any API's that allow us to disable the wide-angle feature on the new iPads programmatically?