Is there any standard way of efficiently showing a MTLTexture on a RealityKit Entity?
I can't find anything proper on how to , for example, generate a LowLevelTexture out of a MTLTexture. Closest match was this two year old thread.
In the old SceneKit app, we would just do
guard let material = someNode.geometry?.materials.first else { return }
material.diffuse.contents = mtlTexture
Our flow is as follows (for visualizing the currently detected object):
Camera-Stream -> CoreML Segmentation -> Send the relevant part of the MLShapedArray-Tensor to a MTLComputeShader that returns a MTLTexture -> Show the resulting texture on a 3D object to the user
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Is there any way to render a RealityView to an Image/UIImage like we used to be able to do using SCNView.snapshot() ?
ImageRenderer doesn't work because it renders a SwiftUI view hierarchy, and I need the currently presented RealityView with camera background and 3D scene content the way the user sees it
I tried UIHostingController and UIGraphicsImageRenderer like
extension View {
func snapshot() -> UIImage {
let controller = UIHostingController(rootView: self)
let view = controller.view
let targetSize = controller.view.intrinsicContentSize
view?.bounds = CGRect(origin: .zero, size: targetSize)
view?.backgroundColor = .clear
let renderer = UIGraphicsImageRenderer(size: targetSize)
return renderer.image { _ in
view?.drawHierarchy(in: view!.bounds, afterScreenUpdates: true)
}
}
}
but that leads to the app freezing and sending an infinite loop of
[CAMetalLayer nextDrawable] returning nil because allocation failed.
Same thing happens when I try
return renderer.image { ctx in
view.layer.render(in: ctx.cgContext)
}
Now that SceneKit is deprecated, I didn't want to start a new app using deprecated APIs.
I thought the ARCoachingOverlayView was a nice touch, so each apps ARKit coaching was recognizable and I used it in my ARView/ARSCNView based apps.
Now with RealityView, is there any replacement planned?
Or should we just use UIViewRepresentable and wrap ARCoachingOverlayView?
I want an AR character to be able to look at a position while still playing the characters animation.
So far, I managed to manually adjust a single bone rotation using
skeletalComponent.poses.default = Transform(
scale: baseTransform.scale,
rotation: lookAtRotation,
translation: baseTransform.translation
)
which I run at every rendering update, while a full body animation is running.
But of course, hardcoding single joints to point into a direction (in my case the head) does not look as nice, as if I were to run some inverse cinematic that includes, hips + neck + head joints.
I found some good IKRig code in Composing interactive 3D content with RealityKit and Reality Composer Pro.
But when I try to adjust rigs while animations are playing, the animations are usually winning over the IKRig changes to the mesh.
Most models are only available as glb or fbx, so I usually reexport them into usdz using Blender.
When I import them into Reality Composer Pro, Mesh, Textures etc look great, but in the Animation Library subsection all I can see is one default subtree animation.
In Blender I can see all available animations and play them individually. The default subtree animation just plays the default idle animation.
In fact when I open the nonlinear animation view in Blender and select a different animation as the default animation, the exported usdz shows the newly selected animation as default subtree animation.
I can see in the Apple sample apps models can have multiple animations in their Animation Library.
I'm using the latest Blender 4.5 and the usdz exporter should be working properly?
So, I was trying to animate a single bone using FromToByAnimation, but when I start the animation, the model instead does the full body animation stored in the availableAnimations.
If I don't run testAnimation nothing happens.
If I run testAnimation I see the same animation as If I had called
entity.playAnimation(entity.availableAnimations[0],..)
here's the full code I use to animate a single bone:
func testAnimation() {
guard let jawAnim = jawAnimation(mouthOpen: 0.4) else {
print("Failed to create jawAnim")
return
}
guard let creature, let animResource = try? AnimationResource.generate(with: jawAnim) else { return }
let controller = creature.playAnimation(animResource, transitionDuration: 0.02, startsPaused: false)
print("controller: \(controller)")
}
func jawAnimation(mouthOpen: Float) -> FromToByAnimation<JointTransforms>? {
guard let basePose else { return nil }
guard let index = basePose.jointNames.firstIndex(of: jawBoneName) else {
print("Target joint \(self.jawBoneName) not found in default pose joint names")
return nil
}
let fromTransforms = basePose.jointTransforms
let baseJawTransform = fromTransforms[index]
let maxAngle: Float = 40
let angle: Float = maxAngle * mouthOpen * (.pi / 180)
let extraRot = simd_quatf(angle: angle, axis: simd_float3(x: 0, y: 0, z: 1))
var toTransforms = basePose.jointTransforms
toTransforms[index] = Transform(
scale: baseJawTransform.scale * 2,
rotation: baseJawTransform.rotation * extraRot,
translation: baseJawTransform.translation
)
let fromToBy = FromToByAnimation<JointTransforms>(
jointNames: basePose.jointNames,
name: "jaw-anim",
from: fromTransforms,
to: toTransforms,
duration: 0.1,
bindTarget: .jointTransforms,
repeatMode: .none,
)
return fromToBy
}
PS: I can confirm that I can set this bone to a specific position if I use
guard let index = newPose.jointNames.firstIndex(of: boneName) ...
let baseTransform = basePose.jointTransforms[index]
newPose.jointTransforms[index] = Transform(
scale: baseTransform.scale,
rotation: baseTransform.rotation * extraRot,
translation: baseTransform.translation
)
skeletalComponent.poses.default = newPose
creatureMeshEntity.components.set(skeletalComponent)
This works for manually setting the bone position, so the jawBoneName and the joint-transformation can't be that wrong.
Two issues:
No matter what I set in
try audioSession.setPreferredSampleRate(x)
the sample rate on both iOS and macOS is always 48000 when the output goes through the speaker, and 24000 when my Airpods connect to an iPhone/iPad.
Now, I'm checking the current output loudness to animate a 3D character, using
mixerNode.installTap(onBus: 0, bufferSize: y, format: nil) { [weak self] buffer, time in
Task { @MainActor in
// calculate rms and animate character accordingly
but any buffer size under 4800 is just ignored and the buffers I get are 4800 sized.
This is ok, when the sampleRate is currently 48000, as 10 samples per second lead to decent visual results.
But when AirPods connect, the samplerate is 24000, which means only 5 samples per second, so the character animation looks lame.
My AVAudioEngine setup is the following:
audioEngine.connect(playerNode, to: pitchShiftEffect, format: format)
audioEngine.connect(pitchShiftEffect, to: mixerNode, format: format)
audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: nil)
Now, I'd be fine if the outputNode runs at whatever if it needs, as long as my tap would get at least 10 samples per second.
PS: Specifying my favorite format in the
let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)!
mixerNode.installTap(onBus: 0, bufferSize: y, format: format)
doesn't change anything either
So experimenting with the new SpeechTranscriber, if I do:
let transcriber = SpeechTranscriber(
locale: locale,
transcriptionOptions: [],
reportingOptions: [.volatileResults],
attributeOptions: [.audioTimeRange]
)
only the final result has audio time ranges, not the volatile results.
Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses.
I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while.
The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
I tried animating the scrollTo() like so, as described in the docs. - https://developer.apple.com/documentation/swiftui/scrollviewreader
swift
withAnimation {
scrollProxy.scrollTo(index, anchor: .center)
}
the result is the same as if I do
swift
withAnimation(Animation.easeIn(duration: 20)) {
scrollProxy.scrollTo(progress.currentIndex, anchor: .center)
}
I tried this using the example from the ScrollViewReader docs.
With the result that up and down scrolling has exactly the same animation.
struct ScrollingView: View {
@Namespace var topID
@Namespace var bottomID
var body: some View {
ScrollViewReader { proxy in
ScrollView {
Button("Scroll to Bottom") {
withAnimation {
proxy.scrollTo(bottomID)
}
}
.id(topID)
VStack(spacing: 0) {
ForEach(0..100) { i in
color(fraction: Double(i) / 100)
.frame(height: 32)
}
}
Button("Top") {
withAnimation(Animation.linear(duration: 20)) {
proxy.scrollTo(topID)
}
}
.id(bottomID)
}
}
}
func color(fraction: Double) - Color {
Color(red: fraction, green: 1 - fraction, blue: 0.5)
}
}
struct ScrollingView_Previews: PreviewProvider {
static var previews: some View {
ScrollingView()
}
}
Today I was surprised that the latest version of Three.js (browser based 3D Engine) actually supports a transmission property for transparent materials that looks pretty good and is really performant:
So far I have only used this material property when rendering in Blender, not in a realtime engine.
But if THREE can run this at 60fps in the browser, there must be a way of achieving the same in ARKit/SceneKit/RealityKit?
I just haven't found anything in the SCNMaterial documentation about Transmission nor anyone mentioning relevant shader modifiers.
Especially if we want to present objects with frosted glas effects in AR this would be super useful?
(The Three.js app for reference: https://q2nl8.csb.app/ and I learned about this in an article by Kelly Mulligan: https://tympanus.net/codrops/2021/10/27/creating-the-effect-of-transparent-glass-and-plastic-in-three-js/)
Hey there,
When I run the following 50 lines of code in release mode, or turn Optimization on in Build-Settings Swift Compiler - Code Generation I will get the following crash.
Anyone any idea why that happens? (Xcode 13.4.1, happens on Device as well as simulator on iOS 15.5 and 15.6)
Example Project: https://github.com/Bersaelor/ResourceCrashMinimalDemo
#0 0x000000010265dd58 in assignWithCopy for Resource ()
#1 0x000000010265d73c in outlined init with copy of Resource<VoidPayload, String> ()
#2 0x000000010265d5dc in specialized Resource<>.init(url:method:query:authToken:headers:) [inlined] at /Users/konradfeiler/Source/ResourceCrashMinimalDemo/ResourceCrashMinimalDemo/ContentView.swift:51
#3 0x000000010265d584 in specialized ContentView.crash() at /Users/konradfeiler/Source/ResourceCrashMinimalDemo/ResourceCrashMinimalDemo/ContentView.swift:18
Code needed:
import SwiftUI
struct ContentView: View {
var body: some View {
Button(action: { crash() }, label: { Text("Create Resouce") })
}
/// crashes in `outlined init with copy of Resource<VoidPayload, String>`
func crash() {
let testURL = URL(string: "https://www.google.com")!
let r = Resource<VoidPayload, String>(url: testURL, method: .get, authToken: nil)
print("r: \(r)")
}
}
struct VoidPayload {}
enum HTTPMethod<Payload> {
case get
case post(Payload)
case patch(Payload)
}
struct Resource<Payload, Response> {
let url: URL
let method: HTTPMethod<Payload>
let query: [(String, String)]
let authToken: String?
let parse: (Data) throws -> Response
}
extension Resource where Response: Decodable {
init(
url: URL,
method: HTTPMethod<Payload>,
query: [(String, String)] = [],
authToken: String?,
headers: [String: String] = [:]
) {
self.url = url
self.method = method
self.query = query
self.authToken = authToken
self.parse = {
return try JSONDecoder().decode(Response.self, from: $0)
}
}
}
So, we use ARFaceTrackingConfiguration and ARKit for a magic mirror like experience in our apps, augmenting users faces with digital content.
On the iPad Pro 5gen customers are complaining that the camera image is too wide, I'm assuming that is because of the new wide-angle camera necessary for Apples center-stage Facetime calls?
I have looked through Tracking and Visualizing Faces and the WWDC 2021 videos, but I must have missed any API's that allow us to disable the wide-angle feature on the new iPads programmatically?
Working with a M1 Macbook Air, macos 12.4.
Anytime I open a new terminal window or just a new tab, it takes a really long time till I can type.
I have commented out my entire ~/.zshrc and when run
for i in $(seq 1 10); do /usr/bin/time $SHELL -i -c exit; done
directly in an open terminal window it says it finished in 0.1s.
So it must be something macos is doing before zsh is even starting.
PS: In the activity monitor I can only see a spike in kernel_task cpu usage when opening a new terminal
So, I'm using some small python script that imports
from pxr import Usd, Sdf, UsdGeom, Kind
which used to work fine a few months ago.
I got the USD tools from https://developer.apple.com/augmented-reality/tools/ and always had to run it in a Rosetta-Terminal (plus making sure the USD tools are in PATH and PYTHONPATH) but it worked.
Now, whatever I do I always get
File "create_scene.py", line 2, in <module>
from pxr import Usd, Sdf, UsdGeom, Kind
File "/Applications/usdpython/USD/lib/python/pxr/Usd/__init__.py", line 24, in <module>
import _usd
ModuleNotFoundError: No module named '_usd'
Does anyone have any clues what this _usd is about?