Post

Replies

Boosts

Views

Activity

Reply to Showing a MTLTexture on an Entity in RealityKit
Mhmm, in my first simple Test I tried: @MainActor private static func generateTexture(width: Int, height: Int) throws -> LowLevelTexture { return try LowLevelTexture(descriptor: .init(pixelFormat: .rgba8Unorm_srgb, width: width, height: height, depth: 1, mipmapLevelCount: 1, textureUsage: [.shaderWrite, .shaderRead])) } @MainActor init(textureSize: SIMD2<Int>) async throws { lowLevelTexture = try Self.generateTexture(width: textureSize.x, height: textureSize.y) let textureResource = try await TextureResource(from: lowLevelTexture) var descriptor = UnlitMaterial.Program.Descriptor() descriptor.blendMode = .add let program = await UnlitMaterial.Program(descriptor: descriptor) material = UnlitMaterial(program: program) material.color = .init(texture: .init(textureResource)) material.opacityThreshold = 0.0 // Enable transparency material.blending = .transparent(opacity: 1.0) } @MainActor mutating func setTextureSize(_ textureSize: SIMD2<Int>) throws { lowLevelTexture = try Self.generateTexture(width: textureSize.x, height: textureSize.y) let textureResource = try TextureResource(from: lowLevelTexture) material.color = .init(texture: .init(textureResource)) } mutating func blitMTLTextureIntoLowLevelTexture(_ mtlTexture: MTLTexture) { let size = self.textureSize guard mtlTexture.width == size.x, mtlTexture.height == size.y else { Logger.ar.error("MTLTexture size \(mtlTexture.width)x\(mtlTexture.height) does not match LowLevelTexture size \(size.x)x\(size.y)") return } MetalHelper.blitTextures(from: mtlTexture, to: lowLevelTexture) } And then the blit method: static func blitTextures(from inTexture: MTLTexture, to lowLevelTexture: LowLevelTexture) { guard let commandQueue = sharedCommandQueue else { Logger.ml.error("Failed to get command queue") return } guard let commandBuffer = commandQueue.makeCommandBuffer() else { Logger.ml.error("Failed to create command buffer") return } guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else { Logger.ml.error("Failed to create compute encoder") return } commandBuffer.enqueue() defer { blitEncoder.endEncoding() commandBuffer.commit() } let outTexture: MTLTexture = lowLevelTexture.replace(using: commandBuffer) blitEncoder.copy(from: inTexture, to: outTexture) } which compiles and runs without error, but I only see a pink mesh.
Topic: Graphics & Games SubTopic: RealityKit Tags:
4h
Reply to Rendering scene in RealityView to an Image
Oh, so this is for a new app where we use CoreML and machine vision to detect animals in the AR scene and show details on them. I just realized, when I want to convert our existing SceneKit-based production Glasses-Try-On app to RealityKit, I'll face the same problem. In our glasses app, allowing the user to take a screenshot of them with Virtual Glasses on is a core-feature for which we need 3D Scene + Camera background (without the UI).
Topic: Spatial Computing SubTopic: ARKit Tags:
3d
Reply to How to get multiple animations into USDZ
Thank you for the clarification, I rewatched that WWDC lecture again. What I ended up doing is: export the main model with textures, uv etc with the main idle animation export each other animation as a usdz only containing the mesh & animation import all into RCP in the scene editor, add the main model, and then click (+) in the Animation Library, and import the other animations from the other usdz's in code, import the scene and take the entity out of the scene, not directly out of the RealityKitPackage This probably will inflate the download size of the app slightly, as the pure mesh is like 300kb and ends up duplicated in the app bundle. But from a workflow perspective, if the designer changes a single animation, I can import it without the risk of getting the start-end wrong. PS: I filed a feedback in feedback assistant, because I feel like both options are sub-optimal.
3w
Reply to SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS
Ah, nice, let's see, first baseline without prepareToAnalyze: The KPI I'm interested is the time between the last audio above the noise-ground level and the final transcript (e.g. between the user stopping to speak and the transcription being ready to trigger actions): n: 11, avg: 2.2s, Var: 0.75 Then, with calling prepareToAnalyze: n: 11, avg: 1.45s, Var: 1.305 (the delay varied greatly between 0.05s and 3s) So yeah, based on this small sample, preparing did seem to decrease the delay.
Topic: Media Technologies SubTopic: Audio Tags:
4w
Reply to [26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
Consider using the SpeechDetector module in conjunction with SpeechTranscriber. SpeechDetector performs a similar voice activity detection function and integrates with SpeechTranscriber. thank you, so i've been using SpeechDetector like so for a while: let detector = SpeechDetector(detectionOptions: SpeechDetector.DetectionOptions(sensitivityLevel: .medium), reportResults: true) if analyzer == nil { analyzer = SpeechAnalyzer(modules: [detector, transcriber], options: SpeechAnalyzer.Options(priority: .high, modelRetention: .processLifetime)) } self.analyzerFormat = await SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith: [transcriber]) (inputSequence, inputBuilder) = AsyncStream<AnalyzerInput>.makeStream() Task { for try await result in detector.results { print("result: \(result.description)]") } } recognizerTask = Task { // .. but I have never seen any result: in the logs. Is there any API where SpeechDetector would tell my app when it thinks the speech is over? The docs say This module asks “is there speech?” and provides you with the ability to gate transcription by the presence of voices, saving power otherwise used by attempting to transcribe what is likely to be silence. but this seems to be happening behind the scenes, without getting direct feedback. At the moment, I keep observing the input volume, and once it is below my estimated noise-floor for about a 1 sec I stop the recording. I do this so I can trigger the next even programmatically without cutting of the users speech mid-sentence. The apps user flow does not involve a "start"/"stop" recording button, so I need to end recordings without automatically to create a seamless flow.
Topic: Media Technologies SubTopic: Audio Tags:
4w
Reply to BlendShapes don’t animate while playing animation in RealityKit
The docs on AnimationGroup say: If two animations on the same property overlap durations at runtime, the one that the framework processes second overwrites the first. That means, I'll have to adjust the animation in the usdz using blender, so it does not use the jaw or neck joint? Then I should be able to animate the jaw / neck simultaneously to the idle animation from my asset, using AnimationGroup?
Topic: Graphics & Games SubTopic: RealityKit Tags:
Jul ’25
Reply to BlendShapes don’t animate while playing animation in RealityKit
The goal is to play facial expressions (like blinking or talking) while a body animation (like waving) is playing. I'm actually working on something similar, wondering the same question. Model imported from usdz with a list of animations (walk, idle, etc). E.g. entity.playAnimation(animations[index], transitionDuration: 0.2, startsPaused: false) I can manipulate joints for the neck or jaw programmatically to adjust the model. By doing: // input variable mouthOpen: Float let target = "Root_M/.../Jaw_M" var newPose = basePose guard let index = newPose.jointNames.firstIndex(of: target) else { return } let baseTransform = basePose.jointTransforms[index] let maxAngle: Float = 40 let angle: Float = maxAngle * mouthOpen * (.pi / 180) let extraRot = simd_quatf(angle: angle, axis: simd_float3(x: 0, y: 0, z: 1)) newPose.jointTransforms[index] = Transform( scale: baseTransform.scale, rotation: baseTransform.rotation * extraRot, translation: baseTransform.translation ) skeletalComponent.poses.default = newPose creatureMeshEntity.components .set(skeletalComponent) I also plan on making the head look at a specific point by manually setting the neck or eye joints rotation. The problem is that playing an animation via entity.playAnimation() will overwrite the jointtransforms and so block the programmatic rotating of joints. Playing a character's walk/idle animation while making them look at a specific spot is a pretty common use case, isn't it?
Topic: Graphics & Games SubTopic: RealityKit Tags:
Jul ’25
Reply to There's wrong with speech detector ios26
Thank you Greg! @DTS Engineer When I use it with the retroactive protocol conformance, it seems to work, but I never see any results (for the reportResults: true) When I try: let detector = SpeechDetector(detectionOptions: SpeechDetector.DetectionOptions(sensitivityLevel: .medium), reportResults: true) if analyzer == nil { analyzer = SpeechAnalyzer(modules: [detector, transcriber], options: SpeechAnalyzer.Options(priority: .high, modelRetention: .processLifetime)) } Task { for try await result in detector.results { print("result: \(result.description)]") } } I never see any of the result prints in the log, while the Transcription works fine. Is the detector.results supposed to be used like that and if so, does it show any response for others?
Topic: Media Technologies SubTopic: Audio Tags:
Jul ’25
Reply to There's wrong with speech detector ios26
SpeechAnalysisModule doesn't exist, SpeechAnalyzer init parameter is called SpeechModule. Doing let modules: [any SpeechModule] = [detector, transcriber] also doesn't work, since it's obviously Cannot convert value of type 'SpeechDetector' to expected element type 'any SpeechModule'. This compiles and runs: let detector = SpeechDetector(detectionOptions: SpeechDetector.DetectionOptions(sensitivityLevel: .medium), reportResults: true) let modules: [any SpeechModule] = [detector as! (any SpeechModule), transcriber] let analyzer = SpeechAnalyzer(modules: modules, options: SpeechAnalyzer.Options(priority: .high, modelRetention: .processLifetime)) but honestly, I see no difference with or without the detector. Actually testing the results via: let detector = SpeechDetector(detectionOptions: SpeechDetector.DetectionOptions(sensitivityLevel: .medium), reportResults: true) let modules: [any SpeechModule] = [detector as! (any SpeechModule), transcriber] Task { for try await result in detector.results { print("result: \(result.description)]") } } also doesn't yield any log lines, so I think while force-casting it to SpeechModule doesn't make the app crash, it's just ignored.
Topic: Media Technologies SubTopic: Audio Tags:
Jul ’25
Reply to Model Guardrails Too Restrictive?
I had a similar experience in Beta 3, even questions like "What is the capital of France?" were hitting guardrails. Tried the same question with a number of real countries, always guardrailed. Then tried with Gondor and Westeros and for those fictional countries the model sent a response. I'm assuming mentioning real country names must have triggered guard rails against political topics. As of Beta 4 my test questions for capitals work for both real and fictional countries.
Jul ’25
Reply to Glass material in USDZ
I think it would help to specify more clearly what shader system you are using? When it comes to ARKit, ARKit is only the framework that matches your camera to the 3D scene, i.e. handling of the odometry etc. The materials are rendered in the 3D shaders used, technically you can use ARKit with shaders from RealityKit, SceneKit and MetalKit. here's a nice writeup on their differences on stackoverflow
Topic: Spatial Computing SubTopic: ARKit Tags:
Dec ’22