eddiewangyw’s Profile | Apple Developer Forums

eddiewangyw

Last seen

Post

Replies

Boosts

Views

Activity

Reply to How to use the SpeechDetector Module

The type cast workaround is clever but fragile — it relies on the internal conformance being present at runtime even though the compiler cannot see it. If you need voice activity detection before the fix ships, AVAudioEngine installTap with vDSP_measqv for RMS metering is a solid fallback. About 10 lines of code and you get sub-10ms detection without depending on the Speech framework at all.

Media Technologies Audio

Reply to SpeechAnalyzer speech to text wwdc sample app

Hit this in production. The root cause is a locale format mismatch — Locale.current.identifier returns underscores (en_US) but the internal allocation table uses hyphens (en-US). Even after the beta 3 fix I still see it intermittently with en-GB when device region differs from language setting. Skipping installed(locale:) and calling downloadIfNeeded() directly is the safest workaround.Hit this in production. The root cause is a locale format mismatch — Locale.current.identifier returns underscores (en_US) but the internal allocation table uses hyphens (en-US). Even after the beta 3 fix I still see it intermittently with en-GB when device region differs from language setting. Skipping installed(locale:) and calling downloadIfNeeded() directly is the safest workaround.

Media Technologies Audio

Reply to Core Spotlight Semantic Search - still non-functional for 1+ year after WWDC24?

Same experience here — CSSearchableItem with semanticDescription populated, index looks fine in the debug console, but semantic queries return nothing useful. Filed a Feedback last year and got silence. At this point I'm embedding my own vectors via sentence-transformers on CoreML and doing the similarity search manually. More work but at least it actually functions.

Machine Learning & AI Apple Intelligence

Reply to Is anyone working on jax-metal?

Still broken as of early 2026 in my testing. For JAX workloads on Apple Silicon I've moved to MLX entirely — the API is different but the Metal backend actually works and gets regular updates. For anything that must stay in JAX, CPU fallback is unfortunately the only reliable path on macOS right now.

Machine Learning & AI General

Reply to AI framework usage without user session

We ran CoreML inference from a launch daemon (no user session) for about a year — it works but with caveats. ANE access is unreliable without a session, so you'll likely fall back to CPU/GPU compute units. Vision framework calls that touch CoreGraphics can deadlock if there's no window server connection. Our workaround was forcing .cpuOnly for the daemon path and keeping the GPU/ANE path for the user-facing XPC.

Machine Learning & AI General

Reply to Does using Vision API offline to label a custom dataset for Core ML training violate DPLA?

I've done something similar — used Vision framework outputs to build training labels for a custom audio-visual alignment model. As long as you're using the API as documented and shipping your own model (not redistributing Apple's), you're fine. The DPLA restriction is about reverse-engineering the framework internals, not about using its outputs as training signal. Never had App Review pushback on this.

Machine Learning & AI Core ML

Reply to After loading my custom model - unsupportedTokenizer error

Tokenizer breakage across mlx versions is a recurring pain point — the tokenizer factory gets updated without guaranteed backward compat for custom-fused models. Check if tokenizer_config.json in your fused model specifies a tokenizer_class that 2.29.1 still recognizes. Manually setting the tokenizer type in LLMModelFactory registration usually gets around it.

Machine Learning & AI General

Reply to 26.4 Foundation Model rejects most topics

Seeing the same regression — the on-device model's refusal threshold got way more aggressive in 26.4. Topics that worked fine in 26.3 now trigger blanket rejections. Feels like Apple tightened the guardrails without testing against real app use cases. For now I'm falling back to a custom CoreML model for the affected flows, but that defeats the whole point of FoundationModels framework.

Machine Learning & AI Foundation Models

Reply to CoreML MLE5ProgramLibrary AOT recompilation hangs/crashes on iOS 26.4 — C++ exception in espresso IR compiler bypasses Swift error handling

Hit a similar hang on 26.4 with a custom speech model — the C++ exception in libBNNS during AOT recompilation bypasses all Swift error handling. What worked for me was pre-compiling to .mlmodelc with coremlcompiler on 26.3 and shipping the compiled artifact instead of .mlpackage. Skips the on-device respecialization entirely.

Machine Learning & AI Core ML

Reply to Sharing a Swift port of Gemma 4 for mlx-swift-lm — feedback welcome

Solid work — 12-14 tok/s on A-series with 4-bit is respectable. 341-392 MB resident on 7.4 GB does leave thin margins though. Have you profiled whether MLX is placing any matmuls on ANE, or is this pure GPU? In my experience with Whisper-scale models the GPU path is more predictable but ANE helps with battery if the ops map cleanly.

Machine Learning & AI Core ML

Reply to Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X

If your goal is inference, Apple Silicon with unified memory sidesteps these driver issues entirely. I've been loading 30B+ models via MLX on an M2 Pro — no PCIe bottleneck, no VRAM split, no driver compatibility fights. Might be worth comparing the cost of a Mac Studio vs the time spent debugging ROCm on the 2019 Pro.

Machine Learning & AI General

Reply to SpeechAnalyzer error "asset not found after attempted download" for certain languages

Good to get the definitive answer. I've been doing a quick validation transcription per locale before shipping rather than trusting the supportedLocale API — it's been unreliable across betas for several languages, not just Arabic.

Media Technologies Audio

Reply to SpeechAnalyzer > AnalysisContext lack of documentation

Same wall here — the DictationTranscriber-only limitation for contextualStrings is easy to miss. I ended up keeping SpeechTranscriber and doing fuzzy command matching in post-processing (Levenshtein distance against the command list). For numbers, regex extraction after transcription is more reliable than trying to bias the model toward the right homophone.

Media Technologies Audio

Reply to AVAudioFile.read extremely slow after seeking in FLAC and MP3 files

Nice benchmarks. I hit the same thing processing long audio recordings for speech-to-text — FLAC seeking was so slow I thought my app had hung. Ended up just converting to ALAC on import as a workaround since Apple’s own codec handles seeking properly. Not ideal but saved me from pulling in libFLAC as a dependency.

Media Technologies Audio

Apr ’26

Reply to CoreML GPU NaN bug with fused QKV attention on macOS Tahoe

Saw the same class of bug with a Whisper-based encoder — attention outputs were garbled on GPU, fine on CPU. Your fuse_transpose_matmul removal is the right fix. I’d also tried forcing .cpuAndGPU as a workaround but that kills ANE scheduling entirely which tanks throughput.

Machine Learning & AI Core ML

Apr ’26

Reply to How to use the SpeechDetector Module

Media Technologies Audio