Hi Quinn,
CPU and network usage. I would like to at least:
Continuously perform voice activity detection (this does seem to work with a basic VAD algo; and I imagine streaming apps are doing more work decoding audio than this anyway).
Send voice to a server for processing.
Receive and store (with minimal processing) JSON responses.
Play back synthesized voice.
Ideally, rather than sending voice to the server, I'd like to perform Siri speech-to-text transcription and speech synthesis on the way back, allowing me to upload only text and receive text responses.
My understanding is there are some limitations on CPU usage for at least some of these cases. However, I imagine that audio streaming apps (YouTube, Spotify, etc.) must be doing a fair bit of decoding work themselves?
Thank you,
-- B.
Topic:
App & System Services
SubTopic:
Core OS
Tags: