Foundation Model Inference in Background? Concurrency?

Hi,

Are there rules around using Foundation Models:

  1. In a background task/session?
  2. Concurrently, i.e. a bunch simultaneously using Swift Concurrency?

I couldn't find this in the docs (sorry if I missed it) so wondering what's supported and what the best practice is here.

In case it matters, my primary platform is Vision Pro (so, M2).

Answered by DTS Engineer in 855444022

You can follow the Swift concurrency rules to run multiple Foundation Models sessions / tasks concurrently. The framework doesn't impose any extra rules for concurrency.

Note that the inference tasks will eventually run on the neural engine serially though.

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

Accepted Answer

You can follow the Swift concurrency rules to run multiple Foundation Models sessions / tasks concurrently. The framework doesn't impose any extra rules for concurrency.

Note that the inference tasks will eventually run on the neural engine serially though.

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

Adding an extra bit about background tasks:

You can run a Foundation Model session in a background process, but on the operating-system level background calls to the on-device model are rate limited. This is because, as Ziqiao said, currently all inference against the on-device model runs serially one at a time, and the model is a shared resource across the operating system.

Foundation Model Inference in Background? Concurrency?
 
 
Q