FoundationModels coding

I am writing an app that parses text and conducts some actions. I don't want to give too much away ;)

However, I am having a huge problem with token sizes. LanguageModelSession will of course give me the on device model 4096 available, but when you go over 4096, my code doesn't seem to be falling back to PCC, or even the system configured ChatGPT. Can anyone assist me with this? For some reason, after reading the docs, it's very unclear how this transition between the three takes place.

Hi @joelesler, currently the context window limit for Foundation Models is around 4k tokens, as you mentioned. When your app exceeds that limit, it will throw an .exceededContextWindowSize error, which your app can then handle appropriately.

Currently the framework does not support a way to "fall back" to cloud-based providers. Please take a look at this Technote which may prove useful:

Managing the on-device foundation model’s context window

Best,

-J

That's what I am not understanding, and maybe this will change with the upcoming model replacement, but from what the world has come to understand, and maybe I don't understand it correctly -- try local first, if local can't, then it should move to PCC, and if PCC can't handle the task, then to the 3rd party extension (currently ChatGPT by default).

It's not really "fall back" but I would think the behavior would be "is the token size larger than 4096? If yes, then move to PCC."

Hi @joelesler,

Thanks for your reply. In regards to:

from what the world has come to understand

Can you let me know the documentation, session, or source where you learned this?

Best,

-J

FoundationModels coding
 
 
Q