While the context size limit is not a bug it seems that the FoundationModels framework does have a bug where it will regurgitate this error even if the context size is smaller than 4096. I'm trying to take information from a file the user selected and get a list. I'm using your suggested token division of 3.5 but still get the error -> Unhandled error streaming response: InferenceError::inferenceFailed::Failed to run inference: Context length of 4096 was exceeded during singleExtend.. when running this code:
do {
let languageModelSession = LanguageModelSession(model: .default, instructions: "Can you give me a concise list of barcodes from this CSV import?")
//"Tell me something simple."
let purifiedContent = content.replacingOccurrences(of: "\n", with: ",")
let prompt: String = "Here is the data -> \(purifiedContent)"
let characterCount = prompt.count
let estimatedTokens = Double(characterCount) / 3.5
let tokenCount = Int(round(estimatedTokens))
print("Estimated tokens: \(tokenCount)")
print(prompt)
print(prompt.count)
let response = try await languageModelSession.respond(to: prompt)
print(response.content)
}catch {
print(error)
}
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Tags: