Hi, I wanted to follow up on this. I'm now on Beta 4 for Xcode and macOS but it's still the same issue. Note that I am trying a RAG approach via Tool Calling as mentioned here:
// https://developer.apple.com/videos/play/wwdc2025/301/?time=124
// https://developer.apple.com/documentation/foundationmodels/expanding-generation-with-tool-calling
var session: LanguageModelSession
session = LanguageModelSession(
tools: [RetrievalTool(retrieval)],
instructions: instructions
)
// https://developer.apple.com/documentation/foundationmodels/generationoptions
let response = try await session.respond(
to: prompt,
options: GenerationOptions.init(maximumResponseTokens: 500)
)
tools can return a somewhat lengthy document relevant to the prompt but even though the instructions and maximumResponseTokens specifies to return a brief response, response ends up being around the same length of the tools
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Tags: