Post

Replies

Boosts

Views

Activity

Reply to Context window 90% of adapter model full after single user prompt
Hi Carina, here's the requested information: 1) Sample of training data: See attached sample.json sample.json **2) AdapterTrainingConfiguration ** config3 = AdapterTrainingConfiguration( epochs=6, learning_rate=1e-4, batch_size=2, # Try increasing if memory allows gradient_accumulation_steps=4, # Reduce accordingly enable_activation_checkpointing=True, precision='bf16-mixed', max_sequence_length=4095, compile_model=False ) train_adapter( train_data=TRAIN_FILE, eval_data=VALID_FILE, config=config3, checkpoint_dir='/content/drive/MyDrive/checkpoints' ) 3) at inference time (from my unit tests) https://github.com/MAOShea/Hello-World-Tools-Adapter-SwiftUI/blob/main/Hello%20World%20ToolsTests/LanguageModelComparisonTests.swift struct SessionFactory { static func createSession( modelType: ModelType, systemPrompt: SystemPromptVersion ) throws -> LanguageModelSession { let tools = [WriteUbersichtWidgetToFileSystem()] switch modelType { case .base: let instructions = systemPrompt.prompt return LanguageModelSession( tools: tools, instructions: instructions ) case .adapter(let adapterURL): let adapter = try SystemLanguageModel.Adapter(fileURL: adapterURL) let customAdapterModel = SystemLanguageModel(adapter: adapter) return LanguageModelSession( model: customAdapterModel, tools: tools ) } } } Note the switch and the two cases.
3w
Reply to Context window 90% of adapter model full after single user prompt
Hi, I have put together a pair of unit tests that run the same scenario against two separate language models: the Apple Foundation base model, my fine-tuned adapter model. While both are able to successfully complete a first prompt/reply turn, the LanguageModelSession that is running against the adapter model runs out of context window in turn 2. A very important nuance is that while both models operate with the same system prompt: a) for the unit test running against the base model, the system prompt is passed as "instructions" when instantiating the LanguageModelSession b) in the unit test running against the adapter model, the system prompt is baked into the training data. Here's the link to the analysis of the behaviour of the two tests and how they differ (compiled by Claude, as you'll no doubt detect from the superb over-confidence on display that is typical of AI agents) : https://github.com/MAOShea/Hello-World-Tools-Adapter-SwiftUI/blob/main/SUPPORT_REQUEST_TranscriptStorageDifference.md The two log files that it is analysing are : the base model : https://github.com/MAOShea/Hello-World-Tools-Adapter-SwiftUI/blob/7a7016f7a90c4606fd834a37dd58da11d0f9419e/TestRuns/baseModel_DiagnosticInspectTranscriptEntries.log the adapter model: https://github.com/MAOShea/Hello-World-Tools-Adapter-SwiftUI/blob/7a7016f7a90c4606fd834a37dd58da11d0f9419e/TestRuns/adapterModel_DiagnosticInspectTranscriptEntries.log I'll be happy to share any code or training data that you'd request as we investigate this problem together. Kind regards, Michael O'Shea
3w
Reply to Training adapter, it won't call my tool
So I've done more research on why the context window is filling up and that research hints at something weird happening during training, which loops back to your recommendation above. I have however just one question : with your patch, do I still need tool_calls property in the assistant message, as I indicated above? Notice how the content property is empty in my assistant message. I was in fact passing the content in the arguments of the call to the tool in the tool_calls property. Do I now just put the content into the content property or do I just leave my training set as-is? I will apply your fix and see how it goes. Thanks!
Nov ’25
Reply to Training adapter, it won't call my tool
I have continued my research and have discovered that there must be a tool_calls assistant message. The tool gets called once now :-D It doesn't get called again after that. I'll continue testing; { 'role': 'assistant', 'content': '', 'tool_calls': [ { 'id': tool_call_id, 'type': 'function', 'function': { 'name': 'WriteUbersichtWidgetToFileSystem', 'arguments': arguments_json } } ] }
Nov ’25