Issue resolved through systematic schema optimization.
The delay occurred during schema compilation phase, not during the actual streaming token generation. I did generate a .trace file as suggested but resolved the issue through direct code optimization before needing to analyze the profiling data.
Performance now within acceptable range for beta framework.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models