I am having the same issue with a macOS command line tool and raised a feedback (17965726). My issue is that the rate limit seems to just block forever after you hit it. E.g. It will loop 10 times, and then throw a rate limit error. I can put in tasks to delay/sleep for up to 30 minutes after the error and it will still error about rate limits.
What are the rate limits? Is X per second/minute/hour? Can we get some documentation or preferred implementation for those sorts of tasks that do loop through small prompt calls?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models