My app is going to use more sophisticated models locally but I'm trying to utilize FM for relatively less demanding tasks, including some categorization and summarization. But my users will be importing a wide variety of content/data types, and in my own testing with my personal journal content had discussion of a violent crime against a friend throw a guardrailViolation (unsafe content) exception. Not sure if I'll just exclude the model from activities where this might happen, or catch the exception and use a downloaded model as the fallback. I appreciate the importance of safety but this is not safety, it's blatantly censorship (however well intentioned). People discuss unsafe things, and it's critically important we do so for reasons related to safety itself, both personal and social. At least provide an option to configure how this is done and provide content-specific information in the error so we are not guessing how we are crossing the guardrail.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models