Post

Replies

Boosts

Views

Activity

Foundational Model - Image as Input? Timeline
Hi all, I am interested in unlocking unique applications with the new foundational models. I have a few questions regarding the availability of the following features: Image Input: The update in June 2025 mentions "image" 44 times (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates) - however I can't seem to find any information about having images as the input/prompt for the foundational models. When will this be available? I understand that there are existing Vision ML APIs, but I want image input into a multimodal on-device LLM (VLM) instead for features like "Which player is holding the ball in the image", etc (image understanding) Cloud Foundational Model - when will this be available? Thanks! Clement :)
1
0
54
9h