Multimodal Workflows¶

Images¶

Use --multimodal plus --image with a compatible model such as gemma3-12B.

ollm prompt --model gemma3-12B --multimodal --image ./diagram.png "Describe this image"

voxtral-small-24B supports audio through the optimized-native path.

ollm chat --model gemma3-12B --multimodal
/image ./diagram.png
/send Describe this image

ollm chat --model voxtral-small-24B --multimodal
/audio ./sample.wav
/send What can you tell me about this audio?