Quickstart¶
Text prompt¶
ollm prompt --model llama3-8B-chat "Summarize this file"
Read from stdin¶
cat notes.txt | ollm prompt --stdin --model llama3-8B-chat
Inspect the runtime plan¶
ollm prompt --model llama3-8B-chat --plan-json
Force the generic path¶
ollm prompt --model llama3-8B-chat --backend transformers-generic --no-specialization "Summarize this file"
Discover available model references¶
ollm models list
ollm models list --installed
Run diagnostics¶
ollm doctor --json
Start the local API server¶
uv sync --extra server
ollm serve
Once the server is running locally, inspect the schema at /openapi.json or
open the interactive docs at /docs.