Skip to content

Quickstart

Text prompt

ollm prompt --model llama3-8B-chat "Summarize this file"

Read from stdin

cat notes.txt | ollm prompt --stdin --model llama3-8B-chat

Inspect the runtime plan

ollm prompt --model llama3-8B-chat --plan-json

Force the generic path

ollm prompt --model llama3-8B-chat --backend transformers-generic --no-specialization "Summarize this file"

Discover available model references

ollm models list
ollm models list --installed

Run diagnostics

ollm doctor --json

Start the local API server

uv sync --extra server
ollm serve

Once the server is running locally, inspect the schema at /openapi.json or open the interactive docs at /docs.