ollm serve¶
ollm serve starts oLLM's optional local-only REST API server.
Key options¶
--host--port--reload / --no-reload--log-level--response-store-backend--response-store-factory
Settings precedence¶
ollm serve uses the same precedence contract as the rest of oLLM:
- CLI flags
OLLM_SERVER__*environment variables- TOML config values under
[server] - built-in defaults
The default bind is 127.0.0.1:8000.
Responses storage is disabled by default. Use:
--response-store-backend memoryfor process-scoped dev/test retrieval--response-store-backend plugin --response-store-factory package.module:factoryfor a custom backend
OpenAPI and docs endpoints¶
When the server is running locally, FastAPI publishes:
/openapi.json/docs/redoc
Examples¶
uv sync --extra server
ollm serve
ollm serve --port 9001 --log-level debug
ollm serve --response-store-backend memory
ollm serve --response-store-backend plugin \
--response-store-factory custom.module:build_store
OLLM_SERVER__PORT=8123 ollm serve
See Local Server API for the current HTTP route surface.