Runtime Client API¶
RuntimeClient is the high-level public Python API. Use it when you want the same
resolver, planner, loader, and executor behavior that powers the CLI.
High-level runtime API shared by the CLI and the Python library.
Attributes:
| Name | Type | Description |
|---|---|---|
runtime_loader |
RuntimeLoader
|
Resolver, planner, materialization, and backend-loading boundary. |
runtime_executor |
RuntimeExecutor
|
Prompt execution boundary used once a runtime has been loaded. |
resolve
¶
resolve(
model_reference: str, models_dir: Path = Path("models")
) -> ResolvedModel
Resolve a model reference without loading a runtime.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_reference
|
str
|
User-facing model reference such as a built-in alias, Hugging Face repository, or local model path. |
required |
models_dir
|
Path
|
Local models root used for implicit path resolution. |
Path('models')
|
Returns:
| Name | Type | Description |
|---|---|---|
ResolvedModel |
ResolvedModel
|
Normalized model metadata for planning or inspection. |
discover_local_models
¶
discover_local_models(
models_dir: Path = Path("models"),
) -> tuple[ResolvedModel, ...]
Discover local materialized models under a models directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models_dir
|
Path
|
Local models root to inspect. |
Path('models')
|
Returns:
| Type | Description |
|---|---|
ResolvedModel
|
tuple[ResolvedModel, ...]: Materialized model directories discovered |
...
|
under the given root. |
plan
¶
plan(runtime_config: RuntimeConfig) -> RuntimePlan
Build a runtime plan without loading a backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runtime_config
|
RuntimeConfig
|
Execution configuration to inspect. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
RuntimePlan |
RuntimePlan
|
Planned backend, specialization, and capability result. |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when the runtime configuration is invalid or no executable plan can be produced. |
describe_plan
¶
describe_plan(
runtime_config: RuntimeConfig,
) -> PlanJsonPayload
Return a JSON-serializable inspection payload for a runtime plan.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runtime_config
|
RuntimeConfig
|
Execution configuration to inspect. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PlanJsonPayload |
PlanJsonPayload
|
Serialized inspection payload for CLI or HTTP use. |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when the runtime configuration is invalid or no executable plan can be produced. |
load
¶
load(runtime_config: RuntimeConfig) -> LoadedRuntime
Resolve and load a runtime backend for the given configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runtime_config
|
RuntimeConfig
|
Execution configuration to load. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
LoadedRuntime |
LoadedRuntime
|
Loaded backend runtime bundle ready for execution. |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when the model cannot be resolved, materialized, planned, or loaded. |
prompt
¶
prompt(
prompt: str,
*,
runtime_config: RuntimeConfig,
generation_config: GenerationConfig | None = None,
system_prompt: str = DEFAULT_SYSTEM_PROMPT,
images: tuple[str, ...] = (),
audio: tuple[str, ...] = (),
sink: StreamSink | None = None,
) -> PromptResponse
Execute one prompt using text plus optional image or audio inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
Primary text prompt. |
required |
runtime_config
|
RuntimeConfig
|
Runtime configuration to execute. |
required |
generation_config
|
GenerationConfig | None
|
Optional generation
overrides. Defaults to |
None
|
system_prompt
|
str
|
System instruction prepended to the request when non-empty. |
DEFAULT_SYSTEM_PROMPT
|
images
|
tuple[str, ...]
|
Optional image input paths or URIs. |
()
|
audio
|
tuple[str, ...]
|
Optional audio input paths or URIs. |
()
|
sink
|
StreamSink | None
|
Optional streaming sink for incremental text callbacks. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
PromptResponse |
PromptResponse
|
Final prompt response and assistant message payload. |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when the runtime or generation configuration is invalid or when no executable backend exists. |
prompt_parts
¶
prompt_parts(
parts: list[ContentPart],
*,
runtime_config: RuntimeConfig,
generation_config: GenerationConfig | None = None,
system_prompt: str = DEFAULT_SYSTEM_PROMPT,
history: list[Message] | None = None,
sink: StreamSink | None = None,
) -> PromptResponse
Execute a prompt composed from explicit content parts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parts
|
list[ContentPart]
|
Prompt payload parts in final user-message order. |
required |
runtime_config
|
RuntimeConfig
|
Runtime configuration to execute. |
required |
generation_config
|
GenerationConfig | None
|
Optional generation
overrides. Defaults to |
None
|
system_prompt
|
str
|
System instruction prepended to the request when non-empty. |
DEFAULT_SYSTEM_PROMPT
|
history
|
list[Message] | None
|
Optional prior conversation messages to prepend before the new user message. |
None
|
sink
|
StreamSink | None
|
Optional streaming sink for incremental callbacks. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
PromptResponse |
PromptResponse
|
Final prompt response and assistant message payload. |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when |
session
¶
session(
*,
runtime_config: RuntimeConfig,
generation_config: GenerationConfig | None = None,
session_name: str = "default",
system_prompt: str = DEFAULT_SYSTEM_PROMPT,
messages: list[Message] | None = None,
autosave_path: Path | None = None,
) -> ChatSession
Create a reusable chat session over the shared runtime stack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runtime_config
|
RuntimeConfig
|
Runtime configuration for the session. |
required |
generation_config
|
GenerationConfig | None
|
Optional generation
overrides. Defaults to |
None
|
session_name
|
str
|
Human-readable session label. |
'default'
|
system_prompt
|
str
|
Session-wide system instruction. |
DEFAULT_SYSTEM_PROMPT
|
messages
|
list[Message] | None
|
Optional initial transcript messages. |
None
|
autosave_path
|
Path | None
|
Optional transcript autosave path. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
ChatSession |
ChatSession
|
Reusable session object bound to the shared runtime |
ChatSession
|
stack. |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when the runtime or generation configuration is invalid. |