Skip to content

Runtime Client API

RuntimeClient is the high-level public Python API. Use it when you want the same resolver, planner, loader, and executor behavior that powers the CLI.

High-level runtime API shared by the CLI and the Python library.

Attributes:

Name Type Description
runtime_loader RuntimeLoader

Resolver, planner, materialization, and backend-loading boundary.

runtime_executor RuntimeExecutor

Prompt execution boundary used once a runtime has been loaded.

resolve

resolve(
    model_reference: str, models_dir: Path = Path("models")
) -> ResolvedModel

Resolve a model reference without loading a runtime.

Parameters:

Name Type Description Default
model_reference str

User-facing model reference such as a built-in alias, Hugging Face repository, or local model path.

required
models_dir Path

Local models root used for implicit path resolution.

Path('models')

Returns:

Name Type Description
ResolvedModel ResolvedModel

Normalized model metadata for planning or inspection.

discover_local_models

discover_local_models(
    models_dir: Path = Path("models"),
) -> tuple[ResolvedModel, ...]

Discover local materialized models under a models directory.

Parameters:

Name Type Description Default
models_dir Path

Local models root to inspect.

Path('models')

Returns:

Type Description
ResolvedModel

tuple[ResolvedModel, ...]: Materialized model directories discovered

...

under the given root.

plan

plan(runtime_config: RuntimeConfig) -> RuntimePlan

Build a runtime plan without loading a backend.

Parameters:

Name Type Description Default
runtime_config RuntimeConfig

Execution configuration to inspect.

required

Returns:

Name Type Description
RuntimePlan RuntimePlan

Planned backend, specialization, and capability result.

Raises:

Type Description
ValueError

Raised when the runtime configuration is invalid or no executable plan can be produced.

describe_plan

describe_plan(
    runtime_config: RuntimeConfig,
) -> PlanJsonPayload

Return a JSON-serializable inspection payload for a runtime plan.

Parameters:

Name Type Description Default
runtime_config RuntimeConfig

Execution configuration to inspect.

required

Returns:

Name Type Description
PlanJsonPayload PlanJsonPayload

Serialized inspection payload for CLI or HTTP use.

Raises:

Type Description
ValueError

Raised when the runtime configuration is invalid or no executable plan can be produced.

load

load(runtime_config: RuntimeConfig) -> LoadedRuntime

Resolve and load a runtime backend for the given configuration.

Parameters:

Name Type Description Default
runtime_config RuntimeConfig

Execution configuration to load.

required

Returns:

Name Type Description
LoadedRuntime LoadedRuntime

Loaded backend runtime bundle ready for execution.

Raises:

Type Description
ValueError

Raised when the model cannot be resolved, materialized, planned, or loaded.

prompt

prompt(
    prompt: str,
    *,
    runtime_config: RuntimeConfig,
    generation_config: GenerationConfig | None = None,
    system_prompt: str = DEFAULT_SYSTEM_PROMPT,
    images: tuple[str, ...] = (),
    audio: tuple[str, ...] = (),
    sink: StreamSink | None = None,
) -> PromptResponse

Execute one prompt using text plus optional image or audio inputs.

Parameters:

Name Type Description Default
prompt str

Primary text prompt.

required
runtime_config RuntimeConfig

Runtime configuration to execute.

required
generation_config GenerationConfig | None

Optional generation overrides. Defaults to GenerationConfig() when omitted.

None
system_prompt str

System instruction prepended to the request when non-empty.

DEFAULT_SYSTEM_PROMPT
images tuple[str, ...]

Optional image input paths or URIs.

()
audio tuple[str, ...]

Optional audio input paths or URIs.

()
sink StreamSink | None

Optional streaming sink for incremental text callbacks.

None

Returns:

Name Type Description
PromptResponse PromptResponse

Final prompt response and assistant message payload.

Raises:

Type Description
ValueError

Raised when the runtime or generation configuration is invalid or when no executable backend exists.

prompt_parts

prompt_parts(
    parts: list[ContentPart],
    *,
    runtime_config: RuntimeConfig,
    generation_config: GenerationConfig | None = None,
    system_prompt: str = DEFAULT_SYSTEM_PROMPT,
    history: list[Message] | None = None,
    sink: StreamSink | None = None,
) -> PromptResponse

Execute a prompt composed from explicit content parts.

Parameters:

Name Type Description Default
parts list[ContentPart]

Prompt payload parts in final user-message order.

required
runtime_config RuntimeConfig

Runtime configuration to execute.

required
generation_config GenerationConfig | None

Optional generation overrides. Defaults to GenerationConfig() when omitted.

None
system_prompt str

System instruction prepended to the request when non-empty.

DEFAULT_SYSTEM_PROMPT
history list[Message] | None

Optional prior conversation messages to prepend before the new user message.

None
sink StreamSink | None

Optional streaming sink for incremental callbacks.

None

Returns:

Name Type Description
PromptResponse PromptResponse

Final prompt response and assistant message payload.

Raises:

Type Description
ValueError

Raised when parts is empty or when runtime/generation validation or backend loading fails.

session

session(
    *,
    runtime_config: RuntimeConfig,
    generation_config: GenerationConfig | None = None,
    session_name: str = "default",
    system_prompt: str = DEFAULT_SYSTEM_PROMPT,
    messages: list[Message] | None = None,
    autosave_path: Path | None = None,
) -> ChatSession

Create a reusable chat session over the shared runtime stack.

Parameters:

Name Type Description Default
runtime_config RuntimeConfig

Runtime configuration for the session.

required
generation_config GenerationConfig | None

Optional generation overrides. Defaults to GenerationConfig() when omitted.

None
session_name str

Human-readable session label.

'default'
system_prompt str

Session-wide system instruction.

DEFAULT_SYSTEM_PROMPT
messages list[Message] | None

Optional initial transcript messages.

None
autosave_path Path | None

Optional transcript autosave path.

None

Returns:

Name Type Description
ChatSession ChatSession

Reusable session object bound to the shared runtime

ChatSession

stack.

Raises:

Type Description
ValueError

Raised when the runtime or generation configuration is invalid.