Skip to content

Runtime Loader API

RuntimeLoader owns materialization, runtime planning, and safe fallback during backend loading.

Loaded runtime bundle containing the finalized backend and plan metadata.

Attributes:

Name Type Description
resolved_model ResolvedModel

Final resolved model metadata for the loaded runtime.

config RuntimeConfig

Effective runtime configuration after selector application.

backend BackendRuntime

Loaded backend runtime implementation.

model_path Path | None

Local materialized model path when one exists.

plan RuntimePlan

Final runtime plan used to load the backend.

capabilities property

capabilities: CapabilityProfile

Return capability information aligned with the finalized runtime plan.

Returns:

Name Type Description
CapabilityProfile CapabilityProfile

Capability metadata adjusted to reflect the final

CapabilityProfile

support level and disk-cache behavior of the loaded runtime.

model property

model

Expose the backend-owned model object when one exists.

tokenizer property

tokenizer

Expose the backend-owned tokenizer when one exists.

processor property

processor

Expose the backend-owned processor when one exists.

device property

device: device

Expose the backend runtime device.

get_or_create_kv_cache

get_or_create_kv_cache(
    cache_dir: Path,
    strategy: str,
    lifecycle: str,
    window_tokens: int | None,
) -> object | None

Reuse one KV-cache instance per resolved cache key.

Parameters:

Name Type Description Default
cache_dir Path

Cache root for the KV cache instance.

required
strategy str

Resolved KV cache strategy ID.

required
lifecycle str

Resolved cache lifecycle ID.

required
window_tokens int | None

Sliding-window token budget when the strategy requires one.

required

Returns:

Type Description
object | None

object | None: Existing or newly created cache object, or None

object | None

when the backend does not expose a cache.

reset_kv_cache_instances

reset_kv_cache_instances() -> None

Drop any cached KV objects before a full-history re-execution.

Resolve, plan, materialize, and load runtimes for model references.

Parameters:

Name Type Description Default
resolver ModelResolver | None

Optional resolver override.

None
selector BackendSelector | None

Optional backend selector override.

None
backends tuple[ExecutionBackend, ...] | None

Optional registered backend implementations.

None
snapshot_downloader Callable[[str, str, bool, str | None], None] | None

Optional Hugging Face snapshot downloader override.

None
specialization_registry SpecializationRegistry | None

Optional specialization registry override.

None

resolve

resolve(
    model_reference: str, models_dir: Path
) -> ResolvedModel

Resolve a model reference without planning or loading.

Parameters:

Name Type Description Default
model_reference str

User-facing model reference.

required
models_dir Path

Local models root used for implicit path resolution.

required

Returns:

Name Type Description
ResolvedModel ResolvedModel

Normalized model metadata.

discover_local_models

discover_local_models(
    models_dir: Path,
) -> tuple[ResolvedModel, ...]

Discover local materialized models under a models directory.

Parameters:

Name Type Description Default
models_dir Path

Local models root to inspect.

required

Returns:

Type Description
ResolvedModel

tuple[ResolvedModel, ...]: Resolved local model directories found

...

under the given root.

download

download(
    model_reference: str,
    models_dir: Path,
    force_download: bool = False,
) -> Path

Materialize a downloadable model reference locally.

Parameters:

Name Type Description Default
model_reference str

User-facing model reference to materialize.

required
models_dir Path

Local models root used for materialization.

required
force_download bool

Whether to re-download even when a managed directory already exists.

False

Returns:

Name Type Description
Path Path

Local materialized model directory.

Raises:

Type Description
ValueError

Raised when the reference cannot be materialized or the resulting managed directory is incomplete.

load

load(config: RuntimeConfig) -> LoadedRuntime

Validate, plan, and load a runtime backend for execution.

Parameters:

Name Type Description Default
config RuntimeConfig

Runtime configuration to execute.

required

Returns:

Name Type Description
LoadedRuntime LoadedRuntime

Loaded backend runtime bundle ready for execution.

Raises:

Type Description
ValueError

Raised when planning, materialization, specialization, or backend loading fails without a truthful fallback path.

plan

plan(config: RuntimeConfig) -> RuntimePlan

Build a runtime plan without loading a backend.

Parameters:

Name Type Description Default
config RuntimeConfig

Runtime configuration to inspect.

required

Returns:

Name Type Description
RuntimePlan RuntimePlan

Planned backend, specialization, and strategy state.

Raises:

Type Description
ValueError

Raised when the runtime configuration is invalid or no truthful plan can be produced.