Runtime Plan API¶

The runtime plan captures backend selection, support level, and specialization state before execution begins.

Bases: str, Enum

Describe whether specialization is absent, planned, applied, or replaced by fallback.

Describe how oLLM intends to execute a resolved model reference.

Attributes:

Name	Type	Description
`resolved_model`	`ResolvedModel`	Final resolved model metadata for the plan.
`backend_id`	`str \| None`	Selected backend identifier when the plan is executable.
`model_path`	`Path \| None`	Local materialized model path when one exists.
`support_level`	`SupportLevel`	Planned support level.
`generic_model_kind`	`GenericModelKind \| None`	Generic execution family when one applies.
`supports_disk_cache`	`bool`	Whether the selected backend supports disk KV cache behavior.
`supports_cpu_offload`	`bool`	Whether CPU offload controls are supported.
`supports_gpu_offload`	`bool`	Whether GPU offload controls are supported.
`specialization_enabled`	`bool`	Whether specialization is enabled for the current request.
`specialization_applied`	`bool`	Whether specialization has already been applied.
`specialization_provider_id`	`str \| None`	Matching specialization provider identifier.
`specialization_state`	`SpecializationState`	Current specialization lifecycle state.
`reason`	`str`	Human-readable plan summary.
`specialization_pass_ids`	`tuple[SpecializationPassId, ...]`	Planned specialization passes.
`applied_specialization_pass_ids`	`tuple[SpecializationPassId, ...]`	Applied specialization passes.
`fallback_reason`	`str \| None`	Fallback reason when specialization failed.
`details`	`dict[str, str]`	Extra serialized inspection details.

is_executable ¶

is_executable() -> bool

Return whether the plan resolved to a runnable backend.

Returns:

Name	Type	Description
`bool`	`bool`	`True` when a backend ID was selected.

as_dict() -> dict[str, object]

Return a JSON-serializable representation of the runtime plan.

Returns:

Type	Description
`dict[str, object]`	dict[str, object]: Serialized runtime plan payload.