Runtime Plan API¶
The runtime plan captures backend selection, support level, and specialization state before execution begins.
Bases: str, Enum
Describe whether specialization is absent, planned, applied, or replaced by fallback.
Source code in src/ollm/runtime/plan.py
13 14 15 16 17 18 19 | |
Describe how oLLM intends to execute a resolved model reference.
Attributes:
| Name | Type | Description |
|---|---|---|
resolved_model |
ResolvedModel
|
Final resolved model metadata for the plan. |
backend_id |
str | None
|
Selected backend identifier when the plan is executable. |
model_path |
Path | None
|
Local materialized model path when one exists. |
support_level |
SupportLevel
|
Planned support level. |
generic_model_kind |
GenericModelKind | None
|
Generic execution family when one applies. |
supports_disk_cache |
bool
|
Whether the selected backend supports disk KV cache behavior. |
supports_cpu_offload |
bool
|
Whether CPU offload controls are supported. |
supports_gpu_offload |
bool
|
Whether GPU offload controls are supported. |
specialization_enabled |
bool
|
Whether specialization is enabled for the current request. |
specialization_applied |
bool
|
Whether specialization has already been applied. |
specialization_provider_id |
str | None
|
Matching specialization provider identifier. |
specialization_state |
SpecializationState
|
Current specialization lifecycle state. |
reason |
str
|
Human-readable plan summary. |
specialization_pass_ids |
tuple[SpecializationPassId, ...]
|
Planned specialization passes. |
applied_specialization_pass_ids |
tuple[SpecializationPassId, ...]
|
Applied specialization passes. |
fallback_reason |
str | None
|
Fallback reason when specialization failed. |
details |
dict[str, str]
|
Extra serialized inspection details. |
Source code in src/ollm/runtime/plan.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | |
is_executable
¶
is_executable() -> bool
Return whether the plan resolved to a runnable backend.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
|
Source code in src/ollm/runtime/plan.py
74 75 76 77 78 79 80 | |
as_dict
¶
as_dict() -> dict[str, object]
Return a JSON-serializable representation of the runtime plan.
Returns:
| Type | Description |
|---|---|
dict[str, object]
|
dict[str, object]: Serialized runtime plan payload. |
Source code in src/ollm/runtime/plan.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | |