Skip to content

Commit

Permalink
update kepler-mdoel-server with v0.7
Browse files Browse the repository at this point in the history
Signed-off-by: Sunyanan Choochotkaew <[email protected]>
  • Loading branch information
sunya-ch committed Apr 4, 2024
1 parent 0a15d11 commit 9fcd5e6
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 8 deletions.
2 changes: 1 addition & 1 deletion docs/kepler_model_server/get_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ data:
NODE_COMPONENTS_INIT_URL= < Static URL >
```

The static URL from standard pipeline v0.6 (std_v0.6) are listed [here](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.6/nx12).
The static URL from provided pipeline v0.7 are listed [here](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.7).

#### Dynamic via server API
A dynamic way is to enable the model server to auto select the power model which has the best accuracy and supported the running cluster environment. Similarly, It can be set via the environment variable or set it via Kepler config map.
Expand Down
10 changes: 9 additions & 1 deletion docs/kepler_model_server/node_profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,12 @@

We form a group of machines (nodes) called [node type](./pipeline.md#node-spec) based on processor model, the number of cores, the number of chips, memory size, and maximum CPU frequency. When collecting the data from the bare metal machine, these attributes are automatically extracted and kept as a machine spec in json format.

A power model will be built per node type. For each group of node type, we make a profile composing of background power when the resource usage is almost constant without user workload, minimum, maximum power for each power components (e.g., core, uncore, dram, package, platform), and normalization scaler (i.e., MinMaxScaler), standardization scaler (i.e., StandardScaler) for each [feature group](./pipeline.md#available-metrics).
A power model will be built per node type. For each group of node type, we make a profile composing of background power when the resource usage is almost constant without user workload, minimum, maximum power for each power components (e.g., core, uncore, dram, package, platform), and normalization scaler (i.e., MinMaxScaler), standardization scaler (i.e., StandardScaler) for each [feature group](./pipeline.md#available-metrics).

Node specification is composed of:

- processor *- CPU processor model name*
- cores *- Number of CPU cores*
- chips *- Number of chips*
- memory_gb *- Memory size in GB*
- cpu_freq_mhz *- Maximum CPU frequency in MHz*
4 changes: 2 additions & 2 deletions docs/kepler_model_server/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ BPFOnly|BPF_FEATURES|[BPF](../design/metrics.md#base-metric)
IRQOnly|IRQ_FEATURES|[IRQ](../design/metrics.md#irq-metrics)
AcceleratorOnly|ACCELERATOR_FEATURES|[Accelerator](../design/metrics.md#Accelerator-metrics)
CounterIRQCombined|COUNTER_FEATURES, IRQ_FEATURES|BPF and Hardware Counter
Basic|COUNTER_FEATURES, CGROUP_FEATURES, BPF_FEATURES, KUBELET_FEATURES|All except IRQ and node information
WorkloadOnly|COUNTER_FEATURES, CGROUP_FEATURES, BPF_FEATURES, IRQ_FEATURES, KUBELET_FEATURES, ACCELERATOR_FEATURES|All except node information
Basic|COUNTER_FEATURES, CGROUP_FEATURES, BPF_FEATURES|All except IRQ and node information
WorkloadOnly|COUNTER_FEATURES, CGROUP_FEATURES, BPF_FEATURES, IRQ_FEATURES, ACCELERATOR_FEATURES|All except node information
Full|WORKLOAD_FEATURES, SYSTEM_FEATURES|All
||

Expand Down
10 changes: 6 additions & 4 deletions docs/kepler_model_server/power_estimation.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,10 @@ The Kepler General Estimator sidecar can update the model from the Kepler Model
![](../fig/full_integration.png)


## Power model accuracy report
## Provided power models on Kepler Model DB

version|machine ID|pipeline|feature group|component power source|total power source|Local LR MAE in watts (Node Components/Total)|Estimator Sidecar MAE in watts (Node Components/Total)|Reference Power Range in watts
---|---|---|---|---|---|---|---|---
[0.6](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.6)|[nx12](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.6/nx12)|[std_v0.6](https://github.com/sustainable-computing-io/kepler-model-db/blob/main/models/v0.6/.doc/std_v0.6.md)|BPFOnly|rapl|acpi|66.32/93.57|34.40/49.52|505.79
version|power data source|pipeline|available energy sources|error report
---|---|---|---|---
[0.6](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.6)|[nx12](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.6/nx12)|[std_v0.6](https://github.com/sustainable-computing-io/kepler-model-db/blob/main/models/v0.6/.doc/std_v0.6.md)|rapl,acpi|[Link](https://github.com/sustainable-computing-io/kepler-model-db/blob/main/models/v0.6/nx12/README.md)
[0.7](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.7)|[SPECpower](https://www.spec.org/power_ssj2008/)|[specpower](https://github.com/sustainable-computing-io/kepler-model-db/blob/main/models/v0.7/.doc/specpower.md)|acpi|[Link](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.7/specpower)
[0.7](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.7)|[Training Playbook](https://github.com/sustainable-computing-io/kepler-model-training-playbook)|[ec2](https://github.com/sustainable-computing-io/kepler-model-db/blob/main/models/v0.7/.doc/ec2.md)|intel_rapl|[Link](https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.7/ec2)

0 comments on commit 9fcd5e6

Please sign in to comment.