Skip to content

Commit

Permalink
Ready for review (I think)
Browse files Browse the repository at this point in the history
  • Loading branch information
gieljnssns committed Apr 18, 2024
1 parent c51d540 commit 1590404
Show file tree
Hide file tree
Showing 4 changed files with 133 additions and 44 deletions.
106 changes: 90 additions & 16 deletions docs/mlregressor.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This API provides two main methods:

- predict: To obtain a prediction from a pre-trained model. This method is exposed with the `regressor-model-predict` end point.


## A basic model fit

To train a model use the `regressor-model-fit` end point.
Expand Down Expand Up @@ -45,28 +46,38 @@ A correct `curl` call to launch a model fit can look like this:
```
curl -i -H "Content-Type:application/json" -X POST -d '{}' http://localhost:5000/action/regressor-model-fit
```

After applying the `curl` command to fit the model the following information is logged by EMHASS:

2023-02-20 22:05:22,658 - __main__ - INFO - Training a LinearRegression model
2023-02-20 22:05:23,882 - __main__ - INFO - Elapsed time: 1.2236599922180176
2023-02-20 22:05:24,612 - __main__ - INFO - Prediction R2 score: 0.2654560762747957

## The predict method

To obtain a prediction using a previously trained model use the `regressor-model-predict` end point.
A Home Assistant `rest_command` can look like this:

```
curl -i -H "Content-Type:application/json" -X POST -d '{}' http://localhost:5000/action/regressor-model-predict
fit_heating_hours:
url: http://127.0.0.1:5000/action/regressor-model-fit
method: POST
content_type: "application/json"
payload: >-
{
"csv_file": "heating_prediction.csv",
"features":["degreeday", "solar"],
"target": "hours",
"regression_model": "RandomForestRegression",
"model_type": "heating_hours_degreeday",
"timestamp": "timestamp",
"date_features": ["month", "day_of_week"]
}
```
After fitting the model the following information is logged by EMHASS:

If needed pass the correct `model_type` like this:
2024-04-17 12:41:50,019 - web_server - INFO - Passed runtime parameters: {'csv_file': 'heating_prediction.csv', 'features': ['degreeday', 'solar'], 'target': 'heating_hours', 'regression_model': 'RandomForestRegression', 'model_type': 'heating_hours_degreeday', 'timestamp': 'timestamp', 'date_features': ['month', 'day_of_week']}
2024-04-17 12:41:50,020 - web_server - INFO - >> Setting input data dict
2024-04-17 12:41:50,021 - web_server - INFO - Setting up needed data
2024-04-17 12:41:50,048 - web_server - INFO - >> Performing a machine learning regressor fit...
2024-04-17 12:41:50,049 - web_server - INFO - Performing a MLRegressor fit for heating_hours_degreeday
2024-04-17 12:41:50,064 - web_server - INFO - Training a RandomForestRegression model
2024-04-17 12:41:57,852 - web_server - INFO - Elapsed time for model fit: 7.78800106048584
2024-04-17 12:41:57,862 - web_server - INFO - Prediction R2 score of fitted model on test data: -0.5667567505914477

```
curl -i -H "Content-Type:application/json" -X POST -d '{"model_type": "load_forecast"}' http://localhost:5000/action/regressor-model-predict
```
## The predict method

It is possible to publish the predict method results to a Home Assistant sensor.
To obtain a prediction using a previously trained model use the `regressor-model-predict` end point.

The list of parameters needed to set the data publish task is:

Expand All @@ -89,3 +100,66 @@ runtimeparams = {
"model_type": "heating_hours_degreeday"
}
```

Pass the correct `model_type` like this:

```
curl -i -H "Content-Type:application/json" -X POST -d '{"model_type": "heating_hours_degreeday"}' http://localhost:5000/action/regressor-model-predict
```

A Home Assistant `rest_command` can look like this:

```
predict_heating_hours:
url: http://localhost:5001/action/regressor-model-predict
method: POST
content_type: "application/json"
payload: >-
{
"mlr_predict_entity_id": "sensor.predicted_hours",
"mlr_predict_unit_of_measurement": "h",
"mlr_predict_friendly_name": "Predicted hours",
"new_values": [8.2, 7.23, 2, 6],
"model_type": "heating_hours_degreeday"
}
```
After predicting the model the following information is logged by EMHASS:

```
2024-04-17 14:25:40,695 - web_server - INFO - Passed runtime parameters: {'mlr_predict_entity_id': 'sensor.predicted_hours', 'mlr_predict_unit_of_measurement': 'h', 'mlr_predict_friendly_name': 'Predicted hours', 'new_values': [8.2, 7.23, 2, 6], 'model_type': 'heating_hours_degreeday'}
2024-04-17 14:25:40,696 - web_server - INFO - >> Setting input data dict
2024-04-17 14:25:40,696 - web_server - INFO - Setting up needed data
2024-04-17 14:25:40,700 - web_server - INFO - >> Performing a machine learning regressor predict...
2024-04-17 14:25:40,715 - web_server - INFO - Performing a prediction for heating_hours_degreeday
2024-04-17 14:25:40,750 - web_server - INFO - Successfully posted to sensor.predicted_hours = 3.716600000000001
```
The predict method will publish the result to a Home Assistant sensor.


## How to store data in a csv file from Home Assistant
Notify to a file
```
notify:
- platform: file
name: heating_hours_prediction
timestamp: false
filename: /share/heating_prediction.csv
```
Then you need an automation to notify to this file
```
alias: "Heating csv"
id: 157b1d57-73d9-4f39-82c6-13ce0cf42
trigger:
- platform: time
at: "23:59:32"
action:
- service: notify.heating_hours_prediction
data:
message: >
{% set degreeday = states('sensor.degree_day_daily') |float %}
{% set heating_hours = states('sensor.heating_hours_today') |float | round(2) %}
{% set solar = states('sensor.solar_daily') |float | round(3) %}
{% set time = now() %}
{{time}},{{degreeday}},{{solar}},{{heating_hours}}
```
53 changes: 29 additions & 24 deletions src/emhass/command_line.py
Original file line number Diff line number Diff line change
Expand Up @@ -246,34 +246,39 @@ def set_input_data_dict(
P_PV_forecast, P_load_forecast = None, None
params = json.loads(params)
days_list = None
csv_file = params["passed_data"]["csv_file"]
features = params["passed_data"]["features"]
target = params["passed_data"]["target"]
timestamp = params["passed_data"]["timestamp"]
if get_data_from_file:
base_path = base_path + "/data"
filename_path = pathlib.Path(base_path) / csv_file
csv_file = params["passed_data"].get("csv_file", None)
if "features" in params["passed_data"]:
features = params["passed_data"]["features"]
if "target" in params["passed_data"]:
target = params["passed_data"]["target"]
if "timestamp" in params["passed_data"]:
timestamp = params["passed_data"]["timestamp"]
if csv_file:
if get_data_from_file:
base_path = base_path + "/data"
filename_path = pathlib.Path(base_path) / csv_file

else:
filename_path = pathlib.Path(base_path) / csv_file
else:
filename_path = pathlib.Path(base_path) / csv_file

if filename_path.is_file():
df_input_data = pd.read_csv(filename_path, parse_dates=True)
if filename_path.is_file():
df_input_data = pd.read_csv(filename_path, parse_dates=True)

else:
logger.error("The cvs file was not found.")
raise ValueError("The CSV file " + csv_file + " was not found.")
required_columns = []
required_columns.extend(features)
required_columns.append(target)
if timestamp is not None:
required_columns.append(timestamp)
else:
logger.error("The cvs file was not found.")
raise ValueError("The CSV file " + csv_file + " was not found.")
required_columns = []
required_columns.extend(features)
required_columns.append(target)
if timestamp is not None:
required_columns.append(timestamp)

if not set(required_columns).issubset(df_input_data.columns):
logger.error("The cvs file does not contain the required columns.")
raise ValueError(
f"CSV file should contain the following columns: {', '.join(required_columns)}",
)
if not set(required_columns).issubset(df_input_data.columns):
logger.error("The cvs file does not contain the required columns.")
msg = f"CSV file should contain the following columns: {', '.join(required_columns)}"
raise ValueError(
msg,
)

elif set_type == "publish-data":
df_input_data, df_input_data_dayahead = None, None
Expand Down
3 changes: 2 additions & 1 deletion src/emhass/machine_learning_regressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,9 +190,10 @@ def get_regression_model(self: MLRegressor) -> tuple[str, str]:
param_grid = REGRESSION_METHODS["AdaBoostRegression"]["param_grid"]
else:
self.logger.error(
"Passed sklearn model %s is not valid",
"Passed model %s is not valid",
self.regression_model,
)
return None
return base_model, param_grid

def fit(self: MLRegressor, date_features: list | None = None) -> None:
Expand Down
15 changes: 12 additions & 3 deletions src/emhass/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,12 +228,12 @@ def treat_runtimeparams(
params["passed_data"]["csv_file"] = csv_file
params["passed_data"]["features"] = features
params["passed_data"]["target"] = target
if "timestamp" not in runtimeparams.keys():
if "timestamp" not in runtimeparams:
params["passed_data"]["timestamp"] = None
else:
timestamp = runtimeparams["timestamp"]
params["passed_data"]["timestamp"] = timestamp
if "date_features" not in runtimeparams.keys():
if "date_features" not in runtimeparams:
params["passed_data"]["date_features"] = []
else:
date_features = runtimeparams["date_features"]
Expand All @@ -242,6 +242,15 @@ def treat_runtimeparams(
if set_type == "regressor-model-predict":
new_values = runtimeparams["new_values"]
params["passed_data"]["new_values"] = new_values
if "csv_file" in runtimeparams:
csv_file = runtimeparams["csv_file"]
params["passed_data"]["csv_file"] = csv_file
if "features" in runtimeparams:
features = runtimeparams["features"]
params["passed_data"]["features"] = features
if "target" in runtimeparams:
target = runtimeparams["target"]
params["passed_data"]["target"] = target

# Treating special data passed for MPC control case
if set_type == "naive-mpc-optim":
Expand Down Expand Up @@ -335,7 +344,7 @@ def treat_runtimeparams(
sklearn_model = runtimeparams["sklearn_model"]
params["passed_data"]["sklearn_model"] = sklearn_model
if "regression_model" not in runtimeparams.keys():
regression_model = "LinearRegression"
regression_model = "AdaBoostRegression"
else:
regression_model = runtimeparams["regression_model"]
params["passed_data"]["regression_model"] = regression_model
Expand Down

0 comments on commit 1590404

Please sign in to comment.