Skip to content

Commit

Permalink
Merge pull request #5 from SETO2243/docker_runnable
Browse files Browse the repository at this point in the history
Add runnable docker example and fix some bugs
  • Loading branch information
emfdavid authored Sep 6, 2024
2 parents e84ae80 + 6cb2b27 commit 81e50f6
Show file tree
Hide file tree
Showing 12 changed files with 10,164 additions and 79 deletions.
34 changes: 34 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Include any files or directories that you don't want to be copied to your
# container here (e.g., local build artifacts, temporary files, etc.).
#
# For more help, visit the .dockerignore file reference guide at
# https://docs.docker.com/engine/reference/builder/#dockerignore-file

**/.DS_Store
**/__pycache__
**/.venv
**/.classpath
**/.dockerignore
**/.env
**/.git
**/.gitignore
**/.project
**/.settings
**/.toolstarget
**/.vs
**/.vscode
**/*.*proj.user
**/*.dbmdl
**/*.jfm
**/bin
**/charts
**/docker-compose*
**/compose*
**/Dockerfile*
**/node_modules
**/npm-debug.log
**/obj
**/secrets.dev.yaml
**/values.dev.yaml
LICENSE
README.md
56 changes: 56 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# syntax=docker/dockerfile:1

# Comments are provided throughout this file to help you get started.
# If you need more help, visit the Dockerfile reference guide at
# https://docs.docker.com/engine/reference/builder/

ARG PYTHON_VERSION=3.10.14
FROM python:${PYTHON_VERSION}-bullseye as base

# Prevents Python from writing pyc files.
ENV PYTHONDONTWRITEBYTECODE=1

# Keeps Python from buffering stdout and stderr to avoid situations where
# the application crashes without emitting any logs due to buffering.
ENV PYTHONUNBUFFERED=1

WORKDIR /app

# Create a non-privileged user that the app will run under.
# See https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#user
ARG UID=10001
RUN adduser \
--disabled-password \
--gecos "" \
--home "/nonexistent" \
--shell "/sbin/nologin" \
--no-create-home \
--uid "${UID}" \
appuser

RUN apt-get -y update
RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC \
apt-get -y dist-upgrade

RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC \
apt-get -y install libeccodes-dev libhdf5-serial-dev pkg-config cmake g++-10 gcc-10

# Download dependencies as a separate step to take advantage of Docker's caching.
# Leverage a cache mount to /root/.cache/pip to speed up subsequent builds.
# Leverage a bind mount to requirements.txt to avoid having to copy them into
# into this layer.
RUN --mount=type=cache,target=/root/.cache/pip \
--mount=type=bind,source=requirements.txt,target=requirements.txt \
python -m pip install -r requirements.txt

# Switch to the non-privileged user to run the application.
USER appuser

# Copy the source code into the container.
COPY . .

# Expose the port that the application listens on.
EXPOSE 8890

# Run the application.
CMD python -m example_forecast
129 changes: 119 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,119 @@ the composable model framework. Example PV model construction is shown below.
The API for fit, predict, and metrics is reduced to specifying a start and end times for a given location.
The model must construct feature data using column transforms. Having done so, forecasting as a service become trivial.

## Installation
## Getting started

Users can verify the code works with the `example_forecast.py` script by running `docker compose up --build`
The log output from the container will include feature and weather data as well as predicted values. The script
takes several minutes to run because the weather data is large.

The dockerfile included in the project will run the example_forecast.py which demonstrates both the machine
learning model for AMI meter forecasting and the physics based PV model using PySam. Users can then choose between
their local working environment and containerized environment to extend and experiment with the time series models
library.

The dockerized example is a great place to start for experimentation and further development.
### Expected output
Once the container layers are running (this will take several minutes) the console log will show
```console
[+] Running 1/1
✔ Container seto_forecasting-server-1 Recreated 0.7s
Attaching to seto_forecasting-server-1
```

#### AMI Meter Model
The first example is a meter level xgboost estimator
```console
seto_forecasting-server-1 | INFO:__main__:Starting forecast example for AMI meter forecast with XgBoost estimator!
seto_forecasting-server-1 | INFO:time_series_models.transformers:Constructing overfetched range pipeline using lags [ 0 168]
seto_forecasting-server-1 | INFO:time_series_models.processes:Instantiating RegularTimeSeriesModels with kwargs {'day_of_week': True, 'harmonics': array([ 24, 168, 8760], dtype='timedelta64[h]'), 'met_vars': ['t', 'r2'], 'met_horizon': 12, 'mapping': {'p2ulv18716': {'latitude': 35.0, 'longitude': -75.0}}}
```
There will be a good deal more log messages (and a few minutes to download weather data) for training and prediction
followed by the (truncated) results of the prediction run for the `p2ulv18716` meter.
```console
seto_forecasting-server-1 | INFO:time_series_models.data_fetchers.fetcher:Finished 'HrrrFetcher' 'get_data' in 248.3485 secs
seto_forecasting-server-1 | INFO:time_series_models.data_fetchers.fetcher:Finished 'AmiFetcher' 'get_data' in 0.0837 secs
seto_forecasting-server-1 | INFO:__main__:Predicted: predicted true
seto_forecasting-server-1 | location date_time
seto_forecasting-server-1 | p2ulv18716 2021-01-01 00:00:00 785.116394 600.156056
seto_forecasting-server-1 | 2021-01-01 01:00:00 717.555481 2579.214714
seto_forecasting-server-1 | 2021-01-01 02:00:00 817.579041 2720.881345
seto_forecasting-server-1 | 2021-01-01 03:00:00 507.064819 2341.922617
seto_forecasting-server-1 | 2021-01-01 04:00:00 444.800018 2124.941260
seto_forecasting-server-1 | ... ... ...
seto_forecasting-server-1 | 2021-02-04 20:00:00 513.341370 425.300693
seto_forecasting-server-1 | 2021-02-04 21:00:00 591.890686 459.267964
seto_forecasting-server-1 | 2021-02-04 22:00:00 2320.842773 546.954250
seto_forecasting-server-1 | 2021-02-04 23:00:00 2011.579346 599.035650
seto_forecasting-server-1 | 2021-02-05 00:00:00 2223.689941 652.819004
seto_forecasting-server-1 |
seto_forecasting-server-1 | [841 rows x 2 columns]
```

#### PV Physics Model
The PV physics model will continue immediately with
```console
seto_forecasting-server-1 | INFO:__main__:Starting forecast example for PV physical forecast!
```
Intermediate weather data will be visible during the training and prediction steps, extracted for the time range,
latitude and longitude of the site.
```console
seto_forecasting-server-1 | INFO:time_series_models.transformers_pv:Feature DF: ghi dni dhi temp_air wind_speed
seto_forecasting-server-1 | date_time
seto_forecasting-server-1 | 2021-02-01 00:00:00 0.0 0.0 0.0 275.616852 2.302358
seto_forecasting-server-1 | 2021-02-01 01:00:00 0.0 0.0 0.0 273.182281 3.805551
seto_forecasting-server-1 | 2021-02-01 02:00:00 0.0 0.0 0.0 271.327850 2.718800
seto_forecasting-server-1 | 2021-02-01 03:00:00 0.0 0.0 0.0 270.273697 4.061844
seto_forecasting-server-1 | 2021-02-01 04:00:00 0.0 0.0 0.0 270.467178 4.645654
seto_forecasting-server-1 | ... ... ... ... ... ...
seto_forecasting-server-1 | 2021-02-04 20:00:00 586.1 969.0 77.0 281.806671 12.966594
seto_forecasting-server-1 | 2021-02-04 21:00:00 485.4 924.0 72.1 282.243988 15.832714
seto_forecasting-server-1 | 2021-02-04 22:00:00 325.8 547.0 147.3 281.664047 16.190084
seto_forecasting-server-1 | 2021-02-04 23:00:00 128.7 236.0 88.2 279.410828 14.847732
seto_forecasting-server-1 | 2021-02-05 00:00:00 0.0 0.0 0.0 276.510498 12.012517
```
Finally, the predicted values for the `capybara` PV demo site will be logged.
```console
seto_forecasting-server-1 | INFO:time_series_models.transformers_pv:Trying to load from: /app/pv_site.json
seto_forecasting-server-1 | INFO:__main__:pv predictions: predicted
seto_forecasting-server-1 | location date_time
seto_forecasting-server-1 | capybara 2021-02-01 00:00:00 2.555532e+02
seto_forecasting-server-1 | 2021-02-01 01:00:00 2.555532e+02
seto_forecasting-server-1 | 2021-02-01 02:00:00 2.555532e+02
seto_forecasting-server-1 | 2021-02-01 03:00:00 2.555532e+02
seto_forecasting-server-1 | 2021-02-01 04:00:00 2.555532e+02
seto_forecasting-server-1 | ... ...
seto_forecasting-server-1 | 2021-02-04 20:00:00 -1.007929e+06
seto_forecasting-server-1 | 2021-02-04 21:00:00 -1.096734e+06
seto_forecasting-server-1 | 2021-02-04 22:00:00 -9.101299e+05
seto_forecasting-server-1 | 2021-02-04 23:00:00 -4.854458e+05
seto_forecasting-server-1 | 2021-02-05 00:00:00 2.555532e+02
seto_forecasting-server-1 |
seto_forecasting-server-1 | [97 rows x 1 columns]
seto_forecasting-server-1 | INFO:root:All done!
```

Some systems may return an error code on exit.
```console
seto_forecasting-server-1 | free(): invalid pointer
seto_forecasting-server-1 | Aborted
seto_forecasting-server-1 exited with code 134
```
This is likely related to an issue with the version of the pysam library but it does not affect the execution of the example.

### Development & Installation

This library is designed for use by technical engineers and data scientists. It takes advantage of the Python
data science ecosystem and therefore requires installation of many third party open source libraries. It has
been developed and tested in a Linux operating system. Running on a Docker container such as
the [canonical Ubuntu image](https://hub.docker.com/_/ubuntu) is strongly recommended. The library was
developed using Ubuntu 22.04 (Jammy) with Python 3.10.6.

### Installing system libraries
The docker file included in the repository only runs the `example_forecast.py` file. The following code snippets
are provided as helpful steps toward building a developer environment where you can run unit tests, forecasting scripts
and jupyter notebooks.

#### Installing system libraries

After installing [Docker](https://docs.docker.com/engine/install/), run the following command to setup a basic Jammy container with this library:

Expand Down Expand Up @@ -62,6 +166,7 @@ python -m unittest

This will print "SUCCESS" near the end if the code work correctly in your new environment.

### Running a notebook
To start jupyter notebook run:

```sh
Expand All @@ -70,8 +175,6 @@ jupyter notebook --NotebookApp.ip=0.0.0.0

This will print a URL, which you can open in your browser. Then open the example notebook and execute the cells in the demonstration to get acquainted with the functionality.

<!-- TODO: add docker executable image in July 2024 -->

## Usage
Models can be composed of mixins for various estimators and forecast processes. These composable
pieces can be put together in different ways to solve many problems. The RegularTimeSeriesModel is the
Expand Down Expand Up @@ -159,11 +262,18 @@ using machine learning models like xgboost too.

```python
pv_config = dict(
site_config_mapping="RESOURCE_SELF",
site_meter_mapping=None,
site_latlong_mapping="RESOURCE_SELF",
source_mode="12_hour_horizon",
lags=None,
lags=None,
site_config_mapping={
"capybara": ["/app/pv_site.json"],
},
site_latlong_mapping={
"capybara": dict(
latitude=40.0,
longitude=-100.0,
),
},
site_meter_mapping=None,
source_mode="12_hour_horizon",
)

class PVForecastModel(
Expand All @@ -186,7 +296,6 @@ Engineers and data scientists commonly use an interactive web-based development
An [example notebook](https://github.com/SETO2243/forecasting/blob/main/example.ipynb) is provided in this GitHub
repository which demonstrates the core capabilities of the time series models library developed for the SETO project.

<!-- TODO: Add screen shot of dockerized output, July 2024-->

## Input Data

Expand Down
16 changes: 16 additions & 0 deletions compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Comments are provided throughout this file to help you get started.
# If you need more help, visit the Docker compose reference guide at
# https://docs.docker.com/compose/compose-file/

# Here the instructions define your application as a service called "server".
# This service is built from the Dockerfile in the current directory.
# You can add other services your application may depend on here, such as a
# database or a cache. For examples, see the Awesome Compose repository:
# https://github.com/docker/awesome-compose
services:
server:
build:
context: .
ports:
- 8890:8890

90 changes: 90 additions & 0 deletions example_forecast.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
import logging
import numpy as np

from time_series_models.time_series_models import RegularTimeSeriesModel
from time_series_models.processes import AmiHourlyForecast, PVForecast
from time_series_models.estimators import (
XgbRegressor,
IdentityRegressor,
)

logger = logging.getLogger(__name__)


def run_forecast_example():

logger.info(
"Starting forecast example for AMI meter forecast with XgBoost estimator!",
)

class XgbModel(AmiHourlyForecast, XgbRegressor, RegularTimeSeriesModel):
pass

config = dict(
lags=np.array([24, 48, 168], dtype="timedelta64[h]"),
day_of_week=True,
harmonics=np.array([24, 168, 365 * 24], dtype="timedelta64[h]"),
met_vars=["t", "r2"],
met_horizon=12,
mapping=dict(p2ulv18716=dict(latitude=35.0, longitude=-75.0)),
)
instance = XgbModel(**config)

instance.fit("2021-01-15", "2021-01-31", "p2ulv18716")

logger.info("Trained instance: %s", instance.model)

features_df = instance.features_dataframe("2021-02-01", "2021-02-05", "p2ulv18716")
logger.info("Features data: %s", features_df)

predicted_df = instance.predict_dataframe(
"2021-01-01", "2021-02-05", "p2ulv18716", range=True
)

logger.info("Predicted: %s", predicted_df)

logger.info(
"Starting forecast example for PV physical forecast!",
)

pv_config = dict(
lags=None,
site_config_mapping={
"capybara": ["/app/pv_site.json"],
},
site_latlong_mapping={
"capybara": dict(
latitude=40.0,
longitude=-100.0,
),
},
site_meter_mapping=None,
source_mode="12_hour_horizon",
)

class PVForecastModel(
PVForecast,
IdentityRegressor,
RegularTimeSeriesModel,
):
pass

pv_instance = PVForecastModel(**pv_config)
pv_instance.model

pv_instance.fit("2021-01-15", "2021-01-16", "capybara")

pv_hrrr_df = pv_instance.hrrr_fetcher.source_loader(
np.datetime64("2021-02-01"), np.datetime64("2021-02-05"), "capybara"
)
logger.info("PV HRRR Data: %s", pv_hrrr_df)

pv_df = pv_instance.predict_dataframe("2021-02-01", "2021-02-05", "capybara")
logger.info("pv predictions: %s", pv_df)


if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
run_forecast_example()
logging.info("All done!")
exit(0)
Loading

0 comments on commit 81e50f6

Please sign in to comment.