Skip to content

Latest commit

 

History

History
246 lines (177 loc) · 5.91 KB

README.md

File metadata and controls

246 lines (177 loc) · 5.91 KB

MLflow Container Setup

Setup focuses on experiment and artifact tracking using mlflow.

Quick start

Requires poetry, docker and docker compose.

poetry install

Build image, set environment variables, and start containers (within docker folder)

cd docker && \
./build_image.sh \
--repository localhost/mlflow \
--tag latest && \
\
echo '#!/bin/bash

# mlflow settings
export MLFLOW_PORT=5000

export POSTGRES_DATA=$(pwd)/data/pgdata
export STORAGE_DATA=$(pwd)/data/storage

# db settings
export POSTGRES_USER=mlflow
export POSTGRES_PASSWORD=mlflow123

# (optional) mlflow s3 storage backend settings (e.g. can be minio)
# export MLFLOW_ARTIFACTS_DESTINATION=s3://yourbucketname/yourfolder
# export AWS_ACCESS_KEY_ID=youraccesskey
# export AWS_SECRET_ACCESS_KEY=yoursecretaccesskey
# export MLFLOW_S3_ENDPOINT_URL=https://minio.yourdomain.com
# export MLFLOW_S3_IGNORE_TLS=true' > .env.sh && \
\
source .env.sh && \
\
if [ ! -d "./data/pgdata" ] ; then mkdir -p $POSTGRES_DATA; fi && \
if [ ! -d "./data/storage" ] ; then mkdir -p $STORAGE_DATA; fi && \
\
docker compose up -d

Now checkout http://localhost:5000.

Samples

Run sample tracking script

poetry run python samples/tracking.py

Run sample artifact script

poetry run python samples/artifacts.py

Navigate to http://localhost:5000 to see the MLflow UI and the experiment tracking.

Local Setup

Using plain python and mlflow server.

Basic

Using poetry. Runs and artifacts are stored in the mlruns and mlartifacts directories.

poetry install && \
poetry run mlflow server --host 0.0.0.0

Backends

Database

Using postgres as backend.

docker run -d --name ml-postgres -p 5432:5432 \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres_password \
-e POSTGRES_DB=mlflow \
postgres:latest

Runs mlflow server with postgres backend (only psycopg2 supported)

poetry run mlflow server --backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow --host 0.0.0.0

Run sample tracking script

poetry run python samples/tracking.py

Artifacts Store

s3

Set S3 credentials and endpoint URL

echo '
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export MLFLOW_S3_ENDPOINT_URL=...
' > .env.sh

Start mlflow server with s3 backend (default)

source .env.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--default-artifact-root s3://my-bucket/mlflow/test \
--host 0.0.0.0

Run (client reqpuires s3 credentials)

source .env.sh && \
poetry run python samples/artifacts.py

Proxied s3 backend for artifacts (client do not need to know s3 credentials)

source .env.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--artifacts-destination s3://my-bucket/mlflow/test \
--host 0.0.0.0

Run (client do not need to know s3 credentials)

poetry run python samples/artifacts.py
azure blob storage

Set azure credentials and endpoint URL - more info here and here.

echo "
export AZURE_STORAGE_CONNECTION_STRING='AccountName=<YOUR_ACCOUNT_NAME>;AccountKey=<YOUR_KEY>;EndpointSuffix=core.windows.net;DefaultEndpointsProtocol=https;'
export AZURE_STORAGE_ACCESS_KEY='<YOUR_KEY>'
" > .env_azure.sh

Proxied azure blob storage backend for artifacts (client do not need to know azure credentials)

source .env_azure.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--artifacts-destination wasbs://[email protected]/my-folder \
--host 0.0.0.0

Run (client do not need to know azure credentials)

poetry run python samples/artifacts.py

Metrics

Using prometheus as metrics backend.

source .env.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--artifacts-destination s3://my-bucket/mlflow/test \
--expose-prometheus ./metrics \
--host 0.0.0.0

For running mlflow as tracking service it is highly recommended to use gunicorn with gevent workers (non blocking io optimized). The following describes a corresponding config map

apiVersion: v1
kind: ConfigMap
metadata:
  name: mlflow-additional-config
data:
  MLFLOW_HOST: "0.0.0.0"
  MLFLOW_PORT: "5000"
  MLFLOW_ADDITIONAL_OPTIONS: "--gunicorn-opts '--worker-class gevent --threads 4 --timeout 300 --keep-alive 300 --log-level INFO'"

Deployment Server

Create env file containing API keys and secrets

echo '#!/bin/bash

# openai
export OPENAI_API_KEY=yoursecretkey
export OPENAI_API_KEY2=yoursecretkey

# anthropic
export ANTHROPIC_API_KEY=yoursecretkey' > .env-deployments-server.sh

Start mlflow deployments server with additional options

source .env-deployments-server.sh && \
mlflow deployments start-server --config-path samples/config.yaml --workers 4

Samples

Run samples after starting the deployments server

poetry run python samples/completions.py
poetry run python samples/embeddings.py
poetry run python samples/chat.py