This repository provides an example for converting the SQLServer integration PR submitted to the main MLflow repository by @avflor (mlflow/mlflow#1734) into an MLflow plugin.
To install this plugin package:
- Initialize the custom
mlflow
submodule by running the following command from the repository root:
$ git submodule update --init
- Invoke the
install.sh
script from the repository's root directory. This script usespip
to install thesqlplugin
library (which provides a custom artifact repository implementation) as well as a custom version of the MLflow library that defines a[sqlserver]
extras tag to automatically install thesqlplugin
dependency when MLflow is installed (this is not strictly required for the plugin to work).
Once the plugin has been installed along with the custom version of MLflow, run the artifact logging test Python script located at tests/log_artifacts.py
. If this Python script creates an mlruns
directory and a SQLite database called testartifactdb
without emitting errors, the plugin is working as expected.
This repository contains two main Python packages:
sqlplugin
: This package includes theDBArtifactRepository
class that is used to read and write artifacts from SQL databases. This class sets the attributeis_plugin = True
in order to indicate that the class is an MLflow artifact repository plugin. This package also includes the SQLAlchemy database models referenced byDBArtifactRepository
. The files associated with this implementation were taken from the MLflow GitHub PR by @avflor (mlflow/mlflow#1734); a couple of small modifications were applied for correctness purposes that are not relevant to the plugin. Most critically, the package'ssetup.py
file defines entrypoints that tell MLflow to automatically associate themssql
andsqlite
URIs with theDBArtifactRepository
implementation when thesqlplugin
library is installed. The entrypoints are configured as follows:
entry_points={
"mlflow.artifact_repository": [
"mssql=sqlplugin.store:DBArtifactRepository",
"sqlite=sqlplugin.store:DBArtifactRepository",
]
},
mlflow
: This package refers to a branch of the MLflow repository based on the MLflow version 1.3.0 release. This custom branch applies the changes in this PR: dbczumar/mlflow#4. This PR simply defines a custom installation parameter called[sqlserver]
, which enables the user to install MLflow andsqlplugin
together by running:pip install mlflow[sqlserver]
. This is a nice-to-have feature that makes thesqlplugin
library easier to install along with MLflow, but it is not strictly required for the plugin to work
Note: sqlite is only included here for testing purposes. We recommend that this plugin ultimately target mssql exclusively
Note 2: sqlplugin
is used as an example plugin package name. Feel free to use another name of your choosing.
We propose that the CISL team will control the development and testing of the sqlplugin
component in their own repository. The MLflow team will review and incorporate a PR similar to dbczumar/mlflow#4 in order to provide users with the ability to install both MLflow and the SQL Server plugin library at the same time.
The proposed plugin structure and development workflow provide the following experience to the end user:
Users can simply install MLflow with the SQL Server plugin via pip install mlflow[sqlserver]
and then use MLflow as normal. The SQLServer artifact support will be provided automatically using the previously-described setup entrypoints mechanism.
The SQLServer integration PR submitted by @avflor (mlflow/mlflow#1734) defines some additional logic for parsing artifact URIs of the type ArtifactRepositoryType.DB
(https://github.com/mlflow/mlflow/blob/c0e1fa56587f858dddf15f26852b2fa8c51c6d51/mlflow/tracking/artifact_utils.py#L59-L76, https://github.com/mlflow/mlflow/blob/c0e1fa56587f858dddf15f26852b2fa8c51c6d51/mlflow/store/artifact_repository_registry.py#L97-L102). The MLflow team will need to determine whether or not this additiona logic is necessary and potentially merge artifact URI parsing changes into th main repository for plugin compatibility purposes.