We happily welcome contributions to DiscoverX. We use GitHub Issues to track community reported issues and GitHub Pull Requests for accepting changes.
While using this project, you need Python 3.X and pip
or conda
for package management.
- Instantiate a local Python environment via a tool of your choice. This example is based on
conda
, but you can use any environment management tool:
conda create -n discoverx python=3.9
conda activate discoverx
- If you don't have JDK installed on your local machine, install it (in this example we use
conda
-based installation):
conda install -c conda-forge openjdk=11.0.15
- Install project locally (this will also install dev requirements):
pip install -e ".[local,test]"
For unit testing, please use pytest
:
pytest tests/unit --cov
Please check the directory tests/unit
for more details on how to use unit tests.
In the tests/unit/conftest.py
you'll also find useful testing primitives, such as local Spark instance with Delta support, local MLflow and DBUtils fixture.
Please set the following secrets or environment variables for your CI provider:
DATABRICKS_HOST
DATABRICKS_TOKEN
- To trigger the CI pipeline, simply push your code to the repository. If CI provider is correctly set, it shall trigger the general testing pipeline
- To trigger the release pipeline, get the current version from the
discoverx/__init__.py
file and tag the current code version:
git tag -a v<your-project-version> -m "Release tag for version <your-project-version>"
git push origin --tags