A Python package that provides helpers for cleaning, deduplication, enrichment, etc. in Spark
- Free software: MIT license
- Documentation: https://spark-etl-python.readthedocs.io.
- TODO
In order to be able to develop on this package:
- Create a virtual environment
- Install pip-tools: pip install pip-tools
- Run pip-sync requirements_dev.txt requirements.txt
To update dependencies, add them to requirements.in (if they are needed to run the package) or requirements_dev.in. Then run pip-compile requirements.in or pip-compile requirements_dev.in.
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.