Developing with Databricks-Connect & Azure DevOps

A guide of how to build good Data Pipelines with Databricks Connect using best practices. Details: https://datathirst.net/blog/2019/9/20/series-developing-a-pyspark-application

About

This is a sample Databricks-Connect PySpark application that is designed as a template for best practice and useability.

The project is designed for:

Python local development in an IDE (VSCode) using Databricks-Connect
Well structured PySpark application
Simple data pipelines with reusable code
Unit Testing with Pytest
Build into a Python Wheel
CI Build with Test results published
Automated deployments/promotions

Setup

Create a Conda Environment (open Conda prompt):

conda create --name dbconnectappdemo python=3.7

Activate the environment:

conda activate dbconnectappdemo

IMPORTANT: Open the requirements.txt in the root folder and ensure the version of databrick-connect matches your cluster runtime.

Install the requirements into your environments:

pip install -r requirements.txt

If you need to setup databricks-connect then run:

databricks-connect configure

More help here & here

Setup Deployment

If you would like to deploy from your local PC to Databricks create a file in the root called MyBearerToken.txt and paste in a bearer token from the Databricks UI.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.vscode		.vscode
configs		configs
pipelines		pipelines
tests		tests
.gitignore		.gitignore
Build.ps1		Build.ps1
Deploy.ps1		Deploy.ps1
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
simpleExecute.py		simpleExecute.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Developing with Databricks-Connect & Azure DevOps

About

Setup

Setup Deployment

About

Releases

Packages

Languages

DataThirstLtd/Databricks-Connect-PySpark

Folders and files

Latest commit

History

Repository files navigation

Developing with Databricks-Connect & Azure DevOps

About

Setup

Setup Deployment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages