Lists (1)
Sort Name ascending (A-Z)
Stars
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data P…
Mock to use for Databricks Utils (dbutils) whenever it is not installed, e.g. on you local machine
A connector to ingest Azure Databricks lineage into Microsoft Purview
This repo is a local setup of self-hosted vaultwarden with traefik (that using duckdns and let's encrypt) and ngrok.
An Open Standard for lineage metadata collection
Open, Multi-modal Catalog for Data & AI, written in Rust
An example Talos Linux Kubernetes cluster in Proxmox QEMU/KVM Virtual Machines using terraform
Open, Multi-modal Catalog for Data & AI
Real-Time Trade Analysis Pipeline using Finnhub.io and Spark
A data warehousing application on NYC Food Inspection
A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev
Objects and Animals detection with Wifi camera and Yolo
remove the enterprise repository to make it possible to update proxmox and remove the no subscription popup box on login
A list of Free Software network services and web applications which can be hosted on your own servers
A list of Terraform modules to build your Azure Data IaC templates.
A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.
Squid in docker container based on Alpine Linux
Terraform script to deploy almost all Azure Data Services
🕶️A curated list of awesome tools for dealing with CSV.
To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed wor…
Easy and Repeatable Kubernetes Development