A curated collection of streamlined and effective scripts and tools designed specifically for data engineering tasks.
Script | Description |
---|---|
aws-glueetl-costs-analysis | Crawl the cost data of AWS Glue ETL jobs, create scatter plots, and help find jobs with abnormal costs. |
databricks-sql-warehouse-unload | Unload data from databricks sql warehouse, save to local file or AWS S3, results format support parquet/csv . |
snow-bench-app | Try snow-bench for batch querying Automation on Snowflake. |
awsglue-rclone | awsglue-rclone is a wrapper for RClone designed for use with AWS Glue Python Shell, rclone everything within Glue! |
sqlgen | sqlgen is a small tool that automatically generates SQL statements for table creation from Excel document templates. It supports both command-line and Web UI usage, making it ideal for scenarios that frequently involve database design and table creation. |
simple-rotating-file-writer | Providing a mechanism for writing log messages to a file, automatically rotating the log file and managing backup files when a specified size limit is exceeded. |
pyspark-recipe | Providing a collection of reusable code snippets and functions that simplify common tasks in data processing using Apache Spark. |