Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Sparkle class #4

Merged
merged 10 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ on:
jobs:
tests:
strategy:
fail-fast: false # Prevents the matrix from failing fast
matrix:
os: [ubuntu-latest, macos-latest]
runs-on: ${{ matrix.os }}
Expand Down
100 changes: 99 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,99 @@
# sparkle
# Sparkle ✨

**Sparkle** is a meta-framework built on top of [Apache
Spark](https://spark.apache.org/), designed to streamline data
engineering workflows and accelerate the delivery of data
products. Developed by [**DataChef**](https://datachef.co), Sparkle
focuses on three main areas:

1. **Improving Developer Experience (DevEx) 🚀**
2. **Reducing Time to Market ⏱️**
3. **Easy Maintenance 🔧**

With these goals in mind, Sparkle has enabled DataChef to deliver
functional data products from day one, allowing for seamless handovers
to internal teams.

## Key Features

### 1. Improved Developer Experience 🚀

Sparkle enhances the developer experience by abstracting away
non-business-critical aspects of Spark application development. It
achieves this through:

- **Sophisticated Configuration Mechanism**: Simplifies the setup and
configuration of Spark applications, allowing developers to focus
solely on business logic.
- **Automatic Functional Tests 🧪**: Generates tests for each
application automatically, based on predefined input and output
fixtures. This ensures that the application behaves as expected
without requiring extensive manual testing.

### 2. Reduced Time to Market ⏱️

Sparkle significantly reduces the time to market by automating the
deployment and testing processes. This allows data engineers to
concentrate exclusively on developing the business logic, with all
other aspects handled by Sparkle:

- **Automated Testing ✅**: Ensures that all applications are robust
and ready for deployment without manual intervention.
- **Seamless Deployment 🚢**: Automates the deployment pipeline,
reducing the time needed to bring new data products to market.

### 3. Enhanced Maintenance 🔧

Sparkle simplifies maintenance through heavy testing and abstraction
of non-business functional requirements. This provides a reliable and
trustworthy system that is easy to maintain:

- **Abstraction of Non-Business Logic 📦**: By focusing on business
logic, Sparkle minimizes the complexity associated with maintaining
Spark applications.
- **Heavily Tested Framework 🔍**: All non-business functionalities
are thoroughly tested, reducing the risk of bugs and ensuring a
stable environment for data applications.

## How It Works 🛠️

The Sparkle framework operates on a principle similar to Function as a
Service (FaaS). Developers can instantiate a Sparkle application that
takes a list of input DataFrames and focuses solely on transforming
these DataFrames according to the business logic. The Sparkle
application then automatically writes the output of this
transformation to the desired destination.

## Getting Started 🚀

Sparkle is currently under heavy development, and we are continuously
working on improving and expanding its capabilities.

To stay updated on our progress and access the latest information,
follow us on [LinkedIn](https://nl.linkedin.com/company/datachefco)
and [GitHub](https://github.com/DataChefHQ/Sparkle).

## Contributing 🤝

We welcome contributions from the community! If you're interested in
contributing to Sparkle, please check our [GitHub
repository](https://github.com/DataChefHQ/Sparkle) for more details on
how you can get involved.

## License 📄

Sparkle is licensed under the Apache v2.0 License. See the
[LICENSE](LICENSE) file for more details.

## Contact 📬

For more information, questions, or feedback, feel free to reach out
to us on [LinkedIn](https://nl.linkedin.com/company/datachefco) or
open an issue on our
[GitHub](https://github.com/DataChefHQ/sparkle/issues) repository.

---

Thank you for your interest in Sparkle! We're excited to have you join
us on this journey to revolutionize data engineering with Apache
Spark. 🎉
64 changes: 51 additions & 13 deletions devenv.lock
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
"devenv": {
"locked": {
"dir": "src/modules",
"lastModified": 1721817837,
"lastModified": 1725287955,
"owner": "cachix",
"repo": "devenv",
"rev": "44bfc26843694ab17ebae1d4922065e48d93f501",
"treeHash": "62b4ad814fcc952c5660916c9cdadc34927b3330",
"rev": "0ceddcb8040b72943d34af364cde79294222e1af",
"treeHash": "3950480113111b9ffacc5b7a2fb08d27e40ac0cd",
"type": "github"
},
"original": {
Expand All @@ -33,6 +33,22 @@
"type": "github"
}
},
"flake-compat_2": {
"flake": false,
"locked": {
"lastModified": 1696426674,
"owner": "edolstra",
"repo": "flake-compat",
"rev": "0f9255e01c2351cc7d116c072cb317785dd33b33",
"treeHash": "2addb7b71a20a25ea74feeaf5c2f6a6b30898ecb",
"type": "github"
},
"original": {
"owner": "edolstra",
"repo": "flake-compat",
"type": "github"
}
},
"flake-utils": {
"inputs": {
"systems": "systems"
Expand Down Expand Up @@ -95,11 +111,11 @@
]
},
"locked": {
"lastModified": 1720642556,
"lastModified": 1724996935,
"owner": "nlewo",
"repo": "nix2container",
"rev": "3853e5caf9ad24103b13aa6e0e8bcebb47649fe4",
"treeHash": "a9c2f1d3f52f288515ca0fb11f9aed970fd869b6",
"rev": "fa6bb0a1159f55d071ba99331355955ae30b3401",
"treeHash": "a934d246fadcf8b36d28f3577fad413f5ab3f7d3",
"type": "github"
},
"original": {
Expand All @@ -124,13 +140,34 @@
"type": "github"
}
},
"nixpkgs-python": {
"inputs": {
"flake-compat": "flake-compat",
"nixpkgs": [
"nixpkgs"
]
},
"locked": {
"lastModified": 1722978926,
"owner": "cachix",
"repo": "nixpkgs-python",
"rev": "7c550bca7e6cf95898e32eb2173efe7ebb447460",
"treeHash": "d9d38ef1b6fc92be18170b74e9889a7ab9174f6e",
"type": "github"
},
"original": {
"owner": "cachix",
"repo": "nixpkgs-python",
"type": "github"
}
},
"nixpkgs-stable": {
"locked": {
"lastModified": 1721821769,
"lastModified": 1725001927,
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "d0907b75146a0ccc1ec0d6c3db287ec287588ef6",
"treeHash": "4db84624993e912855fe9497179a08cbe6893cce",
"rev": "6e99f2a27d600612004fbd2c3282d614bfee6421",
"treeHash": "1e85443cc9f0ba302df2cf61cacb8014943e2d19",
"type": "github"
},
"original": {
Expand All @@ -142,19 +179,19 @@
},
"pre-commit-hooks": {
"inputs": {
"flake-compat": "flake-compat",
"flake-compat": "flake-compat_2",
"gitignore": "gitignore",
"nixpkgs": [
"nixpkgs"
],
"nixpkgs-stable": "nixpkgs-stable"
},
"locked": {
"lastModified": 1721042469,
"lastModified": 1724857454,
"owner": "cachix",
"repo": "pre-commit-hooks.nix",
"rev": "f451c19376071a90d8c58ab1a953c6e9840527fd",
"treeHash": "91f40b7a3b9f6886bd77482cba5b5cd890415a2e",
"rev": "4509ca64f1084e73bc7a721b20c669a8d4c5ebe6",
"treeHash": "ced5a8df7c554ce10bf26223c002f41b31aff034",
"type": "github"
},
"original": {
Expand All @@ -169,6 +206,7 @@
"mk-shell-bin": "mk-shell-bin",
"nix2container": "nix2container",
"nixpkgs": "nixpkgs",
"nixpkgs-python": "nixpkgs-python",
"pre-commit-hooks": "pre-commit-hooks"
}
},
Expand Down
4 changes: 4 additions & 0 deletions devenv.nix
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@

python = {
enable = true;
version = "3.10.14";
venv = {
enable = true;
requirements = ''
Expand All @@ -59,6 +60,9 @@
};
};

languages.java.enable = true;
languages.java.jdk.package = pkgs.jdk17;

enterShell = ''
hello
pdm install
Expand Down
6 changes: 5 additions & 1 deletion devenv.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,8 @@ inputs:
url: github:rrbutani/nix-mk-shell-bin
nixpkgs:
url: github:cachix/devenv-nixpkgs/rolling

nixpkgs-python:
url: github:cachix/nixpkgs-python
inputs:
nixpkgs:
follows: nixpkgs
Loading
Loading