Releases: FRosner/drunken-data-quality
Releases · FRosner/drunken-data-quality
Drunken Data Quality 5.0.0
Drunken Data Quality 4.1.1
- make date format parsing strict and not lenient for date format constraints to avoid reporting malformed rows as correct
- implement missing number of rows constraints in python api
Drunken Data Quality 4.1.0
- Custom constraints on DataFrames
- Email reporter
Drunken Data Quality 4.0.0
- Change API to utilize the new
SparkSession
in Spark 2 - Fix a problem where reusing a
SimpleDateFormat
lead to wrong results because it is not thread safe - Add function to compare two data frames for equality
Drunken Data Quality 3.2.1
- fix standard reporter not working in IPython / Jupyter
- improve error logging in the python stream wrappers
Drunken Data Quality 3.2.0
- Python API
Drunken Data Quality 3.1.0
- Zeppelin reporter
- Console.out as default PrintStream for ConsoleReporter
- Reporting showcase for ELK + Log4jReporter
Drunken Data Quality 3.0.0
- generalize number of rows constraint (not only equals but arbitrary condition)
- constraint result error state (e.g. if a column is not even existing)
- convertible checks have been reworked to take the target type as an argument
- add
Logj4Reporter
, writing the results as JSON object strings to the Spark logger - upgrade to Spark 1.6.x
Drunken Data Quality 2.1.0
- DDQ is now available as a spark-package
- Fix problem where IntelliJ could not compile the project properly
- Use singular and plural forms in messages
- Rework
isJoinableWith
message (showing also a match percentage)
Drunken Data Quality 2.0.0
Constraints
- Functional dependency constraint check
Checks and Reports
- Runner allows to run multiple checks at once
- Reporter interfaces allow development and usage of custom reporters
- Added console reporter
- Added markdown reporter