Releases: HXLStandard/libhxl-python
libhxl-python v4.8.4
Interim release:
- add a new JSONFilter for extracting data from embedded JSON in a cell
- fix the representation of embedded JSON data inside a cell.
libhxl-python v4.8.3
Interim release with critical bug fixes and features:
- handle Google Drive "open" and "file" URLs
- normalise whitespace for the count filter (so that "Guinea" and "Guinea " won't count separately)
- fix validation test for trailing whitespace
libhxl-python v4.8.2
Hotfixes for release 4.8 (see https://github.com/HXLStandard/libhxl-python/releases/tag/4.8)
- there were some circular dependencies that blocked installation in a clean environment
- the default validation schema wasn't getting picked up in the distribution
libhxl-python v4.8
- module now has a __version__attribute, per PEP 396
- validation enhancements:
- major refactor of the hxl.validation module for better testing and maintainbility
- new default validation schema
- ability to generate a JSON validation report
- validation schema rules can support multiple tag patterns
- test for spelling inconsistencies
- test for numerical outliers
- filter changes
- refactor hxl.filters.CacheFilter to preserve source row numbers
- hxl.filters.RowFilter no longer ignores empty cells
- when multiple columns match a row query, it will succeed with at
- fix bug in the "is" operator for row queries
- handle more types of Google Sheets URLs
- when times are attached to dates, date parsing will still succeed
- new external module dependency: python-io-wrapper
libhxl-python v4.7.1
- hotfix for a rare bug in date parsing
libhxl-python v4.7
- added wildcard support to tag patterns, so that we can use patterns like "" or "+f-children"
- removed obsolete Python2 compatibility code
- added source_row_number and source_column_number to support validation
- revamped date handling to support partial dates like "2018-01" or "2018", and also special notation like "2018W05" or "2018Q1"
- row queries support is (not) min and is (not) max, including for numbers and dates
- added min() and max() methods to hxl.model.Dataset
Major enhancements to the validation engine:
- now accepts all parseable date formats
- new #valid_unique constraint (single value or compound key)
- new #valid_correlation constraint (e.g. make sure that #adm1+name is always consistent for any given value #adm1+code)
- new #valid_datatype+consistent test to to infer datatypes in a column without explicit rules
- new #valid_value+whitespace test
- suggests spelling corrections when validating against a list with #valid_value+enum or #valid_value+url
- new #valid_value+whitespace test detects irregular whitespace
For more details, see the CHANGELOG
libhxl-python v4.6
Note: Python2 no longer supported.
Core enhancements:
- Added Python logging support (will expand in next release).
- can now open data from a CKAN dataset URL (will look for the first resource)
Filter enhancements:
- clean filter can normalise lat/lon
- column filter can remove all columns without HXL hashtags
- 'patterns' parameter is now optional for JSON-encoded count filter
Bug fixes:
- Google Sheets now open properly from a CKAN resource URL
- Restore support for preserving original attribute order.
libhxl-python v4.5.1
Interim bug-fix release. A badly-formatted date was sending an exception to the top level and halting processing.
libhxl-python v4.5
Enhancements to filters:
- merge-data looks for key matches in all candidate columns
- cut-columns can remove all columns without HXL hashtags
- clean-data allows number formatting strings like
0.2f
Try to recognise JSON data even when it doesn't have a JSON MIME type or .json
extension.
Generate JSON lists of objects as described in the HXL 1.1 beta spec
See the CHANGELOG for more details.
libhxl-python v4.4
For the HXL 1.1 JSON formats, support reading a JSON array of objects as well as an array of arrays.
Search recursively for HXL data inside a JSON dataset.
Improve exception handling for HTTP errors.