Releases: metafacture/metafacture-core
Metafacture Runner Distribution 3.1.0
This release updates the metafacture-core dependency to version 3.1.0 Please see the release notes for metafacture-core for a list of changes.
Metafacture Core 3.1.0
Changed Behaviour
- Fix #216 changes the behaviour of Metamorph: Collectors are now reset on
flushWith
even if the condition was not met.
Maven Coordinates
Metafacture core is available on Maven Central:
<dependency>
<groupId>org.culturegraph</groupId>
<artifactId>metafacture-core</artifactId>
<version>3.1.0</version>
</dependency>
Changes
The release contains mostly bug fixes but also a couple of new features.
New Features
- #229: Support for marcxml in test cases
- #119: a tar reader (thanks to Pascal Christoph)
- #195: a pica-xml handler and reader (thanks to Pascal Christoph & Fabian Steeg)
- #222: a Unicode normalizer for handling the different Unicode normalization forms in streams
- @FluxCommand annotation (see commit eac0926 for details)
- #197: Support passing a URL to
ResourceUtil.getStream(String)
(thanks to Fabian Steeg) - #196: Support optional
FilenameFilter
inDirReader
(thanks to Fabian Steeg)
Bug Fixes
- #219, #227: Improved X-Include handling in Metamorph
- #223: The only attribute of the occurrence filter now accepts lessThan and moreThan in addition to lessThen and more Then
- #216: Fixed handling of reset in if-conditions in collectors (NOTE: This may change the behaviour of Metamorph scripts)
- #217: Metamorph lost variable maps passed as constructor arguments.
Metafacture Runner Distribution 3.0.0
This release updates the metafacture-core dependency to version 3.0.0 Please see the release notes for metafacture-core for a list of changes.
Metafacture Core 3.0.0
This release is not compatible with the 2.x.x line of metafacture-core.
Changed Behaviour and Interfaces
- Formeta: Only accept allowed escape sequences (commit 162c1f1)
- Internal structure of the
Metamorph
classes changed:
Maven Coordinates
Metafacture core is available on Maven Central:
<dependency>
<groupId>org.culturegraph</groupId>
<artifactId>metafacture-core</artifactId>
<version>3.0.0</version>
</dependency>
All Changes
Bug Fixes
- Fixed #177: Condition was not reset when collector was reset (commit a20767d)
- Fixed #178: Reset conditions if sameEntity is true (commit ed9824c)
- Fix #179: Output message of wrapped exceptions (commit a55475c)
- Formeta: Escape leading and trailing whitespace (commit 432aca0)
- Fix #192: AbstractTripleSort has memory leak (commit dda2343)
- Improve error message when decoding CSV (commit 3d6fc92)
- Fix #204: High memory usage of Metamorph tests (commit 446d663)
New Features and Improvements
Metamorph
- Allow macros in
entity
statements - Added framework for introspecting the data processing within am Metamorph script
See commit 17710f3 for a brief introduction how to use this feature.- Refactored the code for building pipelines in Metamorph (commit 3eca5ba)
- Added source file location annotations to Metamorph DOM (commit c16833e)
- Refactored code for loading morph scripts (commit 498730b)
- Added source location infos to the morph pipeline (commit 7e766ad)
- Simplified from interface
NamedValueSource
(commit efb39d5) - Add a system for interception to Metamorph (commit 17710f3)
- Fix #205: Exception during Metamorph build (commit 3a509d9)
- Add reverse concatenation to concat in Metamorph (commit 048c6ba)
- Add new attributes era and removeLeadingZeros to DateFormat (commit cac5b94)
- Add sameEntity attribute to concat metamorph statement (commit a5d5023)
Stream Modules
- Logger and Exception catcher:
- Added new module for grouping Pica multiscript fields:
- Added an encoder for Marc21 records
The implementation implements the full ISO 2709:2008 standard. Only the Flux module is specific
for Marc21. Additional instances of the ISO 2709:2008 can therefore easily be added. - Add modules for reading and writing event streams from or into POJOs:
Miscellaneous
- Support for formeta as input and output format of testcases
- Adds reader for multiline formeta records
- Allow other data formats for result type than cg-xml:
- Added utility class for method argument checking
Code clean-up/improvements
- Minor change: removed duplicated code in
flush
method (commit b06b9a3) - Fixed
getStream
method to not create theFile
object twice (commit bbc0d63) - Fixed coding style (commit 8f3caeb)
- Added new test case for
occurrence
-function (commit a5df610) - Small code quality improvements (commit 914b58d)
- Minor code improvements in
PicaDecoder
(commit e2152f4, commit 72a75ba) - Improved code formatting and added documentation (commit 7931ba0)
- Removed old merge conflict in commit (commit fce4dbd)
- Improve code style (commit 56a7698)
- Cleanup: Remove dead code from
JndiSqlMap
(commit 34cffed) - Relax return-count check (commit 7c1612d)
- Add comments and remove trailing whitespace (commit 97a0d54)
- Improve
StringUtils.copyToBuffer
(commit 16c3bcb) - Update junit related PMD rules (commit 016c14d)
- Exclude check for boolean inversion from PMD (commit 973fa51)
Metafacture Core Distribution 1.2.2
This is a bug fix release for the 1.2.x branch of metafacture-core.
Bug fixes
- Fixed #178: Reset conditions if sameEntity is true
If a collector has an sameEntity="true"
set, it is reset whenever
the current entity changes. As described in issue #178 the current
implementation fails to reset the condition during these reset
operations. This commit fixes this.
Additionally, the test cases for the if-condition have been moved
into a separate xml-file as quite a number of tests have
accumulated so that it makes sense to keep them in a separate file.
Metafacture Core Distribution 1.2.1
This is a bug fix release for the 1.2.x branch of metafacture-core.
Bug fixes
-
Fixed #177: Condition was not reset when collector was reset
If a collector has an reset="true" attribute one would expect that this
also resets the state of an if-condition in the collector. However, this
was not the case. This commits fixes this.Please note that this may change the behaviour of existing scripts if
they relied on the condition not being reset with the reset of the
collector.
Metafacture Runner Distribution 2.0.0
This is the initial release of metafacture-runner. The 2.0.0 version number was chosen to keep it the versioning scheme in sync with the metafacture-core package.
Please see the release notes of the metafacture-core package for changes of the code which is now in metafacture-runner but was part of metafacture-core before.
Metafacture Core 2.0.0
This release is not compatible with the 1.x.x line of metafacture-core.
Incompatible changes
- Removed flux executable and runtime dependencies slf4j-log4j and mysql jdbc
driver from metafacture-core. The flux command line application is now
maintained in the culturegraph/metafacture-runner package (see issues #131,
#130 and #168 and commits 41329a7 and ecdafbc). - Removed eclipse project files from repository (see commit 27c2390)
- Reimplemented PicaDecoder: The records are now properly parsed. The new
implementation does not do special processing of subfield "S" like the old
class did. Additionally, multi-line pica records are supported (see issues
#51, #109, #112, #137 and #139 and commits 3c75b41, 9e736df, 4483e5e, 89119a6,
ae5a08a, c0eeb04, ec81279, bd30086, 5c8002e) - Renamed the
configure
method inSimpleXmlWriter
(nowSimplXmlEncoder
)
intosetNamespaces
to reflect what its actually doing (see issue #99) - Renamed
org.culturegraph.mf.stream.sink.SimpleXmlWriter
to
org.culturegraph.mf.stream.converter.xml.SimpleXmlEncoder
(see issue #100) - The receiver interfaces do no longer extend
LifeCycle
directly but extend an
intermediateReceiver
interface (see commit 7065cc0)
New features & improvements
- Updated dependencies to latest version (see commit 3ab7331)
- Modified
IdChangePipe
to accept nested literals as ids (see commit e81b230) - Modified the
Counter
module to allow pipe lining (see commit 43c52c3) - Added pretty printing and configurable character escapes to the
JsonEncoder
(see commit 8cb7a08) - Added a dateformat function to Metamorph for converting various date formats
(see commit 6b9b7e1) - Added a Metamorph function for generating timestamps (see commit 82be110)
- Added triple-to-stream module which converts triples into a stream (without
collecting them into records as collect-triples does (see commit 55fc144) - Improvements to
LineSplitter
: Added flux-annotations to LineSplitter and
added it to flux commands. (see commit 34aed80) - Added
StreamExceptionCatcher
module which is the stream counterpart of
ObjectExceptionCatcher
(see commit 59ff596)
Bug fixes
- Generate Flux parser and lexer as part of the build cycle (see commit a7b4d78)
- The flux lexer was failing on files which had an empty comment not followed by
a new line as their last line (see issue #147) - Place
OreAggregationAdder
and its test in same package (see issue #60) - Adds the ability to escape the @-character in Metamorph names (see commit
0b470e5) - Replaced binary or with boolean or in
StreamLiteralFormatter
(see commit
bbe340c) - Added fallbacks to
flux.sh
in caserealpath
is not available (see commit
b1e1172) - Moved the logic for creating a buffer to allow direct access to the characters
in a string into theStringUtil
class. The code was fixed to always create a
buffer that is large enough (see issue #161) AbstractTripleSort
threwNullPointerExceptions
if it received a
"memoryLow" message before the first record was processed (see issue #160)
Metafacture Core Distribution 1.2.0
The new release should be fully compatible with release 1.1.0. It contains the following new features and bug fixes:
New features
- Added header, footer and separator settings to ObjectWriter (see issue #154)
- New Collector EqualsFilter (see issue #149)
- Set namespace for rdf output via remote configuration in the same way as Metamorph scripts are set (see issue #145)
- Added RecordReader for reading records from a reader. This module is available as as-records in Flux (see issues #140, #142 )
- Add a wrapper for WildcardTrie enabling simple character classes in source statements in Metamorph (see issues #135, #143)
Experimental feature
- Added support for conditional activation of collectors. Additionally a set of quantifiers collectors allows to express Boolean conditions in nice and easily comprehendable way (see issues #151, #154)
Bug fixes
- Changed JsonEncoder to not prefix output with spaces (see issue #152)
- Partially reverted commit 23c05cf: This commit broke the plugin loading mechanism. This should be fixed again (see issue #148)
- Bugfix for RdfMacroPipe: Empty name parameter in literal(..)-method resulted in an StringIndexOutOfBoundsException. The parameter is checked with org.apache.commons.lang.StringUtils now and so the use of empty name parameter is possible again (see issues #146, #150 )
- Fix wrong namespace (dcterm->dcterms) (see issue #136)
Metafacture Core Distribution 1.1.0
metafacture-core-1.1.0 [maven-release-plugin] copy for tag metafacture-core-1.1.0