MeTA v2.3.0
New features
-
Forward and inverted indexes are now stored in one directory. To make use of your existing indexes, you will need to move their directories. For example, a configuration that used to look like the following
dataset = "20newsgroups" corpus = "line.toml" forward-index = "20news-fwd" inverted-index = "20news-inv"
will now look like the following
dataset = "20newsgroups" corpus = "line.toml" index = "20news-index"
and your folder structure should now look like
20news-index ├── fwd └── inv
You can do this by simply moving the old folders around like so:
mkdir 20news-index mv 20news-fwd 20news-index/fwd mv 20news-inv 20news-index/inv
-
stats::multinomial
now can report the number of unique event types
counted (unique_events()
) -
std::vector
can now be hashed viahash_append
.
Bug fixes
- Fix rounding bug in language model-based rankers. This bug caused
severely degraded performance for these rankers with short queries. The
unit tests have been improved to prevent such a regression in the
future.
Enhancements
- The bundled ICU version has been bumped to ICU 57.1.
- MeTA will now attempt to build its own version of ICU on Windows if it
fails to find a suitable ICU installed. - CI support for GCC 6.x was added for all three major platforms.
- CI support also uses a fixed version of LLVM/libc++ instead of trunk.
Model File Checksums (sha256)
d29bf8b4cbeef21db087cf8042efe5afe25c7bd3c460997728d58b92c24ec283 beam-search-constituency-parser-4.tar.gz
ce44c7d96a8339ff4b597f35a35534ccf93ab99b7d45cbbdddffe7e362b9c20e crf.tar.gz
2a75ab9750ad2eabfe1b53889b15a31f79bd2315f71c2a4a62f6364586a6042d gigaword-embeddings-50d.tar.gz
40cd87901eb29b69e57e4bca14bc2539d7d6b4ad5c186d6f3b1532a60c5163b0 greedy-constituency-parser.tar.gz
a0a3814c1f82780f1296d600eba260f474420aa2d93f000e390c71a0ddac42d9 greedy-perceptron-tagger.tar.gz