MeTA v2.2.0
New features
- Parallelized versions of PageRank and Personalized PageRank have been
added. A demo is available inwiki-page-rank
; see the website for
more information on obtaining the required data. - Add a disk-based streaming minimal perfect hash function library. A
sub-component of this is a small memory-mapped succinct data structure
library for answering rank/select queries on bit vectors. - Much of our CMake magic has been moved into a separate project included
as a submodule: https://github.com/meta-toolkit/meta-cmake, which can
now be used in other projects to simplify initial build system
configuration.
Bug fixes
- Fix parameter settings in language model rankers not being range checked
(issue #134). - Fix incorrect incoming edge insertion in
directed_graph::add_edge()
. - Fix
find_first_of
andfind_last_of
inutil::string_view
.
Enhancements
forward_index
now knows how to tokenize a document down to a
feature_vector
, provided it was generated with a non-LIBSVM analyzer.- Allow loading of an existing index where its corpus is no longer
available. - Data is no longer shuffled in
batch_train
. Shuffling the data
causes horrible access patterns in the postings file, so the data
should instead shuffled before indexing. util::array_view
s can now be constructed as empty.util::multiway_merge
has been made more generic. You can now specify
both the comparison function and merging criteria as parameters, which
default tooperator<
andoperator==
, respectively.- A simple utility classes
io::mifstream
andio::mofstream
have been
added for places where a moveableifstream
orofstream
is desired
as a workaround for older standard libraries lacking these move
constructors. - The number of indexing threads can be controlled via the configuration
keyindexer-num-threads
(which defaults to the number of threads on
the system), and the number of threads allowed to concurrently write to
disk can be controlled viaindexer-max-writers
(which defaults to 8).
Model File Checksums (sha256)
d29bf8b4cbeef21db087cf8042efe5afe25c7bd3c460997728d58b92c24ec283 beam-search-constituency-parser-4.tar.gz
ce44c7d96a8339ff4b597f35a35534ccf93ab99b7d45cbbdddffe7e362b9c20e crf.tar.gz
2a75ab9750ad2eabfe1b53889b15a31f79bd2315f71c2a4a62f6364586a6042d gigaword-embeddings-50d.tar.gz
40cd87901eb29b69e57e4bca14bc2539d7d6b4ad5c186d6f3b1532a60c5163b0 greedy-constituency-parser.tar.gz
a0a3814c1f82780f1296d600eba260f474420aa2d93f000e390c71a0ddac42d9 greedy-perceptron-tagger.tar.gz