Skip to content

MeTA v2.2.0

Compare
Choose a tag to compare
@skystrife skystrife released this 09 Apr 02:04
· 304 commits to master since this release

New features

  • Parallelized versions of PageRank and Personalized PageRank have been
    added. A demo is available in wiki-page-rank; see the website for
    more information on obtaining the required data.
  • Add a disk-based streaming minimal perfect hash function library. A
    sub-component of this is a small memory-mapped succinct data structure
    library for answering rank/select queries on bit vectors.
  • Much of our CMake magic has been moved into a separate project included
    as a submodule: https://github.com/meta-toolkit/meta-cmake, which can
    now be used in other projects to simplify initial build system
    configuration.

Bug fixes

  • Fix parameter settings in language model rankers not being range checked
    (issue #134).
  • Fix incorrect incoming edge insertion in directed_graph::add_edge().
  • Fix find_first_of and find_last_of in util::string_view.

Enhancements

  • forward_index now knows how to tokenize a document down to a
    feature_vector, provided it was generated with a non-LIBSVM analyzer.
  • Allow loading of an existing index where its corpus is no longer
    available.
  • Data is no longer shuffled in batch_train. Shuffling the data
    causes horrible access patterns in the postings file, so the data
    should instead shuffled before indexing.
  • util::array_views can now be constructed as empty.
  • util::multiway_merge has been made more generic. You can now specify
    both the comparison function and merging criteria as parameters, which
    default to operator< and operator==, respectively.
  • A simple utility classes io::mifstream and io::mofstream have been
    added for places where a moveable ifstream or ofstream is desired
    as a workaround for older standard libraries lacking these move
    constructors.
  • The number of indexing threads can be controlled via the configuration
    key indexer-num-threads (which defaults to the number of threads on
    the system), and the number of threads allowed to concurrently write to
    disk can be controlled via indexer-max-writers (which defaults to 8).

Model File Checksums (sha256)

d29bf8b4cbeef21db087cf8042efe5afe25c7bd3c460997728d58b92c24ec283  beam-search-constituency-parser-4.tar.gz
ce44c7d96a8339ff4b597f35a35534ccf93ab99b7d45cbbdddffe7e362b9c20e  crf.tar.gz
2a75ab9750ad2eabfe1b53889b15a31f79bd2315f71c2a4a62f6364586a6042d  gigaword-embeddings-50d.tar.gz
40cd87901eb29b69e57e4bca14bc2539d7d6b4ad5c186d6f3b1532a60c5163b0  greedy-constituency-parser.tar.gz
a0a3814c1f82780f1296d600eba260f474420aa2d93f000e390c71a0ddac42d9  greedy-perceptron-tagger.tar.gz