You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the GloVe algorithm for
training word embeddings and a library class word_embeddings for loading and
querying trained embeddings. To facilitate returning word embeddings, a simple util::array_view class was added.
Add simple vector math library (and move fastapprox into the math
namespace).
Bug fixes
Fix probe_map::extract() for inline_key_value_storage type; old
implementation forgot to delete all sentinel values before returning the
vector.
Fix incorrect definition of l1norm() in sgd_model.
Fix gmap calculation where 0 average precision was ignored
Fix progress output in multiway_merge.
Enhancements
Improve performance of printing::progress. Before, progress::operator() in
tight loops could dramatically hurt performance, particularly due to frequent
calls to std::chrono::steady_clock::now(). Now, progress::operator()
simply sets an atomic iteration counter and a background thread periodically
wakes to update the progress output.
Allow full text storage in index as metadata field. If store-full-text = true (default false) in the corpus config, the string metadata field
"content" will be added. This is to simplify the creation of full text
metadata: the user doesn't have to duplicate their dataset in metadata.dat,
and metadata.dat will still be somewhat human-readable without large strings
of full text added.
Allow make_index to take a user-supplied corpus object.
Miscellaneous
ZLIB is now a required dependency.
Switch to just using the standalone ./unit-test instead of ctest. There
aren't really many advantages for us to using CTest at this point with the new
unit test framework, so just use our unit test executable.