Skip to content

tudocomp/datasets

Repository files navigation

Tudocomp Datasets

A collection of scripts and sources for the generation and gathering of a comprehensive text corpus.

Standalone usage

TODO: Running tests

Using as external dependency

TODO: usecase of copying, or using as submodule

Dependencies

The CMake build process will either find external dependencies on the system if they have been properly installed, or automatically download and build them from their official repositories in case they cannot be found. In that regard, a proper installation of the dependencies is not required.

Said external dependencies are the following:

License

The code in this repository is published under the Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
COPYING

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published