rdftools

rdftools is a python wrapper over a number of RDF related tools

rdf parsers / serializers
void utilities
lubm generator
etc

Important Notes

This software is the product of research carried out at the University of Zurich and comes with no warranty whatsoever. Have fun!

TODO's

The project is not documented (yet)

How to Compile/Install the Project

Ensure that libraptor2 v2.0.13+ and cityhash are installed on your system (either using the package manager of the OS or compiled from source).

To install rdftools you have two options: 1) manual installation (install requirements first) or 2) automatic with pip

Manual installation:

$ git clone https://github.com/cosminbasca/rdftools
$ cd rdftools
$ python setup.py install

Install the project with pip:

$ pip install https://github.com/cosminbasca/rdftools

Also have a look at the build.sh, clean.sh, test.sh scripts included in the codebase

To include the latest JVM RDF tools update to the latest of jvmrdftools and create an assembly:

$ sbt compile assembly

copy the resulting jar from the target folder to the lib folder inside the rdftools.tools.jvmrdftools module and reinstall the python package.

The tools

To find out what a tool does, simply supply the --help comand line argument to any of the tools Available tools:

rdfconvert, convert RDF files from source format to a destination format using the libraptor2 C RDF parser

usage: rdfconvert [-h] [--clear] [--dst_format DST_FORMAT]
                  [--buffer_size BUFFER_SIZE] [--version]
                  SOURCE

rdftools v0.9.2, rdf converter, based on libraptor2

positional arguments:
  SOURCE                the source file or location (of files) to be converted

optional arguments:
  -h, --help            show this help message and exit
  --clear               clear the original files (delete) - this action is
                        permanent, use with caution!
  --dst_format DST_FORMAT
                        the destination format to convert to. Supported
                        parsers: ['rdfxml', 'ntriples', 'turtle', 'trig',
                        'guess', 'rss-tag-soup', 'rdfa', 'nquads', 'grddl'].
                        Supported serializers ['rdfxml', 'rdfxml-abbrev',
                        'turtle', 'ntriples', 'rss-1.0', 'dot', 'html',
                        'json', 'atom', 'nquads'].
  --buffer_size BUFFER_SIZE
                        the buffer size in Mb of the input buffer (the parser
                        will only parse XX Mb at a time)
  --version             the current version

rdfconvert2 convert RDF files from source format to a destination format using the rdf2rdf java RDF parser

usage: rdfconvert2 [-h] [--clear] [--dst_format DST_FORMAT]
                   [--workers WORKERS] [--version]
                   SOURCE

rdftools v0.9.2, rdf converter (2), makes use of rdf2rdf bundled - requires
java

positional arguments:
  SOURCE                the source file or location (of files) to be converted

optional arguments:
  -h, --help            show this help message and exit
  --clear               clear the original files (delete) - this action is
                        permanent, use with caution!
  --dst_format DST_FORMAT
                        the destination format to convert to
  --workers WORKERS     the number of workers (default -1 : all cpus)
  --version             the current version

rdfencode, endode an ntriples file to a binary format (each S, P, O string is hashed with cityhash 64 bit)

usage: rdfencode [-h] [--version] SOURCE

rdftools v0.9.2, encode the RDF file(s)

positional arguments:
  SOURCE      the source file or location (of files) to be encoded

optional arguments:
  -h, --help  show this help message and exit
  --version   the current version

genlubm, generate a LUBM dataset (in parallel)

usage: genlubm [-h] [--univ UNIV] [--index INDEX] [--seed SEED]
               [--ontology ONTOLOGY] [--workers WORKERS] [--version]
               OUTPUT

rdftools v0.9.2, lubm dataset generator wrapper (bundled) - requires java

positional arguments:
  OUTPUT               the location in which to save the generated
                       distributions

optional arguments:
  -h, --help           show this help message and exit
  --univ UNIV          number of universities to generate
  --index INDEX        start university
  --seed SEED          the seed
  --ontology ONTOLOGY  the lubm ontology
  --workers WORKERS    the number of workers (default -1 : all cpus)
  --version            the current version

genlubmdistro generate a LUBM dataset (in parallel) and mix the universities to N sites with the specified distribution

usage: genlubmdistro [-h] [--distro DISTRO] [--univ UNIV] [--index INDEX]
                     [--seed SEED] [--ontology ONTOLOGY] [--pdist PDIST]
                     [--sites SITES] [--clean] [--workers WORKERS] [--version]
                     OUTPUT

rdftools v0.9.4, lubm dataset generator wrapper (bundled) - requires java

positional arguments:
  OUTPUT               the location in which to save the generated
                       distributions

optional arguments:
  -h, --help           show this help message and exit
  --distro DISTRO      the distibution to use, valid values are ['seedprop',
                       'uni2many', 'horizontal', 'uni2one']
  --univ UNIV          number of universities to generate
  --index INDEX        start university
  --seed SEED          the seed
  --ontology ONTOLOGY  the lubm ontology
  --pdist PDIST        the probabilities used for the uni2many distribution,
                       valid choices are ['3S', '7S', '5S'] or file with
                       probabilities split by line
  --sites SITES        the number of sites
  --clean              delete the generated universities
  --workers WORKERS    the number of workers (default -1 : all cpus)
  --version            the current version

genvoid, generate VoID statistics from the source file

usage: genvoid [-h] [--version] SOURCE

rdftools v0.9.2, generate void statistics for RDF source file

positional arguments:
  SOURCE      the source file to be analized

optional arguments:
  -h, --help  show this help message and exit
  --version   the current version

genvoid2, generate VoID statistics from the RDF source file, using the nxparser VoID exporter

usage: genvoid2 [-h] [--dataset_id DATASET_ID] [--use_nx] [--version] SOURCE

rdftools v0.9.2, generate a VoiD descriptor using the nxparser java package

positional arguments:
  SOURCE                the source file to be analized

optional arguments:
  -h, --help            show this help message and exit
  --dataset_id DATASET_ID
                        dataset id
  --use_nx              if true (default false) use the nx parser builtin void
                        generator
  --version             the current version

ntround, round all numeric literals (typed or untyped) in an ntriples files with the given precision

usage: ntround [-h] [--prefix PREFIX] [--precision PRECISION] [--version] PATH

rdftools v0.9.2, rounds ntriple files in a folder, (rounds the floating point literals)

positional arguments:
  PATH                  location of the indexes

optional arguments:
  -h, --help            show this help message and exit
  --prefix PREFIX       the prefix used for files that are transformed, cannot
                        be the enpty string!
  --precision PRECISION
                        the precision to round to, if 0, floating point
                        numbers are rounded to long
  --version             the current version

Thanks a lot to

University of Zurich and the Swiss National Science Foundation for generously funding the research that led to this software.

Name		Name	Last commit message	Last commit date
Latest commit History 184 Commits
doc		doc
rdftools		rdftools
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
NOTICE		NOTICE
README.md		README.md
build.sh		build.sh
clean.sh		clean.sh
ez_setup.py		ez_setup.py
requirements.sh		requirements.sh
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rdftools

Important Notes

TODO's

How to Compile/Install the Project

The tools

Thanks a lot to

About

Releases

Packages

Languages

License

cosminbasca/rdftools

Folders and files

Latest commit

History

Repository files navigation

rdftools

Important Notes

TODO's

How to Compile/Install the Project

The tools

Thanks a lot to

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages