Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SPARQL backend #31

Open
nichtich opened this issue Mar 2, 2023 · 3 comments
Open

Add SPARQL backend #31

nichtich opened this issue Mar 2, 2023 · 3 comments
Labels
backend Requires work on the backend

Comments

@nichtich
Copy link
Member

nichtich commented Mar 2, 2023

Add a SPARQL backend in addition to SQLite and PostgresSQL (#26). Loading the NTriples dump into Fuseki takes considerably more time and query might be slower as well, but if query performance is acceptable, a SPARQL backend may provide more flexible kind of queries, such as transitive inclusion of narrower concepts.

@nichtich nichtich added the backend Requires work on the backend label Jul 18, 2023
@nichtich
Copy link
Member Author

nichtich commented Aug 8, 2023

See https://labs.onb.ac.at/de/tool/sparql/ for a public SPARQL endpoint (read-only) to experiment with: ANNO (historische Zeitungen) and AKON (historische Postkarten), e.g.

Occurrence:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT (COUNT(?doc) AS ?c) {
  ?doc dc:subject <http://d-nb.info/gnd/4062901-6> .
}

Co-Occurrence: none (each title seems to have only one of 42 subjects?):

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?subject (COUNT(DISTINCT ?doc) AS ?count) {
  ?doc dc:subject ?subject .
  FILTER(isIRI(?subject))
} GROUP BY ?subject ORDER BY DESC(?count)

@nichtich
Copy link
Member Author

Unfortunately https://labs.onb.ac.at/de/tool/sparql/ does not include skos:inScheme and it uses http://purl.org/dc/elements/1.1/subject instead of http://purl.org/dc/terms/subject - the latter could be configured though.

@nichtich
Copy link
Member Author

Things to further adjust:

  • Configure subject predicate (dc:subject vs dct:subject)
  • Record URIs are hard-coded to http://uri.gbv.de/document/opac-de-627:ppn:$, better allow arbitrary record URIs as well
  • Partial and full import have not been implemented yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Requires work on the backend
Projects
None yet
Development

No branches or pull requests

1 participant