Skip to content

Latest commit

 

History

History
263 lines (206 loc) · 9.71 KB

README.md

File metadata and controls

263 lines (206 loc) · 9.71 KB

SPARQL Anything showcase: open data from the Tate Gallery

This showcase provides examples of using SPARQL Anything to query open data from the Tate Gallery collection. The repository is included as Git submodule in folder collection.

In what follows, fx is a placeholder for java -jar sparql-anything-<version>.jar. See the SPARQL Anything usage documentation for details on Java options such as enabling logging.

Artists as Schema.org

The query generates a Schema.org description of artists from the CSV file.

Extract artworks and subjects
Query queries/artists.sparql
Input collection/artist_data.csv
Output artists.ttl
Type CONSTRUCT
Options csv.headers=true
Formats CSV
Level Novice

Usage:

fx -q queries/artists.sparql -f TTL -o artists.ttl

Output excerpt from artists.ttl:

@prefix schema: <http://schema.org/> .
@prefix fx:    <http://sparql.xyz/facade-x/ns/> .
@prefix dct:   <http://purl.org/dc/terms/> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xyz:   <http://sparql.xyz/facade-x/data/> .
@prefix tate:  <http://sparql.xyz/example/tate/> .
@prefix tsub:  <http://sparql.xyz/example/tate/subject/> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .

tate:artist-1218  a        schema:Person ;
        rdfs:label         "Greiffenhagen, Maurice" ;
        schema:birthDate   "1862" ;
        schema:birthPlace  "London, United Kingdom" ;
        schema:deathDate   "1862" ;
        schema:deathPlace  "London, United Kingdom" ;
        schema:gender      "Male" ;
        schema:url         "http://www.tate.org.uk/art/artists/maurice-greiffenhagen-1218" .

tate:artist-10241  a       schema:Person ;
        rdfs:label         "Apóstol, Alexander" ;
        schema:birthDate   "1969" ;
        schema:birthPlace  "Barquisimeto, Venezuela" ;
        schema:deathDate   "1969" ;
        schema:deathPlace  "" ;
        schema:gender      "Male" ;
        schema:url         "http://www.tate.org.uk/art/artists/alexander-apostol-10241" .
...

Extract artworks and subjects

This query combines two x-sparql-anything transformation. The first, iterates over the CSV file artworks_data.csv. For each one of the artworks, the Tate Gallery open data includes a JSON file with some more details, including a list of annotated subjects. The query finds the related JSON and queries it to retrieve the subjects. The output is projected in a KG of artworks and subjects, via a CONSTRUCT query.

Title Extract artworks and subjects
Query queries/arts-and-subjects.sparql
Input collection/artworks_data.csv, collection/artworks/
Output arts-and-subjects.ttl
Type CONSTRUCT
Options csv.headers=true
Formats CSV, JSON
Level Novice

Run the example as follows:

fx -q queries/arts-and-subjects.sparql -f TTL -o arts-and-subjects.ttl

Build a SKOS taxonomy of subjects

This example explores all the JSON files of the open data collection and generates a unified SKOS taxonomy of all artwork subject annotations of the Tate Gallery open dataset.

Title Build a SKOS taxonomy of subjects
Query queries/subjects-as-skos.sparql
Input collection/artworks_data.csv, collection/artworks/
Output subjects.ttl
Type CONSTRUCT
Options csv.headers=true
Formats CSV, JSON
Level Novice

Run the query as follows:

fx -q queries/subjects-as-skos.sparql -f TTL -o subjects.ttl

Output (excerpt, see subjects.ttl):

@prefix fx:    <http://sparql.xyz/facade-x/ns/> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xyz:   <http://sparql.xyz/facade-x/data/> .
@prefix tate:  <http://sparql.xyz/example/tate/> .
@prefix skos:  <http://www.w3.org/2004/02/skos/core#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tsub:  <http://sparql.xyz/example/tate/subject/> .

tsub:852  a            skos:Concept ;
        rdfs:label     "condom" ;
        skos:broader   tsub:88 ;
        skos:inScheme  tsub:subjects .

tsub:2166  a           skos:Concept ;
        rdfs:label     "Caiaphas" ;
        skos:broader   tsub:134 ;
        skos:inScheme  tsub:subjects .

tsub:18356  a          skos:Concept ;
        rdfs:label     "New York, Rockefeller Center" ;
        skos:broader   tsub:0 ;
        skos:inScheme  tsub:subjects .

tsub:15852  a          skos:Concept ;
        rdfs:label     "Texel" ;
        skos:broader   tsub:111 ;
        skos:inScheme  tsub:subjects .
		...

Generate a CSV list of subjects

This example extracts all subjects from the JSON files in collections/artworks/ and return a distinct set of subjects as CSV.

Title Generate a CSV list of subjects
Query queries/subjects-list.sparql
Input collection/artworks_data.csv, collection/artworks/
Output subjects.csv
Type SELECT
Options csv.headers=true
Formats CSV, JSON
Level Novice

Run the query as follows:

fx -q queries/subjects-list.sparql -f CSV -o subjects.csv

Generate a CSV list of the subjects hierarchy

This example extracts all subjects from the JSON files in collections/artworks/ and return the whole hierarchy as CSV table.

Title Generate a CSV list of the subjects hierarchy
Query queries/subjects-hierarchy.sparql
Input collection/artworks_data.csv, collection/artworks/
Output subjects.csv
Type SELECT
Options csv.headers=true
Formats CSV, JSON
Level Novice

Run the query as follows:

fx -q queries/subjects-hierarchy.sparql -f CSV -o hierarchy.csv

Generate a CSV of artworks + related subjects

This query combines two x-sparql-anything transformation. The first, iterates over the CSV file artworks_data.csv. For each one of the artworks, the Tate Gallery open data includes a JSON file with some more details, including a list of annotated subjects. The query finds the related JSON and queries it to retrieve the subjects. The output is projected in a KG of artworks and subjects, via a CONSTRUCT query.

Title Generate a CSV of artworks + related subjects
Query queries/arts-and-subjects-list.sparql
Input collection/artworks_data.csv, collection/artworks/
Output arts-and-subjects-list.csv
Type SELECT
Options csv.headers=true
Formats CSV, JSON
Level Novice

Run the example as follows:

fx -q queries/arts-and-subjects-list.sparql -f CSV -o arts-and-subjects-list.csv

Count artworks for each subjects

This example is a process divided in two steps.

Step 1: Generate a table associating subjectId and artworkId. The table is produced by querying an RDF file of artworks and subjects. I this example, the input is an RDF and the output is CSV!

Title Count artworks for each subjects (Part 1)
Query queries/subjects-artworks-id.sparql
Input arts-and-subjects.ttl
Output subjects-artworks-id.csv
Type SELECT
Options
Formats -
Level Novice

Run the example as follows:

fx -q queries/subjects-artworks-id.sparql -o subjects-artworks-id.csv -f CSV -l arts-and-subjects.ttl

Step 2: Reading the CSV table and counting the number of artworks for each subject

Title Count artworks for each subjects (Part 2)
Query queries/subjects-artworks-count.sparql
Input subjects-artworks-id.csv
Output subjects-artworks-count.csv
Type SELECT
Options
Formats -
Level Novice

Run the example as follows:

fx -q queries/subjects-artworks-count.sparql -o subjects-artworks-count.csv -f CSV

Other queries

Generate CSV of artworks + subjects + link to images

This example shows how to generate a CSV file including artworks, subjects, and link to thumbnail images by querying the arts-and-subjects.ttl file.

Run the example as follows:

fx -q queries/subjects-artworks-images.sparql -o subjects-artworks-images.csv -f CSV -l arts-and-subjects.ttl 

Generate CSV of id + materials

Mediums information from artwork's JSON files, aggregated into a CSV.

fx -q queries/classification-medium.sparql -o materials.csv -f CSV

### Generate CSV of id + time
Time information from the CSV to another CSV.

fx -q queries/time.sparql -o time.csv -f CSV


### List of genres
Information is distributed in the artworks' JSON files.

fx -q queries/genres.sparql -o genres.csv -f CSV

### List of decades
Information is distributed in the artworks' JSON files.

fx -q queries/decades.sparql -o decades.csv -f CSV