Construct a knowledge graph of artists and artworks of the IMMA museum website

This showcase demonstrates the use of SPARQL Anything for constructing a Knowledge Graph from data encoded in HTML pages.

In what follows, fx refers to the following command line

java -jar sparql-anything-<version>-.jar

Knowledge graph construction pipeline

Step 1: list artists from the catalogue

This query extracts the list of artists from the Web page and build an XML result set with ?artistNickname and ?artistUrl. The SPARQL result set file will be used in the next query to iterate over each one of the artists' pages.

Title	Step 1: list artists from the catalogue
Query	queries/imma-artists.sparql
Input	https://imma.ie/artists/
Output	imma-artists.xml
Type	SELECT
Options	`html.selector=#az-group`
Formats	HTML
Level	Novice

Run the example as follows:

fx -q queries/imma-artists.sparql -o imma-artists.xml -f xml

Step 2: iterate over artists' web pages and create a JSON-LD for each one of them

In this step we use a parametrized query that is able to query an artists' web page and extract relevant metadata. The query is repeated for each value of the SPARQL result set file produced in the previous step. The command generates a JSON-LD for each execution, using the artist nickname as file name (one of the values provided by the result set). Crucially, the JSON-LD files produced will include web pages of the related artworks.

Title	Step 2: iterate over artists' web pages and create a JSON-LD for each one of them
Query	queries/imma-artist.sparql
Input	imma-artists.xml, `?_artistUrl`
Output	artists/*.jsonld
Type	CONSTRUCT
Options
Formats	HTML
Level	Novice

Run the example as follows:

fx -q queries/imma-artist.sparql -i imma-artists.xml -p "artists/?artistNickname.jsonld" -f json

Step 3: Generate the list of artworks

Next, we extract the list of artworks' Web pages from the JSON-LD files of the artists. This is easy as we can simply query the JSON-LD files, loading them in an in-memory dataset via the command-line option -l.

Title	Step 3: Generate the list of artworks
Query	queries/imma-artworks.sparql
Input	artists/
Output	imma-artworks.xml
Type	SELECT
Options	`-l`
Formats
Level	Novice

Run the example as follows:

fx -q queries/imma-artworks.sparql -l artists/ -o imma-artworks.xml -f xml

Step 4: Generate the list of artworks

Next, we extract data from the artworks' Web pages and build one JSON-LD file each (create folder 'artworks' first).

Title	Step 4: Generate the list of artworks
Query	queries/imma-artwork.sparql
Input	imma-artworks.xml, `?_artworkUrl`
Output	artworks/*.jsonld
Type	CONSTRUCT
Options
Formats
Level	Novice

fx -q queries/imma-artwork.sparql -i imma-artworks.xml -p "artworks/?artworkNickname.jsonld" -f json

Finally, we can load the files into our favourite triple store.

Extract single artists / artworks

These queries can be used to execute only one specific artists/artwork. In addition, they showcase the CLI option -v, used to pass parameter values.

Extract data from a specific artist Web page:

fx -q queries/imma-artist.sparql -v artistNickname=lambert-gene -v artistUrl=https://imma.ie/artists/gene-lambert/ -p "artists/?artistNickname.jsonld" -f json

Extract data from a specific artwork Web page:

fx -q queries/imma-artwork.sparql  -v artworkNickname=naturaleza-desde-la-ventana -v artworkUrl=https://imma.ie/collection/naturaleza-desde-la-ventana/ -p "artworks/?artworkNickname.jsonld" -f json
fx -q queries/imma-artwork.sparql  -v artworkNickname=berry-dress -v artworkUrl=https://imma.ie/collection/berry-dress/ -p "artworks/?artworkNickname.jsonld" -f json

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
artists		artists
artworks		artworks
queries		queries
README.md		README.md
download-images.py		download-images.py
images.csv		images.csv
imma-artists.xml		imma-artists.xml
imma-artworks.xml		imma-artworks.xml
imma-materials.csv		imma-materials.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Construct a knowledge graph of artists and artworks of the IMMA museum website

Knowledge graph construction pipeline

Step 1: list artists from the catalogue

Step 2: iterate over artists' web pages and create a JSON-LD for each one of them

Step 3: Generate the list of artworks

Step 4: Generate the list of artworks

Extract single artists / artworks

About

Releases 2

Packages

Contributors 2

Languages

SPARQL-Anything/showcase-imma

Folders and files

Latest commit

History

Repository files navigation

Construct a knowledge graph of artists and artworks of the IMMA museum website

Knowledge graph construction pipeline

Step 1: list artists from the catalogue

Step 2: iterate over artists' web pages and create a JSON-LD for each one of them

Step 3: Generate the list of artworks

Step 4: Generate the list of artworks

Extract single artists / artworks

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages