Generators, slug rewrites and generic collections #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a few more hooks and fixes some bugs.
Note: This release will not have a
build --watch
option.Generators
A generator is a script that can be added to headless project that will be called before the rest of the pipeline. A generator will be tasked with creating a folder of IIIF manifests. It will be given a folder in the cache, and unless specified, the contents will automatically be added to the rest of the headless static site pipeline.
The generator is structured in a way to maximise caching, and parallelism. However, these are optional and you can just run it as a single monolithic function. The build-in example uses the NASA photo archive to make a search query and build manifests from the results. It uses the parallel steps.
All the steps available are:
fetch()
is provided.true
then the following generate steps will be called.generateEach
is called on them. Useful for async requests. Currently the generator does not have batching, so be careful if you are making lots of HTTP requests. In the generate you are passed a directory to save the resource, along with the data returned in the prepare step and caches. Similar to other steps in the pipeline you can return cache entries specific to this resource (by id) usingreturn { cache: { anything: '...' } }
There is also an
invalidateEach
step that will be passed the resource-specific cache from thegenerateEach
step, so you can only build things that have changed.The NASA example uses:
The prepare step makes the search request to the NASA API and returns a list of results. It uses the configuration from the
.iiifrc.yaml
to limit the results and pages. It then returns a list of resources (NOT IIIF YET). E.g.The identifier needs to be unique, but doesn't have to be a URL and the data can be anything serialisable.
The returned list of results are then passed to
generateEach
which will make a further call to the NASA APIs to get image information and metadata. It will then construct a IIIF Manifest (using IIIF Builder, instance passed in as a helper) and save it to disk.There is a helper for saving data to disk (save and forget, no await).
If there is no
output
in theiiifrc.yaml
configuration, then it will be automatically build into the static site using theiiif-json
preset. You can the "virtual" store it generates in the build folder (.iiif/build/config/stores.json
). If you set an output folder, that will be used for building instead - and then you can wire it up manually, if you want to save into source control or change the filter rules.Example
.iiifrc.yaml
using the build in example generator (NASA)By default, generators will be run during a build. But you can pass
--no-generate
to prevent this. You can also run generate as a distinct step usingiiif-hss generate
Caching is aggressive, but
--no-cache
will disable the remote URL cache for generators.To create a new generator, you can create a script - similar to extract/enrich.
Rewrites
A new hook was added for rewriting slugs. This is always called, so ideally the generation of the rewrite should be minimal and not make slow requests. There is a bundled rewrite for flattening manifests/collections. If you implemented this in the
scripts/
folder, it might look like thisThis will rewrite the slug (the URL of the manifest, minus the
/manifest.json
at the end). Currently rewrites can only be added to the top-levelrun:
in the config, and not per-store. This might change in the future, but you need to ensure you add it into the run configuration.Generic collections
There is some work to do with collection processing, and this feature is a functional but not customisable implementation of creating generic collections during the
extract
step.In addition to returning indicies, caches and meta from an extraction step, you can also return a list of collection slugs. e.g.
path/to/my/collection
. All manifests that are tagged in this way will be gathered together and a IIIF collection created. The labels are bad and you cannot customise them yet. However there is a plan to improve the collection enrichment step, including these collections and index collections.There is a built-in generic collection extraction:
folder-collections
. This will create a collection from each folder that contains manifests.For example:
Will create 2 collections, each with 3 manifests:
And if you pair this will flat manifests, they would be rewritten to:
So they are flat, but you have the folder structure preserved.