-
Notifications
You must be signed in to change notification settings - Fork 9
Home
The Thoth User Manual, for publishers and other creators of metadata records in Thoth, can be found here.
The wiki below provides an overview of Thoth's approach to Data and Metadata and its interactions with the Open Access Book Supply Chain. It also provides an overview of the Thoth Open Archiving Network.
In the digital realm, a Work usually consists of two constituent parts: data and metadata. The data comprise the contents of the publications, the information contained in it targeted at human readers, machine readers, or both. The metadata comprise all the data about the publication, such as its author, title, and subject classification.
Metadata are frequently also part of the Work. For example, the title and author are often mentioned on the opening pages, and the ISBN numbers are usually listed in the colophon. Despite this partial overlap, it is useful to distinguish between data and metadata, as they are handled in distinct manners in the Open Access book supply chain. There are several international Metadata Standards setting baseline quality criteria for metadata.
There are specific digital Data Formats and Metadata Formats that are supported by Thoth. An important subset of metadata is formed by Persistent Identifiers.
Thoth operates at several level in the Open Access book supply chain. We employ here the categorization of key stakeholders and intermediaries proposed in Michael Clarke and Laura Ricci's 2021 report OA Books Supply Chain Mapping (Clarke & Ricci 2021).
Work records in Thoth allow Content Creators to add information about Funding by referencing an Institution by means of Persistent Identifiers as well as further grant program and project information. Content Funders are able to harvest these data through one of the Thoth Metadata Formats or our Open API.
Libraries, both University Libraries and National Libraries, have become increasingly important Content Funders in the OA Book Supply Chain. Thoth is partially funded by library subscriptions through the Open Book Collective and in return Thoth provides high-quality metadata in a range of Metadata Formats including MARC 21 that libraries can ingest into their Library Management Systems.
Additionally, Thoth is working with University Libraries in the context of the Thoth Open Archiving Network.
Thoth is primarily designed as a platform for Content Creators, in particular Open Access Publishers. Thoth provides integrated services for the maintenance, management, and dissemination of metadata records in a wide variety of Metadata Formats to a large selection of Content Platforms, Ebook Distributors, and Catalogs and Indices.
Publishers may use one of the available commercial Title Management Platforms or Publishing Platforms, which allow authors, editors, and publishers to collaborate in a digital, in-browser environment. Thoth is currently collaborating collaborating with Open Monograph Press and Janeway to improve integration with their in-platform metadata management functionalities.
Whereas this wiki focuses on mainly on the digital OA book supply chain, many OA publishers also publish print books via one of the commercial Print Book Distributors.
Individuals authors are not a targeted user group of Thoth. They may manage their private bibliographic metadata on one of the available commercial or open source Bibliographic Reference Management Platforms and upload their research directly to one of the Green OA Repositories. Thoth currently supports the export of metadata to all available Bibliographic Reference Management Platforms via BibTeX.
Authors are also end-users of the metadata provided by Thoth by accessing Knowledge Graphs and Web-Scale Search Engines and using any of the Content Platforms to access publications during their research phase.
Ebook Aggregators "license and consolidate titles from many publishers into one combined database, [… and] often combine OA and paid-access titles for greater discoverability and convenience" (Clarke & Ricci 2021). Thoth currently supports the export of metadata to Baobab ebooks, EBSCO eBooks, JSTOR, Project MUSE, and ProQuest Ebook Central.
OA Platforms and Repositories "have no underlying infrastructure for the buying and selling of books, and are intended to host exclusively free or OA content" (Clarke & Ricci 2021). Thoth currently supports the export of metadata to OAPEN.
A special category of OA Platforms and Repositories are Digital OER Libraries, which focus mainly on textbooks rather than scholarly publications.
Consumer Ebook Platforms "offer titles for an individual’s use and access, and do not actively support institutional or library integration" (Clarke & Ricci 2021). Thoth currently supports the export of metadata to Google Play Books.
Shadow Libraries are online databases of readily available content that is normally obscured or otherwise not readily accessible. Such content may be inaccessible for a number of reasons, including the use of paywalls, copyright controls, or other barriers to accessibility placed upon the content by its original owners" (Wikipedia). Thoth currently does not support export of metadata to any of the shadow libraries.
Ebook Distributors differ from Digital Libraries in the sense that they do not claim to offer a scholarly function, be that to research institutions or to the general public. Distributors repackage and normalize ebook metadata. Most ebook distributors operate some form of monetization scheme, which may not be hospitable to OA books. Thoth currently supports the export of metadata to OverDrive and RNIB Bookshare.
Third-party Content Indices are more specialized types of products that promote metadata curation and discovery. Thoth currently supports the export of metadata to DOAB.
OER Discovery Platforms provide similar services for open textbooks.
Knowledge Bases are library-agnostic global content indices. Thoth currently supports the export of metadata to BDSLive and EBSCO Knowledge Base.
Topic-specific Bibliographies are managed by scholarly organizations related to a specific field of inquiry.
Citation Indices, such as OpenCitations, provide specific indexing for citations and references.
The Thoth Open Archiving Network comprises an expanding number of institutional repositories that archive the metadata stored in Thoth and its linked data for long-term preservation purposes. These data and metadata are often preserved in specific Data Formats and Metadata Formats. Institutional repositories often operate through one of the available Repository Systems such as DSpace or Figshare. See also our blog posts here and here.
The following Preservation Repositories are currently connected to Thoth as part of the Thoth Open Archiving Network:
- Internet Archive
- Cambridge University Library "Apollo" Repostory (via DSpace)
- Loughborough University Library Repository (via Figshare)
The accessibility of the eBook output of small publishers to print disabled individuals is supported through metadata. This can be in the form of metadata standards including data on the accessibility of the work, or metadata standards used to provide accessibility features. In addition, there are various digital accessibility standards that apply to web sites and the documents available on them, that form part of legal requirements in certain countries.
Metadata standards that include accessibility tags:
Standards that allow provision of accessibility features and tools:
Markup languages that allow provision of accessibility features and tools:
- MathML, also known as ISO/IEC 40314:2016
- Timed Text Markup Language
- Scalable Vector Graphics
- Voice Extensible Markup Language
- DocBook
- DTBook
Digital accessibility standards:
- Web Content Accessibility Guidelines (WCAG) 2.2 - Level AA is required by UK law
- Web Content Accessibility Guidelines (WCAG) 2.1
- Web Content Accessibility Guidelines (WCAG) 2.0 - Level AA is required by US law
- EN 301 549 Annex A - Similar to WCAG 2.1 and is required by EU law
The Thoth Wiki has been developed in the context of the COPIM (Community-led Open Publication Infrastructures for Monographs) project. Individual contributions to the wiki have been made by Tim Elfenbein, Joanne Fitzpatrick, Rupert Gatti, Ross Higman, Hannah Hillen, Brendan O'Connell, Tobias Steiner, and Vincent W.J. van Gerven Oei under the general editorship of Van Gerven Oei. All data are available under a CC-BY 4.0 license.