From b55a69642d92f41820c1816ca0cb8bcc817fe632 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Mon, 2 Dec 2024 18:14:28 +0000 Subject: [PATCH 01/16] grammar & style tweaks to introduction --- docs/_specification/1.2-DRAFT/introduction.md | 59 +++++++++---------- 1 file changed, 28 insertions(+), 31 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index 0618f7e7..31dca741 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -25,15 +25,15 @@ parent: RO-Crate 1.2-DRAFT # Introduction -This document specifies a method, known as _RO-Crate_ (Research Object Crate), of aggregating and describing data for distribution, re-use, publishing, preservation and archiving. RO-Crates aggregate data into a Dataset, and may describe any resource including files, URI-addressable resources, or use other addressing schemes to locate digital or physical data. Describing resources includes technical metadata such as file sizes and types as well as contextual information including how datasets and files were created, and where, how they were collated and collected, who was involved in the process, what equipment and software was used, who funded the work, how to cite it, and crucially, how it may be reused, and by whom. +This document specifies a method, known as _RO-Crate_ (Research Object Crate), of aggregating and describing data for distribution, re-use, publishing, preservation and archiving. RO-Crates aggregate data into a Dataset, and may describe any resource including files, URI-addressable resources, or use other addressing schemes to locate digital or physical data. Describing resources includes technical metadata such as file sizes and types as well as contextual information including how and where datasets and files were created, how they were collated and collected, who was involved in the process, what equipment and software was used, who funded the work, how to cite it, and crucially, how it may be reused, and by whom. -The core of RO-Crate is a machine-readable linked-data document in JSON-LD format known as an **RO-Crate Metadata Document**. RO-Crate metadata documents can to a large extent be created and processed just like any other JSON: knowledge of JSON-LD is not needed, unless extending RO-Crate with additional concepts or combining RO-Crate with other Linked Data technologies. +The core of RO-Crate is a machine-readable linked-data document in JSON-LD format known as an **RO-Crate Metadata Document**. RO-Crate metadata documents can, to a large extent, be created and processed just like any other JSON: knowledge of JSON-LD is not needed, unless extending RO-Crate with additional concepts or combining RO-Crate with other Linked Data technologies. This page introduces the general RO-Crate concepts through a running example, while the normative pages in the rest of the RO-Crate specification define in more detail these and other concepts using separate examples and recommendations. ## Walkthrough: An initial RO-Crate -In the simplest form, to describe some data on disk, an _RO-Crate Metadata Document_ named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories (this file is known as the _RO-Crate Metadata File_). +In the simplest form, to describe some data on disk, a file named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories. This `ro-crate-metadata.json` file is known as the _RO-Crate Metadata Document_. In the example below, a single file `data.csv` is placed with the RO-Crate Metadata Document in a directory named `crate1`: @@ -97,11 +97,11 @@ In this running example, the content of the _RO Crate Metadata Document_ is: ### JSON-LD preamble -The preamble of `@context` and `@graph` are JSON-LD structures that help provide global identifiers to the JSON keys and types used in the rest of the _RO-Crate Metadata Document_. These will largely map to definitions in the [schema.org](http://schema.org/) vocabulary, which can be used by RO-Crate extensions to provide additional metadata beyond the RO-Crate specifications. It is this feature of JSON-LD that helps make RO-Crate extensible for many different purposes -- this is explored further in [appendix on JSON-LD](appendix/jsonld). +The preamble of `@context` and `@graph` are JSON-LD structures that help provide global identifiers to the JSON keys and types used in the rest of the _RO-Crate Metadata Document_. These will largely map to definitions in the [schema.org](http://schema.org/) vocabulary, which can be used by RO-Crate extensions to provide additional metadata beyond the RO-Crate specification. It is this feature of JSON-LD that helps make RO-Crate extensible for many different purposes -- this is explored further in the [appendix on JSON-LD](appendix/jsonld). -However, in the general case it should be sufficient to follow the RO-Crate JSON examples directly without deeper JSON-LD understanding. In short, an _RO-Crate metadata Document_ contains a flat list of _entities_ as objects in the `@graph` array. These entities are cross-referenced using `@id` identifiers rather than being deeply nested. +However, in the general case it should be sufficient to follow the RO-Crate JSON examples directly without deeper JSON-LD understanding. In short, an _RO-Crate Metadata Document_ contains a flat list of _entities_ as objects in the `@graph` array. These entities are cross-referenced using `@id` identifiers rather than being deeply nested. -### RO-Crate Metadata descriptor +### RO-Crate Metadata Descriptor The first JSON-LD _entity_ in our example above has the `@id` `ro-crate-metadata.json`: @@ -117,7 +117,7 @@ The first JSON-LD _entity_ in our example above has the `@id` `ro-crate-metadata This required entity, known as the _RO-Crate Metadata Descriptor_, helps this file self-identify as an _RO-Crate Metadata Document_, which is conforming to (`conformsTo`) the RO-Crate specification version 1.2-DRAFT. -The descriptor also indicates via the `about` property which entity in the `@graph` array is the _RO-Crate Root Dataset_ -- the starting point of this RO-Crate. +The descriptor also indicates via the `about` property which entity in the `@graph` array is the _RO-Crate Root_ dataset -- the starting point of this RO-Crate. ### RO-Crate Root @@ -128,14 +128,13 @@ We can visualise how the above entity references the **RO-Crate Root** as:
Figure 2: showing RO-Crate Metadata descriptor's about property pointing at the RO-Crate Root entity with matching @id
-By convention, in RO-Crate the `@id` value of `./` means that this document describes the directory of content in which the RO-Crate metadata is located as in the example above. This reference from `ro-crate-metadata.json` is therefore marking the `crate1` directory as being the RO-Crate root. +By convention, in RO-Crate the `@id` value of `./` means that this document describes the directory of content in which the _RO-Crate Metadata Document_ is located, as in the example above. This reference from `ro-crate-metadata.json` is therefore marking the `crate1` directory as being the _RO-Crate Root_. The entity whose `@id` is the _RO-Crate Root_ is called the _Root Data Entity_. -{: .note } -This example is a directory-based RO-Crate stored on disk. If the crate is being served from a Web service, such as a data repository or database where files are not organized in directories, then the `@id` might be an absolute URI instead of `"./"` -- see section [Root Data Entity](root-data-entity) for details +{% include callout.html type="note" content="This example is a directory-based RO-Crate stored on disk. If the crate is being served from a Web service, such as a data repository or database where files are not organized in directories, then the `@id` might be an absolute URI instead of `\"./\"` -- see section [Root Data Entity](root-data-entity) for details." %} ### About cross-references -In _RO-Crate Metadata Documents_, entities are cross-referenced using `@id` reference objects, rather than using deeply nested JSON objects. In short, this _flattened JSON-LD_ style allows any entity to reference any other entity, and RO-Crate consumers to directly find all the descriptions of an entity within a single JSON object. So let's have a look at the Root Data Entity `./`: +In an _RO-Crate Metadata Document_, entities are cross-referenced using `@id` reference objects, rather than using deeply nested JSON objects. In short, this _flattened JSON-LD_ style allows any entity to reference any other entity, and RO-Crate consumers to directly find all the descriptions of an entity within a single JSON object. So let's have a look at the _Root Data Entity_ `./`: ```json @@ -147,20 +146,17 @@ In _RO-Crate Metadata Documents_, entities are cross-referenced using `@id` refe } ``` -The root is always typed `Dataset`, though it may have more than one type. It has several metadata properties that describe the RO-Crate as a whole, as a collection of resources. The section on [root data entity](root-data-entity) explores further the required and recommended properties of the root `./`. +The _Root Data Entity_ always has `@type` `Dataset`, though it may have more than one type. It has several metadata properties that describe the RO-Crate as a whole, as a collection of resources. The section on the [Root Data Entity](root-data-entity) explores further the required and recommended properties of this entity. ### Data entities -A main type of resources collected are _data_ -- simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the Root Dataset with the `hasPart` property. In this example we have an array with a single value, a reference to the entity describing the file `data.csv`. +A main type of resources collected are _data_ -- simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the _Root Data Entity_ with the `hasPart` property. In this example we have an array with a single value, a reference to the entity describing the file `data.csv`. -{: .tip} -RO-Crates can also contain data entities that are folders and Web resources, as well as non-File-like data like online databases -- see section on [data entities](data-entities). +{% include callout.html type="tip" content="RO-Crates can also contain _data entities_ that are folders and Web resources, as well as non-File-like data like online databases -- see the section on [data entities](data-entities) for more information." %}
- -JSON block with id ./ has an array under hasPart listing id data.csv. In second JSON block with id data.csv we see it is typed File and have other properties. - -
Figure 3: RO-Crate Root entity referencing the data entity with @id identifier data.csv
+ JSON block with id `./` has an array under  `hasPart` listing id `data.csv`. In second JSON block with id `data.csv` we see it is typed `File` and has other properties. +
Figure 3: RO-Crate Root entity referencing the data entity with @id identifier data.csv
If we now follow the `@id` reference for the corresponding _data entity_ JSON block, we see it has `@type` value of `File` and additional metadata such as `encodingFormat`. It is recommended that every entity has a human readable `name`, which as shown in this example, does not need to match the filename/identifier. The `encodingFormat` indicates the media file type so that consumers of the crate can open `data.csv` in an appropriate program. @@ -176,12 +172,12 @@ If we now follow the `@id` reference for the corresponding _data entity_ JSON bl }, ``` -For more information on describing files and directories, including their recommended and required attributes, see section on [data entities](data-entities). +For more information on describing files and directories, including their recommended and required attributes, see the section on [data entities](data-entities). ### Contextual entities -Moving back to the RO-Crate root `./`, the publisher of this Dataset should be indicated using the property `publisher` using a URI to identify the `Organization`, linking to what is known as a _Contextual Entity_ that provides some information about the Organization such as its name and web address. +Moving back to the RO-Crate _Root Data Entity_ (with `@id` `./`), the publisher of this Dataset should be indicated using the property `publisher` and using a URI to identify the publishing `Organization`, linking to what is known as a _Contextual Entity_ that provides some information about the Organization such as its name and web address. ```json @@ -201,17 +197,17 @@ Moving back to the RO-Crate root `./`, the publisher of this Dataset should be i } ``` -You may notice the subtle difference between a _data entity_ that is conceptually part of the RO-Crate and is file-like (containing bytes), while this _contextual entity_ is a representation of a real-life organization that can't be downloaded: following the URL, we would only get its _description_. The section [contextual entities](contextual-entities) explores several of the entities that can be added to the RO-Crate to provide it with a **context**, for instance how to link to authors and their affiliations. Simplifying slightly, a data entity is referenced from `hasPart` in a `Dataset`, while a contextual entity is referenced using any other defined property. +You may notice the subtle difference between a _data entity_ that is conceptually part of the RO-Crate and is file-like (containing bytes), while this _contextual entity_ is a representation of a real-life organization that can't be downloaded: following the URL, we would only get its _description_. The section on [contextual entities](contextual-entities) explores several of the entities that can be added to the RO-Crate to provide it with a **context**, for instance how to link to authors and their affiliations. Simplifying slightly, a _data entity_ is referenced from `hasPart` in a `Dataset`, while a _contextual entity_ is referenced using any other defined property. ## HTML preview -An RO-Crate can be distributed on disk, in packaged format such as a zip file or disk image, or placed on a static website. In any of these cases, an RO-Crate should have an accompanying HTML version (`ro-crate-metadata.html`) designed to be human-readable. The exact contents of the preview may vary but should correspond to the _RO-Crate Metadata Document_ content and link to the contained data entities. The preview may be generated automatically from the RO-Crate Metadata Document (see [RO-Crate tools](../../tools)), or even by hand (equivalent to a README). +An RO-Crate can be distributed on disk, in packaged format such as a zip file or disk image, or placed on a static website. In any of these cases, an RO-Crate should have an accompanying HTML version (`ro-crate-preview.html`) designed to be human-readable. The exact contents of the preview may vary but should correspond to the _RO-Crate Metadata Document_ content and link to the contained data entities. The preview may be generated automatically from the _RO-Crate Metadata Document_ (see [RO-Crate tools](../../tools)), or even by hand (equivalent to a README). -Below is a screenshot from the [preview of the running example](examples/rainfall-1.2.0/ro-crate-preview.html): +Below is a screenshot from the [preview of the running example](examples/rainfall-1.2.0/ro-crate-preview.html), which was generated using the [ro-crate-html](https://www.npmjs.com/package/ro-crate-html) package:
Screenshot of RO-Crate HTML preview. The metadata attributes are listed in a table with links to each connected entity, such as the Bureau of Meteorology. -
Figure 3: RO-Crate preview of the running example.
+
Figure 4: RO-Crate preview of the running example.
@@ -219,18 +215,19 @@ Below is a screenshot from the [preview of the running example](examples/rainfal The rest of this specification is structured as follows: -* [Terminology](terminology) defines terms such as _Entity_ used in the rest of the document. You may use this page as a quick-reference, but note that most of these are also covered in detail in separate pages. -* [RO-Crate structure](structure) defines further how the `ro-crate-metadata.json` and data files can be organized within an _RO-Crate Root_ directory +* [Terminology](terminology) defines terms such as _Entity_ used in this section and the rest of the specification. You may use this section as a quick-reference, but note that most of these are also covered in detail in separate sections. +* [RO-Crate Structure](structure) defines further how the `ro-crate-metadata.json` and data files can be organized within an _RO-Crate Root_ directory. * [Metadata of the RO-Crate](metadata) explains the connection to Linked Data principles and how RO-Crate keys are mapped to global identifiers. This is mainly of interest for readers already familiar with JSON-LD or ontologies, or which want to expand RO-Crate metadata keys. -* [Root Data Entity](root-data-entity) defines the entities _RO-Crate Metadata Descriptor_ (`ro-crate-metdata.json`) and _RO-Crate Root_ (`./`) including their required and recommended properties. +* [Root Data Entity](root-data-entity) defines the entities _RO-Crate Metadata Descriptor_ (`ro-crate-metadata.json`) and _Root Data Entity_ (`./`) including their required and recommended properties. * [Data Entities](data-entities) explores further how to describe data, including files, directories and Web references. Metadata such as file formats help inform RO-Crate consumers on which tools may be able to process the data. * [Contextual Entities](contextual-entities) shows how to describe entities used to annotate other entities, adding `People` and `Organization` referenced from `author`, `publication`, `affiliation` etc. Metadata like licensing, funding, locations and subjects can be described using contextual entities. +* [The focus of an RO-Crate](crate-focus) !!add description!! * [Provenance of Entities](provenance) explores how the history of making an entity can be added to the RO-Crate using a series of _actions_ -- this may include real-world activities and instruments, as well as software executions and modifications to the RO-Crate metadata itself. -* Subsection [Digital Library and Repository content](provenance#digital-library-and-repository-content) details how records in an existing repository (which may reference files, but also physical objects) can be described and published using RO-Crate. + * Subsection [Digital Library and Repository content](provenance#digital-library-and-repository-content) details how records in an existing repository (which may reference files, but also physical objects) can be described and published using RO-Crate. * [Workflows and Scripts](workflows) explains how computional software and code can be added to an RO-Crate, possibly as part of explaining provenance, but also for providing potential usage and further processing of the data. -* [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific profile, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, explored in this section. +* [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific _RO-Crate profile_, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, which is also explored in this section. * [Appendixes](appendix/) contain more technical references and suggestions for developers, e.g. for deciding on `@id` [in JSON-LD](appendix/jsonld#describing-entities-in-json-ld) or [extending RO-Crate terms](appendix/jsonld#extending-ro-crate). The appendix also explores how an RO-Crate can be [packaged with BagIt](appendix/implementation-notes#combining-with-other-packaging-schemes) or used as part of a repository. -Throughout the specifications you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by pages like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crate can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. +Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by pages like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. {% include references.liquid %} From b8b5dc6c67abc0cb2e6547279c2c7d2483c0cb2e Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Tue, 3 Dec 2024 14:53:29 +0000 Subject: [PATCH 02/16] minor additional changes to introduction --- docs/_specification/1.2-DRAFT/introduction.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index 31dca741..f79bf9e6 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -29,13 +29,13 @@ This document specifies a method, known as _RO-Crate_ (Research Object Crate), o The core of RO-Crate is a machine-readable linked-data document in JSON-LD format known as an **RO-Crate Metadata Document**. RO-Crate metadata documents can, to a large extent, be created and processed just like any other JSON: knowledge of JSON-LD is not needed, unless extending RO-Crate with additional concepts or combining RO-Crate with other Linked Data technologies. -This page introduces the general RO-Crate concepts through a running example, while the normative pages in the rest of the RO-Crate specification define in more detail these and other concepts using separate examples and recommendations. +This section introduces the general RO-Crate concepts through a running example, while the normative sections in the rest of the RO-Crate specification define in more detail these and other concepts using separate examples and recommendations. ## Walkthrough: An initial RO-Crate -In the simplest form, to describe some data on disk, a file named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories. This `ro-crate-metadata.json` file is known as the _RO-Crate Metadata Document_. +In the simplest form, to describe some data on disk, an _RO-Crate Metadata Document_ named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories (this file is known as the _RO-Crate Metadata File_). -In the example below, a single file `data.csv` is placed with the RO-Crate Metadata Document in a directory named `crate1`: +In the example below, a single file `data.csv` is placed with the _RO-Crate Metadata Document_ in a directory named `crate1`:
Folder listing of crate1, including data.csv and ro-crate-metadata.json @@ -228,6 +228,6 @@ The rest of this specification is structured as follows: * [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific _RO-Crate profile_, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, which is also explored in this section. * [Appendixes](appendix/) contain more technical references and suggestions for developers, e.g. for deciding on `@id` [in JSON-LD](appendix/jsonld#describing-entities-in-json-ld) or [extending RO-Crate terms](appendix/jsonld#extending-ro-crate). The appendix also explores how an RO-Crate can be [packaged with BagIt](appendix/implementation-notes#combining-with-other-packaging-schemes) or used as part of a repository. -Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by pages like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. +Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by sections like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. {% include references.liquid %} From 1f7f4dd43a123388741406e02f4475dd0014d7ee Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Tue, 3 Dec 2024 15:11:06 +0000 Subject: [PATCH 03/16] add Entity definition, move entity terms close together --- docs/_specification/1.2-DRAFT/terminology.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/terminology.md b/docs/_specification/1.2-DRAFT/terminology.md index b7c6b740..0a63aeb2 100644 --- a/docs/_specification/1.2-DRAFT/terminology.md +++ b/docs/_specification/1.2-DRAFT/terminology.md @@ -44,18 +44,20 @@ _RO-Crate Website_: Human-readable HTML pages which describe the RO-Crate (i.e. _Type_: A classification of objects or their descriptions. The type (or "class") is given as a short-hand _key_, mapped by the _RO-Crate JSON-LD Context_ to a _URI_ that has the type definition. See appendix [RO-Crate JSON-LD](appendix/jsonld). -_Data Entity_: A JSON-LD representation (in the _RO-Crate Metadata Document_) of a directory, file, or other Web resource which is considered _contained_ by the _RO-Crate_. See section [Data entities](data-entities). - _Property_: A relationship from one _entity_ to another entity, or to a _value_. The type of relationship is identified by a _URI_, mapped to a _key_ by _JSON-LD_. See appendix [RO-Crate JSON-LD](appendix/jsonld). -_Root Data Entity_: A _Data Entity_ of _type_ [Dataset], representing the RO-Crate as a whole. See section [Root Data Entity](root-data-entity). +_Entity_: A JSON-LD representation of an object, which has a _type_ and may be described using a set of _properties_. There are two categories of entity: _data entities_ and _contextual entities_. -_JSON-LD_: A JSON-based file format for storing _Linked Data_. This document assumes [JSON-LD 1.0]. JSON-LD use a _context_ to map from JSON keys to _URIs_. See appendix [RO-Crate JSON-LD](appendix/jsonld). +_Data Entity_: A JSON-LD representation (in the _RO-Crate Metadata Document_) of a directory, file, or other Web resource which is considered _contained_ by the _RO-Crate_. See section [Data entities](data-entities). -_JSON_: The _JavaScript Object Notation (JSON) Data Interchange Format_ as defined by [RFC 7159]; a structured text file format that can be programmatically consumed and generated in a wide range of programming languages. The main JSON structures are _objects_ (`{}`) indexed by _keys_, sequential _arrays_ (`[]`) and literal _values_ (`""`). +_Root Data Entity_: A _Data Entity_ of _type_ [Dataset], representing the RO-Crate as a whole. See section [Root Data Entity](root-data-entity). _Contextual Entity_: A JSON-LD representation of an entity associated with another _Entity_, in order to adequately describe it. For example, a [Person], [Organization] (including research projects), item of equipment ([IndividualProduct]), [license] or any other _thing_ or _event_ that forms part of the metadata for a _Data Entity_. _Properties_ of contextual entities may refer to further entities. See section [Contextual Entities](contextual-entities). +_JSON-LD_: A JSON-based file format for storing _Linked Data_. This document assumes [JSON-LD 1.0]. JSON-LD uses a _context_ to map from JSON keys to _URIs_. See appendix [RO-Crate JSON-LD](appendix/jsonld). + +_JSON_: The _JavaScript Object Notation (JSON) Data Interchange Format_ as defined by [RFC 7159]; a structured text file format that can be programmatically consumed and generated in a wide range of programming languages. The main JSON structures are _objects_ (`{}`) indexed by _keys_, sequential _arrays_ (`[]`) and literal _values_ (`""`). + _Linked Data_: A data structure where properties, types and resources are identified with _URIs_, which if retrieved over the Web, further describe or provide the identified property/type/resource. _URI_: A _Uniform Resource Identifier_ as defined in [RFC 3986], for example `http://example.com/path/file.html` - commonly known as _URL_. In this document the term _URI_ includes _IRI_, which also permit international Unicode characters. The URI identifies a downloadable resource (e.g. an image) or a concept (e.g. a _type_ definition). @@ -64,7 +66,7 @@ _URI Path_: The relative _path_ element of an _URI_ as defined in [RFC3986 secti _RO-Crate JSON-LD Context_: A JSON-LD [context][JSON-LD context] that provides Linked Data mapping for RO-Crate metadata to vocabularies like [Schema.org]. This mapping assigns meaning to the JSON keys, see appendix [RO-Crate JSON-LD](appendix/jsonld). -_RO-Crate JSON-LD_: JSON-LD that use the _RO-Crate JSON-LD Context_ and contain RO-Crate metadata, written as if [flattened] and then [compacted] according to the rules in JSON-LD 1.0. The _RO-Crate JSON-LD_ for an _RO-Crate_ is stored or transmitted in the _RO-Crate Metadata Document. +_RO-Crate JSON-LD_: JSON-LD that use the _RO-Crate JSON-LD Context_ and contain RO-Crate metadata, written as if [flattened] and then [compacted] according to the rules in JSON-LD 1.0. The _RO-Crate JSON-LD_ for an _RO-Crate_ is stored or transmitted in the _RO-Crate Metadata Document_. @@ -74,6 +76,6 @@ Throughout this specification, RDF terms (_properties_, _types_) are referred to Following [Schema.org] practice, `property` names start with lowercase letters and `Type` names start with uppercase letters. -In the _RO-Crate Metadata Document_ the RDF terms use their RO-Crate JSON-LD names as defined in the _RO-Crate JSON-LD Context_, which is available at +In the _RO-Crate Metadata Document_ the RDF terms use their RO-Crate JSON-LD names as defined in the _RO-Crate JSON-LD Context_, which is available at . {% include references.liquid %} From 63b0e62ed573bdf2d07e2c25ccc90be2c643f4d4 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 10:33:28 +0000 Subject: [PATCH 04/16] update dependencies --- Makefile | 2 +- environment.yml | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index 8c52c361..d97c1cfe 100644 --- a/Makefile +++ b/Makefile @@ -17,7 +17,7 @@ all: dependencies release dependencies: node_modules/.bin/rochtml scripts/schema-context.py --version node_modules/.bin/rochtml --help - pip --exists-action=s install 'panflute<2' + pip --exists-action=s install 'panflute==2.1.3' pandoc --version xelatex --version diff --git a/environment.yml b/environment.yml index d32556b7..e1c07999 100644 --- a/environment.yml +++ b/environment.yml @@ -11,10 +11,10 @@ dependencies: - sed=4.8 - python=3.8 - requests=2.24 - - panflute=1.12 + - panflute=2.1.3 - rb-jekyll=4.0 # for binary deps, use "bundle exec jekyll serve" for jekyll 5 - pandoc=2.11 - - nodejs=8.12 # for makehtml + - nodejs=14.8 # for makehtml # - gxx_linux-64=12.2 # - weasyprint=v51 # - xelatex 3.14159265-2.6-0.999991 From 34cb1d05487bd0006b4c838fbc1f7e421c175f9a Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 10:35:08 +0000 Subject: [PATCH 05/16] update custom rendering of message boxes for ETT style --- Makefile | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index d97c1cfe..52b0e903 100644 --- a/Makefile +++ b/Makefile @@ -89,12 +89,17 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} pandoc --from=markdown+gfm_auto_identifiers --to=markdown+gfm_auto_identifiers \ docs/_specification/${RELEASE}/.metadata.md \ `grep ^nav_order: docs/_specification/${RELEASE}/*.md | sort -n -k 2 | grep -v index.md| grep -v about.md | sed s/:.*//` \ - docs/_specification/${RELEASE}/appendix/*.md docs/_includes/references.liquid docs/_specification/${RELEASE}/.references.md |\ - grep -v '{%' > release/ro-crate-${TAG}.md + docs/_specification/${RELEASE}/appendix/*.md docs/_includes/references.liquid docs/_specification/${RELEASE}/.references.md \ + > release/ro-crate-${TAG}.md # Our own rendering of Note/Warning/Tip - sed -i -E 's/\{: ?\.note ?\} \\>/**Note**:/g' release/ro-crate-${TAG}.md - sed -i -E 's/\{: ?\.warning ?\} \\>/**Warning**:/g' release/ro-crate-${TAG}.md - sed -i -E 's/\{: ?\.tip ?\} \\>/**Tip**:/g' release/ro-crate-${TAG}.md + sed -i -E 's/\{% include callout.html //g' release/ro-crate-${TAG}.md + sed -i -E 's/\" %}//g' release/ro-crate-${TAG}.md + sed -i -E 's/type=\"note\" content=\"/**Note**: /g' release/ro-crate-${TAG}.md + sed -i -E 's/type=\"warning\" content=\"/**Warning** :/g' release/ro-crate-${TAG}.md + sed -i -E 's/type=\"tip\" content=\"/**Tip**: /g' release/ro-crate-${TAG}.md + sed -i -E 's/type=\"important\" content=\"/**Important**: /g' release/ro-crate-${TAG}.md + # remove any remaining lines beginning with {% + sed -i -E 's/\{%.*//g' release/ro-crate-${TAG}.md # Skip intermediate table-of-contents sed -i -E 's/1..*\{:toc\}//g' release/ro-crate-${TAG}.md sed -i -E 's/## Table of contents//g' release/ro-crate-${TAG}.md From 72ec4f3bfa94a8a960a9cf38155d97517450c8a0 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 12:19:25 +0000 Subject: [PATCH 06/16] fix internal links --- Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 52b0e903..bb1524df 100644 --- a/Makefile +++ b/Makefile @@ -105,8 +105,8 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} sed -i -E 's/## Table of contents//g' release/ro-crate-${TAG}.md sed -i -E 's/\{:[^}]*\}//g' release/ro-crate-${TAG}.md # Fix internal links to work in single-page - sed -i -E 's,]\(([^:)]*/)*([^:)]*)\.md\),](#\2),g' release/ro-crate-${TAG}.md - sed -i -E 's,]\([^):]*\.md#([^)]*)\),](#\1),g' release/ro-crate-${TAG}.md + sed -r -i -E 's,]\(([^:)]*/)*([^:)]*)(\.md)?\),](#\2),g' release/ro-crate-${TAG}.md + sed -r -i -E 's,]\([^):]*(\.md)?#([^)]*)\),](#\2),g' release/ro-crate-${TAG}.md release/ro-crate-${TAG}.html: dependencies release/ release/ro-crate-${TAG}.md From ef943abe6727f09bd250ffa785c2d3487c8d097d Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 12:20:11 +0000 Subject: [PATCH 07/16] custom heading ids for intro sections which share titles with other sections --- docs/_specification/1.2-DRAFT/introduction.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index f79bf9e6..ab3a673c 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -101,7 +101,7 @@ The preamble of `@context` and `@graph` are JSON-LD structures that help provide However, in the general case it should be sufficient to follow the RO-Crate JSON examples directly without deeper JSON-LD understanding. In short, an _RO-Crate Metadata Document_ contains a flat list of _entities_ as objects in the `@graph` array. These entities are cross-referenced using `@id` identifiers rather than being deeply nested. -### RO-Crate Metadata Descriptor +### RO-Crate Metadata Descriptor {#intro-ro-crate-metadata-descriptor} The first JSON-LD _entity_ in our example above has the `@id` `ro-crate-metadata.json`: @@ -130,7 +130,7 @@ We can visualise how the above entity references the **RO-Crate Root** as: By convention, in RO-Crate the `@id` value of `./` means that this document describes the directory of content in which the _RO-Crate Metadata Document_ is located, as in the example above. This reference from `ro-crate-metadata.json` is therefore marking the `crate1` directory as being the _RO-Crate Root_. The entity whose `@id` is the _RO-Crate Root_ is called the _Root Data Entity_. -{% include callout.html type="note" content="This example is a directory-based RO-Crate stored on disk. If the crate is being served from a Web service, such as a data repository or database where files are not organized in directories, then the `@id` might be an absolute URI instead of `\"./\"` -- see section [Root Data Entity](root-data-entity) for details." %} +{% include callout.html type="note" content="This example is a directory-based RO-Crate stored on disk. If the crate is being served from a Web service, such as a data repository or database where files are not organized in directories, then the `@id` might be an absolute URI instead of `./` -- see section [Root Data Entity](root-data-entity) for details." %} ### About cross-references @@ -148,7 +148,7 @@ In an _RO-Crate Metadata Document_, entities are cross-referenced using `@id` re The _Root Data Entity_ always has `@type` `Dataset`, though it may have more than one type. It has several metadata properties that describe the RO-Crate as a whole, as a collection of resources. The section on the [Root Data Entity](root-data-entity) explores further the required and recommended properties of this entity. -### Data entities +### Data entities {#intro-data-entities} A main type of resources collected are _data_ -- simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the _Root Data Entity_ with the `hasPart` property. In this example we have an array with a single value, a reference to the entity describing the file `data.csv`. @@ -175,7 +175,7 @@ If we now follow the `@id` reference for the corresponding _data entity_ JSON bl For more information on describing files and directories, including their recommended and required attributes, see the section on [data entities](data-entities). -### Contextual entities +### Contextual entities {#intro-contextual-entities} Moving back to the RO-Crate _Root Data Entity_ (with `@id` `./`), the publisher of this Dataset should be indicated using the property `publisher` and using a URI to identify the publishing `Organization`, linking to what is known as a _Contextual Entity_ that provides some information about the Organization such as its name and web address. From 14e806a7580b0665dd45051b6e64c36c5281d09d Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 13:49:44 +0000 Subject: [PATCH 08/16] more internal link fixes in makefile --- Makefile | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index bb1524df..6ec023b2 100644 --- a/Makefile +++ b/Makefile @@ -105,10 +105,14 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} sed -i -E 's/## Table of contents//g' release/ro-crate-${TAG}.md sed -i -E 's/\{:[^}]*\}//g' release/ro-crate-${TAG}.md # Fix internal links to work in single-page + # first change links to non-spec website pages, e.g. ../../tools -> https://www.researchobject.org/ro-crate/tools + sed -r -i -E 's,]\(\\.\./\.\./([^:)]*)\),](https://www.researchobject.org/ro-crate/\1),g' release/ro-crate-${TAG}.md + sed -r -i -E 's,]\(([^:/)]*\.(json|html))\),](https://www.researchobject.org/ro-crate/specification/${RELEASE}/\1),g' release/ro-crate-${TAG}.md + # change links without a #, e.g. appendix/jsonld to #jsonld sed -r -i -E 's,]\(([^:)]*/)*([^:)]*)(\.md)?\),](#\2),g' release/ro-crate-${TAG}.md + # change links with a #, e.g. contextual-entities#people to #people sed -r -i -E 's,]\([^):]*(\.md)?#([^)]*)\),](#\2),g' release/ro-crate-${TAG}.md - release/ro-crate-${TAG}.html: dependencies release/ release/ro-crate-${TAG}.md egrep -v '^{:(\.no_)?toc}' release/ro-crate-${TAG}.md | \ pandoc --standalone --number-sections --toc --section-divs \ From 13cb7e2694edd7075c7859378123cf91c1e499c7 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 14:17:12 +0000 Subject: [PATCH 09/16] improve internal section links, put appendix in order in makefile --- Makefile | 5 +++-- docs/_specification/1.2-DRAFT/appendix/changelog.md | 2 +- .../1.2-DRAFT/appendix/implementation-notes.md | 2 +- docs/_specification/1.2-DRAFT/appendix/index.md | 8 +++++++- docs/_specification/1.2-DRAFT/appendix/jsonld.md | 4 +--- docs/_specification/1.2-DRAFT/appendix/relative-uris.md | 4 +--- docs/_specification/1.2-DRAFT/contextual-entities.md | 4 +--- docs/_specification/1.2-DRAFT/crate-focus.md | 3 +-- docs/_specification/1.2-DRAFT/data-entities.md | 2 +- docs/_specification/1.2-DRAFT/introduction.md | 2 +- docs/_specification/1.2-DRAFT/metadata.md | 4 +--- docs/_specification/1.2-DRAFT/profiles.md | 2 +- docs/_specification/1.2-DRAFT/provenance.md | 4 +--- docs/_specification/1.2-DRAFT/root-data-entity.md | 4 ++-- docs/_specification/1.2-DRAFT/structure.md | 4 +--- docs/_specification/1.2-DRAFT/terminology.md | 2 +- docs/_specification/1.2-DRAFT/workflows.md | 6 ++---- 17 files changed, 27 insertions(+), 35 deletions(-) diff --git a/Makefile b/Makefile index 6ec023b2..5b46dc63 100644 --- a/Makefile +++ b/Makefile @@ -89,7 +89,8 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} pandoc --from=markdown+gfm_auto_identifiers --to=markdown+gfm_auto_identifiers \ docs/_specification/${RELEASE}/.metadata.md \ `grep ^nav_order: docs/_specification/${RELEASE}/*.md | sort -n -k 2 | grep -v index.md| grep -v about.md | sed s/:.*//` \ - docs/_specification/${RELEASE}/appendix/*.md docs/_includes/references.liquid docs/_specification/${RELEASE}/.references.md \ + `grep ^nav_order: docs/_specification/${RELEASE}/appendix/*.md | sort -n -k 2 | sed s/:.*//` \ + docs/_includes/references.liquid docs/_specification/${RELEASE}/.references.md \ > release/ro-crate-${TAG}.md # Our own rendering of Note/Warning/Tip sed -i -E 's/\{% include callout.html //g' release/ro-crate-${TAG}.md @@ -106,7 +107,7 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} sed -i -E 's/\{:[^}]*\}//g' release/ro-crate-${TAG}.md # Fix internal links to work in single-page # first change links to non-spec website pages, e.g. ../../tools -> https://www.researchobject.org/ro-crate/tools - sed -r -i -E 's,]\(\\.\./\.\./([^:)]*)\),](https://www.researchobject.org/ro-crate/\1),g' release/ro-crate-${TAG}.md + sed -r -i -E 's,]\(\.\./\.\./([^:)]*)\),](https://www.researchobject.org/ro-crate/\1),g' release/ro-crate-${TAG}.md sed -r -i -E 's,]\(([^:/)]*\.(json|html))\),](https://www.researchobject.org/ro-crate/specification/${RELEASE}/\1),g' release/ro-crate-${TAG}.md # change links without a #, e.g. appendix/jsonld to #jsonld sed -r -i -E 's,]\(([^:)]*/)*([^:)]*)(\.md)?\),](#\2),g' release/ro-crate-${TAG}.md diff --git a/docs/_specification/1.2-DRAFT/appendix/changelog.md b/docs/_specification/1.2-DRAFT/appendix/changelog.md index 8d5f99ba..67d46f54 100644 --- a/docs/_specification/1.2-DRAFT/appendix/changelog.md +++ b/docs/_specification/1.2-DRAFT/appendix/changelog.md @@ -26,7 +26,7 @@ excerpt: List of changes in releases of this specifications --> -# APPENDIX: Changelog +# APPENDIX: Changelog {#changelog} * RO-Crate 1.2.0 * Updated the Bioschemas namespace for properties from `https://bioschemas.org/ComputationalWorkflow#` to `https://bioschemas.org/properties/`. This change affects only the `input` and `output` properties in the [JSON-LD context](../ro-crate-metadata.json). diff --git a/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md b/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md index 8498afb9..699f8b8e 100644 --- a/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md +++ b/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md @@ -24,7 +24,7 @@ nav_order: 21 limitations under the License. --> -# APPENDIX: Implementation notes +# APPENDIX: Implementation notes {#implementation-notes} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/appendix/index.md b/docs/_specification/1.2-DRAFT/appendix/index.md index 1cd5e820..45c3a777 100644 --- a/docs/_specification/1.2-DRAFT/appendix/index.md +++ b/docs/_specification/1.2-DRAFT/appendix/index.md @@ -24,4 +24,10 @@ has_children: true limitations under the License. --> -# Appendixes +# Appendixes {#appendix} + +## Contents +* [Changelog](changelog) +* [Handling relative URI references](relative-uris) +* [Implementation Notes](implementation-notes) +* [RO-Crate JSON-LD](jsonld) diff --git a/docs/_specification/1.2-DRAFT/appendix/jsonld.md b/docs/_specification/1.2-DRAFT/appendix/jsonld.md index cea91e54..91fda08d 100644 --- a/docs/_specification/1.2-DRAFT/appendix/jsonld.md +++ b/docs/_specification/1.2-DRAFT/appendix/jsonld.md @@ -24,9 +24,7 @@ nav_order: 22 limitations under the License. --> -
- -# APPENDIX: RO-Crate JSON-LD +# APPENDIX: RO-Crate JSON-LD {#jsonld} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/appendix/relative-uris.md b/docs/_specification/1.2-DRAFT/appendix/relative-uris.md index 0441f4b6..04797677 100644 --- a/docs/_specification/1.2-DRAFT/appendix/relative-uris.md +++ b/docs/_specification/1.2-DRAFT/appendix/relative-uris.md @@ -24,9 +24,7 @@ nav_order: 23 limitations under the License. --> -
- -# APPENDIX: Handling relative URI references +# APPENDIX: Handling relative URI references {#relative-uris} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/contextual-entities.md b/docs/_specification/1.2-DRAFT/contextual-entities.md index 462063c4..d12e2b7c 100644 --- a/docs/_specification/1.2-DRAFT/contextual-entities.md +++ b/docs/_specification/1.2-DRAFT/contextual-entities.md @@ -30,9 +30,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -
- -# Representing Contextual Entities +# Representing Contextual Entities {#contextual-entities} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/crate-focus.md b/docs/_specification/1.2-DRAFT/crate-focus.md index 0a5386d3..876b358f 100644 --- a/docs/_specification/1.2-DRAFT/crate-focus.md +++ b/docs/_specification/1.2-DRAFT/crate-focus.md @@ -26,8 +26,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -# The focus of an RO-Crate -
+# The focus of an RO-Crate {#crate-focus} In addition to simple data packaging, Crates may have a "main" entry point or topic (referenced with a singleton `mainEntity` property), or function as a bundle of one or more Contextual Entities referenced via the `mentions` property. diff --git a/docs/_specification/1.2-DRAFT/data-entities.md b/docs/_specification/1.2-DRAFT/data-entities.md index f89277c5..531a6a36 100644 --- a/docs/_specification/1.2-DRAFT/data-entities.md +++ b/docs/_specification/1.2-DRAFT/data-entities.md @@ -23,7 +23,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -# Data Entities +# Data Entities {#data-entities} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index ab3a673c..43551965 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -226,7 +226,7 @@ The rest of this specification is structured as follows: * Subsection [Digital Library and Repository content](provenance#digital-library-and-repository-content) details how records in an existing repository (which may reference files, but also physical objects) can be described and published using RO-Crate. * [Workflows and Scripts](workflows) explains how computional software and code can be added to an RO-Crate, possibly as part of explaining provenance, but also for providing potential usage and further processing of the data. * [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific _RO-Crate profile_, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, which is also explored in this section. -* [Appendixes](appendix/) contain more technical references and suggestions for developers, e.g. for deciding on `@id` [in JSON-LD](appendix/jsonld#describing-entities-in-json-ld) or [extending RO-Crate terms](appendix/jsonld#extending-ro-crate). The appendix also explores how an RO-Crate can be [packaged with BagIt](appendix/implementation-notes#combining-with-other-packaging-schemes) or used as part of a repository. +* [Appendixes](appendix) contain more technical references and suggestions for developers, e.g. for deciding on `@id` [in JSON-LD](appendix/jsonld#describing-entities-in-json-ld) or [extending RO-Crate terms](appendix/jsonld#extending-ro-crate). The appendix also explores how an RO-Crate can be [packaged with BagIt](appendix/implementation-notes#combining-with-other-packaging-schemes) or used as part of a repository. Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by sections like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. diff --git a/docs/_specification/1.2-DRAFT/metadata.md b/docs/_specification/1.2-DRAFT/metadata.md index c869f143..db2d1485 100644 --- a/docs/_specification/1.2-DRAFT/metadata.md +++ b/docs/_specification/1.2-DRAFT/metadata.md @@ -29,9 +29,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -
- -# RO-Crate Metadata +# RO-Crate Metadata {#metadata} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/profiles.md b/docs/_specification/1.2-DRAFT/profiles.md index 337f63c8..4b762589 100644 --- a/docs/_specification/1.2-DRAFT/profiles.md +++ b/docs/_specification/1.2-DRAFT/profiles.md @@ -25,7 +25,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -# RO-Crate profiles +# RO-Crate profiles {#profiles} While RO-Crates can be considered general-purpose containers of arbitrary data and open-ended metadata, in practical use within a particular domain, application or framework, it will be beneficial to further constrain RO-Crate to a specific **profile**: a set of conventions, types and properties that one minimally can require and expect to be present in that subset of RO-Crates. diff --git a/docs/_specification/1.2-DRAFT/provenance.md b/docs/_specification/1.2-DRAFT/provenance.md index 06010d95..6f1b6ec1 100644 --- a/docs/_specification/1.2-DRAFT/provenance.md +++ b/docs/_specification/1.2-DRAFT/provenance.md @@ -23,9 +23,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -
- -# Detailing provenance of entities +# Detailing provenance of entities {#provenance} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/root-data-entity.md b/docs/_specification/1.2-DRAFT/root-data-entity.md index 8cfb506a..2be9ea20 100644 --- a/docs/_specification/1.2-DRAFT/root-data-entity.md +++ b/docs/_specification/1.2-DRAFT/root-data-entity.md @@ -23,7 +23,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -# Root Data Entity +# Root Data Entity {#root-data-entity} {: .no_toc } ## Table of contents @@ -118,7 +118,7 @@ In this case we can look for the root entity by executing a heuristic algorithm similar to the one shown above, with the only difference that step 2 must be replaced by: -2. .. if the `@id`'s last path segment is `ro-crate-metadata.json` +2\. .. if the `@id`'s last path segment is `ro-crate-metadata.json` It is possible to build an RO-Crate having more than one entity whose `@id` has `ro-crate-metadata.json` as its last path segment. For instance, the crate diff --git a/docs/_specification/1.2-DRAFT/structure.md b/docs/_specification/1.2-DRAFT/structure.md index 6845e2e7..1e87c2dd 100644 --- a/docs/_specification/1.2-DRAFT/structure.md +++ b/docs/_specification/1.2-DRAFT/structure.md @@ -23,9 +23,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -
- -# RO-Crate Structure +# RO-Crate Structure {#structure} {: .no_toc } ## Table of contents diff --git a/docs/_specification/1.2-DRAFT/terminology.md b/docs/_specification/1.2-DRAFT/terminology.md index 0a63aeb2..445a7704 100644 --- a/docs/_specification/1.2-DRAFT/terminology.md +++ b/docs/_specification/1.2-DRAFT/terminology.md @@ -23,7 +23,7 @@ parent: RO-Crate 1.2-DRAFT limitations under the License. --> -# Terminology +# Terminology {#terminology} _RO-Crate_: A dataset, which is described in a JSON-LD _RO-Crate Metadata Document_. diff --git a/docs/_specification/1.2-DRAFT/workflows.md b/docs/_specification/1.2-DRAFT/workflows.md index 00c515a2..918ca809 100644 --- a/docs/_specification/1.2-DRAFT/workflows.md +++ b/docs/_specification/1.2-DRAFT/workflows.md @@ -24,12 +24,10 @@ parent: RO-Crate 1.2-DRAFT distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and - limitations under the License. + limitations under the License. --> -
- -# Workflows and Scripts +# Workflows and Scripts {#workflows} {: .no_toc } ## Table of contents From 2babc8de2fc6f13005314831757a01c06b30408c Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 15:08:44 +0000 Subject: [PATCH 10/16] fix links in appendix and to examples/ --- Makefile | 4 ++-- .../1.2-DRAFT/appendix/changelog.md | 24 +++++++++---------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/Makefile b/Makefile index 5b46dc63..277cae7c 100644 --- a/Makefile +++ b/Makefile @@ -106,9 +106,9 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} sed -i -E 's/## Table of contents//g' release/ro-crate-${TAG}.md sed -i -E 's/\{:[^}]*\}//g' release/ro-crate-${TAG}.md # Fix internal links to work in single-page - # first change links to non-spec website pages, e.g. ../../tools -> https://www.researchobject.org/ro-crate/tools + # first change links to website pages outside the spec, e.g. ../../tools -> https://www.researchobject.org/ro-crate/tools sed -r -i -E 's,]\(\.\./\.\./([^:)]*)\),](https://www.researchobject.org/ro-crate/\1),g' release/ro-crate-${TAG}.md - sed -r -i -E 's,]\(([^:/)]*\.(json|html))\),](https://www.researchobject.org/ro-crate/specification/${RELEASE}/\1),g' release/ro-crate-${TAG}.md + sed -r -i -E 's,]\((\.\./)?([^:)]*\.(json|html))\),](https://www.researchobject.org/ro-crate/specification/${RELEASE}/\2),g' release/ro-crate-${TAG}.md # change links without a #, e.g. appendix/jsonld to #jsonld sed -r -i -E 's,]\(([^:)]*/)*([^:)]*)(\.md)?\),](#\2),g' release/ro-crate-${TAG}.md # change links with a #, e.g. contextual-entities#people to #people diff --git a/docs/_specification/1.2-DRAFT/appendix/changelog.md b/docs/_specification/1.2-DRAFT/appendix/changelog.md index 67d46f54..e95833a3 100644 --- a/docs/_specification/1.2-DRAFT/appendix/changelog.md +++ b/docs/_specification/1.2-DRAFT/appendix/changelog.md @@ -30,28 +30,28 @@ excerpt: List of changes in releases of this specifications * RO-Crate 1.2.0 * Updated the Bioschemas namespace for properties from `https://bioschemas.org/ComputationalWorkflow#` to `https://bioschemas.org/properties/`. This change affects only the `input` and `output` properties in the [JSON-LD context](../ro-crate-metadata.json). - * **Change**: Replaced [name-based algorithm for finding root](../root-data-entity.html#finding-the-root-data-entity) [#198](https://github.com/ResearchObject/ro-crate/issues/198) - * Updated [algorithm to always use string filter to find root](../appendix/relative-uris.html#finding-ro-crate-root-in-rdf-triple-stores) [#189](https://github.com/ResearchObject/ro-crate/issues/189) - (see [algorithm](../root-data-entity.html#finding-the-root-data-entity)) - * **Change**: [Files on the web](../data-entities.html#embedded-data-entities-that-are-also-on-the-web) should now use `contentUrl` for direct download [#259](https://github.com/ResearchObject/ro-crate/issues/259) + * **Change**: Replaced [name-based algorithm for finding root](../root-data-entity#finding-the-root-data-entity) [#198](https://github.com/ResearchObject/ro-crate/issues/198) + * Updated [algorithm to always use string filter to find root](../appendix/relative-uris#finding-ro-crate-root-in-rdf-triple-stores) [#189](https://github.com/ResearchObject/ro-crate/issues/189) + (see [algorithm](../root-data-entity#finding-the-root-data-entity)) + * **Change**: [Files on the web](../data-entities#embedded-data-entities-that-are-also-on-the-web) should now use `contentUrl` for direct download [#259](https://github.com/ResearchObject/ro-crate/issues/259) * [Clarify entity terminology](../contextual-entities#contextual-vs-data-entities) [#204](https://github.com/ResearchObject/ro-crate/issues/204) * Update [JSON-LD context](../ro-crate-metadata.json) to [schema.org 22.0](https://github.com/schemaorg/schemaorg/tree/main/data/releases/22.0/). Note that upstream adds >230 terms, and removed terms `AuthenticContent` `MissingContext`, `constrainingProperty` (now `constraintProperty`), `measuredValue`, `observedNode` . [#263](https://github.com/ResearchObject/ro-crate/issues/263) [#274](https://github.com/ResearchObject/ro-crate/issues/274) * Remove custom mapping of `funding` for Bioschemas (now officially ) * Updated for [ComputationalWorkflow 1.0 profile](../workflows#complying-with-bioschemas-computational-workflow-profile) [#185](https://github.com/ResearchObject/ro-crate/issues/185) - * Clarified [Directories on the web format](../data-entities.html#directories-on-the-web-dataset-distributions) is a media type, not extension [#205](https://github.com/ResearchObject/ro-crate/issues/235) - * **New**: Extended [introduction](../introduction.html) with a running example [#227](https://github.com/ResearchObject/ro-crate/issues/227) [#215](https://github.com/ResearchObject/ro-crate/issues/215) [#219](https://github.com/ResearchObject/ro-crate/issues/219) -- see also [RO-Crate tutorials](https://www.researchobject.org/ro-crate/tutorials.html). + * Clarified [Directories on the web format](../data-entities#directories-on-the-web-dataset-distributions) is a media type, not extension [#205](https://github.com/ResearchObject/ro-crate/issues/235) + * **New**: Extended [introduction](../introduction) with a running example [#227](https://github.com/ResearchObject/ro-crate/issues/227) [#215](https://github.com/ResearchObject/ro-crate/issues/215) [#219](https://github.com/ResearchObject/ro-crate/issues/219) -- see also [RO-Crate tutorials](https://www.researchobject.org/ro-crate/tutorials.html). * **New**: Added section [Profiles](../profiles) and the concept _Profile Crate_ [#250](https://github.com/ResearchObject/ro-crate/issues/250) [#251](https://github.com/ResearchObject/ro-crate/issues/251) [#255](https://github.com/ResearchObject/ro-crate/issues/255) [#256](https://github.com/ResearchObject/ro-crate/issues/256) - * Define [terms for Profiles](../metadata.html#additional-metadata-standards) [#248](https://github.com/ResearchObject/ro-crate/issues/248) - * Added subsection for [grouping extensions in a profile](../appendix/jsonld.html#grouping-extensions-as-an-ro-crate-profile) [#233](https://github.com/ResearchObject/ro-crate/issues/233) [#233](https://github.com/ResearchObject/ro-crate/issues/252) - * [Allow more types on root](../root-data-entity.html#ro-crate-metadata-descriptor) [#182](https://github.com/ResearchObject/ro-crate/issues/182) [#223](https://github.com/ResearchObject/ro-crate/issues/223) - * Added subsection on [root data entity identifier](../root-data-entity.html#root-data-entity-identifier) [#183](https://github.com/ResearchObject/ro-crate/issues/183) + * Define [terms for Profiles](../metadata#additional-metadata-standards) [#248](https://github.com/ResearchObject/ro-crate/issues/248) + * Added subsection for [grouping extensions in a profile](../appendix/jsonld#grouping-extensions-as-an-ro-crate-profile) [#233](https://github.com/ResearchObject/ro-crate/issues/233) [#233](https://github.com/ResearchObject/ro-crate/issues/252) + * [Allow more types on root](../root-data-entity#ro-crate-metadata-descriptor) [#182](https://github.com/ResearchObject/ro-crate/issues/182) [#223](https://github.com/ResearchObject/ro-crate/issues/223) + * Added subsection on [root data entity identifier](../root-data-entity#root-data-entity-identifier) [#183](https://github.com/ResearchObject/ro-crate/issues/183) * **New**: Introduces distinction of [Attached/Detached RO-Crate](../structure) [#248](https://github.com/ResearchObject/ro-crate/issues/248) [#189](https://github.com/ResearchObject/ro-crate/issues/189) [#183](https://github.com/ResearchObject/ro-crate/issues/183) * Included Attached/Detached RO-Crate [in terminology](../terminology) [#248](https://github.com/ResearchObject/ro-crate/issues/248) * Rephrased description of [payload files](../structure#payload-files-and-directories-attached-ro-crates) [#183](https://github.com/ResearchObject/ro-crate/issues/183) [#189](https://github.com/ResearchObject/ro-crate/issues/189) * Describes [ro-crate-preview.html as entity](../structure#ro-crate-website-ro-crate-previewhtml-and-ro-crate-preview_files) [#106](https://github.com/ResearchObject/ro-crate/issues/106) [#210](https://github.com/ResearchObject/ro-crate/issues/210) * Added usage [of DefinedTerm; rdfs:label and comment optional; replaced example](jsonld#adding-new-or-ad-hoc-vocabulary-terms) [#232](https://github.com/ResearchObject/ro-crate/issues/232) [#208](https://github.com/ResearchObject/ro-crate/issues/208) [#106](https://github.com/ResearchObject/ro-crate/issues/106) - * Added section on [converting Attached/Detached RO-Crates](../appendix/relative-uris.html#converting-from-attached-to-detached-ro-crate) [#189](https://github.com/ResearchObject/ro-crate/issues/189) - * Added [Common principles for RO-Crate entities](../metadata.html#common-principles-for-ro-crate-entities) [#225](https://github.com/ResearchObject/ro-crate/issues/225) [#260](https://github.com/ResearchObject/ro-crate/issues/260) + * Added section on [converting Attached/Detached RO-Crates](../appendix/relative-uris#converting-from-attached-to-detached-ro-crate) [#189](https://github.com/ResearchObject/ro-crate/issues/189) + * Added [Common principles for RO-Crate entities](../metadata#common-principles-for-ro-crate-entities) [#225](https://github.com/ResearchObject/ro-crate/issues/225) [#260](https://github.com/ResearchObject/ro-crate/issues/260) * [RO-Crate 1.1.2](https://github.com/ResearchObject/ro-crate/releases/tag/1.1.2) * Typo fixes in [data entity section](../data-entities) [#177](https://github.com/ResearchObject/ro-crate/issues/177), [workflow section](../workflows) [#180](https://github.com/ResearchObject/ro-crate/issues/180), [metadata section](../metadata) [#181](https://github.com/ResearchObject/ro-crate/issues/181) * Correct namespace for `rdfs:comment` on [ad-hoc terms](jsonld#add-local-definitions-of-ad-hoc-terms) [#164](https://github.com/ResearchObject/ro-crate/issues/164) From 23fb4e549273bfa97c3f3f77a72fa5d098e9a651 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 15:48:23 +0000 Subject: [PATCH 11/16] change {: .tip } style callouts to ETT style --- .../appendix/implementation-notes.md | 8 +++---- .../1.2-DRAFT/appendix/jsonld.md | 15 ++++-------- .../1.2-DRAFT/appendix/relative-uris.md | 24 +++++++------------ .../1.2-DRAFT/contextual-entities.md | 6 ++--- .../_specification/1.2-DRAFT/data-entities.md | 15 ++++-------- docs/_specification/1.2-DRAFT/metadata.md | 20 ++++++---------- docs/_specification/1.2-DRAFT/profiles.md | 16 +++++-------- docs/_specification/1.2-DRAFT/provenance.md | 9 +++---- .../1.2-DRAFT/root-data-entity.md | 24 +++++++------------ docs/_specification/1.2-DRAFT/structure.md | 12 ++++------ docs/_specification/1.2-DRAFT/workflows.md | 3 +-- 11 files changed, 53 insertions(+), 99 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md b/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md index 699f8b8e..630016c7 100644 --- a/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md +++ b/docs/_specification/1.2-DRAFT/appendix/implementation-notes.md @@ -96,8 +96,7 @@ e1105ed0…5e13 data/chipseq_20200910.json 37fd3a02…bb95 data/results/pipeline_info/design_reads.csv ``` -{: .note } -> The SHA-512 checksums have been shortened in the above example. +{% include callout.html type="note" content="The SHA-512 checksums have been shortened in the above example." %} Creating the manifest file without using BagIt tools/libraries can be done using the equivalent of: @@ -128,10 +127,9 @@ b0556450…8802 bag-info.txt 000b27e3…c52e manifest-sha512.txt ``` -{: .warning } -The BagIt manifest is intended to detect "bit rot" and accidental damage, +{% include callout.html type="warning" content='The BagIt manifest is intended to detect "bit rot" and accidental damage, it does not provide proof the RO-Crate has not been deliberately -tampered with, as a malicious actor can also update the checksums. +tampered with, as a malicious actor can also update the checksums.' %} Guarding against such scenarious would require additional cryptographic measures, e.g. diff --git a/docs/_specification/1.2-DRAFT/appendix/jsonld.md b/docs/_specification/1.2-DRAFT/appendix/jsonld.md index 91fda08d..5b16b46b 100644 --- a/docs/_specification/1.2-DRAFT/appendix/jsonld.md +++ b/docs/_specification/1.2-DRAFT/appendix/jsonld.md @@ -173,8 +173,7 @@ To check which RO-Crate version is used (in terms of properties and types expect RO-Crate consumers SHOULD NOT do the opposite substitution from an embedded context, but MAY use the [JSON-LD flattening] algorithm with _compaction_ to a referenced _RO-Crate JSON-LD context_ (see also notes on [handling relative URI references](relative-uris) below). -{: .tip } -> The [JSON-LD flattening & compaction](https://www.w3.org/TR/json-ld-api/#flattening-algorithm) algorithms can be used to rewrite to a different `@context`, e.g. to `https://schema.org/docs/jsonldcontext.jsonld` or a different version of the _RO-Crate JSON-LD Context_. +{% include callout.html type="tip" content="The [JSON-LD flattening & compaction](https://www.w3.org/TR/json-ld-api/#flattening-algorithm) algorithms can be used to rewrite to a different `@context`, e.g. to `https://schema.org/docs/jsonldcontext.jsonld` or a different version of the _RO-Crate JSON-LD Context_." %} ## RO-Crate JSON-LD Media type @@ -275,8 +274,7 @@ For projects that have their own web-presence, URIs MAY be defined there and SHO ``` -{: .tip } -> Ensure you have a consistent use of `http` or `https` (preferring https) as well as consistent path `/vocab` vs `/vocab/` vs `/vocab/index.html` (preferring the shortest that is also visible in browser). +{% include callout.html type="tip" content="Ensure you have a consistent use of `http` or `https` (preferring https) as well as consistent path `/vocab` vs `/vocab/` vs `/vocab/index.html` (preferring the shortest that is also visible in browser)." %} For ad hoc terms where the crate author does not have the resources to create and maintain an HTML page, authors may use the RO-Crate public namespace (`https://w3id.org/ro/terms/`) to reserve their terms. For example, an ad-hoc URI MAY be used in the form `https://w3id.org/ro/terms/some-project#myProperty` where `some-project` is acting as a _namespace_ for one or more related terms like `education`. Ad-hoc namespaces under `https://w3id.org/ro/terms/` are available on first-come-first-serve basis; to avoid clashes, namespaces SHOULD be registered by [submitting terms and definitions][ro-terms] to the RO-Crate terms project. @@ -312,8 +310,7 @@ Following the conventions used by Schema.org, ad-hoc terms SHOULD also include d ``` -{: .tip } -> It is **not** a requirement to use English for the terms, labels or comments. +{% include callout.html type="tip" content="It is **not** a requirement to use English for the terms, labels or comments." %} More information about the relationship of this term to other terms MAY be provided using [domainIncludes], [rangeIncludes], [rdfs:subClassOf], [rdfs:subPropertyOf], [sameAs] following the conventions used in the [Schema.org schema] -- *"Schema.org style schemas"*. For compatibility with RDFS/OWL tools, `name` and `description` SHOULD be duplicated using the RDFS properties `rdfs:label` and `rdfs:comment`: @@ -335,11 +332,9 @@ More information about the relationship of this term to other terms MAY be provi } ``` -{: .note } -> Schema.org also provides the types [Class] and [Property]. These MAY be used as an additional `@type` corresponding to `rdfs:Class` and `rdf:Property`, but as these are (for some reason) not used in Schema.org style schemas, they are also not required by RO-Crate. Likewise, an ontology defining such terms externally may be declaring properties there with more specific types like `owl:ObjectProperty` which do not need to be reflected in the RO-Crate reference. +{% include callout.html type="note" content="Schema.org also provides the types [Class] and [Property]. These MAY be used as an additional `@type` corresponding to `rdfs:Class` and `rdf:Property`, but as these are (for some reason) not used in Schema.org style schemas, they are also not required by RO-Crate. Likewise, an ontology defining such terms externally may be declaring properties there with more specific types like `owl:ObjectProperty` which do not need to be reflected in the RO-Crate reference." %} -{: .tip } -> For compatibility with the official schema.org JSON-LD context, make sure any referenced `@id` to schema.org terms starts with `http://` rather than `https://` as shown in the browser. +{% include callout.html type="tip" content="For compatibility with the official schema.org JSON-LD context, make sure any referenced `@id` to schema.org terms starts with `http://` rather than `https://` as shown in the browser." %} ## Grouping extensions as an RO-Crate profile diff --git a/docs/_specification/1.2-DRAFT/appendix/relative-uris.md b/docs/_specification/1.2-DRAFT/appendix/relative-uris.md index 04797677..266f86fb 100644 --- a/docs/_specification/1.2-DRAFT/appendix/relative-uris.md +++ b/docs/_specification/1.2-DRAFT/appendix/relative-uris.md @@ -159,8 +159,7 @@ To additionally save Web-based Data entities to become part of the Detached Crat As this procedure can be error-prone (e.g. a Web-based entity may not be accessible or may require authentication), the implementation should consider the new Attached Crate as a _fork_ and update `identifier` and `isDefinedBy` as specified above. -{: .tip } -> If you are archiving an [attached RO-Crate](../structure#attached-ro-crate) that is already on the Web, then first [establish the absolute URI](#establishing-absolute-uri-for-ro-crate-root) for the root, and retrieve all [payload](../structure#payload-files-and-directories-attached-ro-crates) files that are considered URI path-wise to be part the RO-Crate Root, creating corresponding local paths. In this scenario the above algorithm can be simplified and the rewriting of identifiers can be avoided if they are already relative URIs. +{% include callout.html type="tip" content="If you are archiving an [attached RO-Crate](../structure#attached-ro-crate) that is already on the Web, then first [establish the absolute URI](#establishing-absolute-uri-for-ro-crate-root) for the root, and retrieve all [payload](../structure#payload-files-and-directories-attached-ro-crates) files that are considered URI path-wise to be part the RO-Crate Root, creating corresponding local paths. In this scenario the above algorithm can be simplified and the rewriting of identifiers can be avoided if they are already relative URIs. " %} ## Handling relative URI references when using JSON-LD/RDF tools @@ -256,8 +255,7 @@ Results in a valid _RO-Crate JSON-LD_ (actual order in `@graph` may differ): } ``` -{: .note } -> The saved _RO-Crate JSON-LD_ SHOULD NOT include `{@base: null}` in its `@context`. +{% include callout.html type="note" content="The saved _RO-Crate JSON-LD_ SHOULD NOT include `{@base: null}` in its `@context`." %} ## Expanding/parsing JSON-LD keeping relative referencing @@ -359,11 +357,9 @@ Results in a [expanded form][JSON-LD expanded form] without `@context`, using ab ] ``` -{: .note } -> `@base: null` will not relativize existing absolute URIs that happen to be contained by the _RO-Crate Root_ (see section [Relativizing absolute URIs within RO-Crate Root](#relativizing-absolute-uris-within-ro-crate-root)). +{% include callout.html type="note" content="`@base: null` will not relativize existing absolute URIs that happen to be contained by the _RO-Crate Root_ (see section [Relativizing absolute URIs within RO-Crate Root](#relativizing-absolute-uris-within-ro-crate-root))." %} -{: .tip } -> Most RDF parsers supporting JSON-LD will perform this kind of expansion before generating triples, but not all RDF stores or serializations support relative URI references. Consider using an alternative `@base` as detailed in sections below. +{% include callout.html type="tip" content="Most RDF parsers supporting JSON-LD will perform this kind of expansion before generating triples, but not all RDF stores or serializations support relative URI references. Consider using an alternative `@base` as detailed in sections below." %} ## Establishing absolute URI for RO-Crate Root @@ -371,8 +367,7 @@ When loading _RO-Crate JSON-LD_ as RDF, or combining the crate's Linked Data int [base URI][JSON-LD base URI] to resolve URI references that are relative to the _RO-Crate Root_. -{: .note } -> When retrieving an RO-Crate over the web, servers might have performed HTTP redirections so that the base URI is different from what was requested. It is RECOMMENDED to follow section [Establishing a Base URI of RFC3986](http://tools.ietf.org/html/rfc3986#section-5.1) before resolving relative links from the _RO-Crate Metadata File_. +{% include callout.html type="note" content="When retrieving an RO-Crate over the web, servers might have performed HTTP redirections so that the base URI is different from what was requested. It is RECOMMENDED to follow section [Establishing a Base URI of RFC3986](http://tools.ietf.org/html/rfc3986#section-5.1) before resolving relative links from the _RO-Crate Metadata File_." %} For instance, consider this HTTP redirection from a permalink (simplified): @@ -456,8 +451,7 @@ When parsing a _RO-Crate Metadata File_ into [RDF triples], for instance uploadi * Web servers hosting `ro-crate-metadata.json` may not send the [JSON-LD _Content-Type_](jsonld#ro-crate-json-ld-media-type) * If base URI is not correct it may be difficult to find the corresponding file and directory paths from an RDF query returning absolute URIs -{: .tip } -> If the RDF library can parse the _RO-Crate JSON-LD_ directly by retrieving from a `http`/`https` URI of the _RO-Crate Metadata File_ it should calculate the correct base URI as detailed in section [Establishing absolute URI for RO-Crate Root](#establishing-absolute-uri-for-ro-crate-root) and you should **not** need to override the base URI as detailed here. +{% include callout.html type="tip" content="If the RDF library can parse the _RO-Crate JSON-LD_ directly by retrieving from a `http`/`https` URI of the _RO-Crate Metadata File_ it should calculate the correct base URI as detailed in section [Establishing absolute URI for RO-Crate Root](#establishing-absolute-uri-for-ro-crate-root) and you should **not** need to override the base URI as detailed here." %} If a web-based URI for the _RO-Crate root_ is known, then this can be supplied as a _base URI_. Most RDF tools support a `--base` option or similar. If this is not possible, then the `@context` of the `RO-Crate JSON-LD` can be modified by ensuring the `@context` is an array that sets the desired `@base`: @@ -596,8 +590,7 @@ Parsing this as RDF will generate triples including: Here consumers can assume `/` is the _RO-Crate Root_ and generating relative URIs can safely be achieved by search-replace as the arcp URI is unique. Saving _RO-Crate JSON-LD_ from the triples can be done by using the arcp URI to [relativize absolute URIs within RO-Crate Root](#relativizing-absolute-uris-within-ro-crate-root). -{: .tip } -> **Bagit**: The arcp specification suggests how [BagIt identifiers][ARCP BagIt] can be used to calculate the base URI. See also section [Combining with other packaging schemes](implementation-notes#combining-with-other-packaging-schemes) - note that in this approach the _RO-Crate Root_ will be the payload folder `/data/` under the calculated arcp base URI. +{% include callout.html type="tip" content="**Bagit**: The arcp specification suggests how [BagIt identifiers][ARCP BagIt] can be used to calculate the base URI. See also section [Combining with other packaging schemes](implementation-notes#combining-with-other-packaging-schemes) - note that in this approach the _RO-Crate Root_ will be the payload folder `/data/` under the calculated arcp base URI." %} ## Relativizing absolute URIs within RO-Crate Root @@ -686,7 +679,6 @@ Will output _RO-Crate JSON-LD_ with relative URIs: } ``` -{: .warning } -> This method would also relativize URIs outside the _RO-Crate Root_ that are on the same host, e.g. `http://example.com/crate255/other.txt` would become `../create255/other.txt` - this can particularly be a challenge with local `file:///` URIs. ` +{% include callout.html type="warning" content="This method would also relativize URIs outside the _RO-Crate Root_ that are on the same host, e.g. `http://example.com/crate255/other.txt` would become `../create255/other.txt` - this can particularly be a challenge with local `file:///` URIs. `" %} {% include references.liquid %} diff --git a/docs/_specification/1.2-DRAFT/contextual-entities.md b/docs/_specification/1.2-DRAFT/contextual-entities.md index d12e2b7c..b53f41fb 100644 --- a/docs/_specification/1.2-DRAFT/contextual-entities.md +++ b/docs/_specification/1.2-DRAFT/contextual-entities.md @@ -51,8 +51,7 @@ RO-Crate distinguishes between _contextual entities_ and _data entities_. Some contextual entities can also be considered data entities -- for instance the [license](#licensing-access-control-and-copyright) property refers to a [CreativeWork] that can reasonably be downloaded, however a license document is not usually considered as part of research outputs and would therefore typically not be included in [hasPart] on the [root data entity](root-data-entity). -{: .tip } -> Files in the _RO-Crate Root_ are not necessarily data entities -- the [RO-Crate Metadata Descriptor](root-data-entity#ro-crate-metadata-descriptor) is a file in the _RO-Crate Root_, but is considered a _Contextual Entity_ as it is describing the RO-Crate, rather than being part of it. On the other hand, the [Root Data Entity](root-data-entity#root-data-entity) is a data entity within its own metadata file. +{% include callout.html type="tip" content="Files in the _RO-Crate Root_ are not necessarily data entities -- the [RO-Crate Metadata Descriptor](root-data-entity#ro-crate-metadata-descriptor) is a file in the _RO-Crate Root_, but is considered a _Contextual Entity_ as it is describing the RO-Crate, rather than being part of it. On the other hand, the [Root Data Entity](root-data-entity#root-data-entity) is a data entity within its own metadata file." %} Likewise, some data entities may also be described as contextual entities, for instance a `File` that is also a [ScholarlyArticle]. In such cases the _contextual data entity_ MUST be described as a single JSON object in the RO-Crate Metadata JSON `@graph` and SHOULD list both relevant data and contextual types in a `@type` array. @@ -297,8 +296,7 @@ The [Root Data Entity](root-data-entity) SHOULD have a [publisher] property. Thi To associate a research project with a [Dataset], the _RO-Crate JSON-LD_ SHOULD contain an entity for the project using type [Organization], referenced by a [funder] property. The project `Organization` SHOULD in turn reference any external [funder], either by using its URL as an `@id` or via a _Contextual Entity_ describing the funder. -{: .tip } -> To make it very clear where funding is coming from, the _Root Data Entity_ SHOULD also reference funders directly, as well as via a chain of references. +{% include callout.html type="tip" content="To make it very clear where funding is coming from, the _Root Data Entity_ SHOULD also reference funders directly, as well as via a chain of references." %} ```json diff --git a/docs/_specification/1.2-DRAFT/data-entities.md b/docs/_specification/1.2-DRAFT/data-entities.md index 531a6a36..1a046535 100644 --- a/docs/_specification/1.2-DRAFT/data-entities.md +++ b/docs/_specification/1.2-DRAFT/data-entities.md @@ -228,8 +228,7 @@ Some generic file formats like `application/json` may be specialized using a _pr ``` -{: .tip } -Profiles expressed in formal languages (e.g. XML Schema for validation) can have their own `encodingFormat` and `conformsTo` to indicate their file format. +{% include callout.html type="tip" content="Profiles expressed in formal languages (e.g. XML Schema for validation) can have their own `encodingFormat` and `conformsTo` to indicate their file format." %} {: .note} The [Metadata Descriptor](root-data-entity#ro-crate-metadata-descriptor) `ro-crate-metadata.json` is not a data entity, but is described with `conformsTo` to an _implicit contextual entity_ for the RO-Crate specification, a profile of [JSON-LD](appendix/jsonld). RO-Crates themselves can be specialized using [Profile Crates](profiles), specified with `conformsTo` on the root data entity. @@ -416,11 +415,9 @@ If the referenced RO-Crate B has an `identifier` declared as B's [Root Data Enti } ``` -{.tip } -> The `conformsTo` generic RO-Crate profile on a `Dataset` entity MUST be version-less. The referenced crate B is NOT required to conform to the same version of the RO-Crate specification as A's RO-Crate Metadata Document. +{% include callout.html type="tip" content="The `conformsTo` generic RO-Crate profile on a `Dataset` entity MUST be version-less. The referenced crate B is NOT required to conform to the same version of the RO-Crate specification as A's RO-Crate Metadata Document." %} -{.warning } -> It is NOT RECOMMENDED to declare the generic profile `https://w3id.org/ro/crate` on a referencing crate A's own [root data entity](root-data-entity.html#direct-properties-of-the-root-data-entity), see [metadata descriptor](root-data-entity.html#ro-crate-metadata-descriptor). +{% include callout.html type="warning" content="It is NOT RECOMMENDED to declare the generic profile `https://w3id.org/ro/crate` on a referencing crate A's own [root data entity](root-data-entity.html#direct-properties-of-the-root-data-entity), see [metadata descriptor](root-data-entity.html#ro-crate-metadata-descriptor). " %} Consumers that find a reference to a `Dataset` with the generic RO-Crate profile indicated MAY attempt to resolve the persistent identifier, but SHOULD NOT assume that the `@id` directly resolves to an RO-Crate Metadata Document. See section [Retrieving an RO-Crate](#retrieving-an-ro-crate) below for the recommended algorithm. @@ -457,8 +454,7 @@ If a referenced RO-Crate Metadata Document is known at a given URI or path, but } ``` -{.tip } -> Counter to [file format profile](data-entities.html#file-format-profiles) recommendations, the referenced RO-Crate metadata descriptor SHOULD NOT include its own `conformsTo` declarations to `https://w3id.org/ro/crate` or reference the dataset with `about`; this is to avoid confusion with the referencing RO-Crate's own [metadata descriptor](root-data-entity#ro-crate-metadata-descriptor). +{% include callout.html type="tip" content="Counter to [file format profile](data-entities.html#file-format-profiles) recommendations, the referenced RO-Crate metadata descriptor SHOULD NOT include its own `conformsTo` declarations to `https://w3id.org/ro/crate` or reference the dataset with `about`; this is to avoid confusion with the referencing RO-Crate's own [metadata descriptor](root-data-entity#ro-crate-metadata-descriptor). " %} ##### Profiles of referenced crates @@ -481,8 +477,7 @@ If the referenced crate conforms to a given [RO-Crate profile](profiles), this M } ``` -{.note} -> The profile declaration of a referenced crate is a hint. Consumers should check `conformsTo` as declared in the retrieved RO-Crate, as it may have been updated after this RO-Crate. +{% include callout.html type="note" content="The profile declaration of a referenced crate is a hint. Consumers should check `conformsTo` as declared in the retrieved RO-Crate, as it may have been updated after this RO-Crate." %} diff --git a/docs/_specification/1.2-DRAFT/metadata.md b/docs/_specification/1.2-DRAFT/metadata.md index db2d1485..726f04c1 100644 --- a/docs/_specification/1.2-DRAFT/metadata.md +++ b/docs/_specification/1.2-DRAFT/metadata.md @@ -83,13 +83,11 @@ For all entities listed in an _RO-Crate Metadata Document_ the following princip [Schema.org] is the base metadata standard for RO-Crate. Schema.org was chosen because it is widely used on the World Wide Web and supported by search engines, on the assumption that discovery is likely to be maximized if search engines index the content. -{: .note } -> As far as we know there is no alternative, well-maintained linked-data schema for research data with the coverage needed for this project - i.e. a single standard for expressing all the examples presented in this specification. +{% include callout.html type="note" content="As far as we know there is no alternative, well-maintained linked-data schema for research data with the coverage needed for this project - i.e. a single standard for expressing all the examples presented in this specification." %} RO-Crate relies heavily on [Schema.org], using a constrained subset of [JSON-LD], and this specification gives opinionated recommendations on how to represent the metadata using existing [linked data] best practices. -{: .tip } -> The main principle of RO-Crate is to use a [Schema.org] whenever possible, even if its official definition may seem broad or related to every day objects. For instance, [IndividualProduct] can describe scientific equipment and instruments (see [Provenance of entities](provenance)). RO-Crate implementers are free to use additional properties and types beyond this specification (see also appendix [Extending RO-Crate(appendix/jsonld#extending-ro-crate)]). +{% include callout.html type="tip" content="The main principle of RO-Crate is to use a [Schema.org] whenever possible, even if its official definition may seem broad or related to every day objects. For instance, [IndividualProduct] can describe scientific equipment and instruments (see [Provenance of entities](provenance)). RO-Crate implementers are free to use additional properties and types beyond this specification (see also appendix [Extending RO-Crate(appendix/jsonld#extending-ro-crate)])." %} ### Differences from Schema.org @@ -99,8 +97,7 @@ Generally, the standard _type_ and _property_ names (_terms_) from [Schema.org] * `File` is mapped to which was chosen as a compromise as it has many of the properties that are needed to describe a generic file. Future versions of Schema.org or a research data extension may re-define `File`. * `Journal` is mapped to . -{: .warning } -> JSON-LD examples given on the [Schema.org] website may not be in _flattened_ form; any nested entities in _RO-Crate JSON-LD_ SHOULD be described as separate contextual entities in the flat `@graph` list. +{% include callout.html type="warning" content="JSON-LD examples given on the [Schema.org] website may not be in _flattened_ form; any nested entities in _RO-Crate JSON-LD_ SHOULD be described as separate contextual entities in the flat `@graph` list. " %} To simplify processing and avoid confusion with string values, the _RO-Crate JSON-LD Context_ requires URIs and entity references to be given in the form `"author": {"@id": "http://example.com/alice"}`, even where [Schema.org] for some properties otherwise permit shorter forms like `"author": "http://example.com/alice"`. @@ -116,8 +113,7 @@ RO-Crate also uses the _Portland Common Data Model_ ([PCDM] version - `hasFile` mapped to -{: .note } -> The terms `RepositoryObject` and `RepositoryCollection` are renamed to avoid collision between other vocabularies and the PCDM terms `Collection` and `Object`. The term `RepositoryFile` is renamed to avoid clash with RO-Crate's `File` mapping to . +{% include callout.html type="note" content="The terms `RepositoryObject` and `RepositoryCollection` are renamed to avoid collision between other vocabularies and the PCDM terms `Collection` and `Object`. The term `RepositoryFile` is renamed to avoid clash with RO-Crate's `File` mapping to ." %} RO-Crate use the [Profiles Vocabulary](https://www.w3.org/TR/2019/NOTE-dx-prof-20191218/) to describe [profiles](profiles) using these terms and definitions: @@ -148,8 +144,7 @@ To support geometry in [Places](contextual-entities#places), these terms from th * `Geometry` mapped to * `asWKT` mapped to -{: .note } -> In this specification the proposed Bioschemas terms use the temporary namespace; future releases of RO-Crate may reflect mapping to the namespace. +{% include callout.html type="note" content="In this specification the proposed Bioschemas terms use the temporary namespace; future releases of RO-Crate may reflect mapping to the namespace." %} From [CodeMeta 3.0](https://w3id.org/codemeta/3.0): @@ -164,9 +159,8 @@ From [CodeMeta 3.0](https://w3id.org/codemeta/3.0): * `referencePublication` mapped to * `softwareSuggestions` mapped to -{: .warning } -> As of 2024-05-23, the CodeMeta URIs do not resolve correctly, but are used here to match the Codemeta JSON-LD context (issue [#275](https://github.com/ResearchObject/ro-crate/issues/275)). -> The CodeMeta terms `maintainer` and `funding` are not mapped, as these are already defined by schema.org. +{% include callout.html type="warning" content="As of 2024-05-23, the CodeMeta URIs do not resolve correctly, but are used here to match the Codemeta JSON-LD context (issue [#275](https://github.com/ResearchObject/ro-crate/issues/275)). +The CodeMeta terms `maintainer` and `funding` are not mapped, as these are already defined by schema.org." %} ## Summary of Coverage diff --git a/docs/_specification/1.2-DRAFT/profiles.md b/docs/_specification/1.2-DRAFT/profiles.md index 4b762589..825c2f7a 100644 --- a/docs/_specification/1.2-DRAFT/profiles.md +++ b/docs/_specification/1.2-DRAFT/profiles.md @@ -155,8 +155,7 @@ The rest of the [earlier requirements](#declaring-conformance-of-an-ro-crate-pro * SHOULD list related data entities using `hasPart` (see [below](#what-is-included-in-the-profile-crate)) * MAY list profile descriptors using `hasResource` (see [below](#declaring-the-role-within-the-crate)) -{: .tip} -> The base RO-Crate specification referenced by `isProfileOf` is a Profile Crate itself, see [ro-crate-metadata.json](ro-crate-metadata.json) or [ro-crate-preview.html](ro-crate-preview.html). +{% include callout.html type="tip" content="The base RO-Crate specification referenced by `isProfileOf` is a Profile Crate itself, see [ro-crate-metadata.json](ro-crate-metadata.json) or [ro-crate-preview.html](ro-crate-preview.html). " %} ### How to retrieve a Profile Crate @@ -173,8 +172,7 @@ If an RO-Crate declares conformance to a given profile crate with `conformsTo` o For instance, if a Profile Crate adds a `DefinedTerm` entity according to the [ad-hoc definitions](appendix/jsonld#adding-new-or-ad-hoc-vocabulary-terms), the term MAY be referenced in the conforming crate without making a contextual entity there. For archival purposes it MAY however still be preferrable to copy such entities across to each conforming crate. -{: .note } -> In the conforming crate, any terms defined in the profile using `DefinedTerm`, `rdfs:Class` and `rdf:Property` MUST either be used as full URIs matching the `@id`, or mapped to these URIs from the conforming crate's JSON-LD `@context`. Note that JSON-LD only expands keys from `@id` and `@type`. +{% include callout.html type="note" content="In the conforming crate, any terms defined in the profile using `DefinedTerm`, `rdfs:Class` and `rdf:Property` MUST either be used as full URIs matching the `@id`, or mapped to these URIs from the conforming crate's JSON-LD `@context`. Note that JSON-LD only expands keys from `@id` and `@type`." %} It is RECOMMENDED that `@id` of such shared entities are absolute URIs on both sides to avoid resolving relative paths, and that the profile's recommended [JSON-LD Context](#json-ld-context) used by conforming crates SHOULD have a mapping to the URIs, see section [Extending RO-Crate](appendix/jsonld#extending-ro-crate). @@ -403,10 +401,9 @@ Below are known schema types in their recommended media type, with suggested ide | OWL 2 (in RDF) | `text/turtle` | | | `vocabulary` | -{: .tip } -Some of the above schema languages are based on general data structure syntaxes +{% include callout.html type="tip" content="Some of the above schema languages are based on general data structure syntaxes like `application/json` and `text/turtle`, and therefore have a -generic `encodingFormat` with a specialized `conformsTo` _URI_, which itself is declared as a `Profile`. +generic `encodingFormat` with a specialized `conformsTo` _URI_, which itself is declared as a `Profile`." %} @@ -578,11 +575,10 @@ The JSON-LD Context entity: Note that the referenced context URI does _not_ have to match the `@context` of the Profile Crate itself. -{: .tip } -The `@context` MAY be the Profile Crate's Metadata JSON-LD file itself if +{% include callout.html type="tip" content="The `@context` MAY be the Profile Crate's Metadata JSON-LD file itself if it is [resolvable](appendix/jsonld#ro-crate-json-ld-media-type) as media type `application/ld+json` over HTTP. Make sure the crate includes the -defined terms both within its `@context` and ideally as entities in its `@graph`. +defined terms both within its `@context` and ideally as entities in its `@graph`." %} #### Multiple profiles diff --git a/docs/_specification/1.2-DRAFT/provenance.md b/docs/_specification/1.2-DRAFT/provenance.md index 6f1b6ec1..b3633953 100644 --- a/docs/_specification/1.2-DRAFT/provenance.md +++ b/docs/_specification/1.2-DRAFT/provenance.md @@ -145,8 +145,7 @@ In the below example, an image with the `@id` of `pics/2017-06-11%2012.56.14.jpg }, ``` -{: .tip } -> If representing command lines, double escape `\\` so that JSON preserves the `\` character. +{% include callout.html type="tip" content="If representing command lines, double escape `\\` so that JSON preserves the `\` character." %} If multiple [SoftwareApplication]s have been used in composition, such as from a script or workflow, then the `CreateAction`'s [instrument] SHOULD rather reference a [SoftwareSourceCode] which can be further described as explained in the [Workflows and scripts](workflows) section. @@ -240,11 +239,9 @@ A [Contextual Entity](contextual-entities) from a repository, representing an ab Objects MAY be grouped together in [RepositoryCollection]s with [hasMember] pointing to the [RepositoryObject]. -{: .note } -> The terms `RepositoryObject` and `RepositoryCollection` are renamed in RO-Crate to avoid collision between other vocabularies and the PCDM terms `Collection` and `Object`. The term `RepositoryFile` is renamed to avoid clash with RO-Crate's `File` mapping to . +{% include callout.html type="note" content="The terms `RepositoryObject` and `RepositoryCollection` are renamed in RO-Crate to avoid collision between other vocabularies and the PCDM terms `Collection` and `Object`. The term `RepositoryFile` is renamed to avoid clash with RO-Crate's `File` mapping to ." %} -{: .warning } -> PCDM specifies that files should have only technical metadata, not descriptive metadata, which is _not_ a restriction in RO-Crate. If the RO-Crate is to be imported into a strict PCDM repository, modeling of object/file relationships will be necessary. +{% include callout.html type="warning" content="PCDM specifies that files should have only technical metadata, not descriptive metadata, which is _not_ a restriction in RO-Crate. If the RO-Crate is to be imported into a strict PCDM repository, modeling of object/file relationships will be necessary." %} For example, this data is exported from an [Omeka] repository: diff --git a/docs/_specification/1.2-DRAFT/root-data-entity.md b/docs/_specification/1.2-DRAFT/root-data-entity.md index 2be9ea20..616c5022 100644 --- a/docs/_specification/1.2-DRAFT/root-data-entity.md +++ b/docs/_specification/1.2-DRAFT/root-data-entity.md @@ -67,17 +67,15 @@ property referencing the _Root Data Entity_'s `@id`. } ``` -{: .note} -> Even in [Detached RO-Crates](structure#detached-ro-crate) which do not have an _RO-Crate Metadata File_ present, the identifier `ro-crate-metadata.json` MUST be used. +{% include callout.html type="note" content="Even in [Detached RO-Crates](structure#detached-ro-crate) which do not have an _RO-Crate Metadata File_ present, the identifier `ro-crate-metadata.json` MUST be used." %} The [conformsTo] of the _RO-Crate Metadata Descriptor_ SHOULD be a versioned permalink URI of the RO-Crate specification that the _RO-Crate JSON-LD_ conforms to. The URI SHOULD start with `https://w3id.org/ro/crate/`. -{: .tip } -> The `conformsTo` property MAY be an array, to additionally indicate -specializing [RO-Crate profiles](profiles). +{% include callout.html type="tip" content="The `conformsTo` property MAY be an array, to additionally indicate +specializing [RO-Crate profiles](profiles)." %} If the root data entity `@id` is an absolute URI, the RO-Crate is considered web-based: in this case, the metadata descriptor SHOULD also have an absolute @@ -192,13 +190,11 @@ The _Root Data Entity_ MUST have the following properties: * `datePublished`: MUST be a single string value in [ISO 8601 date format][DateTime] and SHOULD be specified to at least the precision of a day, MAY be a timestamp down to the millisecond. * `license`: SHOULD link to a _Contextual Entity_ or _Data Entity_ in the _RO-Crate Metadata Document_ with a name and description (see section on [licensing](contextual-entities#licensing-access-control-and-copyright)). MAY, if necessary be a textual description of how the RO-Crate may be used. -{: .note } -> These requirements are stricter than those published -> for [Google Dataset Search](https://developers.google.com/search/docs/data-types/dataset) which -> requires a `Dataset` to have a `name` and `description`, +{% include callout.html type="note" content="These requirements are stricter than those published +for [Google Dataset Search](https://developers.google.com/search/docs/data-types/dataset) which +requires a `Dataset` to have a `name` and `description`," %} -{: .warning } -> The properties above are not sufficient to generate a [DataCite][DataCite Schema] citation. Advice on integrating with [DataCite] will be provided in a future version of this specification, or as an implementation guide. +{% include callout.html type="warning" content="The properties above are not sufficient to generate a [DataCite][DataCite Schema] citation. Advice on integrating with [DataCite] will be provided in a future version of this specification, or as an implementation guide." %} Additional properties of _schema.org_ types [Dataset] and [CreativeWork] MAY be added to further describe the RO-Crate as a whole, e.g. [author], [abstract], [publisher]. See sections [contextual entities](contextual-entities) and [provenance](provenance) for further details. @@ -211,15 +207,13 @@ If the `@id` of the Root Data Entity is an absolute URI, an _Attached RO-Crate_ RO-Crates that have been assigned a _persistent identifier_ (e.g. a DOI) SHOULD indicate this using [identifier] on the root data entity using the approach set out in the [Science On Schema.org guides], that is through a `PropertyValue`. -{: note} -> Earlier RO-Crate 1.1 and earlier recommended `identifier` to be plain string URIs. Clients SHOULD be permissive of an RO-Crate `identifier` being a string (which MAY be a URI), or a `@id` reference, which SHOULD be represented as an `PropertyValue` entity which MUST have a human readable `value`, and SHOULD have a `url` if the identifier is Web-resolvable. A citable representation of this persistent identifier MAY be given as a `description` of the `PropertyValue`, but as there are more than 10.000 known [citation styles], no attempt should be made to parse this string. +{% include callout.html type="note" content="Earlier RO-Crate 1.1 and earlier recommended `identifier` to be plain string URIs. Clients SHOULD be permissive of an RO-Crate `identifier` being a string (which MAY be a URI), or a `@id` reference, which SHOULD be represented as an `PropertyValue` entity which MUST have a human readable `value`, and SHOULD have a `url` if the identifier is Web-resolvable. A citable representation of this persistent identifier MAY be given as a `description` of the `PropertyValue`, but as there are more than 10.000 known [citation styles], no attempt should be made to parse this string." %} #### Resolvable persistent identifiers and citation text It is RECOMMENDED that resolving the `identifier` programmatically return the _RO-Crate Metadata Document_ or an archive (e.g. ZIP) that contain the _RO-Crate Metadata File_, using [content negotiation](data-entities#retrieving-an-ro-crate) and/or [Signposting]. With an RO-Crate identifier that is persistant and resolvable in this way from a URI, the root data entity SHOULD indicate this using the `cite-as` property according to [RFC8574]. Likewise, an HTTP/HTTPS server of the resolved RO-Crate Metadata Document or archive (possibly after redirection) SHOULD indicate that persistent identifier in its [Signposting] headers using `Link rel="cite-as"`. -{: .tip} -> The above `cite-as` MAY go to a repository landing page, and MAY require authentication, but MUST ultimately have the RO-Crate as a downloadable item, which SHOULD be programmatically accessible through content negotiation or [Signposting] (`Link rel="describedby"` for a _RO-Crate Metadata Document_, or `Link rel="item"` for an archive). To rather associate a textual scholarly citation for a crate (e.g. journal article), indicate instead a [publication via `citation` property](contextual-entities#publications-via-citation-property). +{% include callout.html type="tip" content='The above `cite-as` MAY go to a repository landing page, and MAY require authentication, but MUST ultimately have the RO-Crate as a downloadable item, which SHOULD be programmatically accessible through content negotiation or [Signposting] (`Link rel="describedby"` for a _RO-Crate Metadata Document_, or `Link rel="item"` for an archive). To rather associate a textual scholarly citation for a crate (e.g. journal article), indicate instead a [publication via `citation` property](contextual-entities#publications-via-citation-property).' %} Any entity which is a subclass of CreativeWork, including the _Root Data Entity_ MAY have a `creditText` property which provides a textual citation for the entity. diff --git a/docs/_specification/1.2-DRAFT/structure.md b/docs/_specification/1.2-DRAFT/structure.md index 1e87c2dd..bbfec520 100644 --- a/docs/_specification/1.2-DRAFT/structure.md +++ b/docs/_specification/1.2-DRAFT/structure.md @@ -80,13 +80,11 @@ These crates cannot carry their own data _payload_, but may reference data depos Any [data entities](data-entities) in a _Detached RO-Crate_ MUST be [Web-based Data Entities](data-entities.html#web-based-data-entities). -{: .warning } -> Using relative URI references like `example/data.txt` in a _Detached RO-Crate_ is NOT RECOMMENDED as this is considered ambigious and fragile. +{% include callout.html type="warning" content="Using relative URI references like `example/data.txt` in a _Detached RO-Crate_ is NOT RECOMMENDED as this is considered ambigious and fragile. " %} A _Detached RO-Crate_ can be identified by the [root data entity](root-data-entity) having an `@id` different from `./` in the JSON. -{: .note } -> [Finding the Root Data Entity](root-data-entity#finding-the-root-data-entity) can be harder for consumers of detached crates, particularly if the platform serving the _RO-Crate Metadata Document_ is unable to ensure the URI path ends with `…/ro-crate-metadata.json`. +{% include callout.html type="note" content="[Finding the Root Data Entity](root-data-entity#finding-the-root-data-entity) can be harder for consumers of detached crates, particularly if the platform serving the _RO-Crate Metadata Document_ is unable to ensure the URI path ends with `…/ro-crate-metadata.json`. " %} Note that a detached RO-Crate may still use `#`-based local identifiers for [contextual entities](contextual-entities). @@ -186,8 +184,7 @@ Metadata about parts of the _RO-Crate Website_ MAY be included in an RO-Crate as } ``` -{: .warning } -> In a _Detached RO-Crate_ it is **undefined** how to find the _RO-Crate Website_ from the _RO-Crate Metadata Document_ or vice versa; it is RECOMMENDED to describe both as contextual entities. +{% include callout.html type="warning" content="In a _Detached RO-Crate_ it is **undefined** how to find the _RO-Crate Website_ from the _RO-Crate Metadata Document_ or vice versa; it is RECOMMENDED to describe both as contextual entities." %} @@ -201,8 +198,7 @@ Payload files may appear directly in the _RO-Crate Root_ alongside the _RO-Crate A RO-Crate may also contain [Web-based Data Entities](data-entities.html#web-based-data-entities) that are not present as part of the payload and referenced using absolute URIs. These may require additional preservation measures. -{: .tip } -> A RO-Crate [packaged with BagIt](appendix/implementation-notes#adding-ro-crate-to-bagit) may be [referencing external files](appendix/implementation-notes#referencing-external-files) which are not present in the _RO-Crate Root_ hierarchy until the BagIt has been _completed_. This method can be used for files that are large, require authentication or otherwise inconvenient to transfer with the RO-Crate, but which should nevertheless still be considered part of the _payload_. +{% include callout.html type="tip" content="A RO-Crate [packaged with BagIt](appendix/implementation-notes#adding-ro-crate-to-bagit) may be [referencing external files](appendix/implementation-notes#referencing-external-files) which are not present in the _RO-Crate Root_ hierarchy until the BagIt has been _completed_. This method can be used for files that are large, require authentication or otherwise inconvenient to transfer with the RO-Crate, but which should nevertheless still be considered part of the _payload_." %} ## Self-describing and self-contained (_Attached RO-Crates_) diff --git a/docs/_specification/1.2-DRAFT/workflows.md b/docs/_specification/1.2-DRAFT/workflows.md index 918ca809..0664d748 100644 --- a/docs/_specification/1.2-DRAFT/workflows.md +++ b/docs/_specification/1.2-DRAFT/workflows.md @@ -232,8 +232,7 @@ A contextual entity conforming to the [FormalParameter profile][FormalParameter } ``` -{: .note } -> `input`, `output` and `FormalParameter` are at time of writing proposed by Bioschemas and not yet integrated in Schema.org +{% include callout.html type="note" content="`input`, `output` and `FormalParameter` are at time of writing proposed by Bioschemas and not yet integrated in Schema.org" %} ## Complete Workflow Example From 279d2ce1338c22f81290217c6f3dbd09ed684d13 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 16:23:46 +0000 Subject: [PATCH 12/16] Revert "add Entity definition, move entity terms close together" This reverts commit 1f7f4dd43a123388741406e02f4475dd0014d7ee. --- docs/_specification/1.2-DRAFT/terminology.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/terminology.md b/docs/_specification/1.2-DRAFT/terminology.md index 445a7704..3bbb0afb 100644 --- a/docs/_specification/1.2-DRAFT/terminology.md +++ b/docs/_specification/1.2-DRAFT/terminology.md @@ -44,20 +44,18 @@ _RO-Crate Website_: Human-readable HTML pages which describe the RO-Crate (i.e. _Type_: A classification of objects or their descriptions. The type (or "class") is given as a short-hand _key_, mapped by the _RO-Crate JSON-LD Context_ to a _URI_ that has the type definition. See appendix [RO-Crate JSON-LD](appendix/jsonld). -_Property_: A relationship from one _entity_ to another entity, or to a _value_. The type of relationship is identified by a _URI_, mapped to a _key_ by _JSON-LD_. See appendix [RO-Crate JSON-LD](appendix/jsonld). - -_Entity_: A JSON-LD representation of an object, which has a _type_ and may be described using a set of _properties_. There are two categories of entity: _data entities_ and _contextual entities_. - _Data Entity_: A JSON-LD representation (in the _RO-Crate Metadata Document_) of a directory, file, or other Web resource which is considered _contained_ by the _RO-Crate_. See section [Data entities](data-entities). -_Root Data Entity_: A _Data Entity_ of _type_ [Dataset], representing the RO-Crate as a whole. See section [Root Data Entity](root-data-entity). +_Property_: A relationship from one _entity_ to another entity, or to a _value_. The type of relationship is identified by a _URI_, mapped to a _key_ by _JSON-LD_. See appendix [RO-Crate JSON-LD](appendix/jsonld). -_Contextual Entity_: A JSON-LD representation of an entity associated with another _Entity_, in order to adequately describe it. For example, a [Person], [Organization] (including research projects), item of equipment ([IndividualProduct]), [license] or any other _thing_ or _event_ that forms part of the metadata for a _Data Entity_. _Properties_ of contextual entities may refer to further entities. See section [Contextual Entities](contextual-entities). +_Root Data Entity_: A _Data Entity_ of _type_ [Dataset], representing the RO-Crate as a whole. See section [Root Data Entity](root-data-entity). -_JSON-LD_: A JSON-based file format for storing _Linked Data_. This document assumes [JSON-LD 1.0]. JSON-LD uses a _context_ to map from JSON keys to _URIs_. See appendix [RO-Crate JSON-LD](appendix/jsonld). +_JSON-LD_: A JSON-based file format for storing _Linked Data_. This document assumes [JSON-LD 1.0]. JSON-LD use a _context_ to map from JSON keys to _URIs_. See appendix [RO-Crate JSON-LD](appendix/jsonld). _JSON_: The _JavaScript Object Notation (JSON) Data Interchange Format_ as defined by [RFC 7159]; a structured text file format that can be programmatically consumed and generated in a wide range of programming languages. The main JSON structures are _objects_ (`{}`) indexed by _keys_, sequential _arrays_ (`[]`) and literal _values_ (`""`). +_Contextual Entity_: A JSON-LD representation of an entity associated with another _Entity_, in order to adequately describe it. For example, a [Person], [Organization] (including research projects), item of equipment ([IndividualProduct]), [license] or any other _thing_ or _event_ that forms part of the metadata for a _Data Entity_. _Properties_ of contextual entities may refer to further entities. See section [Contextual Entities](contextual-entities). + _Linked Data_: A data structure where properties, types and resources are identified with _URIs_, which if retrieved over the Web, further describe or provide the identified property/type/resource. _URI_: A _Uniform Resource Identifier_ as defined in [RFC 3986], for example `http://example.com/path/file.html` - commonly known as _URL_. In this document the term _URI_ includes _IRI_, which also permit international Unicode characters. The URI identifies a downloadable resource (e.g. an image) or a concept (e.g. a _type_ definition). @@ -66,7 +64,7 @@ _URI Path_: The relative _path_ element of an _URI_ as defined in [RFC3986 secti _RO-Crate JSON-LD Context_: A JSON-LD [context][JSON-LD context] that provides Linked Data mapping for RO-Crate metadata to vocabularies like [Schema.org]. This mapping assigns meaning to the JSON keys, see appendix [RO-Crate JSON-LD](appendix/jsonld). -_RO-Crate JSON-LD_: JSON-LD that use the _RO-Crate JSON-LD Context_ and contain RO-Crate metadata, written as if [flattened] and then [compacted] according to the rules in JSON-LD 1.0. The _RO-Crate JSON-LD_ for an _RO-Crate_ is stored or transmitted in the _RO-Crate Metadata Document_. +_RO-Crate JSON-LD_: JSON-LD that use the _RO-Crate JSON-LD Context_ and contain RO-Crate metadata, written as if [flattened] and then [compacted] according to the rules in JSON-LD 1.0. The _RO-Crate JSON-LD_ for an _RO-Crate_ is stored or transmitted in the _RO-Crate Metadata Document. @@ -76,6 +74,6 @@ Throughout this specification, RDF terms (_properties_, _types_) are referred to Following [Schema.org] practice, `property` names start with lowercase letters and `Type` names start with uppercase letters. -In the _RO-Crate Metadata Document_ the RDF terms use their RO-Crate JSON-LD names as defined in the _RO-Crate JSON-LD Context_, which is available at . +In the _RO-Crate Metadata Document_ the RDF terms use their RO-Crate JSON-LD names as defined in the _RO-Crate JSON-LD Context_, which is available at {% include references.liquid %} From f71b1c624eb18f815b7c8c754fe133dd57f60cac Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 16:23:52 +0000 Subject: [PATCH 13/16] Revert "minor additional changes to introduction" This reverts commit b8b5dc6c67abc0cb2e6547279c2c7d2483c0cb2e. --- docs/_specification/1.2-DRAFT/introduction.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index 43551965..d1066f9c 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -29,13 +29,13 @@ This document specifies a method, known as _RO-Crate_ (Research Object Crate), o The core of RO-Crate is a machine-readable linked-data document in JSON-LD format known as an **RO-Crate Metadata Document**. RO-Crate metadata documents can, to a large extent, be created and processed just like any other JSON: knowledge of JSON-LD is not needed, unless extending RO-Crate with additional concepts or combining RO-Crate with other Linked Data technologies. -This section introduces the general RO-Crate concepts through a running example, while the normative sections in the rest of the RO-Crate specification define in more detail these and other concepts using separate examples and recommendations. +This page introduces the general RO-Crate concepts through a running example, while the normative pages in the rest of the RO-Crate specification define in more detail these and other concepts using separate examples and recommendations. ## Walkthrough: An initial RO-Crate -In the simplest form, to describe some data on disk, an _RO-Crate Metadata Document_ named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories (this file is known as the _RO-Crate Metadata File_). +In the simplest form, to describe some data on disk, a file named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories. This `ro-crate-metadata.json` file is known as the _RO-Crate Metadata Document_. -In the example below, a single file `data.csv` is placed with the _RO-Crate Metadata Document_ in a directory named `crate1`: +In the example below, a single file `data.csv` is placed with the RO-Crate Metadata Document in a directory named `crate1`:
Folder listing of crate1, including data.csv and ro-crate-metadata.json @@ -228,6 +228,6 @@ The rest of this specification is structured as follows: * [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific _RO-Crate profile_, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, which is also explored in this section. * [Appendixes](appendix) contain more technical references and suggestions for developers, e.g. for deciding on `@id` [in JSON-LD](appendix/jsonld#describing-entities-in-json-ld) or [extending RO-Crate terms](appendix/jsonld#extending-ro-crate). The appendix also explores how an RO-Crate can be [packaged with BagIt](appendix/implementation-notes#combining-with-other-packaging-schemes) or used as part of a repository. -Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by sections like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. +Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by pages like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. {% include references.liquid %} From be0064090b4583b2b25244bd8e4d9ec5259bf3bb Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 16:26:31 +0000 Subject: [PATCH 14/16] Revert "grammar & style tweaks to introduction" This reverts commit b55a69642d92f41820c1816ca0cb8bcc817fe632. --- docs/_specification/1.2-DRAFT/introduction.md | 56 ++++++++++--------- 1 file changed, 29 insertions(+), 27 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index d1066f9c..593e73f6 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -25,15 +25,15 @@ parent: RO-Crate 1.2-DRAFT # Introduction -This document specifies a method, known as _RO-Crate_ (Research Object Crate), of aggregating and describing data for distribution, re-use, publishing, preservation and archiving. RO-Crates aggregate data into a Dataset, and may describe any resource including files, URI-addressable resources, or use other addressing schemes to locate digital or physical data. Describing resources includes technical metadata such as file sizes and types as well as contextual information including how and where datasets and files were created, how they were collated and collected, who was involved in the process, what equipment and software was used, who funded the work, how to cite it, and crucially, how it may be reused, and by whom. +This document specifies a method, known as _RO-Crate_ (Research Object Crate), of aggregating and describing data for distribution, re-use, publishing, preservation and archiving. RO-Crates aggregate data into a Dataset, and may describe any resource including files, URI-addressable resources, or use other addressing schemes to locate digital or physical data. Describing resources includes technical metadata such as file sizes and types as well as contextual information including how datasets and files were created, and where, how they were collated and collected, who was involved in the process, what equipment and software was used, who funded the work, how to cite it, and crucially, how it may be reused, and by whom. -The core of RO-Crate is a machine-readable linked-data document in JSON-LD format known as an **RO-Crate Metadata Document**. RO-Crate metadata documents can, to a large extent, be created and processed just like any other JSON: knowledge of JSON-LD is not needed, unless extending RO-Crate with additional concepts or combining RO-Crate with other Linked Data technologies. +The core of RO-Crate is a machine-readable linked-data document in JSON-LD format known as an **RO-Crate Metadata Document**. RO-Crate metadata documents can to a large extent be created and processed just like any other JSON: knowledge of JSON-LD is not needed, unless extending RO-Crate with additional concepts or combining RO-Crate with other Linked Data technologies. This page introduces the general RO-Crate concepts through a running example, while the normative pages in the rest of the RO-Crate specification define in more detail these and other concepts using separate examples and recommendations. ## Walkthrough: An initial RO-Crate -In the simplest form, to describe some data on disk, a file named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories. This `ro-crate-metadata.json` file is known as the _RO-Crate Metadata Document_. +In the simplest form, to describe some data on disk, an _RO-Crate Metadata Document_ named `ro-crate-metadata.json` is placed in a directory alongside a set of files or directories (this file is known as the _RO-Crate Metadata File_). In the example below, a single file `data.csv` is placed with the RO-Crate Metadata Document in a directory named `crate1`: @@ -97,11 +97,11 @@ In this running example, the content of the _RO Crate Metadata Document_ is: ### JSON-LD preamble -The preamble of `@context` and `@graph` are JSON-LD structures that help provide global identifiers to the JSON keys and types used in the rest of the _RO-Crate Metadata Document_. These will largely map to definitions in the [schema.org](http://schema.org/) vocabulary, which can be used by RO-Crate extensions to provide additional metadata beyond the RO-Crate specification. It is this feature of JSON-LD that helps make RO-Crate extensible for many different purposes -- this is explored further in the [appendix on JSON-LD](appendix/jsonld). +The preamble of `@context` and `@graph` are JSON-LD structures that help provide global identifiers to the JSON keys and types used in the rest of the _RO-Crate Metadata Document_. These will largely map to definitions in the [schema.org](http://schema.org/) vocabulary, which can be used by RO-Crate extensions to provide additional metadata beyond the RO-Crate specifications. It is this feature of JSON-LD that helps make RO-Crate extensible for many different purposes -- this is explored further in [appendix on JSON-LD](appendix/jsonld). -However, in the general case it should be sufficient to follow the RO-Crate JSON examples directly without deeper JSON-LD understanding. In short, an _RO-Crate Metadata Document_ contains a flat list of _entities_ as objects in the `@graph` array. These entities are cross-referenced using `@id` identifiers rather than being deeply nested. +However, in the general case it should be sufficient to follow the RO-Crate JSON examples directly without deeper JSON-LD understanding. In short, an _RO-Crate metadata Document_ contains a flat list of _entities_ as objects in the `@graph` array. These entities are cross-referenced using `@id` identifiers rather than being deeply nested. -### RO-Crate Metadata Descriptor {#intro-ro-crate-metadata-descriptor} +### RO-Crate Metadata Descriptor The first JSON-LD _entity_ in our example above has the `@id` `ro-crate-metadata.json`: @@ -117,7 +117,7 @@ The first JSON-LD _entity_ in our example above has the `@id` `ro-crate-metadata This required entity, known as the _RO-Crate Metadata Descriptor_, helps this file self-identify as an _RO-Crate Metadata Document_, which is conforming to (`conformsTo`) the RO-Crate specification version 1.2-DRAFT. -The descriptor also indicates via the `about` property which entity in the `@graph` array is the _RO-Crate Root_ dataset -- the starting point of this RO-Crate. +The descriptor also indicates via the `about` property which entity in the `@graph` array is the _RO-Crate Root Dataset_ -- the starting point of this RO-Crate. ### RO-Crate Root @@ -128,13 +128,13 @@ We can visualise how the above entity references the **RO-Crate Root** as:
Figure 2: showing RO-Crate Metadata descriptor's about property pointing at the RO-Crate Root entity with matching @id
-By convention, in RO-Crate the `@id` value of `./` means that this document describes the directory of content in which the _RO-Crate Metadata Document_ is located, as in the example above. This reference from `ro-crate-metadata.json` is therefore marking the `crate1` directory as being the _RO-Crate Root_. The entity whose `@id` is the _RO-Crate Root_ is called the _Root Data Entity_. +By convention, in RO-Crate the `@id` value of `./` means that this document describes the directory of content in which the RO-Crate metadata is located as in the example above. This reference from `ro-crate-metadata.json` is therefore marking the `crate1` directory as being the RO-Crate root. {% include callout.html type="note" content="This example is a directory-based RO-Crate stored on disk. If the crate is being served from a Web service, such as a data repository or database where files are not organized in directories, then the `@id` might be an absolute URI instead of `./` -- see section [Root Data Entity](root-data-entity) for details." %} ### About cross-references -In an _RO-Crate Metadata Document_, entities are cross-referenced using `@id` reference objects, rather than using deeply nested JSON objects. In short, this _flattened JSON-LD_ style allows any entity to reference any other entity, and RO-Crate consumers to directly find all the descriptions of an entity within a single JSON object. So let's have a look at the _Root Data Entity_ `./`: +In _RO-Crate Metadata Documents_, entities are cross-referenced using `@id` reference objects, rather than using deeply nested JSON objects. In short, this _flattened JSON-LD_ style allows any entity to reference any other entity, and RO-Crate consumers to directly find all the descriptions of an entity within a single JSON object. So let's have a look at the Root Data Entity `./`: ```json @@ -146,17 +146,20 @@ In an _RO-Crate Metadata Document_, entities are cross-referenced using `@id` re } ``` -The _Root Data Entity_ always has `@type` `Dataset`, though it may have more than one type. It has several metadata properties that describe the RO-Crate as a whole, as a collection of resources. The section on the [Root Data Entity](root-data-entity) explores further the required and recommended properties of this entity. +The root is always typed `Dataset`, though it may have more than one type. It has several metadata properties that describe the RO-Crate as a whole, as a collection of resources. The section on [root data entity](root-data-entity) explores further the required and recommended properties of the root `./`. ### Data entities {#intro-data-entities} -A main type of resources collected are _data_ -- simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the _Root Data Entity_ with the `hasPart` property. In this example we have an array with a single value, a reference to the entity describing the file `data.csv`. +A main type of resources collected are _data_ -- simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the Root Dataset with the `hasPart` property. In this example we have an array with a single value, a reference to the entity describing the file `data.csv`. -{% include callout.html type="tip" content="RO-Crates can also contain _data entities_ that are folders and Web resources, as well as non-File-like data like online databases -- see the section on [data entities](data-entities) for more information." %} +{: .tip} +RO-Crates can also contain data entities that are folders and Web resources, as well as non-File-like data like online databases -- see section on [data entities](data-entities).
- JSON block with id `./` has an array under  `hasPart` listing id `data.csv`. In second JSON block with id `data.csv` we see it is typed `File` and has other properties. -
Figure 3: RO-Crate Root entity referencing the data entity with @id identifier data.csv
+ +JSON block with id ./ has an array under hasPart listing id data.csv. In second JSON block with id data.csv we see it is typed File and have other properties. + +
Figure 3: RO-Crate Root entity referencing the data entity with @id identifier data.csv
If we now follow the `@id` reference for the corresponding _data entity_ JSON block, we see it has `@type` value of `File` and additional metadata such as `encodingFormat`. It is recommended that every entity has a human readable `name`, which as shown in this example, does not need to match the filename/identifier. The `encodingFormat` indicates the media file type so that consumers of the crate can open `data.csv` in an appropriate program. @@ -172,12 +175,12 @@ If we now follow the `@id` reference for the corresponding _data entity_ JSON bl }, ``` -For more information on describing files and directories, including their recommended and required attributes, see the section on [data entities](data-entities). +For more information on describing files and directories, including their recommended and required attributes, see section on [data entities](data-entities). ### Contextual entities {#intro-contextual-entities} -Moving back to the RO-Crate _Root Data Entity_ (with `@id` `./`), the publisher of this Dataset should be indicated using the property `publisher` and using a URI to identify the publishing `Organization`, linking to what is known as a _Contextual Entity_ that provides some information about the Organization such as its name and web address. +Moving back to the RO-Crate root `./`, the publisher of this Dataset should be indicated using the property `publisher` using a URI to identify the `Organization`, linking to what is known as a _Contextual Entity_ that provides some information about the Organization such as its name and web address. ```json @@ -197,17 +200,17 @@ Moving back to the RO-Crate _Root Data Entity_ (with `@id` `./`), the publisher } ``` -You may notice the subtle difference between a _data entity_ that is conceptually part of the RO-Crate and is file-like (containing bytes), while this _contextual entity_ is a representation of a real-life organization that can't be downloaded: following the URL, we would only get its _description_. The section on [contextual entities](contextual-entities) explores several of the entities that can be added to the RO-Crate to provide it with a **context**, for instance how to link to authors and their affiliations. Simplifying slightly, a _data entity_ is referenced from `hasPart` in a `Dataset`, while a _contextual entity_ is referenced using any other defined property. +You may notice the subtle difference between a _data entity_ that is conceptually part of the RO-Crate and is file-like (containing bytes), while this _contextual entity_ is a representation of a real-life organization that can't be downloaded: following the URL, we would only get its _description_. The section [contextual entities](contextual-entities) explores several of the entities that can be added to the RO-Crate to provide it with a **context**, for instance how to link to authors and their affiliations. Simplifying slightly, a data entity is referenced from `hasPart` in a `Dataset`, while a contextual entity is referenced using any other defined property. ## HTML preview -An RO-Crate can be distributed on disk, in packaged format such as a zip file or disk image, or placed on a static website. In any of these cases, an RO-Crate should have an accompanying HTML version (`ro-crate-preview.html`) designed to be human-readable. The exact contents of the preview may vary but should correspond to the _RO-Crate Metadata Document_ content and link to the contained data entities. The preview may be generated automatically from the _RO-Crate Metadata Document_ (see [RO-Crate tools](../../tools)), or even by hand (equivalent to a README). +An RO-Crate can be distributed on disk, in packaged format such as a zip file or disk image, or placed on a static website. In any of these cases, an RO-Crate should have an accompanying HTML version (`ro-crate-metadata.html`) designed to be human-readable. The exact contents of the preview may vary but should correspond to the _RO-Crate Metadata Document_ content and link to the contained data entities. The preview may be generated automatically from the RO-Crate Metadata Document (see [RO-Crate tools](../../tools)), or even by hand (equivalent to a README). -Below is a screenshot from the [preview of the running example](examples/rainfall-1.2.0/ro-crate-preview.html), which was generated using the [ro-crate-html](https://www.npmjs.com/package/ro-crate-html) package: +Below is a screenshot from the [preview of the running example](examples/rainfall-1.2.0/ro-crate-preview.html):
Screenshot of RO-Crate HTML preview. The metadata attributes are listed in a table with links to each connected entity, such as the Bureau of Meteorology. -
Figure 4: RO-Crate preview of the running example.
+
Figure 3: RO-Crate preview of the running example.
@@ -215,19 +218,18 @@ Below is a screenshot from the [preview of the running example](examples/rainfal The rest of this specification is structured as follows: -* [Terminology](terminology) defines terms such as _Entity_ used in this section and the rest of the specification. You may use this section as a quick-reference, but note that most of these are also covered in detail in separate sections. -* [RO-Crate Structure](structure) defines further how the `ro-crate-metadata.json` and data files can be organized within an _RO-Crate Root_ directory. +* [Terminology](terminology) defines terms such as _Entity_ used in the rest of the document. You may use this page as a quick-reference, but note that most of these are also covered in detail in separate pages. +* [RO-Crate structure](structure) defines further how the `ro-crate-metadata.json` and data files can be organized within an _RO-Crate Root_ directory * [Metadata of the RO-Crate](metadata) explains the connection to Linked Data principles and how RO-Crate keys are mapped to global identifiers. This is mainly of interest for readers already familiar with JSON-LD or ontologies, or which want to expand RO-Crate metadata keys. -* [Root Data Entity](root-data-entity) defines the entities _RO-Crate Metadata Descriptor_ (`ro-crate-metadata.json`) and _Root Data Entity_ (`./`) including their required and recommended properties. +* [Root Data Entity](root-data-entity) defines the entities _RO-Crate Metadata Descriptor_ (`ro-crate-metdata.json`) and _RO-Crate Root_ (`./`) including their required and recommended properties. * [Data Entities](data-entities) explores further how to describe data, including files, directories and Web references. Metadata such as file formats help inform RO-Crate consumers on which tools may be able to process the data. * [Contextual Entities](contextual-entities) shows how to describe entities used to annotate other entities, adding `People` and `Organization` referenced from `author`, `publication`, `affiliation` etc. Metadata like licensing, funding, locations and subjects can be described using contextual entities. -* [The focus of an RO-Crate](crate-focus) !!add description!! * [Provenance of Entities](provenance) explores how the history of making an entity can be added to the RO-Crate using a series of _actions_ -- this may include real-world activities and instruments, as well as software executions and modifications to the RO-Crate metadata itself. - * Subsection [Digital Library and Repository content](provenance#digital-library-and-repository-content) details how records in an existing repository (which may reference files, but also physical objects) can be described and published using RO-Crate. +* Subsection [Digital Library and Repository content](provenance#digital-library-and-repository-content) details how records in an existing repository (which may reference files, but also physical objects) can be described and published using RO-Crate. * [Workflows and Scripts](workflows) explains how computional software and code can be added to an RO-Crate, possibly as part of explaining provenance, but also for providing potential usage and further processing of the data. -* [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific _RO-Crate profile_, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, which is also explored in this section. +* [Profiles](profiles) formalises how a set of RO-Crates can indicate they are conforming to a specific profile, which may add additional requirements beyond this general RO-Crate specification. Profiles may add additional terms from `schema.org` and other vocabularies, or require a certain type of data entity used in a particular research domain. Profiles can themselves be expressed as an RO-Crate, explored in this section. * [Appendixes](appendix) contain more technical references and suggestions for developers, e.g. for deciding on `@id` [in JSON-LD](appendix/jsonld#describing-entities-in-json-ld) or [extending RO-Crate terms](appendix/jsonld#extending-ro-crate). The appendix also explores how an RO-Crate can be [packaged with BagIt](appendix/implementation-notes#combining-with-other-packaging-schemes) or used as part of a repository. -Throughout the specification you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by pages like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crates can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. +Throughout the specifications you will find references to the keys and types reused from `schema.org` through the JSON-LD context, for instance [Dataset], which define many more properties than the ones highlighted by pages like [Root Data Entity](root-data-entity). The intention is that the RO-Crate specification gives a common minimum of metadata, and that producers of RO-Crate can use additional `schema.org` types and properties as needed. When some patterns emerge from such extensions they can be formalized in a published [profile](profiles) to ensure they are also used consistently. {% include references.liquid %} From e5f0d6404b951810422ea044124e23ff2d7f79d1 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Wed, 4 Dec 2024 16:29:18 +0000 Subject: [PATCH 15/16] re-update tip in intro --- docs/_specification/1.2-DRAFT/introduction.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/_specification/1.2-DRAFT/introduction.md b/docs/_specification/1.2-DRAFT/introduction.md index 593e73f6..6dfa3c1a 100644 --- a/docs/_specification/1.2-DRAFT/introduction.md +++ b/docs/_specification/1.2-DRAFT/introduction.md @@ -152,8 +152,7 @@ The root is always typed `Dataset`, though it may have more than one type. It ha A main type of resources collected are _data_ -- simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the Root Dataset with the `hasPart` property. In this example we have an array with a single value, a reference to the entity describing the file `data.csv`. -{: .tip} -RO-Crates can also contain data entities that are folders and Web resources, as well as non-File-like data like online databases -- see section on [data entities](data-entities). +{% include callout.html type="tip" content="RO-Crates can also contain data entities that are folders and Web resources, as well as non-File-like data like online databases -- see section on [data entities](data-entities)." %}
From aa0c30243934a31356a0378e872677ddf60da0f7 Mon Sep 17 00:00:00 2001 From: Eli Chadwick Date: Mon, 9 Dec 2024 15:24:07 +0000 Subject: [PATCH 16/16] remove outdated filter from pandoc call --- Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 277cae7c..af4d2092 100644 --- a/Makefile +++ b/Makefile @@ -117,14 +117,13 @@ release/ro-crate-${TAG}.md: dependencies release/ docs/_specification/${RELEASE} release/ro-crate-${TAG}.html: dependencies release/ release/ro-crate-${TAG}.md egrep -v '^{:(\.no_)?toc}' release/ro-crate-${TAG}.md | \ pandoc --standalone --number-sections --toc --section-divs \ - --filter scripts/pandoc-admonition.py \ --metadata pagetitle="RO-Crate Metadata Specification ${RELEASE}" \ --from=markdown+gfm_auto_identifiers -o release/ro-crate-${TAG}.html release/ro-crate-${TAG}.pdf: dependencies release/ release/ro-crate-${TAG}.md egrep -v '^{:(\.no_)?toc}' release/ro-crate-${TAG}.md | \ pandoc --pdf-engine xelatex --variable=hyperrefoptions:colorlinks=true,allcolors=blue \ - --variable papersize=a4 --filter scripts/pandoc-admonition.py \ + --variable papersize=a4 \ --number-sections --toc --metadata pagetitle="RO-Crate Metadata Specification ${RELEASE}" \ --from=markdown+gfm_auto_identifiers -o release/ro-crate-${TAG}.pdf