Skip to content

Commit

Permalink
restructure knowledgebase
Browse files Browse the repository at this point in the history
  • Loading branch information
Freymaurer committed Nov 6, 2024
1 parent 27fe376 commit 91f568b
Show file tree
Hide file tree
Showing 33 changed files with 82 additions and 36 deletions.
18 changes: 18 additions & 0 deletions astro.config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,24 @@ export default defineConfig({
collapsed: true,
autogenerate: { directory: 'guides' },
},
{
label: 'Resources',
// Collapse the group by default.
collapsed: true,
autogenerate: { directory: 'resources' },
},
{
label: 'Git',
// Collapse the group by default.
collapsed: true,
autogenerate: { directory: 'git' },
},
{
label: 'CWL',
// Collapse the group by default.
collapsed: true,
autogenerate: { directory: 'cwl' },
},
{
label: 'Fundamentals',
// Collapse the group by default.
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/arc-commander/before-we-start.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ lastUpdated: 2023-06-14
After the setup steps, you're all set and ready to start using the ARC Commander. 🎉

:::tip
We recommend trying the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/guides/arc-commander-quick-start) for your first steps with the ARC Commander.
We recommend trying the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/arc-commander/arc-commander-quick-start) for your first steps with the ARC Commander.
:::

:::tip
Expand All @@ -17,7 +17,7 @@ After the quickstarts, we strongly recommend to read the in-depth ARC Commander

## Notes on ARC Commander Guides

- For most steps in this manual and in the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/guides/arc-commander-quick-start), it is assumed, that you opened a [shell or command prompt](/nfdi4plants.knowledgebase/guides/tutorial-command-line) within a directory you want to initiate as an ARC
- For most steps in this manual and in the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/arc-commander/arc-commander-quick-start), it is assumed, that you opened a [shell or command prompt](/nfdi4plants.knowledgebase/fundamentals/tutorial-command-line) within a directory you want to initiate as an ARC
- In the shell, `arc` defines the path to the ARC Commander executable (e.g. on Windows "C:\Users\userA\programs\ArcCommander\arc.exe").
- Note that each input that contains non-literal characters must be encapsulated in "quotation marks" when entered within the shell. This also applies when using the editor for numerals that are no numbers (dates, phone numbers, etc.).

Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/arc-commander/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Unless you actively request it to, the ARC Commander does not delete, modify or
## Do I have to use the ARC Commander?

No. As with most tools and services developed in DataPLANT, you are not obliged to use the ARC Commander to benefit from DataPLANT's support in [FAIR RDM](/nfdi4plants.knowledgebase/fundamentals/research-data-management).
However, we'd highly recommend to check it following the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/guides/arc-commander-quick-start).
However, we'd highly recommend to check it following the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/arc-commander/arc-commander-quick-start).

The alternative would be to

Expand All @@ -48,7 +48,7 @@ The ARC Commander runs on current Windows, Mac and Linux operating systems. The

For details, please

- try out the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/guides/arc-commander-quick-start),
- try out the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/arc-commander/arc-commander-quick-start),
- check the [GitHub repository](https://github.com/nfdi4plants/ARCCommander) to download and install the latest ARC Commander release on your device, or
- explore this ARC Commander Manual for in-depth details.

Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ The standardized ARC structure helps with routine computations:

- The ARC's simple [directory structure](/nfdi4plants.knowledgebase/core-concepts/arc) itself helps building routines, no matter whether you work with code or licensed software. Across projects, you and your collaborators know, where to find metadata and raw data, where to store processed data and results.
- The ARC facilitates task automation such as quality control and validation within one project or across multiple ARCs covering routine measurements
- Code-based computations can be designed as reusable and reproducible workflows using [Common Workflow Language (CWL)](/nfdi4plants.knowledgebase/guides/data-analysis/computational-workflows)
- Code-based computations can be designed as reusable and reproducible workflows using [Common Workflow Language (CWL)](/nfdi4plants.knowledgebase/cwl)

### Data publication

Expand Down Expand Up @@ -144,7 +144,7 @@ Following the exemplary scenario A, you could setup the ARC for your collaborati
Alternatively, collaborators already working with ARCs could invite you to "their" ARC (exemplary scenario B). They can independently set up the ARC and fill metadata (4) based on your prepared templates (3).

:::tip
In scenario B the collaborator might invite you to a very large ARC with data not really relevant for your platform-specific collaboration. In this case you might want to [exclude irrelevant data or avoid downloading large data](/nfdi4plants.knowledgebase/guides/arc-gitignore) when syncing the ARC.
In scenario B the collaborator might invite you to a very large ARC with data not really relevant for your platform-specific collaboration. In this case you might want to [exclude irrelevant data or avoid downloading large data](/nfdi4plants.knowledgebase/git/git-gitignore) when syncing the ARC.
:::

**Can I retain my established naming convention for project management and data storage?**
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@ authors:

# CWL Metadata

Metadata plays a crucial role in enhancing the comprehensibility of CWL files. By embedding additional information about the performer and the process within the metadata,
researchers can create a more comprehensive and informative description of their workflows.
Metadata plays a crucial role in enhancing the comprehensibility of CWL files. By embedding additional information about the performer and the process within the metadata, researchers can create a more comprehensive and informative description of their workflows.

- Performer Metadata:

Expand Down
37 changes: 37 additions & 0 deletions src/content/docs/cwl/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: Computational Workflows
lastUpdated: 2023-02-05
authors:
- kevin-frey
---
import { CardGrid } from '@astrojs/starlight/components';
import { LinkCard } from '@astrojs/starlight/components';

Welcome to the section on Computational Workflows (CWL).

Here you can find documentation about CWL and best practices. If you want to get a picture of CWL first, please check out the [CWL Main Page](https://www.commonwl.org/).

Please explore the sections on the left to find guides on

<CardGrid>
<LinkCard
title="Introduction"
href="/nfdi4plants.knowledgebase/cwl/cwl-introduction"
description="General information about the common workflow language!"
/>
<LinkCard
title="Metadata"
href="/nfdi4plants.knowledgebase/cwl/cwl-metadata"
description="Metadata in your .cwl files."
/>
<LinkCard
title="Setup"
href="/nfdi4plants.knowledgebase/cwl/cwl-runner-installation"
description="Setup the cwl runner on your machine."
/>
<LinkCard
title="Examples"
href="/nfdi4plants.knowledgebase/cwl/cwl-examples"
description="Find some example files!"
/>
</CardGrid>
2 changes: 1 addition & 1 deletion src/content/docs/fundamentals/data-management-plan.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ DataPLANT helps the community by providing [DataPLAN](https://plan.nfdi4plants.o
## Follow-up

- If you are looking for DataPLAN, the Data Management Plan (DMP) generator of DataPLANT, you can find it [here](https://plan.nfdi4plants.org).
- If you are looking for an article about DataPLAN, you can find it [here](/nfdi4plants.knowledgebase/guides/dataplan).
- If you are looking for an article about DataPLAN, you can find it [here](/nfdi4plants.knowledgebase/resources/dataplan).

## Sources and further information

Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/fundamentals/metadata.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ The diversity of metadata types, sources and stakeholders highlights that collec
Metadata stakeholders from different environments have different understandings of what metadata is required for comprehension of the annotated data. As plant biologists we probably agree that, when retrieving data from a public repository or publication, it is beneficial to know what type of measurement was performed on what species of plants. By contrast, a computational biologist and a librarian might emphasize the importance of the programming environment required to interpret a script or the contributing authors and licenses, respectively.

As plant biologists, we frequently experience how hard it is (if at all possible) to reproduce an experiment described with too little information in a publication. So, the more metadata the merrier &ndash; wouldn't it be great to capture *all* metadata about a project? Realistically we can only collect a portion of metadata. To guide users on what metadata is encouraged to collect, different domains of data experts have formulated these requirements into what is often referred to as "metadata standards" or "minimum information standards".
Examples for bibliographic and administrative metadata standards include [DublinCore](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/) and [DataCite](https://schema.datacite.org). Prominent standards to annotate data relevant to different plant science domains are grouped under the "Minimum Information for Biological and Biomedical Investigations" ([MIBBI](https://fairsharing.org/3518d)) and define e.g. minimum information about a high-throughput SEQuencing Experiment ([MINSEQE](https://www.fged.org/projects/minseqe)), Proteomics Experiment ([MIAPE](http://www.psidev.info/miape)) or a Plant Phenotyping Experiment ([MIAPPE](https://www.miappe.org)). There are many more metadata standards available which can be explored at [fairsharing.org]. You can also use DataPLANT's [Metadata recommendation quiz](/nfdi4plants.knowledgebase/guides/metadata-quiz) for finding suitable metadata standards, metadata checklists for repositories as well as corresponding [Swate](/nfdi4plants.knowledgebase/swate) templates for your data.
Examples for bibliographic and administrative metadata standards include [DublinCore](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/) and [DataCite](https://schema.datacite.org). Prominent standards to annotate data relevant to different plant science domains are grouped under the "Minimum Information for Biological and Biomedical Investigations" ([MIBBI](https://fairsharing.org/3518d)) and define e.g. minimum information about a high-throughput SEQuencing Experiment ([MINSEQE](https://www.fged.org/projects/minseqe)), Proteomics Experiment ([MIAPE](http://www.psidev.info/miape)) or a Plant Phenotyping Experiment ([MIAPPE](https://www.miappe.org)). There are many more metadata standards available which can be explored at [fairsharing.org]. You can also use DataPLANT's [Metadata recommendation quiz](/nfdi4plants.knowledgebase/resources/metadata-quiz) for finding suitable metadata standards, metadata checklists for repositories as well as corresponding [Swate](/nfdi4plants.knowledgebase/swate) templates for your data.
The metadata standards can be regarded as "checklists", which, when followed, provide that the data is annotated with the required metadata attributes to make it comprehensible at least in the current context.


Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/fundamentals/public-data-repositories.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Examples for general-purpose repositories include

The following resources provide good starting points to seek a suitable repository for your research data.

- DataPLANT's [Metadata recommendation quiz](/nfdi4plants.knowledgebase/guides/metadata-quiz)
- DataPLANT's [Metadata recommendation quiz](/nfdi4plants.knowledgebase/resources/metadata-quiz)
- FAIRsharing: https://fairsharing.org
- re3data (Registry of Research Data Repositories): https://www.re3data.org
- Overview of EMBL-EBI repositories: https://www.ebi.ac.uk/services/all
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,7 @@ This might however pose a safety risk. Please read the details here: https://www

### Git LFS

[Git LFS](/nfdi4plants.knowledgebase/guides/arc-lfs) is basically the system in the back to simplify working with git and (ARCs containing) large data files.
[Git LFS](/nfdi4plants.knowledgebase/git/git-lfs) is basically the system in the back to simplify working with git and (ARCs containing) large data files.
ARC commander and ARCitect offer options to download (clone) an ARC without large files; speeding up the process and avoiding waste of data storage, if you are only interested e.g. in the metadata.

If you have downloaded (cloned) an ARC without large files and try to upload it to a new location (i.e. new remote due to a transfer to other user, group, etc.), you will see the following or similar error
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/guides/arc-adding-external-data.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Adding external data to the ARC
title: External Data
lastUpdated: 2023-07-07
authors:
- dominik-brilhaus
Expand Down Expand Up @@ -39,5 +39,5 @@ As with any other routine used by researchers to share scientific results and da
:::

:::tip
You can add datasets to the [.gitignore](/nfdi4plants.knowledgebase/guides/arc-gitignore) file, if you are unsure about the conditions to reuse data from an external source.
You can add datasets to the [.gitignore](/nfdi4plants.knowledgebase/git/git-gitignore) file, if you are unsure about the conditions to reuse data from an external source.
:::
7 changes: 7 additions & 0 deletions src/content/docs/guides/arc-cwl.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: CWL
---

import CwlIndex from '../cwl/index.mdx'

<CwlIndex />
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Managing ARCs across locations
title: Storing ARCs
lastUpdated: 2024-07-17
authors:
- dominik-brilhaus
Expand All @@ -22,8 +22,8 @@ A few things are important when maintaining ARCs in multiple locations:
1. Try to keep your ARC in sync via the DataHUB
2. Make sure to sync large files properly

As with any cloud service, when a single file is edited from multiple locations, you can run into merge conflicts. To avoid these, make sure to regularly [sync your ARC with the DataHUB](/nfdi4plants.knowledgebase/guides/arc-syncing-recommendation) and from there sync with your (other) locations before adding or editing data.
In order to have the large files only where you need them and not where you do not (e.g. your personal computer), the ARC and DataHUB implement the LFS (Large file storage) system. The ARCitect and ARC commander provide options to properly handle [LFS-tagged files](/nfdi4plants.knowledgebase/guides/arc-lfs).
As with any cloud service, when a single file is edited from multiple locations, you can run into merge conflicts. To avoid these, make sure to regularly [sync your ARC with the DataHUB](/nfdi4plants.knowledgebase/git/git-syncing-recommendation) and from there sync with your (other) locations before adding or editing data.
In order to have the large files only where you need them and not where you do not (e.g. your personal computer), the ARC and DataHUB implement the LFS (Large file storage) system. The ARCitect and ARC commander provide options to properly handle [LFS-tagged files](/nfdi4plants.knowledgebase/git/git-lfs).

<!--
TODO
Expand Down

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion src/content/docs/start-here/data-analysis.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ Hence the data analysis can simply be packaged as an assay, with the computation
## Option II: Create a run and a workflow

If your data analysis is code-based, you likely aim to make it reusable and actionable in-place.
To achieve this, we recommend to wrap and annotate your workflow using [Common Workflow Language (CWL)](/nfdi4plants.knowledgebase/guides/data-analysis/computational-workflows/). Although CWL is out of the scope of this starters' guide, we want to share the basic concept here.
To achieve this, we recommend to wrap and annotate your workflow using [Common Workflow Language (CWL)](/nfdi4plants.knowledgebase/cwl). Although CWL is out of the scope of this starters' guide, we want to share the basic concept here.

![](@images/start-here/arc-prototypic-workflows-cwl1.svg)

Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/swate/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Ontology terms within the Swate database can not only be used to standardize the

## Templates for convenient metadata annotation
Metadata annotation as part of the data submission routine to public repositories is often bothersome due to a high variability between repository requirements. This can become particularly inconvenient when the same metadata is submitted repeatedly, e.g. to unrelated public repositories. To assist researchers in this process, DataPLANT provides a growing collection of templates as a starting point for their annotation tables. The template design process is initiated “backwards”, starting from the requirements of public repositories and thereby, compliance with metadata standards. Our Data stewards supervise the metadata harmonization between template and target repository and simultaneously contribute to the development of the DataPLANT biology ontology [(DPBO)](https://github.com/nfdi4plants/nfdi4plants_ontology).
From a technical perspective, these templates are ISA Protocols containing various Characteristics, Parameters, and the Study specific Factor. DataPLANT provides checklists and requirements of public repositories as templates that are considered useful for various technologies and common standards, e.g. MIAPPE. For finding metadata checklists and Swate templates that are suitable for your data, you might want to use the [Metadata recommendation quiz](/nfdi4plants.knowledgebase/guides/metadata-quiz). Swate templates can directly be integrated to the isa.study.xlsx and isa.assay.xlsx files using Swate. Once loaded into the table, they still can be modified to special needs in the sense of adding or deleting annotation building blocks. The modularity of the system also gives labs and institutions the possibility to create their own lab specific templates for experiments that are frequently run in the lab, e.g. a metabolomics experiment of a measurement facility. High flexibility is fostered by offering a manual or Swate-supported template customization, distribution, and use.
From a technical perspective, these templates are ISA Protocols containing various Characteristics, Parameters, and the Study specific Factor. DataPLANT provides checklists and requirements of public repositories as templates that are considered useful for various technologies and common standards, e.g. MIAPPE. For finding metadata checklists and Swate templates that are suitable for your data, you might want to use the [Metadata recommendation quiz](/nfdi4plants.knowledgebase/resources/metadata-quiz). Swate templates can directly be integrated to the isa.study.xlsx and isa.assay.xlsx files using Swate. Once loaded into the table, they still can be modified to special needs in the sense of adding or deleting annotation building blocks. The modularity of the system also gives labs and institutions the possibility to create their own lab specific templates for experiments that are frequently run in the lab, e.g. a metabolomics experiment of a measurement facility. High flexibility is fostered by offering a manual or Swate-supported template customization, distribution, and use.

![Templates](@images/swate/swate-add-template.png)

Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/vault/arc-user-journey.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,4 @@ After Viola generated her plots, she placed them in individual subdirectories, s

## Cheat sheet

We hope that these examples nicely illustrated the ARC structure and that you are now ready to produce your own ARCs. Use the figure below as a cheat sheet to remember where to store which files. Or follow the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/guides/arc-commander-quick-start) to try it out yourself.
We hope that these examples nicely illustrated the ARC structure and that you are now ready to produce your own ARCs. Use the figure below as a cheat sheet to remember where to store which files. Or follow the [ARC Commander QuickStart](/nfdi4plants.knowledgebase/arc-commander/arc-commander-quick-start) to try it out yourself.
Loading

0 comments on commit 91f568b

Please sign in to comment.