-
Notifications
You must be signed in to change notification settings - Fork 27
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #269 from thoelken/main
Corrected headings level and capitalization in some articles
- Loading branch information
Showing
25 changed files
with
424 additions
and
179 deletions.
There are no files selected for viewing
240 changes: 240 additions & 0 deletions
240
docs/_Getting-Started/01-privacy-policy-english-translation.md
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7 changes: 6 additions & 1 deletion
7
.../01-privacy-policy-english-translation.md → .../01-privacy-policy-english-translation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,7 @@ Legend: | |
* END = no solution, this problem is unsolvable | ||
|
||
|
||
# RNA-seq | ||
## RNA-seq | ||
1. high peak at low bp in the electropherogram (intensity mV per Size bp) | ||
- **source**: documentation (PDF) | ||
- **possible reason(s)**: contamination e.g. adapter dimers (adapter+adapter, no DNA) | ||
|
@@ -149,7 +149,7 @@ Legend: | |
- **possible reason(s)**: humans are bad with ratios (0.01 = almost 0 and 100 is just large but not the largest bar ever) | ||
- **solution/measure**: use any log transformation (e.g. log10: 0.01 => -2, 100 => +2) | ||
|
||
# Single cell | ||
## Single cell | ||
|
||
### Quality check | ||
1. peak at left/right side in gene or reads per cell histogram or log10-cummulative-number of reads per cell id | ||
|
@@ -191,9 +191,5 @@ Legend: | |
- **possible reason(s)**: some genes can be interpreted as dates when using excel for data handling <https://doi.org/10.1126/science.aah4573> | ||
- **solution/measure**: never ever use excel or at least make sure that cell type is not "AUTO" | ||
|
||
# Get Help | ||
## Get Help | ||
If you have any further questions about the management and analysis of your microbial research data, please contact us: [[email protected]](mailto:[email protected]) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) | ||
|
||
# Further resources | ||
|
||
# References |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,12 +5,12 @@ layout: default | |
docs_css: markdown | ||
--- | ||
|
||
# Introduction | ||
## Introduction | ||
A Data Management Plan (DMP) is a formal and living document that defines responsibilities and provides guidance. It describes data and data management during the project and measures for archiving and making data and research results available, usable, and understandable after the project has ended. | ||
|
||
DMPs are required in [DFG funding proposals since 2022](https://www.dfg.de/en/research_funding/announcements_proposals/2022/info_wissenschaft_22_25/index.html) and in [EU Funding Programs 2021-2027](https://ec.europa.eu/info/funding-tenders/opportunities/docs/2021-2027/common/guidance/aga_en.pdf). For funders, DMPs serve as a reporting tool to hold grantees accountable for conducting good and open science, with regular updates or in case of changes. For researchers and other stakeholders, DMPs are meant to be a living document that accompanies them from proposal writing or project start to the sharing of their data and results. | ||
|
||
# Content of DMPs | ||
## Content of DMPs | ||
DMPs typically include the following information: | ||
* Administrative project-specific information (including a description of the research project) | ||
* Roles, responsibilities and obligations | ||
|
@@ -27,7 +27,7 @@ DMPs typically include the following information: | |
|
||
To find a TDR, see the [Data Repositories page of the Knowledge Base]({% link _RDM-Share/22-data-repositories.md %}). | ||
|
||
# DMP templates and examples | ||
## DMP templates and examples | ||
|
||
**Templates** | ||
* [NFDI4Microbiota's template](https://doi.org/10.5281/zenodo.13628589) | ||
|
@@ -38,7 +38,7 @@ To find a TDR, see the [Data Repositories page of the Knowledge Base]({% link _R | |
* [DD-DeCaF Bioinformatics Services for Data-Driven Design of Cell Factories and Communities](https://phaidra.univie.ac.at/o:1139495) | ||
* [METASTAVA](https://doi.org/10.5281/zenodo.5841166) | ||
|
||
# Benefits of a DMP | ||
## Benefits of a DMP | ||
If implemented correctly, a DMP can [benefit all stakeholders](https://doi.org/10.1371/journal.pcbi.1006750) in a research project, despite the initial cost of creating the DMP itself. | ||
|
||
A DMP can **save time and nerves** for yourself and others by planning ahead. DMPs define roles, responsibilities, and efforts regarding the data and its management. Writing a DMP will also get you in touch with IT staff and your institution's data protection officer at an early stage. Writing a DMP also ensures data quality and allows you to easily trace your processing steps, making your analysis and results reproducible. Writing a DMP also allows you to manage access rights and prevent security breaches. Finally, by writing your DMP, you may be able to identify gaps and vulnerabilities in your current data management strategy at an early stage and outline solutions to fill them. | ||
|
@@ -47,7 +47,7 @@ A DMP can also facilitate and **harmonize the coordination and shared use of dat | |
|
||
DMPs offer **other benefits**, such as enabling verification and control: researchers are accountable for how their data are managed during their research project. They also help to identify - and potentially minimize - time and money costs that need to be included in the proposal, such as for Research Data Management (RDM) activities. They also help to comply with Good Research Practice (GRP), support research integrity, and ensure that ethical and legal requirements are met. DMPs also help to meet institutional and funder requirements: funding agencies increasingly require information on the management of research data, and a DMP allows you to structure and formalize this information. Last but not least, DMPs facilitate data reuse, thereby increasing data citation and advancing scientific progress. | ||
|
||
# Writing a DMP | ||
## Writing a DMP | ||
|
||
**Who is involved in the creation of the DMP?** Entities involved in the creation of a DMP are researchers, RDM staff (check your institution's [research data policy](https://www.forschungsdaten.org/index.php/Forschungsdaten-Policies) and ask for [local support](https://www.forschungsdaten.org/index.php/FDM-Kontakte)) and central infrastructure (e.g. computer center, library). | ||
|
||
|
@@ -57,7 +57,7 @@ DMPs offer **other benefits**, such as enabling verification and control: resear | |
|
||
**DMP quality check:** A good DMP is well structured and distinguishes between actions to be taken during and after the project. It is a living document that needs to be updated regularly and is for the use of all project stakeholders. It should be started as early as possible, be as concise as possible, as long as necessary, and contain sufficient detail without being redundant. Ideally, the DMP will be published with the research data at the end of the project. | ||
|
||
# DMP tools | ||
## DMP tools | ||
Although it is generally possible to formulate a DMP in a text document, the use of more dynamic and machine-readable formats finally unlocks its full potential. | ||
|
||
* **[Research Data Management Organizer](https://rdmorganiser.github.io/) (RDMO)** is an open-source web application that has been widely adopted by institutes and consortia in Germany. RDMO supports the structured and collaborative planning and implementation of RDM and also enables the textual output of a DMP. | ||
|
@@ -69,7 +69,7 @@ RDMO organizes individual DMPs around predefined templates that reflect the requ | |
|
||
* **[DMPonline](https://dmponline.dcc.ac.uk/)** was developed by the [Digital Curation Center](https://www.dcc.ac.uk/) (DCC) for the UK funding context but has also been used elsewhere. It is an open-source, web-based tool for researchers. It enables the creation, review, and sharing of DMPs that meet institutional and funder requirements. | ||
|
||
# Further resources | ||
## Further resources | ||
* Cessda - [Data Management Expert Guide](https://dmeg.cessda.eu/Data-Management-Expert-Guide) | ||
* [Content of a Data Management Plan](https://doi.org/10.18154/RWTH-2019-10064) | ||
* [Data Management Plan — the Turing Way - Data Management Plan](https://the-turing-way.netlify.app/reproducible-research/rdm/rdm-dmp.html) | ||
|
@@ -90,8 +90,8 @@ RDMO organizes individual DMPs around predefined templates that reflect the requ | |
* [SM Wizard](https://smw.ds-wizard.org/) | ||
* [Writing and using a software management plan](https://www.software.ac.uk/guide/writing-and-using-software-management-plan) | ||
|
||
# Get Help | ||
## Get Help | ||
If you have any further questions about the management and analysis of your microbial research data, please contact us: [[email protected]](mailto:[email protected]) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) | ||
|
||
# References | ||
## References | ||
{% bibliography --cited_in_order %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,10 +5,10 @@ layout: default | |
docs_css: markdown | ||
--- | ||
|
||
# Abstract | ||
## Abstract | ||
Aruna Object Storage (AOS) is a modern distributed storage platform designed to meet the increasing demand for effective data management and storage of scientific data. It is the central storage of the [Research Data Commons (RDC)](23-research-data-commons.html) cloud layer and the data foundation for the upper layers. It is a cloud-native, scalable system with an API and a S3-compatible interface. It allows resource organization into Objects, Datasets, Collections and Projects. Additionally, it provides an event-driven architecture which enables automation, data validation and improves accessibility and reproducibility of scientific results. AOS is open-source and available at [https://aruna-storage.org](https://aruna-storage.org). | ||
|
||
# Factsheet | ||
## Factsheet | ||
* ![Aruna Object Storage Logo]({{ '/assets/img/aruna_dark_font.png' | relative_url }} "Aruna Object Storage Logo"){:width="20%"} | ||
* Status: V2.x BETA, V1.x deprecated | ||
* Current Version: V2.0.x beta | ||
|
@@ -18,7 +18,7 @@ Aruna Object Storage (AOS) is a modern distributed storage platform designed to | |
|
||
![AOS inside RDC]({{ '/assets/img/rdc_aruna.png' | relative_url }} "AOS inside RDC"){:width="70%"} | ||
|
||
# Overview | ||
## Overview | ||
AOS is a fast, secure and geo-redundant data storage. It offers a sophisticated metadata management according to the FAIR principles. It builds the foundation for RDCs mediation and semantic layer and and handles all stored data objects secure, and data-agnostically. | ||
|
||
AOS key features are: | ||
|
@@ -33,21 +33,21 @@ Storing data in localized, domain-specific data silos has limited use for collab | |
|
||
![Aruna Object Storage Concept]({{ '/assets/img/concept_aruna.png' | relative_url }} "Aruna Object Storage Concept"){:width="40%"} | ||
|
||
# Getting started | ||
## Getting started | ||
AOS is located at [https://aruna-storage.org](https://aruna-storage.org). Users can log in there. Currently, the AAI of the GWDG is used for this purpose, which requires a user account at the GWDG, the DFN or at LifeScience AAI. Nevertheless, additional identity providers are possible. Thus, login via an SSO of NFDI4Biodiversity (and other NFDIs) will be supported when the service is established. After the AOS account has been activated, the user can create a project. Further users can then be activated for this project to enable data exchange and joint processing. The project can then be filled with data either via the API or via the S3 interface. | ||
|
||
![Aruna Object Storage Start Page]({{ '/assets/img/aruna-startpage-2023-7-28_8-24-10.png' | relative_url }} "Aruna Object Storage Start Page"){:width="60%"} | ||
|
||
# User Guide | ||
## User Guide | ||
Basically, AOS is intended as a data backend for the RDC. For this reason, very few end users will use AOS directly. Data import, verification, transformation and processing is basically possible via the services in the mediation layer. This also ensures the consistency of the data. Users and services can be informed about changes to individual data objects or even entire projects via the AOS notification service and can thus react to these changes. | ||
|
||
# Developer Guide | ||
## Developer Guide | ||
The current documentation for using AOS is linked from the AOS home page at [https://aruna-storage.org](https://aruna-storage.org). This contains a complete description of the API. AOS consists of five main components: AOS Server, AOS Proxy, AOS API (and its S3 interface), AOS CLI and AOS Notification System. Of these components, the AOS team installs and maintains the servers and associated databases. AOS proxies can then be installed at various locations, which then communicate with the servers in each case. The actual data traffic from and to the storage backend then takes place via the AOS proxies. The interaction between a client and the proxies/servers takes place via the AOS API. To reduce the entry barrier, there is a command line interface called AOS CLI, which encapsulates API calls. Moreover, an S3 interface was implemented, since many software packages already support data storage via S3 as industry standard. Finally, the AOS notification system will soon be released to allow immediate response to changes in the AOS. This can be, for example, a data verification that is automatically initiated when a data upload is complete. | ||
|
||
## AOS infrastructure | ||
### AOS infrastructure | ||
The main component of AOS is a distributed database system. It synchronizes all data between several computers at different locations and thus generates fail-safety via this redundancy. This database is regularly backed up. The actual data can also be synchronized across multiple sites to provide redundancy. Nevertheless, all data will also be stored at one location in a redundant system. Due to the fact that data cannot be overwritten, but new versions of the data are then created, in combination with the redundant data storage at multiple levels, no backup of the data is currently performed. An implementation at a later date is currently being discussed. | ||
|
||
## AOS data structure | ||
### AOS data structure | ||
AOS organizes data in Version 1.x into Projects, Collections, Object Groups, and Objects, starting with version 2.x the data structure will be even more flexible and are organized into Projects, Collections, Datasets, and Objects with a more flexible relation model. | ||
|
||
|![Aruna Object Storage Structure V1]({{ '/assets/img/aruna-1-structure.png' | relative_url }} "Aruna Object Storage Structure V1"){:width="50%"} | | ||
|
@@ -58,9 +58,9 @@ AOS organizes data in Version 1.x into Projects, Collections, Object Groups, and | |
|-| | ||
| UML diagram of the Aruna Object Storage data structure starting in Version v2.0. All resources form a directed acyclic graph of belongs to relationships (blue) with Projects as roots and Objects as leaves. Resources can also describe horizontal version relationships (orange), data/metadata relationships (yellow) or even custom user-defined relationships (green). | | ||
|
||
# Get Help | ||
## Get Help | ||
If you have any further questions about the management and analysis of your microbial research data, please contact us: [[email protected]](mailto:[email protected]) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) | ||
|
||
# References | ||
## References | ||
* Dokumentation and Aruna start page: [https://aruna-storage.org](https://aruna-storage.org) | ||
* Source-Code: [https://github.com/ArunaStorage](https://github.com/ArunaStorage) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,10 +4,11 @@ category: RDM-Preserve | |
layout: default | ||
docs_css: markdown | ||
--- | ||
# Definition | ||
|
||
## Definition | ||
Digital preservation means taking certain measures to ensure that digital material can be found and can be accessed in the long term ("long-term accessibility of data"). It aims to preserve information in a way that is understandable and reusable for a specific community and to prove its authenticity. | ||
|
||
# Digital preservation for researchers | ||
## Digital preservation for researchers | ||
The sustainable handling of data by researchers naturally facilitates the long-term accessibility of data. Best practice methods are: | ||
* Cleaning data / data structures - see also: [Data Organisation](https://knowledgebase.nfdi4microbiota.de/RDM-Process/14-data-organization.html) | ||
* Validating data - see also: [Data Quality Control](https://knowledgebase.nfdi4microbiota.de/RDM-Collect/13-data-qc.html) | ||
|
@@ -18,7 +19,7 @@ The sustainable handling of data by researchers naturally facilitates the long-t | |
* Storing files on 2 different media types | ||
* Keeping at least 1 copy off site. | ||
|
||
## Data selection | ||
### Data selection | ||
To decide well-founded on data selection we recommend reading the how-to guide of the Edinburgh Digital Curation Centre {% cite dcc_five_2014 %}. The suggested steps are: | ||
* **Step 1:** Identify purposes that the data could fulfill: consider the purpose or ‘reuse case’ of your data, including reuse outside your research group. | ||
* **Step 2:** Identify data that **must** be kept: consider legal or policy compliance risks, as well as funder requirements. | ||
|
@@ -27,7 +28,7 @@ To decide well-founded on data selection we recommend reading the how-to guide o | |
* **Step 5:** Complete the data appraisal, i.e. list what data must, should or could be kept to fulfill which potential reuse purposes. Summarize any actions needed to prepare the data for deposit - or justification for not keeping it. | ||
|
||
|
||
## Recommended file formats for preservation | ||
### Recommended file formats for preservation | ||
Making your research available in recommended file formats additional to the original software format supports highly the reusability and long-term accessibility of your data. | ||
Attributes of those file formats are: | ||
* Open rather than proprietary (examples for [open files formats](https://en.wikipedia.org/wiki/List_of_open_file_formats)) | ||
|
@@ -40,18 +41,18 @@ Attributes of those file formats are: | |
|
||
For biomaterial data, recommended formats are CSV, TXT and XML. | ||
|
||
# Digital preservation for repository operators | ||
## Digital preservation for repository operators | ||
|
||
Specific preservation measures depend on the digital objects, needs of the user community, and various other conditions. Repositories usually contain publications as files, making file format identification and validation relevant. | ||
|
||
## Bitstream preservation | ||
### Bitstream preservation | ||
Preservation on the bitstream level is the basis for digital preservation. It covers e. g. | ||
* Checking checksums of transferred files upon receiving them (or generating file checksums) and conducting regular fixity checks | ||
* Redundant storage of data | ||
* Generating backups (e. g. offline backups of the underlying repository database) | ||
* Strategies for updating storage media (according to e. g. server lifetime) | ||
|
||
## Preservation beyond bitstream | ||
### Preservation beyond bitstream | ||
Preservation of file content, being able to open and render it correctly in a software is part of logical {% cite lindlar_2020_3672773 %} or technical preservation, also called digital curation. Semantic preservation is concerned with e. g. semantic drift impacting metadata. | ||
* Obtaining sufficient rights allowing e. g. format migrations, file repairs and re-use over the long-term like re-publication in other infrastructures | ||
* File format identification, based format-specific bit patterns, e. g. via [DROID](https://coptr.digipres.org/index.php/DROID) during publication process | ||
|
@@ -69,10 +70,10 @@ Preservation of file content, being able to open and render it correctly in a so | |
|
||
Many digital preservation criteria applying to repositories are also present in the certification criteria of the CoreTrustSeal and the nestor seal {% cite coretrustseal_standards_and_certificatio_2022_7051012 harmsen_henk_explanatory_2013 %}. | ||
|
||
# Get Help | ||
## Get Help | ||
If you have any further questions about the management and analysis of your microbial research data, please contact us: [[email protected]](mailto:[email protected]) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) | ||
|
||
# References | ||
## References | ||
{% bibliography --cited_in_order %} | ||
|
||
|
Oops, something went wrong.