From f8936a1602ca219dad31d1121ecd7034fe9b3eb3 Mon Sep 17 00:00:00 2001 From: thoelken <5861076+thoelken@users.noreply.github.com> Date: Fri, 13 Sep 2024 11:06:01 +0200 Subject: [PATCH 1/2] Corrected headings level and capitalization in some articles --- .../01-privacy-policy-english-translation.md | 240 ++++++++++++++++++ docs/_Getting-Started/02-contributing.md | 2 +- .../01-governance-workflows.md | 2 +- docs/_RDM-Collect/13-data-qc.md | 10 +- docs/_RDM-Plan/01-dmp.md | 18 +- docs/_RDM-Preserve/24-aruna-object-storage.md | 20 +- docs/_RDM-Preserve/25-digital-preservation.md | 19 +- docs/_RDM-Process/14-data-organization.md | 60 ++--- docs/_RDM-Reuse/23-research-data-commons.md | 6 +- docs/_RDM-Reuse/27-data-reuse.md | 54 ++-- docs/_RDM-Share/19-collaboration-tools.md | 2 +- docs/_RDM-Share/20-pids.md | 34 +-- docs/_RDM-Share/22-data-repositories.md | 4 +- docs/_RDM-Share/26-licenses.md | 24 +- .../02-workflows.md | 2 +- .../03-software-containers.md | 22 +- .../04-resources.md | 30 +-- docs/_Research-Data-Management/01-rd.md | 10 +- docs/_Research-Data-Management/02-rdm.md | 3 +- docs/_Research-Data-Management/03-md.md | 22 +- docs/_Research-Data-Management/04-fair.md | 24 +- docs/_Resources/01-glossary.md | 2 +- .../02-external-training-resources.md | 2 +- docs/_Software-Development/02-toolsurvey.md | 4 +- 24 files changed, 427 insertions(+), 189 deletions(-) create mode 100644 docs/_Getting-Started/01-privacy-policy-english-translation.md diff --git a/docs/_Getting-Started/01-privacy-policy-english-translation.md b/docs/_Getting-Started/01-privacy-policy-english-translation.md new file mode 100644 index 00000000..bcb325c8 --- /dev/null +++ b/docs/_Getting-Started/01-privacy-policy-english-translation.md @@ -0,0 +1,240 @@ +# +--- +title: Privacy Policy +category: Getting-Started +layout: default +docs_css: markdown +hide: true +--- + +## DISCLAIMER +The following policy is an automated translation of the German text. Please refer to the German original for a legally binding document. + +## § 1 Information on data collection, controller, contacting us + +1. In the following, we provide information about the collection of personal data when using our websites www.nfdi4microbiota.de, www.nfdi4life.de, www.zbmed.de, www.livivo.de, www.publisso.de, https://books.publisso.de, https://repository.publisso.de and zbmedblog.de. Personal data is all data that can be related to you personally, e.g. name, address, e-mail addresses, user behavior. + +2. The controller pursuant to Art. 4 (7) of the EU General Data Protection Regulation (GDPR) is + + Deutsche Zentralbibliothek für Medizin (ZB MED) – Informationszentrum Lebenswissenschaften
+ Gleueler Straße 60
+ 50931 Köln
+ Tel.: +49 (0)221 478-5686 (Infocenter)
+ [E-mail](mailto:info@zbmed.de)
+ + Persons authorized to represent the company:
+ Commercial Managing Director: Gabriele Herrmann-Krotz, graduate economist
+ Scientific Director: Prof. Dr. Dietrich Rebholz-Schuhmann
+ + You can reach our data protection officer at Datenschutz@zbmed.de or our postal address with the addition “the data protection officer”. + +3. When you contact us by e-mail or via our contact forms, the data you provide (your e-mail address, your name and telephone number if applicable) will be stored by us in order to answer your questions. We delete the data arising in this context after storage is no longer required, or restrict processing if there are statutory retention obligations. + +4. If we use contracted service providers for individual functions of our offer or would like to use your data for advertising purposes, we will inform you in detail below about the respective processes. We will also state the specified criteria for the storage period. + +## § 2 Your rights + +1. You have the following rights vis-à-vis us with regard to your personal data: + +- Right of access, +- Right to rectification or erasure, +- Right to restriction of processing, +- Right to object to processing, +- Right to data portability. + +2. You also have the right to lodge a complaint with the competent data protection supervisory authority, the + + State Commissioner for Data Protection and Freedom of Information North Rhine-Westphalia
+ Kavalleriestr. 2-4
+ 40213 Düsseldorf
+ Telephone: 0211/38424-0
+ Fax: 0211/38424-10
+ [E-mail](mailto:poststelle@ldi.nrw.de)
+ [Website](https://www.ldi.nrw.de/index.php)
+ + to complain about the processing of your personal data by us. + +## § 3 What data we collect when you visit our website + +1. When using the website for information purposes only, i.e. if you do not register or otherwise provide us with information, we only collect the personal data that your browser transmits to our server. If you wish to view our website, we collect the following data, which is technically necessary for us to display our website to you and to ensure stability and security (legal basis is Art. 6 para. 1 sentence 1 lit. f GDPR): - IP address - date and time of the request - time zone difference to Greenwich Mean Time (GMT) - content of the request (specific page) - access status/HTTP status code - amount of data transferred in each case - website from which the request comes - browser - operating system and its interface - language and version of the browser software. + +2. In addition to the aforementioned data, cookies are stored on your computer when you use our website. Cookies are small text files that are stored locally in the cache of the visitor's Internet browser. Cookies enable the internet browser to be recognized and allow certain information to flow to the site that sets the cookie (in this case us). Cookies cannot execute programs or transfer viruses to your computer. They are used to make the website more user-friendly and effective overall. + +3. Use of cookies: + + a. For the websites mentioned under § 1 (1), we use the following types of cookies, the scope and function of which are explained below: - Transient cookies (see b) - Persistent cookies (see c). + + b. Transient cookies are automatically deleted when you close the browser. These include session cookies in particular. These store a so-called session ID, with which various requests from your browser can be assigned to the joint session. This allows your computer to be recognized when you return to our website. The session cookies are deleted when you log out or close the browser. + + c. Persistent cookies are automatically deleted after a specified period, which may vary depending on the cookie. You can delete the cookies in the security settings of your browser at any time. + + d. You can configure your browser settings according to your wishes and, for example, refuse to accept third-party cookies or all cookies. Please note that you may then not be able to use all the functions of this website. + + e. If you have an account at www.livivo.de, https://books.publisso.de or https://repository.publisso.de, we use cookies to identify you for subsequent visits. Otherwise you would have to log in again for each visit. + +## § 4 What other data we collect when you use the other functions and offers on our website and what we do with it + +1. In addition to the purely informational use of our website, we offer various services that you can use if you are interested. These include the use of www.livivo.de, https://books.publisso.de, https://repository.publisso.de and a library user account registered with us. To do so, you must generally provide additional personal data that we use to provide the respective service and to which the aforementioned data processing principles apply. + +2. In some cases, we use external service providers to process your data. These have been carefully selected and commissioned by us, are bound by our instructions and are regularly monitored. + +3. Furthermore, we may pass on your personal data to third parties if we offer participation in promotions, the conclusion of contracts or similar services together with partners. You will receive more detailed information on this when you provide your personal data or in the description of the offer below. + +4. If our service providers or partners are based in a country outside the European Economic Area (EEA), we will inform you of the consequences of this circumstance in the description of the offer. + +## § 5 Objection to/revocation of the processing of your data + +1. If you have given your consent to the processing of your data, you can withdraw this at any time. Such a revocation affects the permissibility of the processing of your personal data after you have expressed it to us. + +2. If we base the processing of your personal data on the balancing of interests, you can object to the processing. This is the case if, in particular, the processing is not necessary for the performance of a contract with you, which is described by us in the following description of the functions. When exercising such an objection, we ask you to explain the reasons why we should not process your personal data as we have done. In the event of your justified objection, we will examine the situation and will either stop or adapt the data processing or show you our compelling reasons worthy of protection on the basis of which we will continue the processing. + +3. Of course, you can object to the processing of your personal data for data analysis purposes at any time. + +## § 6 Use of our blog function, use of our web store, use of our portals + + Use of the blog functions In our blog (zbmedblog.de), in which we publish various articles on topics relating to our activities, you can make public comments. Your comment will be published with your specified user name next to the post. We recommend that you use a pseudonym instead of your real name. Your username and e-mail address are required, all other information is voluntary. If you leave a comment, we will continue to store your IP address. This storage is necessary for us to be able to defend ourselves against liability claims in the event of possible publication of unlawful content. We need your e-mail address in order to contact you if a third party objects to your comment as unlawful. The legal basis is Art. 6 para. 1 sentence 1 lit. b and f GDPR. Comments are checked before publication. We reserve the right to delete comments if they are objected to as unlawful by third parties. + +### Use of our web store + +1. If you wish to place an order in our web store at www.livivo.de, you must provide the personal data we require to process your order in order to conclude the contract. Mandatory information required for the processing of contracts is marked separately, other information is voluntary. We process the data you provide to process your order. For this purpose, we may forward your payment data to our house bank. The legal basis for this is Art. 6 para. 1 sentence 1 lit. b GDPR. + + We may also process the data you provide in order to inform you about other interesting products from our portfolio or to send you emails with technical information. + +2. Due to commercial and tax law requirements, we are obliged to store your address, payment and order data for a period of ten years. However, we restrict processing after two years, i.e. your data will only be used to comply with legal obligations. + +3. To prevent unauthorized access by third parties to your personal data, in particular financial data, the order process is encrypted using TLS technology. + +### Use of our portals www.livivo.de, www.publisso.de, https://books.publisso.de, https://repository.publisso.de + +1. If you wish to use our portals www.publisso.de, https://books.publisso.de and https://repository.publisso.de as an author, editor or reviewer, you must register by entering your e-mail address, a password of your choice and a user name of your choice. In order to be able to track whether and how you have been informed, we also collect the following data as part of the e-mail correspondence: Recipient, sender, time stamp, subject. On our portal https://books.publisso.de it is compulsory to use a clear name, pseudonymous use is not possible. + + If you wish to use www.livivo.de, you can register by entering your e-mail address, name and address as well as a password of your choice. Not all functions are available without registration. + + We use the so-called double opt-in procedure for registration on our portals, i.e. your registration is not complete until you have confirmed your registration by clicking on the link contained in a confirmation e-mail sent to you for this purpose. If you do not confirm your registration within 24 hours, your registration will be automatically deleted from our database. + + The provision of the aforementioned data is mandatory; you can provide all other information voluntarily by using our portal. + +2. If you use our portals, we store your data required for the fulfillment of the contract, for www.livivo.de also information on the method of payment, until you finally delete your access. Furthermore, we store the voluntary data you provide for the duration of your use of the portal, unless you delete it beforehand. You can manage and change all information in the protected customer area. The legal basis is Art. 6 para. 1 sentence 1 lit. f GDPR. + +3. If you use our portals https://books.publisso.de and https://repository.publisso.de, your data may be made accessible to other portal participants in accordance with the contractual service. All data provided is visible on our portal https://books.publisso.de. You have the option of deciding whether your data is displayed. Your nickname is visible on https://repository.publisso.de. Our moderators will also see the e-mail address you have entered. Members who are not logged in will not receive any information about you. Your user name and photo are visible to all registered members, regardless of whether you have shared them or not. In contrast, your entire profile with the data you have shared is visible to all members who have confirmed you as a personal contact. If you make content accessible to your personal contacts that you do not send by means of a private message, this content will be visible to third parties if your personal contact has approved it. If you post content in public groups, this content will be visible to all registered members of the portal. + +4. To prevent unauthorized access by third parties to your personal data, in particular financial data, the connection is encrypted using TLS technology. + +## § 7 Our newsletter + +1. With your consent, you can subscribe to our newsletter, with which we inform you about our current interesting offers. The advertised goods and services are named in the declaration of consent. + +2. We use the so-called double opt-in procedure to subscribe to our newsletter. This means that after you have registered, we will send you an e-mail to the specified e-mail address in which we ask you to confirm that you wish to receive the newsletter. If you do not confirm your registration within 24 hours, your information will be blocked and automatically deleted after one month. In addition, we store the IP addresses you use and the times of registration and confirmation. The purpose of this procedure is to be able to prove your registration and, if necessary, to clarify any possible misuse of your personal data. + +3. The only mandatory information for sending the newsletter is your e-mail address. The provision of further, separately marked data is voluntary and is used to address you personally. After your confirmation, we will store your e-mail address for the purpose of sending you the newsletter. The legal basis is Art. 6 para. 1 sentence 1 lit. a GDPR. + +4. You can revoke your consent to receive the newsletter at any time and unsubscribe from the newsletter. You can declare your revocation by clicking on the link provided in every newsletter e-mail, for our ZB MED newsletter by e-mail to webredaktion@zbmed.de or by sending a message to the contact details given in the imprint. + +## § 8 Use of web analytics + +### Use of Google Analytics + +1. Our websites www.zbmed.de, www.livivo.de, www.publisso.de use Google Analytics, a web analysis service of Google Inc. (“Google”). Google Analytics uses “cookies”, which are text files placed on your computer, to help the website analyze how users use the site. The information generated by the cookie about your use of this website is usually transmitted to a Google server in the USA and stored there. However, if IP anonymization is activated on this website, your IP address will first be truncated by Google within member states of the European Union or in other states party to the Agreement on the European Economic Area. Only in exceptional cases will the full IP address be transmitted to a Google server in the USA and truncated there. Google will use this information on behalf of the operator of this website for the purpose of evaluating your use of the website, compiling reports on website activity and providing other services relating to website activity and internet usage to the website operator. + +2. The IP address transmitted by your browser as part of Google Analytics will not be merged with other Google data. + +3. You may refuse the use of cookies by selecting the appropriate settings on your browser, however please note that if you do this you may not be able to use the full functionality of this website. You can also prevent Google from collecting the data generated by the cookie and relating to your use of the website (including your IP address) and from processing this data by Google by downloading and installing the browser plug-in available at the following link: http://tools.google.com/dlpage/gaoptout?hl=de. + +4. The websites named in paragraph (1) use Google Analytics with the extension “anonymizeIp()”. This means that IP addresses are further processed in abbreviated form, so that they cannot be linked to a specific person. If the data collected about you is personally identifiable, it is immediately excluded and the personal data is deleted immediately. + +5. We use Google Analytics to analyze and regularly improve the use of our website. We can use the statistics obtained to improve our offer and make it more interesting for you as a user. For the exceptional cases in which personal data is transferred to the USA, Google has submitted to the EU-US Privacy Shield, https://www.privacyshield.gov/EU-US-Framework. + + The legal basis for the use of Google Analytics is Art. 6 para. 1 sentence 1 lit. f GDPR. + +6. Information from the third-party provider: Google Dublin, Google Ireland Ltd, Gordon House, Barrow Street, Dublin 4, Ireland, Fax: +353 (1) 436 1001. Terms of use: http://www.google.com/analytics/terms/de.html, overview of data protection: http://www.google.com/intl/de/analytics/learn/privacy.html, as well as the privacy policy: http://www.google.de/intl/de/policies/privacy. + +### Use of Piwik/Matomo + +1. Our websites https://books.publisso.de and https://repository.publisso.de use the web analysis service Piwik/Matomo to analyze and regularly improve the use of our website. We can use the statistics obtained to improve our offer and make it more interesting for you as a user. The legal basis for the use of Piwik/Matomo is Art. 6 para. 1 sentence 1 lit. f GDPR. + +2. Cookies (see § 3 for more details) are stored on your computer for this analysis. The information collected in this way is stored by the controller exclusively on its server in Germany. You can stop the analysis by deleting existing cookies and preventing the storage of cookies. If you prevent the storage of cookies, we would like to point out that you may not be able to use this website to its full extent. You can prevent the storage of cookies by changing the settings in your browser. You can prevent the use of Piwik/Matomo by unchecking the following box to activate the opt-out plug-in: [Piwik/Matomo iFrame]. + +3. This website uses Piwik/Matomo with the “AnonymizeIP” extension. This means that IP addresses are further processed in abbreviated form, so that direct personal references can be ruled out. The IP address transmitted by your browser using Piwik/Matomo is not merged with other data collected by us. + +4. The Piwik/Matomo program is an open source project. Information from the third-party provider on data protection can be found at http://piwik.org/privacy/policy. + +### Use of Jetpack/former WordPress.com-Stats + +1. zbmedblog.de uses the web analysis service Jetpack (formerly: WordPress.com-Stats) to analyze and regularly improve the use of our website. We can use the statistics obtained to improve our offer and make it more interesting for you as a user. We also use the system for measures to protect the security of the website, e.g. to detect attacks or viruses. For the exceptional cases in which personal data is transferred to the USA, Automattic Inc. has submitted to the EU-US Privacy Shield, https://www.privacyshield.gov. The legal basis for the use of Jetpack is Art. 6 para. 1 sentence 1 lit. f GDPR. + +2. Cookies (see § 3 for more details) are stored on your computer for this analysis. The information collected in this way is stored on a server in the USA. If you prevent the storage of cookies, we would like to point out that you may not be able to use this website to its full extent. You can prevent the storage of cookies by changing the settings in your browser or by clicking the “Click here to Opt-out” button at http://www.quantcast.com/opt-out. + +3. This website uses Jetpack with an extension that processes IP addresses in abbreviated form immediately after they are collected in order to prevent them from being linked to individuals. + +4. Information from the third-party provider: Automattic Inc, 60 29 th Street #343, San Francisco, CA 94110-4929, USA, https://automattic.com/privacy, and the third-party provider of the tracking technology: Quantcast Inc, 201 3 rd St, Floor 2, San Francisco, CA 94103-3153, USA, https://www.quantcast.com/privacy. + +## § 9 Social media + +### Use of social media buttons + +1. We currently use the following social media buttons via “c't Shariff”: Youtube, Flickr, Google+, Twitter, Xing, LinkedIn. The c't Shariff project replaces the usual social network share buttons and protects your surfing behavior from prying eyes. Nevertheless, a single click on the button is enough to share information with others. The usual social media buttons transmit user data to Facebook & Co. every time you visit a page and provide the social networks with precise information about your surfing behavior (user tracking). You do not need to be logged in or a member of the network to do this. In contrast, a Shariff button only establishes direct contact between the social network and the visitor when the latter actively clicks on the share button. In this way, Shariff prevents you from leaving a digital trail on every page you visit. Thanks to Shariff, the display of “likes”, “+1s” or “tweets” comes from the operator of the page with the buttons. Shariff acts as an intermediate instance here: Instead of the browser, the website operator's server queries the number of “likes”, “+1s” or “tweets” - and only once a minute to keep traffic within limits. The visitor remains anonymous. Further information on “c't Shariff” can be found at: https://www.heise.de/ct/artikel/Shariff-Social-Media-Buttons-mit-Datenschutz-2467514.html and at: https://www.heise.de/ct/ausgabe/2014-26-Social-Media-Buttons-datenschutzkonform-nutzen-2463330.html. + +2. Address of the respective button provider and URL with its data protection information: Heise Medien GmbH & Co. KG, Karl-Wiechert-Allee 10, 30625 Hanover, Germany; https://www.heise.de/Privacy-Policy-der-Heise-Medien-GmbH-Co-KG-4860.html. + +### Integration of YouTube videos + +1. We have integrated YouTube videos into our online offering, which are stored on http://www.YouTube.com and can be played directly from our website. These are all integrated in “extended data protection mode”, i.e. no data about you as a user is transferred to YouTube if you do not play the videos. Only when you play the videos will the data mentioned in paragraph 2 be transmitted. We have no influence on this data transfer. + +2. By visiting the website, YouTube receives the information that you have accessed the corresponding subpage of our website. In addition, the data mentioned under § 3 of this declaration will be transmitted. This occurs regardless of whether YouTube provides a user account through which you are logged in or whether no user account exists. If you are logged in to Google, your data will be assigned directly to your account. If you do not wish your data to be associated with your YouTube profile, you must log out before activating the button. YouTube stores your data as usage profiles and uses them for the purposes of advertising, market research and/or the needs-based design of its website. Such an evaluation is carried out in particular (even for users who are not logged in) to provide needs-based advertising and to inform other users of the social network about your activities on our website. You have the right to object to the creation of these user profiles, whereby you must contact YouTube to exercise this right. + +3. Further information on the purpose and scope of data collection and its processing by YouTube can be found in YouTube's privacy policy. There you will also find further information on your rights and setting options to protect your privacy: https://www.google.de/intl/de/policies/privacy. Google also processes your personal data in the USA and has submitted to the EU-US Privacy Shield, https://www.privacyshield.gov/EU-US-Framework. + +### Integration of Google Maps + +1. We use the Google Maps service on our website www.zbmed.de. This allows us to show you interactive maps directly on the website and enables you to use the map function conveniently. + +2. When you visit the website, Google receives the information that you have accessed the corresponding subpage of our website. In addition, the data mentioned under § 3 of this declaration will be transmitted. This takes place regardless of whether Google provides a user account through which you are logged in or whether no user account exists. If you are logged in to Google, your data will be assigned directly to your account. If you do not wish your data to be associated with your Google profile, you must log out before activating the button. Google stores your data as usage profiles and uses them for the purposes of advertising, market research and/or the needs-based design of its website. Such an evaluation is carried out in particular (even for users who are not logged in) to provide needs-based advertising and to inform other users of the social network about your activities on our website. You have the right to object to the creation of these user profiles, whereby you must contact Google to exercise this right. + +3. Further information on the purpose and scope of data collection and its processing by the plug-in provider can be found in the provider's privacy policy. There you will also find further information on your rights in this regard and setting options to protect your privacy: http://www.google.de/intl/de/policies/privacy. Google also processes your personal data in the USA and has submitted to the EU-US Privacy Shield, https://www.privacyshield.gov/EU-US-Framework. + +## § 10 Online advertising + +### Use of Google Adwords Conversion6 + +1. At www.zbmed.de, www.publisso.de and www.livivo.de, we use Google Adwords to draw attention to our offers with the help of advertising material (so-called Google Adwords) on external websites. In relation to the advertising campaign data, we can determine how successful the individual advertising measures are. We are interested in showing you advertising that is of interest to you, making our website more interesting for you and achieving a fair calculation of advertising costs. + +2. These advertising materials are delivered by Google via so-called “ad servers”. For this purpose, we use ad server cookies, through which certain parameters for measuring success, such as the display of ads or clicks by users, can be measured. If you access our website via a Google ad, Google Adwords will store a cookie on your PC. This cookie generally loses its validity after 30 days and is not intended to identify you personally. The unique cookie ID, number of ad impressions per placement (frequency), last impression (re-relevant for post-view conversions) and opt-out information (marking that the user no longer wishes to be addressed) are usually stored as analysis values for this cookie. + +3. These cookies enable Google to recognize your internet browser. If a user visits certain pages of an Adwords customer's website and the cookie stored on their computer has not yet expired, Google and the customer can recognize that the user clicked on the ad and was redirected to this page. A different cookie is assigned to each Adwords customer. Cookies can therefore not be tracked via the websites of Adwords customers. We ourselves do not collect and process any personal data in the aforementioned advertising measures. We only receive statistical evaluations from Google. Based on these evaluations, we can recognize which of the advertising measures used are particularly effective. We do not receive any further data from the use of the advertising material; in particular, we cannot identify the users on the basis of this information. + +4. Due to the marketing tools used, your browser automatically establishes a direct connection with the Google server. We have no influence on the scope and further use of the data collected by Google through the use of this tool and therefore inform you according to our state of knowledge: Through the integration of AdWords Conversion, Google receives the information that you have accessed the corresponding part of our website or clicked on an advertisement from us. If you are registered with a Google service, Google can assign the visit to your account. Even if you are not registered with Google or have not logged in, it is possible that the provider will find out your IP address and store it. + +5. You can prevent participation in this tracking process in various ways: + + a. by setting your browser software accordingly, in particular by suppressing third-party cookies so that you do not receive any ads from third-party providers; + + b. by deactivating cookies for conversion tracking by setting your browser to block cookies from the domain www.googleadservices.com, https://www.google. de/settings/ads, whereby this setting will be deleted if you delete your cookies; + + c. by deactivating the interest-based ads of the providers that are part of the “About Ads” self-regulation campaign via the link http://www.aboutads.info/choices, whereby this setting will be deleted if you delete your cookies; + + d. by permanently deactivating them in your Firefox, Internet Explorer or Google Chrome browsers under the link http://www.google.com/settings/ads/plugin. We would like to point out that in this case you may not be able to use all functions of this website to their full extent. + +6. The legal basis for the processing of your data is Art. 6 para. 1 sentence 1 lit. f GDPR. Further information on data protection at Google can be found here: http://www.google.com/intl/de/policies/privacy and https://services.google.com/sitestats/de.html. Alternatively, you can visit the website of the Network Advertising Initiative (NAI) at http://www.networkadvertising.org. Google has submitted to the EU-US Privacy Shield, https://www.privacyshield.gov/EU-US-Framework. + + Google Remarketing In addition to Adwords Conversion, we use the Google Remarketing application on www.zbmed.de, www.publisso.de and www.livivo.de. This is a procedure with which we would like to address you again. This application allows our advertisements to be displayed to you when you continue to use the Internet after visiting our website. This is done by means of cookies stored in your browser, which are used by Google to record and evaluate your usage behavior when you visit various websites. This enables Google to determine your previous visit to our website. According to its own statements, Google does not merge the data collected in the context of remarketing with your personal data, which may be stored by Google. In particular, according to Google, pseudonymization is used in remarketing. + +### DoubleClick by Google8 + +1. On www.zbmed.de, www.publisso.de and www.livivo.de we continue to use the online marketing tool DoubleClick by Google. DoubleClick uses cookies to display ads that are relevant to users, to improve campaign performance reports or to prevent a user from seeing the same ads more than once. Google uses a cookie ID to record which ads are displayed in which browser and can thus prevent them from being displayed more than once. In addition, DoubleClick can use cookie IDs to record so-called conversions that are related to ad requests. This is the case, for example, when a user sees a DoubleClick ad and later visits the advertiser's website with the same browser and makes a purchase there. According to Google, DoubleClick cookies do not contain any personal information. + +2. Due to the marketing tools used, your browser automatically establishes a direct connection with the Google server. We have no influence on the scope and further use of the data collected by Google through the use of this tool and therefore inform you according to our state of knowledge: Through the integration of DoubleClick, Google receives the information that you have called up the corresponding part of our website or clicked on an advertisement from us. If you are registered with a Google service, Google can assign the visit to your account. Even if you are not registered with Google or have not logged in, there is a possibility that the provider will find out your IP address and store it. + +3. You can prevent participation in this tracking process in various ways: + + a. by setting your browser software accordingly, in particular by suppressing third-party cookies so that you do not receive ads from third-party providers; + + b. by deactivating cookies for conversion tracking by setting your browser to block cookies from the domain www.googleadservices.com, https://www.google.de/settings/ads, whereby this setting will be deleted if you delete your cookies; + + c. by deactivating the interest-based ads of the providers that are part of the “About Ads” self-regulation campaign via the link http://www.aboutads.info/choices, whereby this setting will be deleted if you delete your cookies; + + d. by permanently deactivating them in your Firefox, Internet Explorer or Google Chrome browsers via the link http://www.google.com/settings/ads/plugin. We would like to point out that in this case you may not be able to use all the functions of this website to their full extent. + +4. The legal basis for the processing of your data is Art. 6 para. 1 sentence 1 lit. f GDPR. Further information on DoubleClick by Google can be found at https://www.google.de/doubleclick and https://support.google.com/campaignmanager/answer/2839090, as well as on data protection at Google in general: https://www.google.de/intl/de/policies/privacy. Alternatively, you can visit the website of the Network Advertising Initiative (NAI) at http://www.networkadvertising.org. Google has submitted to the EU-US Privacy Shield, https://www.privacyshield.gov/EU-US-Framework + + Our privacy policy is based on the forms from: **Koreng, Ansgar/Lachenmann, Matthias u.a. (Hrsg.), „Formularhandbuch Datenschutzrecht“, C. H. Beck, München, 2. Auflage, 2018.**
+ We have adapted these according to our needs. diff --git a/docs/_Getting-Started/02-contributing.md b/docs/_Getting-Started/02-contributing.md index 3c70ff42..438f2c57 100644 --- a/docs/_Getting-Started/02-contributing.md +++ b/docs/_Getting-Started/02-contributing.md @@ -1,5 +1,5 @@ --- -title: Contributing to the NFDI4Microbiota Knowledge Base +title: How to Contribute category: Getting-Started layout: default docs_css: markdown diff --git a/docs/_How-We-Operate/01-governance-workflows.md b/docs/_How-We-Operate/01-governance-workflows.md index 462b077d..890be086 100644 --- a/docs/_How-We-Operate/01-governance-workflows.md +++ b/docs/_How-We-Operate/01-governance-workflows.md @@ -1,5 +1,5 @@ --- -title: Governance workflows +title: Governance Workflows category: How-We-Operate layout: default docs_css: markdown diff --git a/docs/_RDM-Collect/13-data-qc.md b/docs/_RDM-Collect/13-data-qc.md index 355d5340..9d7d7307 100644 --- a/docs/_RDM-Collect/13-data-qc.md +++ b/docs/_RDM-Collect/13-data-qc.md @@ -12,7 +12,7 @@ Legend: * END = no solution, this problem is unsolvable -# RNA-seq +## RNA-seq 1. high peak at low bp in the electropherogram (intensity mV per Size bp) - **source**: documentation (PDF) - **possible reason(s)**: contamination e.g. adapter dimers (adapter+adapter, no DNA) @@ -149,7 +149,7 @@ Legend: - **possible reason(s)**: humans are bad with ratios (0.01 = almost 0 and 100 is just large but not the largest bar ever) - **solution/measure**: use any log transformation (e.g. log10: 0.01 => -2, 100 => +2) -# Single cell +## Single cell ### Quality check 1. peak at left/right side in gene or reads per cell histogram or log10-cummulative-number of reads per cell id @@ -191,9 +191,5 @@ Legend: - **possible reason(s)**: some genes can be interpreted as dates when using excel for data handling - **solution/measure**: never ever use excel or at least make sure that cell type is not "AUTO" -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) - -# Further resources - -# References diff --git a/docs/_RDM-Plan/01-dmp.md b/docs/_RDM-Plan/01-dmp.md index a1a47a95..fb0069fa 100644 --- a/docs/_RDM-Plan/01-dmp.md +++ b/docs/_RDM-Plan/01-dmp.md @@ -5,12 +5,12 @@ layout: default docs_css: markdown --- -# Introduction +## Introduction A Data Management Plan (DMP) is a formal and living document that defines responsibilities and provides guidance. It describes data and data management during the project and measures for archiving and making data and research results available, usable, and understandable after the project has ended. DMPs are required in [DFG funding proposals since 2022](https://www.dfg.de/en/research_funding/announcements_proposals/2022/info_wissenschaft_22_25/index.html) and in [EU Funding Programs 2021-2027](https://ec.europa.eu/info/funding-tenders/opportunities/docs/2021-2027/common/guidance/aga_en.pdf). For funders, DMPs serve as a reporting tool to hold grantees accountable for conducting good and open science, with regular updates or in case of changes. For researchers and other stakeholders, DMPs are meant to be a living document that accompanies them from proposal writing or project start to the sharing of their data and results. -# Content of DMPs +## Content of DMPs DMPs typically include the following information: * Administrative project-specific information (including a description of the research project) * Roles, responsibilities and obligations @@ -27,7 +27,7 @@ DMPs typically include the following information: To find a TDR, see the [Data Repositories page of the Knowledge Base]({% link _RDM-Share/22-data-repositories.md %}). -# DMP templates and examples +## DMP templates and examples **Templates** * [NFDI4Microbiota's template](https://doi.org/10.5281/zenodo.13628589) @@ -38,7 +38,7 @@ To find a TDR, see the [Data Repositories page of the Knowledge Base]({% link _R * [DD-DeCaF Bioinformatics Services for Data-Driven Design of Cell Factories and Communities](https://phaidra.univie.ac.at/o:1139495) * [METASTAVA](https://doi.org/10.5281/zenodo.5841166) -# Benefits of a DMP +## Benefits of a DMP If implemented correctly, a DMP can [benefit all stakeholders](https://doi.org/10.1371/journal.pcbi.1006750) in a research project, despite the initial cost of creating the DMP itself. A DMP can **save time and nerves** for yourself and others by planning ahead. DMPs define roles, responsibilities, and efforts regarding the data and its management. Writing a DMP will also get you in touch with IT staff and your institution's data protection officer at an early stage. Writing a DMP also ensures data quality and allows you to easily trace your processing steps, making your analysis and results reproducible. Writing a DMP also allows you to manage access rights and prevent security breaches. Finally, by writing your DMP, you may be able to identify gaps and vulnerabilities in your current data management strategy at an early stage and outline solutions to fill them. @@ -47,7 +47,7 @@ A DMP can also facilitate and **harmonize the coordination and shared use of dat DMPs offer **other benefits**, such as enabling verification and control: researchers are accountable for how their data are managed during their research project. They also help to identify - and potentially minimize - time and money costs that need to be included in the proposal, such as for Research Data Management (RDM) activities. They also help to comply with Good Research Practice (GRP), support research integrity, and ensure that ethical and legal requirements are met. DMPs also help to meet institutional and funder requirements: funding agencies increasingly require information on the management of research data, and a DMP allows you to structure and formalize this information. Last but not least, DMPs facilitate data reuse, thereby increasing data citation and advancing scientific progress. -# Writing a DMP +## Writing a DMP **Who is involved in the creation of the DMP?** Entities involved in the creation of a DMP are researchers, RDM staff (check your institution's [research data policy](https://www.forschungsdaten.org/index.php/Forschungsdaten-Policies) and ask for [local support](https://www.forschungsdaten.org/index.php/FDM-Kontakte)) and central infrastructure (e.g. computer center, library). @@ -57,7 +57,7 @@ DMPs offer **other benefits**, such as enabling verification and control: resear **DMP quality check:** A good DMP is well structured and distinguishes between actions to be taken during and after the project. It is a living document that needs to be updated regularly and is for the use of all project stakeholders. It should be started as early as possible, be as concise as possible, as long as necessary, and contain sufficient detail without being redundant. Ideally, the DMP will be published with the research data at the end of the project. -# DMP tools +## DMP tools Although it is generally possible to formulate a DMP in a text document, the use of more dynamic and machine-readable formats finally unlocks its full potential. * **[Research Data Management Organizer](https://rdmorganiser.github.io/) (RDMO)** is an open-source web application that has been widely adopted by institutes and consortia in Germany. RDMO supports the structured and collaborative planning and implementation of RDM and also enables the textual output of a DMP. @@ -69,7 +69,7 @@ RDMO organizes individual DMPs around predefined templates that reflect the requ * **[DMPonline](https://dmponline.dcc.ac.uk/)** was developed by the [Digital Curation Center](https://www.dcc.ac.uk/) (DCC) for the UK funding context but has also been used elsewhere. It is an open-source, web-based tool for researchers. It enables the creation, review, and sharing of DMPs that meet institutional and funder requirements. -# Further resources +## Further resources * Cessda - [Data Management Expert Guide](https://dmeg.cessda.eu/Data-Management-Expert-Guide) * [Content of a Data Management Plan](https://doi.org/10.18154/RWTH-2019-10064) * [Data Management Plan — the Turing Way - Data Management Plan](https://the-turing-way.netlify.app/reproducible-research/rdm/rdm-dmp.html) @@ -90,8 +90,8 @@ RDMO organizes individual DMPs around predefined templates that reflect the requ * [SM Wizard](https://smw.ds-wizard.org/) * [Writing and using a software management plan](https://www.software.ac.uk/guide/writing-and-using-software-management-plan) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) -# References +## References {% bibliography --cited_in_order %} diff --git a/docs/_RDM-Preserve/24-aruna-object-storage.md b/docs/_RDM-Preserve/24-aruna-object-storage.md index ca2fcfda..6cb3af2a 100644 --- a/docs/_RDM-Preserve/24-aruna-object-storage.md +++ b/docs/_RDM-Preserve/24-aruna-object-storage.md @@ -5,10 +5,10 @@ layout: default docs_css: markdown --- -# Abstract +## Abstract Aruna Object Storage (AOS) is a modern distributed storage platform designed to meet the increasing demand for effective data management and storage of scientific data. It is the central storage of the [Research Data Commons (RDC)](23-research-data-commons.html) cloud layer and the data foundation for the upper layers. It is a cloud-native, scalable system with an API and a S3-compatible interface. It allows resource organization into Objects, Datasets, Collections and Projects. Additionally, it provides an event-driven architecture which enables automation, data validation and improves accessibility and reproducibility of scientific results. AOS is open-source and available at [https://aruna-storage.org](https://aruna-storage.org). -# Factsheet +## Factsheet * ![Aruna Object Storage Logo]({{ '/assets/img/aruna_dark_font.png' | relative_url }} "Aruna Object Storage Logo"){:width="20%"} * Status: V2.x BETA, V1.x deprecated * Current Version: V2.0.x beta @@ -18,7 +18,7 @@ Aruna Object Storage (AOS) is a modern distributed storage platform designed to ![AOS inside RDC]({{ '/assets/img/rdc_aruna.png' | relative_url }} "AOS inside RDC"){:width="70%"} -# Overview +## Overview AOS is a fast, secure and geo-redundant data storage. It offers a sophisticated metadata management according to the FAIR principles. It builds the foundation for RDCs mediation and semantic layer and and handles all stored data objects secure, and data-agnostically. AOS key features are: @@ -33,21 +33,21 @@ Storing data in localized, domain-specific data silos has limited use for collab ![Aruna Object Storage Concept]({{ '/assets/img/concept_aruna.png' | relative_url }} "Aruna Object Storage Concept"){:width="40%"} -# Getting started +## Getting started AOS is located at [https://aruna-storage.org](https://aruna-storage.org). Users can log in there. Currently, the AAI of the GWDG is used for this purpose, which requires a user account at the GWDG, the DFN or at LifeScience AAI. Nevertheless, additional identity providers are possible. Thus, login via an SSO of NFDI4Biodiversity (and other NFDIs) will be supported when the service is established. After the AOS account has been activated, the user can create a project. Further users can then be activated for this project to enable data exchange and joint processing. The project can then be filled with data either via the API or via the S3 interface. ![Aruna Object Storage Start Page]({{ '/assets/img/aruna-startpage-2023-7-28_8-24-10.png' | relative_url }} "Aruna Object Storage Start Page"){:width="60%"} -# User Guide +## User Guide Basically, AOS is intended as a data backend for the RDC. For this reason, very few end users will use AOS directly. Data import, verification, transformation and processing is basically possible via the services in the mediation layer. This also ensures the consistency of the data. Users and services can be informed about changes to individual data objects or even entire projects via the AOS notification service and can thus react to these changes. -# Developer Guide +## Developer Guide The current documentation for using AOS is linked from the AOS home page at [https://aruna-storage.org](https://aruna-storage.org). This contains a complete description of the API. AOS consists of five main components: AOS Server, AOS Proxy, AOS API (and its S3 interface), AOS CLI and AOS Notification System. Of these components, the AOS team installs and maintains the servers and associated databases. AOS proxies can then be installed at various locations, which then communicate with the servers in each case. The actual data traffic from and to the storage backend then takes place via the AOS proxies. The interaction between a client and the proxies/servers takes place via the AOS API. To reduce the entry barrier, there is a command line interface called AOS CLI, which encapsulates API calls. Moreover, an S3 interface was implemented, since many software packages already support data storage via S3 as industry standard. Finally, the AOS notification system will soon be released to allow immediate response to changes in the AOS. This can be, for example, a data verification that is automatically initiated when a data upload is complete. -## AOS infrastructure +### AOS infrastructure The main component of AOS is a distributed database system. It synchronizes all data between several computers at different locations and thus generates fail-safety via this redundancy. This database is regularly backed up. The actual data can also be synchronized across multiple sites to provide redundancy. Nevertheless, all data will also be stored at one location in a redundant system. Due to the fact that data cannot be overwritten, but new versions of the data are then created, in combination with the redundant data storage at multiple levels, no backup of the data is currently performed. An implementation at a later date is currently being discussed. -## AOS data structure +### AOS data structure AOS organizes data in Version 1.x into Projects, Collections, Object Groups, and Objects, starting with version 2.x the data structure will be even more flexible and are organized into Projects, Collections, Datasets, and Objects with a more flexible relation model. |![Aruna Object Storage Structure V1]({{ '/assets/img/aruna-1-structure.png' | relative_url }} "Aruna Object Storage Structure V1"){:width="50%"} | @@ -58,9 +58,9 @@ AOS organizes data in Version 1.x into Projects, Collections, Object Groups, and |-| | UML diagram of the Aruna Object Storage data structure starting in Version v2.0. All resources form a directed acyclic graph of belongs to relationships (blue) with Projects as roots and Objects as leaves. Resources can also describe horizontal version relationships (orange), data/metadata relationships (yellow) or even custom user-defined relationships (green). | -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) -# References +## References * Dokumentation and Aruna start page: [https://aruna-storage.org](https://aruna-storage.org) * Source-Code: [https://github.com/ArunaStorage](https://github.com/ArunaStorage) diff --git a/docs/_RDM-Preserve/25-digital-preservation.md b/docs/_RDM-Preserve/25-digital-preservation.md index 12cb42a8..4c60e372 100644 --- a/docs/_RDM-Preserve/25-digital-preservation.md +++ b/docs/_RDM-Preserve/25-digital-preservation.md @@ -4,10 +4,11 @@ category: RDM-Preserve layout: default docs_css: markdown --- -# Definition + +## Definition Digital preservation means taking certain measures to ensure that digital material can be found and can be accessed in the long term ("long-term accessibility of data"). It aims to preserve information in a way that is understandable and reusable for a specific community and to prove its authenticity. -# Digital preservation for researchers +## Digital preservation for researchers The sustainable handling of data by researchers naturally facilitates the long-term accessibility of data. Best practice methods are: * Cleaning data / data structures - see also: [Data Organisation](https://knowledgebase.nfdi4microbiota.de/RDM-Process/14-data-organization.html) * Validating data - see also: [Data Quality Control](https://knowledgebase.nfdi4microbiota.de/RDM-Collect/13-data-qc.html) @@ -18,7 +19,7 @@ The sustainable handling of data by researchers naturally facilitates the long-t * Storing files on 2 different media types * Keeping at least 1 copy off site. -## Data selection +### Data selection To decide well-founded on data selection we recommend reading the how-to guide of the Edinburgh Digital Curation Centre {% cite dcc_five_2014 %}. The suggested steps are: * **Step 1:** Identify purposes that the data could fulfill: consider the purpose or ‘reuse case’ of your data, including reuse outside your research group. * **Step 2:** Identify data that **must** be kept: consider legal or policy compliance risks, as well as funder requirements. @@ -27,7 +28,7 @@ To decide well-founded on data selection we recommend reading the how-to guide o * **Step 5:** Complete the data appraisal, i.e. list what data must, should or could be kept to fulfill which potential reuse purposes. Summarize any actions needed to prepare the data for deposit - or justification for not keeping it. -## Recommended file formats for preservation +### Recommended file formats for preservation Making your research available in recommended file formats additional to the original software format supports highly the reusability and long-term accessibility of your data. Attributes of those file formats are: * Open rather than proprietary (examples for [open files formats](https://en.wikipedia.org/wiki/List_of_open_file_formats)) @@ -40,18 +41,18 @@ Attributes of those file formats are: For biomaterial data, recommended formats are CSV, TXT and XML. -# Digital preservation for repository operators +## Digital preservation for repository operators Specific preservation measures depend on the digital objects, needs of the user community, and various other conditions. Repositories usually contain publications as files, making file format identification and validation relevant. -## Bitstream preservation +### Bitstream preservation Preservation on the bitstream level is the basis for digital preservation. It covers e. g. * Checking checksums of transferred files upon receiving them (or generating file checksums) and conducting regular fixity checks * Redundant storage of data * Generating backups (e. g. offline backups of the underlying repository database) * Strategies for updating storage media (according to e. g. server lifetime) -## Preservation beyond bitstream +### Preservation beyond bitstream Preservation of file content, being able to open and render it correctly in a software is part of logical {% cite lindlar_2020_3672773 %} or technical preservation, also called digital curation. Semantic preservation is concerned with e. g. semantic drift impacting metadata. * Obtaining sufficient rights allowing e. g. format migrations, file repairs and re-use over the long-term like re-publication in other infrastructures * File format identification, based format-specific bit patterns, e. g. via [DROID](https://coptr.digipres.org/index.php/DROID) during publication process @@ -69,10 +70,10 @@ Preservation of file content, being able to open and render it correctly in a so Many digital preservation criteria applying to repositories are also present in the certification criteria of the CoreTrustSeal and the nestor seal {% cite coretrustseal_standards_and_certificatio_2022_7051012 harmsen_henk_explanatory_2013 %}. -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) -# References +## References {% bibliography --cited_in_order %} diff --git a/docs/_RDM-Process/14-data-organization.md b/docs/_RDM-Process/14-data-organization.md index f66b2a29..68fcaa6f 100644 --- a/docs/_RDM-Process/14-data-organization.md +++ b/docs/_RDM-Process/14-data-organization.md @@ -5,18 +5,18 @@ layout: default docs_css: markdown --- -# Motivation +## Motivation -## 5S methodology +### 5S methodology “5S” {% cite Wikipedia:5S %} is a workplace organisation method that uses a list of five Japanese words translated into English as: sort, set in order, shine, standardise and sustain. In the context of organising research data, 'sort' would refer to deleting unnecessary files. 'Set in order' would refer to developing and documenting naming conventions and folder structures. 'Shine' would refer to following conventions and developing routines. 'Standardise' would refer to documenting rules and responsibilities and developing best practices and standard operating procedures (SOPs). And 'sustain' would refer to regularly checking that rules are being followed and making improvements where necessary {% cite assmann_2022 %}. -## Further resources on the 5S methodology +### Further resources on the 5S methodology * [The 5S Methodology in Research Data Management](https://doi.org/10.5281/zenodo.4494258) * [5S Data: Setz dich auf deine 5 Buchstaben und organisiere deine Daten! (Coffee Lecture)](https://youtu.be/73XzLsLrwMk) -# File naming +## File naming -## File naming convention +### File naming convention In order to maximise access to your data, to stay organised and to identify your files quickly, files and folders should be named in a meaningful and systematic way {% cite LMA_RDMWG:2024a rehwald_2022 %}. A file naming convention provides a framework for naming your files and/or folders in a way that describes what they contain and how they relate to other files. This framework will help you, your future self, and others in a shared or collaborative group file-sharing environment to navigate your work more easily {% cite LMA_RDMWG:2024a %}. @@ -25,21 +25,21 @@ Thus, within your research group, we recommend {% cite biernacka:2020 bobrov_202 2. Document your file and folder naming convention. 3. Stay consistent: the naming convention should be chosen in advance to ensure that it can be systematically followed and contains the same information (such as date and time) in the same order (e.g. yyyy-mm-dd) {% cite biernacka:2020 %}. -## Criteria for a good naming convention +### Criteria for a good naming convention Avoid automatically generated names (e.g. from digital cameras) as they can lead to conflicting names due to repetition {% cite biernacka:2020 %}. A good naming convention produces file names that are human-readable, machine-readable, and play well with your system's default ordering {% cite Goldman:2020a %}. -## Human-readability +### Human-readability File names should be descriptive and provide just enough contextual information to establish a link to a particular experiment or data collection {% cite bobrov_2021 LMA_RDMWG:2024a %}. To achieve this, you should choose names that reflect the content and are unique {% cite lindstädt_2019%}. -## Machine-readability +### Machine-readability In some operating systems, long file paths can cause technical problems. Therefore, file names should be as long as necessary and as short as possible to keep them concise and readable on any operating system. It is recommended to limit file names to ≤ 32 characters (32CharactersLooksExactlyLikeThis.txt). Avoid using special characters (e.g. {}[]<>()* % # ' ; " , : ? ! & @ $ ~), umlauts (ä, ö, ü, ß,...) or spaces {% cite biernacka:2020 %}. Periods should only be used before version numbers and file extensions, which should be preserved from the system (e.g. .ERL, .CSV, .TIF) {% cite lindstädt_2019 %}. You can use underscores (_), hyphens (-), or CamelCase instead to make file names both human- and machine-readable {% cite LMA_RDMWG:2024a %}. -## Play well with default ordering +### Play well with default ordering The computer organises files by name, character by character. To browse your files easily, you should choose names that can be sorted alphabetically, numerically or chronologically to ensure that the files appear in a logical order. If you want chronological order, start with a date in ISO 8601 format (YYYY-MM-DD or YYYYMMDD) {% cite Briney:2020 lindstädt_2019 %}. When using sequential numbering, make sure to use leading zeros. For a sequence of 1-10: 01-10 and for a sequence of 1-100: 001-010-100. Scalability should be taken into account (e.g. if a two-digit file number is chosen, the number of files is limited to 99) {% cite biernacka:2020 %}. Name components that are already part of the folder name do not have to be repeated in the file names {% cite biernacka:2020 %}. Also consider the system under which the file is stored for later access and retrieval of the data. -## Examples of file names +### Examples of file names Below are some examples of file names that are human-readable (if you know the code/abbreviations), machine-readable, and properly sortable {% cite bres_2022 %}: * 2016-01-04_ProjectA_Ex1Test1_SmithE_v1.0.xlsx * 2000_USNM_379221_01.tiff @@ -50,25 +50,25 @@ Here are some examples of file names that need improvement {% cite bres_2022 %}: * Meeting notes Jan 17.doc * Notes Eric.txt -## Tools for simultaneous renaming of files +### Tools for simultaneous renaming of files -### Multiple OS +#### Multiple OS * [Adobe Bridge](https://www.adobe.com/products/bridge.html) * [jExifToolGUI](https://github.com/hvdwolf/jExifToolGUI/blob/master/README.md) -### Linux +#### Linux * [Gnome Commander](https://gcmd.github.io/) * [GPRename](https://gprename.sourceforge.net/) -### Mac +#### Mac * [ExifRenamer](https://www.qdev.de/?location=mac/exifrenamer) * [NameChanger](https://mrrsoftware.com/namechanger/) * [Renamer 6](https://renamer.com/) -### Unix +#### Unix * mv command -### Windows +#### Windows * [Advanced Renamer](https://www.advancedrenamer.com/) * [Altap Salamander](https://www.altap.cz/) * [Ant Renamer](http://www.antp.be/software/renamer) @@ -78,7 +78,7 @@ Here are some examples of file names that need improvement {% cite bres_2022 %}: * [Total Commander](https://www.ghisler.com/deutsch.htm) * [WildRename](https://www.cylog.org/utilities/wildrename.jsp) -## Further resources on file naming +### Further resources on file naming * [File naming examples](https://doi.org/10.3897/rio.6.e56508) (Table 1) * Information and steps for creating [naming conventions](https://datamanagement.hms.harvard.edu/plan-design/file-naming-conventions) * Information about [file naming](https://rdm.elixir-belgium.org/file_naming) @@ -89,29 +89,29 @@ Here are some examples of file names that need improvement {% cite bres_2022 %}: * [Checklist for FIle Naming Conventions](https://osf.io/dpu45) * A detailed [documentation of a File Naming Convention](https://www.data.cam.ac.uk/files/gdl_tilsdocnaming_v1_20090612.pdf) -# File versioning +## File versioning Versioning or version control is the practice of tracking and managing changes to a file or set of files over time so that you can later retrieve specific versions. We recommend that you meet with project partners to decide how versioning will be carried out, how version changes will be documented, and how a version change will be defined {% cite bres_2022 %}. -## Purpose and use of versioning +### Purpose and use of versioning Versioning helps you to keep a complete long-term change history of each file by tracking, tracing and annotating your steps (i.e. changes made to the file(s)) and also allows you to go back one step. Versioning also allows you to keep multiple versions of each file, and to create new versions of the same file - or even new results - by incorporating new data and/or changes to a file’s structure; this is particularly important in the case of software. Versioning also supports debugging in software. Overall, versioning makes your research easier to understand {% cite biernacka:2020 bres_2022 Di_Russo:2020 %}. -## Version control methods and tools +### Version control methods and tools Versioning can be done in the file name (see semantic versioning below), in the data (e.g. in the header or a column for comments), in a text file (e.g. in a README file), or using a version control system (VCS). A VCS is a software tool that helps to manage changes to one or more files over time. Examples of VCSs include Git (e.g. Bitbucket, GitHub, GitLab) and Apache Subversion {% cite Git:n.d. %}. For collaborative document and storage locations (e.g. wiki, Google Docs, cloud), versioning is available in situ {% cite biernacka:2020 %} (i.e. within the document/storage location and in real-time). -## Apply versioning methods +### Apply versioning methods Manual file versioning can be done using [semantic versioning](https://semver.org/). You can do this by adding a "v" to the end of each file name, followed by a maximum of three numbers separated by a period (note that these are the only periods allowed in a file name other than the one before the extension). The first number is called MAJOR and indicates important changes. The second number is called MINOR and indicates less drastic changes. The third number is called PATCH, and is mainly used by software developers to indicate bug fixes, but could also be used when fixing typos. Examples of semantic versioning would look like this {% cite bobrov_2021 bres_2022 %}: Filename_vMAJOR.MINOR.PATCH.FileExtension -* Ex1Test1_SmithE_v1.0.0.xlsx -* Ex1Test1_SmithE_v1.2.5.xlsx -* Ex1Test1_SmithE_v2.1.1.xlsx +* `Ex1Test1_SmithE_v1.0.0.xlsx` +* `Ex1Test1_SmithE_v1.2.5.xlsx` +* `Ex1Test1_SmithE_v2.1.1.xlsx` If you decide to use manual file versioning, it is recommended that you use a version control table (a version control table template from the University of Sydney Library can be downloaded [here](https://www.library.sydney.edu.au/content/dam/library/documents/support/doc_versioncontrol.docx)). It is also recommended that you assign responsibilities for completing files, store milestone versions, and store obsolete versions separately after backup. How many versions of a file will be kept, which versions (e.g. major versions instead of minor versions (version 2.0 but not 2.1)), for how long, and how the versions will be organised need to be decided in advance {% cite biernacka:2020 %}, ideally with project partners. -# Folder structure +## Folder structure To make it easier to find files, especially if you have a lot of data, you should avoid a chaotic or alphabetical approach to storing data. Instead, a proper folder structure is a hierarchical arrangement in which folders are created to make it easier to find data {% cite biernacka:2020 %}. A typical hierarchical folder structure has a root folder and several levels of subfolders. A carefully planned folder structure, with understandable folder names and an intuitive design, is the foundation of good data organisation. The folder structure provides an overview of what information can be found where, enabling both current and future contributors to understand what files have been produced in the project {% cite Mičetić:n.d. %}. -## General characteristics of an efficient folder structure +### General characteristics of an efficient folder structure An efficient folder structure allows "someone", perhaps your future self, to look at your files and immediately understand in detail what you have done and why {% cite Goldman:2020b %}. Therefore you should choose a folder structure that is hierarchical, clear, comprehensive, efficient and conclusive {% cite bres_2022 bres_2023 %}. To make it clear and comprehensive for other team members, make sure the structure is self-explanatory and has intuitive navigation {% cite biernacka:2020 bobrov_2021 bres_2022 %}. Short, meaningful folder names that follow a comprehensive naming convention make browsing a folder structure more efficient {% cite assmann_2022 RDM_Guide:n.d. %}. Sometimes it is a good idea to number the folders to ensure that they work well with the system's default order {% cite assmann_2022 %}. For clarity, the folder structure should be identical on servers and local devices {% cite biernacka:2020 %}. @@ -131,7 +131,7 @@ You should avoid using generic "current stuff" folders. Also, be careful about c Make sure you don't have overlapping categories, as you shouldn't have copies of files in different folders, since this can lead to confusion and make it difficult to keep track of different versions of the file {% cite Goldman:2020b %}. If you need to see a file in more than one folder, you can use shortcuts to the file instead. This allows you to keep a single reference file {% cite RDM_Guide:n.d. %}. In particular, make sure you have a 'raw data' folder for each type of data or experiment {% cite RDM_Guide:n.d. %}. It is important to store your raw data separately so that the original versions of the files or their documentation are preserved and the original files can be reconstructed {% cite biernacka:2020 %}. -## Example of folder structure +### Example of folder structure * Project * Data * Raw_data @@ -147,7 +147,7 @@ Make sure you don't have overlapping categories, as you shouldn't have copies of * Conference_reports * Administrative_information -## Further resources on folder structure +### Further resources on folder structure * [Checklist Directory Form](https://osf.io/fp9j5) * [Worksheet for Naming and Organizing Files and Folders](https://www.dropbox.com/scl/fi/1zd63iszw33rh4hjcu1dl/Worksheet_fileOrg.docx?rlkey=q0t25t1wttp4qx2p1ne39qfhd&e=1&dl=0) * [Information on File Naming and Folder Hierarchy](https://libraries.mit.edu/data-management/store/organize/) @@ -159,10 +159,10 @@ Make sure you don't have overlapping categories, as you shouldn't have copies of * [Template for research repositories](https://doi.org/10.5281/zenodo.4410128) * [Simple Open Data template](https://doi.org/10.5281/zenodo.4899847) -# Tools +## Tools * [Data Curation Tool](https://github.com/fair4health/data-curation-tool) (FAIR4Health) * [FAIRDOM](https://fair-dom.org/news/2021-11-30-covid-community-conference-fairdomhub.html): “Project space [...] used by the community to organize, share and publish data, documents, literature and computational models, as well as to list contributors” * G-Node Infrastructure ([GIN](https://gin.g-node.org/)) = Modern Research Data Management for Neuroscience (see Notes for more details) -# References +## References {% bibliography --cited_in_order %} diff --git a/docs/_RDM-Reuse/23-research-data-commons.md b/docs/_RDM-Reuse/23-research-data-commons.md index c9ea7bb6..e9ec9d09 100644 --- a/docs/_RDM-Reuse/23-research-data-commons.md +++ b/docs/_RDM-Reuse/23-research-data-commons.md @@ -5,11 +5,11 @@ layout: default docs_css: markdown --- -# Overview +## Overview The Research Data Commons (RDC) is conceptualized as an expandable, cloud-based research infrastructure that provides scientists, data providers, and data consumers with powerful tools for creating FAIR data products and facilitates the exchange of data and services in a collaborative manner, both within the German National Research Data Infrastructure (NFDI) and beyond. In NFDI4Microbiota, RDC development specifically serves to empower users from this domain to reuse heterogeneous data sources, correlate them, and conduct complex analyses to extract new research insight. In agreement with the FAIR principles, we work to make the offered data products and services computer-actionable, i.e., services will provide a FAIR application program interface (API). Based on an initial design of an architecture, RDC is developed incrementally, with the initial architecture becoming more specific and the first services of a reference implementation available. Selected components may be developed in cooperation with other NFDI consortia, and services will be continuously growing in number as the NFDI4Microbiota project progresses. -# Architecture +## Architecture A brief overview of the RDC architecture is outlined in the attached figure. In order to manage the complexity of the RDC, we decided to organize the software architecture in layers and software components that interact with each other via well-defined interfaces. There are a total of four layers entitled: * Cloud Layer @@ -33,5 +33,5 @@ The **Application Layer** consists of concrete applications and services develop In addition to these four layers, there are two other essential elements in the architecture. The first one **Management & Governance** features tools and policies to manage rules and access rights for the resources offered in the four horizontal layers, including user management and monitoring of usage of the technical resources. The second, called **External Data Interfaces**, features a collection of interfaces for accessing external data sets. Obviously, RDC requires connectivity to established large data providers without the need to manage copies of their data in the Cloud Layer. -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) diff --git a/docs/_RDM-Reuse/27-data-reuse.md b/docs/_RDM-Reuse/27-data-reuse.md index a7d3157c..d9d66ca4 100644 --- a/docs/_RDM-Reuse/27-data-reuse.md +++ b/docs/_RDM-Reuse/27-data-reuse.md @@ -1,12 +1,12 @@ --- -title: Data Reuse +title: Data Re-Use category: Research-Data-Management layout: default docs_css: markdown redirect_from: /Research-Data-Management --- -# Benefits and drawbacks +## Benefits and drawbacks Making data reusable benefits researchers who publish their data, researchers who reuse data, and society. @@ -22,21 +22,21 @@ For researchers who publish their data, preparing data sets for reuse is time-co For researchers reusing data, there are risks such as unknown quality and normalization (i.e. "the same data is stored multiple times in the same database under different names/identifiers"). There is also the challenge of comparing and integrating data sets from different sources {% cite sielemann_2020 %}. -# Successful Cases of Data Reuse +## Successful Cases of Data Reuse -## Case 1: FishBase {% cite pavone_2020 %} +### Case 1: FishBase {% cite pavone_2020 %} Various [data sources](https://web.archive.org/web/20111008223552/http://ichthyology.bio.auth.gr/files/tsikliras/d/d3.pdf) have been combined into a digital catalogue of fish, known as [FishBase](https://www.fishbase.us/). The data in FishBase were processed using a new algorithm to create a [new dataset](https://thredds.d4science.org/thredds/catalog/public/netcdf/AquaMaps_08_2016/catalog.html). This new dataset was combined with other data to create [AquaMaps](https://www.aquamaps.org/), a tool for predicting the natural occurrence of marine species based on environmental parameters. This led to an increase in citations of FishBase (e.g. [Coro _et al._ 2018](https://doi.org/10.1016/j.ecolmodel.2018.01.007)) and a [report](https://europe.oceana.org/en/our-work/froese-report/overview) on EU fish stocks,the evidence for which was debated in the European Parliament in 2017. In addition, climate change predictions from AquaMaps and NASA were merged to create a [climate change timeline](https://dlnarratives.eu/timeline/climate.html). -## Case 2: TerrestrialMetagenomeDB +### Case 2: TerrestrialMetagenomeDB [TerrestrialMetagenomeDB](https://web.app.ufz.de/tmdb/) is a public repository of curated and standardised metadata for terrestrial metagenomes. -## Further cases in microbiology +### Further cases in microbiology See [Sielemann *et al.* 2020](https://doi.org/10.7717/peerj.9954). -# Relevant licenses and terms of use +## Relevant licenses and terms of use See [Licenses]({% link _RDM-Share/26-licenses.md %}). -# Criteria for selection of trustworthy data sets +## Criteria for selection of trustworthy data sets Below is a list of criteria for selecting trustworthy data sets {% cite bres_2022 sielemann_2020 %}. As in Sielemann *et al.* 2020 {% cite sielemann_2020 %}, for each possible criterion, several questions to consider are listed. @@ -67,7 +67,7 @@ Below is a list of criteria for selecting trustworthy data sets {% cite bres_202 * Is the research purpose/(hypo-)thesis well documented? * Is it documented whether the data are raw or processed? -# Data Provenance +## Data Provenance The provenance of research data can be defined as “a documented trail that accounts for the origin of a piece of data and where it has moved from to where it is presently” {% cite National_Library_of_Medicine:2022 %}. As suggested by Schröder et al. 2022, it can be accounted for by answering questions based on the W7 provenance model {% cite Schroder:2022 %}: * W1: Who participated in the study? [List of all researchers involved in an experiment and their affiliations] * W2: Which biological and chemical resources and which equipment was used in the study? [Resources and the equipment used in an experiment including all details such as the lot number and the passage information] @@ -77,18 +77,18 @@ The provenance of research data can be defined as “a documented trail that acc * W6: Where was the experiment conducted? [Institution where the experiments was conducted] * W7: What was the order of the stimulation parameters in a particular experiment? -# Data discovery +## Data discovery -## Services to search for data +### Services to search for data -### Registries of data repositories +#### Registries of data repositories * Registry of Research Data Repositories ([re3data.org](https://www.re3data.org/)) * [OpenAIRE Explore](https://explore.openaire.eu/) * [OpenDOAR](https://v2.sherpa.ac.uk/opendoar/) * [FAIRsharing.org](https://fairsharing.org/) * [Master Data Repository List](https://clarivate.com/webofsciencegroup/master-data-repository-list/) -### Search engines +#### Search engines * [NCBI Data sets](https://www.ncbi.nlm.nih.gov/datasets/) * **Google** @@ -105,18 +105,18 @@ The provenance of research data can be defined as “a documented trail that acc * [Mendeley Data](https://data.mendeley.com/) -### (Meta)data aggregators +#### (Meta)data aggregators * [B2FIND](https://b2find.eudat.eu/) * [data.europa.eu](https://data.europa.eu/en) * [DataCite Commons](https://commons.datacite.org/) * [gesisDataSearch](https://datasearch.gesis.org/start) -### Services where data can be published +#### Services where data can be published * **Interdisciplinary and [discipline-specific]({% link _RDM-Share/22-data-repositories.md %}#well-established-repositories-for-data-deposition-in-microbiology) repositories** * **Data reports** * **Data journals** (see e.g. [here](https://www.forschungsdaten.org/index.php/Data_Journals)) -### Resources to facilitate data reuse in microbiology +#### Resources to facilitate data reuse in microbiology Below are listed widely used resources in microbiology that facilitate the reuse of raw data found in the data repositories (see section above). These so-called "secondary databases" provided added value through additional data types for example from data integration or from processing of raw data. For each resource and when available, the FAIRsharing and re3data pages are linked. On the FAIRsharing page, you will find information such as which journals endorse the resource (under "Collections & Recommendations" and then "In Policies"). On the re3data page, you will find information such as the above-mentioned criteria to select a trusted resource. DB = database. | Domain, Data Type | Data repository | FAIRsharing | re3data | @@ -137,7 +137,7 @@ Below are listed widely used resources in microbiology that facilitate the reuse | **All, Protein sequence search** | [InterPro](https://www.ebi.ac.uk/interpro/) | [FAIRsharing](https://fairsharing.org/FAIRsharing.pda11d) | [re3data](https://www.re3data.org/repository/r3d100010798) | {: .table .table-hover} -## Strategies to search for data +### Strategies to search for data The Consortium of European Social Science Data Archives (CESSDA) {% cite cessda_2017 %} has produced a list of steps in data discovery. The main ones are outlined below, and you can look at their [website](https://dmeg.cessda.eu/) for the sub-steps. 1. Develop a clear picture of the research data you need @@ -153,39 +153,39 @@ CESSDA also suggests three steps to adjust your search strategy {% cite cessda_2 Other tips and tricks from the [Center for Open Science 2023](https://mailchi.mp/osf/osf-tips-mar-1386252?e=38c1d6ec62) include citation chaining (i.e. the process of mining citations in relevant literature to find more sources), looking at previous reuse, and documenting your search strategy to avoid repetition in one repository while helping you to replicate the same strategies in other data. To properly document your search strategy, keep a record of the terms used, filters, other refinements, dates and repositories searched. -# Data citation +## Data citation -## Common standards for data citation +### Common standards for data citation -### Interdisciplinary +#### Interdisciplinary * **DataCite 2019**: Creator (PublicationYear): Title. Version. Publisher. (resourceTypeGeneral). Identifier * **FORCE 11**: Author(s), Year, Data set title, Data repository or archive, Version, Global persistent identifier (preferably as a link) * [BibGuru](https://app.bibguru.com/p/3420f069-22ea-42f6-ba23-4bc6b8ae37e4) * [DOI Citation Formatter](https://citation.crosscite.org/) * [How to Cite Data sets and Link to Publications](https://www.dcc.ac.uk/guidance/how-guides/cite-datasets) -### For nucleic acid sequences and functional genomics +#### For nucleic acid sequences and functional genomics * [How do I cite my ArrayExpress data sets in my publication?](https://www.ebi.ac.uk/biostudies/arrayexpress/help#cite) * [How to Cite Data in ENA](https://www.ebi.ac.uk/ena/browser/about/citing-ena) * [Citing and linking to the GEO database](https://www.ncbi.nlm.nih.gov/geo/info/linking.html) * [How do I cite NCBI services and databases?](https://support.nlm.nih.gov/knowledgebase/article/KA-03391/en-us) -## Code citation +### Code citation Code citation allows for greater recognition of research software. Some major platforms and tools offer code citation: GitHub, GitLab, JabRef, Zenodo, and Zotero {% cite escience_center_2021 %}. -# How-tos +## How-tos -## How to make your data reusable? +### How to make your data reusable? * Properly document your data with metadata {% cite pavone_2020 %}. * Use common metadata standards and terminologies {% cite pavone_2020 %}. * Standardise your data. * Share your raw data with an open license. -## How to maximize already existing data? +### How to maximize already existing data? See Wood-Charlson *et al.* 2022 {% cite wood-charlson_2022 %}. -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) -# References +## References {% bibliography --cited_in_order %} diff --git a/docs/_RDM-Share/19-collaboration-tools.md b/docs/_RDM-Share/19-collaboration-tools.md index d483e9f6..1cb3e41c 100644 --- a/docs/_RDM-Share/19-collaboration-tools.md +++ b/docs/_RDM-Share/19-collaboration-tools.md @@ -36,7 +36,7 @@ docs_css: markdown * Sielemann, K., Hafner, A., & Pucker, B. (2020). The reuse of public datasets in the life sciences: potential risks and rewards. In PeerJ (Vol. 8, p. e9954). PeerJ. [https://doi.org/10.7717/peerj.9954](https://doi.org/10.7717/peerj.9954) * Soiland-Reyes, S., Sefton, P., Crosas, M., Castro, L. J., Coppens, F., Fernández, J. M., Garijo, D., Grüning, B., La Rosa, M., Leo, S., Ó Carragáin, E., Portier, M., Trisovic, A., RO-Crate Community, Groth, P., & Goble, C. (2022). Packaging research artefacts with RO-Crate. In S. Peroni (Ed.), Data Science (Vol. 5, Issue 2, pp. 97–138). IOS Press. [https://doi.org/10.3233/ds-210053](https://doi.org/10.3233/ds-210053) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) ## References diff --git a/docs/_RDM-Share/20-pids.md b/docs/_RDM-Share/20-pids.md index 138dddf4..1d09d43a 100755 --- a/docs/_RDM-Share/20-pids.md +++ b/docs/_RDM-Share/20-pids.md @@ -8,19 +8,19 @@ empty: false hide: false --- -# Definition +## Definition Persistent digital identifier (PID) is a "unique long-lasting reference to a digital object" {% cite Cousijn2021 %}. PIDs can reference people, datasets, or papers. PIDs are a primary way to meet the first standard in the FAIR principles ensuring digital objects are Findable. -# Examples of PIDs -## ORCID +## Examples of PIDs +### ORCID Open Researcher and Contributor ID, or [ORCID](https://orcid.org/), is a non-profit organization which aims to connect researchers to their research. Thus improving transparency and facilitating trust between researchers in the scientific community. This is a free service for researchers which provides users with a persistent digital identifier (PID). For example, Mathias Mueller is a fairly common name in Germany. However, with a PID we can identify the correct researcher and link them with all of their work. Just like a unique finger print a PID distinguishes you from other researchers and allows you to connect your ID with your professional information including affiliations, publications, grants, and peer reviews {% cite gonzalez_rdm %}. -## DOI +### DOI Digital Object Identifier or DOI is another form of persistent identifier. Papers, articles, and published datasets may have a DOI which links these items. This makes it easier to find these digital objects online. Even if the publisher changes the DOI of the published article, dataset, etc. will not change. -# Benefits of assigning PIDs +## Benefits of assigning PIDs + Increase visibility + Connects the author with the work + You can use ORCID to login to other applications (ie. [Coscine](https://docs.coscine.de/en/)) @@ -28,29 +28,29 @@ Digital Object Identifier or DOI is another form of persistent identifier. Paper + Adheres to FAIR principle + Permanent link to digital object -# Use cases +## Use cases PIDs are seen as the first step towards making research FAIR through increasing the findability. A case study looks at the next step in this FAIRification process by linking PIDs with metadata. This will allow for linking digital resources that are assigned PIDs together. According to Cousijn and collaborators, the PID Graph establishes connections between different entities within the research landscape, thereby enabling both researchers and institutions to access new information {% cite Cousijn2021 --suppress_author %}. -# Link to the FAIR data principles +## Link to the FAIR data principles Wilkinson and collaborators discuss the FAIR principles for research data management in the first formal publication of the principles, they include the rationale behind them, and some exemplar implementations in the community {% cite wilkinson_2016 --suppress_author %}. {% comment %} -# Using PIDs to access resources +## Using PIDs to access resources -# Receiving a PID for research outputs +## Receiving a PID for research outputs -# Collaborations +## Collaborations -# Provenance and versioning +## Provenance and versioning -# PID graph +## PID graph -# Get Help -If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) - -# Further resources +## Further resources {% endcomment %} -# References +## Get Help +If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) + +## References {% bibliography --cited_in_order %} diff --git a/docs/_RDM-Share/22-data-repositories.md b/docs/_RDM-Share/22-data-repositories.md index e22250ba..da75a04c 100644 --- a/docs/_RDM-Share/22-data-repositories.md +++ b/docs/_RDM-Share/22-data-repositories.md @@ -166,10 +166,10 @@ For more details, see this [guide](https://www.openaire.eu/zenodo-guide). * To find a suitable interdisciplinary repository: [Generalist Repository Comparison Chart](https://doi.org/10.5281/zenodo.3946720) * To find Open Access repositories: [OpenDOAR](https://v2.sherpa.ac.uk/opendoar/): Directory of Open Access Repositories -## See Also +## Further Resources * [Data Deposition and Standardization](https://academic.oup.com/nar/pages/data_deposition_and_standardization) help page of the [Oxford Academic](https://academic.oup.com) Nucleic Acids Research ([NAR Journal](https://academic.oup.com/nar)). -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) ## References diff --git a/docs/_RDM-Share/26-licenses.md b/docs/_RDM-Share/26-licenses.md index 4dfc3925..fb479152 100644 --- a/docs/_RDM-Share/26-licenses.md +++ b/docs/_RDM-Share/26-licenses.md @@ -6,21 +6,21 @@ docs_css: markdown --- -# Introduction +## Introduction In order to make data and software more accessible, licenses are an important tool to ensure what can be used for what. By default any creator of data, software, writing or any other content involving a sufficient amount of creativity is the copyright owner of that content without having to declare the copyright explicitly. Defining or using a suitable license for published content usually has the benefit of giving all parties legal certainty and understanding of permission to use. In sciences, two categories of licenses can applied to either software or data and results that explicitly describe whether and how others can use it. -# Properties of recommendable licenses +## Properties of recommendable licenses * Standardized, i.e. machine-readable, text * Available in an easy-to-read version * No transfer of exclusive rights * Compatible with many jurisdictions * Common -# Licenses for data and other creative works +## Licenses for data and other creative works Also scientific data and output are subject to copyright if their generation requires a sufficient amount of creativity (this might be contested in some contexts). However, since scientists conduct experiments mostly within their employment of their funding institution, the copyright lies with both and should ideally be discussed with the employer (the universities legal department) in high profile cases. @@ -29,7 +29,7 @@ Similar to the publication of source code, a license communicates who can use th Publishing figures and articles in journals, usually requires accepting the license agreement of the publisher and involves either a complete transfer of rights on your own work or picking an open access journal with acceptable permissive licenses. -## Recommendations +### Recommendations Because publicly funded sciences should make their data and results publicly available (often required by the employer or funder), choosing a permissive license is highly encouraged. The [same criteria as for software](#software-licenses) apply and the [Creative Commons (CC)](https://creativecommons.org) licenses are most widely used as well as easily understandable. @@ -48,31 +48,31 @@ There are different versions of CC that consist of the core license with further ([Source](https://commons.wikimedia.org/wiki/File:Creative_commons_license_spectrum.svg), CC-BY Shaddim; original CC license symbols by Creative Commons) -## Usage +### Usage Creators can declare a works license by selecting it during upload to a website/database/repository or attaching a line stating the title, author and license to the published work. Here we look at some examples of license usage, found on [Zenodo](https://zenodo.org/). -### CC BY 4.0 example usage +#### CC BY 4.0 example usage We are going to be looking at two examples here: 1. [Metagenome-assembled genomes (MAGs), colorectal cancer (CRC)](https://zenodo.org/records/7008911) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7008911.svg)](https://doi.org/10.5281/zenodo.7008911) 2. [Metagenome assemblies and metagenome-assembled genomes from the Daphnia magna microbiota](https://zenodo.org/records/4435010) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4435010.svg)](https://doi.org/10.5281/zenodo.4435010) Both datasets have been submitted to Zenodo under the [CC BY 4.0 license](https://creativecommons.org/licenses/by/4.0/). That means that the data that has been deposited is free to share and free to redistribute (including commercially) in any format or medium. It also enables the data user to build upon or transform it for any purpose (including commercial use). However, the license requires the data user to give appropriate credit to the submitter/data generator. In addition, the user must also provide a link to the license and disclose any changes made when licensing their work when derived from work already under license. -### CC0 example usage +#### CC0 example usage We are going to be looking at two examples here: 1. [Data from: Mining for NRPS and PKS genes revealed a high diversity in the Sphagnum bog metagenome](https://zenodo.org/records/4976456) [![DOI](https://zenodo.org/badge/DOI/10.5061/dryad.hf56v.svg)](https://doi.org/10.5061/dryad.hf56v) 2. [Data from: Generational conservation of composition and diversity of field-acquired midgut microbiota in Anopheles gambiae sensu lato during colonization in the laboratory](https://zenodo.org/records/5001400) [![DOI](https://zenodo.org/badge/DOI/10.5061/dryad.98jj7gk.svg)](https://doi.org/10.5061/dryad.98jj7gk) Both datasets have been submitted to Zenodo under the [CC0 license](https://creativecommons.org/publicdomain/zero/1.0/). That means that the data that has been deposited is in the public domain. That implies that the data deposited can be copied, modified, distributed, and used even for commercial purposes, and the depositor/generator of the data waives their right to the work. The user of data does not need to seek the permission of the data/material submitter or generator. -# Software licenses +## Software licenses When software sources are distributed, it is considered good practice to specify an already established license under which it can be used. Software developed in the context of science and funded by public money, usually needs to be made available free of charge and in open source by requirements of the funder or the governing institution's policies. -## Recommendations +### Recommendations When choosing a license, multiple aspects should be regarded depending on its application and intention: - **Standardized:** While you can formulate your own license or freely modify most available licenses to better suit your needs, keep in mind that anyone interested in using/supporting/modifying your software needs to be familiar with the terms and in doubt has to read it. @@ -96,15 +96,15 @@ The text is also a disclaimer that states software is distributed "as is" withou An exhaustive list of generally recommended licenses for Open Source is curated by the [Open Source Initiative](https://opensource.org/licenses). -## Usage +### Usage Typically, developers or distributors add a plain-text file called `LICENSE` to the source code or binary of their software that contains the chosen license text. Especially source repositories like github or gitlab will allow you to choose a license per project and automatically adding such a `LICENSE` file to the source code. The benefit of selecting a license on the code hosting platform is the machine readable interpretation of your permissions which can potentially increase visibility in search results across the platform. -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) -# Further resources +## Further resources - [Creative Commons license chooser](https://creativecommons.org/choose): Explanations and guide to choosing CC licenses for your work. - [tl;drLegal](https://tldrlegal.com/): Software licenses in plain English with a short feature list. diff --git a/docs/_Reproducible-Data-Analysis/02-workflows.md b/docs/_Reproducible-Data-Analysis/02-workflows.md index 7a87bd8f..fdd00371 100644 --- a/docs/_Reproducible-Data-Analysis/02-workflows.md +++ b/docs/_Reproducible-Data-Analysis/02-workflows.md @@ -74,5 +74,5 @@ Execution of workflows on Slurm based clusters is directly supported in the next - If your workflow reads large input data it should be possible to read such data directly from an S3 bucket by providing bucket URI and credentials. - It should also be possible to directly write all outputs to an S3 bucket. -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the privacy policy on our website: [Contact](https://nfdi4microbiota.de/contact-form/)) diff --git a/docs/_Reproducible-Data-Analysis/03-software-containers.md b/docs/_Reproducible-Data-Analysis/03-software-containers.md index 97ef94dc..9996e719 100644 --- a/docs/_Reproducible-Data-Analysis/03-software-containers.md +++ b/docs/_Reproducible-Data-Analysis/03-software-containers.md @@ -1,23 +1,23 @@ --- -title: Software containers +title: Software Containers category: Reproducible-Data-Analysis layout: default docs_css: markdown empty: true --- -# Software Containers +## Software Containers -## Introduction to Software Containers +### Introduction to Software Containers Software containers, such as [Apptainer](https://apptainer.org/) (formerly known as Singularity) and [Docker](https://www.docker.com/) provide a way to encapsulate an application and its environment for consistent, portable, and reproducible execution across various computing environments. This is crucial for scientific research, ensuring that analyses remain consistent regardless of the underlying infrastructure. -## Why Use Software Containers? +### Why Use Software Containers? - **Consistency and Reproducibility**: Containers ensure your analysis runs the same way, everywhere. - **Isolation**: Package your application with its dependencies to avoid conflicts. - **Portability**: Easily share your computational environment with others. -## Getting Started with Containers +### Getting Started with Containers Apptainer is a popular choice in scientific and high-performance computing (HPC) environments due to its ability to handle container privileges. It offers secure, user-friendly containerization, making it ideal for computational biology and bioinformatics. Based on the same technology, Docker images are compatible with Apptainer and most commands function similarly. @@ -30,9 +30,9 @@ For installation and quick start, always refer to the main documenation page fro [Docker Quick Start](https://docs.docker.com/guides/get-started/) -## Example of Working with Containers +### Example of Working with Containers -### Apptainer +#### Apptainer To start getting an idea what a container actually is, it is relevant to get some examples. A good example of a software available as a apptainer container is [Virsorter2](https://github.com/jiarong/VirSorter2), a multi-classifier with an expert-guided approach to detect diverse DNA and RNA virus genomes. @@ -44,11 +44,11 @@ You will get a file `virsorter2.sif`, which is a apptainer image that can be run You can use the absolute path of this file to replace Virsorter2 in commands. Also this image has the database and dependencies included, so you can skip the download of databases and dependencies. -### Docker +#### Docker Similarly with Docker, the user can find an example of running BLAST [here](https://biocontainers-edu.readthedocs.io/en/latest/running_example.html) -## Best Practices for Container Creation {best-practices} +### Best Practices for Container Creation {best-practices} When creating containers, incorporating best practices ensures efficiency, security, and reproducibility. Here's a concise guide, drawing from broader container best practices, including insights from [Google Cloud's recommendations](https://cloud.google.com/architecture/best-practices-for-building-containers): - **Use Specific Versions**: Specify exact versions of base images, software, and libraries, in order to avoid breaking changes occuring when updating with the `latest` tag and ensures consistency across environments. @@ -74,7 +74,7 @@ Use volumes or bind mounts for data that needs to persist beyond the life of the - **Documentation**: Include a `%help` section in your definition file, providing users with information on how to use the container, including running the software and accessing data. -## Advanced Usage +### Advanced Usage #### [Integration with Nextflow](https://www.nextflow.io/docs/latest/container.html) - **Nextflow and Containers**: Simplifies complex workflows by executing each step in a container for consistency across environments. - **Configurations**: Supports managing containers through `nextflow.config`, streamlining execution. @@ -83,7 +83,7 @@ Use volumes or bind mounts for data that needs to persist beyond the life of the - **Container Orchestration**: Automates deployment, scaling, and management of containerized applications, essential for microservices architecture. - **Scalability and Management**: Provides tools for load balancing, auto-scaling, and efficient resource allocation across diverse infrastructures. -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) ## Resources and Further Reading diff --git a/docs/_Reproducible-Data-Analysis/04-resources.md b/docs/_Reproducible-Data-Analysis/04-resources.md index 0a62e4b7..0e118498 100644 --- a/docs/_Reproducible-Data-Analysis/04-resources.md +++ b/docs/_Reproducible-Data-Analysis/04-resources.md @@ -6,9 +6,9 @@ docs_css: markdown --- -# Metainformation Template +## Metainformation Template -## CV terms +### CV terms | Issue 84 metadata key | Corresponding bio.tools CV term | Type of value or list of values | |---------------------------------|--------------------------------------------------------------------------------------------------|-----------------------------------| @@ -29,25 +29,25 @@ Modelling and simulation, Optimisation and refinement, Prediction and recognitio --- -## Template -### Homepage -### Maturity -### Topic -### License -### Description -### Input format -### Tool operation -### Output data -### Credits, support +### Template +#### Homepage +#### Maturity +#### Topic +#### License +#### Description +#### Input format +#### Tool operation +#### Output data +#### Credits, support - code: github/gitlab/sourceforge link - doi: [10.XXXX/YYYYYY](doi.org/10.xxxx/YYYYYYYY) -# Bioinformatic tools +## Bioinformatic tools --- -## Disclaimer: Changes in the tool display +### Disclaimer: Changes in the tool display We recently created a [NFDI4Microbiota domain](https://bio.tools/t?domain=nfdi4microbiota) on [the life sciences software registry bio.tools](https://bio.tools) and we will soon be displaying all the tools that NFDI4Microbiota created as well as the ones that NFDI4Microbiota consortium members endorse and highly recommend. @@ -416,5 +416,5 @@ PlasFlow is a set of scripts used for prediction of plasmid sequences in metagen - doi: [10.1093/nar/gkx1321](https://doi.org/10.1093/nar/gkx1321) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) diff --git a/docs/_Research-Data-Management/01-rd.md b/docs/_Research-Data-Management/01-rd.md index 90435d09..0651d3cb 100644 --- a/docs/_Research-Data-Management/01-rd.md +++ b/docs/_Research-Data-Management/01-rd.md @@ -6,7 +6,7 @@ docs_css: markdown redirect_from: /Research-Data-Management --- -# Definition of research data +## Definition of research data There is no consensus on the definition of research data as they are highly heterogeneous. Thus, the definition can vary considerably depending on the research funder, the scientific discipline or subject, and the research data itself {% cite lindstädt_2019 biernacka:2020 voigt_2022 %}. We propose the following definition, based on around 20 others: **research data** is the collection of digital and non-digital objects (excluding scientific publications) that are generated (e.g. through measurements, surveys, source work), studied and stored during or as a result of scientific research activities. These objects are commonly accepted in the scientific community as necessary for the production, validation and documentation of original research results. In the context of Research Data Management (RDM), research data also includes non-data objects such as software and simulations (see further examples below). The characteristics of research data depend strongly on the context (i.e. conditions of generation, methods used, perspective) {% cite biernacka:2020 %}. Nevertheless, we can try to classify them as follows: @@ -21,7 +21,7 @@ Data is differentiated from **information** (i.e. processed data that can be con ![Information pyramid]({{ '/assets/img/information_pyramid.png' | relative_url }} "Information pyramid"){:width="70%"} -# General data types +## General data types General data types include the following {% cite NC_State_University_Library:2020 steen:2022 voigt_2022 dfg:2015 %}: * Data files (e.g. text files, binary files) * Documents (e.g. word processing documents, spreadsheets) @@ -39,7 +39,7 @@ Data is differentiated from **information** (i.e. processed data that can be con * Methodologies and workflows * Standard Operating Procedures (SOPs) and protocols -# Common data types in microbiology +## Common data types in microbiology Data types in microbiology include the following: * Clinical data * Crystallographic data @@ -75,8 +75,8 @@ Data types in microbiology include the following: * Standardised bacterial information * Vertebrate-virus network -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) -# References +## References {% bibliography --cited_in_order %} diff --git a/docs/_Research-Data-Management/02-rdm.md b/docs/_Research-Data-Management/02-rdm.md index 69b36ef4..35fe382b 100644 --- a/docs/_Research-Data-Management/02-rdm.md +++ b/docs/_Research-Data-Management/02-rdm.md @@ -4,6 +4,7 @@ category: Research-Data-Management layout: default docs_css: markdown --- + ## Definition of Research Data Management (RDM) Research Data Management (RDM) is the care and maintenance required to (1) obtain high-quality data (whether produced or reused), (2) make the data available and usable in the long term, independent of the data producer and (3) make research results reproducible beyond the research project {% cite biernacka:2020 bres_2022 RfII_RD voigt_2022 pauls_2023 bres_2023 %}. It complements research from planning to data reuse and deletion. @@ -40,7 +41,7 @@ The research data life cycle is a model that illustrates the steps of RDM and de * [BEXIS2](https://demo.bexis2.uni-jena.de) by [NFDI4Biodiversity](https://www.nfdi4biodiversity.org/en/) at [FSU Jena](https://www.uni-jena.de) * [GfBio](https://www.gfbio.org) consortium services -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) ## References diff --git a/docs/_Research-Data-Management/03-md.md b/docs/_Research-Data-Management/03-md.md index 65b17c61..0314ed48 100644 --- a/docs/_Research-Data-Management/03-md.md +++ b/docs/_Research-Data-Management/03-md.md @@ -1,5 +1,5 @@ --- -title: Metadata and Metadata Standards +title: Metadata & Metadata Standards category: Research-Data-Management layout: default docs_css: markdown @@ -8,13 +8,13 @@ empty: false hide: false --- -# Metadata +## Metadata Metadata Is a Love Note to the Future - Jason Scott (CC-BY - [cea +](https://www.flickr.com/people/33255628@N00), [Source](https://en.wikipedia.org/wiki/File:Metadata_is_a_love_note_to_the_future_(8071729256).jpg)) -## Metadata is data about data +### Metadata is data about data Before we delve into specifications on what metadata standards for the microbiology community are, let us explain what metadata is. @@ -60,7 +60,7 @@ In microbiology, metadata provides crucial contextual information about biologic For more details on the distinction between different types of metadata, we refer you to the FAIR Cookbook recipe [FAIR and the notion of metadata](https://w3id.org/faircookbook/FCB068) section. -## When should you collect your metadata +### When should you collect your metadata As is usually the case in sciences, your research (and the wider microbiological community) can benefit highly from the rigorous and timely planning of your experiments, including metadata collection. In this case, we refer you to other subsections of this Knowledge Base: [**Data Management Plans (DMPs)**](./08-dmp.md) that could help you plan your experiments. @@ -68,7 +68,7 @@ Metadata collection should be planned, but at the same time, it can be overwhelm These and other considerations should be thoroughly thought out before the start of your experimental procedures. Some of the metadata can even be collected and documented before starting the experiments if you already know how to collect your samples, process, sequence them (if sequencing is a part of the analysis), and analyze them. -## Metadata collection example +### Metadata collection example We will look into an example of microbiological environmental metadata, where we gather samples from a forest environment, specifically plant rhizosphere, and we will be doing amplicon and metagenomic sequencing. We will not dive specifically into all omics types and biological/environmental on this page. Instead, we encourage readers to read our [MetadataStandards](https://github.com/NFDI4Microbiota/MetadataStandards) resource repository. @@ -81,7 +81,7 @@ Alternatively, we can hop over to the [MetadataStandards/Plant-associated microb By now, we should have a rough estimation of what kind of biological/environmental metadata we can collect before sampling, during sampling, and what could be collected during the processing of samples. -# Metadata standards +## Metadata standards Once a community agrees to a set of relevant metadata for their field, they can devise metadata standards. A metadata standard is usually defined for a given type of data and by different stakeholders (e.g., users communities, data repositories). @@ -91,17 +91,17 @@ For every metadata field part of a metadata standard, one could expect a human-r At NFDI4Microbiota, we compiled a [list of widely used metadata standards in the field of microbiome research](https://github.com/NFDI4Microbiota/MetadataStandards) that you can browse and use for the different types of data collected during your investigations. -# Metadata management +## Metadata management -# Metadata quality control +## Metadata quality control Metadata quality control involves thorough validation and standardization of metadata attributes to minimize errors and inconsistencies. For instance, in studies involving microbial sequencing data, rigorous checks are needed to verify the accuracy of sample identifiers, ensuring that each sample is uniquely identified and correctly linked to corresponding experimental conditions. Moreover, metadata completeness is essential to provide sufficient context for data interpretation and reuse. Researchers should meticulously document sample collection details, including the source organism, sampling location, and environmental conditions, to facilitate cross-study comparisons and meta-analyses. For example, in a microbiome study investigating the gut microbial composition in patients with inflammatory bowel disease, comprehensive metadata would encompass clinical metadata such as patient demographics, disease severity scores, and medication history, alongside microbial metadata like taxonomic profiles and sequencing protocols. Furthermore, metadata quality control in microbiology extends to the curation of controlled vocabularies and ontologies to harmonize terminology and promote semantic interoperability across data sets. Standardized metadata terms ensure consistency in data annotation and facilitate data integration from diverse sources. For instance, ontologies such as the Environment Ontology (ENVO) provide a structured vocabulary for describing environmental parameters, enabling researchers to annotate microbial samples with terms like "soil pH," "temperature," and "moisture content" uniformly. Additionally, validation rules and automated workflows are employed to enforce data integrity and conformity to predefined metadata standards. For example, data submission portals for public repositories often incorporate validation checks to verify compliance with metadata schema requirements before data deposition, minimizing errors and enhancing data usability. By implementing robust metadata quality control measures, microbiology researchers can uphold data integrity, foster data harmonization and interoperability, and facilitate meaningful insights into microbial ecosystems and host-microbe -# Further resources +## Further resources -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) -# References +## References diff --git a/docs/_Research-Data-Management/04-fair.md b/docs/_Research-Data-Management/04-fair.md index 32e48e26..4227ebda 100644 --- a/docs/_Research-Data-Management/04-fair.md +++ b/docs/_Research-Data-Management/04-fair.md @@ -4,7 +4,8 @@ category: Research-Data-Management layout: default docs_css: markdown --- -# Introduction + +## Introduction The FAIR data principles are a concise and measurable set of principles that may act as a guideline for those wishing to enhance the reusability of their data holdings. FAIR stands for Findable, Accessible, Interoperable and Reusable {% cite wilkinson_2016 %}. The FAIR data principles aims at {% cite wilkinson_2016 lowenberg_2021 %}: @@ -16,21 +17,21 @@ The FAIR data principles are a concise and measurable set of principles that may The principles {% cite wilkinson_2016 go_fair_2022 %} are reproduced below: -# To be Findable +## To be Findable * (Meta)data are assigned a globally unique and persistent identifier. * Data are described with rich metadata. * Metadata clearly and explicitly include the identifier of the data it describes. * (Meta)data are registered or indexed in a searchable resource (e.g. data repository). -# To be Accessible +## To be Accessible * (Meta)data are retrievable by their identifier using a standardized communications protocol (e.g. http(s)). * The protocol is open, free, and universally implementable. * The protocol allows for an authentication and authorization procedure, where necessary. * Metadata are accessible, even when the data are no longer available. -# To be Interoperable +## To be Interoperable Data interoperability is the ability of a dataset to work with other datasets or systems without special effort on the part of the user {% cite godan_action_2019_3588148 %}. @@ -38,35 +39,34 @@ Data interoperability is the ability of a dataset to work with other datasets or * (Meta)data use vocabularies that follow the FAIR principles (e.g. using FAIR Data Point). * (Meta)data include qualified references to other (meta)data (e.g. specifying if one dataset builds on another one, properly citing all datasets). -# To be Reusable +## To be Reusable * Meta(data) are richly described with a plurality of accurate and relevant attributes (i.e. metadata that richly describes the context under which the data was generated such as the experimental protocols, the species used). * (Meta)data are released with a clear and accessible data usage license. * (Meta)data are associated with detailed provenance. -# Further resources +## Further resources * Introducing the FAIR Principles for research software: [Barker *et al.* 2022](https://doi.org/10.1038/s41597-022-01710-x) -## Learning resources +### Learning resources * Course on FAIR in (biological) practice: [The Carpentries Incubator](https://carpentries-incubator.github.io/fair-bio-practice/) * How to be FAIR with your data. A teaching and training handbook for higher education institutions: [Engelhardt *et al.* 2022](https://doi.org/10.5281/zenodo.6674301) * Unit on the benefits and challenges associated with sharing research data openly: [The University of Edinburgh](https://mantra.ed.ac.uk/fairsharingandaccess/) -## How to make data FAIR? +### How to make data FAIR? * Guidelines to FAIRify data management and make data reusable: [PARTHENOS](https://doi.org/10.5281/zenodo.2668479) * Preparing data for sharing: [Knight 2015](https://www.slideshare.net/lshtm/preparing-data-for-sharing-the-fair-principles) * Recipes that help you to make and keep data FAIR: [FAIR Cookbook](https://faircookbook.elixir-europe.org/content/home.html) * Top 10 FAIR Data & Software Things: [Martinez *et al.* 2019](https://doi.org/10.5281/zenodo.3409968) -## How to assess the FAIRness of your datasets? +### How to assess the FAIRness of your datasets? * FAIR data maturity model indicators: [Bahim *et al.* 2020](https://doi.org/10.5334/dsj-2020-041), Table 1 * FAIR evaluator service: [Fraunhofer FIT](https://gitlab.fit.fraunhofer.de/abu.ibne.bayazid/fairevaluator) * How FAIR are your data? [Jones and Grootveld 2017](https://doi.org/10.5281/zenodo.5111307) * Self-Assessment Tool to Improve the FAIRness of Your Dataset ([SATIFYD](https://satifyd.dans.knaw.nl/)) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) -# References - +## References {% bibliography --cited_in_order %} diff --git a/docs/_Resources/01-glossary.md b/docs/_Resources/01-glossary.md index b2619bae..3da330a6 100644 --- a/docs/_Resources/01-glossary.md +++ b/docs/_Resources/01-glossary.md @@ -89,5 +89,5 @@ redirect_from: /Resources * [Research Data Management Terminology](https://codata.org/initiatives/data-science-and-stewardship/rdm-terminology-wg/rdm-terminology/) * [GoFAIR - FAIR principles for metadata vocabulary usage](https://www.go-fair.org/fair-principles/i2-metadata-use-vocabularies-follow-fair-principles/) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) diff --git a/docs/_Resources/02-external-training-resources.md b/docs/_Resources/02-external-training-resources.md index 536c4770..39e0d2f2 100644 --- a/docs/_Resources/02-external-training-resources.md +++ b/docs/_Resources/02-external-training-resources.md @@ -38,5 +38,5 @@ redirect_from: /Resources * [FAIR Office Austria](https://fair-office.at/lernen-sie-mehr/?lang=en) * [Videos](https://www.rwth-aachen.de/cms/root/Forschung/Forschungsdatenmanagement/Weiterbildungsangebote/~udzt/Lehrvideos/?lidx=1) (RWTH Aachen University) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) diff --git a/docs/_Software-Development/02-toolsurvey.md b/docs/_Software-Development/02-toolsurvey.md index 92f70844..1c820eeb 100644 --- a/docs/_Software-Development/02-toolsurvey.md +++ b/docs/_Software-Development/02-toolsurvey.md @@ -1,5 +1,5 @@ --- -title: Available Tools of NFDI4Microbiota members +title: Available NFDI4Microbiota Tools category: Software Development layout: default docs_css: markdown @@ -267,5 +267,5 @@ license: NA link: [https://nfdi4microbiota.de/contact-form/](https://nfdi4microbiota.de/contact-form/) -# Get Help +## Get Help If you have any further questions about the management and analysis of your microbial research data, please contact us: [helpdesk@nfdi4microbiota.de](mailto:helpdesk@nfdi4microbiota.de) (by emailing us you agree to the [privacy policy - in German](https://nfdi4microbiota.de/legals/privacy-policy.html) on our website: [Contact](https://nfdi4microbiota.de/contact-form/).) From 5c58f34764636cf930da5b9e644f9b101c29dea1 Mon Sep 17 00:00:00 2001 From: thoelken <5861076+thoelken@users.noreply.github.com> Date: Fri, 13 Sep 2024 11:08:03 +0200 Subject: [PATCH 2/2] Moved privacy policy --- .../01-privacy-policy-english-translation.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) rename docs/{_Privacy-Policy-English-Translation => _How-We-Operate}/01-privacy-policy-english-translation.md (99%) diff --git a/docs/_Privacy-Policy-English-Translation/01-privacy-policy-english-translation.md b/docs/_How-We-Operate/01-privacy-policy-english-translation.md similarity index 99% rename from docs/_Privacy-Policy-English-Translation/01-privacy-policy-english-translation.md rename to docs/_How-We-Operate/01-privacy-policy-english-translation.md index c8d4133c..18e9babe 100644 --- a/docs/_Privacy-Policy-English-Translation/01-privacy-policy-english-translation.md +++ b/docs/_How-We-Operate/01-privacy-policy-english-translation.md @@ -1,4 +1,9 @@ -# Privacy policy +--- +title: Privacy Policy +category: How-We-Operate +layout: default +docs_css: markdown +--- ## DISCLAIMER: The following policy is an automated translation of the German text. Please refer to the German original for a legally binding document.