-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Methods section for metadata and ontologies #65
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The level of detail for the individual fields looks like a good start. Returning some comments.
content/04.methods.md
Outdated
Additionally, ontology term identifiers were assigned to the following metadata categories for each sample: | ||
- Age: Ontology term obtained from HsapDv [@url:https://www.ebi.ac.uk/ols4/ontologies/hsapdv]. For ages 0-11 months, the HsapDv for age in months was used. For ages 12 months and greater, the HsapDv for age in years was used. | ||
- Sex: Ontology term obtained from PATO, either male (PATO:0000384), female (PATO:0000383), or unknown [@url:https://www.ebi.ac.uk/ols4/ontologies/pato]. | ||
- Organism: NCBI taxonomy term for organism. All current samples available on the Portal are from Homo sapiens or NCBITaxon:9606 [@url:https://www.ncbi.nlm.nih.gov/taxonomy]. | ||
- Diagnosis: The most appropriate MONDO term based on the provided diagnosis [@url:https://www.ebi.ac.uk/ols4/ontologies/mondo]. An exact match was identified for most samples, but in a handful of cases, the most closely related term was used. | ||
- Tissue of origin: The most appropriate UBERON term based on the provided tissue of origin [@url:https://www.ebi.ac.uk/ols4/ontologies/uberon]. An exact match was identified for most samples, but in a handful of cases, the most closely related term was used. | ||
- Ethnicity (if applicable): If the submitter provided ethnicity, the associated Hancestro term [@url:https://www.ebi.ac.uk/ols4/ontologies/hancestro]. If ethnicity is unavailable, `unknown` is used. | ||
The human-readable metadata and the associated ontology term identifiers are available on the Portal for all samples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to bioRxiv first, so we don't have to worry about "journal-ready" formatting yet. Can we ignore any main display item limits and make this a table instead? I think that will be much more scannable/digestible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't use a table, this should still follow one sentence per line; there should be no impact on the bullet formatting.
content/04.methods.md
Outdated
|
||
Submitters were required to submit the age, sex, organism, diagnosis, subdiagnosis (if applicable), and tissue of origin for each sample. | ||
The submitted metadata was standardized across projects, including converting all ages to years, removing abbreviations used in diagnosis, subdiagnosis, or tissue of origin, and using standard terms across projects as much as possible. | ||
Additionally, ontology term identifiers were assigned to the following metadata categories for each sample: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain (and cite!) why, i.e., the CELLxGENE schema.
content/04.methods.md
Outdated
### Metadata | ||
|
||
Submitters were required to submit the age, sex, organism, diagnosis, subdiagnosis (if applicable), and tissue of origin for each sample. | ||
The submitted metadata was standardized across projects, including converting all ages to years, removing abbreviations used in diagnosis, subdiagnosis, or tissue of origin, and using standard terms across projects as much as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say more about what terms tended to get standardized (e.g., disease timing). We should be able to figure that out from metadata cleaning PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean which of the terms get standardized, like Diagnosis, subdiagnosis, disease timing, and tissue type. Or do you mean mention specific examples - e.g., all samples collected at diagnosis were labeled with Initial diagnosis
.
…nto allyhawkins/ontology-methods
@jaclyn-taroni I updated this to include a table rather than the bulleted list. I added some column titles for the table, but I'm 50/50 on them and think we could also just do without. This should be ready for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning some comments with the expectation that my suggestions will be taken. I don't need to see this again 👍🏻
Co-authored-by: Jaclyn Taroni <[email protected]>
Click the link below to download the manuscript build as a ZIP file. |
Click the link below to download the manuscript build as a ZIP file. |
Click the link below to download the manuscript build as a ZIP file. |
Closes #50
Stacked on #61
Here I'm adding a section to the methods on metadata, including ontology assignments. I added this section below the data processing and generation and above any of processing related methods.
I mentioned that metadata is standardized as much as possible across projects and list out the ontology terms that were assigned. I provided any relevant details on how exactly we assigned terms, but let me know if we should include more detail?
I'm requesting @jaclyn-taroni since she was the most involved other than me in the ontology assignment and metadata cleaning.