Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OntologyClass metadata additions: alternative_names, alternative_identifiers, relations #2239

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

sierra-moxon
Copy link
Member

@sierra-moxon sierra-moxon commented Oct 31, 2024

Adds metadata to OntologyClass so that its collection in Mongo and its table in Postgres can hold cross-references (curie form), synonyms, and term descriptions available from external ontologies. It also adds a new class, OntologyRelation, to capture NMDC-search/display-relevant axioms/relationships (in particular, subclassOf and part_of relations).

In particular,r in the example data, @turbomam @cmungall - I'd like some feedback on the difference between "is_a" and "BFO:0000050" as the value for "predicate" in an OntologyRelation (using "is_a" follows the obographs convention, but I picked rdfs:subclassOf as the predicate for "is_a" because it sort of enables consistent processing of this field by downstream software?).

These changes support generic ontology metadata loading into NMDC.

Some related tickets:
microbiomedata/nmdc-server#1388 - support more facets for GO in search
microbiomedata/pilot#17 - support metadata search from ontologies
microbiomedata/pilot#12 - support ENVO search facets
microbiomedata/nmdc-field-notes#187 - field notes ENV triad term selection

We also want to provide users with more details about our existing terms. For example, we could extend the widget on the data portal search to include ontology metadata like cross-references, definitions, etc, to enable more informed selections.

Screen Shot 2024-10-31 at 12 12 00 PM

Storing these metadata in Mongo and, subsequently, in Postgres might make this data more accessible for downstream tools.

Copy link

github-actions bot commented Oct 31, 2024

PR Preview Action v1.4.8
🚀 Deployed preview to https://microbiomedata.github.io/nmdc-schema/pr-preview/pr-2239/
on branch gh-pages at 2024-11-27 20:39 UTC

@sierra-moxon sierra-moxon marked this pull request as ready for review November 19, 2024 22:15
@aclum aclum self-requested a review November 20, 2024 22:02
biolink: https://w3id.org/biolink/vocab/
owl: http://www.w3.org/2002/07/owl#

default_curi_maps:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped using default curie maps under @cmungall 's guidance a long time ago. I'd like the three of us to discuss this before merging, but that would push it back to the first week of December.

Copy link
Member

@turbomam turbomam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, this very good but I would like to talk though some subtleties. apologies for terms comments.

OntologyClass:
aliases: ["OntologyClass"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I ask a blunt question it doesn't mean I think your idea was crazy. Just want to succinctly start some conversations.

  • Why are you asserting the class name an alias?
  • why are you using [short list] notation? I don't think we have any precedent for that in nmdc-schema

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@turbomam - I toyed with the idea of changing the name of the Class ("Class" is redundant and inconsistent in the context of this schema, but after some research -- I checked into Biolink, and other LinkML schemas, and I see that its a common name for this kind of class). I will remove the alias for now.

short list notation is a convention I use in other models, happy to remove the list.

class_uri: nmdc:OntologyClass
description: >-
This class is used to represent ontology terms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to stay away form self-referential definitions. Would something like "An optimized, local representation of class defined in an external ontology" be OK?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, it currently has no definition.

- owl:Class
- schema:Class

OntologyRelation:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole thing is awesome

@@ -550,7 +550,7 @@ slots:
multivalued: true
description: >-
A list of alternative identifiers for the entity.
pattern: '^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$'
pattern: '^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are some examples of alternative identifiers using those additional characters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smiles, inchi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants