Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converter: OWL to FHIR CodeSystem #320

Closed
wants to merge 3 commits into from
Closed

Conversation

joeflack4
Copy link
Contributor

@joeflack4 joeflack4 commented Oct 18, 2022

Updates

OWL to FHIR
- Add: New converter for OWL to FHIR CodeSystem JON (WIP)
- Add: Tests for new converter (WIP)

Misc
- Update: .gitignore: added: .DS_Store

Related

@joeflack4 joeflack4 marked this pull request as draft October 18, 2022 22:19
notebooks/api-key.txt

.template.db
.venv
.tox/
.coverage.*

.DS_Store
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprised this wasn't in .gitignore

@@ -0,0 +1,20 @@
"""Convert OWL to FHIR CodeSystem JSON

TODO's
Copy link
Contributor Author

@joeflack4 joeflack4 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major and minor/later todo's listed here.

src/oaklib/converters/owl_to_fhir.py Outdated Show resolved Hide resolved
"""


def convert(inpath: str):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just going to do things quick and dirty at first, as opposed to trying to conform to codebase conventions straight off the bat.

@joeflack4 joeflack4 force-pushed the fhir branch 2 times, most recently from 33d5aad to 51cbd17 Compare October 18, 2022 23:34
"""Convert

Side effects:
- Writes file at `outpath`"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the content the same as Dict? I would choose one or the other

[sorry to comment on a draft I can't help it!]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than writing to dicts you could have an actual python datamodel for the FHIR code system and serialize this. This has numerous advantages such as type safety, clarity, ease of refactoring. See the src/datamodels folder for examples

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting on the draft is good. The input is useful and it could take a long time to complete this PR if I get pulled in other directions.

This is a sound suggestion. I've added it to the list of "todo's minor" at the top of the file. My primary objective is to complete this issue for TIMS so that I can continue working on other issues, but if I can improve the typing at all in a quick way, I will do that earlier on.

@@ -27,8 +27,9 @@ kgcl-rdflib = "^0.3.0"
pystow = "^0.4.4"
class-resolver = "^0.3.10"
ontoportal-client = "0.0.3"
curies = "^0.1.5"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Update: curies = "^0.1.5" --> curies = "^0.3.0"
  • Add: bioontologies = "^0.2.0"

For now I'm using bioontologies until I can fix issues I'm having with getting OAK to do the conversion.
The curies bit changed because, for some reason, even though semver syntax should support download of 0.3.0 via ^0.1.5, it would simply not install unless I made this change. Error messages and discussion of that in are in this slack thread.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's best to do these in separate PRs, you will have conflicts now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. This change probably won't be included anyway. Also, Damien explained to me why semver was not doing as I expected here.

src/oaklib/converters/owl_to_fhir.py Outdated Show resolved Hide resolved
"""Convert

Side effects:
- Writes file at `outpath`"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting on the draft is good. The input is useful and it could take a long time to complete this PR if I get pulled in other directions.

This is a sound suggestion. I've added it to the list of "todo's minor" at the top of the file. My primary objective is to complete this issue for TIMS so that I can continue working on other issues, but if I can improve the typing at all in a quick way, I will do that earlier on.


# todo: This is for development purposes. can delete after or modify if needed
# todo: would be nice if faster to load pickle https://stackoverflow.com/questions/66348333/speed-up-reading-multiple-pickle-files
def _load_obograph(inpath: str, cache_dir=CACHE_DIR, save_cache=False, load_cache=True):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Caching. May or may not keep some variation of this, turned off by default, in the final PR. This is especially useful for me right now since simply loading the bioontologies.robot module is taking me about ~24 seconds right now.

@joeflack4 joeflack4 force-pushed the fhir branch 3 times, most recently from aa2f64a to 41b50e5 Compare October 19, 2022 02:53
}
for syn in node.meta.synonyms:
# todo: will each Synonym realistically only have 1 xref? It supports a list. But could it be 0 or >1?
for xref in syn.xrefs:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that synonym.xrefs is for the provenance of the synonym - it's not the code system of the synonym.

E.g.

meta:
  synonyms:
    val: "foo disease type 2"
    xrefs:
      - PMID:1234   ### a publications
      - ORCID:4567-897 ### Nicole's/curators orcid
      - ...

Copy link
Contributor Author

@joeflack4 joeflack4 Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for pointing that out. I might have ended up making a mistake here.

bioregistry = "^0.5.64"
bioontologies = "^0.2.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be necessary

Copy link
Contributor Author

@joeflack4 joeflack4 Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. I'll remove this. I think I only added this here temporarily because I was using an interface and not an implementation and having issues because of that.


TODO's
* Move this into GitHub issue (and link here) so I can get input on some of these more difficult parts / parts I don't yet understand
1. Write converter MVP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also look at pyrophen as a model of how to do this

Copy link
Contributor Author

@joeflack4 joeflack4 Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, I checked how Pyrophen does a couple days ago. It is simple enough to understand. My intuition is still leaning towards Obographs being faster and easier to do.

4.3. Obograph Node stuff
4.3.1. See what all I can include from:
- alternative_ids[], created_by, creation_date, definition, definition_provenance, id, lbl (label), synonyms[],
- xrefs[] (I see the CURIE, but nothing else... does it not parse other xref info? or is it in axioms?),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the mappings - think of what we are doing with mondo, omim, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are explicitly comparing this mapping case (xrefs between ontologies) with this mapping case (data models). I agree with you hear that this is the higher priority for using a mapping standard. In this case, if I do do that (and I'm not sure I will yet), I would reach for SSSOM.

@joeflack4 joeflack4 force-pushed the fhir branch 11 times, most recently from 25442c7 to 064bd0d Compare November 4, 2022 22:19
@joeflack4 joeflack4 force-pushed the fhir branch 2 times, most recently from 581bb2e to 0e9b94a Compare November 6, 2022 21:43
- Add: New converter for OWL to FHIR CodeSystem JON (WIP)
- Add: Tests for new converter (WIP)

Misc
- Update: .gitignore: added: .DS_Store
@cmungall
Copy link
Collaborator

Hi @joeflack4 - I had a shot at a simplified version of this here:

cmungall added a commit that referenced this pull request Nov 30, 2022
* Adding FHIR CodeSystem dumping.

Fixes #367

May replace #320

* making logical def retrieval more efficient

* Hacking with @joeflack4

* lint
@joeflack4 joeflack4 closed this Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants