-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converter: OWL to FHIR CodeSystem #320
Conversation
notebooks/api-key.txt | ||
|
||
.template.db | ||
.venv | ||
.tox/ | ||
.coverage.* | ||
|
||
.DS_Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprised this wasn't in .gitignore
src/oaklib/converters/owl_to_fhir.py
Outdated
@@ -0,0 +1,20 @@ | |||
"""Convert OWL to FHIR CodeSystem JSON | |||
|
|||
TODO's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major and minor/later todo's listed here.
src/oaklib/converters/owl_to_fhir.py
Outdated
""" | ||
|
||
|
||
def convert(inpath: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just going to do things quick and dirty at first, as opposed to trying to conform to codebase conventions straight off the bat.
33d5aad
to
51cbd17
Compare
src/oaklib/converters/owl_to_fhir.py
Outdated
"""Convert | ||
|
||
Side effects: | ||
- Writes file at `outpath`""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the content the same as Dict? I would choose one or the other
[sorry to comment on a draft I can't help it!]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than writing to dicts you could have an actual python datamodel for the FHIR code system and serialize this. This has numerous advantages such as type safety, clarity, ease of refactoring. See the src/datamodels folder for examples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting on the draft is good. The input is useful and it could take a long time to complete this PR if I get pulled in other directions.
This is a sound suggestion. I've added it to the list of "todo's minor" at the top of the file. My primary objective is to complete this issue for TIMS so that I can continue working on other issues, but if I can improve the typing at all in a quick way, I will do that earlier on.
@@ -27,8 +27,9 @@ kgcl-rdflib = "^0.3.0" | |||
pystow = "^0.4.4" | |||
class-resolver = "^0.3.10" | |||
ontoportal-client = "0.0.3" | |||
curies = "^0.1.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Update:
curies = "^0.1.5"
-->curies = "^0.3.0"
- Add: bioontologies = "^0.2.0"
For now I'm using bioontologies
until I can fix issues I'm having with getting OAK to do the conversion.
The curies
bit changed because, for some reason, even though semver syntax should support download of 0.3.0
via ^0.1.5
, it would simply not install unless I made this change. Error messages and discussion of that in are in this slack thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's best to do these in separate PRs, you will have conflicts now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. This change probably won't be included anyway. Also, Damien explained to me why semver was not doing as I expected here.
src/oaklib/converters/owl_to_fhir.py
Outdated
"""Convert | ||
|
||
Side effects: | ||
- Writes file at `outpath`""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting on the draft is good. The input is useful and it could take a long time to complete this PR if I get pulled in other directions.
This is a sound suggestion. I've added it to the list of "todo's minor" at the top of the file. My primary objective is to complete this issue for TIMS so that I can continue working on other issues, but if I can improve the typing at all in a quick way, I will do that earlier on.
src/oaklib/converters/owl_to_fhir.py
Outdated
|
||
# todo: This is for development purposes. can delete after or modify if needed | ||
# todo: would be nice if faster to load pickle https://stackoverflow.com/questions/66348333/speed-up-reading-multiple-pickle-files | ||
def _load_obograph(inpath: str, cache_dir=CACHE_DIR, save_cache=False, load_cache=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Caching. May or may not keep some variation of this, turned off by default, in the final PR. This is especially useful for me right now since simply loading the bioontologies.robot
module is taking me about ~24 seconds right now.
aa2f64a
to
41b50e5
Compare
} | ||
for syn in node.meta.synonyms: | ||
# todo: will each Synonym realistically only have 1 xref? It supports a list. But could it be 0 or >1? | ||
for xref in syn.xrefs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that synonym.xrefs is for the provenance of the synonym - it's not the code system of the synonym.
E.g.
meta:
synonyms:
val: "foo disease type 2"
xrefs:
- PMID:1234 ### a publications
- ORCID:4567-897 ### Nicole's/curators orcid
- ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for pointing that out. I might have ended up making a mistake here.
bioregistry = "^0.5.64" | ||
bioontologies = "^0.2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. I'll remove this. I think I only added this here temporarily because I was using an interface and not an implementation and having issues because of that.
|
||
TODO's | ||
* Move this into GitHub issue (and link here) so I can get input on some of these more difficult parts / parts I don't yet understand | ||
1. Write converter MVP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could also look at pyrophen as a model of how to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, I checked how Pyrophen does a couple days ago. It is simple enough to understand. My intuition is still leaning towards Obographs being faster and easier to do.
4.3. Obograph Node stuff | ||
4.3.1. See what all I can include from: | ||
- alternative_ids[], created_by, creation_date, definition, definition_provenance, id, lbl (label), synonyms[], | ||
- xrefs[] (I see the CURIE, but nothing else... does it not parse other xref info? or is it in axioms?), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the mappings - think of what we are doing with mondo, omim, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are explicitly comparing this mapping case (xrefs between ontologies) with this mapping case (data models). I agree with you hear that this is the higher priority for using a mapping standard. In this case, if I do do that (and I'm not sure I will yet), I would reach for SSSOM.
25442c7
to
064bd0d
Compare
581bb2e
to
0e9b94a
Compare
- Add: New converter for OWL to FHIR CodeSystem JON (WIP) - Add: Tests for new converter (WIP) Misc - Update: .gitignore: added: .DS_Store
Hi @joeflack4 - I had a shot at a simplified version of this here: |
* Adding FHIR CodeSystem dumping. Fixes #367 May replace #320 * making logical def retrieval more efficient * Hacking with @joeflack4 * lint
Updates
Related