Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test instantiating biosamples from NCBI #149

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

turbomam
Copy link
Member

@turbomam turbomam commented Sep 5, 2023

Instantiating against submission-schema instead of nmdc-schema because submission-schema is already flat, like NCBI Biosample attributes.

  • There are some required slots in SoilInterface (for example) that are unlikely to be asserted for NCBI Biosamples
    • so I'm overlaying NCBI Biosample attributes over a known good data file
  • There are some NCBI Biosample attributes don't would require minor tweaking, like a lat_lon of '71.323 N 156.6114 W'
  • other NCBI Biosample attributes may be much more further away from our specification

@turbomam turbomam requested a review from aclum September 5, 2023 23:29
@turbomam
Copy link
Member Author

turbomam commented Sep 5, 2023

@pkalita-lbl have you worked on anything like this yet?

I'm getting the Attributes for one NCBI Biosample here from the efetch API. We also have a Postgres version of the NCBI Biosample database on SPIN.

@turbomam
Copy link
Member Author

turbomam commented Sep 5, 2023

There's some code in https://github.com/INCATools/biosample-analysis that tries to clean up some NCBI Biosample attributes. I don't think it uses nmdc-chema or submission-schema classes as its target.

It was some of the earliest and worst code I wrote for BBOP.

Maybe in https://github.com/INCATools/biosample-analysis/blob/mam-envo-mapping/src/mixs-envo-mapping/biosample-triad-mapping.py?

It might not be much use doing any more research in there.

@turbomam
Copy link
Member Author

turbomam commented Sep 5, 2023

I don't know why the actions are saying

Because nmdc-submission-schema depends on xmltodict (^0.13.0) which doesn't match any versions, version solving failed.
Error: Process completed with exit code 1.

https://pypi.org/project/xmltodict/0.13.0/

test with printing and minimal assertions
@turbomam
Copy link
Member Author

turbomam commented Oct 28, 2024

I have the sense that team members like @aclum and @sujaypatil96 are developing code for this in nmdc-runtime or other repos and I propose closing this PR

@aclum
Copy link
Collaborator

aclum commented Oct 28, 2024

Efforts of the current squad have been focused on nmdc biosamples -> ncbi samples + SRA records, not the other way around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants