-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add One Health Enteric BioSample Package + misc bugfixes #38
Add One Health Enteric BioSample Package + misc bugfixes #38
Conversation
…rough database names when querying for submission updates Changed navigation in get_ncbi_process_report to step through parent directories individually before entering test or production - ncbi ftp is configured to hide child directories until you access the parents Added some very preliminary handling for the OneHealth Enteric metadata package in SRA and BioSample
…s submitting to Genbank or GISAID Added all mandatory and optional fields for onehealth enteric biosample package to main_config.yaml & added metadata/config templates Added handling to remove empty optional metadata columns if not filled out at submission time
… - fields with * also must be unique. Probably have to revisit this for GISAID/GENBANK
Correcting this, not all FTP accounts have the "submit" folder, adjusting it to automatically detect the folder and correctly step into it if it exists.
library_name is the correct attribute value to be used not library_ID based on their examples https://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/submit/public-docs/sra/samples/sra.submission.run.xml?revision=71838&view=markup
If bs-description is empty don't build descriptor with empty string. Remove for NCBI to automatically generate
Hey @erikwolfsohn, thanks for contributing to SeqSender and incorporating enteric pathogens. I made a couple changes based on my review to fix certain issues I identified but everything looks good. I'm still doing a couple more tests to make sure there aren't any other issues but I should have all your changes merged in shortly within the next few days. If you have any other contributions you'd like to make please make a pull request anytime or suggest changes in our issues for features you'd like to see. |
Awesome, thank you! I saw in another issue you were talking about implementing Pandera metadata validation as a way to support new pathogens and biosample packages in a future release. I think that's a great idea, and I'd love to contribute if possible. I started working on a pandera schema for the OneHealth enteric package and I really like it as an alternative to validating against that main yaml config file. Feel free to shoot me an email at [email protected] if you have some time to chat about your plans for that and possible ways I can contribute - I think this submission pipeline is going to be incredibly useful for our lab, so I definitely want to help in any way I can. |
Yes, I'm working quickly to get it added. The different requirements for One Health Enteric BioSample attributes has caused some issues when testing so instead of reinventing the wheel, I'm going to move up the pandera validation to the next version update to just resolve this issue instead of implementing a temporary fix. I don't think you'll need to manually create a One Health Enteric specific schema as I'm currently testing a way to automatically generate it from NCBI's website. I should have this available on the version update branch later this week. I've already added the Enteric xml as part of the test set I'm working on. I do have a couple other questions that I'm pooling together so once I have the update live I'll send you a email to let you know with my other questions included. Once you get my email if you could test it with the automatically generated schema that would be a major help. |
Hi Dakota,
I just wanted to check in and make sure I didn't miss any emails from you.
I was focusing on some other projects and this dropped off my radar
a little bit. Let me know if there's any way I can contribute currently or
if anything is ready for testing. Pulling the metadata templates directly
from NCBI sounds fantastic, I'm definitely excited for that feature. I'll
be at a conference next week so I won't be available to do much testing,
but I'll be back on May 13th.
…On Tue, Mar 19, 2024 at 8:10 AM Dakota Howard ***@***.***> wrote:
Hey @erikwolfsohn <https://github.com/erikwolfsohn>, thanks for
contributing to SeqSender and incorporating enteric pathogens. I made a
couple changes based on my review to fix certain issues I identified but
everything looks good. I'm still doing a couple more tests to make sure
there aren't any other issues but I should have all your changes merged in
shortly within the next few days. If you have any other contributions you'd
like to make please make a pull request anytime or suggest changes in our
issues for features you'd like to see.
Awesome, thank you! I saw in another issue you were talking about
implementing Pandera metadata validation as a way to support new pathogens
and biosample packages in a future release. I think that's a great idea,
and I'd love to contribute if possible. I started working on a pandera
schema for the OneHealth enteric package and I really like it as an
alternative to validating against that main yaml config file.
Feel free to shoot me an email at ***@***.*** if you have
some time to chat about your plans for that and possible ways I can
contribute - I think this submission pipeline is going to be incredibly
useful for our lab, so I definitely want to help in any way I can.
Yes, I'm working quickly to get it added. The different requirements for
One Health Enteric BioSample attributes has caused some issues when testing
so instead of reinventing the wheel, I'm going to move up the pandera
validation to the next version update to just resolve this issue instead of
implementing a temporary fix. I don't think you'll need to manually create
a One Health Enteric specific schema as I'm currently testing a way to
automatically generate it from NCBI's website. I should have this available
on the version update branch later this week. I've already added the
Enteric xml as part of the test set I'm working on. I do have a couple
other questions that I'm pooling together so once I have the update live
I'll send you a email to let you know with my other questions included.
Once you get my email if you could test it with the automatically generated
schema that would be a major help.
—
Reply to this email directly, view it on GitHub
<#38 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEGEJNHYOLQT6FQ7G2VOWE3YZBIPDAVCNFSM6AAAAABEIUGL7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXGQ2TEMJUGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hey @erikwolfsohn, Sorry, for not having reached out sooner, it took me a bit longer than anticipated to finish working through the update. The update is currently mostly complete and I'm in the process of finalizing the updated instructions for the documentation. I'll go ahead and send you an email now to connect, but I'm definitely in need of users to test this new version. The updated documentation will be available on the branch: https://github.com/CDCgov/seqsender/tree/v1.2.0 by the time you're back on the 13th. I'll also send you an email when I do upload the documentation, as well. |
SeqSender V1.2.0 is currently out and now supports the One Health Enteric Package. Use the documentation to select the package to get the correct metadata and config file. |
This is awesome, thank you for all your hard work on this! We're planning
to scale up sequencing significantly at our lab, so I cannot overstate how
excited I am about this new release. Since we do our analysis and
submission from inside the Terra.bio cloud platform, I'm working on a Terra
workflow to use seqsender in that environment and will definitely share it
with you when I'm done - hopefully by the end of this week.
SRA submission has worked great for me so far. I am having some trouble
with the GISAID submission, but I'm not sure if I'm encountering a bug or
it's just user error. I'll open an separate issue shortly with what I've
found so far.
…-Erik
On Tue, Aug 6, 2024 at 1:09 PM Dakota Howard ***@***.***> wrote:
SeqSender V1.2.0 is currently out and now supports the One Health Enteric
Package. Use the documentation to select the package to get the correct
metadata and config file.
—
Reply to this email directly, view it on GitHub
<#38 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEGEJNE76B36TZCBZN2OQYLZQEUO3AVCNFSM6AAAAABEIUGL7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZSGA3DCOJXGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi! I just found out about this fantastic submission pipeline you all built and I'm really excited to start using it. I had to make some updates since the majority of my NCBI submissions are enteric pathogens, so figured I'd submit a pull request in case any of these changes can be useful for you all.
📋 Updates
🛠️ Fixes