-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect ingest and phylogenetic workflows to follow pathogen-repo-guide #19
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kimandrews!
I skimmed through the code changes and they look alright, but didn't review in depth. Thanks for uploading the new metadata/sequences to S3.
I ran through the phylogenetic workflow and it works as expected. The steps to upload the dataset to s3://nextstrain-data
(i.e. to make it live) aren't yet here, but with the new ingest & phylogenetics being part of the master branch I think the new dataset should be made live. You could do this manually now and add the code in a later PR, but we should try to keep the pipeline in-sync with the live dataset.
Auspice flags the following error, which is not part of your PR but should be fixed at some point:
[Genome annotation] V has length 899 which is not a multiple of 3
An aside I noticed here: We should allow augur tree
and augur align
to run with more than one thread. You can see an example of how to connect snakemake and augur's --nthreads
argument here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes here look good to me, I've only left a couple of non-blocking comments.
Feel free to merge and get other fixes in via separate PRs.
51c63ec
to
5390f10
Compare
The same samples are used for example data as were previously used, but now they have been pulled from the ingest output. Ambiguous dates were manually fixed in the example data for samples JN635406.1, JN635408.1, and EF565859.1, based on dates found here: https://github.com/nextstrain/fauna/blob/dd20a1fad51433e0bdc206f4300635c0f93f8599/source-data/measles_date_fix.tsv
5390f10
to
9e09647
Compare
Description of proposed changes
Connect ingest and phylogenetic workflows, following the pathogen-repo-guide:
This PR focuses on making changes that are necessary to move forward with creating a locus-specific workflow in a later PR.
Related issue(s)
#11
Checklist