Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phylogenetic: Use inline root sequence #42

Merged
merged 1 commit into from
Apr 16, 2024
Merged

phylogenetic: Use inline root sequence #42

merged 1 commit into from
Apr 16, 2024

Conversation

joverlee521
Copy link
Contributor

@joverlee521 joverlee521 commented Apr 16, 2024

Description of proposed changes

Explicitly state that the root-sequence.json file is an expected output of the core phylogenetic workflow.

This also ensures that the Nextstrain automation rule deploy_all will include the root-sequence.json in the upload.

Based on feedback from @jameshadfield in
nextstrain/zika#56 (comment)

Looking at the existing dataset files on S3,
the 5-6 KiB root-sequence.jsons are pretty negligible when the main
Auspice JSONs are 600-800 KiB. Nextstrain datasets are limited by the
500MB memory cap in Chrome,¹ so we'd be fine adding the
root sequence inline.

This ensures that our uploads will include the root sequence so that
they don't get out-of-sync with the main Auspice JSON.

¹ nextstrain/auspice#1622

Related issue(s)

Follow up to #37
Similar to nextstrain/zika#56

Checklist

  • Checks pass

@joverlee521 joverlee521 requested review from j23414 April 16, 2024 00:32
Copy link
Contributor

@j23414 j23414 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 😄

Based on feedback from @jameshadfield in
nextstrain/zika#56 (comment)

Looking at the existing dataset files on S3,
the 5-6 KiB root-sequence.jsons are pretty negligible when the main
Auspice JSONs are 600-800 KiB. Nextstrain datasets are limited by the
500MB memory cap in Chrome,¹ so we'd be fine adding the
root sequence inline.

This ensures that our uploads will include the root sequence so that
they don't get out-of-sync with the main Auspice JSON.

¹ nextstrain/auspice#1622
@joverlee521 joverlee521 changed the title phylogenetic: Add root-sequence.json to rule all phylogenetic: Use inline root sequence Apr 16, 2024
@joverlee521
Copy link
Contributor Author

joverlee521 commented Apr 16, 2024

Merging since the CI run's outputs include root_sequence in the Auspice JSONs and the datasets looks good in auspice.us

@joverlee521 joverlee521 merged commit afa97a4 into main Apr 16, 2024
32 checks passed
@joverlee521 joverlee521 deleted the fix-all-output branch April 16, 2024 18:51
@joverlee521
Copy link
Contributor Author

joverlee521 commented Apr 16, 2024

Manually deleted cache and triggered a re-run of the ingest-to-phylo workflow.

Once complete

  • Remove the existing s3://nextstrain-data/dengue*_root-sequence.json files

@joverlee521
Copy link
Contributor Author

Removed the following files from s3://nextstrain-data/

  • dengue_all_genome_root-sequence.json
  • dengue_denv1_genome_root-sequence.json
  • dengue_denv2_genome_root-sequence.json
  • dengue_denv3_genome_root-sequence.json
  • dengue_denv4_genome_root-sequence.json

Left the E gene root-sequence.json files since they are not being updated by this PR.

Copy link
Contributor

@j23414 j23414 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downloaded the auspice files, and they look good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants