phylogenetic: Use inline root sequence #42

joverlee521 · 2024-04-16T00:20:13Z

Description of proposed changes

~~Explicitly state that the root-sequence.json file is an expected output of the core phylogenetic workflow.~~

~~This also ensures that the Nextstrain automation rule deploy_all will include the root-sequence.json in the upload.~~

Based on feedback from @jameshadfield in
nextstrain/zika#56 (comment)

Looking at the existing dataset files on S3,
the 5-6 KiB root-sequence.jsons are pretty negligible when the main
Auspice JSONs are 600-800 KiB. Nextstrain datasets are limited by the
500MB memory cap in Chrome,¹ so we'd be fine adding the
root sequence inline.

This ensures that our uploads will include the root sequence so that
they don't get out-of-sync with the main Auspice JSON.

¹ nextstrain/auspice#1622

Related issue(s)

Follow up to #37
Similar to nextstrain/zika#56

Checklist

Checks pass

j23414

Thanks! 😄

@jameshadfield

Based on feedback from @jameshadfield in nextstrain/zika#56 (comment) Looking at the existing dataset files on S3, the 5-6 KiB root-sequence.jsons are pretty negligible when the main Auspice JSONs are 600-800 KiB. Nextstrain datasets are limited by the 500MB memory cap in Chrome,¹ so we'd be fine adding the root sequence inline. This ensures that our uploads will include the root sequence so that they don't get out-of-sync with the main Auspice JSON. ¹ nextstrain/auspice#1622

joverlee521 · 2024-04-16T18:51:02Z

Merging since the CI run's outputs include root_sequence in the Auspice JSONs and the datasets looks good in auspice.us

joverlee521 · 2024-04-16T18:52:25Z

Manually deleted cache and triggered a re-run of the ingest-to-phylo workflow.

Once complete

Remove the existing s3://nextstrain-data/dengue*_root-sequence.json files

joverlee521 · 2024-04-16T20:34:44Z

Removed the following files from s3://nextstrain-data/

dengue_all_genome_root-sequence.json
dengue_denv1_genome_root-sequence.json
dengue_denv2_genome_root-sequence.json
dengue_denv3_genome_root-sequence.json
dengue_denv4_genome_root-sequence.json

Left the E gene root-sequence.json files since they are not being updated by this PR.

j23414

Downloaded the auspice files, and they look good to me!

joverlee521 requested review from j23414 April 16, 2024 00:32

j23414 approved these changes Apr 16, 2024

View reviewed changes

joverlee521 force-pushed the fix-all-output branch from b194908 to 0220a50 Compare April 16, 2024 18:04

joverlee521 changed the title ~~phylogenetic: Add root-sequence.json to rule all~~ phylogenetic: Use inline root sequence Apr 16, 2024

joverlee521 merged commit afa97a4 into main Apr 16, 2024
32 checks passed

joverlee521 deleted the fix-all-output branch April 16, 2024 18:51

j23414 reviewed Apr 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phylogenetic: Use inline root sequence #42

phylogenetic: Use inline root sequence #42

joverlee521 commented Apr 16, 2024 •

edited

Loading

j23414 left a comment

joverlee521 commented Apr 16, 2024 •

edited

Loading

joverlee521 commented Apr 16, 2024 •

edited

Loading

joverlee521 commented Apr 16, 2024

j23414 left a comment

phylogenetic: Use inline root sequence #42

phylogenetic: Use inline root sequence #42

Conversation

joverlee521 commented Apr 16, 2024 • edited Loading

Description of proposed changes

Related issue(s)

Checklist

j23414 left a comment

Choose a reason for hiding this comment

joverlee521 commented Apr 16, 2024 • edited Loading

joverlee521 commented Apr 16, 2024 • edited Loading

joverlee521 commented Apr 16, 2024

j23414 left a comment

Choose a reason for hiding this comment

joverlee521 commented Apr 16, 2024 •

edited

Loading

joverlee521 commented Apr 16, 2024 •

edited

Loading

joverlee521 commented Apr 16, 2024 •

edited

Loading