Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish modernization #20

Merged
merged 20 commits into from
Nov 27, 2024
Merged

Finish modernization #20

merged 20 commits into from
Nov 27, 2024

Conversation

genehack
Copy link
Contributor

@genehack genehack commented Nov 16, 2024

Description of proposed changes

Preview builds on staging:

This PR adds the following changes:

  • Pull Nextclade dataset during ingest; use that for genotype/clade assignment
  • Add upload action to ingest workflow
  • Convert nextclade and phylogenetic workflows to download data from S3
  • Sets an explicit, hard-coded clock rate of 2e-04±1e-05 for both genome and prM-E builds
  • Modify builds so that only genome gets run with --timetree
  • Fix authors versus abbreviated authors so that Auspice tips are labeled with abbreviated version
  • Adds frequencies panel to both builds
  • Filters out probable Clade I travel cases (country=China and dates around 2016-2017)
  • Adds in color.tsv generation based on what is in dengue repo, for better country-level chroma/geography matchup
  • Adds CI and ingest-to-phylo github actions

Related issue(s)

Checklist

  • Checks pass

@genehack genehack marked this pull request as ready for review November 21, 2024 22:13
Copy link

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed the general workflows, leaving scientific review to others.

ingest/build-configs/ci/copy_example_data.smk Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Show resolved Hide resolved
ingest/rules/nextclade.smk Show resolved Hide resolved
ingest/build-configs/ci/config.yaml Outdated Show resolved Hide resolved
phylogenetic/defaults/color_orderings.tsv Outdated Show resolved Hide resolved
Also small script tweak
Also add custom rule import bits to phylogenetic/Snakefile
This is the full NCBI yellow-fever dataset (taxon ID 11089) as
downloaded on 21 Nov 2024, and the ingested sequences and metadata as
downloaded from Nextstrain on 21 Nov 2024.

CI _could_ work just fine without embedding this data in the repo, but
having a hard-coded example-data file means that we don't depend on
NCBI or Nextstrain.org for successful CI runs.
@genehack genehack force-pushed the finish-modernization-2 branch from f1a35a6 to 04b92ca Compare November 26, 2024 02:25
@genehack
Copy link
Contributor Author

genehack commented Nov 26, 2024

Updated in response to feedback — thanks @joverlee521!

Rebuilt previews: genome, prM-E

@genehack genehack merged commit 0972a30 into main Nov 27, 2024
5 checks passed
@genehack genehack deleted the finish-modernization-2 branch November 27, 2024 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modernize repo
2 participants