Skip to content

Commit

Permalink
Merge pull request #13 from nextstrain/relabel-clades-12
Browse files Browse the repository at this point in the history
Relabel clades numerically, rather than geographically [#12]
  • Loading branch information
genehack authored Aug 22, 2024
2 parents 0d74fbd + 5051b96 commit 1bf51d9
Show file tree
Hide file tree
Showing 5 changed files with 85 additions and 48 deletions.
18 changes: 17 additions & 1 deletion nextclade/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Yellow Fever Virus Nextclade Dataset Tree

This workflow creates a phylogenetic tree that can be used as part of
a Nextclade dataset to assign genotypes to yellow fever virus samples
a Nextclade dataset to assign clades to yellow fever virus samples
based on [Mutebi et al.][] (J Virol. 2001 Aug;75(15):6999-7008) and
[Bryant et al.][] (PLoS Pathog. 2007 May 18;3(5):e75).

Expand All @@ -14,6 +14,22 @@ based on [Mutebi et al.][] (J Virol. 2001 Aug;75(15):6999-7008) and
* Provide the following coloring options on the tree:
* Genotype assignment from `augur clades`

The clades we annotate (Clade I-VII) are roughly equivalent with the
following genotypes as described in the aforementioned two papers:

| Clade | Genotype |
|-----------|---------------------|
| Clade I | Angola |
| Clade II | East Africa |
| Clade III | East Central/Africa |
| Clade IV | West Africa I |
| Clade V | West Africa II |
| Clade VI | South America I |
| Clade VII | South America II |

(N.b., this table is available as a TSV in this repo, at
`nextclade/defaults/clade-to-genotype.tsv`.)

## How to create a new tree

* Run the workflow: `nextstrain build .`
Expand Down
8 changes: 8 additions & 0 deletions nextclade/defaults/clade-to-genotype.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Clade Genotype
Clade I Angola
Clade II East Africa
Clade III East Central/Africa
Clade IV West Africa I
Clade V West Africa II
Clade VI South America I
Clade VII South America II
78 changes: 39 additions & 39 deletions nextclade/defaults/clades.tsv
Original file line number Diff line number Diff line change
@@ -1,40 +1,40 @@
clade gene site alt
Angola nuc 111 G
Angola nuc 219 T
Angola nuc 240 C
Angola nuc 246 A
Angola nuc 252 A
Angola nuc 255 A
Angola nuc 291 G
Angola nuc 294 A
Angola nuc 300 A
Angola nuc 315 G
Angola nuc 327 G
Angola nuc 372 A
Angola nuc 420 A
Angola nuc 432 A
Angola nuc 453 T
Angola nuc 492 G
Angola nuc 651 T
Angola nuc 72 A
Angola nuc 81 G
Angola nuc 88 C
Angola nuc 90 A
Angola nuc 99 T
East Africa nuc 171 G
East Africa nuc 438 G
East Africa nuc 45 A
East Africa nuc 468 T
East/Central Africa nuc 228 G
South America I nuc 219 A
South America I nuc 532 A
South America II nuc 114 C
South America II nuc 193 T
South America II nuc 249 A
South America II nuc 639 G
West Africa I nuc 183 G
West Africa I nuc 255 C
West Africa II nuc 270 A
West Africa II nuc 321 T
West Africa II nuc 477 A
West Africa II nuc 93 T
Clade I nuc 111 G
Clade I nuc 219 T
Clade I nuc 240 C
Clade I nuc 246 A
Clade I nuc 252 A
Clade I nuc 255 A
Clade I nuc 291 G
Clade I nuc 294 A
Clade I nuc 300 A
Clade I nuc 315 G
Clade I nuc 327 G
Clade I nuc 372 A
Clade I nuc 420 A
Clade I nuc 432 A
Clade I nuc 453 T
Clade I nuc 492 G
Clade I nuc 651 T
Clade I nuc 72 A
Clade I nuc 81 G
Clade I nuc 88 C
Clade I nuc 90 A
Clade I nuc 99 T
Clade II nuc 171 G
Clade II nuc 438 G
Clade II nuc 45 A
Clade II nuc 468 T
Clade III nuc 228 G
Clade VI nuc 219 A
Clade VI nuc 532 A
Clade VII nuc 114 C
Clade VII nuc 193 T
Clade VII nuc 249 A
Clade VII nuc 639 G
Clade IV nuc 183 G
Clade IV nuc 255 C
Clade V nuc 270 A
Clade V nuc 321 T
Clade V nuc 477 A
Clade V nuc 93 T
14 changes: 7 additions & 7 deletions nextclade/defaults/colors.tsv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# genotypes assigned by augur clades
clade_membership Angola #3F63CF
clade_membership East Africa #529AB6
clade_membership East/Central Africa #75B681
clade_membership South America I #A6BE55
clade_membership South America II #D4B13F
clade_membership West Africa I #E68133
clade_membership West Africa II #DC2F24
clade_membership Clade I #3F63CF
clade_membership Clade II #529AB6
clade_membership Clade III #75B681
clade_membership Clade IV #A6BE55
clade_membership Clade V #DC2F24
clade_membership Clade VI #E68133
clade_membership Clade VII #D4B13F
15 changes: 14 additions & 1 deletion nextclade/defaults/nextclade-dataset/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

## Scope of this dataset

This dataset assigns genotypes to yellow fever virus samples based on
This dataset assigns clades to yellow fever virus samples based on
strain and genotype information from [Mutebi et al.][] (J Virol. 2001
Aug;75(15):6999-7008) and [Bryant et al.][] (PLoS Pathog. 2007 May 18;3(5):e75)

Expand All @@ -21,6 +21,19 @@ comprises the 3' end of the pre-membrane protein (prM) gene, the
entire membrane protein (M) gene, and the 5' end of the envelope
protein (E) gene.

The clades we annotate (Clade I-VII) are roughly equivalent with the
following genotypes as described in the aforementioned two papers:

| Clade | Genotype |
|-----------|---------------------|
| Clade I | Angola |
| Clade II | East Africa |
| Clade III | East Central/Africa |
| Clade IV | West Africa I |
| Clade V | West Africa II |
| Clade VI | South America I |
| Clade VII | South America II |

(N.b., the reference sequence used in this data set is actually 672nt
long, from bases 641-1312 of the genome reference. The 2 extra bases
make the reference an complete open reading frame.)
Expand Down

0 comments on commit 1bf51d9

Please sign in to comment.