-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
14 changed files
with
607 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Yellow Fever Virus Nextclade Dataset Tree | ||
|
||
This workflow creates a phylogenetic tree that can be used as part of | ||
a Nextclade dataset to assign genotypes to yellow fever virus samples based on | ||
FIXME reference to those two papers goes here. | ||
|
||
* Build a tree using samples from the `ingest` output, with the following | ||
sampling criteria: | ||
* Force-include the following samples: | ||
* genotype reference strains | ||
* Assign genotypes to each sample and internal nodes of the tree with | ||
`augur clades`, using clade-defining mutations in | ||
`defaults/clades.tsv` | ||
* Provide the following coloring options on the tree: | ||
* Genotype assignment from `augur clades` | ||
|
||
## How to create a new tree | ||
|
||
* Run the workflow: `nextstrain build .` | ||
* Inspect the output tree by comparing genotype assignments from the following sources: | ||
* `augur clades` output | ||
* If unwanted samples are present in the tree, add them to | ||
`defaults/dropped_strains.tsv` and re-run the workflow | ||
* If any changes are needed to the clade-defining mutations, add | ||
changes to `defaults/clades.tsv` and re-run the workflow | ||
* Repeat as needed |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
configfile: "defaults/config.yaml" | ||
|
||
rule all: | ||
input: | ||
auspice_json = config["files"]["auspice_json"], | ||
|
||
include: "rules/prepare_sequences.smk" | ||
include: "rules/construct_phylogeny.smk" | ||
include: "rules/annotate_phylogeny.smk" | ||
include: "rules/export.smk" | ||
|
||
rule clean: | ||
params: | ||
targets = [ | ||
".snakemake", | ||
"auspice", | ||
"benchmarks", | ||
"data", | ||
"logs", | ||
"results", | ||
] | ||
shell: | ||
""" | ||
rm -rfv {params.targets} | ||
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
{ | ||
"title": "Real-time tracking of yellow fever virus full genome virus evolution", | ||
"maintainers": [ | ||
{"name": "John SJ Anderson", "url": "https://bedford.io/team/john-sj-anderson/"}, | ||
{"name": "the Nextstrain team", "url": "https://nextstrain.org/team"} | ||
], | ||
"data_provenance": [ | ||
{ | ||
"name": "GenBank", | ||
"url": "https://www.ncbi.nlm.nih.gov/genbank/" | ||
} | ||
], | ||
"build_url": "https://github.com/nextstrain/yellow-fever", | ||
"colorings": [ | ||
{ | ||
"key": "gt", | ||
"title": "Genotype", | ||
"type": "categorical" | ||
}, | ||
{ | ||
"key": "num_date", | ||
"title": "Date", | ||
"type": "continuous" | ||
}, | ||
{ | ||
"key": "region", | ||
"title": "Region", | ||
"type": "categorical" | ||
}, | ||
{ | ||
"key": "country", | ||
"title": "Country", | ||
"type": "categorical" | ||
}, | ||
{ | ||
"key": "host", | ||
"title": "Host", | ||
"type": "categorical" | ||
} | ||
], | ||
"geo_resolutions": [ | ||
"country", | ||
"region" | ||
], | ||
"display_defaults": { | ||
"map_triplicate": true, | ||
"color_by": "region" | ||
}, | ||
"filters": [ | ||
"clade", | ||
"region", | ||
"country", | ||
"author", | ||
"host" | ||
], | ||
"metadata_columns": [ | ||
"author" | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
clade gene site alt | ||
Angola nuc 72 A | ||
Angola nuc 81 G | ||
Angola nuc 88 C | ||
Angola nuc 90 A | ||
Angola nuc 99 T | ||
Angola nuc 111 G | ||
Angola nuc 219 T | ||
Angola nuc 240 C | ||
Angola nuc 246 A | ||
Angola nuc 252 A | ||
Angola nuc 255 A | ||
Angola nuc 291 G | ||
Angola nuc 294 A | ||
Angola nuc 300 A | ||
Angola nuc 315 G | ||
Angola nuc 327 G | ||
Angola nuc 372 A | ||
Angola nuc 420 A | ||
Angola nuc 432 A | ||
Angola nuc 453 T | ||
Angola nuc 492 G | ||
Angola nuc 651 T | ||
East Africa nuc 45 A | ||
East Africa nuc 171 G | ||
East Africa nuc 438 G | ||
East Africa nuc 468 T | ||
East/Central Africa nuc 228 G | ||
West Africa I nuc 183 G | ||
West Africa I nuc 255 C | ||
West Africa II nuc 93 T | ||
West Africa II nuc 270 A | ||
West Africa II nuc 321 T | ||
West Africa II nuc 477 A | ||
South America I nuc 219 A | ||
South America I nuc 532 A | ||
South America II nuc 114 C | ||
South America II nuc 193 T | ||
South America II nuc 249 A | ||
South America II nuc 639 G |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# genotypes assigned by augur clades | ||
clade_membership Angola #FCF007 | ||
clade_membership East Africa #4B26B1 | ||
clade_membership East/Central Africa #E307FC | ||
clade_membership West Africa I #2CFC07 | ||
clade_membership West Africa II #9EFC07 | ||
clade_membership South America I #996633 | ||
clade_membership South America II #FC0740 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
files: | ||
auspice_config: "defaults/auspice_config.json" | ||
auspice_json: "auspice/yellow-fever-virus_prM-E.json" | ||
clades: "defaults/clades.tsv" | ||
colors: "defaults/colors.tsv" | ||
include: "defaults/include_strains.txt" | ||
reference_prM-E_fasta: "defaults/yellow-fever-virus-reference_prM-E.fasta" | ||
reference_prM-E_gff: "defaults/yellow-fever-virus-reference_prM-E.gff" | ||
strain_id_field: "accession" | ||
align_and_extract_prM-E: | ||
min_length: 650 | ||
min_seed_cover: 0.01 | ||
filter: | ||
group_by: "region year" | ||
subsample_max_sequences: 500 | ||
min_date: 1927 | ||
min_length: 650 | ||
refine: | ||
coalescent: "opt" | ||
date_inference: "marginal" | ||
clock_filter_iqd: 4 | ||
ancestral: | ||
inference: "joint" | ||
export: | ||
metadata_columns: "strain division location region year host" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# FIXME provide reference | ||
AF369669 | ||
AF369670 | ||
AF369671 | ||
AY540431 | ||
AY540432 | ||
AY540433 | ||
AY540434 | ||
AY540435 | ||
U52390 | ||
AY540437 | ||
AY540438 | ||
AY540439 | ||
AY540440 | ||
AY540441 | ||
AY540442 | ||
AY540443 | ||
AY540444 | ||
AY540445 | ||
AY540446 | ||
AY540447 | ||
AY540448 | ||
AY540449 | ||
AY540450 | ||
AY540451 | ||
AY540452 | ||
AY540453 | ||
U23570 | ||
AY540454 | ||
AY540455 | ||
AY540456 | ||
AY540457 | ||
AY540458 | ||
AY540459 | ||
AY540460 | ||
AY540461 | ||
AY540462 | ||
AY540463 | ||
AY540464 | ||
AY540465 | ||
AY540466 | ||
AY540467 | ||
AY540468 | ||
AY540469 | ||
AY540470 | ||
AY540471 | ||
AY540472 | ||
AY540473 | ||
AY540436 | ||
U52392 | ||
U52395 | ||
AF369672 | ||
AF369673 | ||
AY540475 | ||
AY540476 | ||
AY540474 | ||
U52399 | ||
AY540477 | ||
AY540478 | ||
AF369674 | ||
AF369675 | ||
AY572535 | ||
AY640589 | ||
AF369686 | ||
U54798 | ||
AY603338 | ||
AF369676 | ||
U52403 | ||
AF369677 | ||
AF369678 | ||
AF368679 | ||
AF369680 | ||
AF369681 | ||
AF369682 | ||
AF369683 | ||
AF369684 | ||
AF369685 | ||
AY540479 | ||
AY540480 | ||
AY161927 | ||
AY161928 | ||
AY161929 | ||
AY161930 | ||
AY161931 | ||
U52411 | ||
AY161933 | ||
AY161934 | ||
AY161935 | ||
U52405 | ||
U52407 | ||
AY161938 | ||
AY161939 | ||
AY161940 | ||
AY161941 | ||
AY161942 | ||
AY161943 | ||
AY161944 | ||
AY161945 | ||
AY161946 | ||
AY161947 | ||
AY161948 | ||
AY161949 | ||
AY161950 | ||
AY161951 | ||
GI694115 | ||
U89338 | ||
AF369687 | ||
AF369688 | ||
U52413 | ||
AF369689 | ||
AF369690 | ||
AF369691 | ||
AF369692 | ||
AF369693 | ||
AY690831 | ||
AY690832 | ||
AY690833 | ||
DQ872411 | ||
DQ872412 | ||
AY540481 | ||
AY540482 | ||
AY540483 | ||
AY540484 | ||
AY540485 | ||
AY540486 | ||
AF369694 | ||
U52422 | ||
AF369695 | ||
AF369696 | ||
AY540487 | ||
AY540488 | ||
AY540489 | ||
AY540490 | ||
AF369697 |
13 changes: 13 additions & 0 deletions
13
nextclade/defaults/yellow-fever-virus-reference_prM-E.fasta
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
> prM-E region (genome 641-1312, 672 nt) | ||
CCAAGAGAGGAGCCAGATGACATTGATTGCTGGTGCTATGGGGTGGAAAACGTTAGAGTC | ||
GCATATGGTAAGTGTGACTCAGCAGGCAGGTCTAGGAGGTCAAGAAGGGCCATTGACTTG | ||
CCTACGCATGAAAACCATGGTTTGAAGACCCGGCAAGAAAAATGGATGACTGGAAGAATG | ||
GGTGAAAGGCAACTCCAAAAGATTGAGAGATGGTTCGTGAGGAACCCCTTTTTTGCAGTG | ||
ACGGCTCTGACCATTGCCTACCTTGTGGGAAGCAACATGACGCAACGAGTCGTGATTGCC | ||
CTACTGGTCTTGGCTGTTGGTCCGGCCTACTCAGCTCACTGCATTGGAATTACTGACAGG | ||
GATTTCATTGAGGGGGTGCATGGAGGAACTTGGGTTTCAGCTACCCTGGAGCAAGACAAG | ||
TGTGTCACTGTTATGGCCCCTGACAAGCCTTCATTGGACATCTCACTAGAGACAGTAGCC | ||
ATTGATAGACCTGCTGAGGTGAGGAAAGTGTGTTACAATGCAGTTCTCACTCATGTGAAG | ||
ATTAATGACAAGTGCCCCAGCACTGGAGAGGCCCACCTAGCTGAAGAGAACGAAGGGGAC | ||
AATGCGTGCAAGCGCACTTATTCTGATAGAGGCTGGGGCAATGGCTGTGGCCTATTTGGG | ||
AAAGGGAGCATT |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
##sequence-region prM-E 1 672 | ||
NC_002031.1 feature source 1 672 . + . gene=nuc | ||
NC_002031.1 feature gene 1 334 . + . gene_name=prM | ||
NC_002031.1 feature gene 110 334 . + . gene_name=M | ||
NC_002031.1 feature gene 335 672 . + . gene_name=E |
Oops, something went wrong.