Skip to content

Commit

Permalink
fixup replace: parameterize the nextclade key fields
Browse files Browse the repository at this point in the history
  • Loading branch information
j23414 committed Oct 10, 2023
1 parent 89c89c7 commit 1a68589
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
2 changes: 2 additions & 0 deletions ingest/config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ transform:
annotations: 'source-data/annotations.tsv'
# ID field used to merge annotations
annotations_id: 'accession'
# Field to use as the sequence ID in the Nextclade file
nextclade_id_field: 'seqName'
# Field to use as the sequence ID in the FASTA file
id_field: 'accession'
# Field to use as the sequence in the FASTA file
Expand Down
5 changes: 3 additions & 2 deletions ingest/workflow/snakemake_rules/nextclade.smk
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ rule join_metadata_clades:
metadata="data/metadata.tsv",
params:
id_field=config["transform"]["id_field"],
nextclade_id_field=config["transform"]["nextclade_id_field"],
shell:
"""
export SUBSET_FIELDS=`awk 'NR>1 {{print $1}}' {input.nextclade_field_map} | tr '\n' ',' | sed 's/,$//g'`
Expand All @@ -75,11 +76,11 @@ rule join_metadata_clades:
-k {input.nextclade_field_map} \
| tsv-join -H \
--filter-file - \
--key-fields seqName \
--key-fields {params.nextclade_id_field} \
--data-fields {params.id_field} \
--append-fields '*' \
--write-all ? \
{input.metadata} \
| tsv-select -H --exclude seqName \
| tsv-select -H --exclude {params.nextclade_id_field} \
> {output.metadata}
"""

0 comments on commit 1a68589

Please sign in to comment.