Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for STR prioritisation from ExpansionHunter calls #563

Open
julesjacobsen opened this issue Jul 4, 2024 · 0 comments
Open
Assignees
Milestone

Comments

@julesjacobsen
Copy link
Contributor

julesjacobsen commented Jul 4, 2024

ExpansionHunter is used in Genomics England for detecting these from short read sequencing. This is the example output: https://github.com/Illumina/ExpansionHunter/blob/master/docs/06_OutputVcfFiles.md#example

The following VCF entry describes the state of C9orf72 repeat in a sample with name/barcode LP6005616-DNA_A03.

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  LP6005616-DNA_A03
chr9    27573526        .       C       <STR2>,<STR349> .       PASS    SVTYPE=STR;END=27573544;REF=3;RL=18;RU=GGCCCC;REPID=ALS GT:SO:CN:CI:AD_SP:AD_FL:AD_IR   1/2:SPANNING/INREPEAT:2/349:2-2/323-376:19/0:3/6:0/459

This line tells us that first allele spans 2 repeat units while the second allele spans 349 repeat units. The repeat unit is GGCCCC (RU INFO field), so the sequence of the first allele is GGCCCCGGCCCC and the sequence of the second allele is GGCCCC x 349. The repeat spans three repeat units in the reference (REF INFO field). The length of the short allele was estimated from spanning reads (SPANNING) while the length of the expanded allele was estimated from in-repeat reads (INREPEAT). The confidence interval for the size of the expanded allele is (323,376). There are 19 spanning and 3 flanking reads consistent with the repeat allele of size 2 (that is 19 reads fully contain the repeat of size 2 and 2 flanking reads overlap at most 2 repeat units). Also, there are 6 flanking and 459 in-repeat reads consistent with the repeat allele of size 349.

PanelApp has info on the pathogenicity for STRs e.g.
https://panelapp.genomicsengland.co.uk/panels/entities/C9orf72_GGGGCC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant