Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCF coordinates appear to be 0-based instead of 1-based #13

Open
djhshih opened this issue Aug 17, 2016 · 0 comments
Open

VCF coordinates appear to be 0-based instead of 1-based #13

djhshih opened this issue Aug 17, 2016 · 0 comments

Comments

@djhshih
Copy link

djhshih commented Aug 17, 2016

The VCF output file appears to contain 0-based coordinates.
As per the specification, VCF files should contain 1-based coordinates (https://samtools.github.io/hts-specs/VCFv4.1.pdf).

(FYI: VCF also specify that "telomeres are indicated by using positions 0 or N+1, where N is the length of the corresponding chromosome or contig.")

In the below example, the event in the VCF file should have position 16376517 instead of 16376516.

breakseq.vcf.gz:

#CHROM  POS ID  REF ALT QUAL  FILTER  INFO  FORMAT  PB0418-N
1 16376516  . C <DEL> . LowQual SVLEN=-9741;SVTYPE=DEL;END=16386257 GT:ABC:PE:REFCOUNTS 1/1:0,0,1:1:0,0

breakseq.out:

1 1KG_Phase1  Deletion  16376517  16386257  1 . C NAME C1WLGACXX130310:1:1301:1852:84776$1;MAPQ 37;CIGAR 76M;POS 61;END 137;LIBLEN 200;LSPAN 39;RSPAN 37;PE Y;SVLEN 9741

breakseq_genotyped.gff:

1 BreakSeq  Deletion  16376517  16386257  1 . . QUAL LowQual;ABC 0,0,1;PE 1;SVLEN 9741;GT 1/1;COUNTS 0,0

The breakpoint example above is annotated in the library as:

$ grep 163765 breakseq2_bplib_20150129.gff
1       1KG_Phase1      Deletion        16376517        16386257        .       .       .

I am assuming that the breakpoint GFF library contains 1-based coordinate, as per the specification (http://useast.ensembl.org/info/website/upload/gff.html).

Invocation:

run_breakseq2.py \
        --reference /seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta \
        --work work \
        --bwa bwa-0.7.12/bwa \
        --samtools samtools-0.1.19/samtools \
        --bplib_gff bplib/breakseq2_bplib_20150129.gff \
        --nthreads 4 \
        --sample PB0418-N \
        --bams /seq/picard_aggregation/C932/PB0418-N/v7/PB0418-N.bam \

The run log reports success on the last line:

INFO 2016-08-17 10:31:56,875 breakseq2_workflow   Done! (342.241 s)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant