Skip to content

Better GTF support

Compare
Choose a tag to compare
@pmelsted pmelsted released this 28 Jun 11:03
· 8 commits to master since this release

GTF

This version includes bug fixes that improve GTF parsing. We now support the Ensembl and Gencode annotations and have been tested with the latest versions.

Note that for Gencode the FASTA files must be modified so that they match the GTF files (Gencode fasta uses pipes, |, as a separator in the FASTA sequence names, rather than a space). This can be fixed by running

zcat gencode.v26.transcripts.fa.gz  | tr '|' ' ' | gzip -1 >  gencode.v26.transcripts.fixed.fa.gz

Protein coding annotation

pizzly limits the fusion reports to transcripts that have been annotated as protein coding. If this information is not present in the annotation, the --ignore-protein option ignores this requirement. Running pizzly in this way will most likely increase the number of false positives reported.

Warnings

pizzly will now warn when there are sequences in the FASTA file with no corresponding annotation and exit if no sequences have available annotation. pizzly also warns if no transcripts are annotated as protein coding.