Michael G. Campana & Jacob A. West-Roberts, 2017-2024
Smithsonian's National Zoo and Conservation Biology Institute
Contact: [email protected]
Added --inferref option to infer reference when base is missing
Fixed bug in reference-inclusion if missing sites are not skipped
Added option to output reference sequence as part of the alignment
Fixed glitch where the region ID was added to the sequence length value
Fixed glitch outputting a zero-length partition at beginning of table
Added option to output contig partition table for concatenated alignments
Handling for diploid missing data calls in haploid vcfs
encdoed-typo correction
Fixed zlib requirement bug
Now compatible with GATK deletions coded by *
Improvement to array summation
Fixed bug when splitting regions by length that would cause misnumbering and overwriting of previous regions
Fixed bug for haploid data that did not properly count haploid missing data in the minimum calls filter
Added method fix_name to resolve reserved characters in locus names
Read/write gzipped files
Probabilistic pseudohaplotype calling
Minimum samples called as a percentage option
Minimum alignment length to retain parameter
Sample-specific filters no longer filter whole line
Completely filtered sites are excluded by the skip option
Renamed filters to better match standard tags
--annotfilter option controls FILTER value filtration
--split_regions is now functional
vcf2aln can read streamed uncompressed VCF
Removed extraneous debugging output from get_GT_tags
get_GT_tags gets tag information from VCF rather than VCF 4.2 standard tags
Can read GATK HaplotypeCaller PGT phasing information
GT/PGT information do not need to be in first slot of output
Speed increase using write-cycle controls
Indexing bug fix
Haploid VCF bug fix
Onehap bug fix
Onehap concatenation bug fix
Now gets all type fields in VCF
Onehap flag and bug fixes
GLE & ambiguity code handling
Cleaned up help screen
New filters
Ability to identify VCF tags
Separation of methods
Preliminary script to generate FASTA alignment from multi-sample VCF