flusort.py duplicate sequence_ID handeling #8
Labels
bug
Something isn't working
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
Context:
IV24Run11 and IV24Run17 contained unique segment sequences with duplicate fasta headers resulting in the sequential overwriting of blastp assignment output moving through the file. This resulting in improper type and subtype assignments of isolates.
Currently, flusort.py cannot handle duplicate header strings in the input fasta. An enhancement would include the addition of warnings written to a log file and printed in the console. Should potentially add a feature to hault the program unless a
--allow-duplicates
flag is provided.The text was updated successfully, but these errors were encountered: