You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tested my own dataset and the program assigned a genus even for bacterial contigs (RNA model). It would be great if there was an entropy setting to skip false positives. For example less than 0.5. python3 predict.py --model_path /home/sergey/VirusTaxo/Dataset/vt_db_rna_virus_kmer_17.pkl --seq /home/VirusTaxo/My_Data/contigs.fasta > /home/VirusTaxo/My_Data/Results.txt
Why not make a complete taxonomic line in the output file?
I only got the correct assignment for one contig (from 15000 seq) more with an entropy setting of -3.15E-12, where there were 0 the taxonomy assignment was not correct.
Dear authors, could you clarify please what I'm doing wrong?
The text was updated successfully, but these errors were encountered:
Thank you for testing the tool and sharing your valuable feedback! I’d like to address your observations and questions:
The tool employs a k-mer matching strategy, meaning that any random overlap of k-mers between the query sequence and the database could lead to a genus assignment, even if the taxonomy (e.g., RNA viruses) is not as expected. To mitigate this, we’ve introduced a new metric called the "Enrichment Score," which helps reduce the likelihood of random k-mer matches affecting the predictions.
Additionally, this model is specifically designed for predicting viral sequences. Applying it to non-viral sequences may result in incorrect taxonomic assignments. To provide further clarity, we’ve included a new section in the README titled "Method Limitations and Interpretation" to elaborate on these points.
In future updates, we will add full taxonomic lineage (e.g., family, order, genus, species) in the output file, and will provide arguments to choose cutoff for both Entropy and Enrichment_Score.
Full taxonomic lineage (e.g., family, order, genus, species) in the output file, and the arguments to choose cutoff for both Entropy and Enrichment have been added.
python3 predict.py --model_path /home/sergey/VirusTaxo/Dataset/vt_db_rna_virus_kmer_17.pkl --seq /home/VirusTaxo/My_Data/contigs.fasta > /home/VirusTaxo/My_Data/Results.txt
Dear authors, could you clarify please what I'm doing wrong?
The text was updated successfully, but these errors were encountered: