Questions and suggestions about the program. #2

SergeyBaikal · 2023-05-31T23:46:41Z

I tested my own dataset and the program assigned a genus even for bacterial contigs (RNA model). It would be great if there was an entropy setting to skip false positives. For example less than 0.5.
python3 predict.py --model_path /home/sergey/VirusTaxo/Dataset/vt_db_rna_virus_kmer_17.pkl --seq /home/VirusTaxo/My_Data/contigs.fasta > /home/VirusTaxo/My_Data/Results.txt
Why not make a complete taxonomic line in the output file?
I only got the correct assignment for one contig (from 15000 seq) more with an entropy setting of -3.15E-12, where there were 0 the taxonomy assignment was not correct.

Dear authors, could you clarify please what I'm doing wrong?

The text was updated successfully, but these errors were encountered:

Rashedul · 2024-11-21T01:41:31Z

Dear Sergey,

Thank you for testing the tool and sharing your valuable feedback! I’d like to address your observations and questions:

The tool employs a k-mer matching strategy, meaning that any random overlap of k-mers between the query sequence and the database could lead to a genus assignment, even if the taxonomy (e.g., RNA viruses) is not as expected. To mitigate this, we’ve introduced a new metric called the "Enrichment Score," which helps reduce the likelihood of random k-mer matches affecting the predictions.

Additionally, this model is specifically designed for predicting viral sequences. Applying it to non-viral sequences may result in incorrect taxonomic assignments. To provide further clarity, we’ve included a new section in the README titled "Method Limitations and Interpretation" to elaborate on these points.

In future updates, we will add full taxonomic lineage (e.g., family, order, genus, species) in the output file, and will provide arguments to choose cutoff for both Entropy and Enrichment_Score.

Please let us know if you have further questions!

Rashedul · 2024-12-03T06:37:11Z

Full taxonomic lineage (e.g., family, order, genus, species) in the output file, and the arguments to choose cutoff for both Entropy and Enrichment have been added.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions and suggestions about the program. #2

Questions and suggestions about the program. #2

SergeyBaikal commented May 31, 2023

Rashedul commented Nov 21, 2024

Rashedul commented Dec 3, 2024

Questions and suggestions about the program. #2

Questions and suggestions about the program. #2

Comments

SergeyBaikal commented May 31, 2023

Rashedul commented Nov 21, 2024

Rashedul commented Dec 3, 2024