Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing of Data Slow #893

Open
jhnath21 opened this issue Nov 27, 2024 · 1 comment
Open

Processing of Data Slow #893

jhnath21 opened this issue Nov 27, 2024 · 1 comment

Comments

@jhnath21
Copy link

As data amounts have been increasing in size from new chemistry and instrumentation and the reference databases have increased in size, the processing of data has become very slow. We have tried using various number of threads to process the data. It has not been helpful with processing the data (we have tried 96, 128, 192). Also, the larger the datasets have become the more memory the processing computer needs.

Is there a way to speed up the analysis that we have not seen and a way to not require large amounts of RAM with these larger datasets? For example a file containing >20M reads takes 4+ days to process where in the past it would only take ~6 hrs (~5M reads/hr with just 16 threads). Currently we can't use a server with less then 512 GB RAM.

Copy link

You can use KrakenUniq with the new low-memory option, and then you can run on a server with any amount of memory, even just 16 GB. There's a time penalty but it's not bad. Read our short paper about it, https://pubmed.ncbi.nlm.nih.gov/37602140/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants