Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When there is a lot of contigs, makeFurDb is very slow #14

Open
wangzhichao1990 opened this issue Sep 14, 2023 · 1 comment
Open

When there is a lot of contigs, makeFurDb is very slow #14

wangzhichao1990 opened this issue Sep 14, 2023 · 1 comment

Comments

@wangzhichao1990
Copy link

wangzhichao1990 commented Sep 14, 2023

Hi,

When there are a lot of contigs, makeFurDb is very slow.
The following figure shows the statistical results of the neighbor genomes.
图片
Is there a way to increase speed? I am using the latest docker version.
Thanks.

@haubold
Copy link
Contributor

haubold commented Sep 18, 2023

To a first approximation, each neighbor sequence is turned into a suffix array. Since the computation of a suffix array comes with a performance overhead, the analysis of very many sequences in the neighborhood will slow down makeFurDb. One way to speet things up is to concatenate sequences into fewer, longer chunks. In the limit of concatenating all neighbors into one sequence, memory consumption is maximal and might outstrip the avable RAM. Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants