Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: use functools.lru_cache to speed up #42

Open
Zeroto521 opened this issue Nov 22, 2022 · 1 comment
Open

ENH: use functools.lru_cache to speed up #42

Zeroto521 opened this issue Nov 22, 2022 · 1 comment

Comments

@Zeroto521
Copy link

Zeroto521 commented Nov 22, 2022

Levenshtein distance algorithm can't be vectorized.
So the calculation would be very slow in large data.

An idea to speed up is using the cache.
Use the accumulate case to show the cache.

def accumulate(x):
    return sum(range(x))

accumulate(100000000) needs 5s no matter if it is the first time running or the second time running in my local without lru_cache.

from functools import lru_cache

@lru_cache
def accumulate(x):
    return sum(range(x))

After adding lru_cache, the first running accumulate(100000000) still needs 5s.
But the second time running accumulate(100000000) needs 0s.

@Zeroto521 Zeroto521 changed the title ENH: use functools.lru_cache to cache result ENH: use functools.lru_cache to speed up Nov 22, 2022
@maxbachmann
Copy link
Contributor

This is not really a good idea in terms of this library, since in most use cases it it uncommon to call the function twice with the two same strings. If this is really common in you specific case, it makes more sense to write yourself a wrapper function adding the lru_cache for your own project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants