ENH: use `functools.lru_cache` to speed up #42

Zeroto521 · 2022-11-22T01:40:34Z

Levenshtein distance algorithm can't be vectorized.
So the calculation would be very slow in large data.

An idea to speed up is using the cache.
Use the accumulate case to show the cache.

def accumulate(x):
    return sum(range(x))

accumulate(100000000) needs 5s no matter if it is the first time running or the second time running in my local without lru_cache.

from functools import lru_cache

@lru_cache
def accumulate(x):
    return sum(range(x))

After adding lru_cache, the first running accumulate(100000000) still needs 5s.
But the second time running accumulate(100000000) needs 0s.

The text was updated successfully, but these errors were encountered:

maxbachmann · 2023-01-13T13:53:26Z

This is not really a good idea in terms of this library, since in most use cases it it uncommon to call the function twice with the two same strings. If this is really common in you specific case, it makes more sense to write yourself a wrapper function adding the lru_cache for your own project.

Zeroto521 changed the title ~~ENH: use functools.lru_cache to cache result~~ ENH: use functools.lru_cache to speed up Nov 22, 2022

Zeroto521 mentioned this issue Nov 22, 2022

ENH: use functools.lru_cache to speed up rapidfuzz/RapidFuzz#292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: use `functools.lru_cache` to speed up #42

ENH: use `functools.lru_cache` to speed up #42

Zeroto521 commented Nov 22, 2022 •

edited

Loading

maxbachmann commented Jan 13, 2023

ENH: use functools.lru_cache to speed up #42

ENH: use functools.lru_cache to speed up #42

Comments

Zeroto521 commented Nov 22, 2022 • edited Loading

maxbachmann commented Jan 13, 2023

ENH: use `functools.lru_cache` to speed up #42

ENH: use `functools.lru_cache` to speed up #42

Zeroto521 commented Nov 22, 2022 •

edited

Loading