You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Finally a suffix-array match-finder which did see enough love to be actually usable! Thank you for that!
After playing with it for a bit I realized that esa_matchfinder_find_all_matches does drop all matches from the beginning of a parse.
So data like this
hello, hello
the match at position 7 with offset 7 and length 5 will silently be dropped as the offset is zero there.
Is that intended behavior?
for me the parse up-to position 7 looks like this, the one with the + is the one reported back.
@hegdi Unfortunately, this is by design and I noted this in readme "due to implementation details the esa-matchfinder can not find any matches with offset 0.", This certainly can be fixed, but at performance cost. And based on my testing, this limitation does not make any difference in compression ratio. Alternatively, you can extend input text by additional symbol at the begging. Actual symbol does not matter, because it won't be matched anyway.
Finally a suffix-array match-finder which did see enough love to be actually usable! Thank you for that!
After playing with it for a bit I realized that
esa_matchfinder_find_all_matches
does drop all matches from the beginning of a parse.So data like this
the match at position 7 with offset 7 and length 5 will silently be dropped as the offset is zero there.
Is that intended behavior?
for me the parse up-to position 7 looks like this, the one with the + is the one reported back.
The text was updated successfully, but these errors were encountered: