Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter2 command is too slow #269

Open
y9c opened this issue Apr 11, 2024 · 8 comments
Open

filter2 command is too slow #269

y9c opened this issue Apr 11, 2024 · 8 comments

Comments

@y9c
Copy link
Contributor

y9c commented Apr 11, 2024

Compared with filter command or awk, fiter2 command is much slower, especially for rule with multiple conditions.

It might be relative to this function in the for-loop, which repeatedly parsing the expression.

csvtk/csvtk/cmd/filter2.go

Lines 370 to 376 in 9407f73

// evaluate
if containCustomFuncs {
expression, err = govaluate.NewEvaluableExpressionWithFunctions(filterStr1, functions)
} else {
expression, err = govaluate.NewEvaluableExpression(filterStr1)
}
checkError(err)

@shenwei356
Copy link
Owner

Yes, I noticed that. It is slow :(

@y9c
Copy link
Contributor Author

y9c commented Apr 23, 2024

Can we move the Expression parsing function outside the for-loop and run it only once?

@shenwei356
Copy link
Owner

shenwei356 commented May 17, 2024

It is slow, but it must be done like that. Cause filterStr1 is different in each iteration.

@y9c
Copy link
Contributor Author

y9c commented May 17, 2024

Why filterStr1 is different? Can we cache the parsed results?

@shenwei356
Copy link
Owner

It's the expression, like '$age > 18', the $age needs to be replaced with the value of each row.

@y9c
Copy link
Contributor Author

y9c commented May 17, 2024

Yes. I mean can we parsed the expression as something like '$1>18' and reuse the code of the filter command to deal with the computation afterward

@shenwei356
Copy link
Owner

parsed the expression as something like '$1>18' and reuse the code of the filter command
I don't think so.

God, it's really slow~ I used it a lot recently. Have to improve it, when I have time ~

@shenwei356
Copy link
Owner

shenwei356 commented Nov 22, 2024

It's fixed. Please update it to v0.31.1. https://github.com/shenwei356/csvtk/releases/tag/v0.31.1

The problem is in the parameter preparation step, not the expression evaluation. d20aa89

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants