-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speedup #13
Comments
header only: https://github.com/mmarkeloff/cpp-url-decode ada itself also has something similar. But the above implementation is simple enough to be understood. |
and here I am writing my own function :D CharacterVector url_decode_cpp(CharacterVector URLs) {
return sapply(URLs, [](const String& URL) {
std::string input = URL;
std::string output;
size_t i = 0;
while (i < input.length()) {
if (input[i] != '%') {
output += input[i];
i++;
} else {
int value;
sscanf(input.substr(i + 1, 2).c_str(), "%x", &value);
output += static_cast<char>(value);
i += 3;
}
}
return output;
});
} bench::mark(
url_decode_cpp("https://www.google.co.jp/search?q=\u30c9\u30a4\u30c4"),
URLdecode("https://www.google.co.jp/search?q=\u30c9\u30a4\u30c4"),
iterations = 1000, check=FALSE
)
# A tibble: 2 × 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result
<bch:expr> <bch:> <bch:t> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list>
1 "url_deco… 4.4µs 5.29µs 164215. 2.49KB 0 1000 0 6.09ms <NULL>
2 "URLdecod… 31.3µs 36.53µs 25980. 0B 26.0 999 1 38.45ms <NULL>
# ℹ 3 more variables: memory <list>, time <list>, gc <list> But I will check out the header only thing |
@chainsawriot ok this is surprising to me that my implementation is as fast. I would take that but will do some robustness checks urls <- rep("https://www.google.co.jp/search?q=\u30c9\u30a4\u30c4",5000)
bench::mark(
cpp_david = url_decode_cpp(urls),
URLdecode = URLdecode(urls),
cpp_header = url_decode_header(urls),
iterations = 1, check=FALSE
)
# A tibble: 3 × 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list>
1 cpp_david 4.38ms 4.38ms 228. 41.6KB 0 1 0 4.38ms <NULL> <Rprofmem>
2 URLdecode 153.79ms 153.79ms 6.50 39.1KB 13.0 1 2 153.79ms <NULL> <Rprofmem>
3 cpp_header 4.98ms 4.98ms 201. 41.6KB 0 1 0 4.98ms <NULL> <Rprofmem>
# ℹ 2 more variables: time <list>, gc <list> |
URLdecode speedup done with #25
|
Benchmark datasets:
|
remove utf8 (need to set R>=4.2) |
|
won't speed up |
runtime is ok, but given how fast ada-url is by itself, there is room to improvement at a) the interface R/C++ and b)the URLencoding to fix UTF8 support (see #1)
The text was updated successfully, but these errors were encountered: