-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix conversion of non-utf8 sequences to AnyValue
#1253
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1253 +/- ##
=========================================
Coverage 84.62% 84.62%
- Complexity 2136 2140 +4
=========================================
Files 284 284
Lines 6054 6062 +8
=========================================
+ Hits 5123 5130 +7
- Misses 931 932 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
f107d6e
to
228102c
Compare
} | ||
|
||
return $result; | ||
} | ||
|
||
private static function isUtf8(string $value): bool | ||
{ | ||
return \extension_loaded('mbstring') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this condition might not be necessary since symfony/polyfill-mbstring
is pulled in transitively from the SDK. Not sure if it's best practice to rely on that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nevay should we rely on the polyfill, or leave it as-is. Otherwise, I'm happy to approve and merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should leave it as-is. The polyfill is slower and exports can contain a large number of attributes.
Creating and exporting spans with 10 string attributes (16 bytes ea.; 8 valid and 2 invalid utf8 sequences) to local collector w/ default batch size was around 10% slower with the polyfill compared to the preg_match()
fallback (fallback was incorrect until latest push).
Benchmark results for creating AnyValue
from a 16 bytes string (polyfill is esp. costly for invalid utf8 sequences due to its iconv
fallback):
+------------------------------------+---------+---------+--------+---------+
| subject | memory | mode | rstdev | stdev |
+------------------------------------+---------+---------+--------+---------+
| benchCheckEncoding (valid) | 1.959mb | 0.245μs | ±1.95% | 0.005μs |
| benchCheckEncoding (invalid-begin) | 1.959mb | 0.248μs | ±1.68% | 0.004μs |
| benchCheckEncoding (invalid-end) | 1.959mb | 0.250μs | ±1.94% | 0.005μs |
| benchPregMatch (valid) | 1.959mb | 0.307μs | ±2.06% | 0.006μs |
| benchPregMatch (invalid-begin) | 1.959mb | 0.262μs | ±1.53% | 0.004μs |
| benchPregMatch (invalid-end) | 1.959mb | 0.270μs | ±1.62% | 0.004μs |
| benchIsUtf8 (valid) | 1.959mb | 0.250μs | ±1.62% | 0.004μs |
| benchIsUtf8 (invalid-begin) | 1.959mb | 0.252μs | ±1.64% | 0.004μs |
| benchIsUtf8 (invalid-end) | 1.959mb | 0.253μs | ±1.31% | 0.003μs |
| benchPolyfill (valid) | 1.959mb | 0.383μs | ±1.67% | 0.006μs |
| benchPolyfill (invalid-begin) | 1.959mb | 0.830μs | ±1.51% | 0.013μs |
| benchPolyfill (invalid-end) | 1.959mb | 0.945μs | ±1.51% | 0.014μs |
+------------------------------------+---------+---------+--------+---------+
> String values which are not valid Unicode sequences SHOULD be converted to AnyValue's bytes_value with the bytes representing the string in the original order and format of the source string.
228102c
to
a9b76b2
Compare
Converting to AnyValue: