Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is StringDeserializer the right default to work correctly with codec charset? #13

Open
colinsurprenant opened this issue Feb 13, 2020 · 0 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@colinsurprenant
Copy link
Contributor

I recently reviewed an issue where the data in kafka was encoded in ISO8859-1 and we could not correctly decode it using charset => "ISO8859-1" in the codec.

it appears that when using the org.apache.kafka.common.serialization.StringDeserializer (the default) the kafka lib will assume UTF-8 data resulting in receiving incorrectly encoded strings in the kafka input.

Per the kafka docs https://kafka.apache.org/10/javadoc/org/apache/kafka/common/serialization/StringDeserializer.html

String encoding defaults to UTF8 and can be customized by setting the property key.deserializer.encoding, value.deserializer.encoding or deserializer.encoding. The first two take precedence over the last.

  • I believe (not tested) that setting the property value.deserializer.encoding to ISO8859 would have worked.
  • OTOH, by using the org.apache.kafka.common.serialization.ByteArrayDeserializer and setting charset => "ISO8859-1" worked correctly.

This leads me to think that we should probably use the ByteArrayDeserializer by default if we want that to be compatible by default with our codecs + charset conversion.

In any case we should also have a note about this in the docs.

@colinsurprenant colinsurprenant added bug Something isn't working enhancement New feature or request labels Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant