Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading messages from a C# producer #409

Open
dobryakov opened this issue Dec 14, 2020 · 10 comments
Open

Reading messages from a C# producer #409

dobryakov opened this issue Dec 14, 2020 · 10 comments

Comments

@dobryakov
Copy link

Hello, we have a problem with decoding messages containing Decimal values (in terms of C# data types).

For example, when producer (based on C# application) send us value like "1234.56", we receive message and get in PHP data something like "\000\000\000\000�Ĥ���]Hf47b6ffc".

Is there any idea how to solve it?

@nick-zh
Copy link
Collaborator

nick-zh commented Dec 14, 2020

Does the rest of the data look fine?

@dobryakov
Copy link
Author

Does the rest of the data look fine?

In most cases no :( The rest data looks like �Ĥ��� till the end of body.

@nick-zh
Copy link
Collaborator

nick-zh commented Dec 14, 2020

are you maybe using compression in c# and did not configure it on the php side?

@dobryakov
Copy link
Author

are you maybe using compression in c# and did not configure it on the php side?

Compression? Hmm.. Could you please point me to the manuals describing this?

@nick-zh
Copy link
Collaborator

nick-zh commented Dec 14, 2020

php-rdkafka is basically just a wrapper around librdkafka, configuration settings can be found here:
https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
for compression you can set compression.codec. It looks like your producer in C# is using it

@nick-zh nick-zh changed the title Kafka Decimal support issue Reading messages from a C# producer Dec 14, 2020
@dobryakov
Copy link
Author

Unfortunately no.
As we investigated, in the best case the library gives us decimal value as set of bytes. Looks like { value: "�Ĥ���" } in json.
The single working solution for the moment is calling hexdec(bin2hex(message['value'])), and you need to know the decimal point position (AKA "exponent"). Typically it is described at avro scheme, and Deserializer should ask Schema Registry for it. But in this case we need to do it manually, for example $1234567 convert to $12345.67
Very strange.
I will continue to investigate tomorrow and come back to you with details.

@nick-zh
Copy link
Collaborator

nick-zh commented Dec 15, 2020

I see the official C# client is also just a wrapper for librdkafka, so i am sure we can get it to work. Do you use avro schemas for producing messages?

@Steveb-p
Copy link
Contributor

I think it's just the fact that C# in this case saves floating point number in binary format, which is not something PHP can recover on it's own (@dobryakov even mentioned that exponent has to be known).

I think your library @nick-zh (https://github.com/jobcloud/php-kafka-lib) supports avro schemas, so it might work in this case?

@dobryakov
Copy link
Author

dobryakov commented Dec 16, 2020

@nick-zh @Steveb-p We are using jobcloud library too. Doesn't work out-of-the-box at the moment. We use the hack with hexdec(bin2hex()) and convert values manually ($1234567 to $12345.67 mentioning the exponent known from avro scheme visually). I would be happy if someone improve it to do this convertation out-of-the-box, including the exponent :)

@nick-zh
Copy link
Collaborator

nick-zh commented Dec 16, 2020

Thx guys for the feedback, i will have some time over the holidays, but there are a lot of topics i want to tackle, so can't promise anything. But i will try to find out where stuff is being done differently and if we can recover from it.
If it is something that the C# does different from other official clients, i think we wont fix it.
We are using a java producer as well in one of our projects, i will check if we have a similar issue there (not sure if we do use doubles there).
@dobryakov if you can provide more input (like a sample schema and a sample producer to reproduce it), that would be great, this would help me progress faster on this ✌️
Are you using logical types in Avro?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants