-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoding avro messages from a topic with different avro schemas #211
Comments
Is your issue similar to #75? It sounds like you may have implemented the |
After reading the issue I am not really sure if it is that similar, since I did not get any errors fetching a different schema with the same name. Although the solution in the second comment by ggobbe is very similar to mine.
Could this be the source of my issue? I am not really familiar with how this typeHook is being used, I just followed some default implementations |
Hi! I still have the exact same behaviour that you describe in your comment. Did you end up finding a suitable workaround/fix for this issue? Thanks! |
@XavRsl Don't know if it's still relevant but I haven't found a better solution than what I already described in my first post, which is creating two separate schema registry instances |
Hello,
I've encountered a problem when trying to decode avro messages that have the same namespace and name, but differ.
I have the following scenario:
A kafka topic that has avro messages that are produced with two different schemas, the schemas are not completely different they are actually of the same namespace and name, but one is the updated version of another.
I am using the following code to create the SchemaRegistry:
For the sake of simplicity I'll use these two schemas as an example.
So let's say schema 1:
{ type: 'record', name: 'Pet', fields: [ { name: 'kind', type: {type: 'enum', name: 'PetKind', symbols: ['CAT', 'DOG']} }, {name: 'name', type: 'string'} ] }
Schema 2 (updated schema):
{ type: 'record', name: 'Pet', fields: [ { name: 'kind', type: {type: 'enum', name: 'PetKind', symbols: ['CAT', 'DOG', 'MOUSE']} }, {name: 'name', type: 'string'}, ] }
Now let's assume that an avro message that was encoded with schema 1 has been consumed, the registry will fetch the schema using the registryId that is held within the binary message and save the schema, as long as we keep receiving messages that were encoded with schema 1 everything is fine. Once we receive a message that was encoded with schema 2, schema 2 is being fetched as well with the registryId however since we already have a schema with the name 'Pet' it is being ignored and the schema that is being used is the one we obtained earlier (schema 1), thus we are trying to decode an avro message that was encoded with schema 2 using schema 1, which results in the following error messages: 'trailing data' and 'truncated data'
After realizing the problem I have made two SchemaRegistry instances, one for avro messages that were encoded with schema 1, and one for avro messages that were encoded with schema 2. Right before decoding I extract from the buffer the registryId that is kept in 4 bytes at the beginning of the buffer and based on the Id I decide which registry to use, that way registry 1 keeps only schema 1 within it, and registry 2 keeps only schema 2 within it.
My final thoughts and questions:
The text was updated successfully, but these errors were encountered: