-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG][Spark] INSERT INTO struct evolution in map/arrays breaks when a column is renamed #3227
Comments
Hello @johanl-db, |
@Richard-code-gig thanks for volunteering for this. I believe this only impacts INSERT by position:
INSERT by name ( Once you have a test, head over to DeltaAnalysis.resolveQueryColumnsByOrdinal where the logic aligning schema for INSERT by position is. I believe the issue is that |
Okay thanks. I am picking it up now. |
I hope this message finds you well. I have reviewed the issue regarding schema evolution in Delta Lake, particularly the problem that arises when renaming a column while trying to insert new data into a table with nested structures. I believe I can contribute a fix for this bug independently. Before I get started, I wanted to check if there are any specific guidelines or preferred practices for contributing to the Delta Lake codebase that I should be aware of. I am excited about the opportunity to contribute to the project and help improve the functionality. |
@Pshak-20000 You can take a look at https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md for general information about contributions to this project. You may want to start by checking with @Richard-code-gig to see if he's still working on this issue or not. |
Hello @johanl-db, @Pshak-20000 I am working on it and almost done. Johanl I need your help on something minor to complete the assignment, please let me know how I can reach you. Thanks |
You can reach me here or through my email ([email protected]) if that’s more convenient. Let me know what you need help with!!!. Looking forward to hearing from you. |
You can join the Delta slack workspace and reach me there: https://go.delta.io/slack |
Hi, |
@Richard-code-gig If you want some support/feedback from my side, you can open a PR even if it's not entirely finalized yet and we can discuss there |
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. Signed-off-by: Sola Richard Olorunfemi <[email protected]>
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. Signed-off-by: Sola Richard Olorunfemi <[email protected]>
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. Signed-off-by: Sola Richard Olorunfemi <[email protected]> fix!:renamed the added DeltaWriteExample to EvolutionWithMap
Hi Johanl-db, |
Hi @johanl-db, |
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. fix!:renamed the added DeltaWriteExample to EvolutionW ithMap fix!: Modified TypeWideningInsertSchemaEvolutionSuite to accommodate that schema evolution is now allowed for maps Signed-off-by: Sola Richard Olorunfemi <[email protected]>
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. fix!:renamed the added DeltaWriteExample to EvolutionW ithMap fix!: Modified TypeWideningInsertSchemaEvolutionSuite to accommodate that schema evolution is now allowed for maps Signed-off-by: Sola Richard Olorunfemi <[email protected]> fix!: addCastToMap to handle complex types. Added tests to cover new abilities
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. fix!:renamed the added DeltaWriteExample to EvolutionW ithMap fix!: Modified TypeWideningInsertSchemaEvolutionSuite to accommodate that schema evolution is now allowed for maps Signed-off-by: Sola Richard Olorunfemi <[email protected]> fix!: addCastToMap to handle complex types. Added tests to cover new abilities fix: resolved scalaStyle error
…mn renaming Resolved the issue described in [Bug delta-io#3227](delta-io#3227) where adding a field inside a struct (nested within a map) while renaming a top column caused the operation to fail. The fix focuses on handling schema changes without affecting the integrity of existing data structures, specifically avoiding issues with nested fields within a map and renamed columns. fix!:renamed the added DeltaWriteExample to EvolutionW ithMap fix!: Modified TypeWideningInsertSchemaEvolutionSuite to accommodate that schema evolution is now allowed for maps Signed-off-by: Sola Richard Olorunfemi <[email protected]> fix!: addCastToMap to handle complex types. Added tests to cover new abilities fix: resolved scalaStyle error fix: yet another scalaStyle issue fix!:made some schema evolution for maps tests in DeltaInsertIntoTableSuite more flexible fix: DeltaAnalysis
Bug
Describe the problem
Schema evolution in INSERT doesn't always work properly when the new column is added to a struct nested within an array or map. If another column is renamed, the operation fails when it should succeed.
Steps to reproduce
For example with a map, in python, renaming column
key
torenamed_key
and added a fieldcomment
in the a struct inside the map:Note that the struct inside the map isn't evolved to add the new field. Without the unrelated column being renamed, this works well:
Observed results
The operation fails.
Expected results
The operation succeeds, the table schema is changed to
key int, metrics map<string, struct<id: int, value: int, comment: string>>
and the data is inserted.Willingness to contribute
The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?
The text was updated successfully, but these errors were encountered: