Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(format): introduce AdbcDriver110 #681

Closed
wants to merge 1 commit into from

Conversation

lidavidm
Copy link
Member

Fixes #317.

@lidavidm lidavidm marked this pull request as ready for review May 17, 2023 12:24
@lidavidm
Copy link
Member Author

Passing tests here: https://github.com/lidavidm/arrow-adbc/tree/spec-1.1.0

@lidavidm lidavidm requested a review from zeroshade May 17, 2023 12:24
Comment on lines +684 to +686
struct ADBC_EXPORT AdbcDriver110 {
struct AdbcDriver base;
};
Copy link
Member

@zeroshade zeroshade May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a new struct? Isn't the purpose of the version to be so that we don't need a new struct?

This feels like it's going to be a never-ending escalator if we have to always keep embedding the previous one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't add items to a struct without breaking binary compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i'd rather break binary compatibility once here with a better solution (i guess ADBC 2.0?) than start down this path of continuously having to embed and embed and embed until we end up with stuff like driver->base.base.base.base.....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's appetite for 2.0.0, we could do that. Most of the proposals so far are not drastic changes: https://github.com/apache/arrow-adbc/milestone/3 except for possibly arbitrarily nested database schemas (which doesn't fit well into the current nested data hierarchy from GetObjects, since Arrow doesn't have recursive types).

We could use the driver manager to still provide compatibility if we want to do 2.0.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, my hope is that these changes would be infrequent and so we wouldn't need to have so much nesting. (Or, we could just reserve a table up front here for future expansion.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer reserving something for future expansion rather than the embedding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paleolimbot wonder if you have any thoughts here?

Copy link
Member Author

@lidavidm lidavidm May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also do something where based on the passed in version we could assume the presence/absence of new pointers and avoid accessing the "new part" of the struct, UCX-style (as mentioned below). This is rather janky, but would be OK if we assume this mechanism is primarily for the driver/driver manager to coordinate.

I can go ahead and prototype that out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the approach here look? #692

If it works, I'll figure out how to set up all the compatibility tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually like that approach better. Will add comments to that PR

Comment on lines -482 to +488
struct AdbcDriver* private_driver;
void* private_driver;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this technically a breaking change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; I think we could avoid this if we wanted, but it would require acrobatics

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(basically: have the new function table be embedded as the private_data of the AdbcDriver)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other libraries that you're aware of that have done something like this without an embedding 'treadmill' that we could use for inspiration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCX has you pass in sizeof(struct) everywhere, basically, so it knows what fields are safe to access or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then ODBC, basically, assumes you will use the driver manager. I'm not familiar with the driver implementation side but from what I see there's no use of a table like this (which would have placed different constraints on us).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

2 participants