-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(format): introduce AdbcDriver110 #681
Conversation
Passing tests here: https://github.com/lidavidm/arrow-adbc/tree/spec-1.1.0 |
struct ADBC_EXPORT AdbcDriver110 { | ||
struct AdbcDriver base; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a new struct? Isn't the purpose of the version to be so that we don't need a new struct?
This feels like it's going to be a never-ending escalator if we have to always keep embedding the previous one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't add items to a struct without breaking binary compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i'd rather break binary compatibility once here with a better solution (i guess ADBC 2.0?) than start down this path of continuously having to embed and embed and embed until we end up with stuff like driver->base.base.base.base.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's appetite for 2.0.0, we could do that. Most of the proposals so far are not drastic changes: https://github.com/apache/arrow-adbc/milestone/3 except for possibly arbitrarily nested database schemas (which doesn't fit well into the current nested data hierarchy from GetObjects, since Arrow doesn't have recursive types).
We could use the driver manager to still provide compatibility if we want to do 2.0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though, my hope is that these changes would be infrequent and so we wouldn't need to have so much nesting. (Or, we could just reserve a table up front here for future expansion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer reserving something for future expansion rather than the embedding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paleolimbot wonder if you have any thoughts here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also do something where based on the passed in version we could assume the presence/absence of new pointers and avoid accessing the "new part" of the struct, UCX-style (as mentioned below). This is rather janky, but would be OK if we assume this mechanism is primarily for the driver/driver manager to coordinate.
I can go ahead and prototype that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the approach here look? #692
If it works, I'll figure out how to set up all the compatibility tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually like that approach better. Will add comments to that PR
struct AdbcDriver* private_driver; | ||
void* private_driver; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this technically a breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; I think we could avoid this if we wanted, but it would require acrobatics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(basically: have the new function table be embedded as the private_data of the AdbcDriver)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other libraries that you're aware of that have done something like this without an embedding 'treadmill' that we could use for inspiration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UCX has you pass in sizeof(struct) everywhere, basically, so it knows what fields are safe to access or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then ODBC, basically, assumes you will use the driver manager. I'm not familiar with the driver implementation side but from what I see there's no use of a table like this (which would have placed different constraints on us).
Fixes #317.