Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(format): introduce ADBC API revision 1.1.0 #692

Merged
merged 12 commits into from
Jun 6, 2023

Conversation

lidavidm
Copy link
Member

Fixes #317.

adbc.h Outdated Show resolved Hide resolved
@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from b860a60 to e97c30f Compare May 18, 2023 16:29
@lidavidm
Copy link
Member Author

I've addressed feedback + added APIs for most of the new proposals. If this is OK I'll follow up with Go (with a set of interfaces, one per method or so) and Java (which is very easy).

I think we will have to punt on #320 and #621. I don't think #630 is worth doing. #685 is too speculative at this point.

adbc.h Outdated
AdbcStatusCode AdbcStatementCancel(struct AdbcStatement* statement,
struct AdbcError* error);

/// \brief Get a double property of the statement.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would this be used for? is it just to avoid consumers having to call strtof/strtod themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a constant above motivating it. Also, 1) there's no Get* function at all right now and 2) returning strings is really annoying due to allocations.

I could change this to return a string, the caveats would be either: 1) we have to add some release-callback thing (C Data Interface-ish) or 2) we return a const char*, but calling this again or doing other things with the statement will invalidate the pointer (SQLite-ish) or 3) we expect a caller-allocated and/or fixed-size buffer (ODBC-ish)

Comment on lines +849 to +855
if (version >= ADBC_VERSION_1_1_0) {
auto* driver = reinterpret_cast<struct AdbcDriver*>(raw_driver);
FILL_DEFAULT(driver, StatementExecuteSchema);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this also have the same for the other new functions? AdbcStatementCancel and AdbcStatementGetDouble?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to defer implementation, just wanted to demonstrate the basic idea worked. I just want to get the API sort of in the right shape.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agh, let me retarget this PR against the branch, not main.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There - the intent is to build this out on a branch then manually ff-merge it at the end. If the approach looks reasonable I'm also going to build out more serious compatibility tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, okay then in that case I'm in favor of this approach.

@lidavidm lidavidm changed the base branch from main to spec-1.1.0 May 18, 2023 20:31
@lidavidm
Copy link
Member Author

Interestingly the CI failure exposes what does look like a compatibility issue: the new fields in AdbcDriver make the Python test crash (because it mixes a driver using the new definition and a wheel using the old definition)

@lidavidm
Copy link
Member Author

Ah, because the drivers do this

  auto* driver = reinterpret_cast<struct AdbcDriver*>(raw_driver);
  std::memset(driver, 0, sizeof(*driver));

and now that the struct has changed, they need to actually check the size of the struct they're about to manipulate.

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from e97c30f to bd5d608 Compare May 19, 2023 18:20
@lidavidm
Copy link
Member Author

I'm quite tempted to just remove cpplint, it seems to mostly just give noise and it's basically unconfigurable/unmaintained

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from bd5d608 to f89fb97 Compare May 19, 2023 18:23
@lidavidm
Copy link
Member Author

Some other thoughts: maybe we shouldn't overload getInfo for things like current catalog/schema, and instead add a get/set option pair? That would be consistent with how Java does this

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from f89fb97 to 9ee2b8d Compare May 22, 2023 14:24
@lidavidm
Copy link
Member Author

Added Java and Go, though I think I'd like to go back and align things a bit.

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from 5d4f31d to 7761b84 Compare May 22, 2023 17:11
@lidavidm lidavidm marked this pull request as ready for review May 22, 2023 17:22
@lidavidm
Copy link
Member Author

There's a macOS error in CI that I'll try to track down, but otherwise I think this is ready (for C, Java, Go)

@lidavidm lidavidm changed the title feat(format): introduce ADBC_VERSION_1_1_0 feat(format): introduce ADBC API revision 1.1.0 May 22, 2023
@lidavidm
Copy link
Member Author

lidavidm commented May 22, 2023

TODOs

  • Reserve more space for AdbcDriver?
  • Figure out macOS CI error (doesn't replicate under Rosetta...)

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from 7761b84 to 6cae1a4 Compare May 22, 2023 17:38
@lidavidm
Copy link
Member Author

I never really learned amd64 assembly but I think the CI failure I'm seeing is possibly a miscompilation or something odd? I see basically an unconditional call to UBSAN's abort here:

    0x103770820 <+64>:   movq   -0x10(%rbp), %rax
    0x103770824 <+68>:   movq   %rax, -0x20(%rbp)
    0x103770828 <+72>:   movq   -0x20(%rbp), %rax
    0x10377082c <+76>:   movq   %rax, -0x28(%rbp)
    0x103770830 <+80>:   xorl   %eax, %eax
    0x103770832 <+82>:   testb  $0x1, %al
    0x103770834 <+84>:   jne    0x10377084a               ; <+106> at postgresql.cc
    0x10377083a <+90>:   leaq   0x109f3f(%rip), %rdi      ; _dyld_private + 158976
    0x103770841 <+97>:   xorl   %eax, %eax
    0x103770843 <+99>:   movl   %eax, %esi
    0x103770845 <+101>:  callq  0x1037f75ae               ; symbol stub for: __ubsan_handle_type_mismatch_v1_abort

aka (if I'm reading it right):

  • -0x10(%rbp) is raw_driver,
  • 4 instructions of shuffling registers/stack,
  • a comparison that always fails (since we zero out EAX, then bitwise-AND it with 1 before doing a conditional jump),
  • an unconditional call to __ubsan_handle_type_mismatch_v1_abort

Ghidra seems to agree:

  if (param_1 == 1000000) {
    if (param_2 == 0) {
      local_9 = 5;
    }
    else {
      ___ubsan_handle_type_mismatch_v1_abort(&DAT_00194780,0);

I'm not sure why this is happening, or if there's UB in the code and this is a consequence of it...

In the debugger, I can confirm raw_driver is not NULL despite UBSAN reporting a null-pointer-dereference.

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from 6cae1a4 to c296e17 Compare May 22, 2023 19:32
@lidavidm
Copy link
Member Author

For now let's see if writing it differently helps...

@lidavidm
Copy link
Member Author

Ah, I think I know what happened.

@lidavidm lidavidm force-pushed the spec-1.1.0-newstyle branch from c296e17 to 0a3ba63 Compare May 22, 2023 20:05
if (!statement->private_driver) {
return ADBC_STATUS_INVALID_STATE;
}
return statement->private_driver->StatementExecuteSchema(statement, schema, error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the pointer-to-function be checked for null? Or is it the caller's job?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's assumed that if you're using these functions, you're using the driver manager, and that the driver manager has initialized all the functions. (Obviously, not true right now, since I decided to split implementation out.)

@lidavidm
Copy link
Member Author

lidavidm commented Jun 5, 2023

I'm going to merge this soon (into the branch) and then update/continue the follow-up PRs.

@lidavidm lidavidm merged commit de770ef into apache:spec-1.1.0 Jun 6, 2023
@lidavidm lidavidm deleted the spec-1.1.0-newstyle branch June 6, 2023 14:11
@lidavidm
Copy link
Member Author

lidavidm commented Jun 6, 2023

Ah...let me file a follow up PR for some of Antoine's last comments

@lidavidm
Copy link
Member Author

lidavidm commented Jun 6, 2023

#731

lidavidm added a commit that referenced this pull request Jun 6, 2023
lidavidm added a commit to lidavidm/arrow-adbc that referenced this pull request Jun 19, 2023
Fixes apache#317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit to lidavidm/arrow-adbc that referenced this pull request Jun 19, 2023
lidavidm added a commit that referenced this pull request Jun 23, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Jun 23, 2023
lidavidm added a commit that referenced this pull request Jun 27, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Jun 27, 2023
lidavidm added a commit that referenced this pull request Jul 11, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Jul 11, 2023
lidavidm added a commit that referenced this pull request Jul 20, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Jul 20, 2023
lidavidm added a commit that referenced this pull request Jul 21, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Jul 21, 2023
lidavidm added a commit that referenced this pull request Aug 3, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Aug 3, 2023
lidavidm added a commit that referenced this pull request Aug 10, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Aug 10, 2023
lidavidm added a commit that referenced this pull request Aug 10, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Aug 10, 2023
lidavidm added a commit that referenced this pull request Aug 28, 2023
Fixes #317.

---------

Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
lidavidm added a commit that referenced this pull request Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[Format][C] Introducing new API functions in a backwards compatible way
4 participants