Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binding: fix by-name parameter binding and enable corresponding IT #187

Merged
merged 11 commits into from
Dec 11, 2024

Conversation

muzarski
Copy link
Collaborator

@muzarski muzarski commented Oct 9, 2024

Motivation

Before this PR, binding values by name to unprepared queries was buggy:

  • Main issue was that we were not even reading name-to-index mapping (SimpleQuery::name_to_bound_index) - only writing to it. In result, we were not really binding by name, but rather by position, in the same order as cass_statement_bind_* calls appeared. The bound_values vector was directly passed to Session::query_* calls.
  • CassStatement::bind_cql_value would return CASS_OK when size of SimpleQuery::name_to_bound_index map exceeded the declared size of parameters (CassStatement::bound_values.len())
  • We wouldn't unquote the parameter names for unprepared queries in CassStatement::bind_cql_value

This PR solves this issue, and enables another integration test suite - NamedParametersTests.

Changes

Apart from the fixes to CassStatement::bind_cql_value, I also introduced a custom implementation of SerializeRow for unprepared queries execution. I created a SimpleQueryRowSerializer struct which holds necessary information to perform serialization. Now, with following context:

  • bound_values vector - directly taken from CassStatement::bound_values
  • an optional name-to-index mapping (from SimpleQuery::name_to_bound_index) - if it's non-empty, it means that we bind by names, otherwise we preserve the original order of bound_values.
  • column specs from RowSerializationContext

we are able to serialize values in correct order for unprepared queries.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • PR description sums up the changes and reasons why they should be introduced.
  • [ ] I have implemented Rust unit tests for the features/changes introduced.
  • I have enabled appropriate tests in .github/workflows/build.yml in gtest_filter.
  • I have enabled appropriate tests in .github/workflows/cassandra.yml in gtest_filter.

@muzarski muzarski self-assigned this Oct 9, 2024
@Lorak-mmk
Copy link
Collaborator

4 out of 6 tests from this suite are passing. We can enable them.

When it comes to SimpleStatementAnyOrder and SimpleStatementInvalidName, I don't think we will ever enable them. The reason is that rust-driver no longer supports 'WITH_NAMES_FOR_VALUES' flag (https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec#L379) since serialization refactor, thus does not support "by name" value binding.

I don't get it. Rust Driver prepares the statements underneath and should be able to handle named bind markers in queries.

The reason that analogous tests for prepared statements pass is because "name-to-indexes" mapping can be retrieved from the prepared statement's metadata received from the server upon preparation. Thanks to that, we can bind the values by name in correct order (in cpp-rust-driver layer).

@muzarski
Copy link
Collaborator Author

muzarski commented Oct 9, 2024

I don't get it. Rust Driver prepares the statements underneath and should be able to handle named bind markers in queries.

You are right. Probably, the custom SerializeRow implementation will do the job. I'll try it out

@muzarski muzarski marked this pull request as draft October 9, 2024 10:55
@muzarski muzarski force-pushed the enable-named-parameters-tests branch 5 times, most recently from 93fdd2f to f7bb42a Compare October 9, 2024 16:00
@muzarski muzarski changed the title ci: enable passing NamedParameterTests ci: enable NamedParameterTests Oct 9, 2024
@muzarski muzarski marked this pull request as ready for review October 9, 2024 16:40
@muzarski
Copy link
Collaborator Author

muzarski commented Oct 9, 2024

I updated the description.

scylla-rust-wrapper/src/statement.rs Outdated Show resolved Hide resolved
scylla-rust-wrapper/src/statement.rs Show resolved Hide resolved
scylla-rust-wrapper/src/statement.rs Outdated Show resolved Hide resolved
scylla-rust-wrapper/src/cass_error.rs Outdated Show resolved Hide resolved
scylla-rust-wrapper/src/cass_error.rs Outdated Show resolved Hide resolved
@muzarski
Copy link
Collaborator Author

v2: rebased on #200. We don't match the specific SerializationError by stringification anymore - downcast_ref() method is used.

@muzarski muzarski force-pushed the enable-named-parameters-tests branch from f9a965e to 9fe9d95 Compare October 30, 2024 12:55
@muzarski
Copy link
Collaborator Author

v3: rewritten the bind_cql_value_by_name method according to @wprzytula suggestions - thank you! Now, the method logic is much clearer since we got rid of early returns.

@muzarski
Copy link
Collaborator Author

Rebased on master

@wprzytula
Copy link
Collaborator

The PR's name is a bit misleading. It sounds like it changes only tests and enables them; instead, it fixes bugs in the driver! This should definitely be indicated in the name, and also prioritised higher.

@wprzytula wprzytula added the bug Something isn't working label Dec 1, 2024
@muzarski muzarski force-pushed the enable-named-parameters-tests branch from 0b7cf56 to e6d4f31 Compare December 2, 2024 15:44
@muzarski muzarski changed the title ci: enable NamedParameterTests binding: fix by-name parameter binding and enable corresponding IT Dec 2, 2024
@muzarski muzarski force-pushed the enable-named-parameters-tests branch 2 times, most recently from a7e6028 to f36a5f3 Compare December 4, 2024 00:55
@muzarski
Copy link
Collaborator Author

muzarski commented Dec 4, 2024

v3.1: Applied some of the suggestions by @Lorak-mmk

@muzarski muzarski requested a review from Lorak-mmk December 4, 2024 10:55
@muzarski muzarski force-pushed the enable-named-parameters-tests branch from f36a5f3 to 95015a1 Compare December 5, 2024 02:42
@muzarski
Copy link
Collaborator Author

muzarski commented Dec 5, 2024

v4:

  • Applied a suggestion by @Lorak-mmk. Moved bound_values out of CassStatement, to new structs BoundSimpleQuery and BoundPreparedStatement. It reduces the complexity of binding functions, as we do not have to match against &self.statement in multiple places.
  • Defined two new integration test cases that perform by-name binding to multiple columns using the same bind marker

muzarski and others added 4 commits December 9, 2024 13:33
Current binding code is fairly complicated. This is because we need
to match against the &[mut] self.statement in each function.

To prevent this, we move the bound_values to lower layer - i.e.
BoundPreparedStatement and BoundSimpleQuery.

This commit does not change the logic at all. It is equivalent to the
previous version.
`cass_statement_reset_paramters` should reset the
name-to-index mapping used in simple queries.
I reduced mutability scope of some variables, renamed some variables
so they have more descriptive names, and added some comments.
I refactored the function, so we get rid of early returns.

Thanks to that, we can discover a bug:
Previously, if we did not find a free index
for a value with given name, we returned CASS_OK.
cpp-driver returns an error in such case.

Co-authored-by: Wojciech Przytuła <[email protected]>
@muzarski muzarski force-pushed the enable-named-parameters-tests branch from 95015a1 to c6a1e6b Compare December 9, 2024 12:33
@muzarski
Copy link
Collaborator Author

muzarski commented Dec 9, 2024

Rebased on master

@muzarski muzarski added this to the 0.4 milestone Dec 9, 2024
@muzarski muzarski force-pushed the enable-named-parameters-tests branch from c6a1e6b to 658f1cf Compare December 10, 2024 19:53
Previously, we would search for the first value in `bound_values`
vector that is `Unset` to prevent overwriting the values, that were
previously possibly bound via by index binding.

This however, is not consistent with cpp-driver. The cpp-driver
overwrites the value in such case.

To allocate new index, we simply get the current size of `name_to_bound`
map size.

Consider following example, where user declared 2 parameters (thus bound_values.len() == 2):
- by_name("foo", 5) <- bound_values[0] = 5, name_to_bound_index["foo"] = 0
- by_index(1, 19) <- bound_values[1] = 19
- by_name("bar", 13) <- previously: returns an error, because we can't find
  an index where value is Unset. Now: bound_values[1] = 13, name_to_bound_index["bar"] = 1

I'm adding an integration test case for analogous scenario at the end of this PR.
Previously, we would store a quoted parameter name for simple queries.
This is a bug.
The quotes are only introduces to decide whether the name
is case sensitive for prepared statements.

To fix this, I extracted the common logic of unquoting to
`CassStatement::bind_cql_value_by_name`. It will pass the unquoted
parameter name to methods on corresponding statement kind.
In this commit, we also adjust one of the NamedParameterTests to this change.
We are going to enable this test suite in the next commit.
In these tests, we use the same bind marker (:abc) to bind
values to 3 different columns. We then validate the results.

There are two test cases - one for prepared statements, and other one
for unprepared statements.
@muzarski muzarski force-pushed the enable-named-parameters-tests branch from 658f1cf to 658ebb0 Compare December 10, 2024 19:54
@muzarski
Copy link
Collaborator Author

v4.1: Thanks to @Lorak-mmk I discovered a small inconsistency between cpp-rust and cpp-driver in regards to interleaving by name and by index bindings. I fixed it (commit: binding: adjust finding new index to cpp-driver semantics) and introduced integration test cases that verify this (last commit). I checked that the same test cases pass for cpp-driver.

Copy link
Collaborator

@wprzytula wprzytula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd feel more secure about these changes with at least a number of unit tests of the BoundStatement. There are integration tests, that's true, but for some reason I tend to trust Rust unit tests targetting the specific area more.

@muzarski
Copy link
Collaborator Author

muzarski commented Dec 11, 2024

I think I'd feel more secure about these changes with at least a number of unit tests of the BoundStatement. There are integration tests, that's true, but for some reason I tend to trust Rust unit tests targetting the specific area more.

I think it would be possible to implement unit tests for simple queries, if I extracted another struct containing binding payload (everything from BoundSimpleQuery apart from Query). It would be possible because Query is not required during binding, and we could mock metadata for serialization.

However, I don't see how we could implement unit tests for prepared statements. Does rust driver allow to somehow mock PreparedStatement structure?

Anyhow, I'll merge it as is. There are multiple places where Rust unit tests would be useful, but currently (AFAIK) we don't have a way to mock structures from rust-driver.

@muzarski muzarski merged commit cb0077b into scylladb:master Dec 11, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants