Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialization derive macros #1024

Merged
merged 29 commits into from
Aug 6, 2024
Merged

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Jun 27, 2024

Fixes #962

Derive macros for DeserializeValue and DeserializeRow are introduced.

An example feature-complete use

#[derive(scylla_macros::DeserializeValue, PartialEq, Eq, Debug)]
#[scylla(enforce_order, skip_name_checks, forbid_excess_udt_fields)]
struct Udt<'a> {
    a: &'a str,
    #[scylla(skip)]
    x: String,
    #[scylla(allow_missing)]
    b: Option<i32>,
    #[scylla(rename = "d", default_when_null)]
    c: i64,
}

#[derive(scylla_macros::DeserializeRow, PartialEq, Eq, Debug)]
#[scylla(enforce_order)
struct MyRow<'a> {
    a: &'a str,
    #[scylla(skip)]
    x: String,
    b: Option<i32>,
    #[scylla(rename = "d")]
    c: i64,
}

Features

Flavours

Both macros support both unordered (the default) and enforce_order flavours. These are analogous to Serialize{Value,Row} macros flavours, i.e.

  • unordered matches by name and allows for arbitrary order of fields in Rust struct and of fields in CQL UDT or columns in CQL row;
  • enforce_order matches by position, verifying that names match (unless skip_name_checks attribute is given). This is more strict, but also more performant (because the code, by expecting a specific order, can be more compact and less dynamic).

Struct attributes specific to DeserializeValue:

  • forbid_excess_udt_fields - by default, if a UDT definition contains more fields than the Rust struct (in unordered flavour: anywhere, in enforce_order: in suffix), those excess fields are ignored. With forbid_excess_udt_fields attribute set, type check fails with an error in case such fields are present. For comparison, DeserializeRow always requires the same number of Rust fields and CQL columns, effectively rejecting rows with excess columns.

Field attributes common to DeserializeValue and DeserializeRow

  • skip - a Rust struct field is skipped during type check and deserialization, and initialised with Default::default();
  • rename = "X" - a Rust struct field is matched to CQL UDT field / row column with the name specified instead of its own name.

Field attributes specific to DeserializeValue

The following attributes are very important in production when adding new fields to a UDT, if one first updates clients, then the cluster. They allow for seamless transition in such case.

  • allow_missing - if a Rust struct field's name is missing from CQL UDT definition, the field is initialised with Default::default();
  • default_when_null - if a CQL UDT field contains null, but the Rust field's type expects non-null, deserialization would normally fail. With this attribute, the field is initialised with Default::default() instead.

Errors

The returned errors are rich and insightful, using the new deserialization error framework. They are well-tested.

Testing

All flavours and attributes should be tested in various combinations. Please verify that I didn't forget about any.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@wprzytula wprzytula added enhancement New feature or request performance Improves performance of existing features labels Jun 27, 2024
@wprzytula wprzytula added this to the 0.14.0 milestone Jun 27, 2024
@wprzytula wprzytula self-assigned this Jun 27, 2024
@wprzytula wprzytula force-pushed the deser-derive-macros branch from accce36 to 2edbbfa Compare June 27, 2024 08:32
@wprzytula wprzytula requested review from Lorak-mmk and muzarski June 27, 2024 08:33
@github-actions github-actions bot added the semver-checks-breaking cargo-semver-checks reports that this PR introduces breaking API changes label Jun 27, 2024
Copy link

github-actions bot commented Jun 27, 2024

cargo semver-checks detected some API incompatibilities in this PR.
Checked commit: f470d1d

See the following report for details:

cargo semver-checks output
./scripts/semver-checks.sh --baseline-rev 36d5edd3a06a6808ce19dd594ea4904c8b48c0c2
+ cargo semver-checks -p scylla -p scylla-cql --baseline-rev 36d5edd3a06a6808ce19dd594ea4904c8b48c0c2
     Cloning 36d5edd3a06a6808ce19dd594ea4904c8b48c0c2
     Parsing scylla v0.13.0 (current)
      Parsed [  21.923s] (current)
     Parsing scylla v0.13.0 (baseline)
      Parsed [  20.612s] (baseline)
    Checking scylla v0.13.0 -> v0.13.0 (no change)
     Checked [   0.072s] 76 checks: 76 pass, 0 skip
     Summary no semver update required
    Finished [  42.657s] scylla
     Parsing scylla-cql v0.2.0 (current)
      Parsed [  10.588s] (current)
     Parsing scylla-cql v0.2.0 (baseline)
      Parsed [  10.367s] (baseline)
    Checking scylla-cql v0.2.0 -> v0.2.0 (no change)
     Checked [   0.070s] 76 checks: 75 pass, 1 fail, 0 warn, 0 skip

--- failure auto_trait_impl_removed: auto trait no longer implemented ---

Description:
A public type has stopped implementing one or more auto traits. This can break downstream code that depends on the traits being implemented.
        ref: https://doc.rust-lang.org/reference/special-types-and-traits.html#auto-traits
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.33.0/src/lints/auto_trait_impl_removed.ron

Failed in:
  type UdtTypeCheckErrorKind is no longer UnwindSafe, in /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/types/deserialize/value.rs:1549
  type UdtTypeCheckErrorKind is no longer RefUnwindSafe, in /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/types/deserialize/value.rs:1549

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  21.076s] scylla-cql
make: *** [Makefile:61: semver-rev] Error 1

@wprzytula wprzytula force-pushed the deser-derive-macros branch 2 times, most recently from 9579ead to b090a75 Compare June 27, 2024 08:42
@wprzytula
Copy link
Collaborator Author

Cassandra failure in CI is unrelated to the PR. @Lorak-mmk @muzarski will you take a look?

@wprzytula wprzytula force-pushed the deser-derive-macros branch from b090a75 to db49d5d Compare June 28, 2024 06:49
@wprzytula
Copy link
Collaborator Author

v1.1: more attribute verification & tests for that.

@Lorak-mmk Lorak-mmk self-assigned this Jul 9, 2024
@muzarski muzarski self-assigned this Jul 22, 2024
scylla-macros/src/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/row.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/row.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/row.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/row.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/value.rs Outdated Show resolved Hide resolved
@Lorak-mmk Lorak-mmk removed their assignment Jul 31, 2024
@muzarski muzarski removed their assignment Jul 31, 2024
@wprzytula wprzytula force-pushed the deser-derive-macros branch from db49d5d to 6ce6ca3 Compare August 1, 2024 07:41
@wprzytula
Copy link
Collaborator Author

Rebased on main.

@wprzytula wprzytula force-pushed the deser-derive-macros branch from 6ce6ca3 to b2d5e1b Compare August 1, 2024 09:38
@wprzytula
Copy link
Collaborator Author

v1.2: addressed most of the review comments by @muzarski. Thank you very much for spotting bugs!

@wprzytula wprzytula requested a review from muzarski August 1, 2024 09:39
@muzarski
Copy link
Contributor

muzarski commented Aug 1, 2024

v1.2: addressed most of the review comments by @muzarski. Thank you very much for spotting bugs!

Remember that previously, the doc tests were not executed in CI (the CI passed, even though there was a buggy test). Please, make sure that we include doc tests in CI. I briefly checked, and see that there is no cargo test --doc step in CI.

@wprzytula
Copy link
Collaborator Author

v1.2: addressed most of the review comments by @muzarski. Thank you very much for spotting bugs!

Remember that previously, the doc tests were not executed in CI (the CI passed, even though there was a buggy test). Please, make sure that we include doc tests in CI. I briefly checked, and see that there is no cargo test --doc step in CI.

The doc tests have always been executed in CI. The CI passed even though the test was buggy, because the tests failed to compile (as expected by the tests) because of different reasons than expected.

cargo test runs a superset of tests run by cargo test --doc.

@muzarski
Copy link
Contributor

muzarski commented Aug 1, 2024

v1.2: addressed most of the review comments by @muzarski. Thank you very much for spotting bugs!

Remember that previously, the doc tests were not executed in CI (the CI passed, even though there was a buggy test). Please, make sure that we include doc tests in CI. I briefly checked, and see that there is no cargo test --doc step in CI.

The doc tests have always been executed in CI. The CI passed even though the test was buggy, because the tests failed to compile (as expected by the tests) because of different reasons than expected.

cargo test runs a superset of tests run by cargo test --doc.

Oh, makes sense. I forgot that all tests failed because of crate attribute value.

Copy link
Collaborator

@Lorak-mmk Lorak-mmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, but big PR, took some time to review.

scylla-macros/src/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/row.rs Show resolved Hide resolved
scylla-macros/src/deserialize/row.rs Outdated Show resolved Hide resolved
scylla/src/macros.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-macros/src/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/value.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/row.rs Outdated Show resolved Hide resolved
scylla-cql/src/types/deserialize/value.rs Show resolved Hide resolved
wprzytula and others added 2 commits August 5, 2024 12:04
syn provides strong typing on syntax items, which is better for us.

Co-authored-by: Piotr Dulikowski <[email protected]>
Copy link
Contributor

@muzarski muzarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I totally forgot about my comment regarding panicking in case of a name mismatch. I think that we should do the check in DeserializeRow::deserialize, and panic just like in DeserializeValue::deserialize for udts. Otherwise, LGTM. Good job!

wprzytula and others added 5 commits August 6, 2024 17:33
DeserializeRow enforce_order flavour is added, analogous to SerializeRow
enforce_order flavour.
The flavour requires that fields in Rust struct are ordered the same way
as columns in CQL row definition.

Co-authored-by: Piotr Dulikowski <[email protected]>
DeserializeRow enforce_order flavour is tested in the following aspects:
- the macro executes properly on a struct,
- the generated type_check() and deserialize() implementations are
    correct both in valid and invalid cases (i.e. return error in
    invalid cases and expected value in valid cases).

Co-authored-by: Piotr Dulikowski <[email protected]>
DeserializeRow enforce_order flavour is tested in the following aspects:
- the generated type_check() and deserialize() implementations produce
    meaningful, appropriate errors in invalid cases.
`skip_name_checks` is allowed only in `enforce_order` flavour.
If enabled, it turns off name match verification (e.g., for performance
purposes) between Rust struct and CQL UDT definition.

Co-authored-by: Piotr Dulikowski <[email protected]>
New tests are added for `enforce_order` flavour with name match
verification turned off.

Co-authored-by: Piotr Dulikowski <[email protected]>
@wprzytula wprzytula force-pushed the deser-derive-macros branch from 44d261a to 9fe650d Compare August 6, 2024 15:33
@wprzytula
Copy link
Collaborator Author

v1.3.1: added a field-column name check in DeserializeRow in enforce_order flavour.

wprzytula and others added 12 commits August 6, 2024 17:45
`skip_name_checks` is allowed only in `enforce_order` flavour.
If enabled, it turns off name match verification (e.g., for performance
purposes) between Rust struct fields and CQL row columns.

Co-authored-by: Piotr Dulikowski <[email protected]>
New tests are added for `enforce_order` flavour with name match
verification turned off.

Co-authored-by: Piotr Dulikowski <[email protected]>
`skip_name_checks` requires `enforce_order`, so let's ensure this at
compile time by panicking the macro with an insightful message.
The same with `skip_name_checks` excluding `rename`.
And we check that `rename`s do not introduce any name clashes.
By default, if a UDT definition contains more fields than the Rust
struct (in unordered flavour: anywhere, in enforce_order: in suffix),
those excess fields are ignored. The `forbid_excess_udt_fields`
attribute is added to fail with an error in case such fields are
present.
For comparison, DeserializeRow always requires the same number of Rust
fields and CQL columns, effectively rejecting rows with excess columns.
New tests are added for both flavours that show how the attribute fails
type check phase in case that excess fields are present.
By default, if a UDT definition does not contain a field that the Rust
struct requires, type checking fails. Instead, if `allow_missing` is
specified for a field, type check lets it pass and deserialization
yields `Default::default()`.
This is important in production: if adding a field to a UDT and clients
are updated before the cluster, then using this attribute clients can
operate correctly until the cluster gets updated (by extending the UDT
definition with the field expected by clients).

Co-authored-by: Piotr Dulikowski <[email protected]>
`enforce_order` flavour with `skip_name_checks` can't support
`allow_missing` attribute in certain cases. Namely:

* Fields with `allow_missing` are only permitted at the end of the
struct, i.e. no field without `allow_missing` and `skip` is allowed to
be after any field with `allow_missing`.

If the condition is not upheld, the problem of matching fields becomes
unsolvable. Thus, we don't support such cases, so let's ensure this
condition at compile time by panicking the macro with an insightful
message.
New tests are added for both flavours that show how the attribute
allows for successful type check and deserialization when the
corresponding field is missing from the CQL UDT definition.

Co-authored-by: Piotr Dulikowski <[email protected]>
By default, if DB provides null value in serialized data and the
corresponding Rust type expects non-null value, deserialization fails.
Instead, if `default_when_null` is specified for a field,
deserialization yields `Default::default()`.
This is important in production: when added a field to a UDT,
the existing UDT instances have the new field filled with null, even if
the data represented makes no sense to allow nulls.
By using `default_when_null`, clients can handle such situation without
using `Option` in their Rust struct.
New tests are added for both flavours that show how the attribute
allows for successful deserialization when the field contains null and
the corresponding Rust type expects non-null value.
value.rs already got huge, better to put tests aside. Also, if only
tests are modified when put in the same file as library code, Cargo will
rebuild the lib crate anyway. Conversely, when tests are in a separate
file, the lib crate won't be rebuilt and this saves precious time.
If only tests are modified when put in the same file as library code,
Cargo will rebuild the lib crate anyway. Conversely, when tests are in
a separate file, the lib crate won't be rebuilt and this saves precious
time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/deserialization enhancement New feature or request performance Improves performance of existing features semver-checks-breaking cargo-semver-checks reports that this PR introduces breaking API changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deserialization refactor: macros for the new traits
3 participants