Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Prevent over span #3258

Merged

Conversation

fredcarle
Copy link
Collaborator

@fredcarle fredcarle commented Nov 19, 2024

Relevant issue(s)

Resolves #3242

Description

This PR solves the situation where deleted documents in the immediate next collection by ID were returned on a full collection query. The reason for this behaviour was due to the fetcher start method redefining the spans based on wanting deleted docs or not and was defining an end key that might have been "prefix-ended" based on a shorter prefix (i.e. collection instead of instance type). The solution was to always redefine the end key as the prefix end of the start key.

To fix this we removed the concept of spans and replaced it with a list of prefixes. This results in the fetcher being asked, for example, for all docs in collection 1 with a prefix of /data/1 instead of a span from /data/1 to /data/2. Furthermore, when the fetcher checks if it need to get deleted docs or non-deleted docs, the resulting prefix becomes /data/1/< v or d > instead of the span from /data/1/< v or d > to /data/2/< v or d > (the span is wrong).

The first commit documents the bug with an integration test.

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

make test

Specify the platform(s) on which this was tested:

  • MacOS

@fredcarle fredcarle added bug Something isn't working area/db-system Related to the core system related components of the DB area/planner Related to the planner system labels Nov 19, 2024
@fredcarle fredcarle added this to the DefraDB v0.15 milestone Nov 19, 2024
@fredcarle fredcarle requested a review from a team November 19, 2024 19:24
@fredcarle fredcarle self-assigned this Nov 19, 2024
Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test or two for the fix.

Copy link

codecov bot commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 74.83871% with 39 lines in your changes missing coverage. Please review.

Project coverage is 77.39%. Comparing base (fa0d92b) to head (a8c67e1).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
internal/db/fetcher/fetcher.go 90.24% 4 Missing ⚠️
internal/planner/arbitrary_join.go 0.00% 3 Missing ⚠️
internal/planner/scan.go 81.25% 2 Missing and 1 partial ⚠️
internal/planner/values.go 0.00% 3 Missing ⚠️
internal/planner/view.go 0.00% 3 Missing ⚠️
internal/planner/delete.go 0.00% 2 Missing ⚠️
internal/planner/group.go 0.00% 2 Missing ⚠️
internal/planner/lens.go 0.00% 2 Missing ⚠️
internal/planner/operation.go 0.00% 2 Missing ⚠️
internal/planner/top.go 0.00% 2 Missing ⚠️
... and 12 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3258      +/-   ##
===========================================
- Coverage    77.43%   77.39%   -0.04%     
===========================================
  Files          382      381       -1     
  Lines        35290    35138     -152     
===========================================
- Hits         27325    27194     -131     
+ Misses        6325     6316       -9     
+ Partials      1640     1628      -12     
Flag Coverage Δ
all-tests 77.39% <74.84%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/db/collection_get.go 79.31% <100.00%> (ø)
internal/db/collection_index.go 87.47% <100.00%> (-0.07%) ⬇️
internal/db/fetcher/errors.go 24.00% <ø> (-8.00%) ⬇️
internal/db/fetcher/indexer.go 83.80% <100.00%> (-0.11%) ⬇️
internal/db/fetcher/versioned.go 78.57% <100.00%> (ø)
internal/lens/fetcher.go 71.63% <100.00%> (+1.40%) ⬆️
internal/planner/explain.go 59.40% <ø> (ø)
internal/planner/multi.go 80.30% <100.00%> (ø)
internal/planner/planner.go 80.56% <ø> (ø)
internal/planner/select.go 84.42% <100.00%> (-0.19%) ⬇️
... and 23 more

... and 17 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa0d92b...a8c67e1. Read the comment docs.

---- 🚨 Try these New Features:

@fredcarle
Copy link
Collaborator Author

Please add a test or two for the fix.

Right... I forgot to commit them...

@fredcarle fredcarle force-pushed the fredcarle/fix/3242-over-span branch from c9bc6ed to 0dd5ccf Compare November 19, 2024 20:08
)
span.Start = span.Start.(keys.DataStoreKey).WithValueFlag()
// The end key should always be the prefix end of the start key
span.End = span.Start.PrefixEnd()
Copy link
Contributor

@AndrewSisley AndrewSisley Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: I think this is incorrect if the user provides two docIDs that are next to each other, there is a function that is called earlier that merges spans that sit next to each other - the change here will mean that the second doc would not be fetched.

e.g.:

User (docIDs: [baefooooooa, baefooooob]{
...
}

We currently have no integration tests for this as it is quite painful to find docs with docIds next to each other, but there are unit tests for the func responsible IIRC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should create a test action testUtils.MineDocWithID :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're talking about core.MergeAscending, it's called within the same function after this change to the span and should have no impact when querying two adjacent docs.

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It solves an urgent and bad miss from us, but introduces another (less urgent bug) and well as kind of abuses spans.

I think we should find another fix, probably one that handles whatever is causing the span.End to be invalid.

Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I will leave the approval to Andy

)
span.Start = span.Start.(keys.DataStoreKey).WithValueFlag()
// The end key should always be the prefix end of the start key
span.End = span.Start.PrefixEnd()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should create a test action testUtils.MineDocWithID :D

Comment on lines 261 to 264
"_docID": "bae-1ef746f8-821e-586f-99b2-4cb1fb9b782f",
},
{
"_docID": "bae-22dacd35-4560-583a-9a80-8edbf28aa85c",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: use testUtils.NewDocIndex (or whatever the type was)

@fredcarle
Copy link
Collaborator Author

I think we should find another fix, probably one that handles whatever is causing the span.End to be invalid.

@AndrewSisley I don't think. there is anything other than what was there before the change that is making the span.End invalid. I initially thought that the problem was somewhere else but the span coming in is never invalid per say. It's not precise but not invalid. What made it invalid was that the function param span.End was already prefix-ended, as it should be, and prefix-ended once more after the instance type being assigned. This means that different path parameters where being prefix-ended and thus causing an over span.

@AndrewSisley
Copy link
Contributor

as it should be, and prefix-ended once more after the instance type being assigned.

We can fix this then, and prefix-end the correct key property, it shouldn't be terribly difficult, and then we wont be introducing a new bug/ignoring the span.end

@fredcarle
Copy link
Collaborator Author

We can fix this then, and prefix-end the correct key property, it shouldn't be terribly difficult, and then we wont be introducing a new bug/ignoring the span.end

It is fixed with this change. As stated above, there is no new bug I believe.

@AndrewSisley
Copy link
Contributor

AndrewSisley commented Nov 20, 2024

@fredcarle I had a bit of a think, and chased down the references to Spans.

  1. I think tech. debt wise the change in this PR is not great. Fetchers take a Spans object with a start and an end, and this PR essentially causes the end value to be ignored. This PR does not document this in any space that would make that clear to stuff (like plan nodes) dealing with Spans that the normal case (this fetcher) ignores half of the properties it is assigning. If the solution is left as-is I think it will cost us plenty in the future.

However:

  1. Spans has always been pretty ugly, both in fetcher and in planner. I consider it to notably confuse the flow of things in the otherwise moderately confused planNode interface in particular.
  2. Most of Spans is already ignored anyway in the other, more niche fetchers.
  3. core.MergeAscending is a well meaning but highly ineffective optimisation that almost certainly costs more computing resources than it will ever save. And it far from cheap maintenance-wise.
  4. You as team lead picked this bug up from another developer, who picked this up from a partner. It seems like people are not terribly keen to work on this right now, however the bug is not great and needs a quick fix.
  5. Open PRs aside, I am between tasks right now.

Technically I think the best thing to do is to remove the Spans concept entirely, and I think it will be pretty quick to do (a day or so tops). This includes the Spans type, and the planNode.Spans functions (as a bonus). We only ever need to care about an array of prefixes, not an array of ranges. Even if we want to care about ranges in the future we can do so in a much nicer, localised way that doesn't involve spreading the mess across every implementation of the fetcher and planNode interfaces. This should also allow us to delete the PrefixEnd functions, and the newish keys.Walkable interface.

I suggest you merge the fix as-is, and then I carry out the proposed refactor.

If you disagree with the proposed refactor, I remain against the proposed solution for the reasons in (1). Let me know 😁 Sorry if this feels a little hostage-like 😅

@fredcarle
Copy link
Collaborator Author

  1. I think tech. debt wise the change in this PR is not great

I don't disagree.

Technically I think the best thing to do is to remove the Spans concept entirely.

This is what I thought of doing at some point but I wanted a quick fix first because I wasn't sure how deep of a rabbit hole I would be getting into. I should have documented that in the PR description.

I'm keen on trying to do the "spans to array of prefix" change in this PR. Unless you prefer taking that on.

@fredcarle fredcarle force-pushed the fredcarle/fix/3242-over-span branch from 0dd5ccf to 471ab9c Compare November 20, 2024 17:12
Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic, thank you very much for this Fred! And thanks for renaming the various span related variables/docs along the way :)

@fredcarle fredcarle merged commit 3fa579e into sourcenetwork:develop Nov 20, 2024
42 of 43 checks passed
@fredcarle fredcarle deleted the fredcarle/fix/3242-over-span branch November 20, 2024 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/db-system Related to the core system related components of the DB area/planner Related to the planner system bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

After doc deletion the doc can still be fetched from another collection
4 participants