Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Aggregate filter alias targeting #3252

Merged

Conversation

nasdf
Copy link
Member

@nasdf nasdf commented Nov 18, 2024

Relevant issue(s)

Resolves #3195

Description

This PR enables aggregate alias targeting in filters.

Blocked by #3253

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

Added and updated integration tests.

Specify the platform(s) on which this was tested:

  • MacOS

@nasdf nasdf added the feature New feature or request label Nov 18, 2024
@nasdf nasdf added this to the DefraDB v0.15 milestone Nov 18, 2024
@nasdf nasdf self-assigned this Nov 18, 2024
Copy link

codecov bot commented Nov 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.06%. Comparing base (2776f1e) to head (4e7207c).
Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3252      +/-   ##
===========================================
+ Coverage    78.06%   78.06%   +0.01%     
===========================================
  Files          382      382              
  Lines        35364    35371       +7     
===========================================
+ Hits         27604    27612       +8     
  Misses        6123     6123              
+ Partials      1637     1636       -1     
Flag Coverage Δ
all-tests 78.06% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/planner/average.go 84.85% <100.00%> (+0.23%) ⬆️
internal/planner/count.go 95.12% <100.00%> (+0.04%) ⬆️
internal/planner/max.go 91.86% <100.00%> (+0.05%) ⬆️
internal/planner/min.go 91.86% <100.00%> (+0.05%) ⬆️
internal/planner/select.go 84.57% <100.00%> (+0.14%) ⬆️
internal/planner/sum.go 85.99% <100.00%> (ø)
internal/planner/top.go 67.35% <100.00%> (ø)

... and 17 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2776f1e...4e7207c. Read the comment docs.

@nasdf nasdf requested a review from a team November 18, 2024 18:30
@@ -102,7 +106,7 @@ func (n *averageNode) Next() (bool, error) {
return false, client.NewErrUnhandledType("sum", sumProp)
}

return true, nil
return mapper.RunFilter(n.currentValue, n.aggregateFilter)
Copy link
Contributor

@AndrewSisley AndrewSisley Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why are you running the filter in the aggregate nodes, and not the select nodes they sit on top of (that also run the aggregate filters)?

Copy link
Contributor

@AndrewSisley AndrewSisley Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, is this more like a scan level filter? And it is filtering the value produced by the aggregate node? In which case would it not be better to run the filter in the select node that sits on top of the aggregates, allowing the aggregate nodes to remain focused on aggregating, and not adding an additional code-location in which filters may execute?

Or perhaps we could argue that the select node is doing too much, and we should in fact remove filtering from the Select node's remit, and instead introduce a filter node, removing filtering from select and aggregate nodes (and anywhere else in the planner that filters may have leaked into).

EDIT: Sorry for the notification spam, please consider extracting this and the select-filters to now to be a 'todo'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on board with moving it into a new Filter node but not within this same PR.

Copy link
Contributor

@AndrewSisley AndrewSisley Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rush on this feature? IMO refactors like this should be done before the feature, not after (or, worst case, during and then broken out to merge first). We need to think about the long term implications of how we build features before we build them.

The tech. debt here is basically the entirety of the production code change - there appears to be very little cost in delaying it and doing it in a way that doesn't build up tech. debt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no rush. I'm trying to keep my PRs more focused, so I'll be doing the refactor in a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, please merge that PR before you merge this one.

Happy to chat about this with the team over discord, we dont have to wait until the next standup.

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with the solution and see it as introducing/propagating architectural tech. debt that should be dealt with instead of worsened.

Please either rework this, or fight very hard back against my todo :)

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite ok with this change. If future improvements to the filtering logic end up reverting the changes done in this PR, it will be trivial to do so.

@nasdf nasdf dismissed AndrewSisley’s stale review December 5, 2024 21:48

will revisit filter alias refactor in the future

@nasdf nasdf merged commit e4599e8 into sourcenetwork:develop Dec 5, 2024
41 of 43 checks passed
@islamaliev
Copy link
Contributor

islamaliev commented Dec 11, 2024

bug bash result:
played with lots of different scenarios. Found some quirk behavior. The rest works fine.

@islamaliev
Copy link
Contributor

this one is probably doesn't not pertain to the task, but found during the bug bash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregate filter alias targeting
4 participants