-
Notifications
You must be signed in to change notification settings - Fork 503
Index Scan + Index Join Limit #1422
base: master
Are you sure you want to change the base?
Conversation
- [ ] Add information for subquery transformation above | ||
- [ ] Add information about aggregations in `SELECT` here | ||
- [ ] Add information about `SelectDistinct` in `SELECT` here | ||
- If a `LIMIT` exists in the select statement, the value of the limit and associated offset, along with any `ORDER_BY` clauses, are stored in a `LogicalLimit` node that is added as the parent of the existing `output_expr_` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful here to briefly explain why ORDER_BY
information is stored within a LIMIT
operator
- `GroupExpression`: a class to track a specific operator (logical or physical) in a group which tracks the transaction, group number, whether the stats have been derived, the lowest cost satisfying the required properties of an expression, and the details of the expression. | ||
- `Group`: a class to collect group expressions representing logically equivalent expressions including logical and physical transformations. | ||
- `LogicalExpression`: a group expression which represents a logical expression i.e. the logical operator that specifies the general behavior of a node | ||
- `PhysicalExpression`: a group expression which represents a logical expression i.e. the physical operator that specifies the implementation details of the expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to say "logical expression" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which line is this about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"PhysicalExpression
: a group expression which represents a logical expression..." I think you meant to put "physical expression" instead of "logical expression".
- `LogicalExpression`: a group expression which represents a logical expression i.e. the logical operator that specifies the general behavior of a node | ||
- `PhysicalExpression`: a group expression which represents a logical expression i.e. the physical operator that specifies the implementation details of the expression | ||
- `Rule`: definition of a transformation from one expression to another (may be logical transformation, physical implementation, or rewrite) | ||
- `Property`: interface to define a physical property, currently of which only sort properties exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful if we had a couple of sentences explaining what a property is and why we have them.
Hey @thepinetree I'm just going to consolidate everything we discussed on slack here so that you have a single place to go to when you return to this PR, let me know if I'm missing anything. I'm not sure which of these belong in this PR, or should be done it later PRs, I guess that's up to you/Will/whoever makes those decisions.
|
Index Scan + Index Join Limit
Description
Limit clauses are currently not propagated to the
IndexScanPlanNode
nor theIndexJoinPlanNode
and as a result, the execution engine can't take advantage of pushing down the limit during operation. Instead, this is done in-post, with aLimitPlanNode
doing so after an index scan is completed.This PR adds functionality for the limit value to be pushed down to index scans, and is used in TPC-C. Limits values will be pushed down to their child
LogicalGet
via transformation rule and converted to values in thePhysicalIndexScan
which are then set in theIndexScanPlanNode
. The PR also moves theOrderByOrderingType
from the optimizer to the catalog as a precursor to further changes to involve the sort direction of columns in creating/scanning an index.The final implementation effort is to introduce optional properties to push down ORDER BY sort properties into index scans whenever possible. There are a few alternative possible implementations:
We choose method 1 in this implementation, though the alternative is not difficult to switch to 2 (see PR Index Scan Limit #1031).
Additionally, a guide to optimizer development is included for future developers.
Further work
A description of further work is included in Issue #1421