Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EXCLUDE to partiql-eval #1320

Merged
merged 7 commits into from
Jan 16, 2024
Merged

Conversation

alancai98
Copy link
Member

@alancai98 alancai98 commented Dec 20, 2023

Relevant Issues

Description

  • Adds EXCLUDE to the partiql-eval evaluator
  • Different modeling of EXCLUDE in the plan representation. This change in modeling was done to better align with the redundant path elimination and evaluation approaches
  • Adds some path subsumption tests that were previously commented out in 4ac1b3d
    • More tests were added beyond the previous set of tests

Other Information

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alancai98 alancai98 self-assigned this Dec 20, 2023
@alancai98 alancai98 requested a review from RCHowell December 20, 2023 01:39
@@ -0,0 +1,251 @@
package org.partiql.eval.internal.exclude
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are the same as the commented out tests from partiql-eval/src/test/kotlin/org/partiql/lang/eval/EvaluatingCompilerExcludeTests.kt

@@ -264,6 +264,35 @@ class PartiQLEngineDefaultTest {
assertEquals(expected, output)
}

@OptIn(PartiQLValueExperimental::class)
@Test
fun testExclude() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test succeeds when @Disabled is removed from L29.

Copy link

github-actions bot commented Dec 20, 2023

Conformance comparison report

Base (d64784c) a496bac +/-
% Passing 92.54% 92.54% 0.00%
✅ Passing 5384 5384 0
❌ Failing 434 434 0
🔶 Ignored 0 0 0
Total Tests 5818 5818 0

Number passing in both: 5384

Number failing in both: 434

Number passing in Base (d64784c) but now fail: 0

Number failing in Base (d64784c) but now pass: 0

@alancai98 alancai98 force-pushed the add-exclude-partiql-eval branch from 672a09d to 910dec0 Compare December 20, 2023 01:46
@codecov-commenter
Copy link

codecov-commenter commented Dec 20, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (partiql-eval@d64784c). Click here to learn what that means.

Additional details and impacted files
@@               Coverage Diff               @@
##             partiql-eval    #1320   +/-   ##
===============================================
  Coverage                ?   49.26%           
  Complexity              ?     1046           
===============================================
  Files                   ?      166           
  Lines                   ?    13396           
  Branches                ?     2504           
===============================================
  Hits                    ?     6600           
  Misses                  ?     6139           
  Partials                ?      657           
Flag Coverage Δ
CLI 11.86% <0.00%> (?)
EXAMPLES 80.28% <0.00%> (?)
LANG 54.71% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alancai98 alancai98 force-pushed the add-exclude-partiql-eval branch from 910dec0 to 5bc681e Compare December 20, 2023 19:34
}
}

@OptIn(PartiQLValueExperimental::class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having a bunch of top-level functions, could you please place these in the appropriate spots? Compile related functions in the Compiler, and evaluation related items wihtin the RelExclude operator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed locations of the top-level functions in 9e9a040

@alancai98 alancai98 requested a review from RCHowell December 21, 2023 00:59
Comment on lines 136 to 142
val collWithRemoved = when (coll) {
is BagValue -> coll
is ListValue, is SexpValue -> coll.filterIndexed { index, _ ->
!indexesToRemove.contains(index)
}
}
val finalColl = collWithRemoved.mapIndexed { index, element ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between collWithRemoved and finalColl? I think you should be applying the exclusions with a mapIndexedNotNull and return null if the index is skipped otherwise apply the exlcusion. This way you don't traverse the collection more than once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collWithRemoved would remove the indexes at that level (i.e. the leaf nodes with ExcludeStep.CollIndex). finalColl would remove the values at deeper nested levels (i.e. the branch nodes).

I think you should be applying the exclusions with a mapIndexedNotNull and return null if the index is skipped otherwise apply the exlcusion. This way you don't traverse the collection more than once.

Good point. I will change to use mapIndexedNotNull so only one iteration of coll is necessary.

Comment on lines 153 to 159
// apply collection wildcard exclusions for lists, bags, and sexps
val collectionWildcardKey = ExcludeStep.CollWildcard
branches.find {
it.step == collectionWildcardKey
}?.let {
expr = excludeOnPartiQLValue(expr, it)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of what this is for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll edit the comment to clarify. This basically applies all the collection wildcard exclude steps that have deeper exclude steps (e.g. t.a.b.c[*].field_x in

SELECT *
EXCLUDE
    t.a.b.c[*].field_x
FROM [{
    'a': {
        'b': {
            'c': [
                {                    -- c[0]; field_x to be removed
                    'field_x': 0, 
                    'field_y': 0
                },
                {                    -- c[1]; field_x to be removed
                    'field_x': 1,
                    'field_y': 1
                },
                {                    -- c[2]; field_x to be removed
                    'field_x': 2,
                    'field_y': 2
                }
            ]
        }
    }
}] AS t

When we get to the c list, the collection wildcard step will apply the remainder of the step (i.e. .field_x) on each of the elements of c.

Copy link
Member Author

@alancai98 alancai98 Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ported above example to another test:

SuccessTestCase(
input = """
SELECT *
EXCLUDE
t.a.b.c[*].field_x
FROM [{
'a': {
'b': {
'c': [
{ -- c[0]; field_x to be removed
'field_x': 0,
'field_y': 0
},
{ -- c[1]; field_x to be removed
'field_x': 1,
'field_y': 1
},
{ -- c[2]; field_x to be removed
'field_x': 2,
'field_y': 2
}
]
}
}
}] AS t
""".trimIndent(),
expected = bagValue(
structValue(
"a" to structValue(
"b" to structValue(
"c" to bagValue( // TODO: should be ListValue; currently, Rex.ExprCollection doesn't return lists
structValue(
"field_y" to int32Value(0)
),
structValue(
"field_y" to int32Value(1)
),
structValue(
"field_y" to int32Value(2)
)
)
)
)
)
)
)
)
}

Though the inner collection should be a list rather than bag. I believe this is a bug (or not yet implemented) in Rex.ExprCollection which currently always returning a bag.

@alancai98 alancai98 requested a review from RCHowell December 21, 2023 02:46
@alancai98 alancai98 marked this pull request as draft January 11, 2024 02:30
Comment on lines 34 to 40
while (true) {
val row = input.next() ?: return null
val newRecord = exclusions.fold(row) { curRecord, expr ->
excludeOnRecord(curRecord, expr)
}
return newRecord
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the loop because you aren't processing multiple input rows.

Suggested change
while (true) {
val row = input.next() ?: return null
val newRecord = exclusions.fold(row) { curRecord, expr ->
excludeOnRecord(curRecord, expr)
}
return newRecord
}
val record = input.next() ?: return null
val result = exclusions.fold(record) { curr, path -> curr.exlcude(path) }
return result

Something I've noticed which is a total nitpicky thing is that in Kotlin (unlike rust) functions don't need unique names so you can overload the name because the parameters are different.

Instead of

excludeOnRecord(record: Record. ...)
excludeOnStructValue(structValue: StructValue<*>, ...)
excludeOnCollValue(collValue: CollectionValue<*>, ...)
excludeOnPartiQLValue(initialPartiQLValue: PartiQLValue, ...)

You can do

exclude(record: Record, ...)
exclude(value: PartiQLValue, ...) // usually declare the base before the others, but it doesn't matter
exclude(struct: StructValue<*>, ...)
exclude(collection: CollectionValue<*>, ...)

This is only a style thing to cut out verbosity, but I can see how the naming is carried over from Rust.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the loop because you aren't processing multiple input rows.

Ah, true. I will change it do process just one row.


For naming of the exclude, I'll change all of the excludeOn... to exclude. This will be more consistent with how the TypeUtils models the similar functions for exclude on static type.

Comment on lines +383 to +389
val subsumedPaths = newPaths
.groupBy(keySelector = { it.root }, valueTransform = { it.steps }) // combine exclude paths with the same resolved root before subsumption
.map { (root, allSteps) ->
val nonRedundant = ExcludeRepr.toExcludeRepr(allSteps.flatten()).removeRedundantSteps()
relOpExcludePath(root, nonRedundant.toPlanRepr())
}
val op = relOpExclude(input, subsumedPaths)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in case the AST representations aren't equivalent, but perhaps they resolve to the same?

SELECT * EXCLUDE t.a.*, "t".a FROM { 'a': { 'x': 1 } ... } AS t

I think the only time the root (which is a var) would differ is based on the case-sensitivity of the identifier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the ast representations differed since the root (identifier) of t.a.* was case-insensitive while "t".a was case-sensitive. They would resolve to the same in this example and should be under the same exclude path root in the resolved plan.

@alancai98 alancai98 marked this pull request as ready for review January 12, 2024 00:24
@alancai98 alancai98 requested a review from RCHowell January 12, 2024 00:30
@alancai98 alancai98 requested a review from RCHowell January 13, 2024 00:13
Copy link
Member

@RCHowell RCHowell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@alancai98 alancai98 merged commit bb0342d into partiql-eval Jan 16, 2024
10 checks passed
@alancai98 alancai98 deleted the add-exclude-partiql-eval branch January 16, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants