-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(query) Fixes for split partition queries #1701
fix(query) Fixes for split partition queries #1701
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved (with a couple questions)
val timeRange = TimeRange(1000 * qParams.startSecs, 1000 * qParams.endSecs) | ||
val stepMsOpt = if (qParams.startSecs == qParams.endSecs) None else Some(1000 * qParams.stepSecs) | ||
val partitions = getPartitions(logicalPlan, qParams).distinct.sortBy(_.timeRange.startMs) | ||
require(partitions.nonEmpty, s"Partition assignments is not expected to be empty for query ${qParams.promQl}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming: this won't affect, say, scalar queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, this forced me to try few cases with (OR vector(0)
)which failed with this. I pushed a fix for that. As far as the above line is concerned we are good.
Caused by: java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:268)
at filodb.query.BinaryJoin.<init>(LogicalPlan.scala:462)
at filodb.query.BinaryJoin.copy(LogicalPlan.scala:460)
at filodb.coordinator.queryplanner.PlannerUtil$.rewritePlanWithRemoteRawExport(DefaultPlanner.scala:666)
if (lastTimeRange.endMs < timeRange.endMs) { | ||
// this means we need to add the missing time range to the end to execute the bit on Query service | ||
val (gapStartTimeMs, gapEndTimeMs) = stepMsOpt match { | ||
case Some(step) => (snapToStep(lastTimeRange.endMs + 1, step, timeRange.startMs), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't these timestamps need the same * 1000 / 1000 + 1
treatment as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are doing snapToStep now which will take care to align the start exactly at the time where it needs to start.
// have end date of 1:10. Then the period of 1 - 1:20 will not have any results, this is due to that fact the | ||
// period 1 - 1:10 can not be served from one partition alone and needs to be computed on query service. Here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is due to that fact the period 1 - 1:10 can not be served from one partition alone and needs to be computed on query service.
Should we callout the getQueryTimeRanges
method? Really that just needs to be updated to drop the old "empty result at partition splits" logic.
@@ -153,7 +153,7 @@ class StreamingResultsExecSpec extends AnyFunSpec with Matchers with ScalaFuture | |||
resp(2).asInstanceOf[StreamQueryResultFooter].queryStats.getTimeSeriesScannedCounter().get() shouldEqual 2 | |||
} | |||
|
|||
it("should execute Aggregation Exec Plans from Memstore in result streaming mode using actor plan dispatcher") { | |||
ignore("should execute Aggregation Exec Plans from Memstore in result streaming mode using actor plan dispatcher") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional?
Pull Request checklist
Current behavior :
The query processing assumes the partition split happens exactly at a given point in time, however, in reality there is a delay around that time and this period of uncertainty needs to be considered while computing the partition assignments
New behavior :
Fixes the issue by adding this period of uncertainty around the partition split