optimizing batch sizing based on cost #1416
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Optimize EvaDB
Changes Overview:
evadb/executor/storage_executor.py
: Adjusted the execution of reading data forTableType.NATIVE_DATA
by incorporating the specified batch memory size. This ensures efficient resource utilization during data retrieval.evadb/optimizer/cost_model.py
: Implemented a zero_cost property within theCostModel
class, providing a baseline cost entry for comparison. Revised the calculate_cost method to accept a list of children costs, enabling a more accurate representation of overall costs. Enhanced the cost calculation for various plan types, taking into account child costs and adjusting accordingly.evadb/optimizer/group_expression.py
: Improved the representation ofGroupExpression
instances for better clarity and debugging.evadb/optimizer/optimizer_tasks.py
: Modified the execution logic within theOptimizeInputs
task to consider child costs when calculating the overall cost. This adjustment ensures a more accurate representation of the optimization process.evadb/optimizer/rules/rules.py
: Refined the application of rules related toLogicalGet
instances, specifically addressing batch memory size configuration within theSeqScanPlan
. Experimentation with heuristics for optimal batch memory size is also noted for future consideration.evadb/plan_nodes/vector_index_scan_plan.py
: Improved the__str__
method for better readability and understanding ofVectorIndexScan
instances.