-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: optimize database queries #7191
Conversation
Deployment Summary
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7191 +/- ##
=======================================
Coverage 92.73% 92.74%
=======================================
Files 191 191
Lines 16066 16083 +17
=======================================
+ Hits 14899 14916 +17
Misses 1167 1167
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
dataset_versions.append(dv) | ||
else: | ||
dataset_version_ids.append(dv) | ||
dataset_versions.extend(business_logic.database_provider.get_dataset_versions_by_id(dataset_version_ids)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add an empty list check, to avoid opening a session / querying for nothing in case all are DatasetVersion instances
update(DatasetTable).where(DatasetTable.id.in_(dataset_ids_to_tombstone)).values(tombstone=True) | ||
) | ||
session.execute(tombstone_dataset_statement) | ||
dataset_all_version_ids = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tests are passing so maybe i'm misunderstanding, but this query looks like we're getting List[DatasetVersion] rather than List[DatasetVersionId], no? And dataset_version_ids_to_delete_from_s3 expects Ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're totally right, but that is how the original code was written. I change it to query for DatasetVersion.id and it works. Not sure how it managed to work before, because it was returning a list[row] not a list[string].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test in rdev and ensure no regressions but looks good to me!
I added some timing metric to the queries and the two slowed queries are in
I also did some performance and found the endpoint on average took over 60 seconds to return |
This reverts commit ad999ae.
Reason for Change
Changes
Testing steps
Note