Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing SQL indexes #184

Open
6 of 16 tasks
jpmckinney opened this issue Oct 23, 2023 · 2 comments
Open
6 of 16 tasks

Missing SQL indexes #184

jpmckinney opened this issue Oct 23, 2023 · 2 comments
Labels
performance topic: models Relating to the ORM or database integration

Comments

@jpmckinney
Copy link
Member

jpmckinney commented Oct 23, 2023

Can maybe start with the slow queries listed at https://open-contracting-partnership.sentry.io/performance/?project=4505799907672064&statsPeriod=14d

For single-column indexes, would need to add index=True to Field() calls to add these indexes.

Not sure if https://sqlmodel.tiangolo.com supports composite indexes. Might need to do SQLAlchemy or Alembic directly.

JOIN

Found using:

(?<!os\.path)\.join\(

WHERE

Found using these regexes:

\.filter\((?:\n +)?\b(?:\w+\.)+(?!id\b\s)\w+\b\s
\.filter\((?:\n +)?\b(?:and_\(|or_\(|cast\(|text\()
\.filter\((?!(?:\n +)?\S+\.\w+)
\.where\(

To avoid a ton of indexes, I grouped some WHERE matches under one composite index.

Application

  • (status, archived_at, lender_id)
    • status: get_all_applications_with_status()
    • (status, archived_at): get_all_active_applications()
    • (status, archived_at, lender_id): get_all_fi_applications_emails(), get_all_FI_user_applications()
  • (award_borrower_identifier, status)
    • award_borrower_identifier: get_existing_application()
    • (award_borrower_identifier, status): get_previous_lenders(), get_previous_documents()
  • (id, award_id, archived_at): remove_dated_data()

ApplicationAction

  • (application_id, type): get_modified_data_fields(), check_if_application_was_already_copied(), get_application_days_passed()

Award

Borrower

  • borrower_identifier: get_borrower() (already has unique=True)

BorrowerDocument

  • (application_id, type)
    • application_id: download_application()
    • (application_id, type): create_or_update_borrower_document(), get_previous_documents()

CreditProduct

  • (lender_id, borrower_size, lower_limit, upper_limit, type)
    • (lender_id, borrower_size, lower_limit, upper_limit): reject_application()
    • (lender_id, borrower_size, lower_limit, upper_limit, type): credit_product_options()

Message

  • type: get_applications_to_remind_submit(), get_applications_to_remind_intro()

User

get_general_statistics() and get_msme_opt_in_stats() have a lot of filters, but not sure how important to optimize these functions. I didn't add some from statistics.py or update_statistic.py.

I think the queries in application_utils.py are too complex for an index.

@jpmckinney
Copy link
Member Author

Note that creating automatic migrations incorrectly detects nullable changes. These seem to be fixed in more recent 1.4.x versions of SQLAlchemy, but we need SQLModel to release 0.0.9 to unpin SQLAlchemy: fastapi/sqlmodel#434 (comment)

@jpmckinney
Copy link
Member Author

Application
(status, archived_at, lender_id)

Sentry reports this as a slow query, on /applications

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance topic: models Relating to the ORM or database integration
Projects
No open projects
Status: 📋 Backlog
Development

No branches or pull requests

1 participant