Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Reader Performance Queries for /datafiles and /datasets #488

Merged
merged 18 commits into from
Oct 8, 2024

Conversation

MRichards99
Copy link
Collaborator

This PR will close #{issue number}

Description

These changes allow efficiency gains to be made when either the datafiles or datasets are being queried, with a WHERE filter for an ID of the parent entity (e.g. WHERE dataset.id = 4 on /datafiles). The changes have been implemented for the count endpoints of datafiles and datasets.

If a relevant request comes in, the API queries the parent entity directly, to check whether the user can see it (i.e. has the permissions to). If the user can, their original query is sent to ICAT (JPQL query constructed by Python ICAT as normal) but it is executed as the 'reader' user (as configured in config.yaml). This should bypass the additional complexity added to queries by ICAT Server, which is causing performance issues. As I don't have rules setup in my ICAT, I wasn't able to test this, but hopefully testing next week done by others should test that.

This functionality is completely optional - if the config isn't in config.yaml or it is disabled, the API will behave as normal (i.e. pass user queries onto ICAT as the user of the API request).

Testing Instructions

Add a set up instructions describing how the reviewer should test the code

  • Review code
  • Check GitHub Actions build
  • If icatdb Generator Script Consistency Test CI job fails, is this because of a deliberate change made to the script to change generated data (which isn't actually a problem) or is here an underlying issue with the changes made?
  • Review changes to test coverage
  • Does this change mean a new patch, minor or major version should be made? If so, does one of the commit messages feature fix:, feat: or BREAKING CHANGE: so a release is automatically made via GitHub Actions upon merge?
  • {more steps here}

Agile Board Tracking

Connect to #{issue number}

Copy link

codecov bot commented Sep 25, 2024

Codecov Report

Attention: Patch coverage is 7.36842% with 88 lines in your changes missing coverage. Please review.

Project coverage is 43.45%. Comparing base (a9f336c) to head (b3807ae).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...i/src/datagateway_api/icat/reader_query_handler.py 0.00% 57 Missing ⚠️
...atagateway_api/src/datagateway_api/icat/helpers.py 0.00% 28 Missing ⚠️
...atagateway_api/src/datagateway_api/icat/backend.py 0.00% 2 Missing ⚠️
datagateway_api/src/common/filters.py 50.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (a9f336c) and HEAD (b3807ae). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (a9f336c) HEAD (b3807ae)
4 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #488       +/-   ##
===========================================
- Coverage   96.66%   43.45%   -53.22%     
===========================================
  Files          39       40        +1     
  Lines        3242     3325       +83     
  Branches      317      326        +9     
===========================================
- Hits         3134     1445     -1689     
- Misses         80     1872     +1792     
+ Partials       28        8       -20     
Flag Coverage Δ
43.45% <7.36%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@kevinphippsstfc kevinphippsstfc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having reviewed the code, made a number of changes to it and tested it on preprod, I'm happy that it is functioning as required. I think there are just a few things to complete the code to a "production" level: complete the TODOs, add docstrings to each function, and add some tests for the new functionality. That last one may not be trivial given that it probably requires starting up a new instance of the API using a different config.

@MRichards99 MRichards99 force-pushed the reader-performance-query branch from cad8b4f to e361e2b Compare October 2, 2024 16:01
@MRichards99
Copy link
Collaborator Author

This is ready for review. I've addressed the TODOs, added documentation in the form of docstrings and added a test class to test the new functionality. Quite a bit of time went into fixing the CI which was broken, presumably because a PR hasn't gone through this repo for a little while. I've also edited a couple of the commit messages so a new release will be made.

The generator script CI job fails but this is intentional - final diff in the CI will still fail because main doesn't contain two dropped columns that I've added to an SQL script, but the initial diff to check that two runs produce identical data passes. This is what I expected to happen :)

@MRichards99 MRichards99 marked this pull request as ready for review October 2, 2024 16:32
Copy link
Contributor

@kevinphippsstfc kevinphippsstfc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Thanks for tidying up the code and adding the tests.

And thanks for implementing this bit of functionality which appears to have been successful in reducing database load.

@MRichards99 MRichards99 merged commit 7fba647 into main Oct 8, 2024
17 of 18 checks passed
@MRichards99 MRichards99 deleted the reader-performance-query branch October 8, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants