Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StudyConfig infrastructure #233

Merged
merged 6 commits into from
May 9, 2024
Merged

StudyConfig infrastructure #233

merged 6 commits into from
May 9, 2024

Conversation

dogversioning
Copy link
Contributor

This PR adds a StudyConfig object, which transports values from the CLI down to individual studies, for customizing behavior of study runs.

It also contains several tweaks in support of the UMLS study, and moves the upload logic for databases into each database class.

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added
  • Update template repo if there are changes to study configuration

def download_umls_files(
self,
target: str = "umls-metathesaurus-mrconso-file",
target: str = "umls-metathesaurus-full-subset",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the new target of the UMLS study.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you do with Mr. Conso?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mr. Conso brought all his friends along! He's kind of an inconsiderate guest, I didn't get enough snacks.

cumulus_library/base_utils.py Show resolved Hide resolved
cumulus_library/cli.py Show resolved Hide resolved
cumulus_library/databases.py Show resolved Hide resolved
cumulus_library/db_config.py Show resolved Hide resolved
@dogversioning dogversioning marked this pull request as ready for review May 7, 2024 18:24
def download_umls_files(
self,
target: str = "umls-metathesaurus-mrconso-file",
target: str = "umls-metathesaurus-full-subset",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you do with Mr. Conso?!

self.cursor, self.schema_name, verbose=self.verbose
self.cursor, self.schema_name, verbose=self.verbose, config=config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an early comment before I've read the rest of the PR - but looks like you went light-touch with this first cut. Like, you could have removed self.cursor and self.schema_name args here and had run_protected_table_builder grab them from config.db_backend, yeah? And that sort of refactor might be in play in the future?

Not a criticism, just trying to get a handle on the current PR does and what the future vision might be, as I see a lot of similar args that could be removed around the place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a future refactor to remove those is the plan, because that is a breaking change to the interface that studies use. Right now, table builders in other studies can just take the new config object and ignore it.

cumulus_library/base_utils.py Outdated Show resolved Hide resolved
cumulus_library/cli_parser.py Outdated Show resolved Hide resolved
help="An API Key for the UMLS API",
)
build.add_argument(
"--replace-existing",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming is also lightly tricky - as a user not steeped in the file uploading feature, I might reasonably think this arg does what clean does - i.e. it refers to whether or not to delete existing tables. I know that's not what you had in mind when naming this arg, but if you could somehow scope this arg down to just file uploads? Maybe something more elegant than --replace-existing-uploads?

Or perversely, maybe going less verbose but in a way that doesn't create confusion with clean - like --force/-f, which is often used for this sort of thing when dealing with file-oriented commands like cp or rm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Though --force may get overloaded once we have two kinds of things we want to force through, so it's not necessarily a good idea, just an idea.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is so weird because it's only affecting a slice of the total state. lemme sleep on it, i'll try to get something a bit more nuanced, but it may just be --replace-existing-uploads - :hopefully: this is developer only, though there are edge cases for users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--force-uploads splits the difference

cumulus_library/databases.py Show resolved Hide resolved
cumulus_library/base_utils.py Show resolved Hide resolved
cumulus_library/base_utils.py Outdated Show resolved Hide resolved
cumulus_library/study_parser.py Outdated Show resolved Hide resolved
tests/test_psm.py Outdated Show resolved Hide resolved
@mikix
Copy link
Contributor

mikix commented May 7, 2024

Nice refactoring - excited to use it 😄

@dogversioning dogversioning merged commit 528f3c0 into main May 9, 2024
3 checks passed
@dogversioning dogversioning deleted the mg/umls-support branch May 9, 2024 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants