Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to run studies from git checkouts #235

Merged
merged 1 commit into from
May 13, 2024
Merged

Conversation

mikix
Copy link
Contributor

@mikix mikix commented May 10, 2024

  • When loading a builder, we add the study dir's parent folder into sys.path, so that builders can import helper code. This is basically equivalent to doing PYTHONPATH=. from a checkout.
  • When searching for the study dir, and we find a manifest.toml, actually read the manifest to find the study_prefix instead of using the directory name for the study name. This allows more natural python package layouts (rather than module_name/study_name, you can just do module_name)

This also bumps ruff's version to 0.4.4 and has the commit hook just auto-commit any formatting changes.

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added
  • Update template repo if there are changes to study configuration

@mikix mikix force-pushed the mikix/study-search branch from 34cf6ca to b8258f8 Compare May 10, 2024 20:10
Comment on lines -29 to +30
class BaseConfig(abc.ABC):
"""Abstract ase class for handling table detection/denormalization"""
class BaseConfig:
"""Base class for handling table detection/denormalization"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ruff complained about this - there are no abstract methods or properties now that this is a simple dataclass.

pyproject.toml Outdated
"ruff == 0.2.1",
"ruff",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to unpin this here, maybe I shouldn't have.

My thinking was:

  • The commit hook has its own checkout of ruff, so the developer's version could diverge, though they may see different errors if they manually run ruff.
  • But I liked that oddity more than I liked developers having to manage different packages wanting different ruff versions.
  • Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, though then what version should the CI install...? Should that be hardcoded in the CI to match the hook? Or should we hardcode the version here in the pyproject.toml... hmmm.... Maybe I should leave this alone. But I don't love it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started doing explicit pins on these in response to a SQLFluff linting error that caused malformed code in some cases. I think the astral folks are less likely to make the same mistake, but i sort of like 'a human has decided to increment the version on the autoformatting tools'. could be talked out of this.

Comment on lines -65 to -66
[tool.ruff]
target-version = "py310"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the docs for this, ruff will look at requires-python and actually recommends just setting that instead of this.

@mikix mikix force-pushed the mikix/study-search branch from b8258f8 to b05c1dc Compare May 10, 2024 20:16
- When loading a builder, we add the study dir's parent folder
  into sys.path, so that builders can import helper code.
  This is basically equivalent to doing PYTHONPATH=. from a checkout.
- When searching for the study dir, and we find a manifest.toml,
  actually read the manifest to find the study_prefix instead of using
  the directory name for the study name. This allows more natural
  python package layouts (rather than `module_name/study_name`, you can
  just do `module_name`)

This also bumps ruff's version to 0.4.4 and has the commit hook just
auto-commit any formatting changes.
@mikix mikix force-pushed the mikix/study-search branch from b05c1dc to 45d712f Compare May 10, 2024 20:21
Comment on lines -328 to +332
manifest_paths[path.name] = path
try:
manifest = study_parser.StudyManifestParser(path)
manifest_paths[manifest.get_study_prefix()] = path
except errors.StudyManifestParsingError as exc:
rich.print(f"[bold red] Ignoring study in '{path}': {exc}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not offer any kind of compatibility / migration for the old way of doing things. Like I could have also added the directory name into manifest_paths.

But... I figured any existing studies (outside of some of our test data studies) would have realistically needed to have the study_prefix match the directory already. If you didn't have that setup, that meant that your --target name didn't match your study_prefix, and that seems like something we don't really want to encourage / support?

Comment on lines +1 to -2
from study_python_local_template import local_template

from cumulus_library.base_table_builder import BaseTableBuilder
from tests.test_data.study_python_local_template import local_template
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done to test the new sys.path injection

@mikix
Copy link
Contributor Author

mikix commented May 10, 2024

I didn't see any changes needed in the template repo, mostly because (1) it doesn't include any builders that try to import anything, (2) it doesn't use the two-level directory setup that some released studies like covid have had to use, and (3) its instructions talk about renaming the study folder already to match a python module name and to also rename study_prefix.

So I think it should be fine both before and after this change, but your naming can be more flexible after this PR

@dogversioning
Copy link
Contributor

I didn't see any changes needed in the template repo, mostly because (1) it doesn't include any builders that try to import anything, (2) it doesn't use the two-level directory setup that some released studies like covid have had to use, and (3) its instructions talk about renaming the study folder already to match a python module name and to also rename study_prefix.

So I think it should be fine both before and after this change, but your naming can be more flexible after this PR

We should maybe chat about whether maintaining the template repo even makes sense - The example config file in the docs might be the 'right' way to telegraph how to make studies at this point.

@mikix mikix merged commit 1932655 into main May 13, 2024
3 checks passed
@mikix mikix deleted the mikix/study-search branch May 13, 2024 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants