Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for loading a consistent python module #4059

Closed
billsacks opened this issue Aug 2, 2021 · 2 comments
Closed

Better support for loading a consistent python module #4059

billsacks opened this issue Aug 2, 2021 · 2 comments

Comments

@billsacks
Copy link
Member

It is currently easy to inadvertently use different python versions for different parts of the operation of a given CIME command: When you run a CIME command, it will start by using whatever python3 version it finds in your path. However, at some point, it may call load_env (e.g., in the course of doing a build). At that point, your module environment is reset, and – depending on what's defined in this machine's section of config_machines.xml – the same or a different python module may be loaded, or there will be no python module loaded and it will use the machine defaults. From that point forward, any subprocess call that invokes python – notably, calls to components' buildlib commands, but possibly others – will use this possibly different version of python. As long as the two python versions are close enough, the user probably won't notice and there will be no ill effects. But it seems like this operation opens the door to some really subtle issues.

The idea @mnlevy1981 raised at a CSEG meeting in 2017 (but I think was never recorded in a CIME issue) is: Have the cime scripts initially determine the desired python module, load that, then do everything else in a python subprocess. This strategy would ensure that you get a well-defined, consistent operation of the cime scripts across all users of a given machine: no need for them to load a python module or virtual environment before running the cime scripts. This would be particularly helpful as we add support for 3rd party python packages like yaml and netcdf. For better or for worse (probably mostly for better, but there could be some downsides), that would mean that your python environment is in the hands of CIME's config_machines, and not in your own hands.

If that strategy doesn't work or is too difficult to implement, then an alternative strategy that would address the problem in the first paragraph but not the broader issues addressed by @mnlevy1981 's suggestion would be to somehow maintain the currently-loaded python environment across calls to load_env. I'm not sure how feasible that is, though. (Could you query the currently-loaded python module before resetting the module environment, then reload that python module afterwards???)

Another alternative would be to move towards requiring or strongly encouraging the use of python virtual environments. At least on cheyenne, if you have already loaded a python virtual environment then it seems that doing a module reset doesn't impact the python version you're using. In the past, there has been some (reasonable, in my mind) resistance to requiring users to load a python virtual environment before running cime scripts, but if it's too difficult to ensure a consistent python environment in other ways, we may want to reconsider this.

@billsacks
Copy link
Member Author

As mentioned in ESCOMP/CESM#188 (comment) , it might make sense to wait to resolve this until after the CIME7 reorganization (#3886), as long as that isn't too far out.

@billsacks
Copy link
Member Author

From discussion at last week's cime meeting:

  • We will support two methods for ensuring you have a working python environment for CESM/CIME: (1) containers, and (2) manually loading the appropriate python environment on your system prior to running any cime scripts (via a conda or pip environment).
    • Note that, for the sake of (2), we will soon remove the module loads of python on our machines, since this breaks the ability to control your own python environment
  • We will not spend time trying to work out how we would support a separate pip/conda installation of cime: cime's relationship to the rest of the model will remain as it currently is (just some inline code).

From discussion at today's cseg meeting, this strategy proposed in 2017 no longer feels like the right thing to do:

Have the cime scripts initially determine the desired python module, load that, then do everything else in a python subprocess.

Since our proposed solution involves changes to the workflow but not really to cime itself, I am going to close this as a wontfix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant