Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test support for custom wrapper install directory #654

Merged
merged 17 commits into from
Sep 19, 2023
Merged

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Jun 26, 2023

This will close #653

@vsoch
Copy link
Member Author

vsoch commented Jun 26, 2023

@marcodelapierre I'm going to need a second pair of eyes on this one! Testing locally (albeit without module software) it all looks ok - the recipe I generate has the wrapperDir instead of the moduleDir, and it seems to be added to the path, and it corresponds with the wrapper I see written. I don't know why it doesn't find the executable in the test - I think likely there is a bug I'm not seeing.

And definitely no rush! Please take your time if you are able to look!

@marcodelapierre
Copy link
Contributor

@vsoch I am finally on this one!

Code review is OK to me.

Now I am performing some real tests on a VM with Singularity and Lmod modules.

@marcodelapierre
Copy link
Contributor

Testing shpc install python:3.9.2-slim with both old approach (modules and wrappers all together) and new approach (distinct wrappers dir)

@marcodelapierre
Copy link
Contributor

ah!!

@vsoch found it, such a silly little typo .... with a mystery though

so, look at the template for the wrapper scripts: https://github.com/singularityhub/singularity-hpc/blob/main/shpc/main/wrappers/templates/bases/shell-script-base.sh

looks fine, right?

but then if I look at any generated wrapper, I get a spurious curly bracket (what the heck?!?!??!?!):

script=`realpath $0`
}wrapper_bin="/home/ubuntu/singularity-hpc/wrappers/python/3.9.2-slim"
moduleDir=`dirname $wrapper_bin`

I have some comments here:

  1. I don't understand why this bracket pops out in that position, really
  2. with fixing the typo, the wrapper still wouldn't work, because in the current implementation moduleDir there has to be the real moduleDir, because that is where the script 99-shpc.sh sits
  3. I would make a call actually to install the 99-shpc.sh in the wrapperDir, it sounds like a more sound place to be (it is a dependency to the wrappers not a modulefile). As a by-product, the syntax in the wrapper template can be reverted back to its original, as 99-shpc.sh would sit in the same tree as the wrapper.

What do you think?

@marcodelapierre
Copy link
Contributor

marcodelapierre commented Aug 28, 2023

PS but seriously, where does that spurious curly bracket comes from?! 😂

@vsoch
Copy link
Member Author

vsoch commented Aug 28, 2023

@marcodelapierre okay let's debug! The curly bracket is first priority.

  1. First item of action is rebase with current main - I just pushed so we are good there.
  2. My first instinct is to totally nuke the current setup and start from scratch - I am looking at my local install and I don't see the curly bracket. Maybe you butt dialed a script somewhere?
  3. If it persists, let's check if it's related to a particular container base (e.g., try install with singularity and then podman)
  4. Then I'll add some cat to the testing here to see if there is any difference across the matrix

If one of the above doesn't give us insight I'll need to try to reproduce exactly as you have! I don't have an actual HPC module system (just containers /local machine) so I'll need to figure something out - maybe I can spin up a cloud instance on my personal GCP.

@marcodelapierre
Copy link
Contributor

I have just nuked and re-tested, and can confirm the issue.

I have a very basic, non-HPC setup on an Ubuntu VM:

  • Ubuntu 22.04
  • Python 3.10.x
  • pip and setuptools
  • installed Lmod myself with https://github.com/marcodelapierre/hpc-middleware-scripts/blob/main/modules/install-lmod.sh
  • setup SHPC with the scripts below.

I have installed a python and a blast container with both singularity and docker, and still there is the spurious curly bracket.

File use-shpc.sh:

#!/bin/bash

python_ver="$( python3 -V | cut -d ' ' -f 2 | cut -d . -f 1,2 )"

base_dir="$HOME"
install_dir="$base_dir/singularity-hpc"

lib_dir="$install_dir/local/lib/python${python_ver}/dist-packages"
bin_dir="$install_dir/local/bin"
con_dir="$install_dir/containers"
mod_dir="$install_dir/modules"
wrap_dir="$install_dir/wrappers"

export PYTHONPATH="$lib_dir:$PYTHONPATH"
export PATH="$bin_dir:$PATH"

module use $mod_dir

File install-shpc.sh:

#!/bin/bash

shpc_checkout="custom-wrapper-scripts"
. ~/use-shpc.sh

#git clone https://github.com/singularityhub/singularity-hpc
cd $install_dir
git checkout $shpc_checkout
pip3 install -e . --prefix="$(pwd)"

mkdir -p $mod_dir
shpc config set container_base:\$root_dir/containers
shpc config set wrapper_base:\$root_dir/wrappers

cd -

Then, bash install-shpc.sh.

@marcodelapierre
Copy link
Contributor

On the failures of the CI tests:

  1. one is about this:

    =========================== short test summary info ============================
    FAILED test_wrappers.py::test_wrapper_install[lmod-module.lua-singularity] - AssertionError: assert 'bin' in ['module.lua', 
    '99-shpc.sh']
    

    so the test is looking for bin in the wrong list.

  2. then, downloading the Lmod tarball fails .. worth checking version ID and URL.

  3. finally, the Tcl test fails because it is looking for 99-shpc.sh in the wrong place.

@marcodelapierre
Copy link
Contributor

Now waiting on your feedback @vsoch, happy to keep contributing to this process.

@vsoch
Copy link
Member Author

vsoch commented Aug 31, 2023

Thanks @marcodelapierre ! I'm fairly busy tomorrow (Thursday) and Friday but I'll try to take a look over the weekend. I think with your scripts I should be able to reproduce in a VM. Thank you for doing that! I hope we will get to the bottom of this mysterious curly guy... 😆

@marcodelapierre
Copy link
Contributor

My pleasure!

Please keep me posted, I might have spare time for this, too, next week.

@vsoch
Copy link
Member Author

vsoch commented Sep 1, 2023

@marcodelapierre I won't have time this weekend (or maybe next) - I have to make and record a full talk, plus write a full paper (that I haven't started yet) in the next two weeks, plus a second talk and entire set of experiments (for late October) that also aren't started yet, on top of regular work, so I don't think I'll have time for my fun extra projects! But I'm adding to my calendar for after those are done.

@marcodelapierre
Copy link
Contributor

No worries at all Vanessa, I suggest we update each other whenever convenient.
On this specific one, I think there is no rush from a user perspective, neither, because the reported slowness was in the end due to poor Lmod caching.

PS: I do have something baking myself as well .. you will know soon!

@vsoch
Copy link
Member Author

vsoch commented Sep 11, 2023

Actually the current bug looks to be lmod related: https://github.com/singularityhub/singularity-hpc/actions/runs/6119042830/job/16608191872?pr=654. It's not finding the wrapper bin, and note the tests here don't change it yet.

@vsoch
Copy link
Member Author

vsoch commented Sep 11, 2023

well that made it worse 😆

we have more cases than I anticipated - using a custom wrapper,
the default wrapper ($root/modules), havint it unset (empty string)
and then entirely disabled.

Signed-off-by: vsoch <[email protected]>
Signed-off-by: vsoch <[email protected]>
Signed-off-by: vsoch <[email protected]>
@vsoch
Copy link
Member Author

vsoch commented Sep 11, 2023

It's all green? What? Magic? That was an adventure!

@marcodelapierre
Copy link
Contributor

sounds great!

@marcodelapierre
Copy link
Contributor

any extra tests required on my end you reckon ? or does the update testing scheme for wrappers account already for the cases we discussed?

@vsoch
Copy link
Member Author

vsoch commented Sep 11, 2023

any extra tests required on my end you reckon ? or does the update testing scheme for wrappers account already for the cases we discussed?

I think the big test I added for the custom wrapper directory covers most of it - the one missing piece would be to run the equivalent of loading the module, but with a custom wrapper directory. I also don't test the case where wrappers are disabled but the wrapper directory variable is set - that would actually generate the scripts but not add logic to the templates to use them. So based on that, what do you think?

@vsoch
Copy link
Member Author

vsoch commented Sep 11, 2023

But that said - the default for wrappers (that we test) is the module dir, so technically it's set, and the only change with a custom directory would be changing that path. There could be some bug where the module directory is set (and it should be the wrapper) but I think we would have caught that in the other tests. But I'm not 100% on that! I would say the above is OK, and then a manual test by someone that the module loads is good. If we then see an issue pop up again, it would be worth adding the automation. What do you think?

@marcodelapierre
Copy link
Contributor

I will try and run a quick test today, tomorrow at latest. I should have all setup

@vsoch vsoch merged commit 21370ba into main Sep 19, 2023
11 checks passed
@vsoch vsoch deleted the custom-wrapper-scripts branch September 19, 2023 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change location of wrapper scripts
2 participants