Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looking for input on weekly releases to keep online (lsst_distrib and lsst_sims) #2

Open
airnandez opened this issue Jul 23, 2019 · 15 comments

Comments

@airnandez
Copy link
Owner

airnandez commented Jul 23, 2019

As stated in the documentation, our policy is to keep online the 12 most recent weekly releases and all stable releases.

As of now, we have kept online more weekly releases that we originally intended. This is the current situation:

Platform Distribution Number of Online Weekly Releases
linux-x86_64 lsst_distrib 52
linux-x86_64 lsst_sims 32
darwin-x86_64 lsst_distrib 51
darwin-x86_64 lsst_sims 27

You can browse the current contents of the repository here to see exactly what releases are currently available.

Given the size of each (stable or weekly) release (currently ~10 GB) and in spite of the deduplication features of CernVM-FS, it is necessary to enforce a purge policy of weekly releases which take into account the real usage of those releases by you, the users of this distribution.

I'm therefore looking for input on what weekly releases you really need we keep online. Please bear in mind that it is not possible for us to keep all of them indefinitely. If no input is received, we will implement the original intended policy, that is to say, we will only keep online the more recent 12 weekly releases.

@heather999
Copy link

I would suggest we retain the versions of lsst_distrib used during DC2 DM DRP processing: w_2018_39 and w_2019_19. We can access the Run1.2i data using w_2019_19 so w_2018_39 may not be absolutely required.
This is not necessarily the same versions we should retain for lsst_sim which I would naively think should correspond to the versions of imSim/lsst_sim used for the simulation productions. Currently that includes lsst_sim w_2019_23, w_2019_19, and w_2019_10. I'd need a little help to go back further for Run1.2i production. But others can speak to that.

@cwwalter
Copy link

Personally I use some set of the most recent weeklies plus (mostly) the matched versions of _distrib and _sims that are used for the official imSim production at NERSC and in the docker images. Currently this is _23. I tend to keep my laptop, Duke machines, and OSG jobs all pinned to that version.

Perhaps it would be a good idea to have:

  • All major releases.
  • N most recent weeklies of matched _distrib+sims (N = 12?)
  • A small set of "pinned" versions agreed and documented between @heather999 and @airnandez.

The idea would be that the small number of pinned versions being used wouldn't be deleted if they happened to roll off the end.

@luckyjim
Copy link

As part of the simulation challenge (SC456) and processing of the project Euclid, I use the release w_2018_31 and for the reasons of compatibility of the raw image format with use of the function processEImage I can not pass on newer versions.
So I would like to keep this release w_2018_31 for 1 year (6/2020).
Sorry for this constraint.
JM Colley, member of the EXT-LSST team for the Euclid project at the APC laboratory / University of Paris

@airnandez
Copy link
Owner Author

airnandez commented Jul 29, 2019

@heather999 @cwwalter @luckyjim Thank you all for your inputs.

So, in summary, this could be our policy for selecting the releases of both lsst_distrib and lsst_sims distributions to make available via cvmfs:

  1. All reasonably recent stable releases (e.g. v17.0, v18.0.0, etc.)

  2. The 12 most weekly releases of both lsst_distrib and lsst_sims: the reference document is the list of available release tags published by the LSST project here

  3. A set of well identified weekly releases needed for reproducibility purposes, which are not covered by rule number 2. We call them pinned releases.
    As of today, the weekly releases to be pinned are presented in the table below:

    Distribution Release Tag Comment
    lsst_distrib w_2018_31 Requested by @luckyjim
    w_2018_39 Requested by @heather999
    w_2019_19 Requested by @heather999
    lsst_sims sims_w_2019_10 Requested by @heather999
    sims_w_2019_19 Requested by @heather999
    sims_w_2019_23 Requested by @heather999

@cwwalter @heather999 Are the releases of lsst_sims used for the official imSim production campaigns (at NERSC or elsewhere) documented somewhere? If so, I would refer to that document in our policy.

Additional question: is it necessary (or useful) to keep matched versions of lsst_distrib and lsst_sims? For instance, according to the table above, we would pin lsst_sims sims_w_2019_23 but not lsst_distrib w_2019_23. Does that make sense or should we rather aim at keeping pinned releases of both distributions in sync if at all possible?

Once we converge on the policy, I would document the pinned releases in a separate issue of this repository and start implementing this policy (i.e. progressively removing no-longer-needed weekly releases) from September on.

@heather999
Copy link

We should seek to keep both distributions of lsst_distrib and lsst_sims in sync when possible.. so that would imply we also need to retain lsst_distrib: w_2019_10 and w_2019_23.
I'm hoping Chris or Antonio might help provide an official list of imSim versions used for production.

@cwwalter
Copy link

Additional question: is it necessary (or useful) to keep matched versions of lsst_distrib and lsst_sims? For instance, according to the table above, we would pin lsst_sims sims_w_2019_23 but not lsst_distrib w_2019_23. Does that make sense or should we rather aim at keeping pinned releases of both distributions in sync if at all possible?

Yes, as Heather said we need the matched pair since, in order to use packages in both sets, they need to be a consistent build.

@airnandez
Copy link
Owner Author

Taking into account the input above, as of 2019-07-30 these are the weekly releases to be pinned:

Distribution Release Tag Comment
lsst_distrib w_2018_31 Requested by @luckyjim
w_2018_39 Requested by @heather999
w_2019_10 To match lsst_sims sims_w_2019_10
w_2019_19 Requested by @heather999
w_2019_23 To match lsst_sims sims_w_2019_23
lsst_sims sims_w_2019_10 Requested by @heather999
sims_w_2019_19 Requested by @heather999
sims_w_2019_23 Requested by @heather999

@heather999
Copy link

@airnandez We would like to request for DESC that we pin lsst_distrib w_2019_42 and lsst_sims sims_w_2019_42 as this is the version used for the Run2.2i simulation production.

@airnandez
Copy link
Owner Author

@heather999 Noted. The list of pinned releases is now updated to include lsst_distrib tag w_2019_42 and lsst_sims tag sims_w_2019_42. You can find the list here.

@heather999
Copy link

@airnandez I'd like to add w_2020_15 lsst_distrib and sims_w_2020_15 lsst_sims to the list of pinned packages. This is in support of sn_pipe which is currently referencing this version for their tutorials.

@airnandez
Copy link
Owner Author

@heather999 I updated the list of pinned releases to include lsst_distrib tag w_2020_15 and lsst_sims tag sims_w_2020_15.

You can find the most recent list of pinned releases here.

@BenjaminRacine
Copy link

Hi,

As discussed on #in2p3 on slack, I would like to have weekly w_2020_23 restored.
This is for a project to analyse HSC data using the lsst pipeline.
(Most of it is based on v20.0.0, but some small change in weekly w_2020_23 was necessary, and I then need to setup that version).
Thanks,

Ben

@airnandez
Copy link
Owner Author

@BenjaminRacine I updated the list of pinned releases to include lsst_distrib tag w_2020_23.

You can find the most recent list of pinned releases here.

@airnandez
Copy link
Owner Author

@pgris I'm including lsst_sims sims_w_2020_43 to the list of pinned releases as per your request via this issue.

You can find the most recent list of pinned releases here.

@airnandez
Copy link
Owner Author

airnandez commented May 21, 2023

In this comment of issue #3 we have the list of currently pinned releases. As part of the maintenance operation of the sw.lsst.eu CernVM-FS repository we need to remove releases that are no longer needed.

Could @cwwalter, @luckyjim, @heather999, @BenjaminRacine and @pgris please confirm that you still need those releases to be pinned by commenting on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants