Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add v1 data #43

Open
forsyth2 opened this issue Aug 2, 2024 · 7 comments · May be fixed by #46
Open

Add v1 data #43

forsyth2 opened this issue Aug 2, 2024 · 7 comments · May be fixed by #46

Comments

@forsyth2
Copy link
Collaborator

forsyth2 commented Aug 2, 2024

v1 data should also be included on e3sm_data_docs.

Currently, v1 data is linked from https://e3sm.org/data/get-e3sm-data/released-e3sm-data/. Ideally, we should move all those links to e3sm_data_docs so all the data is linked from one coherent site. Note https://acme-climate.atlassian.net/wiki/spaces/ED/pages/4495441922 provides some v1 data as well.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Aug 6, 2024

@chengzhuzhang @TonyB9000 Is the plan to leave the HPSS path as-is for the v1 data? E.g., /home/g/golaz/2018/E3SM_simulations/repaired/20180129.DECKv1b_piControl.ne30_oEC.edison rather than the more uniform /home/projects/e3sm/www/ prefix?

Reasons to switch:

  • It would simplify automatically extracting simulation information for E3SM Data Docs
  • It would provide consistency for users
  • It doesn't rely on a single user's HPSS path

Reasons to keep as-is:

  • It could be burdensome/time-consuming to move all the v1 data to new directories. (Especially considering we're already on to v3 now).

@forsyth2 forsyth2 linked a pull request Aug 6, 2024 that will close this issue
@chengzhuzhang
Copy link
Collaborator

Another registration is our HPSS project quota, we may run out of space after all the v1 data is copied. I think use original path is fine. We should use a consistent structure for v3 for sure.

@chengzhuzhang
Copy link
Collaborator

@forsyth2 my assessment above about use original HPSS path is not correct, because these datasets can only be able to accessed world-wide (from https://portal.nersc.gov/archive/home/projects/e3sm/www/) if they are placed under `/home/projects/e3sm/www/.

@forsyth2
Copy link
Collaborator Author

@chengzhuzhang Oh that is a good point. So should we begin a process of moving data already on HPSS to be under that path?

@chengzhuzhang
Copy link
Collaborator

We can wait a little bit and work with Tony to figure out the best option to create the backup...

@TonyB9000
Copy link
Collaborator

@chengzhuzhang @forsyth2 Please see my latest comment at the bottom of the Confluence page

https://acme-climate.atlassian.net/wiki/spaces/IPD/pages/4503633923/2024-08-02+IG+data+publication+Continue+publication+archive+backup

Let me know if I can proceed with further archive creation. To save space, we probably want to transfer and delete as we go, or else incur a (temporary) 130+TB footprint on disk.

@forsyth2
Copy link
Collaborator Author

To save space, we probably want to transfer and delete as we go, or else incur a (temporary) 130+TB footprint on disk.

I think that is fine. It looks like many simulations on https://acme-climate.atlassian.net/wiki/spaces/ED/pages/4495441922/V1+Simulation+backfill+WIP in fact have an HPSS path and backup path already -- i.e., there already is a copy for many simulations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants