Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] superflous daily files #367

Open
YanchunHe opened this issue Aug 29, 2024 · 7 comments
Open

[ERROR] superflous daily files #367

YanchunHe opened this issue Aug 29, 2024 · 7 comments
Assignees
Labels
error error in the dataset

Comments

@YanchunHe
Copy link
Collaborator

Describe the error
Some daily files in realm day are superflous in that their time period (1 Jan xx to 30 Dec yy) completely overlap with the same files encopassing also 31 Dec yy. These files can be deleted as their mess up automatic merging commands I provide an example with tas but there are more variables and files concerned. These files can be unpublished.

Reported by:
https://errata.ipsl.fr/static/view.html?uid=47bfb675-a02d-b050-83b6-5fb74882d510

Way to correct
Retract the superflous daily files

@YanchunHe YanchunHe added the error error in the dataset label Aug 29, 2024
@YanchunHe
Copy link
Collaborator Author

YanchunHe commented Aug 29, 2024

I have a look and indeed quite some files with overlapped data.

Related to #134

yanchun@ipcc:/projects/NS9034K/CMIP6/.cmorout/NorESM2-LM/ssp126/v20191108
$ ll tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20*
-rw-rw-r-- 1 yanchun ns9034k  69M Dec  6  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20150101-20201231.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Nov 22  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20210101-20301231.nc
-rw-rw-r-- 1 jang    ns9034k 115M Dec 18  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20310101-20401230.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Dec  4  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20310101-20401231.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Nov 23  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20410101-20501231.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Dec  2  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20510101-20601231.nc
-rw-rw-r-- 1 jang    ns9034k 115M Dec 19  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20610101-20701230.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Dec  3  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20610101-20701231.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Nov 23  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20710101-20801231.nc
-rw-rw-r-- 1 jang    ns9034k 115M Dec 19  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20810101-20901230.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Dec  6  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20810101-20901231.nc
-rw-rw-r-- 1 yanchun ns9034k 115M Nov 24  2019 tas_day_NorESM2-LM_ssp126_r1i1p1f1_gn_20910101-21001231.nc

There are more other files affected which can be listed:

yanchun@ipcc:/projects/NS9034K/CMIP6/.cmorout/NorESM2-LM/ssp126/v20191108
$ ll *_day_*-2*1230.nc

@monsieuralok Could you please retract all these duplicated datasets and republish the correct ones?

@YanchunHe YanchunHe pinned this issue Aug 29, 2024
@YanchunHe YanchunHe unpinned this issue Aug 29, 2024
@monsieuralok
Copy link
Collaborator

@YanchunHe is it only problem with "day" frequency or should we fix also for "Eday, CFday and AERday" also?

@YanchunHe
Copy link
Collaborator Author

@YanchunHe is it only problem with "day" frequency or should we fix also for "Eday, CFday and AERday" also?

OK, all that end with '1230.nc' should be retracted.

@monsieuralok
Copy link
Collaborator

@YanchunHe if we retract the datasets then, we need to republish; for republish we need another version number that you need to fix it. Else, we can open Errata issue and ask people to ignore these extra files. What would be yours suggestion?

@monsieuralok
Copy link
Collaborator

@YanchunHe if we retract the datasets then, we need to republish; for republish we need another version number that you need to fix it. Else, we can open Errata issue and ask people to ignore these extra files. What would be yours suggestion?

@YanchunHe could you provide yours suggesiton?

@YanchunHe
Copy link
Collaborator Author

@YanchunHe if we retract the datasets then, we need to republish; for republish we need another version number that you need to fix it. Else, we can open Errata issue and ask people to ignore these extra files. What would be yours suggestion?

@YanchunHe could you provide yours suggesiton?

Hi @monsieuralok

I now revised the datasets, so that it is now have a new version v20191109.

Could you retract the whole 'v20191108', and republish as v20191109?

Please refer to #134 for detailed information for the new datasets to be retracted and republished.

@monsieuralok
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error error in the dataset
Projects
None yet
Development

No branches or pull requests

2 participants