You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#5 adds the scripts used to transfer 1 degree and 1/4 degree FV3 Replay datasets. There are a number of hard coded values that I used in order to get the wheels turning, but we have now discussed ways that this can be generalized. Here are some examples pointed out by @frolovsa and @danielabdi-noaa
Runtime arguments
The scripts like examples/replay/move_quarter_degree.py have hard coded runtime, SLURM, and user related options. One way around this would be to read in a yaml file with all of these specifications. For starters:
could be generalized with os.get_login() as suggested by @danielabdi-noaa, or we could also just put that in a yaml file. My preference is for the latter, since we'd have other things in a yaml file, and it removes the restrictions that 1) we all have miniconda3 2) it's in the same relative path and 3) the conda environment name is the same.
Replay Mover script options
Here's a list of specifics for the subset of replay data we wanted:
xcycles
@propertydefxcycles(self):
"""These are the DA cycle timestamps, which are every 6 hours. There is one s3 directory per cycle for replay."""cycles=pd.date_range(start="1994-01-01", end="1999-06-13T06:00:00", freq="6h")
returnxr.DataArray(cycles, coords={"cycles": cycles}, dims="cycles")
Makes the assumption about startdate, enddate, and 6hour frequency
xtime
@propertydefxtime(self):
"""These are the time stamps of the resulting dataset, assuming we are grabbing fhr00 and fhr03"""time=pd.date_range(start="1994-01-01", end="1999-06-13T09:00:00", freq="3h")
iau_time=time-timedelta(hours=6)
returnxr.DataArray(iau_time, coords={"time": iau_time}, dims="time", attrs={"long_name": "time", "axis": "T"})
relies first on the assumption related to xcycles, and then relies on the option that we're grabbing the fhr00 and fhr03 files, as well as the adjusted timing due to the IAU. I don't know how to generalize this though, just created this after gaining an understanding about the mapping from cycle to fhr timestamps. Note that similar assumptions are baked into add_time_coords.
the property ods_kwargs is specific to s3, and could be moved so that it's being passed like the runtime yaml options above
Inside the method move_single_dataset is the assumption that we're only grabbing two timestamps per DA cycle (doesn't matter that they are fhr00 and fhr03 though), see line defining tslice.
the cached_path staticmethod is specific to the replay data, but I think this is the one thing that can't be generalized since it is specific to the dataset.
It specifies that it wants an fv3dataset, and this should obviously be generalized. This also pertains to the "region" definition in these lines:
#5 adds the scripts used to transfer 1 degree and 1/4 degree FV3 Replay datasets. There are a number of hard coded values that I used in order to get the wheels turning, but we have now discussed ways that this can be generalized. Here are some examples pointed out by @frolovsa and @danielabdi-noaa
Runtime arguments
The scripts like
examples/replay/move_quarter_degree.py
have hard coded runtime, SLURM, and user related options. One way around this would be to read in a yaml file with all of these specifications. For starters:as well as the slurm options
could easily be put in a yaml file with a syntax like
Then the conda related arguments:
could be generalized with
os.get_login()
as suggested by @danielabdi-noaa, or we could also just put that in a yaml file. My preference is for the latter, since we'd have other things in a yaml file, and it removes the restrictions that 1) we all have miniconda3 2) it's in the same relative path and 3) the conda environment name is the same.Replay Mover script options
Here's a list of specifics for the subset of replay data we wanted:
Makes the assumption about startdate, enddate, and 6hour frequency
relies first on the assumption related to xcycles, and then relies on the option that we're grabbing the fhr00 and fhr03 files, as well as the adjusted timing due to the IAU. I don't know how to generalize this though, just created this after gaining an understanding about the mapping from cycle to fhr timestamps. Note that similar assumptions are baked into
add_time_coords
.ods_kwargs
is specific to s3, and could be moved so that it's being passed like the runtime yaml options abovemove_single_dataset
is the assumption that we're only grabbing two timestamps per DA cycle (doesn't matter that they are fhr00 and fhr03 though), see line defining tslice.cached_path
staticmethod is specific to the replay data, but I think this is the one thing that can't be generalized since it is specific to the dataset.The text was updated successfully, but these errors were encountered: