Feature/soil harvesters #68

sherrieF · 2024-09-16T20:55:03Z

This is a new harvester. The tests in this harvester are
test_harvester_global_soil_moisture.py
test_harvester_global_soil_temperature.py
The above two harvesters test the return values from the daily_bfg.py for the global vaues of
soilt4, tg3, soill4, and soilm. No region was requested. It should be noted that these soil fields
are automatically masked. The ocean and ice values are set to NaN.
The harvesters below:
test_harvester_regional_soil_moisture.py
test_harvester_regional_soil_temperature.py
Test the regional values of soilt4, tg3, soill4, and soilm.
The regions tested were:
'regions': {
'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0},
'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0},
'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0},
'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0},
}
All of the tests passed the pytest.
Also the following classes are apart of this branch.
mask_utils.py - The class for masking methods.
region_utils.py - The class for subsetting the global variable and weight data in subregions.
stats_utils.py - The class for calculating the user requested statistics.

The daily_bfg.py
This is the main python script for populating the harvested_data which will be returned to the tests and other methods which call the daily_bfg.py

added the lhtfl control files to the directory. The lhtfl control files are the files that the test and the harvester daily_bfg.py uses.

and the stats_utils class.

in the bfg fluxes control netcdf files.

test the surface latent heat flux values returned from the harvester.

test_harvester_daily_bfg_prateb.py changed the name of test_harvester_daily_bfg_lhtfl_ave.py to test_harvester_surface_latent_heat_flux.py

west_long and east_long.

longitude region to west_long and east_long

…ast_lon') DEFAULT_REGION = {'global': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}}

It tests the mean,variance,min and max values for two variables. soill4 and soilm. The bfg files found in the data directory were used for the test. This test was for a global region. No subregion was requested.

It tests the mean,variance,min and max values for two variables. soilt4 and tg3. The bfg files found in the data directory were used for the test. This test was for a global region. No subregion was requested.

It tests the mean,variance,min and max values for two variables. soill4 and soilm. The bfg files found in the data directory were used for the test. This test was for the four regions as follows: 'regions': { 'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0}, 'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}, 'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0}, 'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0}, } Values for the statistics mentioned above were tested for all four regions with that which was returned from the daily_bfg harvester.

It tests the mean,variance,min and max values for two variables. soilt4 and tg3. The bfg files found in the data directory were used for the test. This test was for the four regions as follows: 'regions': { 'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0}, 'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}, 'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0}, 'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0}, } Values for the statistics mentioned above were tested for all four regions with that which was returned from the daily_bfg harvester.

The classes that are initialized for this python script are: self.weighted_averages=[] self.variances=[] self.maximum=[] self.stats=stats_list def clear_requested_statistics(self): This method clears out the class lists so the statistice for multiple variables can be calculated and returned. def calculate_requested_statistics(self,weights,temporal_mean): This method takes the weights and the temporal mean for the variable passed in from the calling routine and calculates the user requested statistics for that variable. The following methods are called from this method. The statistics that are calculated from the methods below are put into the class lists. def calculate_weighted_average(self,weights,temporal_mean): This method takes the weights and temporal mean of a variable and calculates a weighted sum. def calculate_var_variance(self,weights,temporal_mean): This method takes the weights and temporal mean of a variable and calculates the variance. variance = sum_R{ w_i * (x_i - xbar)^2 } def find_minimum_value(self,temporal_mean): This method finds the minimum value of the temporal mean of a variable. def find_maximum_value(self,temporal_mean): This method fine the maximum value of the temporal mean of a variable.

The class initialization: def __init__(self,dataset): """ Here we initalize the region class as a dictionary. Parameter: datset - This is a dataset that has been opened with xarray. """ self.name = [] The list to store the name of the user region. This is key word and is needed. self.north_lat = [] The list to store the user requested northern latitude of the reggion. self.south_lat = [] The list to store the user requested southern latitude of the region. self.west_long = [] The list to store the western longitude of the user region. In degrees East self.east_long = [] The list to store the eastern longitude of the user region. In degrees East self.region_indices = [] The list to strore the region indicies. These are passed back to the calling routine. self.latitude_values = dataset['grid_yt'].values This is the array of latitude values on the original dataset. self.longitude_values = dataset['grid_xt'].values This is the array of the longitude values on the original data set. The methods called from this class. def test_user_latitudes(self,north_lat,south_lat): This method tests the user input latitudes to make sure they are reasonable. It exits with an error if the latitudes are out of bounds. If the values pass the tests they are added to the region dictionary defined in the def__init__ method above. def test_user_longitudes(self,west_long,east_long): This method tests the user input longitudes to make sure they are reasonable. It exits with an error if the longitudes are out of bounds. If the values pass the tests they are added to the region dictionary define in the def__init__ method above. def get_region_indices(self,region_index): This method is called from the method get_region_data that is a member of this class.. It calculates the start and end indicies in the latitude and longitude arrays for the region index passed in from the get region data. Methods called from an external python script. def add_user_region(self,dictionary): This method is called from the main calling python script. It tests the region dictionary passed in from the calling script for validity. It calls the test_user_latitudes and test_user_longitudes to make the user defined region is valid. If the user defined region is valid it populates the region dictionary as defined above. def get_region_indices(self,region_index): This method is called from the main calling python script. It calculates the start and end indicies in the latitude and longitude arrays for the region index passed in. It returns the indicies to the calling python script. def get_region_data(self,region_index,data): This method subsets the full grid variable data into the requested region.

The valid masks are land,ocean,sea and ice def __init__(self,user_mask_value,soil_type_values): """ Here we initalize the MaskCatalog class. """ self.name = None self.user_mask = user_mask_value self.data_mask = soil_type_values def initial_mask_of_variable(self,var_name,variable_data,dataset): This method sets the ocean and ice grid cells to missing for the soil variables. This is done automatically for the soil variables: soill4,soilm,soilt4 and tg3. This method is called from the main python script. def replace_bad_values_with_nan(self,variable_data): This method replaces missing or fill values with NaN. This is done so any statistics the user has requested will be calculated correctly. This method is called from the main python script. def user_mask_array(self,region_mask): This takes the sotyp variable data from the data set. It returns an array with boolean values. The grid points that the user wants are set to True and the grid points the user does not want is set to false.

This python script is the main sript for the harvesters in score-hv This script uses the following classes: from score_hv.config_base import ConfigInterface from score_hv.stats_utils import VarStatsCatalog from score_hv.region_utils import GeoRegionsCatalog from score_hv.mask_utils import MaskCatalog This script reads the VALID_CONFIG_DICT that is set up in the harvester tests. At present the VALID_CONFIG_DICT has the following values: VALID_CONFIG_DICT = {'harvester_name': hv_registry.DAILY_BFG, 'filenames' : BFG_PATH, 'statistic': ['mean','variance', 'minimum', 'maximum'], 'variable': ['var1',...'varn'], 'regions': {name,latitude values, longitude values} There can be more than one region. 'surface_mask': land,or ocean or ice } The daily_bfg.py then opens and reads the dataset requested by the user. The path to the data files is in the VALID_CONFIG_DICT:filenames. The python method xarray is used to open and read in the data set. The script then reads the rest of the VALID_CONFIG_DICT. A general rundown of the processing in the daily_bfg.py is as follows: The gridcell area weight files is read in. Each variable that has been requested is processed on at a time. If a soil variable has been requested it is masked. If a surface mask has been requested the variable grid point and weights grid points are masked. If a region or regions have been requested they are applied to to the variable data and weights. The requested user statistics are then calculated. The following information is sent back to the havester that has called the daily_bfg.py harvested_data.append(HarvestedData( self.config.harvest_filenames, statistic, variable, np.float32(value), units, dt.fromisoformat(median_cftime.isoformat()), longname, self.config.surface_mask, self.config.regions)) The following are methods that are called from this main python script: def get_gridcell_area_data_path(): returns the path to the gridcell area data file. def get_median_cftime(xr_dataset): returns the median cftime from the sr_dataset. : There can be more than one region. 'surface_mask': land,or ocean or ice } The daily_bfg.py then opens and reads the dataset requested by the user. The path to the data files is in the VALID_CONFIG_DICT:filenames. The python method xarray is used to open and read in the data set. The script then reads the rest of the VALID_CONFIG_DICT. A general rundown of the processing in the daily_bfg.py is as follows: The gridcell area weight files is read in. Each variable that has been requested is processed on at a time. If a soil variable has been requested it is masked. If a surface mask has been requested the variable grid point and weights grid points are masked. If a region or regions have been requested they are applied to to the variable data and weights. The requested user statistics are then calculated. The following information is sent back to the havester that has called the daily_bfg.py harvested_data.append(HarvestedData( self.config.harvest_filenames, statistic, variable, np.float32(value), units, dt.fromisoformat(median_cftime.isoformat()), longname, self.config.surface_mask, self.config.regions)) The following are methods that are called from this main python script: def get_gridcell_area_data_path(): returns the path to the gridcell area data file. def get_median_cftime(xr_dataset): returns the median cftime from the sr_dataset. def check_variable_exists(var_name,dataset_variable_names): Makes sure the requested variable is in the users dataset. def calculate_surface_energy_balance(xr_dataset,dataset_variable_names): This method calculates the surface energy balance. The surface energy balance is a derived field. def calculate_toa_radative_flux(xr_dataset,dataset_variable_names): This method calculates the top of the atmosphere radiative energy flux(netrf_avetoa). This is a derived field. def check_array_dimensions(region_variable,region_weights): This method makes sure that the region variable and the region weights have the same dimensions. If their dimensions are different we exit the main script. The dimensions must be the same to calculate the statistics requested by the user. def calculate_and_normalize_solid_angle(sum_global_weights,region_weights): This method calculates the solid angle for the regional weights and normalizes them. The normalized regional weights are returned.

/Users/sfredrick/adam/score-hv/src/score_hv/data for all soil tests

in it.

sherrieF added 30 commits December 19, 2023 11:22

Deleted __init__.py

b751e80

deleted prateb_control files from the directory.

0a1e8e7

added the lhtfl control files to the directory. The lhtfl control files are the files that the test and the harvester daily_bfg.py uses.

changed precip to lhftl

55579ee

No change was made

38667e2

Updated the test to work with the daily_bfg.py harvester.

f8c63ea

Updated the harvester to work with the daily_bfg.py harvester.

35bfa58

Update the test to work with the daily_bfg.py harvester.

a60e898

Added the region_utils.py to this branch.

2f836dc

Added the stats_utils.py to this branch.

0ff1940

Up dated daily_bfg.py to work with the region_utils class

7dfeb23

and the stats_utils class.

Merged all of the flux variables from the original bfg control files

f453737

in the bfg fluxes control netcdf files.

added the file test_harvester_surface_latent_heat_flux.py to

79f3e85

test the surface latent heat flux values returned from the harvester.

Removed test files that do not test fluxes.

9414a14

test_harvester_daily_bfg_prateb.py changed the name of test_harvester_daily_bfg_lhtfl_ave.py to test_harvester_surface_latent_heat_flux.py

added test data for soil.

28503d0

working on test for soil moisture

6fc4006

working on daily_bfg.py

e7566d8

Working on soil daily_bfg

44b3549

added column soil moisture, land mask and tg3 to bfg files.

1ce891b

Working on test for harvesters

09beea9

added some documentation.

5320f41

Working on daily_bfg

03a817c

working on adding masks

c0b7231

working on adding masks

ed6b8cf

working on adding masks

3cb7766

no changes

850ab47

Added a mask variable to the VALID_CONFIG_DICT.

09290b8

added surface_mask to VALID_CONFIG_DICT

8bc2ad2

working on test_harvester_global_soilt4.py

186ce6d

working on adding masking

4490101

deleted un-used parameter var_data

8bd3577

sherrieF added 23 commits August 6, 2024 12:11

adding masking

22f9112

adding masking

b1b5e80

working on masking

0cae4b3

no changes

3272d12

changed to latitude and longitude tuples to north_lat,south_lat

62db58f

west_long and east_long.

adding tests for masking

0e2abaf

changed latitude region to north_lat and south_lat. Changes

d841075

longitude region to west_long and east_long

added VALID_REGION_BOUND_KEYS = ('min_lat', 'max_lat', 'west_lon', 'e…

18630df

…ast_lon') DEFAULT_REGION = {'global': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}}

working on masking

4c22eca

This class contains methods for processing masking requests.

05b6d44

working on masking

ea46379

working on masking of land variables

a21ed0e

small changes

3bff0c4

updates

303a1f1

updates

58054f1

This python script tests the return values of the daily_bfg harvester.

9110b9c

It tests the mean,variance,min and max values for two variables. soill4 and soilm. The bfg files found in the data directory were used for the test. This test was for a global region. No subregion was requested.

This python script tests the return values of the daily_bfg harvester.

1e9d163

It tests the mean,variance,min and max values for two variables. soilt4 and tg3. The bfg files found in the data directory were used for the test. This test was for a global region. No subregion was requested.

sherrieF requested review from amschne and jrknezha September 16, 2024 20:55

sherrieF added 5 commits September 17, 2024 11:14

Commiting all test files

0df4676

Changed the gridcell area weighting Netcdf file path to

5d91e52

/Users/sfredrick/adam/score-hv/src/score_hv/data for all soil tests

got rid of the --pycache__ directory

2553f67

Deleted the __pycache__ directory

5108d13

Added the data direction which now has the gridcell area weights file

e1eb08b

in it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/soil harvesters #68

Feature/soil harvesters #68

sherrieF commented Sep 16, 2024

Feature/soil harvesters #68

Are you sure you want to change the base?

Feature/soil harvesters #68

Conversation

sherrieF commented Sep 16, 2024