-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/soil harvesters #68
Open
sherrieF
wants to merge
68
commits into
develop
Choose a base branch
from
feature/soil_harvesters
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
added the lhtfl control files to the directory. The lhtfl control files are the files that the test and the harvester daily_bfg.py uses.
and the stats_utils class.
in the bfg fluxes control netcdf files.
test the surface latent heat flux values returned from the harvester.
test_harvester_daily_bfg_prateb.py changed the name of test_harvester_daily_bfg_lhtfl_ave.py to test_harvester_surface_latent_heat_flux.py
west_long and east_long.
longitude region to west_long and east_long
…ast_lon') DEFAULT_REGION = {'global': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}}
It tests the mean,variance,min and max values for two variables. soill4 and soilm. The bfg files found in the data directory were used for the test. This test was for a global region. No subregion was requested.
It tests the mean,variance,min and max values for two variables. soilt4 and tg3. The bfg files found in the data directory were used for the test. This test was for a global region. No subregion was requested.
It tests the mean,variance,min and max values for two variables. soill4 and soilm. The bfg files found in the data directory were used for the test. This test was for the four regions as follows: 'regions': { 'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0}, 'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}, 'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0}, 'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0}, } Values for the statistics mentioned above were tested for all four regions with that which was returned from the daily_bfg harvester.
It tests the mean,variance,min and max values for two variables. soilt4 and tg3. The bfg files found in the data directory were used for the test. This test was for the four regions as follows: 'regions': { 'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0}, 'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0}, 'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0}, 'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0}, } Values for the statistics mentioned above were tested for all four regions with that which was returned from the daily_bfg harvester.
The classes that are initialized for this python script are: self.weighted_averages=[] self.variances=[] self.maximum=[] self.stats=stats_list def clear_requested_statistics(self): This method clears out the class lists so the statistice for multiple variables can be calculated and returned. def calculate_requested_statistics(self,weights,temporal_mean): This method takes the weights and the temporal mean for the variable passed in from the calling routine and calculates the user requested statistics for that variable. The following methods are called from this method. The statistics that are calculated from the methods below are put into the class lists. def calculate_weighted_average(self,weights,temporal_mean): This method takes the weights and temporal mean of a variable and calculates a weighted sum. def calculate_var_variance(self,weights,temporal_mean): This method takes the weights and temporal mean of a variable and calculates the variance. variance = sum_R{ w_i * (x_i - xbar)^2 } def find_minimum_value(self,temporal_mean): This method finds the minimum value of the temporal mean of a variable. def find_maximum_value(self,temporal_mean): This method fine the maximum value of the temporal mean of a variable.
The class initialization: def __init__(self,dataset): """ Here we initalize the region class as a dictionary. Parameter: datset - This is a dataset that has been opened with xarray. """ self.name = [] The list to store the name of the user region. This is key word and is needed. self.north_lat = [] The list to store the user requested northern latitude of the reggion. self.south_lat = [] The list to store the user requested southern latitude of the region. self.west_long = [] The list to store the western longitude of the user region. In degrees East self.east_long = [] The list to store the eastern longitude of the user region. In degrees East self.region_indices = [] The list to strore the region indicies. These are passed back to the calling routine. self.latitude_values = dataset['grid_yt'].values This is the array of latitude values on the original dataset. self.longitude_values = dataset['grid_xt'].values This is the array of the longitude values on the original data set. The methods called from this class. def test_user_latitudes(self,north_lat,south_lat): This method tests the user input latitudes to make sure they are reasonable. It exits with an error if the latitudes are out of bounds. If the values pass the tests they are added to the region dictionary defined in the def__init__ method above. def test_user_longitudes(self,west_long,east_long): This method tests the user input longitudes to make sure they are reasonable. It exits with an error if the longitudes are out of bounds. If the values pass the tests they are added to the region dictionary define in the def__init__ method above. def get_region_indices(self,region_index): This method is called from the method get_region_data that is a member of this class.. It calculates the start and end indicies in the latitude and longitude arrays for the region index passed in from the get region data. Methods called from an external python script. def add_user_region(self,dictionary): This method is called from the main calling python script. It tests the region dictionary passed in from the calling script for validity. It calls the test_user_latitudes and test_user_longitudes to make the user defined region is valid. If the user defined region is valid it populates the region dictionary as defined above. def get_region_indices(self,region_index): This method is called from the main calling python script. It calculates the start and end indicies in the latitude and longitude arrays for the region index passed in. It returns the indicies to the calling python script. def get_region_data(self,region_index,data): This method subsets the full grid variable data into the requested region.
The valid masks are land,ocean,sea and ice def __init__(self,user_mask_value,soil_type_values): """ Here we initalize the MaskCatalog class. """ self.name = None self.user_mask = user_mask_value self.data_mask = soil_type_values def initial_mask_of_variable(self,var_name,variable_data,dataset): This method sets the ocean and ice grid cells to missing for the soil variables. This is done automatically for the soil variables: soill4,soilm,soilt4 and tg3. This method is called from the main python script. def replace_bad_values_with_nan(self,variable_data): This method replaces missing or fill values with NaN. This is done so any statistics the user has requested will be calculated correctly. This method is called from the main python script. def user_mask_array(self,region_mask): This takes the sotyp variable data from the data set. It returns an array with boolean values. The grid points that the user wants are set to True and the grid points the user does not want is set to false.
This python script is the main sript for the harvesters in score-hv This script uses the following classes: from score_hv.config_base import ConfigInterface from score_hv.stats_utils import VarStatsCatalog from score_hv.region_utils import GeoRegionsCatalog from score_hv.mask_utils import MaskCatalog This script reads the VALID_CONFIG_DICT that is set up in the harvester tests. At present the VALID_CONFIG_DICT has the following values: VALID_CONFIG_DICT = {'harvester_name': hv_registry.DAILY_BFG, 'filenames' : BFG_PATH, 'statistic': ['mean','variance', 'minimum', 'maximum'], 'variable': ['var1',...'varn'], 'regions': {name,latitude values, longitude values} There can be more than one region. 'surface_mask': land,or ocean or ice } The daily_bfg.py then opens and reads the dataset requested by the user. The path to the data files is in the VALID_CONFIG_DICT:filenames. The python method xarray is used to open and read in the data set. The script then reads the rest of the VALID_CONFIG_DICT. A general rundown of the processing in the daily_bfg.py is as follows: The gridcell area weight files is read in. Each variable that has been requested is processed on at a time. If a soil variable has been requested it is masked. If a surface mask has been requested the variable grid point and weights grid points are masked. If a region or regions have been requested they are applied to to the variable data and weights. The requested user statistics are then calculated. The following information is sent back to the havester that has called the daily_bfg.py harvested_data.append(HarvestedData( self.config.harvest_filenames, statistic, variable, np.float32(value), units, dt.fromisoformat(median_cftime.isoformat()), longname, self.config.surface_mask, self.config.regions)) The following are methods that are called from this main python script: def get_gridcell_area_data_path(): returns the path to the gridcell area data file. def get_median_cftime(xr_dataset): returns the median cftime from the sr_dataset. : There can be more than one region. 'surface_mask': land,or ocean or ice } The daily_bfg.py then opens and reads the dataset requested by the user. The path to the data files is in the VALID_CONFIG_DICT:filenames. The python method xarray is used to open and read in the data set. The script then reads the rest of the VALID_CONFIG_DICT. A general rundown of the processing in the daily_bfg.py is as follows: The gridcell area weight files is read in. Each variable that has been requested is processed on at a time. If a soil variable has been requested it is masked. If a surface mask has been requested the variable grid point and weights grid points are masked. If a region or regions have been requested they are applied to to the variable data and weights. The requested user statistics are then calculated. The following information is sent back to the havester that has called the daily_bfg.py harvested_data.append(HarvestedData( self.config.harvest_filenames, statistic, variable, np.float32(value), units, dt.fromisoformat(median_cftime.isoformat()), longname, self.config.surface_mask, self.config.regions)) The following are methods that are called from this main python script: def get_gridcell_area_data_path(): returns the path to the gridcell area data file. def get_median_cftime(xr_dataset): returns the median cftime from the sr_dataset. def check_variable_exists(var_name,dataset_variable_names): Makes sure the requested variable is in the users dataset. def calculate_surface_energy_balance(xr_dataset,dataset_variable_names): This method calculates the surface energy balance. The surface energy balance is a derived field. def calculate_toa_radative_flux(xr_dataset,dataset_variable_names): This method calculates the top of the atmosphere radiative energy flux(netrf_avetoa). This is a derived field. def check_array_dimensions(region_variable,region_weights): This method makes sure that the region variable and the region weights have the same dimensions. If their dimensions are different we exit the main script. The dimensions must be the same to calculate the statistics requested by the user. def calculate_and_normalize_solid_angle(sum_global_weights,region_weights): This method calculates the solid angle for the regional weights and normalizes them. The normalized regional weights are returned.
/Users/sfredrick/adam/score-hv/src/score_hv/data for all soil tests
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a new harvester. The tests in this harvester are
test_harvester_global_soil_moisture.py
test_harvester_global_soil_temperature.py
The above two harvesters test the return values from the daily_bfg.py for the global vaues of
soilt4, tg3, soill4, and soilm. No region was requested. It should be noted that these soil fields
are automatically masked. The ocean and ice values are set to NaN.
The harvesters below:
test_harvester_regional_soil_moisture.py
test_harvester_regional_soil_temperature.py
Test the regional values of soilt4, tg3, soill4, and soilm.
The regions tested were:
'regions': {
'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0},
'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0},
'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0},
'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0},
}
All of the tests passed the pytest.
Also the following classes are apart of this branch.
mask_utils.py - The class for masking methods.
region_utils.py - The class for subsetting the global variable and weight data in subregions.
stats_utils.py - The class for calculating the user requested statistics.
The daily_bfg.py
This is the main python script for populating the harvested_data which will be returned to the tests and other methods which call the daily_bfg.py