Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/soil harvesters #68

Open
wants to merge 68 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
b751e80
Deleted __init__.py
sherrieF Dec 19, 2023
0a1e8e7
deleted prateb_control files from the directory.
sherrieF Dec 19, 2023
55579ee
changed precip to lhftl
sherrieF Dec 20, 2023
38667e2
No change was made
sherrieF Dec 20, 2023
f8c63ea
Updated the test to work with the daily_bfg.py harvester.
sherrieF Apr 25, 2024
35bfa58
Updated the harvester to work with the daily_bfg.py harvester.
sherrieF Apr 25, 2024
a60e898
Update the test to work with the daily_bfg.py harvester.
sherrieF Apr 25, 2024
2f836dc
Added the region_utils.py to this branch.
sherrieF Apr 25, 2024
0ff1940
Added the stats_utils.py to this branch.
sherrieF Apr 25, 2024
7dfeb23
Up dated daily_bfg.py to work with the region_utils class
sherrieF Apr 25, 2024
f453737
Merged all of the flux variables from the original bfg control files
sherrieF Apr 25, 2024
79f3e85
added the file test_harvester_surface_latent_heat_flux.py to
sherrieF Apr 25, 2024
9414a14
Removed test files that do not test fluxes.
sherrieF Apr 25, 2024
28503d0
added test data for soil.
sherrieF Apr 30, 2024
6fc4006
working on test for soil moisture
sherrieF Apr 30, 2024
e7566d8
working on daily_bfg.py
sherrieF Apr 30, 2024
44b3549
Working on soil daily_bfg
sherrieF May 2, 2024
1ce891b
added column soil moisture, land mask and tg3 to bfg files.
sherrieF May 2, 2024
09beea9
Working on test for harvesters
sherrieF May 2, 2024
5320f41
added some documentation.
sherrieF May 2, 2024
03a817c
Working on daily_bfg
sherrieF May 2, 2024
c0b7231
working on adding masks
sherrieF Jun 3, 2024
ed6b8cf
working on adding masks
sherrieF Jun 3, 2024
3cb7766
working on adding masks
sherrieF Jun 3, 2024
850ab47
no changes
sherrieF Jun 3, 2024
09290b8
Added a mask variable to the VALID_CONFIG_DICT.
sherrieF Jun 5, 2024
8bc2ad2
added surface_mask to VALID_CONFIG_DICT
sherrieF Jun 5, 2024
186ce6d
working on test_harvester_global_soilt4.py
sherrieF Jun 17, 2024
4490101
working on adding masking
sherrieF Jun 17, 2024
8bd3577
deleted un-used parameter var_data
sherrieF Jun 17, 2024
8fe32c6
working on masking
sherrieF Jul 1, 2024
9764443
working on masking
sherrieF Jul 1, 2024
7118ac9
working on masking
sherrieF Jul 1, 2024
5b7d37b
working on masking
sherrieF Jul 1, 2024
bef82a4
working on adding masking
sherrieF Jul 24, 2024
f1b2576
working on adding masking
sherrieF Jul 24, 2024
354fbc5
working on masking
sherrieF Jul 24, 2024
47b05d5
working on adding masks
sherrieF Jul 24, 2024
74b2706
changed weights and temporal means arrays to numpy arrays
sherrieF Jul 25, 2024
71ad8db
added a method to get rid of missing and fill values in
sherrieF Jul 25, 2024
22f9112
adding masking
sherrieF Aug 6, 2024
b1b5e80
adding masking
sherrieF Aug 6, 2024
0cae4b3
working on masking
sherrieF Aug 6, 2024
3272d12
no changes
sherrieF Aug 6, 2024
62db58f
changed to latitude and longitude tuples to north_lat,south_lat
sherrieF Aug 14, 2024
0e2abaf
adding tests for masking
sherrieF Aug 19, 2024
d841075
changed latitude region to north_lat and south_lat. Changes
sherrieF Aug 19, 2024
18630df
added VALID_REGION_BOUND_KEYS = ('min_lat', 'max_lat', 'west_lon', 'e…
sherrieF Aug 19, 2024
4c22eca
working on masking
sherrieF Aug 19, 2024
05b6d44
This class contains methods for processing masking requests.
sherrieF Aug 29, 2024
ea46379
working on masking
sherrieF Aug 29, 2024
a21ed0e
working on masking of land variables
sherrieF Aug 29, 2024
3bff0c4
small changes
sherrieF Aug 29, 2024
303a1f1
updates
sherrieF Sep 11, 2024
58054f1
updates
sherrieF Sep 11, 2024
9110b9c
This python script tests the return values of the daily_bfg harvester.
sherrieF Sep 11, 2024
1e9d163
This python script tests the return values of the daily_bfg harvester.
sherrieF Sep 11, 2024
3e91a6c
This python script tests the return values of the daily_bfg harvester.
sherrieF Sep 11, 2024
ef4df65
This python script tests the return values of the daily_bfg harvester.
sherrieF Sep 11, 2024
19fa49a
The following methods are in the python script stats_utils.py.
sherrieF Sep 12, 2024
62abd3d
The following methods are in the python script region_utils.py
sherrieF Sep 16, 2024
80aeb0e
The following methods are in the python script mask_utils.py
sherrieF Sep 16, 2024
eaba1b6
daily_bfg.py
sherrieF Sep 16, 2024
0df4676
Commiting all test files
sherrieF Sep 17, 2024
5d91e52
Changed the gridcell area weighting Netcdf file path to
sherrieF Sep 19, 2024
2553f67
got rid of the --pycache__ directory
sherrieF Sep 19, 2024
5108d13
Deleted the __pycache__ directory
sherrieF Sep 19, 2024
e1eb08b
Added the data direction which now has the gridcell area weights file
sherrieF Sep 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions README.md

This file was deleted.

Empty file removed src/score_hv/__init__.py
Empty file.
Empty file.
454 changes: 373 additions & 81 deletions src/score_hv/harvesters/daily_bfg.py
100755 → 100644

Large diffs are not rendered by default.

103 changes: 103 additions & 0 deletions src/score_hv/mask_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
"""
Copyright 2024 NOAA
All rights reserved.

Collection of methods to work with masking of user requested variables
"""
import sys,os
import numpy as np
import xarray as xr
import pytest
import pdb

VALID_MASKS = ['land','ocean','sea', 'ice']

class MaskCatalog:
def __init__(self,user_mask_value,soil_type_values):
"""
Here we initalize the MaskCatalog class.
"""
self.name = None
self.user_mask = user_mask_value
"""The variable name on the data set
that is used for masking is "sotyp".
The sotyp array values are passed in from the
calling routine where it is read from
the data set.
ocean = sotyp has vaues of 0
sea = sotyp has values of 0
ice = sotyp has values of 16
land = sotyp has values between ocean and ice.
"""
self.data_mask = soil_type_values

def initial_mask_of_variable(self,var_name,variable_data):
"""
This method does the initial masking special variables.
The land variables that are masked initially are:
soill4,soilm,soilt4 and tg3.
It uses the sotyp(soil type) variable in the data set.
Parameters:
var_name - The variable name.
varaiable_data - The initial variable data with no masking.
return - The variable data is returned with the ocean and ice
values set to missing.
"""

self.name = var_name
"""
The values of 0 and 16 in the sotyp variable are used to delete values
over ocean and ice.
"""
masked_data = variable_data.where((self.data_mask != 0) & (self.data_mask != 16))
return(masked_data)

def replace_bad_values_with_nan(self,variable_data):
"""
Check for _FillValue or missing_values in the variable data.
This method will replace the missing or fill values with
NaN.
Parameters;
variable_data - The variable field data. Variable_data is of
type class 'xarray.core.dataarray.DataArray.
Return - The variable field data with NaN's for where the
grid values are missing or fill values. The returned
masked_variable_data is of type
class 'xarray.core.dataarray.DataArray.
"""
fill_value = variable_data.encoding.get('_FillValue', None)
missing_value = variable_data.encoding.get('missing_value', None)

"""
Create a combined mask for fill_values,missing_value and non
finite values.
"""
if fill_value is not None and missing_value is not None:
mask = (variable_data != fill_value) & (variable_data != missing_value)
elif fill_value is not None:
mask = variable_data != fill_value
elif missing_value is not None:
mask = variable_data != missing_value
else:
mask = np.isfinite(variable_data)

# Apply the mask
masked_variable = variable_data.where(mask,np.nan)
return(masked_variable)


def user_mask_array(self,region_mask):
"""
The user has requested a mask. Here we keep only the
type of data that the user wants.
Parameters:
region_mask - This is the sotyp variable read in from the
bfg data file.
Return - The mask array is returned. This will contain
boolean values. True for grid points we want and False
for grid points we do not want.
"""
if 'land' in self.user_mask:
masked_array = ~region_mask.isin([0, 16])
return(masked_array)

242 changes: 242 additions & 0 deletions src/score_hv/region_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
"""
Copyright 2024 NOAA
All rights reserved.

Collection of methods to work with user requested
geographic regions.
"""
import sys,os
import numpy as np
import xarray as xr
import math
from datetime import datetime
import pytest
import pdb
from netCDF4 import Dataset
from pathlib import Path
"""
The class GeoRegions has functions to allow the user to request specific
geographical regions.
"""

"""
Here we define default latitude and longitude ranges.
min lat is -90 and max lat is 90.
NOTE: Our longitude on the BFG files goes from 0 to 360 degrees east, circular.
west long is 0 and east_long is 360.
"""
NORTH_LAT = 90
SOUTH_LAT = -90
WEST_LONG = 0
EAST_LONG = 360

class GeoRegionsCatalog:
def __init__(self,dataset):
"""
Here we initalize the region class as a dictionary.
Parameter: datset - This is a dataset that has been
opened with xarray.
"""
self.name = []
self.north_lat = []
self.south_lat = []
self.west_long = []
self.east_long = []
self.region_indices = []
self.latitude_values = dataset['grid_yt'].values
self.longitude_values = dataset['grid_xt'].values

def test_user_latitudes(self,north_lat,south_lat):
"""
This method tests to make sure the latitude values
entered by the user are within the bounds of -90 to 90.
"""
if south_lat < -90 or north_lat > 90:
msg = f'southern latitude must be greater than -90. and less than 90. '\
f'south_lat: {south_lat}'
raise ValueError(msg)

if north_lat < -90 or north_lat > 90:
msg = f'northern latitude must be greater than -90 and less than 90. ' \
f'north_lat: {north_lat}'
raise ValueError(msg)

if south_lat > north_lat:
msg = f'southern latitude must be less than northern latitude ' \
f'south_lat: {south_lat}, north_lat: {north_lat}'
raise ValueError(msg)

def test_user_longitudes(self,west_long,east_long):
"""
This method tests to make sure the longitude values
entered by the user are within the bounds of 0 to 360.
"""
if east_long < 0 or east_long > 360:
msg = f'east longitude must be greater than 0 and less than 360 '\
f'east_long: {east_long}'
raise ValueError(msg)

if west_long < 0 or west_long > 360:
msg = f'west longitude must be greater than 0 and less than 360 '\
f'west_long: {west_lon}'
raise ValueError(msg)


def add_user_region(self,dictionary):
"""
Parameters:
dictionary - The dictionary should contain the following keys.
name - name of the region.
north_lat - northern most latitude.
south_lat - southern most latitude.
west_long - westmost longitude.
east_long - eastmost longitude.
If the dictionay is empty we exit with a message.
If the region name key word is missing then we exit with a message.
If either the latitude or the longitude keys are missing we supply
a default.
"""
if not dictionary:
msg = f'The dictionary passed in to add_user_region is empty'
raise ValueError(msg)

for region_name, input_dictionary in dictionary.items():
if not region_name:
msg = 'No region name was given. Please enter a name for your region.'
raise ValueError(msg)
self.name.append(region_name)

required_keywords = {'north_lat','south_lat','west_long','east_long'}
missing_keys = required_keywords - set(input_dictionary.keys())
if "north_lat" in missing_keys:
print("north_lat is missing. Using the default of north latitude = 90")
north_lat = NORTH_LAT
else:
north_lat = input_dictionary.get("north_lat")

if "south_lat" in missing_keys:
print("south_lat is missing. Using the default of south latitude = -90")
south_lat = SOUTH_LAT
else:
south_lat = input_dictionary.get("south_lat")

if "west_long" in missing_keys:
print("west_long is missing. Using the default of west longitude = 0")
west_long = WEST_LONG
else:
west_long = input_dictionary.get("west_long")

if "east_long" in missing_keys:
print("east_long is missing. Using the default of east longitude = 360")
east_long = EAST_LONG
else:
east_long = input_dictionary.get("east_long")

self.test_user_latitudes(north_lat,south_lat)
self.north_lat.append(north_lat)
self.south_lat.append(south_lat)

self.test_user_longitudes(west_long,east_long)
self.west_long.append(west_long)
self.east_long.append(east_long)

def get_region_indices(self,region_index):
"""
This method calculates the indices in the latitude values and
longitude values that are passed in from the calling function.
The latitude and longitude values assigned in the
method, add_user_region, above are used to determine the
indices.
Parameters:
region_index - The number of the region that the user
has defined by the key word-name. There
can be multiple regions.
returns - The lat_indices and lon_indices. These lists
return the start and end indices as tuples.
"""
num_long = len(self.longitude_values)
num_lat = len(self.latitude_values)
step_size = self.longitude_values[num_long-1]/num_long

# Find latitude indices within the region.
name = self.name[region_index]

north_lat = self.north_lat[region_index]
south_lat = self.south_lat[region_index]
latitude_indices = [index for index, lat in enumerate(self.latitude_values) if south_lat <= lat <= north_lat ]
if latitude_indices:
lat_start_index = latitude_indices[0]
lat_end_index = latitude_indices[-1]
else:
msg=f"No latitude values were found within the specified range of {south_lat} and {max_lat}."
raise KeyError(msg)


"""Find longitue indices within the region.
We have two cases to consider here.
Ranges that do not cross the Prime Meridian and ranges that do cross the
prime meridian. Our longitude array is 0 to 360 degrees east circular.
"""
east_long = self.east_long[region_index]
west_long = self.west_long[region_index]
if east_long <= west_long:
#region crosses the prime meridian. So we need to get two separate regions.
long1_start_index = int(west_long/step_size)
long1_end_index = num_long - 1
long2_start_index = 0
long2_end_index = int(east_long/step_size)
else:
long1_start_index = int(west_long / step_size)
long1_end_index = int(east_long / step_size)
long2_start_index = -999
long2_end_index = -999
return (lat_start_index,lat_end_index,long1_start_index,long1_end_index,long2_start_index,long2_end_index)

def get_region_data(self,region_index,data):
"""
This method returns the data which corresponds to the user selected
region depending on the region indicies. The region is a subset of the
original variable data.
The data.isel is a xarray method that selects a range of indices
along the grid_yt and grid_xt dimension.
Parameters:
region_index - The number of the region to be sub-regioned..
data - The full grid variable data to be sub-regioned based on the
start and ending indices returned from get_region_indices.
return - The variable data sub-setted into the region.
"""
lat_start_index,lat_end_index,long1_start_index,long1_end_index,long2_start_index, \
long2_end_index = self.get_region_indices(region_index)

# Check dimensions of the specific variable in the dataset
var_dims = data.dims
if 'time' in var_dims:
if long2_start_index != -999:
region1_data = data.isel(time=slice(None),
grid_yt=slice(lat_start_index, lat_end_index + 1),
grid_xt=slice(long1_start_index, long1_end_index))

region2_data = data.isel(time=slice(None),
grid_yt=slice(lat_start_index, lat_end_index + 1),
grid_xt=slice(long2_start_index, long2_end_index))
region_data = xr.concat([region1_data,region2_data],dim='grid_xt')

else:
region_data = data.isel(time=slice(None),
grid_yt=slice(lat_start_index, lat_end_index + 1),
grid_xt=slice(long1_start_index, long1_end_index))
else:
if long2_start_index != -999:
region1_data = data.isel(grid_yt=slice(lat_start_index, lat_end_index + 1),
grid_xt=slice(long1_start_index,long1_end_index))

region2_data = data.isel(grid_yt=slice(lat_start_index, lat_end_index + 1),
grid_xt=slice(long2_start_index,long2_end_index))

region_data = xr.concat([region1_data,region2_data],dim='grid_xt')
else:
region_data = data.isel(grid_yt=slice(lat_start_index, lat_end_index + 1),
grid_xt=slice(long1_start_index,long1_end_index))
return(region_data)


Loading