Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

various minor changes, and updated opendap mapper for Arome Arctic forecast data #533

Open
wants to merge 71 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
56996cc
test that history metadata is added correctly
mortenwh Oct 23, 2020
7e5a1f6
updated baseURLs to s1 thredds
mortenwh Jun 11, 2021
fa71c83
Merge branch 'nansencenter:master' into master
mortenwh Jun 11, 2021
9677223
Merge branch 'nansencenter:master' into master
mortenwh Aug 12, 2021
fceb4b5
removed print line which hampers readability - the line was printed e…
mortenwh Jan 26, 2022
6d715b9
added exception handling of value error
mortenwh Jan 26, 2022
68c01b7
Merge branch 'master' of github.com:nansencenter/nansat
mortenwh Jan 26, 2022
1a53f4b
use != instead of is not. Mappername default changed to None would be…
mortenwh Jan 26, 2022
f754a8a
Merge branch 'master' of github.com:mortenwh/nansat
mortenwh Jan 26, 2022
d90e2e3
This is not an error. Might be a warning but it is still annoying, so…
mortenwh Jan 27, 2022
18fc477
Merge remote-tracking branch 'nersc/master'
mortenwh Sep 16, 2022
365ec55
Merge remote-tracking branch 'nersc/master'
mortenwh Dec 22, 2022
21e4334
#525: added function to export with xarray, some cleaning, and tests
mortenwh Dec 23, 2022
0ca6093
#525: ipdb lines were not meant to be committed..
mortenwh Dec 23, 2022
d1be8e9
#525: removed ipdb lines
mortenwh Dec 23, 2022
110789c
#525: remover xarray based export function and modified the export fu…
mortenwh Dec 25, 2022
2c45363
#525: removed unnecessary test
mortenwh Dec 25, 2022
d4d6495
#525: cleaned test code and added one test to demonstrate issue when …
mortenwh Dec 25, 2022
f276d0e
#525: static methods
mortenwh Dec 26, 2022
a671dc8
#525: adjusted netcdf-cf mapper to account for new attribute names wi…
mortenwh Dec 27, 2022
54f17ab
#525: gdal adds its own Conventions attribute with value 'CF-1.5' whi…
mortenwh Jan 4, 2023
cfc5a28
#525: history is cut by gdal CreateCopy. This needs to be overridden.
mortenwh Jan 5, 2023
5562633
get proj4 string from grid mapping variable
mortenwh Jun 1, 2023
f797b68
specify bands in opendap arome mapper since files are too big
mortenwh Jun 6, 2023
eb5e6fb
update to new netcdf attributes
mortenwh Jun 6, 2023
aebc93c
working mapper for opendap arome arctic
mortenwh Jun 13, 2023
7876ece
opendap-arome now works
mortenwh Jun 14, 2023
d1bffc6
update box
mortenwh Jun 21, 2023
1a62f3d
some error handling
mortenwh Jul 3, 2023
4ed3fae
Merge branch 'master' of github.com:mortenwh/nansat
mortenwh Jul 3, 2023
918e0ec
merged with nersc, and resolved conflicts
mortenwh Nov 20, 2023
6c77c9a
Merge remote-tracking branch 'nersc/master'
mortenwh Feb 27, 2024
3f488fb
start meps mapper
mortenwh Feb 27, 2024
0e8447f
Should work now but does not..
mortenwh Feb 28, 2024
4e71e74
update
mortenwh Feb 28, 2024
6cdea78
reorganise
mortenwh Feb 28, 2024
90e09f9
MEPS mapper works
mortenwh Mar 8, 2024
2e7dedf
Add MET Nordic mapper
mortenwh Apr 9, 2024
b26dedb
add source filename to metadata
mortenwh Apr 15, 2024
f4306c5
Now add MET Nordic mapper
mortenwh Apr 17, 2024
474f05b
Merge branch 'master' of github.com:mortenwh/nansat
mortenwh Apr 17, 2024
f43e9b5
fix encoding of metadata
mortenwh Apr 17, 2024
251dba1
reorganise and change apostrophes
mortenwh Apr 17, 2024
a9556bb
Handle non-string attributes
mortenwh Apr 17, 2024
cd92904
Remove warnings
mortenwh Apr 29, 2024
731e5fb
Make it possible to avoid creating gcps
mortenwh Apr 30, 2024
f410d5b
Add time metadata to bands
mortenwh Apr 30, 2024
913c7aa
allow missing history, and set timezone to utc
mortenwh May 3, 2024
7097cfd
find correct nc-file based on input time
mortenwh May 4, 2024
e222105
Assert correct gcp shape
mortenwh May 13, 2024
a3457f1
better handle input dict
mortenwh May 13, 2024
cdb110d
remove ipdb lines
mortenwh May 13, 2024
03e579a
bug fix
mortenwh May 14, 2024
a559d61
started
mortenwh May 14, 2024
2744857
Fix gcp shape by looping
mortenwh May 15, 2024
d0d4f72
improve gcp reading
mortenwh May 16, 2024
a7c92af
log gcp shape
mortenwh May 16, 2024
09c6079
Init from lonlat
mortenwh Jun 7, 2024
cf52890
Merge branch 'master' of github.com:mortenwh/nansat
mortenwh Jun 7, 2024
7c5af87
Use lat/lon grids for initialization
mortenwh Jun 9, 2024
a6af99f
bug fix?
mortenwh Jun 20, 2024
ed63700
bug fix
mortenwh Jun 25, 2024
4be193c
resolve conflict
mortenwh Jun 25, 2024
7e576c3
Adjust gcp sizes
mortenwh Jun 25, 2024
3e98589
get rid of potential norwegian characters
mortenwh Aug 30, 2024
ae61f24
Merge branch 'master' of github.com:mortenwh/nansat
mortenwh Aug 30, 2024
6b244c6
new mapper
mortenwh Oct 25, 2024
d17666b
Merge branch 'master' of github.com:mortenwh/nansat
mortenwh Oct 25, 2024
b1d7ede
correct docstring
mortenwh Oct 29, 2024
191914f
fix docstring
mortenwh Oct 29, 2024
2d634ce
reorganise arrays
mortenwh Oct 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Vagrantfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

config.vm.box = "ubuntu/trusty64"
config.vm.box_url = "https://atlas.hashicorp.com/ubuntu/trusty64"
config.vm.box = "ubuntu/bionic64"
#config.vm.box_url = "https://atlas.hashicorp.com/ubuntu/bionic64"

config.vm.define "nansat", primary: true do |nansat|
end
Expand Down
4 changes: 2 additions & 2 deletions nansat/domain.py
Original file line number Diff line number Diff line change
Expand Up @@ -526,9 +526,9 @@ def _create_extent_dict(extent_str):
raise ValueError('<extent_dict> must contains exactly 2 parameters '
'("-te" or "-lle") and ("-ts" or "-tr")')
key, extent_dict = Domain._add_to_dict(extent_dict, option)
if key is 'te' or key is 'lle':
if key == 'te' or key == 'lle':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change highlights a problem in this test:
the arguments to -te are given in the wrong order.
Could you change "-te -180 180 60 90 -ts 500 500" to "-te -180 60 180 90 -ts 500 500"?

Domain._validate_te_lle(extent_dict[key])
elif key is 'ts' or key is 'tr':
elif key == 'ts' or key == 'tr':
Domain._validate_ts_tr(extent_dict[key])

return extent_dict
Expand Down
49 changes: 34 additions & 15 deletions nansat/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,11 @@
from __future__ import print_function, absolute_import, division

import os
import pytz
import tempfile
import datetime
import warnings
import importlib

from nansat.utils import gdal
import numpy as np
Expand All @@ -30,6 +32,11 @@

from nansat.exceptions import NansatGDALError

try:
import xarray as xr
except:
warnings.warn("'xarray' needs to be installed for Exporter.xr_export to work.")


class Exporter(object):
"""Abstract class for export functions """
Expand All @@ -40,7 +47,7 @@ class Exporter(object):
'_FillValue', 'type', 'scale', 'offset', 'NETCDF_VARNAME']

def export(self, filename='', bands=None, rm_metadata=None, add_geolocation=True,
driver='netCDF', options='FORMAT=NC4', hardcopy=False):
driver='netCDF', options='FORMAT=NC4', hardcopy=False, add_gcps=True):
"""Export Nansat object into netCDF or GTiff file

Parameters
Expand Down Expand Up @@ -106,10 +113,11 @@ def export(self, filename='', bands=None, rm_metadata=None, add_geolocation=True
if self.filename == filename or hardcopy:
export_vrt.hardcopy_bands()

if driver == 'GTiff':
add_gcps = export_vrt.prepare_export_gtiff()
else:
add_gcps = export_vrt.prepare_export_netcdf()
if add_gcps:
if driver == 'GTiff':
add_gcps = export_vrt.prepare_export_gtiff()
else:
add_gcps = export_vrt.prepare_export_netcdf()

# Create output file using GDAL
dataset = gdal.GetDriverByName(driver).CreateCopy(filename, export_vrt.dataset,
Expand All @@ -123,25 +131,33 @@ def export(self, filename='', bands=None, rm_metadata=None, add_geolocation=True
# Rename variable names to get rid of the band numbers
self.rename_variables(filename)
# Rename attributes to get rid of "GDAL_" added by gdal
self.rename_attributes(filename)
try:
history = self.vrt.dataset.GetMetadata()['history']
except KeyError:
history = None
self.correct_attributes(filename, history=history)

self.logger.debug('Export - OK!')

@staticmethod
def rename_attributes(filename):
def correct_attributes(filename, history=None):
""" Rename global attributes to get rid of the "GDAL_"-string
added by gdal.
added by gdal, remove attributes added by gdal that are
already present in the Nansat object, and correct the history
attribute (the latter may be reduced in length because
gdal.GetDriverByName(driver).CreateCopy limits the string
lenght to 161 characters).
"""
GDAL = "GDAL_"
del_attrs = []
rename_attrs = []
# Open new file to edit attribute names
with Dataset(filename, 'r+') as ds:
""" The netcdf driver adds the Conventions attribute with
value CF-1.5. This may be wrong, so it is better to use the
Conventions metadata from the Nansat object. Other attributes
added by gdal that are already present in Nansat, should also
be deleted."""
value CF-1.5. This in most cases wrong, so it is better to use
the Conventions metadata from the Nansat object. Other
attributes added by gdal that are already present in Nansat,
should also be deleted."""
for attr in ds.ncattrs():
if GDAL in attr:
if attr.replace(GDAL, "") in ds.ncattrs():
Expand All @@ -157,6 +173,9 @@ def rename_attributes(filename):
# Rename attributes:
for attr in rename_attrs:
ds.renameAttribute(attr, attr.replace(GDAL, ""))
# Correct the history
if history is not None:
ds.history = history

@staticmethod
def rename_variables(filename):
Expand Down Expand Up @@ -354,7 +373,7 @@ def _create_dimensions(self, nc_inp, nc_out, time):
nc_out.createDimension(dim_name, dim_shapes[dim_name])

# create value for time variable
td = time - datetime.datetime(1900, 1, 1)
td = time - datetime.datetime(1900, 1, 1).replace(tzinfo=pytz.timezone("utc"))
days = td.days + (float(td.seconds) / 60.0 / 60.0 / 24.0)
# add time dimension
nc_out.createDimension('time', 1)
Expand Down Expand Up @@ -418,8 +437,8 @@ def _post_proc_thredds(self,
fill_value = None
if '_FillValue' in inp_var.ncattrs():
fill_value = inp_var._FillValue
if '_FillValue' in band_metadata[inp_var_name]:
fill_value = band_metadata['_FillValue']
elif '_FillValue' in band_metadata[inp_var_name]:
fill_value = band_metadata[inp_var_name]['_FillValue']
dimensions = ('time', ) + inp_var.dimensions
out_var = Exporter._copy_nc_var(inp_var, nc_out, inp_var_name, inp_var.dtype,
dimensions, fill_value=fill_value, zlib=zlib)
Expand Down
20 changes: 9 additions & 11 deletions nansat/mappers/mapper_arome.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,21 @@

class Mapper(NetcdfCF):

def __init__(self, *args, **kwargs):
def __init__(self, filename, gdal_dataset, gdal_metadata, *args, **kwargs):

mm = args[2] # metadata
if not mm:
if not gdal_metadata:
raise WrongMapperError
if 'NC_GLOBAL#source' not in list(mm.keys()):
if 'source' not in list(gdal_metadata.keys()):
raise WrongMapperError
if not 'arome' in mm['NC_GLOBAL#source'].lower() and \
not 'meps' in mm['NC_GLOBAL#source'].lower():
if not 'arome' in gdal_metadata['title'].lower():
raise WrongMapperError

super(Mapper, self).__init__(*args, **kwargs)
super(Mapper, self).__init__(filename, gdal_dataset, gdal_metadata, *args, **kwargs)

self.dataset.SetMetadataItem('time_coverage_start',
(parse(mm['NC_GLOBAL#min_time'], ignoretz=True, fuzzy=True).isoformat()))
self.dataset.SetMetadataItem('time_coverage_end',
(parse(mm['NC_GLOBAL#max_time'], ignoretz=True, fuzzy=True).isoformat()))
#self.dataset.SetMetadataItem('time_coverage_start', parse(
# gdal_metadata['NC_GLOBAL#min_time'], ignoretz=True, fuzzy=True).isoformat())
#self.dataset.SetMetadataItem('time_coverage_end', parse(
# gdal_metadata['NC_GLOBAL#max_time'], ignoretz=True, fuzzy=True).isoformat()))

# Get dictionary describing the instrument and platform according to
# the GCMD keywords
Expand Down
112 changes: 112 additions & 0 deletions nansat/mappers/mapper_meps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
import json
import pytz
import datetime

import pythesint as pti

from osgeo import gdal
from pyproj import CRS
from netCDF4 import Dataset

from nansat.nsr import NSR
from nansat.exceptions import WrongMapperError
from nansat.mappers.mapper_netcdf_cf import Mapper as NetcdfCF
from nansat.mappers.opendap import Opendap


class Mapper(NetcdfCF, Opendap):

def __init__(self, url, gdal_dataset, gdal_metadata, file_num=0, bands=None, *args,
**kwargs):

if not url.endswith(".nc"):
raise WrongMapperError

try:
ds = Dataset(url)
except OSError:
raise WrongMapperError

if "title" not in ds.ncattrs() or "meps" not in ds.getncattr("title").lower():
raise WrongMapperError

metadata = {}
for attr in ds.ncattrs():
content = ds.getncattr(attr)
if isinstance(content, str):
content = content.replace("æ", "ae").replace("ø", "oe").replace("å", "aa")
metadata[attr] = content

self.input_filename = url

xsize = ds.dimensions["x"].size
ysize = ds.dimensions["y"].size

# Pick 10 meter height dimension only
height_dim = "height6"
if height_dim not in ds.dimensions.keys():
raise WrongMapperError
if ds.dimensions[height_dim].size != 1:
raise WrongMapperError
if ds.variables[height_dim][0].data != 10:
raise WrongMapperError

varnames = []
for var in ds.variables:
var_dimensions = ds.variables[var].dimensions
if var_dimensions == ("time", height_dim, "y", "x"):
varnames.append(var)

# Projection
try:
grid_mapping = ds.variables[ds.variables[varnames[0]].grid_mapping]
except KeyError:
raise WrongMapperError

grid_mapping_dict = {}
for index in grid_mapping.ncattrs():
grid_mapping_dict[index] = grid_mapping.getncattr(index)
crs = CRS.from_cf(grid_mapping_dict)
nsr = NSR(crs.to_proj4())

# Geotransform
xx = ds.variables["x"][0:2]
yy = ds.variables["y"][0:2]
gtrans = xx[0], xx[1]-xx[0], 0, yy[0], 0, yy[1]-yy[0]

self._init_from_dataset_params(xsize, ysize, gtrans, nsr.wkt)

meta_dict = []
if bands is None:
bands = varnames
for band in bands:
if band not in ds.variables.keys():
continue
dimension_names, dim_sizes = self._get_dimension_info(band)
self._pop_spatial_dimensions(dimension_names)
index = self._get_index_of_dimensions(dimension_names, {}, dim_sizes)
fn = self._get_sub_filename(url, band, dim_sizes, index)
band_metadata = self.get_band_metadata_dict(fn, ds.variables[band])
# Add time stamp to band metadata
tt = datetime.datetime.fromisoformat(str(self.times()[index["time"]["index"]]))
if tt.tzinfo is None:
tt = pytz.utc.localize(tt)
band_metadata["dst"]["time"] = tt.isoformat()
meta_dict.append(band_metadata)

self.create_bands(meta_dict)

# Copy metadata
for key in metadata.keys():
self.dataset.SetMetadataItem(str(key), str(metadata[key]))

# Get dictionary describing the instrument and platform according to
# the GCMD keywords
mm = pti.get_gcmd_instrument("computer")
ee = pti.get_gcmd_platform("models")

self.dataset.SetMetadataItem("instrument", json.dumps(mm))
self.dataset.SetMetadataItem("platform", json.dumps(ee))

# Set input filename
self.dataset.SetMetadataItem("nc_file", self.input_filename)
40 changes: 40 additions & 0 deletions nansat/mappers/mapper_meps_ncml.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import os
import pytz
import netCDF4
import datetime

import numpy as np

from nansat.exceptions import WrongMapperError
from nansat.mappers.mapper_meps import Mapper as NCMapper

class Mapper(NCMapper):


def __init__(self, ncml_url, gdal_dataset, gdal_metadata, netcdf_dim=None, *args, **kwargs):

if not ncml_url.endswith(".ncml"):
raise WrongMapperError

dt = datetime.timedelta(0)
if netcdf_dim is not None and "time" in netcdf_dim.keys():
ds = netCDF4.Dataset(ncml_url)
time = netcdf_dim["time"].astype(datetime.datetime).replace(
tzinfo=pytz.timezone("utc"))
dt = time - datetime.datetime.fromisoformat(ds.time_coverage_start.replace(
"Z", "+00:00"))
url = self._get_odap_url(ncml_url, np.round(dt.total_seconds()/3600))

super(Mapper, self).__init__(url, gdal_dataset, gdal_metadata, *args, **kwargs)


def _get_odap_url(self, fn, file_num=0):
""" Get the opendap url to file number 'file_num'. The
default file number is 0, and yields the forecast time.
"""
url = (
"" + os.path.split(fn)[0] + "/member_%02d"
"/meps_" + os.path.basename(fn).split("_")[2] +
"_%02d_" + os.path.basename(fn).split("_")[3][:-2]
) % (int(os.path.basename(fn)[8:11]), file_num)
return url
Loading