Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3-d spherical regridding and regridding to DSG feature types #726

Merged
merged 31 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions Changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ version NEXT

**2024-??-??**

* Added spherical regridding to discrete sampling geometry destination
grids (https://github.com/NCAS-CMS/cf-python/issues/716)
* Added 3-d spherical regridding to `cf.Field.regrids`, and the option
to regrid the vertical axis in logarithmic coordinates to
`cf.Field.regrids` and `cf.Field.regridc`
(https://github.com/NCAS-CMS/cf-python/issues/715)
* Fix misleading error message when it is not possible to create area
weights requested from `cf.Field.collapse`
(https://github.com/NCAS-CMS/cf-python/issues/731)
Expand Down
24 changes: 22 additions & 2 deletions cf/data/dask_regrid.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,7 @@ def regrid(
source grid cells i. Note that it is typical that for a
given j most w_ji will be zero, reflecting the fact only a
few source grid cells intersect a particular destination
grid cell. I.e. *weights* is typically a very sparse
matrix.
grid cell. I.e. *weights* is usually a very sparse matrix.

If the destination grid has masked cells, either because
it spans areas outside of the source grid, or by selection
Expand Down Expand Up @@ -314,6 +313,27 @@ def regrid(
# [0,1,3,4,2]
raxis0, raxis = axis_order[-2:]
axis_order = [i if i <= raxis else i - 1 for i in axis_order[:-1]]
elif n_src_axes == 3 and n_dst_axes == 1:
# The regridding operation decreased the number of data axes
# by 2 => modify 'axis_order' to remove the removed axes.
#
# E.g. regular Z-lat-lon -> DSG could change 'axis_order' from
# [0,2,5,1,3,4] to [0,2,3,1], or [0,2,4,5,3,1] to
# [0,1,2,3]
raxis0, raxis1 = axis_order[-2:]
if raxis0 > raxis1:
raxis0, raxis1 = raxis1, raxis0

new = []
for i in axis_order[:-2]:
if i <= raxis0:
new.append(i)
elif raxis0 < i <= raxis1:
new.append(i - 1)
else:
new.append(i - 2)

axis_order = new
elif n_src_axes != n_dst_axes:
raise ValueError(
f"Can't (yet) regrid from {n_src_axes} dimensions to "
Expand Down
14 changes: 12 additions & 2 deletions cf/data/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -3840,7 +3840,7 @@ def _regrid(
The positions of the regrid axes in the data, given in
the relative order expected by the regrid
operator. For spherical regridding this order is [Y,
X].
X] or [Z, Y, X].

*Parameter example:*
``[2, 3]``
Expand Down Expand Up @@ -3902,7 +3902,7 @@ def _regrid(
new_axis.extend(range(n + 1, n + n_sizes))
n += n_sizes - 1
else:
regridded_chunks.extend(c)
regridded_chunks.append(c)

n += 1

Expand Down Expand Up @@ -3948,6 +3948,16 @@ def _regrid(
min_weight=min_weight,
)

# Performance note:
#
# The function 'regrid_func' is copied into every Dask
# task. If we included the large 'weights_dst_mask' in the
# 'partial' definition then it would also be copied to every
# task, which "will start to be a pain in a few parts of the
# pipeline" definition. Instead we can pass it in via a
# keyword argument to 'map_blocks'.
# github.com/pangeo-data/pangeo/issues/334#issuecomment-403787663

dx = dx.map_blocks(
regrid_func,
weights_dst_mask=weights_dst_mask,
Expand Down
128 changes: 73 additions & 55 deletions cf/docstring/docstring.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,43 +115,45 @@
* ``'bilinear'``: Deprecated alias for ``'linear'``.

* ``'conservative_1st'``: First order conservative
interpolation. Preserves the area integral of the
data across the interpolation from source to
destination. It uses the proportion of the area of
the overlapping source and destination cells to
determine appropriate weights.
interpolation. Preserves the integral of the source
field across the regridding. Weight calculation is
based on the ratio of source cell area overlapped
with the corresponding destination cell area.

* ``'conservative'``: Alias for ``'conservative_1st'``

* ``'conservative_2nd'``: Second-order conservative
interpolation. As with first order conservative
interpolation, preserves the area integral of the
field between source and destination using a
weighted sum, with weights based on the
proportionate area of intersection. In addition the
second-order conservative method takes the source
gradient into account, so it yields a smoother
destination field that typically better matches the
source data.

* ``'patch'`` Patch recovery interpolation. A second
degree 2-d polynomial regridding method, which uses
a least squares algorithm to calculate the
polynomials. This method typically results in
better approximations to values and derivatives when
interpolation. Preserves the integral of the source
field across the regridding. Weight calculation is
based on the ratio of source cell area overlapped
with the corresponding destination cell area. The
second-order conservative calculation also includes
the gradient across the source cell, so in general
it gives a smoother, more accurate representation of
the source field. This is particularly true when
going from a coarse to finer grid.

* ``'patch'`` Patch recovery interpolation. Patch
rendezvous method of taking the least squares fit of
the surrounding surface patches. This is a higher
order method that may produce interpolation weights
that may be slightly less than 0 or slightly greater
than 1. This method typically results in better
approximations to values and derivatives when
compared to bilinear interpolation.

* ``'nearest_stod'``: Nearest neighbour interpolation
for which each destination point is mapped to the
closest source point. Useful for extrapolation of
categorical data. Some destination cells may be
unmapped.
* ``'nearest_stod'``: Nearest neighbour source to
destination interpolation for which each destination
point is mapped to the closest source point. A
source point can be mapped to multiple destination
points. Useful for regridding categorical data.

* ``'nearest_dtos'``: Nearest neighbour interpolation
for which each source point is mapped to the
destination point. Useful for extrapolation of
categorical data. All destination cells will be
mapped.
* ``'nearest_dtos'``: Nearest neighbour destination to
source interpolation for which each source point is
mapped to the closest destination point. A
destination point can be mapped to multiple source
points. Some destination points may not be
mapped. Useful for regridding of categorical data.

* `None`: This is the default and can only be used
when *dst* is a `RegridOperator`.""",
Expand Down Expand Up @@ -493,7 +495,9 @@

The computation of the weights can be much more costly
than the regridding itself, in which case reading
pre-calculated weights can improve performance.""",
pre-calculated weights can improve performance.

Ignored if *dst* is a `RegridOperator`.""",
# aggregated_units
"{{aggregated_units: `str` or `None`, optional}}": """aggregated_units: `str` or `None`, optional
The units of the aggregated array. Set to `None` to
Expand Down Expand Up @@ -587,6 +591,20 @@
"{{weights auto: `bool`, optional}}": """auto: `bool`, optional
If True then return `False` if weights can't be found,
rather than raising an exception.""",
# ln_z
"{{ln_z: `bool` or `None`, optional}}": """ln_z: `bool` or `None`, optional
If True when *z*, *src_z* or *dst_z* are also set,
calculate the vertical component of the regridding
weights using the natural logarithm of the vertical
coordinate values. This option should be used if the
quantity being regridded varies approximately linearly
with logarithm of the vertical coordinates. If False,
then the weights are calculated using unaltered
vertical values. If `None`, the default, then an
exception is raised if any of *z*, *src_z* or *dst_z*
have also been set.

Ignored if *dst* is a `RegridOperator`.""",
# pad_width
"{{pad_width: sequence of `int`, optional}}": """pad_width: sequence of `int`, optional
Number of values to pad before and after the edges of
Expand All @@ -603,30 +621,30 @@
True, or a tuple of both if *item* is True.""",
# regrid RegridOperator
"{{regrid RegridOperator}}": """* `RegridOperator`: The grid is defined by a regrid
operator that has been returned by a previous call
with the *return_operator* parameter set to True.

Unlike the other options, for which the regrid
weights need to be calculated, the regrid operator
already contains the weights. Therefore, for cases
where multiple fields with the same source grids
need to be regridded to the same destination grid,
using a regrid operator can give performance
improvements by avoiding having to calculate the
weights for each source field. Note that for the
other types of *dst* parameter, the calculation of
the regrid weights is not a lazy operation.

.. note:: The source grid of the regrid operator is
immediately checked for compatibility with
the grid of the source field. By default
only the computationally cheap tests are
performed (checking that the coordinate
system, cyclicity and grid shape are the
same), with the grid coordinates not being
checked. The coordinates check will be
carried out, however, if the
*check_coordinates* parameter is True.""",
operator that has been returned by a previous call
with the *return_operator* parameter set to True.

Unlike the other options, for which the regrid weights
need to be calculated, the regrid operator already
contains the weights. Therefore, for cases where
multiple fields with the same source grids need to be
regridded to the same destination grid, using a regrid
operator can give performance improvements by avoiding
having to calculate the weights for each source
field. Note that for the other types of *dst*
parameter, the calculation of the regrid weights is
not a lazy operation.

.. note:: The source grid of the regrid operator is
immediately checked for compatibility with
the grid of the source field. By default
only the computationally cheap tests are
performed (checking that the coordinate
system, cyclicity and grid shape are the
same), with the grid coordinates not being
checked. The coordinates check will be
carried out, however, if the
*check_coordinates* parameter is True.""",
# Returns cfa_file_substitutions
"{{Returns cfa_file_substitutions}}": """The CFA-netCDF file name substitutions in a dictionary
whose key/value pairs are the file name parts to be
Expand Down
Loading