how to use metpy.calc.divergence with chunked Xarray? #2225
-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
One possible work around solution maybe to get the grid spacing in meter or kilometer from longitude and latitude coordinate of xarray DataArray. It seems that But it would be much better if there is a way to let Thanks! |
Beta Was this translation helpful? Give feedback.
-
A key issue here is the following:
As explained in the documentation, That being said, since your data are chunked with the horizontal being contiguous, there are several alternatives for making this work. (If instead your data were chunked in the horizontal, things get a lot more complicated due to the need to overlap blocks.)
MetPy's Dask support is very provisional, however, there is a chance that this could work as-is: div = mpcalc.divergence(u10 * q, v10 * q)
If something in MetPy's internals isn't playing nicely with Dask, then a possible workaround is to let xarray split up the DataArrays based on the Dask chunks. This is a bit more complicated than it otherwise could be due to only one Dask-based object being permitted. inputs = xr.merge((u10 * q).rename('x_component'), (v10 * q).rename('y_component'))
def wrapped_divergence(ds):
return mpcalc.divergence(ds['x_component'], ds['y_component'])
div = xr.map_blocks(wrapped_divergence, inputs, template=q)
div.attrs = ... # will likely need to fix the attrs based on difference from template Based on pydata/xarray#4208, this should work, but if there is a
If the xarray > pint > Dask structure isn't working with either of the two methods mentioned above, then, if you still wanted to try this with Dask, you may need to temporarily drop out of xarray data structures and do the calculation a bit more manually: dx, dy = grid_deltas_from_dataarray(q, kind='actual')
div = mpcalc.divergence((u10 * q).metpy.unit_array, (v10 * q).metpy.unit_array, dx=dx, dy=dy)
div = xr.DataArray(div, dims=q.dims, coords=q.coords)
There's a chance that having Dask arrays inside pint Quantities still won't play nicely with MetPy's calculations (since proper Dask support is still a work in progress and there are still some things being worked out in the ecosystem as a whole. Since in this example problem, you are just chunking over time, you could design an alternative parallelization that loops the data over time, loading just one time at a time, saving to disk between steps (such as appending to a Zarr store), and then collecting the results upon completion. |
Beta Was this translation helpful? Give feedback.
A key issue here is the following:
As explained in the documentation,
xr.apply_ufunc
applies a function for unlabeled arrays. So, when used this way,mpcalc.divergence
doesn't see a xarray object, but instead just data inside, which means that none of MetPy's xarray coordinate-aware features will work. Also, you may have noticed that when you tried it this way, thepint.Quantity
ended up inside the Dask array rather than the other way around, which comes from a failure of Dask to properly recognize upcast types. All-in-all, this approach won't wor…