Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dask_awkward.Array.__array_function__ to dispatch NumPy functions to their dask-awkward equivalents #490

Open
jpivarski opened this issue Mar 28, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@jpivarski
Copy link
Collaborator

dask_awkward.Array currently has

  • __awkward_function__ to dispatch ak.* functions to their equivalent dak.* functions
  • __array_ufunc__ to properly handle NumPy ufuncs via NEP-13 (Awkward and dask-awkward equivalents are not explicitly implemented)

but no

  • __array_function__ to dispatch np.* functions to their equivalent dak.* functions via NEP-18.

For this reason, an eager or Coffea 0.7 analysis that is cavalier about the distinction between np.where and ak.where, for example, will have to be changed by hand to dak.where, instead of automatically finding it.

There are 23 functions that would benefit from this immediately, because of the 38 np.* functions that Awkward overrides,

% fgrep -r numpy.implements src/awkward
src/awkward/operations/ak_min.py:@ak._connect.numpy.implements("amin")
src/awkward/operations/ak_min.py:@ak._connect.numpy.implements("nanmin")
src/awkward/operations/ak_count_nonzero.py:@ak._connect.numpy.implements("count_nonzero")
src/awkward/operations/ak_mean.py:@ak._connect.numpy.implements("mean")
src/awkward/operations/ak_mean.py:@ak._connect.numpy.implements("nanmean")
src/awkward/operations/ak_concatenate.py:@ak._connect.numpy.implements("concatenate")
src/awkward/operations/ak_sum.py:@ak._connect.numpy.implements("sum")
src/awkward/operations/ak_sum.py:@ak._connect.numpy.implements("nansum")
src/awkward/operations/ak_argsort.py:@ak._connect.numpy.implements("argsort")
src/awkward/operations/ak_copy.py:@ak._connect.numpy.implements("copy")
src/awkward/operations/ak_nan_to_num.py:@ak._connect.numpy.implements("nan_to_num")
src/awkward/operations/ak_ones_like.py:@ak._connect.numpy.implements("ones_like")
src/awkward/operations/ak_zeros_like.py:@ak._connect.numpy.implements("zeros_like")
src/awkward/operations/ak_prod.py:@ak._connect.numpy.implements("prod")
src/awkward/operations/ak_prod.py:@ak._connect.numpy.implements("nanprod")
src/awkward/operations/ak_sort.py:@ak._connect.numpy.implements("sort")
src/awkward/operations/ak_var.py:@ak._connect.numpy.implements("var")
src/awkward/operations/ak_var.py:@ak._connect.numpy.implements("nanvar")
src/awkward/operations/ak_round.py:@ak._connect.numpy.implements("round")
src/awkward/operations/ak_ptp.py:@ak._connect.numpy.implements("ptp")
src/awkward/operations/ak_any.py:@ak._connect.numpy.implements("any")
src/awkward/operations/ak_imag.py:@ak._connect.numpy.implements("imag")
src/awkward/operations/ak_real.py:@ak._connect.numpy.implements("real")
src/awkward/operations/ak_broadcast_arrays.py:@ak._connect.numpy.implements("broadcast_arrays")
src/awkward/operations/ak_std.py:@ak._connect.numpy.implements("std")
src/awkward/operations/ak_std.py:@ak._connect.numpy.implements("nanstd")
src/awkward/operations/ak_isclose.py:@ak._connect.numpy.implements("isclose")
src/awkward/operations/ak_full_like.py:@ak._connect.numpy.implements("full_like")
src/awkward/operations/ak_all.py:@ak._connect.numpy.implements("all")
src/awkward/operations/ak_max.py:@ak._connect.numpy.implements("amax")
src/awkward/operations/ak_max.py:@ak._connect.numpy.implements("nanmax")
src/awkward/operations/ak_argmax.py:@ak._connect.numpy.implements("argmax")
src/awkward/operations/ak_argmax.py:@ak._connect.numpy.implements("nanargmax")
src/awkward/operations/ak_where.py:@ak._connect.numpy.implements("where")
src/awkward/operations/ak_angle.py:@ak._connect.numpy.implements("angle")
src/awkward/operations/ak_ravel.py:@ak._connect.numpy.implements("ravel")
src/awkward/operations/ak_argmin.py:@ak._connect.numpy.implements("argmin")
src/awkward/operations/ak_argmin.py:@ak._connect.numpy.implements("nanargmin")

dask-awkward is only missing 15 of them:

>>> import dask_awkward as dak
>>> for name in [
...     "amin",
...     "nanmin",
...     "count_nonzero",
...     "mean",
...     "nanmean",
...     "concatenate",
...     "sum",
...     "nansum",
...     "argsort",
...     "copy",
...     "nan_to_num",
...     "ones_like",
...     "zeros_like",
...     "prod",
...     "nanprod",
...     "sort",
...     "var",
...     "nanvar",
...     "round",
...     "ptp",
...     "any",
...     "imag",
...     "real",
...     "broadcast_arrays",
...     "std",
...     "nanstd",
...     "isclose",
...     "full_like",
...     "all",
...     "amax",
...     "nanmax",
...     "argmax",
...     "nanargmax",
...     "where",
...     "angle",
...     "ravel",
...     "argmin",
...     "nanargmin",
... ]:
...     if not hasattr(dak, name):
...         print(name)
... 
amin
nanmin
nanmean
nansum
nanprod
nanvar
round
imag
real
nanstd
amax
nanmax
nanargmax
angle
nanargmin

(4 of which were just added in scikit-hep/awkward#3053.)

@jpivarski jpivarski added the enhancement New feature or request label Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant