Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make rotate available #213

Merged
merged 27 commits into from
Feb 21, 2024
Merged

Make rotate available #213

merged 27 commits into from
Feb 21, 2024

Conversation

martinschorb
Copy link
Contributor

This adds the rotate function to ndimage.

I am struggling with the tests.
#210

@jakirkham
Copy link
Member

cc @m-albert (in case this is of interest 🙂)

@m-albert
Copy link
Collaborator

m-albert commented May 7, 2021

@martinschorb Thanks a lot for sharing your dask implementation of rotate and this PR!

It's great that you added tests. I'm copying some of your comments from #210 here to address them in this thread.

I have some weird behaviour that I don't fully understand (this is my first time using pytest).
when I do not pass output_chunks on to affine_transform the test run fine, except for the timeouts. I always get Unknown pytest.mark.timeout - is this a typo?. Something seems to be missing in my pytest configuration because the timeout tests run fine with your affine_transform.

I had a look at the code and think that the problem regarding the timeouts is the following: rotate in it's scipy implementation, as opposed to affine_transform, doesn't allow setting an output_shape. Therefore, in the test test_rotate_large_input_small_output_cpu it takes very long to create a large output array that is of the same size as the input array but with chunksize 1. In the analogous affine_transform test, the size of the output array is set to [1,1,1] and therefore there's no time spent on creating a large output array.

These two ways of going ahead would come to my mind:

  1. Stick to the scipy API and don't support differently shaped output arrays (and exclude the above test) or
  2. Include an output_shape argument to dask_image.ndinterp.rotate as this could be a useful functionality.

I'd tend to the second option, as anyways the dask_image API doesn't exactly mirror the scipy.ndimage API here. On the other hand, somebody requiring this functionality could simply use affine_transform with equivalent parameters.

when I pass output_chunks without any modifications, many tests fail:
The reason for this: the images from ndimage and dask-image simply are not the same. There is a very small number of voxels that significantly differ.
Why is that the case, when the only thing that differs is output_chunks being present?

I looked into the tests and found that the problem comes from affine_transform. Namely, the choice of transformation parameters in your test seem to require a slightly larger slice of the input array for a given output chunk. Despite testing using different matrices, this hadn't been caught by existing tests.

I fixed this behaviour here #216 and improved the affine_transform tests. If you include the small change the PR introduces, these tests should pass for you, too.

@martinschorb
Copy link
Contributor Author

Hi Marvin,

  • Stick to the scipy API and don't support differently shaped output arrays (and exclude the above test) or

  • Include an output_shape argument to dask_image.ndinterp.rotate as this could be a useful functionality.

The reason why there is no way of explicitly setting output_shape is that for rotate there are two main use cases. These are defined by the reshape parameter and it will set the output size to what is required to keep all the transformed image in the FOV.

Probably we could deal with both options and also enable output_shape (that would include checking the offset calculation). Then I would think of a good way of also testing for the reshape parameter.

What is the proper way of doing this? I change it in my repo and create a new PR?

Cheers

@m-albert
Copy link
Collaborator

m-albert commented May 7, 2021

Hi Martin,

The reason why there is no way of explicitly setting output_shape is that for rotate there are two main use cases. These are defined by the reshape parameter and it will set the output size to what is required to keep all the transformed image in the FOV.

True that reshape affects the output shape.

Probably we could deal with both options and also enable output_shape (that would include checking the offset calculation). Then I would think of a good way of also testing for the reshape parameter.

I agree that the best would probably be to have both arguments there and functional. Of course, conflicting parameters such as reshape=True together with a given output_shape would need to lead to an exception.

What is the proper way of doing this? I change it in my repo and create a new PR?

Any changes and commits to your branch martinschorb:main are automatically included here in the PR. Is this what you meant?

One thing I'm wondering about is whether the testing strategy is the most convenient like this. Given that a large part of the tests actually tests the functionality of affine_transform, maybe there'd be a way of reducing this overlap. This would come in handy especially as parallel PRs such as #215 are currently improving affine_transform and also significantly changing its tests. Hmm not sure yet what the best solution here would be.

Cheers!

@martinschorb
Copy link
Contributor Author

martinschorb commented May 10, 2021

I am still getting slightly wrong results when comparing an image that is transformed with the output_shape set and the respective sub-area of a transformed image without. It looks like some interpolation is screwing up things...

from scipy import  misc
from dask_image.ndinterp import rotate
img = da.from_array(misc.ascent())
img = img[::4,::4]

im2 = rotate(img,angle)

output_shape = (9,9)
cx=img.shape[0]/2
cy=img.shape[1]/2
rx=output_shape[0]/2
ry=output_shape[1]/2

im3 = rotate(img,angle,output_shape = output_shape)
im4 = im2[cx-np.floor(rx):cx+np.ceil(rx),cy-np.floor(ry):cy+np.ceil(ry)]

these have clearly different values and it is not just a translation of pixels...

I'd love to have this checked by a test as well, but first need to understand where this difference comes from.

@m-albert
Copy link
Collaborator

Hey @martinschorb, I just found back our conversation here :)

It seems that back then I was a bit confused on what was going on with the output_shape parameter, I'm sorry for that!

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@m-albert
Copy link
Collaborator

these have clearly different values and it is not just a translation of pixels...
I'd love to have this checked by a test as well, but first need to understand where this difference comes from.

I just ran your nice code example and realised that in the case of even values for output_shape, im3 and im4 are identical as expected. In the case of odd values, I think the rotated image is simply not directly comparable to im2 sliced by integer values? So I suspect your implementation of rotate actually behaves as expected.

@GenevieveBuckley
Copy link
Collaborator

add to allowlist

@GenevieveBuckley
Copy link
Collaborator

I've resolved some minor conflicts between this PR branch and the dask-image main branch.

@m-albert
Copy link
Collaborator

Thanks a lot @GenevieveBuckley!

I'll try to have a look at what's up with the failing tests in the next days.

@gcaria
Copy link

gcaria commented Oct 27, 2023

Hi all, thanks for this, it looks great, and I'm already using it in my workflow :)

I see that not passing the output_chunks parameter yields just one big chunk for the rotate operation even the input array was chunked (so no parallelization).

Any thoughts about adding:

    if output_chunks is None:
        output_chunks = input_arr.chunksize

@m-albert
Copy link
Collaborator

@gcaria I think that's a good idea! In the case of dask_image.ndinterp.affine_transform the shapes of the input and output images can strongly vary, and there's no reason to assume that input and output would have the same chunksizes. However, in the case of rotate the input and output stacks have similar sizes and assuming the input and output chunksizes to be the same seems reasonable to me.

@martinschorb How's your availability and willingness to continue working on this PR? I think we're not far from finishing a useful addition to dask-image and it'd be great to continue with it one way or another.

@martinschorb
Copy link
Contributor Author

@gcaria I think that's a good idea! In the case of dask_image.ndinterp.affine_transform the shapes of the input and output images can strongly vary, and there's no reason to assume that input and output would have the same chunksizes. However, in the case of rotate the input and output stacks have similar sizes and assuming the input and output chunksizes to be the same seems reasonable to me.

@martinschorb How's your availability and willingness to continue working on this PR? I think we're not far from finishing a useful addition to dask-image and it'd be great to continue with it one way or another.

Hi @m-albert ,
will look into it. Let's see how we can get this through...

@martinschorb
Copy link
Contributor Author

Now it needs 5 more tests to get everything covered. Very simple cases, working on it ...

@m-albert
Copy link
Collaborator

m-albert commented Nov 7, 2023

Great to see all the progress on this :) Just FYI I'm traveling this week and will be happy to give feedback beginning of next week.

@martinschorb
Copy link
Contributor Author

these details are still not perfect:

  • output parameter: I have implemented the dtype option here only. I don't fully understand how to place the result into an existing dask array. Also, the original implementation only works if the provided output array has the exact shape of the rotation result.
  • order and prefilter parameters are restricted. This is now explained in the docs.

@martinschorb
Copy link
Contributor Author

I have no idea what is the problem with the GPU CI...

https://gpuci.gpuopenanalytics.com/job/dask/job/dask-image/job/prb/job/dask-image-prb/169/CUDA_VER=11.5.2,LINUX_VER=ubuntu20.04,PYTHON_VER=3.10/console#L783

tests/test_dask_image/test_ndfilters/test_cupy_ndfilters.py::test_cupy_order[rank_filter-5-3] Fatal Python error: Aborted

This line/file was never touched in any of the commits.

@martinschorb
Copy link
Contributor Author

I wonder if that CI failure is related to the change in versions here:
#345

In the commits there I don't see the gpuCI build check...

@m-albert
Copy link
Collaborator

Thanks for all your work here @martinschorb.

I have no idea what is the problem with the GPU CI...
This line/file was never touched in any of the commits.

I think the GPU CI problems are unrelated to this PR and might relate to the intermittent failures observed by @GenevieveBuckley here. I tested this PR branch with cupy installed and all tests are passing locally for me. I think we can ignore this GPU CI failure in the context of this PR.

Regarding the code:

output parameter: I have implemented the dtype option here only. I don't fully understand how to place the result into an existing dask array. Also, the original implementation only works if the provided output array has the exact shape of the rotation result.

Supporting an output array into which the result is directly written is probably of limited use here and I'm not sure if it's even feasible. I think you're right in that determining an output dtype is related to this. However, since 1) conversion to a different dtype can be easily achieved by casting the output array and 2) to be consistent with dask_image.ndinterp.affine_transform I suggest to leave out this parameter completely.

Another parameter we can remove is output_shape. This way we're consistent with scipy.ndimage.rotate (and don't interfere with the reshape argument).

@m-albert m-albert self-assigned this Nov 13, 2023
Copy link
Collaborator

@m-albert m-albert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment next to the code

Comment on lines 370 to 371
if not all([isinstance(ax, Number) for ax in axes]):
raise TypeError('axes should contain only integer values')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason for why you extend here on the scipy.ndimage.rotate implementation?

https://github.com/scipy/scipy/blob/f990b1d2471748c79bc4260baf8923db0a5248af/scipy/ndimage/_interpolation.py#L901

@m-albert
Copy link
Collaborator

@martinschorb, if you'd like I could work on my comments above. Would you agree to collaborate on this PR? If you agree I'd push related changes here directly and combine them with your work. We're almost there to get this through :)

@martinschorb
Copy link
Contributor Author

Yes, let's get this done.

@m-albert
Copy link
Collaborator

Alright, I took out the parameters discussed above and outsourced others to affine_transform to simplify testing and propagation of potential future changes.

All tests are passing (the failing doc test is unrelated to this PR) and in my view we're ready to go.

@m-albert
Copy link
Collaborator

Merging now 🎉

Thank you for your wonderful contribution to dask_image @martinschorb.

Also thank you to @gcaria for your feedback and a great suggestion regarding output chunks.

@m-albert m-albert merged commit 849955a into dask:main Feb 21, 2024
14 of 15 checks passed
@m-albert m-albert mentioned this pull request Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants