-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifies the squaring bbox and tweaks PIL input shapes #1030
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have started looking at this but not finished, will complete review tomorrow.
pytest.param(1, 6, 7, 8, (10, 10), (1, 3, 7, 9), id="constrained right"), | ||
pytest.param(1, 1, 2, 6, (10, 10), (0, 1, 5, 6), id="constrained top"), | ||
pytest.param(6, 1, 8, 7, (10, 10), (3, 1, 9, 7), id="constrained bottom"), | ||
pytest.param(117, 20, 521, 603, (608, 608), (24, 20, 607, 603), id="constrained top and bottom"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I've tried looking at this in detail I find the dimensions quite hard to think about, are there smaller image_shape
and co-ordinates that would be constrained top and bottom?
I think if image_shape
is always square, which isn't necessarily the case since I recall @SylviaWhittle writing code to handle none-square images, then we'd only ever see a constraint on one side. Something that would be constrained top and bottom would be a wide rectangular bounding box within a wide rectangular image.
EDIT: I added just such a set of test parameters to the above mentioned commit and marked it as pytest.mark.xfail()
with a reason explaining why. If cherry picking it will be included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some broad comments as I don't think I saw/reviewed this unet code when it was originally added.
Test ID's
These are useful they give us (and our future selves) a marker/indication of what is going on.
Maybe I'm misunderstanding the intent/meaning but I think the id
's are perhaps misleading. Taking the second parameterised test (which I can't comment on directly in this PR) it is...
pytest.param(3, 4, 8, 8, (10, 10), (3, 4, 8, 9), id="free space single min col decrease"),
The minimum column though remains at 4
and its the last co-ordinate which is the max_col
which has gone from 8 > 9
here.
Across these I think they should perhaps be...
pytest.param(3, 4, 8, 8, (10, 10), (3, 4, 8, 9), id="free space single max col increase"),
pytest.param(4, 3, 8, 8, (10, 10), (4, 3, 9, 8), id="free space single max row increase"),
pytest.param(4, 4, 7, 8, (10, 10), (4, 4, 8, 8), id="free space single max row increase"),
pytest.param(4, 4, 8, 7, (10, 10), (4, 4, 8, 8), id="free space single max col increase"),
pytest.param(4, 2, 8, 8, (10, 10), (3, 2, 9, 8), id="free space double min row decrease, max row increase"),
pytest.param(2, 4, 8, 8, (10, 10), (2, 3, 8, 9), id="free space double min col decrease, max col increase"),
pytest.param(4, 4, 8, 6, (10, 10), (4, 3, 8, 7), id="free space double min col decrease, max col increase"),
pytest.param(4, 4, 6, 8, (10, 10), (3, 4, 7, 8), id="free space double min row decrease, max row increase"),
Additional assert
ions
The tests currently assert that the returned co-ordinates match pre-defined variables, but I think we can be a bit more savvy here and add in the following checks (can't make a direct suggestion on the code as its outside the scope of changes for the PR). The first of actually checks that the returned co-ordinates give a square...
assert (result[2] - result[0]) == (result[3] - result[1])
We can then check that each of the co-ordinates is within the image_shape
with...
assert result[0] >= 0
assert result[1] >= 0
assert result[2] <= image_shape[0]
assert result[3] <= image_shape[1]
This almost makes the test as to whether the result == expected_indices
redundant but no harm in keeping it in.
Is it always needed?
This function is called on line 644 of grains.py
and it is called unconditionally...
# Make the bounding box square within the confines of the image
bounding_box = make_bounding_box_square(
crop_min_row=bounding_box[0],
crop_min_col=bounding_box[1],
crop_max_row=bounding_box[2],
crop_max_col=bounding_box[3],
image_shape=(image.shape[0], image.shape[1]),
)
...and therefore on every single bounding box. I wouldn't be surprised if it is rare to see a square bounding box and so they almost always need squaring but the logic of testing whether it is square could be used to check whether all of the grains need to be checked by this function. That can be done with...
if (bounding_box[2] - bounding_box[0]) != (bounding_box[3] - bounding_box[1]):
bounding_box = make_bounding_box_square(
crop_min_row=bounding_box[0],
crop_min_col=bounding_box[1],
crop_max_row=bounding_box[2],
crop_max_col=bounding_box[3],
image_shape=(image.shape[0], image.shape[1]),
This led me to think about the logical checks on how the box is squared because there are two checks made the first to see if the height
> width
and then if the width
> height
. So if we only call this when we have non-square boxes the elif width > height
can simply be else
as the existing else
(which returns a square bounding box) isn't required as we wouldn't be calling this function with a square bounding box (removes five lines of code!).
But there are still a lot of if: ... else:...
statements being checked and lots of calculations so I wondered if there might be another way of doing this and I realised there is.
The broad algorithm is to...
- Find the mid-point of the existing bounding box and its largest radius.
- Expand in each direction from the mid-point by the radius (taking floor for minimum and ceilings for maximum).
- Adjust to the image boundary if this extends beyond the edges.
This doesn't always return a square bounding box though (the tests failed, which is why tests are so useful to have in place and the proposed additional assert to check the height == width
would capture this nicely).
These changes are on branch ns-rse/unet-crop-alternative
at commit d2187cb3215ca52fd5439de8ad8eae9c1cb5c02e
. You can check them out at...
git switch main
git pull
git checkout d2187cb3215ca52fd5439de8ad8eae9c1cb5c02e
A quick search threw up this thread which has some simpler logic and I've had a got at coding it up on the same branch ns-rse/unet-crop-alternative
(it's at HEAD
so you can just git switch ns-rse/unet-crop-alternative
to look at the changes and test). It too isn't quite right though as some of the tests still fail. and I've spent too long on this.
Why go through all of this? Its the essence of reviewing code.I like to keep code simple and avoid unnecessary calls if possible as at some point the problem will arise that "Its not fast enough" so if we can at the very least avoid making a function call if we don't need to that is useful (although I'm not sure perfectly square boxes will turn up that often). The proposed change is I think a bit easier to reason about there are less conditional clauses to work through in the various stages, its just the maths that need reasoning about. This might make the code easier to understand in the future.
Don't need to switch to these alternative solutions (especially as neither are complete and are still works in progress!) but I would suggest making a couple of changes at the very least...
- Only calling
make_bounding_box_square()
fromgrains.py
if it isn't already and in turn remove the currentelse:
and make theelif:
theelse:
. - Adding the suggested
assert
additions to the tests.
You can, I think, get the change to just the tests/test_unet_masking.py
and/or topostats/grains.py
by using...
git show d2187cb3215ca52fd5439de8ad8eae9c1cb5c02e -- tests/test_unet_masking.py | git apply --index -
Which shows the changes made to that file then pipes them into git apply --index -
which will apply it to the current branch.
Repeat for grains.py
.
You then need to make a git commit
to commit those changes on your branch.
pytest.param(1, 6, 7, 8, (10, 10), (1, 3, 7, 9), id="constrained right"), | ||
pytest.param(1, 1, 2, 6, (10, 10), (0, 1, 5, 6), id="constrained top"), | ||
pytest.param(6, 1, 8, 7, (10, 10), (3, 1, 9, 7), id="constrained bottom"), | ||
pytest.param(117, 20, 521, 603, (608, 608), (24, 20, 607, 603), id="constrained top and bottom"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I've tried looking at this in detail I find the dimensions quite hard to think about, are there smaller image_shape
and co-ordinates that would be constrained top and bottom?
I think if image_shape
is always square, which isn't necessarily the case since I recall @SylviaWhittle writing code to handle none-square images, then we'd only ever see a constraint on one side. Something that would be constrained top and bottom would be a wide rectangular bounding box within a wide rectangular image.
EDIT: I added just such a set of test parameters to the above mentioned commit and marked it as pytest.mark.xfail()
with a reason explaining why. If cherry picking it will be included.
- Corrects the `id` of the parameterised tests for `test_make_bounding_box_square()` to reflect the `row`/`col` and whether they `increase`/`decrease`. - Adds a rectangular image with a large rectangular `bbox` that can not be squared and marks the test as a fail. I recall, but can't find, that we added functionality to handle non-square scans (I think @SylviaWhittle wrote the code to handle it) and so its possible that instances may arise where the `bbox` can't be squared. Ideally we should add a `try: ... except:` to handle such cases but for now adding a test which is marked as expected to fail should serve as an _aide memoire_ that sometimes things won't work.
tests(unet_masking): Correct ids and add expected fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We went through this earlier and agreed its good to go. The test that has been introduced that is expected to fail if a rectangular image with a large bounding box which can not be squared is considered sufficient for now.
Thanks @MaxGamill-Sheffield and @SylviaWhittle
Closes #1029
Very quick one as I had to do this for the topology paper which modifies how grains are cropped upon entry to the unet - turns out if these were too large, they weren't properly handled by the cropping and thus forced a grain failure.
Also quickly modified the PIL input shapes as sequential steps in the unet processing gives weird values (x, y) vs (y, x) and so I swapped these and now my images work well. I assume this wasn't a problem before as all images were square anyway.
Have updated tests too :)