Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: create a single inverted polygon when no exteriors found #2470

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

rcomer
Copy link
Member

@rcomer rcomer commented Nov 2, 2024

Rationale

When _rings_to_multipolygon finds interior rings without associated exteriors, for each of these interiors it creates a new polygon as the difference between ~the whole map and that interior. Intuitively, if there are multiple interiors, then all of those whole map polygons will overlay each other and you won't see the holes. Instead it should create a single ~whole map polygon which uses all of those left over interiors.

I also switched out use of .buffer(0) for shapely.make_valid as that seems like the more official way to do that.

Fixes #1149, fixes #1674, fixes #2160 (all ocean feature examples) and fixes #1449 (contourf example)
Also fixes the original case from #2030 (but not the second example so not linking that issue)
Also fixes the contourf examples at #1076 (comment) and #1076 (comment)

I turned #1149 into an image test. Really there should be something more specific in test_polygon.py, but I'm at a loss how to create an artificial example for this.

Implications

if polygon_bits:
multi_poly = sgeom.MultiPolygon(polygon_bits)
else:
multi_poly = sgeom.MultiPolygon()
Copy link
Member Author

@rcomer rcomer Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that MultiPolygon will work fine with an empty list.

interior_polys.append(polygon)
elif isinstance(polygon, sgeom.MultiPolygon):
interior_polys.extend(polygon.geoms)
elif isinstance(polygon, sgeom.GeometryCollection):
Copy link
Member Author

@rcomer rcomer Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do get a GeometryCollection for the #1149 case.

@rcomer
Copy link
Member Author

rcomer commented Nov 3, 2024

I note that _rings_to_multipolygon is called within _project_polygon. If you start with a MultiPolygon it will just loop through the polygons and call _project_polygon on each, so we might plausibly create multiple overlapping whole map polygons that way. So (plausibly) we might fix more cases by applying this logic at the MultiPolygon level. I propose to get this fix in first and dig into the MultiPolygon question later though.

@rcomer
Copy link
Member Author

rcomer commented Nov 3, 2024

I appear to be reinventing part of #1739...

@greglucas
Copy link
Contributor

Really there should be something more specific in test_polygon.py, but I'm at a loss how to create an artificial example for this.

I agree. Simplifying these down would really help us debug future issues too because I feel like it is confusing what the root cause of the issue is sometimes. A few thoughts in case this sparks anything.

  • All geometries causing issues are coming from the default PlateCarree, so we should be able to create something in degrees.
  • Most of the issues seem to be around Lambert* transforms, so going to an r/theta coordinate frame.
  • Are there situations where the bottom left corner of a box gets transformed into the upper right corner of a box when you are right around the central longitude / maximum latitude? (possibly a rhombus/diamond shape where one point is on the opposite side of a boundary)

I appear to be reinventing part of #1739...

That has a lot going on. Maybe this is a small piece of that and we can keep doing incremental updates?

@rcomer
Copy link
Member Author

rcomer commented Nov 3, 2024

OK here's the simplest I have so far by studying #1149 (comment)

import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import shapely.geometry as sgeom

sudo_ocean = sgeom.Polygon([(-180, -80), (180, -80), (180, 90), (-180, 90)],
                           [[(-50, -50), (-50, 0), (0, 0), (0, -50)]])

fig = plt.figure(figsize=(6, 8))

ax1 = fig.add_subplot(2, 1, 1, projection=ccrs.PlateCarree())
proj = ccrs.LambertAzimuthalEqualArea(central_latitude=45, central_longitude=-100)
ax2 = fig.add_subplot(2, 1, 2, projection=proj)

for ax in [ax1, ax2]:
    ax.add_geometries(sudo_ocean, crs=ccrs.PlateCarree())
    ax.set_global()
    
plt.show()

main:
image

branch:
image

Copy link
Member Author

@rcomer rcomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I remove the image test?

multi_polygon = proj.project_geometry(poly)
# Should project to single polygon with multiple holes
assert len(multi_polygon.geoms) == 1
assert len(multi_polygon.geoms[0].interiors) >= 2
Copy link
Member Author

@rcomer rcomer Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually get 36 holes. I assume most of them are responsible for the seam down the dateline visible in #2470 (comment). I didn't assert the exact number because I've no idea how stable that would be.

Against main, this test fails at the first assert since we have two polygons in our multipolygon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the image test since it is downloading the natural earth ocean for every runner, or replace it with this example as an image comparison test that doesn't require external downloading.

The 36 holes is interesting and worth investigating IMO. For your example here, it looks like there are many small loops around the boundary points that are created. (your image has a small white curve/line in it protruding from the south pole area which are these linear rings I think)

import matplotlib.pyplot as plt

ax = plt.figure().add_subplot()

interior = multi_polygon.geoms[0].interiors
for i in range(len(interior)):
    ax.plot(*interior[i].xy)

plt.show()

I wonder if these extra loops/rings are what is slowing down some of our projecting of geometries even 🤔

I'm also happy to defer these investigations to future PRs because this is already a pretty substantial improvement I think.

@rcomer rcomer marked this pull request as ready for review November 3, 2024 15:47
@rcomer rcomer force-pushed the multipolygon branch 2 times, most recently from 30a704f to b57388e Compare November 4, 2024 19:03
@greglucas
Copy link
Contributor

I dug into the multiple rings issue a bit and I think the main issue is projecting the seam along the dateline. On one side of the boundary we go up and the other we go down, and because our intermediate point bisection routine is not the same going from A->B as B->A. A quick example with just line segments being plotted and then zooming in you can see that the points aren't exactly aligned and then the segments overlap one another making invalid polygons.

import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import shapely.geometry as sgeom

# Adapted from 1149
proj = ccrs.LambertAzimuthalEqualArea(
    central_latitude=45, central_longitude=-100)
x = 179
l0 = sgeom.LineString([(-180, -80), (-180, 80)])
l1 = sgeom.LineString([(-180, 80), (-180, -80)])
p0 = proj.project_geometry(l0).geoms[0]
p1 = proj.project_geometry(l1).geoms[0]

ax = plt.figure().add_subplot()
ax.plot(p0.xy[0], p0.xy[1])
ax.plot(p1.xy[0], p1.xy[1])
plt.show()

I thought there might be an easy way of modifying project_segment() and swapping the start/end points so that it is always going the same direction regardless of inputs. But, new lines are started/stopped within that routine so the reversal isn't as trivial as I was hoping it would be... I think ultimately we need a new interpolation/projection routine which is a much bigger task.

@rcomer
Copy link
Member Author

rcomer commented Nov 16, 2024

I had a go at #2470 (comment) with rcomer@f440872. It did not break any exiting tests but also did not fix any issue. So back to the drawing board! Also it does make sense that the projection of a multipolygon should give the same result as projecting those polygons individually, so if it had worked it would have been more of a workaround than an actual fix I think.

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good improvement and fixes many of the cases. We shouldn't let perfect be the enemy of good here. There is room for follow-up PRs after this to keep improving the situation here.

Thanks @rcomer for digging into all of this!

@greglucas greglucas merged commit d8e8bdf into SciTools:main Nov 22, 2024
23 checks passed
@QuLogic QuLogic added this to the Next Release milestone Nov 22, 2024
@rcomer rcomer deleted the multipolygon branch November 22, 2024 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment