Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider handling NANs #212

Open
sjsrey opened this issue May 17, 2024 · 3 comments
Open

Consider handling NANs #212

sjsrey opened this issue May 17, 2024 · 3 comments

Comments

@sjsrey
Copy link
Member

sjsrey commented May 17, 2024

yeah, agree. I think the nan-handle logic is only like those three lines. Without touching any of the classification code, we could maybe sneak it into the first step of the binning function so nans are ignored from the outset?

Originally posted by @knaaptime in #211 (comment)

@sjsrey
Copy link
Member Author

sjsrey commented May 17, 2024

As the current philosophy in mapclassify is to assume away NANS, geopandas is doing the heavy lifting on dealing with the NANS for choropleths.

I've been exploring some approaches to handling NANS in mapclassify - it isn't as simple as I initially thought, but certainly possible. Doing so fully would require discussions with @martinfleis in order to keep in sync with geopandas.

So this issue is a channel to flesh out the thinking on whether we should do this in mapclassify, or not.

@knaaptime
Copy link
Member

knaaptime commented May 17, 2024

i started looking at swapping in numpy nan_operators (e.g. nanmean instead of nan) to see about making the classifiers agnostic to the NaNs but decided that would probably be more trouble than it's worth. Probably best to let the classifiers operate, conceptually, on 'pure arrays', then just use pandas indices to keep track of where those real observations live, then reinsert on the other side.

The idea would be that if a classifier is given an array with nans, then the resulting y and yb attributes would also include nans in the appropriate places, but the classifier would ignore them when assigning bins

if we went that route, I think it would (a) not induce any breaking behavior here in mc and (b) could probably drop-in over at geopandas?)

@martinfleis
Copy link
Member

I'll have to take a dive into our plotting code to get a better understanding of how it could help geopandas. It's been a while since I touched that module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants