Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geocoding methods without restrictions #1

Open
MattTriano opened this issue Aug 31, 2018 · 1 comment
Open

Geocoding methods without restrictions #1

MattTriano opened this issue Aug 31, 2018 · 1 comment

Comments

@MattTriano
Copy link
Contributor

As David mentioned on Tuesday, they (ProPublica) couldn't include the latitude and longitude (lat and lon) values in the data sets they've made publically available, as the geocoding service (Google) placed that restriction on that data. These lat and lon values are necessary if you want to map features of the data (for example, if you want to map out all locations where a specific vehicle make were ticketed, or all locations a specific officer ticketed).

I've been exploring some techniques for geocoding the addresses in the sample dataset and, using Python's Geocoder library, I've put together a notebook that could reverse geocode about 150 addresses per minute without an API key. It took about 2 hours to get through all 20k unique addresses in the sample dataset. This ran in the background while I did other things, and computationally this was a very light operation, but at this rate, there may be issues scaling up to a much larger number of addresses.

Does anyone know of other reverse geocoding implementations that are faster while still being free (as in beer and as in speech)?

@eads
Copy link
Contributor

eads commented Aug 31, 2018

The problem is that the Google geocoder terms of service are pretty clear that you can't use it for stuff like this. I think Texas A&M might be an option! I realize I should have probably been a little slower on the draw with your PR by including the output file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants