You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As David mentioned on Tuesday, they (ProPublica) couldn't include the latitude and longitude (lat and lon) values in the data sets they've made publically available, as the geocoding service (Google) placed that restriction on that data. These lat and lon values are necessary if you want to map features of the data (for example, if you want to map out all locations where a specific vehicle make were ticketed, or all locations a specific officer ticketed).
I've been exploring some techniques for geocoding the addresses in the sample dataset and, using Python's Geocoder library, I've put together a notebook that could reverse geocode about 150 addresses per minute without an API key. It took about 2 hours to get through all 20k unique addresses in the sample dataset. This ran in the background while I did other things, and computationally this was a very light operation, but at this rate, there may be issues scaling up to a much larger number of addresses.
Does anyone know of other reverse geocoding implementations that are faster while still being free (as in beer and as in speech)?
The text was updated successfully, but these errors were encountered:
The problem is that the Google geocoder terms of service are pretty clear that you can't use it for stuff like this. I think Texas A&M might be an option! I realize I should have probably been a little slower on the draw with your PR by including the output file.
As David mentioned on Tuesday, they (ProPublica) couldn't include the latitude and longitude (lat and lon) values in the data sets they've made publically available, as the geocoding service (Google) placed that restriction on that data. These lat and lon values are necessary if you want to map features of the data (for example, if you want to map out all locations where a specific vehicle make were ticketed, or all locations a specific officer ticketed).
I've been exploring some techniques for geocoding the addresses in the sample dataset and, using Python's Geocoder library, I've put together a notebook that could reverse geocode about 150 addresses per minute without an API key. It took about 2 hours to get through all 20k unique addresses in the sample dataset. This ran in the background while I did other things, and computationally this was a very light operation, but at this rate, there may be issues scaling up to a much larger number of addresses.
Does anyone know of other reverse geocoding implementations that are faster while still being free (as in beer and as in speech)?
The text was updated successfully, but these errors were encountered: