forked from elastic/elasticsearch-definitive-guide
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
0e79bfa
commit 8f1ed17
Showing
22 changed files
with
1,606 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,32 +1,56 @@ | ||
[[geoloc]] | ||
== Geolocation (TODO) | ||
:ref: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/ | ||
|
||
The web is increasingly location aware – users expect to see local results, | ||
or to be able to filter results by their position on a map. | ||
include::310_Geolocation/10_Intro.asciidoc[] | ||
|
||
This chapter explains how to use geolocation in Elasticsearch, including | ||
optimization tips. | ||
include::310_Geolocation/20_Geopoints.asciidoc[] | ||
|
||
include::310_Geolocation/30_Filter_by_geopoint.asciidoc[] | ||
|
||
=== Adding geolocation to your documents | ||
* Mapping the geo-point type | ||
* Indexing documents with geo-points | ||
include::310_Geolocation/32_Bounding_box.asciidoc[] | ||
|
||
[[geoloc-filters]] | ||
=== Geolocation-aware search | ||
* geo-distance and geo-distance-range filters | ||
* geo-bounding-box filter | ||
* geo-polygon filter | ||
include::310_Geolocation/34_Geo_distance.asciidoc[] | ||
|
||
=== Sorting by distance | ||
. | ||
include::310_Geolocation/36_Caching_geofilters.asciidoc[] | ||
|
||
include::310_Geolocation/38_Reducing_memory.asciidoc[] | ||
|
||
=== Geo-shapes | ||
. | ||
include::310_Geolocation/40_Geohashes.asciidoc[] | ||
|
||
include::310_Geolocation/50_Sorting_by_distance.asciidoc[] | ||
|
||
=== Optimizing geo-queries | ||
. | ||
include::310_Geolocation/60_Geo_aggs.asciidoc[] | ||
|
||
include::310_Geolocation/62_Geo_distance_agg.asciidoc[] | ||
|
||
include::310_Geolocation/64_Geohash_grid_agg.asciidoc[] | ||
|
||
include::310_Geolocation/66_Geo_bounds_agg.asciidoc[] | ||
|
||
include::310_Geolocation/70_Geoshapes.asciidoc[] | ||
|
||
include::310_Geolocation/72_Mapping_geo_shapes.asciidoc[] | ||
|
||
include::310_Geolocation/74_Indexing_geo_shapes.asciidoc[] | ||
|
||
include::310_Geolocation/76_Querying_geo_shapes.asciidoc[] | ||
|
||
include::310_Geolocation/78_Indexed_geo_shapes.asciidoc[] | ||
|
||
include::310_Geolocation/80_Caching_geo_shapes.asciidoc[] | ||
|
||
|
||
//////// | ||
|
||
|
||
|
||
geo_shape: | ||
mapping | ||
tree | ||
precision | ||
type of shapes | ||
indexing | ||
indexed shapes | ||
filters | ||
geoshape | ||
|
||
//////// |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
[[geoloc]] | ||
== Geolocation | ||
|
||
Gone are the days when we wander around a city with paper maps. Thanks to | ||
smartphones, we now know exactly where we are all of the time, and we expect | ||
websites to use that information. I'm not interested in restaurants in | ||
Greater London -- I want to know about restaurants within 5 minutes walk of my | ||
current location. | ||
|
||
But geolocation is only one part of the puzzle. The beauty of Elasticsearch | ||
is that it allows you to combine geolocation with full text search, structured | ||
search, and analytics. | ||
|
||
For instance: show me restaurants that mention _vitello tonnato_, are within 5 | ||
minutes walk, and are open at 11pm, and rank them by a combination of user | ||
rating, distance and price. Another example: show me a map of holiday rental | ||
properties available in August throughout the city, and calculate the average | ||
price per zone. | ||
|
||
Elasticsearch offers two ways of representing geolocations: latitude-longitude | ||
points using the `geo_point` field type, and complex shapes defined in | ||
http://en.wikipedia.org/wiki/GeoJSON[GeoJSON], using the `geo_shape` field | ||
type. | ||
|
||
Geo-points allow you to find points within a certain distance of another | ||
point, to calculate distances between two points for sorting or relevance | ||
scoring, or to aggregate into a grid to display on a map. Geo-shapes, on the | ||
other hand, are used purely for filtering. They can be used to decide whether | ||
two shapes overlap or not, or whether one shape completely contains other | ||
shapes. | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
[[indexing-geopoints]] | ||
=== Indexing geo-points | ||
|
||
Geo-points cannot be automatically detected with | ||
<<dynamic-mapping,dynamic mapping>>. Instead, geo-points fields should be | ||
mapped explicitly: | ||
|
||
[source,json] | ||
----------------------- | ||
PUT /attractions | ||
{ | ||
"mappings": { | ||
"restaurant": { | ||
"properties": { | ||
"name": { | ||
"type": "string" | ||
}, | ||
"location": { | ||
"type": "geo_point" | ||
} | ||
} | ||
} | ||
} | ||
} | ||
----------------------- | ||
|
||
[[lat-lon-formats]] | ||
==== Lat/Lon formats | ||
|
||
With the `location` field defined as a `geo_point`, we can proceed to index | ||
documents containing latitude/longitude pairs, which can be formatted as | ||
strings, arrays, or objects: | ||
|
||
[source,json] | ||
----------------------- | ||
PUT /attractions/restaurant/1 | ||
{ | ||
"name": "Chipotle Mexican Grill", | ||
"location": "40.715, -74.011" <1> | ||
} | ||
PUT /attractions/restaurant/2 | ||
{ | ||
"name": "Pala Pizza", | ||
"location": { <2> | ||
"lat": 40.722, | ||
"lon": -73.989 | ||
} | ||
} | ||
PUT /attractions/restaurant/3 | ||
{ | ||
"name": "Mini Munchies Pizza", | ||
"location": [ -73.983, 40.719 ] <3> | ||
} | ||
----------------------- | ||
<1> A string representation, with `"lat,lon"`. | ||
<2> An object representation with `lat` and `lon` explicitly named. | ||
<3> An array representation with `[lon,lat]`. | ||
|
||
[IMPORTANT] | ||
======================== | ||
Everybody gets caught at least once: string geo-points are | ||
`"latitude,longitude"`, while array geo-points are `[longitude,latitude]` -- | ||
the opposite order! | ||
Originally, both strings and arrays in Elasticsearch used latitude followed by | ||
longitude. However, it was decided early on to switch the order for arrays in | ||
order to conform with GeoJSON. | ||
The result is a bear trap that captures all unsuspecting users on their | ||
journey to full geo-location nirvana. | ||
======================== | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
[[filter-by-geopoint]] | ||
=== Filtering by geo-point | ||
|
||
Four geo-filters filters can be used to include or exclude documents by | ||
geo-location: | ||
|
||
<<geo-bounding-box,`geo_bounding_box`>>:: | ||
|
||
Find geo-points which fall within the specified rectangle. | ||
|
||
<<geo-distance,`geo_distance`>>:: | ||
|
||
Find geo-points within the specified distance of a central point. | ||
|
||
<<geo-distance-range,`geo_distance_range`>>:: | ||
|
||
Find geo-points within a specified minimum and maximum distance from a | ||
central point. | ||
|
||
`geo_polygon`:: | ||
|
||
Find geo-points which fall within the specified polygon. *This filter is | ||
very expensive*. If you find yourself wanting to use it, you should be | ||
looking at <<geo-shapes,geo-shapes>> instead. | ||
|
||
All of these filters work in a similar way: the `lat/lon` values are loaded | ||
into memory for *all documents in the index*, not just the documents which | ||
match the query (see <<fielddata-intro>>). Each filter performs a slightly | ||
different calculation to check whether a point falls into the containing area | ||
or not. | ||
|
||
[TIP] | ||
============================ | ||
Geo-filters are expensive -- they should be used on as few documents as | ||
possible. First remove as many documents as you can with cheaper filters, like | ||
`term` or `range` filters, and apply the geo filters last. | ||
The <<bool-filter,`bool` filter>> will do this for you automatically. First it | ||
applies any bitset-based filters (see <<filter-caching>>) to exclude as many | ||
documents as it can as cheaply as possible. Then it applies the more | ||
expensive geo or script filters to each remaining document in turn. | ||
============================ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
[[geo-bounding-box]] | ||
=== `geo_bounding_box` filter | ||
|
||
This is by far the most performant geo-filter because its calculation is very | ||
simple. You provide it with the `top`, `bottom`, `left`, and `right` | ||
coordinates of a rectangle and all it does is compare the latitude with the | ||
left and right coordinates, and the longitude with the top and bottom | ||
coordinates. | ||
|
||
[source,json] | ||
--------------------- | ||
GET /attractions/restaurant/_search | ||
{ | ||
"query": { | ||
"filtered": { | ||
"filter": { | ||
"geo_bounding_box": { | ||
"location": { <1> | ||
"top": 40.8, | ||
"bottom": 40.7, | ||
"left": -74.0, | ||
"right": -73.0 | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
--------------------- | ||
<1> These coordinates can also be specified as `top_left` and `bottom_right` | ||
pairs, or `bottom_left` and `top_right` pairs. | ||
|
||
[[optimize-bounding-box]] | ||
==== Optimizing bounding boxes | ||
|
||
The `geo_bounding_box` is the one geo-filter which doesn't require all | ||
geo-points to be loaded into memory. Because all it has to do is to check | ||
whether the `lat` and `lon` values fall within the specified ranges, it can | ||
use the inverted index to do a glorified `range` filter. | ||
|
||
In order to use this optimization, the `geo_point` field must be mapped to | ||
index the `lat` and `lon` values separately: | ||
|
||
[source,json] | ||
----------------------- | ||
PUT /attractions | ||
{ | ||
"mappings": { | ||
"restaurant": { | ||
"properties": { | ||
"name": { | ||
"type": "string" | ||
}, | ||
"location": { | ||
"type": "geo_point", | ||
"lat_lon": true <1> | ||
} | ||
} | ||
} | ||
} | ||
} | ||
----------------------- | ||
<1> The `location.lat` and `location.lon` fields will be indexed separately. | ||
These fields can be used for searching, but their values cannot be retrieved. | ||
|
||
Now, when we run our query, we have to tell Elasticsearch to use the indexed | ||
`lat` and `lon` values: | ||
|
||
[source,json] | ||
--------------------- | ||
GET /attractions/restaurant/_search | ||
{ | ||
"query": { | ||
"filtered": { | ||
"filter": { | ||
"geo_bounding_box": { | ||
"type": "indexed", <1> | ||
"location": { | ||
"top": 40.8, | ||
"bottom": 40.7, | ||
"left": -74.0, | ||
"right": -73.0 | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
--------------------- | ||
<1> Setting the `type` parameter to `indexed` (instead of the default | ||
`memory`) tells Elasticsearch to use the inverted index for this filter. | ||
|
||
IMPORTANT: While a `geo_point` field can contain multiple geo-points, the | ||
`lat_lon` optimization can only be used on fields which contain a single | ||
geo-point. | ||
|
Oops, something went wrong.