Documents are indexed—stored and made searchable—by using the index
API. But first, we need to decide where the document lives. As we just
discussed, a document’s _index
, _type
, and _id
uniquely identify the
document. We can either provide our own _id
value or let the index
API
generate one for us.
If your document has a natural identifier (for example, a user_account
field
or some other value that identifies the document), you should provide
your own _id
, using this form of the index
API:
PUT /{index}/{type}/{id}
{
"field": "value",
...
}
For example, if our index is called website
, our type is called blog
,
and we choose the ID 123
, then the index request looks like this:
PUT /website/blog/123
{
"title": "My first blog entry",
"text": "Just trying this out...",
"date": "2014/01/01"
}
Elasticsearch responds as follows:
{
"_index": "website",
"_type": "blog",
"_id": "123",
"_version": 1,
"created": true
}
The response indicates that the document has been successfully created
and includes the _index
, _type
, and _id
metadata, and a new element:
_version
.
Every document in Elasticsearch has a version number. Every time a change is
made to a document (including deleting it), the _version
number is
incremented. In [version-control], we discuss how to use the _version
number to ensure that one part of your application doesn’t overwrite changes
made by another part.
If our data doesn’t have a natural ID, we can let Elasticsearch autogenerate
one for us. The structure of the request changes: instead of using the PUT
verb (store this document at this URL''), we use the
store this document under this URL'').POST
verb (
The URL now contains just the _index
and the _type
:
POST /website/blog/
{
"title": "My second blog entry",
"text": "Still trying this out...",
"date": "2014/01/01"
}
The response is similar to what we saw before, except that the _id
field has been generated for us:
{
"_index": "website",
"_type": "blog",
"_id": "AVFgSgVHUP18jI2wRx0w",
"_version": 1,
"created": true
}
Autogenerated IDs are 20 character long, URL-safe, Base64-encoded GUID strings. These GUIDs are generated from a modified FlakeID scheme which allows multiple nodes to be generating unique IDs in parallel with essentially zero chance of collision.