-
Notifications
You must be signed in to change notification settings - Fork 3
AWS ElassticSearch Service Troubleshooting and Resolution Guide
Parking space for ElasticSearch Troubleshooting Tips that can be used to help fix some issues that may come up with the service.
Due to the limited access to the underlying infrastructure of the ESS provided by AWS, there are a limited number of things we can do or have access too in order to get things working again without involving AWS Support. I am hoping that some of the information provided here will help others.
NOTE
- The items in this document are based from issues that I have had in my own implementations of the AWS EslasticSeaarch Service (ESS). This is not meant to be an all inclusive troubleshooting guide. What you find here may or may not work with stand-alone ESS installs.
You can use CURL or the built in Dev Tools area to directly work with the Kibana API.
Using CURL, your (X)PUT commands must contain:
-H "Content-Type: application/json" -d
The command blocks are sent as JSON formatted command blocks
You can use the Console Dev Tools to send commands as well, they too, will need to be JSON formatted blocks
You can access the Dev Tools by clicking on the "Wrench" icon in Kibana Cluster in Read Only State
- As of Elasticsearch version 7.x, whenever the cluster loses quorum, the cluster is put in a read-only state as a precautionary measure.
- If quorum loss occurs and your cluster has more than one node, Amazon ES restores quorum and places the cluster into a read-only state. You have two options:
- Remove the read-only state and use the cluster as-is (below).
- Restore the cluster or individual indices from a snapshot.
- If quorum loss occurs and your cluster has only one node, Amazon ES replaces the node and does not place the cluster into a read-only state.
- Please note that the cluster is put in a read-only state only when it loses quorum. It is not put in a read-only state in any other scenario where the cluster goes down, and is brought back to a healthy status either by you, or by AWS.
You can tell when the cluster is in a Read Only state when you do not see any new logs being ingested (or look at the cluster state using the information below).
You can query the cluster to get the cluster settings:
curl https://<ElasticSearch_Endpoint>/_cluster/settings
Using the default command above will yield results int he following format:
{"persistent":{"cluster":{"routing":{"allocation":{"cluster_concurrent_rebalance":"2","node_concurrent_recoveries":"2","disk":{"watermark":{"low":"2.85gb","flood_stage":"0.95gb","high":"1.9gb"}},"node_initial_primaries_recoveries":"4"}},"blocks":{"read_only":"false"},"metadata":{"unsafe-bootstrap":"true"}},"indices":{"recovery":{"max_bytes_per_sec":"60mb"}}},"transient":{}}
If you append the following to the end of the CURL command, you will get an output similar to the Kibana Console output below:
?pretty
Example:
curl https://<ElasticSearch_Endpoint>/_cluster/settings?pretty
You can add the '?pretty' option to most all of the commands you send to the stack.
GET /_cluster/settings
{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"cluster_concurrent_rebalance" : "2",
"node_concurrent_recoveries" : "2",
"disk" : {
"watermark" : {
"low" : "2.85gb",
"flood_stage" : "0.95gb",
"high" : "1.9gb"
}
},
"node_initial_primaries_recoveries" : "4"
}
},
"blocks" : {
"read_only" : "true"
},
"metadata" : {
"unsafe-bootstrap" : "true"
}
},
"indices" : {
"recovery" : {
"max_bytes_per_sec" : "60mb"
}
}
},
"transient" : { }
}
"blocks" : {
"read_only" : "true"
},
This section shows that the cluster is in Read Only mode
You can also see in the logs for the Lambda Function that ships logs to CloudWatch using this example:
2019-11-06T16:29:15.337Z 5e062bfb-b852-43e7-8729-48101a3605eb ERROR Invoke Error{
"errorType": "Error",
"errorMessage": "{\"statusCode\":403,\"responseBody\":{\"error\":{\"root_cause\":[{\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\",\"suppressed\":[{\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\"}]}],\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\",\"suppressed\":[{\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\"}]},\"status\":403}}",
"stack": [
"Error: {\"statusCode\":403,\"responseBody\":{\"error\":{\"root_cause\":[{\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\",\"suppressed\":[{\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\"}]}],\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\",\"suppressed\":[{\"type\":\"cluster_block_exception\",\"reason\":\"blocked by: [FORBIDDEN/6/cluster read-only (api)];\"}]},\"status\":403}}",
" at _homogeneousError (/var/runtime/CallbackContext.js:13:12)",
" at postError (/var/runtime/CallbackContext.js:30:51)",
" at done (/var/runtime/CallbackContext.js:57:7)",
" at fail (/var/runtime/CallbackContext.js:69:7)",
" at Object.fail (/var/runtime/CallbackContext.js:105:16)",
" at /var/task/index.js:42:25",
" at IncomingMessage.<anonymous> (/var/task/index.js:176:13)",
" at IncomingMessage.emit (events.js:203:15)",
" at endReadableNT (_stream_readable.js:1145:12)",
" at process._tickCallback (internal/process/next_tick.js:63:19)"
]
}
blocked by: [FORBIDDEN/6/cluster read-only (api)
This gives us the error that tells us the cluster is in Read Only mode
You can use the following command to return the cluster back to Read/Write mode:
curl -XPUT https://<ElasticSearch_Endpoint>/_cluster/settings -H 'Content-Type: application/json' -d '{"persistent": {"cluster.blocks.read_only": false }}'
PUT '{"persistent": {"cluster.blocks.read_only": false }}'
{"acknowledged":true,"persistent":{"cluster":{"blocks":{"read_only":"false"}}},"transient":{}}
You can verify this by checking the Cluster Settings again, the /_cluster/settings should now show the following:
"blocks" : {
"read_only" : "false"
The information provided in this Repo are licensed under the Apache 2.0 license. Please be respectful. Thanks!