Elasticsearch Monitoring Cheatsheet
Elasticsearch Monitoring Cheatsheet
Note:
ʒ Windows users should download cURL to use the commands below.
Collect these OOTB metrics with Datadog START YOUR FREE TRIAL
ʒ Some commands require jq to parse JSON for relevant metrics.
ʒ For more info, visit dtdg.co/monitoring-elasticsearch
General monitoring API endpoints Thread pool queues & rejections—more info
METRIC DESCRIPTION COMMAND METRIC DESCRIPTION COMMAND
Stats from all nodes curl 'localhost:9200/_nodes/stats' Number of queued threads in a thread pool curl 'localhost:9200/_nodes/stats/thread_pool' | jq '.nodes[] |
{node_name: .name, bulk_queue: .thread_pool.bulk.queue, search_
Stats from specific nodes curl 'localhost:9200/_nodes/node1,node2/stats' queue: .thread_pool.search.queue, index_queue: .thread_pool.
Stats from a specific index index.queue}'
curl 'localhost:9200/<INDEX_NAME>/_stats'
Number of rejected threads in a thread pool curl 'localhost:9200/_nodes/stats/thread_pool' |
Cluster-wide stats curl 'localhost:9200/_cluster/stats' jq '.nodes[] | {node_name: .name, bulk_rejected:
.thread_pool.bulk.rejected, search_rejected:
.thread_pool.search.rejected, index_rejected:
.thread_pool.index.rejected}'
Cluster health—more info
METRIC DESCRIPTION COMMAND
Cluster status & unassigned shards curl 'localhost:9200/_cat/health?v' Fielddata cache usage
METRIC DESCRIPTION COMMAND
Size of the fielddata cache (bytes) curl 'localhost:9200/_cat/nodes?v&h=name,fielddataMemory'
Search performance—more info Number of evictions from the fielddata curl 'localhost:9200/_cat/nodes?v&h=name,fielddataEvictions'
cache
METRIC DESCRIPTION COMMAND
Number of times the fielddata circuit breakercurl 'localhost:9200/_nodes/stats/breaker' | jq '.nodes[] |
Total number of queries curl 'localhost:9200/_cat/nodes?v&h=name,searchQueryTotal' has been tripped (ES version >=1.3) {node_name: .name, fielddata: .breakers.fielddata}'
Total time spent on queries curl 'localhost:9200/_cat/nodes?v&h=name,searchQueryTime'
Number of queries currently in progress curl 'localhost:9200/_cat/nodes?v&h=name,searchQueryCurrent'
Total number of fetches curl 'localhost:9200/_cat/nodes?v&h=name,searchFetchTotal' Host-level network and system metrics—more info
Total time spent on fetches curl 'localhost:9200/_cat/nodes?v&h=name,searchFetchTime' METRIC DESCRIPTION COMMAND
Number of fetches currently in progress curl 'localhost:9200/_cat/nodes?v&h=name,searchFetchCurrent' Disk space total, free, curl 'localhost:9200/_nodes/stats/fs' | jq '.nodes[] | {node_name:
available .name, disk_total_in_bytes: .fs.total.total_in_bytes,
disk_free_in_bytes: .fs.total.free_in_bytes, disk_available_in_bytes:
.fs.total.available_in_bytes}'
Total time spent indexing documents curl 'localhost:9200/_cat/nodes?v&h=name,indexingIndexTime' I/O utilization Consult a tool like iostat
Number of documents currently being curl 'localhost:9200/_cat/nodes?v&h=name,indexingIndexCurrent' Used file descriptors curl 'localhost:9200/_cat/nodes?v&h=host,name,fileDescriptorPercent'
indexed percentage
Total number of index flushes to disk curl 'localhost:9200/_cat/nodes?v&h=name,flushTotal' Network bytes sent/received curl 'localhost:9200/_nodes/stats/transport' | jq '.nodes[] |
{node_name: .name, network_bytes_sent: .transport.tx_size_in_bytes, network_
Total time spent on flushing indices to disk curl 'localhost:9200/_cat/nodes?v&h=name,flushTotalTime' bytes_received: .transport.rx_size_in_bytes}'
HTTP connections currently curl 'localhost:9200/_nodes/stats/http' | jq '.nodes[] | {node_name: .name,
open & total opened over time http_current_open: .http.current_open, http_total_opened:
.http.total_opened}'
JVM heap usage—more info
METRIC DESCRIPTION COMMAND
Garbage collection frequency and duration curl 'localhost:9200/_nodes/stats/jvm' | jq '.nodes[] | {node_ Default directories
name: .name, young_gc_count:
.jvm.gc.collectors.young.collection_count, young_gc_time: .jvm. DEBIAN/UBUNTU RHEL/CENTOS ZIP OR TAR INSTALLATION
gc.collectors.young.collection_time_in_millis,
old_gc_count: .jvm.gc.collectors.old.collection_count, Configuration /etc /etc <ELASTICSEARCH INSTALLATION HOME
old_gc_time: ↳/elasticsearch ↳/elasticsearch DIRECTORY>/config
.jvm.gc.collectors.old.collection_time_in_millis}' Logs /var/log /var/log <ELASTICSEARCH INSTALLATION HOME
Percent of JVM heap currently in use curl 'localhost:9200/_cat/nodes?v&h=name,heapPercent' ↳/elasticsearch ↳/elasticsearch DIRECTORY>/logs
Data /var/lib /var/lib <ELASTICSEARCH INSTALLATION HOME
↳/elasticsearch ↳/elasticsearch DIRECTORY>/data
↳/data
Pending tasks
METRIC DESCRIPTION COMMAND
Number of pending tasks curl 'localhost:9200/_cluster/pending_tasks'
Cheatsheet: Elasticsearch Tuning
Note:
ʒ Windows users should download cURL to use the commands below.
Results of each suggested action may vary depending on your particular use case and setup.
Please test them out before implementing in production. For more info, visit dtdg.co/tuning-elasticsearch
5. Pending tasks
METRIC DESCRIPTION DATADOG METRIC NAME
Number of pending tasks elasticsearch.pending_tasks_total