traxticsearch
TRANSCRIPT
Current ArchitectureLogstash Cluster:3 master/ 9 dataLogs only
Custer Cluster:3 master/ 10 dataProcessing data, mission criticalSoon to be firewalled off
Winston Cluster:3 master/ 3 data“Prod” quality playgroundKibana accessrequires CCB to create index/dashboard
Stats by Cluster Logstash Cluster:12 nodes1,251 indices1,158 shards436M docs716 GB data(1,158 closed indices)
Custer Cluster:13 nodes1,187 indices1,995 shards115M docs1.75 TB data(559 closed indices)
Winston Cluster:6 nodes2 indices3 shards10M docs5.39 GB data(0 closed indices)
Elastizabbix: Monitoring
● Written Angrily (...friday night)● Old fashioned● Auto-discovers nodes and indices● Dot-notation syntax to collect anything● Managed from the zabbix user interface● Will not overload the cluster with data
collection● Works surprisingly well
Elastizabbix: MonitoringElastic Stats API:GET _cluster/stats“indices”: { "docs": { "count": 418156163, "deleted": 2278242 }
}
Zabbix Item (avoids scripting):elastizabbix[cluster, indices.docs.count] = 418156163
Elastizabbix: AlertingTriggers (get an adult!):{elastizabbix[nodes,nodes.{#NODE}.jvm.mem.heap_used_percent].last()}>95 = Disaster!
● Escalate to operations (email, XMPP, slack, kibana, etc)
● Look at your favorite monitoring tool (zabbix, marvel, HQ, Kopf, etc)
● Do something about it before the API becomes unreliable.
The quest for mbeans
Relying on the Elasticsearch API for monitoring/statistics is the equivalent of relying on the patient for info during surgery.
Bulk Indexing ● Tune for payload size not doc count ~ 5-15MB ● EsRejectedExecutionException or
TOO_MANY_REQUESTS (429) ● Handling failures