document oriented database infrastructure for monitoring hep data systems applications carlos...

23
Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory, NY, USA October 2015

Upload: jacob-day

Post on 17-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

Document Oriented Database Infrastructure for MonitoringHEP Data Systems Applications

Carlos Fernando GamboaRACF, BNL

HEPiX Brookhaven National Laboratory, NY, USA

October 2015

Page 2: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

2

Overview

1. Brief ELK framework review2. ELK test deployment to monitor storage related applications

Page 3: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

3

The Elasticsearch, Logstash, Kibana (ELK) Ecosystem

Logstash

data collection

formatting

Elasticsearch

data storage

Kibana

Visualization and data analysis

Page 4: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

4

Logstash

The Elasticsearch, Logstash, Kibana (ELK) Ecosystem

Logstash-forwarder ()(lumberjack)

Output

elasticsearch

Filter

Grok()

Date()

GeoIP()

Visualization

Kibana

An event is shipped via logstash forwarder client, collected, and processed sequentially at the logstash server, i.e.

Client Input

Server

File

Logstash

Logstash-forwarder ()(lumberjack)

Compression, encryption

Page 5: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

5

The Elasticsearch, Logstash, Kibana (ELK) Ecosystem

Logstash

data collection

formatting

Elasticsearch

data storage

Kibana

Visualization and data analysis

Page 6: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

6

A Document Oriented database horizontally scalable:

- Built on Apache’s Lucene (Java).- Mapping is comparable to a schema definition in SQL databases. - If the mapping has not been created the server will assume the type of document based on field

values.- Language query is based on JSON called Query DSL or via URL API, i.e.:

[user@racprodb07 ~]# curl -XGET 'http://localhost:9200/aws*/secure_bestman/_search?q=sec_target:"/mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1"&pretty=true'{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 44, "successful" : 44, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 47.9029, "hits" : [ { "_index" : "aws-se-2015.09.23", "_type" : "secure_bestman", "_id" : "AU_6UcW9_b_e2-r1bS0Y", "_score" : 47.9029, "_source":{"message":"Sep 23 08:57:54 aws01 sudo: bestman : TTY=unknown ; PWD=/tmp ; USER=usatlas3 ; COMMAND=/bin/rm /mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1","@version":"1","@timestamp":"2015-09-23T13:08:27.193Z","type":"secure_bestman","file":"/var/log/secure","host":"aws01.racf.bnl.gov","offset":"3116045","sec_timestamp":"Sep 23 08:57:54","sec_host":"aws01","sec_oper":"sudo","sec_sudo_user":"bestman","sec_path":"/tmp","sec_user":"usatlas3","sec_command":"/bin/rm","sec_target":"/mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1","syslog_received_at":"2015-09-23T13:08:27.193Z","received_from":"aws01.racf.bnl.gov"}

The Elasticsearch, Logstash, Kibana (ELK) EcosystemElasticsearch

Index (database)

DocumentType(table)

File

Field

Page 7: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

7

The Elasticsearch, Logstash, Kibana (ELK) ecosystem

Logstash

data collection

formatting

Elasticsearch

data storage

Kibana

Visualization and data analysis

Page 8: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

8

The Elasticsearch, Logstash, Kibana (ELK) EcosystemKibanaIs an analytics and visualization platform designed to work with Elasticsearch.Input field allows to issue interactive queries.Discover page:

DASHBORAD

Visualization 1

Visualization 2

Visualization 3Visualization N

Index

Fields

Results

Input field

Page 9: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

9

Dashboard

Visualization 1 Pie charts

Visualization 2 histograms

Visualization

tile maps

Provides a dynamic creation of individual visualizations:- Based on individual searches (interactive or searched) or other visualization - Pie charts, histograms, bar chart, tile maps available to create the visualization

Dashboard Displays a group of stored visualizations. A search field and time filter is enabled by default in the dashboard.

Visualization 3bar chart

Search field Time filter

The Elasticsearch, Logstash, Kibana (ELK) EcosystemKibana

Page 10: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

10

The Elasticsearch, Logstash, Kibana (ELK) Ecosystem

[root@aws01 ~]# tail -1 /var/log/secureSep 23 08:57:54 aws01 sudo: bestman : TTY=unknown ; PWD=/tmp ; USER=usatlas3 ; COMMAND=/bin/rm /mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1

The event

filter { if [type] == "secure_bestman" { grok { patterns_dir => "/etc/logstash/patterns" match => { "message" => "%{SECURE}"} add_field => [ "syslog_received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } }} Visualized on Kibana

(events aggregated)

The Filter

The output (Kibana)

Page 11: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

11

ELK test deployment to monitor storage related applications.

Page 12: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

12

Monitoring selected storage services

Simple Storage Service

(S3)

Amazon Web Services

BNL ELKmonitoring

AWS SE Bestman Bestman

Gridftp 2

Gridftp 1

SRM

BNL dCache SE

Consolidated into the BILLING logs

WAN

LAN

Application logfiles monitoredusing the Elasticsearch, Logstash and Kibana (ELK) framework.

No central collection of information

Page 13: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

13

BNL ELK test server

Server 1

BNL Test ELK layout

3 AWS VMs and 3 Physical Servers Monitored

Logstash-forwarder

logfileServer 2

Logstash-forwarder

logfile

Logstash filters

Logstash input(lumberjack)

Logstash outputelasticsearch

Server N

Logstash-forwarder

logfile

KIBANA

WAN

LAN

Page 14: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

14

BNL Test ELK layout

Test DashboardsIntended to be used by the site admin. Nginx is used to serve/proxy access to the dashboards.

Link to interactive query dashboard

Page 15: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

15

dCache Billling Monitoring Dashboard

Dashboard ported to kibana 4.1 using as a reference previous work done for Kibana 3 [2]

Data collected using grok filter patterns published [2]

Integrated tile maps and errors charts and stats among other improvements.

Read/Writes per Sunit

type

15 Top Pools

Event Dist.per

Transfer Protocol

Top Errors per Transfer Protocol

Detail record

Page 16: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

16

AWS SE Bestman Monitoring Dashboard

Visualization created using grok filter patterns [1]

Total size buckets

Gridftp transfers

SRM File Deletion

Page 17: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

17

dCache Billing Dashboard 5 minutes refreshing period performance

Current stable configuration

No major client overhead on the monitored hosts.

Concentrating tuning effort on elasticsearch and kibana working with different parameters, such as:

- Thread pool search memory - Kibana timeouts

ELK Test server

Page 18: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

18

dCache Billing dashboard aggregated report performance

Last 7 days

Last 30 days

Last 60

days

Last 90

days

dCache Billing document size is ~400MTotal size 320 GB

Page 19: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

19

BNL Test ELK Software/Hardware

1 ELK node deployedELK Software : - Logstash 1.5.4- Elasticsearch 1.5.2-1 1.7 - Kibana 4.1.1- Logstash-forwarder 0.40OS

RHEL 6.6Legacy hardware used:- Head node: IBM x3650 M3 node, CPUs: 16 x 2.53GHz,

49GB Memory, 10Gbps Network interconnectivity

- External storage IBM DS3500

ELKNode 1

Node 2

DS3500

DS3500Expansion

DS3500Expansion

DS3500Expansion

12 SAS 15krpm 600 GB/disk

Page 20: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

20

Sources/References

1. Peter Love’s https://github.com/ptrlv/logstash

2. dCache Development Team https://github.com/dCache/logstash4dcache

3. General reference information https://www.elastic.co

Rich presentation about ELK4. Johan Guldmyrhttps://indico.desy.de/contributionDisplay.py?contribId=4&confId=11773

Example of Elasticsearch, Kibana with a different data collector infrastructure5. Ilija Vukotichttps://docs.google.com/presentation/d/1oFWLLCP7XxUxrccEH45JYDORsFEQQxtyrsZ9fs677bM/edit#slide=id.p

Page 21: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

21

Thank you

Page 22: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

22

Backup slide

Page 23: Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory,

23

Logstash

stdin () : -Testing, troubleshooting

Logstashforwarder() -Compression, transmission

Reddis(), Rabbitqm() -Large clusters, queuing

file () , Syslog (), Rsyslog()

Grok(): - extract data using pattern matchingDate(): - parse timestapms from fieds, allow assigned time format processed event

Mutate():Manipulate,

modify event field dataGeoip() :

Find IP address geo-location using MaxMin database

Storage:FileS3

MongoDBElasticsearch

…Relay:

RabbitMQ,TCP

Notifications: email Nagios

INPUT FILTER OUTPUT

Software Functionality distributed as a modular pluggable pipeline infrastructure