c97-717209-00 © 2012 cisco and/or its affiliates. all rights reserved. 1
TRANSCRIPT
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 1
Large scale near real-time log indexing with Apache Flume and SolrCloud
Hadoop Summit 2013
Ari FlinkOperations ArchitectCisco/WebEx
June 27th, 2013
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 2
Agenda
1. Intro
2. Problem to solve?
3. How does Flume/Solr help?
4. Syslog indexing example
5. HA, DR & scalability
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 3
Me
Ops Architect at Cisco CCATG (WebEx)Ensure operational readiness for complex distributed services
HA, DR, monitoring, config, deployment
Previously eBay, Excite@Home, IBM, VISAOperations architecture, monitoring, event correlation
© 2012 Cisco and/or its affiliates. All rights reserved. 5
Cisco CCATG - Cloud Collaboration Application Technology Group
Cisco WebEx Meetings
• Voice, video, desktop sharing• Meeting/Event/Support/Training• Centers• Integration with TelePresence
Cisco WebEx Social
• Social networking• Content creation• Integrated IM
Cisco WebEx Messenger
• IM, presence• Integrate with voice, video• XMPP
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights reserved. 6
Participants from over 231 countries, 52% market share
2.2 Billion meeting minutes per month
40.5 Million meeting attendees per month
9.4 million registered hosts worldwide
4 Million mobile downloads
Cisco WebEx - Leader for SaaS Web Conferencing
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 7© 2010 Cisco and/or its affiliates. All rights reserved. 7
Cisco WebEx Collaboration Cloud
Datacenter / PoP
Leased network link
Global Scale: 13 datacenters & iPoPs around the globe
Dedicated network: dual path 10G circuits between DCs
Multi-tenant: 95k sites
Real-time collaboration: voice, desktop sharing, video, chat
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 8© 2010 Cisco and/or its affiliates. All rights reserved. 8
S#!% happens ..
Datacenter / PoP
Leased network link
People make mistakesHardware failsSoftware failsEven failovers sometimes fail
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 9
Recovery-Oriented Computing Philosophy
“If a problem has no solution, it may not be a problem, but a fact, not to be solved, but to be coped with over time”
— Shimon Peres (“Peres’s Law”)
People/HW/SW failures are facts, not problems
Operations main goal is to maintain high service availability• Recovery/repair is how we cope with above facts• Improving recovery/repair improves availability
UnAvailability = MTTR / MTBF1/10th MTTR just as valuable as 10x MTBF
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 10
How could we make recovery & repair faster?
Even better: proactive
Good: reactive
Your search – What is the root cause of the outage? – did not match any documents.
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 11
The goal: NRT full text log search
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 12
Cisco WebEx log collection overview
Flume
Log4j
File
Avro
Syslog
Other SinksSolrSink
App
lica
tion
stat
e &
AP
Is
HDFS
Thrift
AMQP RDBMS
Sqoop
HTTP/REST
MySQL
Unstructured/semi-structured data Structured data
Cisco UCS C240 M3 servers
12 x 3TB = 36 TB / server
HDFSSink
SolrCloud
Raw dataSolr index
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 13
Global Flume topology
DC 1
HDFS
Flume
SolrCloud
FlumeFlume
DC 2
HDFS
Flume
SolrCloud
FlumeFlume
DC 1
FlumeFlume
Flume
syslog log4j file
DC N
FlumeFlume
Flume
syslog log4j file
… Collectortier
Storagetier
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 14
Flume Collector tier
agent agent agent
File Channel 1
Avro src
DC1 Avro sink
DC2 Avro sink
File Channel 2
…
Replicatingfan-out
flow Flume Collector server
Failover & load balancing agents
Flume Storage tier
All events replicated to both Channels
DC1 DC2
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 15
Global Flume topology
DC 1
HDFS
Flume
SolrCloud
FlumeFlume
DC 2
HDFS
Flume
SolrCloud
FlumeFlume
DC 1
FlumeFlume
Flume
syslog log4j file
DC N
FlumeFlume
Flume
syslog log4j file
… Collectortier
Storagetier
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 16
Flume Storage tier
File Channel 1
Avro src
SolrSink
HDFSsink
File Channel 2
…
Multiplexingfan-out
flow Flume Storage tier server
Failover & load balancing agents
FlumeCollector
FlumeCollector
FlumeCollector
HDFSSolrCloud
Routing to Solr by Flume event header
All events to HDFS
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 17
Schema or no schema ..
Isn’t Big Data “schema on read”?• Why does Solr require a schema on write?• Dirty little secret: there’s always a schema• Performance & functionality vs flexibility• Optimize operations and storage based on field type - that's how you
get sub second response times
There’s always a schema• Application code vs. central location
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 18
Cloudera Morphlines: streaming ETL
Cloudera Morphlines
• Framework to simplify event transformation • Compatible with existing grok patterns• Reusable across multiple index workloads:
Flume & M/R
Command: readLine
Command: grok
Command: loadSolr
Solr
Flume event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…Record
Record
Record
Record
SolrSink
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 19
Flume Morphline example
Convert syslog message..
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
.. into Solr schema fields
Severity=[3] Facility=[22] host=[colo01-wxp00-ace01b-connect.webex.com] timestamp=[2013-06-16T04:36:49.000Z]syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[54asdf654] id=[b2f839c3-dece-404f-a535-e0141ad549bf] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 20
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 1: readLine reads in Flume event headers and body
timestamp=[1371357409000]host=[colo01-wxp00-ace01b-connect.webex.com]category=[545f5sfsd5sf]Severity=[3] Facility=[22] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234]
Headers
Body
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 21
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 2: convertTimestamp converts epoch to ISO 8601 format
timestamp=[2013-06-16T04:36:49.000Z]host=[colo01-wxp00-ace01b-connect.webex.com]access_token=[545f5sfsd5sf]message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234]Severity=[3] Facility=[22]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 22
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 3: addValues creates new field access_token
timestamp=[2013-06-16T04:36:49.000Z] category=[545f5sfsd5sf] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] host=[colo01-wxp00-ace01b-connect.webex.com] Severity=[3] Facility=[22]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 23
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 4: tryRules creates field severity_label for severity
timestamp=[2013-06-16T04:36:49.000Z] severity_label=[error] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] host=[colo01-wxp00-ace01b-connect.webex.com] category=[545f5sfsd5sf] Severity=[3] Facility=[22]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 24
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 5: tryRules creates new fields
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 25
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 6: sanitizeUnknownSolrFields drops non-schema fields
timestamp=[2013-06-16T04:36:49.000Z]syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[545f5sfsd5sf] host=[colo01-wxp00-ace01b-connect.webex.com] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 26
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 7: generateUUID creates an unique id for the document
timestamp=[2013-06-16T04:36:49.000Z]syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[545f5sfsd5sf] id=[b2f839c3-dece-404f-a535-e0141ad549bf] host=[colo01-wxp00-ace01b-connect.webex.com] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 27
Flume Morphline example
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234
Step 8: loadSolr loads a record into a Solr server
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 28
Morphlines recap
Command: readLine
Command: grok
Command: loadSolr
SolrCloud
Flume syslog event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…Record
Record
Record
Record
SolrSink
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 29
SolrCloud
ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem(local, HDFS)
Add doc to syslog index
• Collections, shards & replicas• Pluggable file system• Central config & coordination with
ZK • Full HA, automatic fail-over• NRT indexing• Automatic routing
Where can I index data?
leader3
Collection
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 30
SolrCloud
Collection “syslog” with three shards
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 31
Solr index management
Special case of search• Logs are time series data: timestamp + data
• High indexing rate, no updates
• New data is more frequently searched than old
Collection aliases• Time partitioned collections – e.g. one collection per day
• Reduces the workload to near-real-time data only
• One-to-many collection mapping: queries go to a logical representation mapped to multiple, same-schema collection
• Simplifies for hot-warm-cold migration of data
Index expiration• Old data is aged out by Collection Aliases
• Remap only the latest collection to an alias
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 32
Solr & HDFS DR
Solr• No multi-datacenter cluster support
HDFS• No multi-datacenter cluster support
Options?• All our services must survive DC outage
• . . so should logging and indexing
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 33
DR option 1: Flume dual writes
DC 1
HDFS
Flume
SolrCloud
FlumeFlume
DC 2
HDFS
Flume
SolrCloud
FlumeFlume
DC 1
FlumeFlume
Flume
syslog log4j file
DC 2
FlumeFlume
Flume
syslog log4j file
DC N
FlumeFlume
Flume
syslog log4j file
…Collector
tier
StoragetierPlanned or
unplanned outage
Flume Collector disk channel buffering DC1 events
DC1 Hadoop cluster back online after outage
Replicate aggregate
data
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 34
DR option 2: distcp + M/R
DC 1
HDFS
Flume
SolrCloud
FlumeFlume
DC 2
HDFSSolrCloud
DC 1
FlumeFlume
Flume
syslog log4j file
DC 2
FlumeFlume
Flume
syslog log4j file
DC N
FlumeFlume
Flume
syslog log4j file
… Collectortier
Storagetier
FlumeFlume
Flume
distcp
Manual CNAME change to DC2
DC1 back online, sync data from DC2
Data sent only to a single DC
distcp
DNS CNAME change back to DC1
Flip distcp the other way
Flume buffering events at collector tier
Create indexes with M/R
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 35
Tiers to scale
• Flume Collector tier• Flume Storage tier• SolrCloud
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 36
Flume Collector tier throughput100 – 5000 servers per a datacenter
agent agent agent
File Channel 1
Avro src
DC1 Avro sink
DC2 Avro sink
File Channel 2
…
Replicatingfan-out
flow
agent agent agent …
…Flume Collector
More agents and data
FileChannel:14MB/sec
NIC:100MB/sec
NIC:100MB/sec
File Channel 1
Avro src
DC1 Avro sink
DC2 Avro sink
File Channel 2
Replicatingfan-out
flow
Max per server:
14MB/s1.2 TB/day
70k events/s
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 37
Scalability: Flume Storage tierDC 1 collectors
DC 1storage tier
Flume 1
DC 2 storage tier
Avro sink
1
Avro sink
2
Avro sink
N…
DC 2 collectors
Avro sink
1
Avro sink
2
Avro sink
N…
DC N collectors
Avro sink
1
Avro sink
2
Avro sink
N……
File Chan1
Avro src
HDFSsink
Solrsink
File Chan2
Multiplexingfan-out
flowFile
Chan1
Avro src
HDFSsink
Solrsink
File Chan2
Multiplexingfan-out
flowFile
Chan1
Avro src
HDFSsink
Solrsink
File Chan2
Multiplexingfan-out
flowFile
Chan1
Avro src
HDFSsink
Solrsink
File Chan2
Multiplexingfan-out
flow
Max per server:14MB/s
1.2 TB/day70k events/s
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 38
SolrCloud scalablity
ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem(local, HDFS)
New logsto index
Searchqueries
1000 tx/sec/core
2x8 cores 16k tx/sec
3 shards3 x 16k =
48k tx/sec
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 39
Syslog indexing recap
Central syslog servers• Network and OS system messages forwarded to several central syslog
servers
Forward syslog to Solr using Flume Morphline SolrSink• Parse messages with Morphline and grok patterns
SolrCloud • Index log lines as documents into a Collection (i.e. index)
HUE Solr search • Simple UI to build a customized search page layout with faceting, sorting.
• Easy drill down with multiple facets: severity, datacenter, hostname, etc
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 41
Search by time
Sort by select field
Facets by selected fields
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 42
Wildcard query by field
Highlight the query keywords
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 43
Data sources: REST/JSON, log4j, syslog, Avro, Thrift
Parsing: Cloudera Morphlines
NRT Indexing: SolrCloud embedded in CDH
Batch indexing: MapReduce
Analytics: Use your favorite tool, raw detailed data stored in HDFS
Summary
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 44
email: [email protected]: @raaka
Questions ?