mining your logs - gaining insight through visualization
Post on 18-Oct-2014
6.431 Views
Preview:
DESCRIPTION
TRANSCRIPT
Mining Your LogsGaining Insight Through Visualization
Google TechTalk March 2011
Raffael Marty - @zrlram
© by Raffael MartyLogging as a Service
Raffael Marty
2
• Founder @ • Chief Security Strategist and Product Manager @ Splunk• Manager Solutions @ ArcSight• Intrusion Detection Research @ IBM Research• IT Security Consultant @ PriceWaterhouse Coopers
Applied Security VisualizationPublisher: Addison Wesley (August, 2008)
ISBN: 0321510100
© by Raffael MartyLogging as a Service
Agenda
3
•Log Analysis
•History
•Log Architectures
•What’s Working and What’s Not?
•Future Needs
•Data Visualization
•Visualization Concepts
•Security Visualization Use-Cases
© by Raffael MartyLogging as a Service
Log Analysis
4
10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/ HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27" TYhzVH8AAAEAAGOkBOQAAADA 655268
2010-12-28T18:12:10.031+00:00 frontend2-raffy syslog-ng[19600]: syslog-ng starting up; version='3.1.1'
2011-01-10T21:27:04.820+00:00 frontend2-raffy kernel: : [ 664.107313] blocked inbound IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255 LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126 PROTO=UDP SPT=17500 DPT=17500 LEN=160
© by Raffael MartyLogging as a Service
History•1980 Eric Allman develops syslogd(8)•1996 Intellitactics•1997 Tivoli Risk Manager developed by IBM Research in Zurich (later Zurich Correlation Engine, ZCE)
•1999 - 2010 A number of log management / SIEM players enter the market (software, appliances)
•2000 ArcSight - 2010 sold for $1.65bn to HP•2009 Loggly (logging as a service)
5
© by Raffael MartyLogging as a Service
History - The Other View•Network management (SNMP)•IDS false positive reduction•Security monitoring (multiple data sources)•Unification of NOC and SOC (failed?)•Application monitoring (moving up the stack)-original tools failed due to architectural constraints-new approaches have been presented
6
© by Raffael MartyLogging as a Service
Log Management Today
Where are you?
© by Raffael MartyLogging as a Service
Log Management Today
DIY•grep •Perl•SQL
Log Management•Open source•Commercial
CEP and SIEM•Open source•Commercial
MapReduce•Open source
Advanced Analytics•Not log specific!
less tools
© by Raffael MartyLogging as a Service
Open Source Tools•graylog2• logstash•swatch• tenshi• logwatch•OSSEC•snare• lasso
• lire•LogSurfer•SEC•LogHound•slct• log2timeline• logzilla•OSSIM
•MS Logparser•Sguil•Octopussy•Sagan
9
this list is likely incomplete!
© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud
Commercial Tools
10
this list is likely incomplete!
Log Architectures
11
© by Raffael MartyLogging as a Service
Log Mgmt Architecture
12
Collection:- syslog- OPSEC- SDEE- netflow- database
Storage:- on board- external storage array- clusters
Processing:- indexing- context storage- clustering
© by Raffael MartyLogging as a Service
Log Mgmt Architecture
13
Collection:- syslog- OPSEC- SDEE- netflow- database
Processing:- indexing- context storage- clustering
Data Access:- free-text search- field-based search- tagging schemas
normalizedor raw
raw
© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud
Agents and Connectors• piece of code to transport logs to a central location• features- batch- compress- encrypt- sign- fail-over
14
• often additional features:- parse- normalize- aggregate- enrichment (context)
• special protocols:- OPSEC, SDEE- Windows
• file-based collection• database collection
© by Raffael MartyLogging as a Service
SIEM Architecture
15
normalizedraw
asset context
identity context
...
RDBMS
context / tagging
© by Raffael MartyLogging as a Service
SIEM Architecture•RDBMS schema- Fixed number and type of fields-New data sources with new fields?‣ overloading
•RDBMS clusters are expensive and scale poorly•Need a parser for every data source•Slow historical data queries•Hard to configure database efficiently-because of different use-cases
16
© by Raffael MartyLogging as a Service
SIEM Architecture Benefits•Parsed data enables-real-time correlation-real-time statistics-data augmentation (context) close to source•Unified data access language-over a fixed set of fields
•Real-time dashboards
17
© by Raffael MartyLogging as a Service
Search vs. SIEM•Full-text indexing•Parsing at search time
18
Example search:denied
Example search:user=rmarty
• use index to find occurrences of ‘denied’
• use index to find ALL occurrences of ‘rmarty’
• apply parser to results• remove results where
user is not rmarty
© by Raffael MartyLogging as a Service
New SIEM - Hybrid Models•Use parsers for known data sources•Collect everything else•Index all data and use index for search•Correlate parsed data
19
© by Raffael MartyLogging as a Service
Categorization and Tagging•How do you find all failed logins across any data source?
•Does not scale- for new data sources- for new events of existing sources
•Define a ‘taxonomy’ for all events•Map events into taxonomy
20
security:538 OR “sshd authentication failure” OR “sshd failed password” OR ...
id -> object, action, status
© by Raffael MartyLogging as a Service
Content Creation•Rules, dashboards, reports, searches can use taxonomy:
•All failures related to files:
•Mixing with other fields:
21
object=authentication AND action=login AND status=success
object=file AND status=failure
action=login AND user=rmarty
•Approach scales well•Huge effort to build and maintain mappings
Logging as a Service
Logging as a Service (LaaS)
22
•Economically advantageous - think about TCO•Pay as you go•Elastic infrastructure scales with your needs•No installation needed•No setup costs / time for logging solution•Open platform with RESTful APIs
Logging as a Service
Loggly
23
Data Sources Consumers
APIProxies
Distributeddata store
Distributedindexing and processing
Data collectionData access
mobile-166 My syslog
Logglyuser interface
Indexers and Search Machines
Log Archive
UI extensions
© by Raffael MartyLogging as a Service
Tool Usage
24
DIY MR Log Mgmt SIEM LaaS
data sources
knownonly a few
knownonly a few
unknownmany
knownmany -
analysis use-cases
knownone or a few
explorationlarge-scale
unknownmany
unknownmany
extend platform
dynamic use-cases no no yes yes yes
real-time correlation no no no yes extend
platform
costengineerhardwaremaintenance
engineershardwaremaintenance
license(hardware)maintenance
licensehardwaremaintenance
subscription
Should you rather do it yourself (DIY)?
What is Working and What is not?
25
© by Raffael MartyLogging as a Service
What’s Working•Log collection•Log centralization•Alerting on a priori known patterns•Solving specific, known use-cases for sets of known data sources, e.g.,-monitoring privileged access to financial servers-generating compliance reports-security forensics
26
© by Raffael MartyLogging as a Service
What’s Not Working•Log formats are all over and not documented
•No logging guidelines / developer education•Parsing is broken-based on regexes-numerous mistakes-doesn’t scale
27
Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576
© by Raffael MartyLogging as a Service
What’s Not Working•Normalization is broken:- IP to hostnames (when to do DNS lookup)-usernames (rmarty vs. ram vs. raffy)
•Categorization / Taxonomy-doesn’t scale- is buggy
•Prioritization has no working formula•Anomaly detection is voodoo!
28
- is always out of date-expensive
© by Raffael MartyLogging as a Service
What Does It Mean?•We don’t understand our data•Security Operations Center (SOC) monitors all corporate data sources. Analysts-don’t know all the applications-don’t know all the setups-don’t know what log records are ‘normal’ behavior
29
--> Need tools to enable log owners to work with their data
Future Needs
30
© by Raffael MartyLogging as a Service
We Need Better Tools•We will have more and more data and need to deal with larger amounts of data - SIEM needs to support new distributed, scalable data management technologies
•More and more application layer data -How are we going to deal with all the parsing / entity extraction?-We need logging standards and guidelines
•How do we help analysts understand the data?-What is important and what is not?-Mapping problems to business process, business risk!
31
Data Visualization
32
© by Raffael MartyLogging as a Service
Data/Log Visualization•Exploration and Discovery
•Answer Questions
•Communicate Information
•Support Decisions33
© by Raffael MartyLogging as a Service
•We are nowhere!•Visualization is an afterthought•Sec Viz dichotomy•Tools are lacking fundamental capabilities•Users don’t understand data, how can they understand visuals?
Security Visualization
34
Visualization Concepts
35
© by Raffael MartyLogging as a Service
The Analysis Approach
36
Overview first Zoom Details on demand
Principle by Ben Shneiderman
© by Raffael MartyLogging as a Service
Simultaneous Views
37
© by Raffael MartyLogging as a Service
Dynamic Coloring
38
© by Raffael MartyLogging as a Service
Linked Views
39
© by Raffael MartyLogging as a Service
Legible / Usable Graphs
40
Reducing non data ink!
© by Raffael MartyLogging as a Service
Choosing the Right Chart
41
© by Raffael MartyLogging as a Service
Ode to the Pie
42
© by Raffael MartyLogging as a Service
Careful With Interpretations
43
SecViz Examples
44
© by Raffael MartyLogging as a Service 45
© by Raffael MartyLogging as a Service 46
© by Raffael MartyLogging as a Service 47
© by Raffael MartyLogging as a Service
Situational Awareness• Treemap• Protovis.JS• Size: Amount • Brightness: Variance• Color: Sensor• Shows: Scans - bright spots
• Thanks to Chris Horsley
48
© by Raffael MartyLogging as a Service 49
© by Raffael MartyLogging as a Service
Firewall Treemap
50
© by Raffael MartyLogging as a Service
Firewall LogPort Source IP Destination IP
51
© by Raffael MartyLogging as a Service
IDS Sig Tuning - Treemap
52
Hierarchy: SourceDestinationSignatureNumber of Events
Color: PrioritySize: Number of alerts
© by Raffael MartyLogging as a Service
Vulnerability Data by Host
53
© by Raffael MartyLogging as a Service
Visualization Future
54
•A solution to entity extraction•Dynamic and interactive displays•Computer aided intelligence / visualization-Computer supported exploration-Highly interactive
•Expert system that captures domain knowledge-Collaborative
© by Raffael MartyLogging as a Service
Share, discuss, challenge, and learn about security visualization.
http://secviz.org
• List: secviz.org/mailinglist
• Twitter: @secviz
55
top related