mining your logs - gaining insight through visualization

Post on 18-Oct-2014

6.431 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

In this two part presentation we will explore log analysis and log visualization. We will have a look at the history of log analysis; where log analysis stands today, what tools are available to process logs, what is working today, and more importantly, what is not working in log analysis. What will the future bring? Do our current approaches hold up under future requirements? We will discuss a number of issues and will try to figure out how we can address them. By looking at various log analysis challenges, we will explore how visualization can help address a number of them; keeping in mind that log visualization is not just a science, but also an art. We will apply a security lens to look at a number of use-cases in the area of security visualization. From there we will discuss what else is needed in the area of visualization, where the challenges lie, and where we should continue putting our research and development efforts.

TRANSCRIPT

Mining Your LogsGaining Insight Through Visualization

Google TechTalk March 2011

Raffael Marty - @zrlram

© by Raffael MartyLogging as a Service

Raffael Marty

2

• Founder @ • Chief Security Strategist and Product Manager @ Splunk• Manager Solutions @ ArcSight• Intrusion Detection Research @ IBM Research• IT Security Consultant @ PriceWaterhouse Coopers

Applied Security VisualizationPublisher: Addison Wesley (August, 2008)

ISBN: 0321510100

© by Raffael MartyLogging as a Service

Agenda

3

•Log Analysis

•History

•Log Architectures

•What’s Working and What’s Not?

•Future Needs

•Data Visualization

•Visualization Concepts

•Security Visualization Use-Cases

© by Raffael MartyLogging as a Service

Log Analysis

4

10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/ HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27" TYhzVH8AAAEAAGOkBOQAAADA 655268

2010-12-28T18:12:10.031+00:00 frontend2-raffy syslog-ng[19600]: syslog-ng starting up; version='3.1.1'

2011-01-10T21:27:04.820+00:00 frontend2-raffy kernel: : [ 664.107313] blocked inbound IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255 LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126 PROTO=UDP SPT=17500 DPT=17500 LEN=160

© by Raffael MartyLogging as a Service

History•1980 Eric Allman develops syslogd(8)•1996 Intellitactics•1997 Tivoli Risk Manager developed by IBM Research in Zurich (later Zurich Correlation Engine, ZCE)

•1999 - 2010 A number of log management / SIEM players enter the market (software, appliances)

•2000 ArcSight - 2010 sold for $1.65bn to HP•2009 Loggly (logging as a service)

5

© by Raffael MartyLogging as a Service

History - The Other View•Network management (SNMP)•IDS false positive reduction•Security monitoring (multiple data sources)•Unification of NOC and SOC (failed?)•Application monitoring (moving up the stack)-original tools failed due to architectural constraints-new approaches have been presented

6

© by Raffael MartyLogging as a Service

Log Management Today

Where are you?

© by Raffael MartyLogging as a Service

Log Management Today

DIY•grep •Perl•SQL

Log Management•Open source•Commercial

CEP and SIEM•Open source•Commercial

MapReduce•Open source

Advanced Analytics•Not log specific!

less tools

© by Raffael MartyLogging as a Service

Open Source Tools•graylog2• logstash•swatch• tenshi• logwatch•OSSEC•snare• lasso

• lire•LogSurfer•SEC•LogHound•slct• log2timeline• logzilla•OSSIM

•MS Logparser•Sguil•Octopussy•Sagan

9

this list is likely incomplete!

© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud

Commercial Tools

10

this list is likely incomplete!

Log Architectures

11

© by Raffael MartyLogging as a Service

Log Mgmt Architecture

12

Collection:- syslog- OPSEC- SDEE- netflow- database

Storage:- on board- external storage array- clusters

Processing:- indexing- context storage- clustering

© by Raffael MartyLogging as a Service

Log Mgmt Architecture

13

Collection:- syslog- OPSEC- SDEE- netflow- database

Processing:- indexing- context storage- clustering

Data Access:- free-text search- field-based search- tagging schemas

normalizedor raw

raw

© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud

Agents and Connectors• piece of code to transport logs to a central location• features- batch- compress- encrypt- sign- fail-over

14

• often additional features:- parse- normalize- aggregate- enrichment (context)

• special protocols:- OPSEC, SDEE- Windows

• file-based collection• database collection

© by Raffael MartyLogging as a Service

SIEM Architecture

15

normalizedraw

asset context

identity context

...

RDBMS

context / tagging

© by Raffael MartyLogging as a Service

SIEM Architecture•RDBMS schema- Fixed number and type of fields-New data sources with new fields?‣ overloading

•RDBMS clusters are expensive and scale poorly•Need a parser for every data source•Slow historical data queries•Hard to configure database efficiently-because of different use-cases

16

© by Raffael MartyLogging as a Service

SIEM Architecture Benefits•Parsed data enables-real-time correlation-real-time statistics-data augmentation (context) close to source•Unified data access language-over a fixed set of fields

•Real-time dashboards

17

© by Raffael MartyLogging as a Service

Search vs. SIEM•Full-text indexing•Parsing at search time

18

Example search:denied

Example search:user=rmarty

• use index to find occurrences of ‘denied’

• use index to find ALL occurrences of ‘rmarty’

• apply parser to results• remove results where

user is not rmarty

© by Raffael MartyLogging as a Service

New SIEM - Hybrid Models•Use parsers for known data sources•Collect everything else•Index all data and use index for search•Correlate parsed data

19

© by Raffael MartyLogging as a Service

Categorization and Tagging•How do you find all failed logins across any data source?

•Does not scale- for new data sources- for new events of existing sources

•Define a ‘taxonomy’ for all events•Map events into taxonomy

20

security:538 OR “sshd authentication failure” OR “sshd failed password” OR ...

id -> object, action, status

© by Raffael MartyLogging as a Service

Content Creation•Rules, dashboards, reports, searches can use taxonomy:

•All failures related to files:

•Mixing with other fields:

21

object=authentication AND action=login AND status=success

object=file AND status=failure

action=login AND user=rmarty

•Approach scales well•Huge effort to build and maintain mappings

Logging as a Service

Logging as a Service (LaaS)

22

•Economically advantageous - think about TCO•Pay as you go•Elastic infrastructure scales with your needs•No installation needed•No setup costs / time for logging solution•Open platform with RESTful APIs

Logging as a Service

Loggly

23

Data Sources Consumers

APIProxies

Distributeddata store

Distributedindexing and processing

Data collectionData access

mobile-166 My syslog

Logglyuser interface

Indexers and Search Machines

Log Archive

UI extensions

© by Raffael MartyLogging as a Service

Tool Usage

24

DIY MR Log Mgmt SIEM LaaS

data sources

knownonly a few

knownonly a few

unknownmany

knownmany -

analysis use-cases

knownone or a few

explorationlarge-scale

unknownmany

unknownmany

extend platform

dynamic use-cases no no yes yes yes

real-time correlation no no no yes extend

platform

costengineerhardwaremaintenance

engineershardwaremaintenance

license(hardware)maintenance

licensehardwaremaintenance

subscription

Should you rather do it yourself (DIY)?

What is Working and What is not?

25

© by Raffael MartyLogging as a Service

What’s Working•Log collection•Log centralization•Alerting on a priori known patterns•Solving specific, known use-cases for sets of known data sources, e.g.,-monitoring privileged access to financial servers-generating compliance reports-security forensics

26

© by Raffael MartyLogging as a Service

What’s Not Working•Log formats are all over and not documented

•No logging guidelines / developer education•Parsing is broken-based on regexes-numerous mistakes-doesn’t scale

27

Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576

© by Raffael MartyLogging as a Service

What’s Not Working•Normalization is broken:- IP to hostnames (when to do DNS lookup)-usernames (rmarty vs. ram vs. raffy)

•Categorization / Taxonomy-doesn’t scale- is buggy

•Prioritization has no working formula•Anomaly detection is voodoo!

28

- is always out of date-expensive

© by Raffael MartyLogging as a Service

What Does It Mean?•We don’t understand our data•Security Operations Center (SOC) monitors all corporate data sources. Analysts-don’t know all the applications-don’t know all the setups-don’t know what log records are ‘normal’ behavior

29

--> Need tools to enable log owners to work with their data

Future Needs

30

© by Raffael MartyLogging as a Service

We Need Better Tools•We will have more and more data and need to deal with larger amounts of data - SIEM needs to support new distributed, scalable data management technologies

•More and more application layer data -How are we going to deal with all the parsing / entity extraction?-We need logging standards and guidelines

•How do we help analysts understand the data?-What is important and what is not?-Mapping problems to business process, business risk!

31

Data Visualization

32

© by Raffael MartyLogging as a Service

Data/Log Visualization•Exploration and Discovery

•Answer Questions

•Communicate Information

•Support Decisions33

© by Raffael MartyLogging as a Service

•We are nowhere!•Visualization is an afterthought•Sec Viz dichotomy•Tools are lacking fundamental capabilities•Users don’t understand data, how can they understand visuals?

Security Visualization

34

Visualization Concepts

35

© by Raffael MartyLogging as a Service

The Analysis Approach

36

Overview first Zoom Details on demand

Principle by Ben Shneiderman

© by Raffael MartyLogging as a Service

Simultaneous Views

37

© by Raffael MartyLogging as a Service

Dynamic Coloring

38

© by Raffael MartyLogging as a Service

Linked Views

39

© by Raffael MartyLogging as a Service

Legible / Usable Graphs

40

Reducing non data ink!

© by Raffael MartyLogging as a Service

Choosing the Right Chart

41

© by Raffael MartyLogging as a Service

Ode to the Pie

42

© by Raffael MartyLogging as a Service

Careful With Interpretations

43

SecViz Examples

44

© by Raffael MartyLogging as a Service 45

© by Raffael MartyLogging as a Service 46

© by Raffael MartyLogging as a Service 47

© by Raffael MartyLogging as a Service

Situational Awareness• Treemap• Protovis.JS• Size: Amount • Brightness: Variance• Color: Sensor• Shows: Scans - bright spots

• Thanks to Chris Horsley

48

© by Raffael MartyLogging as a Service 49

© by Raffael MartyLogging as a Service

Firewall Treemap

50

© by Raffael MartyLogging as a Service

Firewall LogPort Source IP Destination IP

51

© by Raffael MartyLogging as a Service

IDS Sig Tuning - Treemap

52

Hierarchy: SourceDestinationSignatureNumber of Events

Color: PrioritySize: Number of alerts

© by Raffael MartyLogging as a Service

Vulnerability Data by Host

53

© by Raffael MartyLogging as a Service

Visualization Future

54

•A solution to entity extraction•Dynamic and interactive displays•Computer aided intelligence / visualization-Computer supported exploration-Highly interactive

•Expert system that captures domain knowledge-Collaborative

© by Raffael MartyLogging as a Service

Share, discuss, challenge, and learn about security visualization.

http://secviz.org

• List: secviz.org/mailinglist

• Twitter: @secviz

55

56

about.me/raffy

top related