data mining with splunk

66
Data Mining and Exploration David Carasso, Office of CTO, Chief Mind

Upload: david-carasso

Post on 29-Nov-2014

7.593 views

Category:

Technology


7 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Data Mining with Splunk

Data Mining and Exploration

David Carasso, Office of CTO, Chief Mind

Page 2: Data Mining with Splunk

AGENDAWhat is data mining?

What’s the plan of attack?

What type of events do I have?

How do I mine fields?

How do I to detect anomalous events?

Why do I need to visualize my data?

Page 3: Data Mining with Splunk

What is Data Mining?

3

Page 4: Data Mining with Splunk

Is this data mining?

4

This is an orange

Page 5: Data Mining with Splunk

What is Data Mining?

Extracting implicit, previously unknown, and potentially useful information from data.

5

Page 6: Data Mining with Splunk

Better

6

Page 7: Data Mining with Splunk

Data PreparationData ExplorationData Mining

7

Und

erst

andi

ng

Page 8: Data Mining with Splunk

What’s the plan of attack?

8

Page 9: Data Mining with Splunk

Preparing the data

You've been thrown data you aren't familiar with…

Mar  7 12:40:01 willLaptop crond(pam_unix)[10696]: session opened for user root by (uid=0)Mar  7 12:40:01 willLaptop crond(pam_unix)[10695]: session closed for user rootMar  7 12:40:02 willLaptop crond(pam_unix)[10696]: session closed for user rootMar  7 12:44:47 willLaptop gconfd (root-10750): starting (version 2.10.0), pid 10750 user 'root'Mar  7 12:44:47 willLaptop gconfd (root-10750): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only config...Mar  7 12:44:47 willLaptop gconfd (root-10750): Resolved address "xml:readwrite:/root/.gconf”… Mar  7 12:45:01 willLaptop crond(pam_unix)[10754]: session opened for user root by (uid=0)Mar  7 12:45:02 willLaptop crond(pam_unix)[10754]: session closed for user root....

9

Anomalies(unexpected

address)

Transactions(open-close)

Fields(pid)

Eventtypes(closed sessions)

Page 10: Data Mining with Splunk

Is Understanding Linear?

10

Event Groups Events

FieldsAnomalies

reports

No.

Page 11: Data Mining with Splunk

What type of events do I have?

11

Page 12: Data Mining with Splunk

Given Some Unknown DataMar  7 12:40:01 willLaptop crond(pam_unix)[10696]: session opened for user root by (uid=0)Mar  7 12:40:01 willLaptop crond(pam_unix)[10695]: session closed for user rootMar  7 12:40:02 willLaptop crond(pam_unix)[10696]: session closed for user rootMar  7 12:44:47 willLaptop gconfd (root-10750): starting (version 2.10.0), pid 10750 user 'root'Mar  7 12:44:47 willLaptop gconfd (root-10750): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only config...Mar  7 12:44:47 willLaptop gconfd (root-10750): Resolved address "xml:readwrite:/root/.gconf”… Mar  7 12:44:47 willLaptop gconfd (root-10750): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration ...Mar  7 12:45:01 willLaptop crond(pam_unix)[10754]: session opened for user root by (uid=0)Mar  7 12:45:02 willLaptop crond(pam_unix)[10754]: session closed for user root....

12

Page 13: Data Mining with Splunk

Find Broad Categories of Events

Group Events by Content, Format, and Time

13

Page 14: Data Mining with Splunk

Group Events by ContentCluster events with similar values.

Show 3 examples from each cluster, from the most common cluster to the least: …| cluster labelonly=t showcount=t | dedup 3 cluster_label sortby -cluster_count, cluster_label, _time

14

Page 15: Data Mining with Splunk

Events By Contentcount label _raw--------------------------------------------------------------------------------------------------------- 1339 3 Mar 7 11:05:01 willLaptop crond(pam_unix)[6785]: session opened for user root by… 1339 3 Mar 7 11:10:01 willLaptop crond(pam_unix)[1769]: session opened for user root by … 1339 3 Mar 7 11:10:01 willLaptop crond(pam_unix)[1766]: session opened for user root by … 1324 2 Mar 7 11:05:02 willLaptop crond(pam_unix)[6785]: session closed for user root 1324 2 Mar 7 11:10:01 willLaptop crond(pam_unix)[1766]: session closed for user root 1324 2 Mar 7 11:10:02 willLaptop crond(pam_unix)[1769]: session closed for user root

136 13 Mar 7 20:05:08 willLaptop kernel: SELinux: initialized (dev selinuxfs, type selinuxfs)… 136 13 Mar 7 20:05:09 willLaptop kernel: SELinux: initialized (dev usbfs, type usbfs), uses … 136 13 Mar 7 20:05:09 willLaptop kernel: SELinux: initialized (dev sysfs, type sysfs), uses …

15

Page 16: Data Mining with Splunk

Group by $%#! FormatCluster events by first 7 punctuation chars: …| rex field=punct "(?<smallpunct>.{7})” | eventstats count by smallpunct | sort -count, smallpunct | dedup 3 smallpunct

16

Page 17: Data Mining with Splunk

Events by Formatcount smallpunct raw------------------------------------------------------------------------------------------------ 637 __::__( Mar 10 16:50:02 willLaptop crond(pam_unix)[9639]: session closed for user root 637 __::__( Mar 10 16:50:01 willLaptop crond(pam_unix)[9638]: session closed for user root 637 __::__( Mar 10 16:50:01 willLaptop crond(pam_unix)[9639]: session opened for user root by …

367 __::__: Mar 10 15:30:25 willLaptop dhclient: bound to 10.1.1.194 -- renewal in 5788 seconds. 367 __::__: Mar 10 15:30:25 willLaptop dhclient: DHCPACK from 10.1.1.50 367 __::__: Mar 10 15:30:25 willLaptop dhclient: DHCPREQUEST on eth0 to 10.1.1.50 port 67

57 __::__[ Mar 10 16:46:32 willLaptop ntpd[2544]: synchronized to 138.23.180.126, stratum 2 57 __::__[ Mar 10 16:46:27 willLaptop ntpd[2544]: synchronized to LOCAL(0), stratum 10 57 __::__[ Mar 10 16:42:09 willLaptop ntpd[2544]: time reset -0.236567 s

17

Page 18: Data Mining with Splunk

Group by TimeLook for bursts of events

18

• Turn on computer• Load a web page• Detects speeding car• Print document• Scan security badge

Page 19: Data Mining with Splunk

Group by Time Bursts… | transaction maxpause=2s | search eventcount>1Mar 10 16:50:01 willLaptop crond(pam_unix)[9638]: session opened for user root by (uid=0) Mar 10 16:50:01 willLaptop crond(pam_unix)[9639]: session opened for user root by (uid=0) Mar 10 16:50:01 willLaptop crond(pam_unix)[9638]: session closed for user root Mar 10 16:50:02 willLaptop crond(pam_unix)[9639]: session closed for user root Mar 10 15:30:25 willLaptop dhclient: DHCPREQUEST on eth0 to 10.1.1.50 port 67 Mar 10 15:30:25 willLaptop dhclient: DHCPACK from 10.1.1.50 Mar 10 15:30:25 willLaptop dhclient: bound to 10.1.1.194 -- renewal in 5788 seconds. Mar 10 16:45:01 willLaptop crond(pam_unix)[9553]: session opened for user root by (uid=0) Mar 10 16:45:02 willLaptop crond(pam_unix)[9553]: session closed for user root

19

Page 20: Data Mining with Splunk

Multiple Sources

20

(not really correct)

Page 21: Data Mining with Splunk

Now what?

1. ✓ group your data2. tell splunk!

21

Page 22: Data Mining with Splunk

Telling Splunk(about your groups of events)

Add eventtypes and tags

22

Huh?

Page 23: Data Mining with Splunk

SURPRISE TANGENT!

What is an eventtype?

23

Page 24: Data Mining with Splunk

Eventtype

A dynamic “tag” added to events, if they would match the search that defines the eventtype.

24

Page 25: Data Mining with Splunk

Eventtype: Name: “closed_root” Definition: “session closed” root

Event: … session closed for user root …

=>eventtype=closed_root

25

Page 26: Data Mining with Splunk

26

Create an Eventtype

Page 27: Data Mining with Splunk

27

Independent searches will return events tagged with previous eventtypes that help classify events.

Page 28: Data Mining with Splunk

28

Create reports on the classifications you’ve made

Ok, it wasn’t a tangent.

Page 29: Data Mining with Splunk

How do I mine fields?

29

Page 30: Data Mining with Splunk

Fields Correlation

Discover correlations to remove uninteresting fields and narrow in on promising reports.

30

haiku

Page 31: Data Mining with Splunk

Fields Correlation Haiku

Discover patterns in fields with a correlation: co-occurring fields.

31

indulgence

Page 32: Data Mining with Splunk

Splunkd.log Sample File09-05-2012 15:34:11.886 -0700 INFO ExecProcessor - Ran script: python /opt/splunk/etc/apps/...09-05-2012 15:34:02.467 -0700 ERROR TcpOutputProc - Can't find or illegal IP address or ...09-05-2012 15:32:03.397 -0700 INFO ProcessTracker - Process ran long; type=SplunkOptimize ...09-05-2012 15:30:20.016 -0700 WARN DispatchCommand - The system is approaching the maximum ...

32

fascinating

Page 33: Data Mining with Splunk

Field Correlation… | correlateRowField C CN Component Context L ...------------------------ ---- ---- --------- ------- ---- C 1.00 1.00 0.00 0.00 1.00 CN 1.00 1.00 0.00 0.00 1.00 Component 0.00 0.00 1.00 0.06 0.00 Context 0.00 0.00 0.06 1.00 0.00 L 1.00 1.00 0.00 0.00 1.00 Log_Level 0.00 0.00 1.00 0.06 0.00 …

33

Page 34: Data Mining with Splunk

Field Associationsautomatically deduce correlations and implications of field values: …| associate Log_Level Component

34

Page 35: Data Mining with Splunk

Field Association Summary Uncond Cond Ref_Key Ref_Value Target_Key Support Entropy Entropy Increase Top_Conditional_Value --------- ------------------------ ---------- ------- ------- ------- -------- ------------------------ Component DatabaseDirectoryManager Log_Level 34.67% 1.182 0.000 1.182201 WARN (62.25% -> 100.00%) Component HotDBManager Log_Level 38.25% 1.182 0.000 1.182201 INFO (33.15% -> 100.00%) Component SavedSplunker Log_Level 394.31% 1.182 0.000 1.182201 WARN (62.25% -> 100.00%) Component databasePartitionPolicy Log_Level 95.50% 1.182 0.417 0.765017 INFO (33.15% -> 91.57%) Component loader Log_Level 79.17% 1.182 0.050 1.131883 INFO (33.15% -> 99.44%) Component timeinvertedIndex Log_Level 44.28% 1.182 0.000 1.182201 INFO (33.15% -> 100.00%)

35

Page 36: Data Mining with Splunk

Top Fields by FieldsMost common Log_Level by Component:

... | top Log_Level by Component

Component Log_Level count percent---------------------------------- --------- ----- ----------AdminManager WARN 1 100.000000DatabaseDirectoryManager WARN 153 100.000000DateParserVerbose WARN 262 100.000000DedupProcessor ERROR 1 100.000000DeploymentClient DEBUG 60 85.714286DeploymentClient WARN 5 7.142857

36

Page 37: Data Mining with Splunk

How do I to detect anomalous events?

37

Page 38: Data Mining with Splunk

Types of Anomalies

Anomalies you know about

Anomalies you don’t know about

38

Page 39: Data Mining with Splunk

Handling Known Anomalies.Easy. Define a search for the anomalous condition and make an alert to detect it.

ip=10.* NOT domain=mycompany.com … | stats perc99(spent) 500ms.

Alert on “spent>500” 39

Page 40: Data Mining with Splunk

Finding Unknown AnomaliesLook for Abnormal• Single-Field Values• Multi-Field Values• Contexts• Visual Inspections…

40

Page 41: Data Mining with Splunk

Anomalies by Single Field Values

Identify anomalous values in a given field either by frequency of occurrence or number of standard deviations from the mean.

… | anomalousvalue action=summary pthresh=0.02 | search isNum=YES

41

Page 42: Data Mining with Splunk

Anomalies by Single Field Values

42

Page 43: Data Mining with Splunk

Anomalous by Many Values

Look for small clusters – by content, format, and time – to find anomalies. For example…

…| cluster …| sort cluster_count

43

Page 44: Data Mining with Splunk

Smallest Clusters by Contentcount  label  uri

1    7    /img/skins/default/bolt.png

1    37    /en-US/search/inspector?sid=1345075042.125&namespace=search

1    45    /services/admin/summarization?count=10

1    53    /services/pdfgen/is_available?viewId=index_status_health&...

1    57    /static/splunkrc_cmds.xml

44

Page 45: Data Mining with Splunk

Small Clusters: Bursts of OneFind bursts of just a single events where a pause of 2 seconds occurred around it.

… |transaction maxpause=2s | search eventcount = 1

Mar 10 16:46:32 willLaptop ntpd[2544]: synchronized to 138.23.180.126… Mar 10 16:46:27 willLaptop ntpd[2544]: synchronized to LOCAL(0), stratum… Mar 10 16:42:09 willLaptop ntpd[2544]: time reset -0.236567…

45

Page 46: Data Mining with Splunk

Burst of OneSame idea, different data source: splunk

[11:58:08] "POST /services/search/jobs/export HTTP/1.1" 200 201630 …

[11:12:51] "POST /services/search/jobs/export HTTP/1.1" 200 459441 …

[10:00:58] "GET /servicesNS/nobody/SplunkDeploymentMonitor/backfill/…

46

Page 47: Data Mining with Splunk

Anomalous by ContextIdentify values not expected by the context of other events.

… | anomalies field=file labelonly=true maxvalues=10

47

Page 48: Data Mining with Splunk

Anomalous by Context

48

Unexpectedness file0.00 shelper0.16 shelper0.00 1345502591.3560.00 1345502591.3560.00 1345074401.1910.00 1345074031.1530.03 1345074328.1860.00 1345502591.3560.35 conf-dm_backfill0.00 1345074309.1850.00 1345502591.356

time

Page 49: Data Mining with Splunk

Surprise Eventtype: Part Deux!Classified major categories of your data with eventtypes? -- just search for things that don’t match those eventtypes

49

Page 50: Data Mining with Splunk

50

Page 51: Data Mining with Splunk

Once you can describe anomalous behavior as a search…

51

Page 52: Data Mining with Splunk

52

Page 53: Data Mining with Splunk

Other mining commands• kmeans: Performs k-means clustering on selected fields. • outlier: Removes outlying numerical values. • af (analyze fields): Analyzes numerical fields for their

ability to predict another discrete field• fieldsummary : Generates summary information fields. • shape: Produces a symbolic 'shape' attribute describing

the shape of a numeric multivalued field

53

Page 54: Data Mining with Splunk

Why do I need to visualize my data?

54

Page 55: Data Mining with Splunk

Data Mining by Visualization Visualization can capture nuances in the data that numerical or linguistic summaries cannot easily capture.

55

Page 56: Data Mining with Splunk

56

These data points are radically different.

*Source: Anscombe’s Quartet (Anscombe 1973)

Page 57: Data Mining with Splunk

Why visualize?Because they all have the exact same

• average (7.50)• standard deviation (2.03) • least-squares fit (3 + 0.5x).

Do not just rely on numerical summarization.57

Page 58: Data Mining with Splunk

But I already have charts!You don’t graph enough. Data Exploration

Don’t decide ahead of time what graphs you wantRegularly do out-of-the-box scenarios with graphs

58

Page 59: Data Mining with Splunk

Variations:• Subsets of Events (paying customers vs lookers)• Fields by Fields (including eventtypes and tags)• Ignored fields• Min/max/avg/count• Compare to other times windows• Transactions

59

Data Exploration

Page 60: Data Mining with Splunk

Visual ArrangementSorting data, Changing Scales (Linear/Log), Min/Max can have a huge difference on looking at the same data.

60

Page 61: Data Mining with Splunk

Visual Considerations

61

Pick representations that make obvious the distinctions you need to care about.

Page 62: Data Mining with Splunk

Summary

62

Page 63: Data Mining with Splunk

Summary• Discovery is an iterative process.• Group events by content, format, and time, and

define classifications with eventtypes and tags• Focus on promising fields with correlations• Discover unknown anomalies with small clusters.• Visualize your data, from a dozen angles.

63

Page 64: Data Mining with Splunk

But wait!

64

Page 65: Data Mining with Splunk

More to come: Predictive Analytics

65

… | forecast foo

Page 66: Data Mining with Splunk

The End

66

.,`...,`...,`...,`...,`...,`...,`...,`...,`...,`...,`...,`...

.,`......_.,`...,`...,`...,`...,`...,`...,`...,`...,`....._..

...___..|.|...__._..._.__.,`..._.__.,`..___...__.,`...__.|.|.

../.__|.|.|../._`.|.|.'_.\....|.'_.\.../._.\..\.\./\././.|.|.

.|.(__..|.|.|.(_|.|.|.|_).|...|.|.|.|.|.(_).|..\.V..V./..|_|.

..\___|.|_|..\__,_|.|..__/....|_|.|_|..\___/....\_/\_/...(_).

.,`...,`...,`...,`..|_|.,`...,`...,`...,`...,`...,`...,`.....

.,`...,`...,`...,`...,`...,`...,`...,`...,`...,`...,`...,`...Golf clapping at #datamining

Mine the Gap.