splunk's hunk: a powerful way to visualize your data stored in mongodb
DESCRIPTION
TRANSCRIPT
Copyright © 2014 Splunk Inc.
Spunk HunkA Powerful Way to Visualize Your Data Stored in MongoDB
Mark Groves
Sr. Director, Product Management
Splunk Developer Platform
2
The Accelerating Pace of DataVolume | Velocity | Variety | Variability
GPS,RFID,
Hypervisor,Web Servers,
Email, Messaging,Clickstreams, Mobile,
Telephony, IVR, Databases,Sensors, Telematics, Storage,
Servers, Security Devices, Desktops
Machine data is the fastest growing, most complex, most valuable area of big data
Platform for Machine DataAny Machine Data
Online Services Web
Services
ServersSecurity GPS
Location
StorageDesktops
Networks
Packaged Applications
CustomApplicationsMessaging
TelecomsOnline
Shopping Cart
Web Clickstreams
Databases
Energy Meters
Call Detail Records
Smartphones and Devices
RFID
Datacenter
Private Cloud
Public Cloud
Enterprise Scalability
Search and Investigation
Proactive Monitoring
Operational Visibility
Real-time Business Insights
Operational Intelligence
4
What Does Machine Data Look Like?
4
Sources
Care IVR
Middleware Error
Order Processing
5 5
Customer ID Order ID
Customer’s Tweet
Time Waiting On Hold
Twitter ID
Product ID
Company’s Twitter ID
Sources
Care IVR
Middleware Error
Order Processing
Customer IDOrder ID
Customer ID
What Does Machine Data Look Like?
6 6
Order ID
Customer’s Tweet
Time Waiting On Hold
Product ID
Company’s Twitter ID
Sources
Care IVR
Middleware Error
Order Processing
Order ID
Customer ID
Twitter ID
Customer ID
Customer ID
What Does Machine Data Look Like?
7
How does this relate to MongoDB?
Hunk…Enables you to combine Time Series event data with leading Big Data StoresWhat does this look like? Demo…
Copyright © 2014 Splunk Inc.
Cell Tower Monitoring App
Merging Machine Data with MongoDB
10
Concepts
Splunk Index != Database IndexSchema on ReadTime is a first class citizen in Splunk
11
Components of Hunk Server
64-bit Linux OS
REST API COMMAND LINE
Explore Analyze Visualize Dashboards Share
ODBC
splunkd
Hadoop Interface• Hadoop Client Libraries• JAVA
Streaming Resource Libraries• NoSQL & Other Stores
splunkweb
Web and Application server
Virtual Indexes
Python, AJAX, CSS, XSLT, XML
Search Head C++, Web Services
12
Powerful Platform for Enterprise Developers
12
REST API
Build Splunk Apps Extend and Integrate Splunk
Simple XML
JavaScript
Django
Web Framework
SDKsJava
JavaScript
Python
C#
Ruby
PHP
13
Virtual Indexes – Connector into MongoDB
• Enables seamless use of almost the entire Splunk stack on data• Automatically handles query execution to Mongo, Hadoop, etc
14
HunkSearch Head >
Examples of Virtual Indexes
External System 1
External System 2
External System 3
index = syslog (/home/syslog/…)
index = apache_logsindex = sensor_data
index = twitter
15
Hunk Search Architecture
Query per Index/Virtual Index
Search Processor
HunkSearch Head >
1.3.
4.
2.
Splunk Distributed
Search
Hadoop External Results Provider
MongoDB Streaming
Resource LibraryMongoDBProvider
MongoDB
MongoDB
MongoDB
JSON Config
Results Reduction
16
Hunk applies schema for all fields – including transactions – at search time
Hunk Applies Schema on the Fly
• Structure applied at search time
• No brittle schema to work around
• Automatically find patterns and trends
Integration
18
Install via GUI
18
1. 2.
3.
19
Install via Command LineGo to <apps.splunk.com URL>Download MongoDBProvider.splEither:– Copy MongoDBProvider.spl to $SPLUNK_HOME/etc/apps– tar –zxvf MongoDBProvider.spl
19
20
Configure Indexes.conf - OverviewIndexes.conf defines indexes, physical and virtualNeed to two configuration items, a provider and a virtual index– Provider should be 1:1 to your MongoDB Server– There can be multiple virtual indexes per Provider
Indexes.conf can be in any Splunk App, probably easiest to put it in MongoDBProvider folder
20
21
Configure Indexes.conf
21
[wocorders]vix.provider = local-mongodbvix.mongodb.db = demovix.mongodb.collection = wocordersvix.mongodb.field.time = timestampvix.mongodb.field.time.format = date
[provider:local-mongodb]vix.family = mongodb_erp_familyvix.splunk.search.debug = 0vix.mongodb.host = localhost:27017
Provider Name (referenced in Virtual Indexes)FamilyDisable DebuggingHostname:Port
Provider
[mongodb_vix]vix.provider = local-mongodbvix.mongodb.db = hunkvix.mongodb.collection = testvix.mongodb.field.time = _idvix.mongodb.field.time.format = ObjectId
Name of the Virtual Index (used by users)Provider Name (matches earlier stanza)MongoDB DB NameMongoDB Collection NameField to extract time fromFormat of the Field to Extract Time From (Valid Options are ObjectID, Date, or Epoch)
Virtual Index 1
22
Configure Indexes.conf
22
[wocorders]vix.provider = local-mongodbvix.mongodb.db = demovix.mongodb.collection = wocordersvix.mongodb.field.time = timestampvix.mongodb.field.time.format = date
Name of the Virtual Index (used by users)Provider Name (matches earlier stanza)MongoDB DB NameMongoDB Collection NameField to extract time fromFormat of the Field to Extract Time From (Valid Options are ObjectID, Date, or Epoch)
Virtual Index 2
23
How to query Mongo
23
index=mongodb (foo=xyz OR other=val) | fields foo, bar, baz
Query your MongoDB Virtual
Index
Match any fields by specifying the field name and matching
parameters
Minimize results returned by
projecting down only the fields you
want returned
24
Mongo Specific Integration Highlights
24
index=mongodb foo=xyz | timechart avg(bar) by baz
Predicate Pushdown Projections
Filtering terms are processed on the MongoDB side, so only results where the
field foo matches xyz are returned
We only return back fields which are mentioned in the particular search, in this
case _time, bar and baz
25
Roadmap for the Future
Full text search engineBSON support
25
26
Get The Bits!
Hunk– http://splunk.com/download
MongoDB App– http://apps.splunk.com/app/1810/– Or search for “MongoDB” on apps.splunk.com
26
27
Where to go for More Info
Contact Me: [email protected] - @markgrovsSplunkDev - http://dev.splunk.com/Splunk Apps - https://apps.splunk.com GitHub - https://github.com/splunk/Twitter - https://twitter.com/splunkdev Blogs - http://blogs.splunk.com/dev/
27