visualizing the elasticsearch graph with keylines · pdf file2 what is elasticsearch?...

11
1 Visualizing Graphs with Elasticsearch and KeyLines What is Elasticsearch? ......................................................................................................................... 2 Graph: The Elasticsearch Graph Engine ......................................................................................... 2 Kibana: The Elasticsearch visualization tool .................................................................................. 2 Logstash – a data management tool .............................................................................................. 4 Why visualize Elasticsearch with KeyLines? ..................................................................................... 4 A KeyLines / Elasticsearch Architecture ........................................................................................... 5 Getting started with KeyLines and Elasticsearch ............................................................................ 6 Step 1: Download your files ............................................................................................................. 6 Step 2: Set up your file structure .................................................................................................... 6 Step 3: Load data into Elasticsearch ............................................................................................... 6 Step 4: Embed KeyLines in your webpage .................................................................................... 7 Step 6: Parse our result in the KeyLines format ........................................................................... 9 Step 7: Visualize the data in KeyLines .......................................................................................... 10 Step 8: Performing more sophisticated searches ...................................................................... 11 Next steps: Extending the UI ......................................................................................................... 11 Try it yourself! .................................................................................................................................... 11 Who should read this white paper? This white paper is aimed at: Project managers and non-technical staff looking for a detailed introduction to visualizing data from Elasticsearch with KeyLines. Developers and technical staff seeking a non- technical introduction to visualizing data from Elasticsearch with KeyLines. If you require more information we recommend contacting us to discuss your project.

Upload: duongmien

Post on 12-Feb-2018

284 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

1

Visualizing Graphs with Elasticsearch and KeyLines

 

What is Elasticsearch? ......................................................................................................................... 2  

Graph: The Elasticsearch Graph Engine ......................................................................................... 2  

Kibana: The Elasticsearch visualization tool .................................................................................. 2  

Logstash – a data management tool .............................................................................................. 4  

Why visualize Elasticsearch with KeyLines? ..................................................................................... 4  

A KeyLines / Elasticsearch Architecture ........................................................................................... 5  

Getting started with KeyLines and Elasticsearch ............................................................................ 6  

Step 1: Download your files ............................................................................................................. 6  

Step 2: Set up your file structure .................................................................................................... 6  

Step 3: Load data into Elasticsearch ............................................................................................... 6  

Step 4: Embed KeyLines in your webpage .................................................................................... 7  

Step 6: Parse our result in the KeyLines format ........................................................................... 9  

Step 7: Visualize the data in KeyLines .......................................................................................... 10  

Step 8: Performing more sophisticated searches ...................................................................... 11  

Next steps: Extending the UI ......................................................................................................... 11  

Try it yourself! .................................................................................................................................... 11  

Who should read this white paper?

This white paper is aimed at:

• Project managers and non-technical staff looking for a detailed introduction to visualizing

data from Elasticsearch with KeyLines.

• Developers and technical staff seeking a non- technical introduction to visualizing data

from Elasticsearch with KeyLines.

If you require more information we recommend contacting us to discuss your project.

Page 2: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

2

What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine.

Its power and out-of-the-box simplicity has made it a popular option for organizations needing a

way to search very large volumes of data. It can support near real time searching of data on a

petabyte scale, using a system of sharding and routing to scale outwards from the beginning.

The Elasticsearch engine itself is built on the Apache Lucene software library. Lucene is a high-

performance technology for searching and indexing data, but it is also very complex.

Elasticsearch makes the power of Lucene more readily useable by pre-selecting some sensible

defaults and providing a more intuitive REST API.

Elasticsearch powers the search functionality of some very data-rich organizations, including

Facebook, Wikimedia and Stack Exchange. It is also increasingly popular with KeyLines

developers, looking for a powerful and scalable back-end technology for their graph applications.

In this Getting Started guide we are going to explain how you can use the KeyLines toolkit to

build a UI for your Elasticsearch server.

Through the document, we will refer to a number of different technologies in the Elastic Stack:

• Elasticsearch – the core search technology

• Graph – a new API for performing searches across connected data

• Kibana – an open source visualization web application

• Logstash – a tool for streaming, munging and loading data into Elasticsearch

Elastic Graph: The Elasticsearch graph engine Released in Elasticsearch v2.3, Elastic Graph provides a way to discover and understand

connections in your Elasticsearch index. It is able to infer two data attributes: vertices (nodes) and

connections (links). The Elastic Graph API then allows you to query and explore these vertices and

connections as a graph.

Out of the box, Elastic Graph uses relevance scoring to help identify the most meaningful

connections. This simple analysis can be enhanced with KeyLines visual graph analysis

functionality, making it easier for users to understand complex network trends and uncover

outliers.

Kibana: The Elasticsearch visualization tool Kibana is Elasticsearch’s open source data visualization platform. It provides a dashboard of

charts and maps to help users interpret their data and search results:

Page 3: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

3

A screengrab of a Kibana dashboard, via http://elastic.co

Kibana includes a Graph plugin, allowing users to visually explore data connections:

A screengrab of the Kibana graph plugin, via http://elastic.co

As both Kibana and KeyLines are web technologies, they complement each other perfectly.

Page 4: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

4

Logstash – a data management tool The easiest way to load data into Elasticsearch is using LogStash, a command line tool. This

approach means you can input data as a CSV file, leaving LogStash to parse the dataset into your

Elasticsearch instance.

Why visualize Elasticsearch with KeyLines? Graph visualization is a great way to make large and complex connected data easy to understand.

A well-designed visualization means users can:

• Find and interpret patterns and outliers

• Explore connections in an intuitive way

• Answer questions more quickly using data insight.

Extending Kibana’s graph visualization functionality with KeyLines provides access to an

extensive library of powerful functionality for even greater graph insight, including:

• Social network analysis

• Automated graph layouts

• The KeyLines Time Bar and dynamic network visualization

• KeyLines Geospatial to view network data on geographic maps

• WebGL for faster and more powerful visualization

In this Getting Started guide we are going to follow the steps required to build a simple KeyLines

component to visualize and explore your Elasticsearch graph data.

Let’s get started…

Page 5: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

5

A KeyLines / Elasticsearch Architecture

Elasticsearch provides a REST API and works with the JSON data structure, so the KeyLines

integration architecture is very simple:

In this scenario users interact with KeyLines, which runs in the web browser, to raise events (e.g.

click, hover, right-click, etc). These user interactions with the graph interface raise requests to the

Elasticsearch REST API. Elasticsearch returns the data as a JSON object, which is then styled and

re-presented in KeyLines.

Page 6: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

6

Getting started with KeyLines and Elasticsearch

In this tutorial, we will create a simple KeyLines application to perform a search of our

Elasticsearch data. This is just the starting point. Once you have a functioning integration, you can

incorporate additional KeyLines visualization and analysis functionality.

If you have any problems following these instructions, get in touch.

Step 1: Download your files To build a KeyLines-Elasticsearch integration you will need the following files:

• Keylines.js – request trial account

• Elasticsearch – we used v2.3.3 – installation guide

• Elastic Graph API plugin – installation guide

• Logstash – installation guide

Step 2: Set up your file structure For our KeyLines/Elasticsearch JavaScript app, we will use the following structure:

• App.js contains the main functions to initialize

KeyLines and controllers for Elasticsearch and app-

graph-search.js.

• Elasticsearch.js will contain the functions required

to send queries to, and generally interact with, the

server.

• App-graph-search.js will contain the controller for

our search function with the Graph API.

• Index.htm will contain the KeyLines chart and some

customization code to describe the general UI.

Step 3: Load data into Elasticsearch This step can be omitted if your instance is pre-populated.

We used a random data generator to produce a fake dataset of users. Then we imported the

generated users into Elasticsearch with Logstash: with a “user” type inside a “users” index.

Users have the following structure:

user:  {  id:  "number",  firstname:  "string",  lastname:  "string",

Page 7: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

7

 gender:  "string",  company:  "string",  eyes_color:  "string" }  

Step 4: Embed KeyLines in your webpage We won’t go into detail on this, but you can find sample applications on the KeyLines SDK

website, or create your own using the Getting Started guide in the SDK documentation.

To give you some idea of how this works, here is some of the HTML we would need on our page

to load the KeyLines component:

We have to include KeyLines:

 <link  rel='stylesheet'  type='text/css'  href='css/keylines.css'/>  

<link  rel="stylesheet"  type="text/css"  href="css/style.css">    

And we also need a container to start KeyLines within it:

<!-­‐-­‐  This  is  the  HTML  element  that  will  be  used  to  render  the  KeyLines  component  -­‐-­‐>  

<div  id="kl"></div>    

After that, the rest will be UI to interact with Elasticsearch.

Our KeyLines chart with some UI

Step 5: Fetch some data from Elastic Graph API The Graph API is a rest service, so we use the action “_graph/explore” to request data. Our

endpoint therefore is http://localhost:9200/users/_graph/explore

Page 8: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

8

By importing the data with Logstash, we have an extra field in each user: message. It is the raw

line used to do the import, it looks like this:

100|Noelle|Frye|Sodales  Purus  In  Company|gray

We will use this field to search with the Graph API.

For a graph search for the term ‘brown’, our data query would look like this:  

{  

   "query":  {  

           "query_string":  {  

                   "default_field":  "_all",  

                   "query":  "brown"  

           }  

   },  

   "controls":  {  

           "use_significance":  true,  

           "sample_size":  2000,  

           "timeout":  5000  

   },  

   "connections":  {  

           "vertices":  [  

                   {  

                           "field":  "message",  

                           "size":  20,  

                           "min_doc_count":  3  

                   }  

           ]  

   },  

   "vertices":  [  

           {  

                   "field":  "message",  

                   "size":  20,  

                   "min_doc_count":  3  

           }  

   ]  

}  

In response to this we would receive a JSON object, which we can parse into KeyLines’ own JSON

format.

Page 9: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

9

Step 6: Parse our result in the KeyLines format The Elasticsearch response contains all the information we need to create a KeyLines input, so

parsing your JSON is a relatively simple process.

The search results are received in the following structure:

{  

   connections:[],  

   failures:[],  

   timed_out:false,  

   took:0,  

   vertices:[]  

}  

Inside the connections attribute, we will find the links, for example:

{    doc_count:  14,    source:  10,    target:  2,    weight:  0.005304290380952548 }

 

source and target attributes are the index of vertices in the vertices attributes.

Inside the vertices attribute, we will find the object itself, for example:

{    depth:  0,    field:  "message",    term:  "blue",    weight:  0.8421388547845717 }  

More details are in the documentation.

For this we just use the makeNode() and makeLink() functions to get our KeyLines input, e.g.:

var  makeNode  =  function  (index,  item)  {  

   var  e  =  getNodeWidth(item);  

 

   return  {  

Page 10: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

10

           id:  item.term,  

           type:  "node",  

           t:  item.term,  

           e:  e,  

           c:  "green",  

           d:  Object.assign({},  item)  

   };  

};  

 

var  makeLink  =  function  (index,  item,  nodes)  {  

   var  w  =  getLinkWidth(item);  

   var  node1  =  nodes[item.source];  

   var  node2  =  nodes[item.target];  

 

   return  {  

           type:  "link",  

           id:  "link_"  +  node1.term  +  "_"  +  node2.term,  

           id1:  node1.term,  

           id2:  node2.term,  

           w:  w,  

           d:  Object.assign({},  item)  

   };  

};  

Step 7: Visualize the data in KeyLines Now we have our JSON object, we can put it into KeyLines using a callback like this:

function  loadChart(items)  {  

   chart.load({  

           type:  'LinkChart',  

           items:  items  

   },  function  ()  {  

           chart.layout("standard");  

   });  

}  

Success!

Page 11: Visualizing the Elasticsearch Graph with KeyLines · PDF file2 What is Elasticsearch? Elasticsearch is a fast and scalable open source search engine. Its power and out-of-the-box simplicity

11

In this example, we have added another request to count users returned in our search result. This

allows us to scale nodes and weight the links.

Step 8: Performing more sophisticated searches Our example above is just the starting point. Now your infrastructure is working, you can begin to

perform more sophisticated searches.

For example, you may want to pull in nodes with their full relationships. This would be managed

by performing another server request asking for all elements in the relationships found. You will

also ask it to omit any related nodes – otherwise you will keep returning the original node over

and over.

Next steps: Extending the UI In our example, we included some controls to run KeyLines’ automatic layouts and a selection

detail tool to show information about the selected elements.

You should now be ready to extend these with other functionality to help users explore and

understand their data. The KeyLines SDK site contains has a fully-documented API of functionality

for you to incorporate.

Try it yourself! To find out more about KeyLines, or to start a free trial, just get in touch: http://cambridge-

intelligence.com/contact.