be elastic: leapset innovation session 06-08-2015
TRANSCRIPT
“Where in the world is Elastic”
Innovation SessionData & Analytics Team
1
Try SQL??? SELECT *,IF(`discount`>10%,1,0) `has_discounts`,(3959 * acos ( cos ( radians(cur_lat) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(cur_lng) ) + sin ( radians(cur_lat) ) * sin( radians( lat ) ) ) ) AS distanceFROM `Restaurants`WHERE `type`=’Pizza’AND `price` <= 1500AND `distance` < 500ORDER BY `WiFi` DESC,`has_discounts` DESC
2
● Requirementso Pizzaso Under 1500o Within 500m
● (Optional)o Wifio Discount
What If
3
501m
1501
500m
ElasticSearch Approach{"bool": {
"must": {"multi_match": {"query": "pizza", "fields":
["type^2", "restaurant"]},},"must_not": {},"should": {
"term": {"features": "wifi"},"range": {"discounts": {"gt": 10}}
}}}
4
● Requirementso Pizzaso Under 1500o Within 500m
● (Optional)o Wifio Discount
Via ElasticSearch
5
{"gauss": {
"location": {"origin": "<lat>,<lon>","offset": "0.5km","decay": 0.5
}"price": {
"origin": 0,"offset": 1500,"decay": 0.5
}}}
501m
1501
500m
Outline● Introducing ElasticSearch● Naive comparison● Lucene: The Architecture● Plug & Play● How we use ElasticSearch● Summary
6
Introducing ElasticSearch“Your data, your search”
7
ElasticSearch● Open-Source Search & Analytics engine
o Structured & Unstructured Datao (Near) Real Timeo Analytics capabilities (facets)o REST based
● Distributedo Designed for the Cloudo Designed for Big Data
● Lightweight● Popular: ~200K dl/month
8
Naïve ComparisonElasticSearch vs. Solr.
9
10
● Based on Lucene● Full-text search● Structured and Unstructured● Queries, filters, caches,
Facets● Cloud-ready
● Download size ● Zookeeper Vs ES own algorithm● Release process (LGTM vs
Apache)● Config vs Magic
Commonalities
Differences
Architecture behind ElasticSearch
Inverted Index at Apache Lucene
11
Elasticsearch Storage Architecture ● Analysis process
12
Elasticsearch Storage Architecture ● Inverted Indexes
13
Plug & Play
14
ElasticSearch Cluster
15
Cluster StateIndex Mappings
Shard Routing TablesNodes’ metadata
Node 1 Node 2 Node 3
Shard C
Shard A Shard B
Shard A
Shard C
Shard B
ElasticSearch Index
Lucene Index
Segments
Inverted Index Replicas
Zone A Zone B
Index Request
16
Search Request
17
How we use ElasticSearch
Use-cases at Leapset
18
19
“Can you check the CouchDB errors from last Monday
between 1.00 and 1.05 a.m?”
You gotta be kidding me,
Ale!!!!
Centralized Logs
20
Log Monitor with Kibana
21 *GrayLog by LPG is also powered by Elasticsearch
“Can you detect the slow queries on last Sunday
night?”
22
Network Monitoring - PacketBeat + Kibana
● A packet sniffer that collects and scans packets
● ElasticSearcho As the storage backendo As the index/search
backend
Up & Running at TST2 master DB
23
“Can you compare restaurant sales within X
distance in a given hour?”
24
Graph Model
25
Inside story: Comparison ReportsTechnology Stack● Search Backend: Elasticsearch
o Vertex centric indices▪ Sort & index edges per
vertexo Enables efficient focused
traversals▪ Only retrieve edges that
mattero Uses pushdown predicates
for quick index driven retrieval
26
Summary
27
28