delhi elasticsearch meetup

26
Delhi Elasticsearch Meetup Bharvi Dixit @d_bharvi Nov 29, 2014 Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Upload: bharvi-dixit

Post on 13-Jul-2015

1.087 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Delhi Elasticsearch Meetup

Bharvi Dixit@d_bharvi

Nov 29, 2014 Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Agenda

What is a search engine? Lucene Overview and Indexing Pipeline. Data Driven Approaches & Problems. Elasticsearch Comes to Rescue. Understanding Elasticsearch Architecture. Logstash & Kibana Overview. The ELK stack together. Some tips.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

About Me

Software engineer @Orkash. Loves Java, Data, Elasticsearch, MongoDB, Eclipse. Interested in all things scale, search, security & DevOps. Creator: CIBET Pro Manager Working on Elasticsearch for more than a year. Social Media and News Media Intelligence. (Complex

schemas & Query designs)

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

What is a search engine?

• An information retrieval system designed to find informationstored in computer system.

A search engine has different modules:

• But what about the relevant or irrelevant results??

Data collected from various

sourcesData stored in indexes

Data is queried

Indexing

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

What is a search engine?

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Auto completionDid-You-Mean

Spell correctionMulti-lingual

StemmingSynonyms

HighlightingMore-Like-This

Lucene Overview

Lucene:• Open source, Fast, high performance, search/IR library.• Written in Java.• Initially developed by Doug Cutting (Also author of

Hadoop)• Indexing and Searching.• Inverted Index of documents.• Provides advanced Search options like synonyms,

stopwords, similarity, proximity.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Lucene Internals- Inverted Index

Credit: https://developer.apple.com/library/mac/documentation/userexperience/conceptual/SearchKitConcepts/searchKit_basics/searchKit_basics.html

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Lucene Internals- Continued

• Defines documents Model

• Index contains documents.

• Each document consist of fields.

• Each Field has attributes.

– What is the data type (FieldType)

– How to handle the content (Analyzers, Filters)

– Is it a stored field (stored="true") or Index field

(indexed="true")

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Indexing Pipeline

• Analyzer : create tokens using a Tokenizer and/or applying Filters (Token Filters)

• Each field can define an Analyzer at index time/query time or the both at same time.

Document TokenizerDocument

WriterToken Filter

Inverted Index

Analysis Phase

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Everything starts with a problem..!!

• Data Driven Decisions• Logfiles for scaling up/down• Warehouse withdrawal triggers orders• History for fraud detection• Assembly line, throughput improvement

... data explosion

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Everything starts with a problem..!!

Better decisions == more data?

Data

Big Data

BIG DATA

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Big Data Problem goes on..• I need BIG DATA.• I need to analyze this data.• I need to enrich this big data & make it more bigger. • I need fast searching.• I need real-time analytics.• Ohh wait.. I need relational queries on this big data to get

more insights..• I need .. I need .. I need..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

And I guess this is why someone nailed it..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Elasticsearch comes to rescue..

What is Elasticsearch:• “you know, for search”• Schema-free, REST & JSON Based distributed Full Text

search engine & document store.• Written in JAVA & Build on top of Lucene.• Highly reliable, scalable, fault tolerant.• Support distributed Indexing, Replication, and load

balanced querying.• Powerful Geo-Spatial Queries.• Latest Release : 1.4.1Wait..!! Schema Free?? The real gotcha..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Elasticsearch comes to rescue..

What does it add to Lucene:• REST service: Json API’s over HTTP

• High Availability & Performance: Clustering & Replication

• A Powerful query DSL.• Interoperation with non-Java/JVM languages.• More and more Resilience.• Multitenancy• And the best one: It allows to maintain relationship

among documents.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

The Elasticsearch Open Source Model

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

The Popularity of Elasticsearch

10M downloads in 2 years and counting..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

The Popularity of Elasticsearch

Have a look at the case studies here:http://www.elasticsearch.org/case-studies/

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Understanding Elasticsearch Structure

A live demo is better then nothing

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Logstash

• Tool for Receiving, processing and outputting logs.(Input======Filter======Output)

• All kinds of logs: System logs, error logs, webserver logs,application logs & just about anything you can throw at it.

• Open Source: Apache License 2.0.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Kibana

• Execute queries on your data & visualize results.• Add/remove widgets.• Share/Save/Load dashboards.• No need to know coding.• Open Source: Apache License 2.0.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

The ELK Stack Together

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

meetup.com RSVP stream

• All RSVPs are written out to a HTTP stream• Each line is a JSON document• Available at http://stream.meetup.com/2/rsvps

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

meetup.com RSVP stream

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

In the end..

• Look out for best practices. (Proper cluster formation, Bulk Indexing)

• Continuous monitoring: Marvel, Bigdesk, HQ• Open-JDK strictly prohibited.• Elasticsearch is the always hungry: Give me more RAM..!!• Benchmarking of data to create indexes/shards. (Once

created; can’t be broken)• And don’t forget to create mappings.• Manage your security.. But Now It’s coming soon..

Elasticsearch Shield.. “you know, for security”

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Thank You for Listening

[email protected]://twitter.com/d_bharvislideshare.net/bharvidixit/

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014