search and analytics (using elasticsearch)

29
Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Search and Analytics (using Elasticsearch) Costin Leau

Upload: bigdatalondon

Post on 06-May-2015

4.276 views

Category:

Technology


1 download

DESCRIPTION

Slides from Costin Leau's talk on Search and Analytics (using Elasticsearch) at the 18th Big Data London meetup

TRANSCRIPT

Page 1: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Analytics

(using Elasticsearch)

Costin Leau

Page 2: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Why search?

Page 3: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks with more then (x) accounts”

Page 4: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks near my location”

Page 5: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – What we’re all about

Page 6: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

Page 7: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

data stores

search engines

Page 8: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

‘Players’ in the search market

Search engines

- Google/Bing/Yahoo!/Ask.com/Yandex/Baidu

Open-Source

- Sphinx

- Apache Lucene

- Elasticsearch

- Solr

- Sensei

Enterprise Search

- Oracle Endeca / MDEX

- HP Autonomy

- Exalead

- IBM Enterprise Search

Page 9: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Page 10: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Page 11: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Page 12: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Popular: >200K downloads/month

Page 13: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Page 14: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Page 15: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Page 16: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Page 18: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Searches 50,000,000 venues every day using

Elasticsearch

Use Case - Geolocation

Page 19: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case – Support/Reporting

Page 20: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Centralized Logging

Page 21: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Pure Analytics

Page 22: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Big Data

Page 23: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing Real-Time

Processing

(s4, storm)

Analytics

Page 24: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing

Analytics

Real-Time

Processing

(s4, storm)

Page 25: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Page 26: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Page 27: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch + Hadoop

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

Writing Reading / Querying

Page 28: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Explore data through

(Elastic)Search

Page 29: Search and Analytics (using Elasticsearch)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Thank you! @costinl

http://www.elasticsearch.org/