elasticsearch quick introduction

30
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited Elasticsearch and MIT Sloan Data Analytics Hackathon Cambridge, MA - May 10, 2014 Elasticsearch Quick Introduction

Upload: imotov

Post on 26-Jan-2015

122 views

Category:

Software


3 download

DESCRIPTION

Elasticsearch and MIT Sloan Data Analytics Hackathon Cambridge, MA - May 10, 2014

TRANSCRIPT

Page 1: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch and MIT Sloan Data Analytics Hackathon Cambridge, MA - May 10, 2014

Elasticsearch Quick Introduction

Page 2: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

About Me

• Igor Motov

• Developer at Elasticsearch Inc.

• Github: imotov

• Twitter: @imotov

Page 3: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

About Elasticsearch Inc.

• Founded in 2012 By the people behind the Elasticsearch and Apache Lucene http://www.elasticsearch.com Headquarters: Amsterdam and Los Altos, CA

• We provide Training (public & onsite) Development support Production support subscription (SLA)

Page 4: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

About Elasticsearch

• Real time search and analytics engine JSON-oriented, Apache Lucene-based

• Automatic Schema Detection Enables control of it when needed

• Distributed Scales Up+Out, Highly Available

• Multi-tenancy Dynamically create/delete indices

• API centric Most functionality is exposed through an API

Page 5: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Basic Concepts

• Cluster a group of nodes sharing the same set of indices

• Node a running Elasticsearch instance (typically JVM process)

• Index a set of documents of possibly different types stored in one or more shards

• Type a set of documents in an index that share the same schema

• Shard a Lucene index, allocated on one of the nodes

Page 6: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Basic Concepts - Document

• JSON Object

!

!

!

!

!

!

• Identified by index/type/id

{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }

Page 7: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Downloading elasticsearch• http://www.elasticsearch.org/download/

Windows Everything else

Page 8: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

What’s in a distribution?

. ├── LICENSE.txt ├── NOTICE.txt ├── README.textile ├── bin │   ├── elasticsearch │   ├── elasticsearch.in.sh │   └── plugin ├── config │   ├── elasticsearch.yml │   └── logging.yml ├── data │   └── elasticsearch ├── lib │   ├── elasticsearch-x.y.z.jar │   ├── ... │   └── └── logs    ├── elasticsearch.log    └── elasticsearch_index_search_slowlog.log

executable scripts

node config files

data storage

libs

log files

Page 9: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Configuration (multicast)

• Configuration config/elasticsearch.yml

cluster.name: "elasticsearch-imotov"

unique name

Page 10: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Configuration (stand-alone)

• Configuration config/elasticsearch.yml

cluster.name: "elasticsearch-imotov" network.host: "127.0.0.1" discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["localhost:9300", "localhost:9301", “localhost:9302"]

unique name

listen only on localhost

disable multicast

search for other nodes on localhost

Page 11: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Starting elasticsearch

• Foreground

!

!

• Background

$ bin/elasticsearch

$ bin/elasticsearch -d

Page 12: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Is it running?

{ "status" : 200, "name" : "Kamal", "version" : { "number" : "1.1.1", "build_hash" : "f1585f096d3f3985e73456debdc1a0745f512bbc", "build_timestamp" : "2014-04-16T14:27:12Z", "build_snapshot" : false, "lucene_version" : "4.7" }, "tagline" : "You Know, for Search" }

$ curl -XGET "http://localhost:9200/?pretty"

Page 13: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Communicating with Elasticsearch

• REST API Curl Ruby Python PHP Perl JavaScript (community supported)

• Binary Protocol Java

Page 14: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Pick your client

• Java included in distribution

• Ruby, PHP, Perl, Python http://www.elasticsearch.org/blog/unleash-the-clients-ruby-python-php-perl/

• Everything Else http://www.elasticsearch.org/guide/clients/

Page 15: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Indexing a document

$ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }'

{"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":1}

Page 16: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Getting a document

{ "_index" : "test-data", "_type" : "cities", "_id" : "21", "_version" : 1, "exists" : true, "_source" : { "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" } }

$ curl -XGET "http://localhost:9200/test-data/cities/21?pretty"

Page 17: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Updating a document

$ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "population2012": 636479, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }'

{"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":2}

Page 18: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Searching$ curl -XGET 'http://localhost:9200/test-data/cities/_search?pretty' -d '{ "query": { "match": { "city": "Boston" } } }'

Page 19: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Searching{ "took" : 5, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 6.1357985, "hits" : [ { "_index" : "test-data", "_type" : "cities", "_id" : "21", "_score" : 6.1357985, "_source" : {"rank":"21","city":"Boston",...} } ] } }

Page 20: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Range Queries

$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "range": { "population2012": { "from": 500000, "to": 1000000 } } } }'

Page 21: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Boolean Queries

$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "bool": { "should": [{ "match": { "state": "Texas"} }, { "match": { "state": "California"} }], "must": { "range": { "population2012": { "from": 500000, "to": 1000000 } } }, "minimum_should_match": 1 } } }'

Page 22: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

MatchAll Query$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "match_all": { } } }'

Page 23: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Sorting and Paging

$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "match_all": { } }, "sort": [ {"state": {"order": "asc"}}, {"population2010": {"order": "desc"}} ], "from": 0, "size": 20 }'

Page 24: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Analysis

• By default string are - Divided into words (tokens) - All tokens are converted to lower-case

Page 25: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Analysis Example

• “Elasticsearch is a powerful open source search and analytics engine.”

1. elasticsearch 2. is 3. a 4. powerful 5. open 6. source 7. search 8. and 9. analytics 10. engine

Page 26: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Customizing the mapping

curl -XPUT 'http://localhost:9200/my_index/' -d '{ "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 0 } }, "mappings": { "my_type": { "properties": { "description": { "type": "string" }, "sku": { "type": "string", "index": "not_analyzed" }, "count": { "type": "integer" }, "price": { "type": "float" }, "location": { "type": "geo_point" } } } } }'

exact match

analyzed text

geo location

Page 27: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch Reference

• http://www.elasticsearch.org/guide/

Page 28: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Ideas for hackathon

• Explore data wikipedia twitter enron emails

• Play with Kibana

• Build Elasticsearch plugins

• Get prizes

Page 29: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch Meetup

http://www.meetup.com/Elasticsearch-Boston/

Page 30: Elasticsearch Quick Introduction

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited

We are hiring

http://www.elasticsearch.com/about/jobs/