elasticsearch 5 in amazon elasticsearch service

Post on 16-Feb-2017

87 Views

Category:

Technology

7 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Elasticsearch 5 in Amazon Elasticsearch Service

Darin BriskmanAmazon Web Services Technical Evangelist

briskman@amazon.com or @briskmad

15 Feb 2017

Jon HandlerAWS Principal Solutions Architect

handler@amazon.com or @_searchgeek

Get started at https://aws.amazon.com/elasticsearch-service/

Amazon Search Services

Amazon CloudSearch

Amazon Elasticsearch

Service

Get started at https://aws.amazon.com/elasticsearch-service/

OpenSourceDistributedIndex

ManagedServiceusingElasticsearchandKibana

Fullymanaged;Zeroadmin

HighlyAvailableandReliable

RESTfulAPIforeasyintegrationAmazon

Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Amazon Elasticsearch Service Leading Use Cases

Log Analytics & Operational Monitoring

• Monitor the performance of applications, web servers, and hardware

• Easy to use, powerful data visualization tools to detect issues quickly

• Dig into logs in an intuitive, fine-grained way

• Kibana provides fast, easy visualization

Search

• Application or website provides search capabilities over diverse documents

• Tasked with making this knowledge base searchable and accessible

• Text matching, faceting, filtering, fuzzy search, auto complete, highlighting, and other search features

• Query API to support application search

Leading enterprises trust Amazon Elasticsearch Service for their search and analytics applications

Media&Entertainment

OnlineServices Technology Other

Get started at https://aws.amazon.com/elasticsearch-service/

Adobe Developer Platform (Adobe I/O)

P R O B L E M• Cost effective monitor

for XL amount of log data

• Over 200,000 API calls per second at peak -destinations, response times, bandwidth

• Integrate seamlessly with other components of AWS eco-system.

S O L U T I O N• Log data is routed with

Amazon Kinesis to Amazon Elasticsearch Service, then displayed using AES Kibana

• Adobe team can easily see traffic patterns and error rates, quickly identifying anomalies and potential challenges

B E N E F I T S• Management and

operational simplicity

• Flexibility to try out different cluster configduring dev and test

AmazonKinesisStreams

Spark StreamingAmazon

Elasticsearch Service

Data Sources

1

Get started at https://aws.amazon.com/elasticsearch-service/

McGraw Hill Education

P R O B L E M• Supporting a wide catalog

across multiple services in multiple jurisdictions

• Over 100 million learning events each month

• Tests, quizzes, learning modules begun / completed / abandoned

S O L U T I O N

• Search and analyze test results, student/teacher interaction, teacher effectiveness, student progress

• Analytics of applications and infrastructure are now integrated to understand operations in real time

B E N E F I T S

• Confidence to scale throughout the school year. From 0 to 32TB in 9 months

• Focus on their business, not their infrastructure

Get started at https://aws.amazon.com/elasticsearch-service/

Easy toUse

Deployaproduction-readyElasticsearchclusterinminutes

Simplifiestime-consumingmanagementtaskssuchassoftwarepatching,failurerecovery,backups,andmonitoring

Open

GetdirectaccesstotheElasticsearchopen-sourceAPI

FullycompatiblewiththeopensourceElasticsearchAPI,forallcodeandapplications

Secure

SecureElasticsearchclusterswithAWSIdentityandAccessManagement(IAM)policieswithfine-grainedaccesscontrolaccessforusersandendpoints

Automaticallyappliessecuritypatcheswithoutdisruption,keepingElasticsearchenvironmentssecure

Available

ProvideshighavailabilityusingZoneAwareness,whichreplicatesdatabetweentwoAvailabilityZones

Monitorsthehealthofclustersandautomaticallyreplacesfailednodes,withoutservicedisruption

AWSIntegrated

IntegrateswithAmazonKinesisFirehose,AWSIOT,andAmazonCloudWatchLogsforseamlessdataingestion

AWSCloudTrailforauditing,AWSIdentityandAccessManagement(IAM)forsecurity,andAWSCloudFormationforcloudorchestration

Scalable

Scaleclustersfromasinglenodeupto20nodes

ConfigureclusterstomeetperformancerequirementsbyselectingfromarangeofinstancetypesandstorageoptionsincludingSSD-poweredEBSvolumes

Amazon Elasticsearch Service Benefits

Get started at https://aws.amazon.com/elasticsearch-service/

Easy to use and scalable

AWS SDK

AWS CLI

AWSCloudFormation

Elastic LoadBalancingAWS IAM

Amazon CloudWatch

AWS CloudTrail

Get started at https://aws.amazon.com/elasticsearch-service/

Open

• Drop-in replacement• Zero-change, no-risk

migration to or from open source Elasticsearch

Get started at https://aws.amazon.com/elasticsearch-service/

Secure

• Control access based on originating IP or Principal

• Mix policies to provide application access and Kibana access

• Use IAM roles to provide access for other services

Get started at https://aws.amazon.com/elasticsearch-service/

Available

Amazon Elasticsearch Service cluster

1

3

Instance 1

2

1 2

Instance 2

3

2

1

Instance 3

Availability Zone 1 Availability Zone 2

2

1

Instance 4

3

3

Get started at https://aws.amazon.com/elasticsearch-service/

Logstash

RESTCWL Agent

EC2 Instances

Amazon Kinesis

AmazonRDS

AmazonDynamoDB

AmazonSQS

Queue

LogstashCluster

Amazon Elasticsearch

Service

Amazon CloudWatch

AWSLambda

AWSCloudTrail

Access Logs

Amazon VPC Flow

Logs

Amazon S3 bucket

AWS IoT

Amazon Kinesis Firehose

AWS integrated

Amazon ECS

Dedicated master nodes improve stability

Amazon ES cluster

1

3

3

1

Instance 1

2

1

1

2

Instance 2

3

2

2

3

Instance 3Dedicated master nodes

Data nodes: queries and updates

Get started at https://aws.amazon.com/elasticsearch-service/

Firehose delivery architecture with transformations

intermediate Amazon S3

bucket

backup S3 bucket

source records

data source

source records

Amazon ElasticsearchService

Firehosedelivery stream transformed

records transformedrecords

transformation failure

delivery failure

Get started at https://aws.amazon.com/elasticsearch-service/

Repository Search

• File metadata and possibly file contents for traditional search

• Lambda to keep the repository current

• Good for up to ~60TB of metadata/source data (current limits)

See also: Indexing S3 Metadata blog post by Amit Sharma

Amazon Elasticsearch Service support for Elasticsearch 5

Get started at https://aws.amazon.com/elasticsearch-service/

What to do with a terabyte of logs?

Get started at https://aws.amazon.com/elasticsearch-service/

Visualize it with Kibana 5!

Get started at https://aws.amazon.com/elasticsearch-service/

Scripting with Amazon Elasticsearch Service

Scripting is fully supported using the Painless language. With scripts you can

• Change the precedence of search results• Delete index fields by query• Modify search results to return specific fields• Alter elements in a field

Painless is explicitly designed for Elasticsearch and is both performant and secure.

Get started at https://aws.amazon.com/elasticsearch-service/

Ingest Pipelines and Processors

When you index documents, you can specify a pipeline.The pipeline can have a series of processors that pre-process the data before indexing.Twenty processors are available, some are simple:{ "append":

{ "field": "field1" "value": ["item2", "item3", "item4"] } }

Others are more complex, like the Grok processor for regex with aliased expressions.

Get started at https://aws.amazon.com/elasticsearch-service/

Lots of New Elasticsearch APIs

/_alias/_aliases/_all/_analyze/_bulk/_cache/clear (Index only)/_cat/_cluster/allocation/explain/_cluster/health/_cluster/pending_tasks/_cluster_settings (PUT only):indices.breaker.fielddata.limitindices.breaker.request.limitindices.breaker.total.limit

/_cluster/state/_cluster/stats/_count/_delete_by_query*/_explain/_field_stats/_flush/_forcemerge (Index only) /_mapping/_mget/_msearch/_mtermvectors/_nodes/_plugin/kibana/_recovery (Index only)

/_refresh/_reindex*/_rollover/_search/_search profile/_segments (Index only) /_shard_stores/_shrink/_snapshot/_stats/_status/_tasks/_template/_termvectors/_update_by_query*/_validate

Get started at https://aws.amazon.com/elasticsearch-service/

Shrink and Rollover

Shrink an index to a single shard:POST source_index/_shrink/target_index

Very useful for time-series indexes once ingestion is done!

Rollover an index based on number of documents:POST logs_index/_rollover

{ "conditions": {"max_docs": 100000 } }

Get started at https://aws.amazon.com/elasticsearch-service/

Supported Elasticsearch 5 Plugins

• Smart Chinese Analysis plugin• Stempel Polish Analysis plugin• Ingest Processor Attachment plugin• Ingest Geoip Processor Plugin• Ingest User Agent Processor plugin• Mapper Murmur3 Plugin

中文Polskie

Get started at https://aws.amazon.com/elasticsearch-service/

Testing Ingest Performance

• Load generator• m4.large, single process, single thread

• Amazon Elasticsearch Service• 1 instance, 1 primary, no replicas, EBS gp2 storage

• Data• 1.8m apache web log lines, comprising 196 MB

• _bulk API calls with 10K lines per call• Monitoring data gathered from load generator process

and from the Amazon Elasticsearch Service domain

Get started at https://aws.amazon.com/elasticsearch-service/

Amazon Elasticsearch Service with v2.3 EngineInstance Avg Index Docs/sec

m3.medium 3.93 ms 2811

m3.2xlarge 11.83 ms 3966

r3.large 8.87 ms 3932

r3.8xlarge 10.58 ms 4404

I2.2xlarge 11.2 ms 5305

Ingest Performance Test Results

Instance Avg Index Docs/sec

m3.medium 3.12 ms 3629m3.2xlarge 11.1 ms 5816r3.large 8.76 ms 7221r3.8xlarge 9.59 ms 7726I2.2xlarge 10.3 ms 9676

Amazon Elasticsearch Service with v5.1 Engine

Up to 82% more documents per second!

Get started at https://aws.amazon.com/elasticsearch-service/

Migrating from v2.3 to v5.1

The easy way:1. Create a new Amazon Elasticsearch Service v5.1 cluster2. Snapshot your v2.3 indexes3. Restore the indexes to the v5.1 cluster

… but this won’t get most of the benefits of v5.1

There are many breaking changes in v5, documented athttps://www.elastic.co/guide/en/elasticsearch/reference/5.1/breaking-changes.html

Get started at https://aws.amazon.com/elasticsearch-service/

Three Things to Remember

• Amazon Elasticsearch Service is a drop-in replacement for new and existing Elasticsearch workloads

• Deploy, manage, and scale Elasticsearch more easily in the AWS cloud

• Support for Elasticsearch 5.1 brings scripting, additional plugins and additional performance to Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Findoutmore:https://aws.amazon.com/elasticsearch-service/

AWSCentralizedLogging:https://aws.amazon.com/answers/logging/centralized-logging/

ElasticsearchattheAWSDatabaseBlog:https://aws.amazon.com/blogs/database/category/elasticsearch/

OraskyourSolutionsArchitect!

Amazon Elasticsearch

Service

top related