aws re:invent 2016: aws database state of the union (dat320)

61
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Raju Gulabani, VP Database Services, AWS November 2016 DAT320 AWS Database Services State of the Union

Upload: amazon-web-services

Post on 16-Apr-2017

3.677 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Raju Gulabani, VP Database Services, AWS

November 2016

DAT320

AWS Database ServicesState of the Union

Page 2: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

What to Expect from the Session

• Learn our strategy and overview of our key services

• Get a sense of our scale and key customers per service

• Understand when to use which services for your apps

Page 3: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Strategy

• Start from the customer and work backwards

• Offer managed services

• Leverage the cloud architecture

• Support migration of apps and data from/to on-premises

• Multiple services, each optimized for different use case

Page 4: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Comprehensive Product Portfolio

Traditional Apps

Relational Databases

NoSQL & In-MemoryBig

Data

RDS

Aurora

Database Migration Service

Relational Databases

DynamoDB

ElastiCache

NoSQL & In-Memory

Amazon Redshift

EMR

Data Pipeline

Athena

Big Data

QuickSight

Elasticsearch

Amazon ML

Analytics

Page 5: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Database Services Usage

• Amazon Aurora is the fastest growing service in AWS history

• More than 14,000 databases have been migrated using AWS

Database Migration Service

• DynamoDB served over 56 billion extra requests worldwide on

Prime Day compared to the same day the previous week.

Page 6: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Select DB Services Customers

Page 7: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Relational Databases

Amazon RDS

Amazon Aurora

Database Migration Service

Page 8: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

• Multi-engine support: Aurora, MySQL, MariaDB, PostgreSQL,

Oracle, SQL Server

• Automated provisioning, patching, scaling, backup/restore,

failover

• Use with GP2 or Provisioned IOPS storage

• High availability with RDS Multi-AZ

– 99.95% SLA for Multi-AZ deployments

Amazon RDS

Amazon Aurora

Page 9: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Key Insight: Relational Databases are Complex

• Our experience running Amazon.com taught us that

relational databases can be a pain to manage and

operate with high availability

• Poorly-managed relational databases are a leading

cause of lost sleep and downtime in the IT world!

Page 10: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

• Lower TCO because we manage the muck

• Get more leverage from your teams

• Focus on the things that differentiate you

• Built-in high availability and cross region replication across multiple data centers

• Available on all engines, including base/standard editions, not just for enterprise editions

• Now even a small startup can leverage multiple data centers to design highly available apps with over 99.95% availability.

We Made Things Cheaper, Easier, and Better

Page 11: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Enterprise-grade fault tolerance

solution for production

databases

Automatic failover

Synchronous replication

Inexpensive & enabled with one click

High Availability Multi-AZ Deployments

Page 12: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon RDS Customers

Page 13: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

• Airbnb moved its main MySQL database to Amazon RDS

with only 15 minutes of downtime

• RDS simplifies much of the time-consuming administrative

tasks associated with databases so engineers can spend

more time on features

• Uses asynchronous master-slave replication to improve

website performance launched via the RDS console or an

API call

• Leverages multi-Availability Zone (Multi-AZ) for high

availability

Airbnb – Amazon RDS for MySQL

Page 14: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Reinventing the Relational Database

Page 15: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Key Questions We Asked

• What if we started from a clean sheet of paper with only constraint being that the

database was a relational database?

• Could we offer much better performance by leveraging the massive scale of our

cloud?

• Could we give you a database with designed durability indistinguishable from 100%

and availability of 99.99%?

• …And could we be better and cheaper than the 30-year old commercial databases in

use today?

Page 16: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Yes, We Can. Answer = Amazon Aurora

• A new relational database engine, built from the ground

up to leverage AWS• For all new apps that require SQL, we recommend Amazon Aurora

• Commercial-grade performance and availability at open

source prices

• Retains compatibility with MySQL 5.6

Page 17: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon RDS for Aurora

• MySQL compatible with up to 5x better performance on the

same hardware: 100,000 writes/sec & 500,000 reads/sec

• Scalable with up to 64 TB in single database, up to 15 read

replicas

• Highly available, durable, and fault-tolerant custom SSD

storage layer: 6-way replicated across 3 Availability Zones

• Transparent encryption for data at rest using AWS KMS

• Stored procedures in Amazon Aurora can invoke AWS

Lambda functions

Page 18: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Fastest growing service

in AWS history

Amazon Aurora Customers

Page 19: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Use case: Near real-time analytics and reporting

MasterRead

Replica

Read

Replica

Read

Replica

Shared distributed storage volume

Reader end-point

A customer in the travel industry migrated to Aurora for

their core reporting application accessed by ~1,000

internal users.

Replicas can be created, deleted and scaled within

minutes based on load.

Read-only queries are load balanced across replica

fleet through a DNS endpoint – no application

configuration needed when replicas are added or

removed.

Low replication lag allows mining for fresh data with

no delays, immediately after the data is loaded.

Significant performance gains for core analytics

queries - some of the queries executing in 1/100th

the original time.

► Up to 15 promotable read replicas

► Low replica lag – typically < 10ms

► Reader end-point with load balancing

Page 20: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon Aurora is now PostgreSQL-compatible

• PostgreSQL 9.6 compatibility with support for PostGIS

• All the features you expect from Amazon Aurora

including 15 read replicas with <10ms lag, shared

storage, failover without data loss, 6-way replication

across 3 Availability Zones, encryption with AWS KMS

• Available now in preview

Page 21: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Simplify monitoring from the

AWS Management Console

Database load: Identifies

database bottlenecks

Easy

Powerful

Identifies source of bottlenecks

Top SQL

Adjustable time frame

Hour, day, week, and longer

Max CPU

Performance Insights for Amazon RDS

Page 22: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

AWS Database Migration Service

• Fully managed service for migration from on-premises to

the AWS Cloud with minimal downtime

• Migrates data to and from all widely used commercial and

open source DBs

• Schema Conversion Tool that converts source DB schemas,

stored procedures and application code to a different target

format

• Supports homogenous and heterogeneous data replication

• A terabyte-sized DB can be migrated for as little as $3

Page 23: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Database Conversion Capabilities in SCT

Source Database Target Database

Microsoft SQL Server Amazon Aurora, MySQL, PostgreSQL

MySQL PostgreSQL

Oracle Amazon Aurora, MySQL, PostgreSQL

Oracle Data Warehouse Amazon Redshift

PostgreSQL Amazon Aurora, MySQL

Teradata, Netezza, Greenplum Amazon Redshift

Page 24: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

AWS Database Migration Service Customers

Page 25: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Heterogeneous Migration

• Oracle private DC to RDS PostgreSQL migration

• Used the AWS Schema Conversion Tool to convert their database schema

• Used on-going replication (CDC) to keep databases in sync until they reached the cutover window

• Benefits:• Improved reliability of the cloud environment

• Savings on Oracle licensing costs

• SCT Assessment Report let them understand the scope of the migration

Page 26: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

NoSQL & In MemoryDynamoDB

ElastiCache

Page 27: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Fast, Flexible, Scalable NoSQLAmazon

DynamoDB

Page 28: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

History of NoSQL at Amazon

Page 29: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Key Questions We Asked

• Aurora was designed with a single constraint• SQL compatibility and relational database semantics

• What if we said no to this constraint?• No to SQL = NoSQL

• Could we eliminate the things we didn’t like about

relational databases?

Page 30: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Yes, We Can. Answer = Amazon DynamoDB

• Database that can scale beyond a single box without any changes to your app

• You can start small but know that there is no limit to how successful your app can be

• If your app is running fast today with 10 users, it will always run fast, even when you have 1M,

10M or 100M users using your app

• No need to spend time tuning queries and diagnosing why your app is running slow

• Deliver availability and durability indistinguishable from 100%.

• 99.99% and 60 second failover are not good enough

• You don’t have to manage anything. You don’t even need to know what a database instance is

• No schema. All you need to tell us is the number of reads/sec and writes/sec you want to execute.

We do the rest

Page 31: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon DynamoDB Customers

Page 32: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Lyft Easily Scales Up its Ride Location Tracking System using DynamoDB

It was so simple to scale out.

We had two knobs. One was

for reads and one was for

writes.

Chris Lambert

CTO, Lyft

“ • Lyft serves up to 8x more rides during

peak times

• The GPS location for all rides was

tracked in the ride location tracking

system.

• In June, 2014, Lyft deployed DynamoDB

in production.

• Lyft has since moved many of its other

data stores over to DynamoDB as well.

Page 33: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

In-memory cache

Memcached or Redis

Fully managed; zero admin

Amazon

ElastiCache

Page 34: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Key ElastiCache Features

• Fully managed

• Cache node auto-discovery

• Multi-AZ node

placement

• Fully managed

• Persistence

• Read replicas

• Multi-AZ with

auto-failover

• Redis cluster

Page 35: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Gaming AdTech Media Mobile Other

Amazon ElastiCache Customers

Page 36: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

RDS and ElastiCache are Behind Grab’s Taxi-Booking App

The latency of a cab call must be low,

and remain low even in times of peak

traffic of hundreds of thousands of

cab requests per minute. We use

ElastiCache for Redis in front of RDS

MySQL to keep our systems’ real

time performance at any scale.

Ryan Ooi

Sr. Devops Engineer, Grab

“ • Grab is a popular taxi hailing app in southeast Asia.

• Average response time of the API layer is <40ms, mandating an

in-memory layer to achieve such performance.

• A small devops team that tried running Redis on EC2 before, but

that was too much work. Using both RDS and ElastiCache in

Multi-AZ allowed them to outsource all the management to AWS.

Page 37: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Big Data

Amazon Redshift

Amazon EMR

Amazon Athena

Data Pipeline

Page 38: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon Redshift

• Petabyte-scale, relational, MPP, data

warehousing

• Fully managed with SSD and HDD platforms

• Built-in end-to-end security, including

customer-managed keys

• $1,000/TB/year; start at $0.25/hour

Page 39: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Why we built Amazon Redshift

• Customers were generating data in the cloud but moving

it on-premises to analyze it using a data warehouse

• Customers had migrated everything to AWS except their

on-premises data warehouses. • They wanted to shut down these data centers but could not till we offered

them a solution in the cloud

Page 40: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011

IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares

Available for analysis

Generated data

1990 2000 2010 2020

Key Insight: Most Data Falls on the Floor

90% of the data in a company

is never analyzed

High costs and complexity of

traditional DW systems make it

hard to justify the capital

expense

Page 41: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Key Questions We Asked

• Could we design a system cheap and scalable enough

to let you analyze all your data?

• Could we build a service that was faster, cheaper, and

easier to use than traditional DW systems?

Page 42: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Yes, We Can. Answer = Amazon Redshift

• A massively parallel processing (MPP) system with up to 128 compute nodes to

store and process up to 2PB of compressed data

• At $1,000/TB/year, its so cheap that you can analyze all your data

• You can provision a petabyte in under three minutes and pay for it by the hour

• 10x performance and 1/10 the price of other solutions

• Fully managed with automated provisioning, patching, securing, backup,

restore, and built-in fault tolerance

Page 43: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon Redshift Customers

Page 44: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

NTT Docomo: Japan’s largest mobile provider

• 68 million customers

• 10s of TBs per day of data across

mobile network

• 6PB of total data (uncompressed)

• Data science for marketing

operations, logistics etc.

• Greenplum on-premises

• 125 node DS2.8XL cluster

• 4,500 vCPUs, 30TB RAM

• 6 PB uncompressed data

• 10x faster analytic queries

• 50% reduction in time for

new BI app. deployment

• Significantly less ops.

overhead

Page 45: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon EMR

• Hadoop, Hive, Presto, Spark, Tez, Impala etc.

• Release 5.2: Hadoop 2.7.3, Hive 2.1, Spark 2.02, Zeppelin, Presto, HBase

1.2.3 and HBase on S3, Phoenix, Tez, Flink

• New applications added within 30 days of their open source release

• Fully managed, automatically scaling clusters with support for On-

Demand and Spot pricing

• Support for HDFS and S3 filesystems enabling separated compute and

storage; multiple clusters can run against the same data in S3

• HIPAA-eligible. Support for end-to-end encryption, IAM/VPC, S3 client-

side encryption with customer managed keys and AWS KMS

Page 46: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Why we built Amazon EMR

• Customers wanted to use the latest open source analytic frameworks to analyze and transform their data

• Customers wanted to use technologies like Spark and Presto in conjunction with AWS services like Amazon S3 and features like EC2 Spot Instances

• Customers wanted to benefit from the elasticity that AWS offers

Page 47: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon EMR Customers

Page 48: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon Athena

• Serverless query service for querying data in S3 with no

infrastructure to manage.

• No data loading required; query directly from Amazon S3

• Use standard ANSI SQL queries with support for joins,

JSON, and window functions.

• Support for multiple data formats include text, CSV, TSV,

JSON, Avro, ORC, Parquet

• Pay per query only when you’re running queries; $5/TB

scanned; if you compress your data, your queries cost less

Page 49: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Why we built Amazon Athena

• Customers wanted an easy way to run ad-hoc queries

on data in Amazon S3 with no infrastructure to manage

• Customers wanted a service that could complement their

use of Amazon Redshift and Amazon EMR

• Customers wanted to give this capability to anyone in

their company and only pay per query

Page 50: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon Athena Customers

Page 51: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Analytics QuickSight

Amazon ES

Amazon ML

Page 52: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

As a native cloud service, QuickSight

combines the speed, scalability, and

and ease of deployment that our

customers have come to depend on

with the value and cost effectiveness

you expect from AWS.

Amazon QuickSight

Fast, easy to use business analytics service at 1/10th the

cost of traditional BI solutions.

Page 53: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon QuickSight

• Auto-Discover AWS data sources like Amazon Redshift, RDS, and S3

• Connect to third-party sources like Excel, Salesforce, and other

hosted/on-premises databases

• Super-fast performance with SPICE

• Instant visualizations with Autograph

• Securely share and collaborate on analyses, dashboards and stories

• Native iPhone experience and web based access from all other devices

• Governed datasets

• User access controls

• Active Directory Integration

Page 54: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

QuickSight providing real-time insights at MLB Advanced Media

QuickSight provides us with

a real-time, 360 degree view

of our business without

being constrained by pre-

built dashboards and

metrics expanding our use

of data to make informed

decisions.

Brandon Sangiovanni

Sr. BI Development Manager

Page 55: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Distributed search and analytics engine

Managed service using Elasticsearch and Kibana

Fully managed; zero admin

Highly available and reliable

Tightly integrated with other AWS servicesAmazon

Elasticsearch

Service

Page 56: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Amazon Elasticsearch Service Leading Use-Cases

Log Analytics &

Operational Monitoring

• Monitor the performance of your

application, web servers, and

hardware

• Easy to use, yet powerful data

visualization tools to detect

issues in near real-time

• Ability to dig into your logs in an

intuitive, fine-grained way

• Kibana provides fast, easy

visualization

Traditional Search

• Application or website provides search capabilities over diverse documents

• Tasked with making this knowledge base searchable and accessible

• Key search features including text matching, faceting, filtering, fuzzy search, auto complete, and highlighting

• Query API to support application search

Page 57: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Media and Entertainment

Online Services

Technology Other

Amazon Elasticsearch Customers

Page 58: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Case Study: Adobe Developer Platform (Adobe I/O)

Over 200,000 API calls per second peak• destinations, response times, bandwidth

Log data is routed with Amazon Kinesis to Amazon Elasticsearch Service, then

displayed using AES Kibana

Adobe team can easily see traffic patterns and error rates, quickly identifying

anomalies and potential challenges

Amazon

Kinesis

StreamsSpark Streaming

Amazon

Elasticsearch

Service

Data

Sources

1

Page 59: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Which Service Should You Use?

Situation Solution

Existing application

Use your existing engine on RDS• MySQL Amazon Aurora, RDS for MySQL

• PostgreSQL RDS for PostgreSQL

• Oracle, SQL Server RDS for Oracle, RDS for SQL Server

New application• If you can avoid relational features DynamoDB

• If you need relational features Amazon Aurora

Data Warehouse & BI • Amazon Redshift and Amazon QuickSight

Ad hoc analysis of data in S3 • Amazon Athena and Amazon QuickSight

Spark, Hadoop, Hive, HBase • Amazon EMR

Log analytics, operational

monitoring and search• Amazon Elasticsearch Service

Page 60: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Thank you!

Page 61: AWS re:Invent 2016: AWS Database State of the Union (DAT320)

Remember to complete

your evaluations!