launching your big data project on aws - amazon s3 · amazon kinesis - stream processing on aws...
TRANSCRIPT
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ganesh Raja
Specialist Solutions Architect – Data & Analytics
Launching your Big Data Project on AWS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traditionally, Analytics Used to Look Like This
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence • Relational data
• TBs–PBs scale
• Schema defined prior to data load
• Operational reporting and ad hoc
• Large initial CAPEX + $10K–$50K/TB/Year
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes Extend the Traditional Approach
Data Warehouse
Business Intelligence
OLTP ERP CRM LOB
• Relational and non-relational data
• TBs–EBs scale
• Diverse analytical engines
• Low-cost storage & analytics
Devices Web Sensors Social
Big Data processing,
real-time, Machine Learning
Data Lake
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes and Analytics from AWS
Cost-effective
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data Movement
On-premisesData Movement
Data Lake on AWS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A m a z o n S 3
A m a z o n G l a c i e r
A W S G l u e
Store Data in the Format You WantOpen and comprehensive
• Store data in the format you want:
• Text files like CSV
• Columnar like Apache Parquet, and Apache ORC
• Logstash like Grok
• JSON (simple, nested), AVRO
• And more…
CSV
ORC
Grok
Avro
Parquet
JSON
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes from AWS
Data Lake on AWS
Cost-effective
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data Movement
On-premisesData Movement
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Provides Highest Levels of SecuritySecure
Compliance
AWS Artifact
Amazon Inspector
Amazon Cloud HSM
Amazon Cognito
AWS CloudTrail
Security
Amazon GuardDuty
AWS Shield
AWS WAF
Amazon Macie
VPC
Encryption
AWS Certification Manager
AWS Key Management
Service
Encryption at rest
Encryption in transit
Bring your own keys, HSM
support
Identity
AWS IAM
AWS SSO
Amazon Cloud Directory
AWS Directory Service
AWS Organizations
Customer need to have multiple levels of security, identity and access management,
encryption, and compliance to secure their data lake
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security: Machine Learning-Powered SecuritySecure
• Machine learning to discover, classify,
and protect data
• Continuously monitors data access for anomalies
• Generates alerts when it detects
unauthorized access
• Recognizes PII or intellectual propertyAmazon Macie
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Encryption: Data-at-Rest and in Motion Secure
• Only cloud that offers three forms of encryption
• Server-side encryption
• Encryption with keys managed by the
AWS Key Management Service
• Encryption with keys that customers manage
• Only cloud that encrypts data in transit when replicating
across regions
• Data movement services can use the same Key
Management Service
• SSL endpoints
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Compliance: Virtually Every Regulatory Agency
CSACloud Security
Alliance Controls
ISO 9001Global Quality
Standard
ISO 27001Security Management
Controls
ISO 27017Cloud Specific
Controls
ISO 27018Personal Data
Protection
PCI DSS Level 1Payment Card
Standards
SOC 1Audit Controls
Report
SOC 2Security, Availability, &
Confidentiality Report
SOC 3General Controls
Report
Global United States
CJISCriminal Justice
Information Services
DoD SRGDoD Data
Processing
FedRAMPGovernment Data
Standards
FERPAEducational
Privacy Act
FIPSGovernment Security
Standards
FISMAFederal Information
Security Management
GxPQuality Guidelines
and Regulations
ISO FFIECFinancial Institutions
Regulation
HIPPAProtected Health
Information
ITARInternational Arms
Regulations
MPAAProtected Media
Content
NISTNational Institute of
Standards and Technology
SEC Rule 17a-4(f)Financial Data
Standards
VPAT/Section 508Accountability
Standards
Asia Pacific
FISC [Japan]Financial Industry
Information Systems
IRAP [Australia]Australian Security
Standards
K-ISMS [Korea]Korean Information
Security
MTCS Tier 3 [Singapore]Multi-Tier Cloud
Security Standard
My Number Act [Japan]Personal Information
Protection
Europe
C5 [Germany]Operational Security
Attestation
Cyber Essentials Plus [UK]Cyber Threat
Protection
G-Cloud [UK]UK Government
Standards
IT-Grundschutz
[Germany]Baseline Protection
Methodology
X P
G
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes from AWS
Data Lake on AWS
Cost-effective
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data Movement
On-premisesData Movement
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Any ScaleScalable and durable
• S3 has trillions of objects and exabytes of data
• Built to store any amount of data
• Run analytic engines at largest scale by spinning
up any amount of compute resources in minutes
• Runs on the world’s largest global
cloud infrastructure
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Unmatched Durability and AvailabilityScalable and durable
• Designed to deliver 99.999999999% durability
• Geographic redundancy & automatic replication
• Store data in multiple data centers across 3 AZs
in a single region
• Seamlessly replicates data between any region
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes from AWS
Data Lake on AWS
Lowest cost
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data Movement
On-premisesData Movement
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tiered Storage to Optimize Price/PerformanceLowest Cost
• Tiered storage to optimize price/performance• S3 Standard
• S3 Standard—Infrequent Access
• S3 One Zone—Infrequent Access
• Amazon Glacier
• Migrate between tiers based on lifecycle policies
• Store data at $0.023/GB/month with S3
• Store data at $0.004/GB/month with Glacier
S3
StandardS3 Standard
Infrequent Access
S3 One Zone-IA
Glacier
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes, Analytics, and ML Portfolio from AWSBroadest, deepest set of analytic services
Amazon SageMaker
AWS Deep Learning AMIs
Amazon Rekognition
Amazon Lex
AWS DeepLens
Amazon Comprehend
Amazon Translate
Amazon Transcribe
Amazon Polly
Amazon Athena
Amazon EMR
Amazon Redshift
Amazon Elasticsearch service
Amazon Kinesis
Amazon QuickSight
AnalyticsMachine Learning
AWS Direct Connect
AWS Snowball
AWS Snowmobile
AWS Database Migration Service
AWS Storage Gateway
AWS IoT Core
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Real-time Data Movement
On-premises Data Movement
Data Lake on AWSStorage | Archival Storage | Data Catalog
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Sources
FilesLogsStreamsDatabases
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Sources - Databases
Amazon S3Databases
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Change Data Capture – Database Logs
LOG_FILE_HDR_SIZE
OS_FILE_LOG_BLOCK
_SIZE
FORMAT
CHECKSUM
LOG_CHECKPOINT_1
LOG_CHECKPOINT_2
Checkpoint_lsn
Checkpoing_no
Log.buf_size
LOG BLOCK
LOG_BLOCK_HDR_SIZ
E
Hdr_no
Flush_bit
Data_len
[…]
???
Tx001.log
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Database Migration Service (DMS) easily and
securely migrate and/or replicate your databases and
data warehouses to AWS
Database Migration Service(Also good for ingestion!)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DMS – Supported Data Sources
On-Premise or EC2 Amazon RDS On Azure
Oracle * Oracle * Azure SQL Database (no CDC)
MS SQL Server * MS SQL Server *
MySQL (5.5+) MySQL (CDC on 5.6+)
MariaDB MariaDB
PostgreSQL (9.4+) PostgreSQL (CDC on 9.4.9+,
9.5.4+)
SAP Adaptive Server
Enterprise
Amazon Aurora – MySQL
MongoDB (2.6.x, 3.x+)
Db2 LUW
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DMS – Deployment
Amazon S3
Availability Zone Availability Zone
VPC subnet VPC subnet
Replication
Master
Replication
Slave
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Sources - Files
Amazon S3Files
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Uploading to Amazon S3
• Amazon S3 supports both a single-part upload
and a multi-part upload API
• The single-part upload supports objects up to
5 GB in size
• The multi-part upload supports objects up to 5
TB in size
• The multi-part upload also enables you to
maximize your throughput by using parallel
threads
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PUT requests go through the nearest AWS Edge
Location
Data transits over the AWS private network rather
than Internet
AWS private network optimizes throughput and
latency to the AWS Region
Data is not stored in the edge cache
S3 Transfer Acceleration
S3 bucketAWS edge
location
Uploader
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Sources - Streams
Amazon S3Streams
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis - Stream Processing on AWS
Kinesis StreamsCapture streaming data for downstream processing
Allow multiple processors to read streams at their own rate
Kinesis Firehose
• Buffer records in a stream into a single output for more efficient storage
• Automatic flushing of buffer to S3, ElasticSearch, Redshift, or Splunk
Kinesis Analytics
• Create time windows over streams and perform aggregate operations using SQL
• Join together multiple streams and output to new streams
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis – How it works
Millions of sources
producing 100’s of
terabytes per hour
FrontEnd
AZ AZ AZAuthenticate
Authorize
Durable, highly consistent storage replicas data
across three AWS Availability Zones
Aggregate
and archive
to S3
Real time
dashboards
and alarms
Machine learning
algorithms or sliding
window analytics
Aggregate analysis
in Hadoop or a
data warehouse
Ordered stream of
events supports
multiple readers
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Sources - Logs
Amazon S3Logs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Logs
Collecting and Analyzing
• CloudWatch
• Amazon Kinesis
• Other Options
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Logs – CloudWatch Agent
EC2 Instances
CloudWatch Log Stream AWS Lambda Amazon S3
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Summary - Ingestion
s3://datalake/
/vendorfeeds
/vendorA
/vendorB
/clickstream
/orders
/vendors
/customers
/app_logs
/instance1
/instance2
/syslogs
/instance1
/instance2
/databases
/customers
/orders
/vendors
File Gateway
API Gateway
Kinesis Agent
DMS
Kinesis Firehose
Amazon S3
Files
Streams
Logs
Databases
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes, Analytics, and ML Portfolio from AWSBroadest, deepest set of analytic services
Amazon SageMaker
AWS Deep Learning AMIs
Amazon Rekognition
Amazon Lex
AWS DeepLens
Amazon Comprehend
Amazon Translate
Amazon Transcribe
Amazon Polly
Amazon Athena
Amazon EMR
Amazon Redshift
Amazon Elasticsearch service
Amazon Kinesis
Amazon QuickSight
AnalyticsMachine Learning
AWS Direct Connect
AWS Snowball
AWS Snowmobile
AWS Database Migration Service
AWS Storage Gateway
AWS IoT Core
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Real-time Data Movement
On-premises Data Movement
Data Lake on AWSStorage | Archival Storage | Data Catalog
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3—The Data Lake
Security and
Compliance
Three different forms of
encryption; encrypts data
in transit when
replicating across regions;
log and monitor with
CloudTrail, use ML to
discover and protect
sensitive data with Macie
Flexible Management
Classify, report, and
visualize data usage
trends; objects can be
tagged to see storage
consumption, cost, and
security; build lifecycle
policies to automate
tiering, and retention
Durability, Availability
& Scalability
Built for eleven nine’s of
durability; data
distributed across 3
physical facilities in an
AWS region;
automatically replicated
to any other AWS region
Query in Place
Run analytics & ML on
data lake without data
movement; S3 Select can
retrieve subset of data,
improving analytics
performance by 400%
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Glacier—Backup and Archive
Durability, Availability
& Scalability
Built for eleven nine’s of
durability; data
distributed across 3
physical facilities in an
AWS region;
automatically replicated
to any other AWS region
Secure
Log and monitor with
CloudTrail, Vault Lock
enables WORM storage
capabilities, helping
satisfy compliance
requirements
Retrieves data in
minutes
Three retrieval options to
fit your use case;
expedited retrievals with
Glacier Select can return
data in minutes
Inexpensive
Lowest cost AWS object
storage class, allowing
you to archive large
amounts of data at a very
low cost
$
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storing is Not Enough, Data Needs to Be Discoverable
Dark data are the information
assets organizations collect,
process, and store during
regular business activities,
but generally fail to use for
other purposes (for example,
analytics, business relationships
and direct monetizing).
CRM ERP Data warehouse Mainframe
data
Web Social Log
files
Machine
data
Semi-
structuredUnstructured
“
”Gartner IT Glossary, 2018
https://www.gartner.com/it-glossary/dark-data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue—Data CatalogMake data discoverable
• Automatically discovers data and stores schema
• Catalog makes data searchable, and available for ETL
• Catalog contains table and job definitions
• Computes statistics to make queries efficient
Glue
Data Catalog
Discover data and
extract schema
Compliance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Glue Data
Catalog
Glue: Data Catalog – Queryable by Many Services
Glue ETL
Amazon Athena
Redshift Spectrum
EMR
(Hadoop/Spark)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue—ETL ServiceMake ETL scripting and deployment easy
• Automatically generates ETL code
• Code is customizable with Python
and Spark
• Endpoints provided to edit, debug,
test code
• Jobs are scheduled or event-based
• Serverless
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Auto-configure VPC and role-based access
Customers can specify the capacity that
gets allocated to each job
Automatically scale resources (on post-GA
roadmap)
You pay only for the resources you
consume while consuming them
There is no need to provision, configure, or
manage servers
Customer VPC Customer VPC
Compute instances
AWS Glue: Job Execution - Serverless
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue: Overall Flow
1. Crawl your raw
data
2. Create your desired targets
3. Generate and prep your ETL
4. Execute your job
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes, Analytics, and ML Portfolio from AWSBroadest, deepest set of analytic services
Amazon SageMaker
AWS Deep Learning AMIs
Amazon Rekognition
Amazon Lex
AWS DeepLens
Amazon Comprehend
Amazon Translate
Amazon Transcribe
Amazon Polly
Amazon Athena
Amazon EMR
Amazon Redshift
Amazon Elasticsearch service
Amazon Kinesis
Amazon QuickSight
AnalyticsMachine Learning
AWS Direct Connect
AWS Snowball
AWS Snowmobile
AWS Database Migration Service
AWS Storage Gateway
AWS IoT Core
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Real-time Data Movement
On-premises Data Movement
Data Lake on AWSStorage | Archival Storage | Data Catalog
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Athena—Interactive Analysis
Interactive query service to analyze data in Amazon S3 using standard SQL
No infrastructure to set up or manage and no data to load
Ability to run SQL queries on data archived in Amazon Glacier (coming soon)
Query Instantly
Zero setup cost; just
point to S3 and
start querying
SQL
Open
ANSI SQL interface,
JDBC/ODBC drivers,
multiple formats,
compression types,
and complex joins and
data types
Easy
Serverless: zero
infrastructure, zero
administration
Integrated with
QuickSight
Pay per query
Pay only for queries
run; save 30–90% on
per-query costs
through compression
$
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Redshift—Data Warehousing
Fast at scale
Columnar storage
technology to improve
I/O efficiency and scale
query performance
Secure
Audit everything; encrypt
data end-to-end;
extensive certification
and compliance
Open file formats
Analyze optimized data
formats on the latest
SSD, and all open data
formats in Amazon S3
Inexpensive
As low as $1,000 per
terabyte per year, 1/10th
the cost of traditional
data warehouse
solutions; start at $0.25
per hour
$
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Redshift SpectrumExtend the data warehouse to exabytes of data in S3 data lake
S3 data lakeRedshift data
Redshift Spectrum
query engine• Exabyte Redshift SQL queries against S3
• Join data across Redshift and S3
• Scale compute and storage separately
• Stable query performance and unlimited concurrency
• CSV, ORC, Grok, Avro, & Parquet data formats
• Pay only for the amount of data scanned
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EMR—Big Data Processing
Low cost
Flexible billing with per-
second billing, EC2 spot,
reserved instances and
auto-scaling to reduce
costs 50–80%
$
Easy
Launch fully managed
Hadoop & Spark in
minutes; no cluster
setup, node provisioning,
cluster tuning
Latest versions
Updated with the latest
open source frameworks
within 30 days of release
Use S3 storage
Process data directly in
the S3 data lake securely
with high performance
using the EMRFS
connector
Data Lake
100110000100101011
100101010111001010
100000111100101100
101010001100001
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Elasticsearch Service
Easy to Use
Fully managed;
Deploy production-ready
clusters in minutes
Secure
Secure access with VPC to
keep all traffic within
AWS network
Open
Direct access to
Elasticsearch open-source
APIs; supports Logstash
and Kibana
Available
Zone awareness
replicates data between
two AZs; automatically
monitors & replaces
failed nodes
$
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon QuickSight
easy
Empower
everyone
Seamless
connectivity
Fast analysis Serverless
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
One product for all of your users
QuickSight covers all of your users from casual data consumers, to dashboard creators, to power users and analysts that need self-serve analytics.
ExploreGive power users and analysts the freedom
to do their own self-serve data discovery
and analysis on governed data you control
CreateCreate and publish rich, interactive
dashboards to all of your users
ConsumeWith the new Reader Role, you can provide
everyone in your organization with secure,
easy access to interactive dashboards and
reports, on any device
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introducing Pay-per-Session pricing for Readers!
Pay-per-Session pricing for Readers starts at $0.30 per session up to a max of $5/user/month for unlimited sessions for data consumers that interact with published dashboards and reports.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes, Analytics, and ML Portfolio from AWSBroadest, deepest set of analytic services
Amazon SageMaker
AWS Deep Learning AMIs
Amazon Rekognition
Amazon Lex
AWS DeepLens
Amazon Comprehend
Amazon Translate
Amazon Transcribe
Amazon Polly
Amazon Athena
Amazon EMR
Amazon Redshift
Amazon Elasticsearch service
Amazon Kinesis
Amazon QuickSight
AnalyticsMachine Learning
AWS Direct Connect
AWS Snowball
AWS Snowmobile
AWS Database Migration Service
AWS Storage Gateway
AWS IoT Core
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Real-time Data Movement
On-premises Data Movement
Data Lake on AWSStorage | Archival Storage | Data Catalog
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS offers a range of tools to make AI/ML more accessible
PollyLex Rekognition
Deep Learning FrameworksAmazon AI/ML Services
Usability/simplicity:
leverages AWS AI/ML expertise
Greater control:
customer-specific models
These solutions are underpinned by proven, scalable AWS products and services
AWS
GreengrassAWS
IoTAWS
Lambda
Amazon EC2
(P2 and G2 GPUs)
Amazon
S3
Amazon
DynamoDBAmazon
Redshift
Amazon EC2
(CPUs)
Amazon EC2
(ENA)
Rekognition
Video
Machine Learning Platforms
Amazon ML
Spark & EMR
Kinesis
Batch
ECS
Connect Transcribe Translate ComprehendSageMaker
DeepLens
Apache MXNet
TensorFlow
Caffe/Caffe2
Theano
Keras
Torch
Cognitive Toolkit
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMakerThe quickest and easiest way to get ML models from idea to production
NEW!
Zero setup
Flexible Model Training
End-to-End Machine Learning
Platform
Pay by the second
$
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
These tools come together to form a full AI/ML stack
Platforms
Amazon ML ECSSpark & EMR Kinesis Batch
Infrastructure GPU MobileCPU IoT
Services
Lex
Polly
Rekognition
Frameworks Apache
MXNetTorch
Cognitive
ToolkitKerasTheano
Caffe2
& CaffeTensorFlow
AWS Deep Learning AMI
We support all major frameworks to provide our customers with the best tool for the job.
Rekognition Video
ConnectTranscribe
Translate Comprehend
SageMaker DeepLens ML Solutions Lab
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ML Solutions Lab lets you leverage Amazon expertise
Companies have
numerous
opportunities for
Machine Learning
And are unable to
unlock business
potential
Brainstorming Modeling Teaching
But lack ML
expertise or scale
Leverage Amazon experts with decades of ML
experience with technologies like Amazon Echo,
Amazon Alexa, Prime Air, and Amazon GoAmazon ML Lab
provides the missing
ML expertise
Engage the ML Solutions Lab to harness the business value of your data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Processing & Analytics
Transactional & RDBMS
DynamoDB
NoSQL DB Relational Database
Aurora
BI & Data Visualization
Kinesis Streams
& Firehose
Batch
EMR
Hadoop, Spark,
Presto
Redshift
Data Warehouse
Athena
Query Service
AWS Batch
Predictive
Real-time
AWS LambdaApache Storm
on EMR
Apache Flink
on EMR
Spark Streaming
on EMR
Elasticsearch
ServiceKinesis Analytics,
Kinesis Streams
ElastiCache DAX
In Summary…
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Training Offer
Make your data driven decisions count, and make a career in Big Data on AWS. Follow the Big Data Specialty learning path and become a specialist in Big Data:
• Implement core AWS Big Data services according to best practices
• Design and maintain Big Data
• Leverage tools to automate data analysis
Certified Cloud
PractitionerAssociate-level Certification
AWS Certified Big Data - Specialty
• Enterprise solutions
architects
• Data scientists
• Big Data solutions
architects
• Data analysts
Who should attend
Free AWS digital training: Foundational
knowledge
Big Data on AWS – 3-day Classroom Training
Free AWS digital training:
Big Data Technology Fundamentals
Visit www.aws.training to find out more.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
We hope you found it interesting! A kind reminder to complete the survey.
Let us know what you thought of today’s event and how we can improve the event
experience for you in the future.
twitter.com/AWSCloud
facebook.com/AmazonWebServices
youtube.com/user/AmazonWebServices
slideshare.net/AmazonWebServices
twitch.tv/aws
Thank You For Attending
AWS Data Driven Decisions Webinar Series.