data replication options in awsawsmedia.s3.amazonaws.com/arc302.pdf · agenda • data replication...
TRANSCRIPT
![Page 1: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/1.jpg)
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Data Replication Options in AWS
Thomas Park – Manager, Solutions Architecture
November 15, 2013
![Page 2: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/2.jpg)
Thomas Jefferson first
acquired the letter-
copying device he called
"the finest invention of
the present age" in
March of 1804.
http://www.monticello.org/site/house-and-gardens/polygraph
![Page 3: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/3.jpg)
Agenda
• Data replication design options in AWS
• Replication design factors and challenges
• Use cases – Do-it-yourself options
– Managed and built-in AWS options
• Demos
![Page 4: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/4.jpg)
Data Replication Capabilities
AWS Global Infrastructure
Application Services
Networking
Deployment & Administration
Database Storage Compute
AW
S D
ata
Replic
ation
Capabili
ties
Part
ner
Data
Replic
ation
Capabili
ties
![Page 5: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/5.jpg)
Data Replication Solution Architecture
• Business continuity
• Disaster recovery
• Customer experience
• Productivity
• Mismatched SLA
• Compliance
• Reducing cost
• Information security risks
• Global expansion
• Performance/availability
Business Drivers AWS
Capabilities
Solutions
Architecture
• Multiple Availability Zones
• Multiple regions
• AMI
• Amazon EBS and DB snapshots
• AMI copy
• Amazon EBS and DB snapshot copy
• Multi-AZ DBs
• Read replica DBs
• Provisioned IOPS
• Offline backups and achieves
• Data lifecycle policies
• Highly durable storage
Measure
Metrics Evaluate
Options
![Page 6: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/6.jpg)
Focus of Our Discussion Today
Data Preservation Databases
Performance
Storage/Content
(Files and Objects)
Data
Type
Business Drivers
Replication
Options
Design
Factors
Design
Options
![Page 7: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/7.jpg)
Focus of Our Discussion Today
Data Preservation Databases
Performance
Storage/Content
(Files and Objects)
Data
Type
Business Drivers
Replication
Options
Design
Factors
Design
Options
![Page 8: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/8.jpg)
Design Options in AWS
Multi-AZ Cross
Region
Hybrid
IT
Single
AZ
Databases
Files and
Objects
AZ AZ AZ Region Region
![Page 9: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/9.jpg)
Focus of Our Discussion Today
Data Preservation Databases
Performance
Storage/Content
(Files and Objects)
Data
Type
Business Drivers
Replication
Options
Design
Factors
Design
Options
![Page 10: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/10.jpg)
DR Metrics – RTO/RPO
Time
Last Backup Event Data Restored
RPO
4 Hours
RTO
5 Hours
2:00am 6:00am 11:00am
Physical vs. Logical
Synchronous vs. Asynchronous
![Page 11: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/11.jpg)
Data Replication Options in AWS
Multi-AZ Cross
Region
Hybrid
IT
Single
AZ
Databases
Files and
Objects
AZ AZ AZ Region Region
Pre
serv
atio
n
![Page 12: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/12.jpg)
Focus of Our Discussion Today
Databases
Storage/Content
(Files and Objects)
Data
Type
Business Drivers
Replication
Options
Design
Factors
Design
Options
Data Preservation
Performance
![Page 13: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/13.jpg)
Performance Metric – Total Time
Estimated DB
Size
~35 TB
Estimated DB
Size
~48 TB
Daily and Weekly
Updates
ETL Source
DB Target
DB
Can we do this in 10 hours?
600M Records and
320 GB in Size
![Page 14: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/14.jpg)
Performance Metric – Total Time
Estimated DB
Size
~70 TB Estimated DB
Size
~48 TB
ETL
Source
DB
Target
DB
Can we STILL do this in 10 hours?
![Page 15: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/15.jpg)
Data Replication Options in AWS
Multi-AZ Cross
Region
Hybrid
IT
Single
AZ
Databases
Files and
Objects
AZ AZ AZ Region Region
Pre
serv
atio
n
Perfo
rmance
![Page 16: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/16.jpg)
Focus of Our Discussion Today
Databases
Storage/Content
(Files and Objects)
Data
Type
Business Drivers
Replication
Options
Design
Factors
Design
Options
Data preservation
Performance
![Page 17: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/17.jpg)
Factors Affecting Replication Designs
Source Target
Replicate
Read/Write Read/Write
Latency
Bandwidth
Throughput
Data Change Rate
1
2
3
5
4
Size of Data Consistency 6
![Page 18: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/18.jpg)
Challenges in Replication
Availability &
Performance Data
Size
Consistency
Change
Rate
Database Compute Network Storage
Infrastructure Capabilities
![Page 19: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/19.jpg)
Challenges in Replication
Availability &
Performance
Data
Size
Consistency
Change
Rate
Infrastructure Capabilities
Database Compute Network Storage
![Page 20: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/20.jpg)
Challenges in Replication
Availability &
Performance
Data
Size
Change
Rate
Database Compute Network Storage
![Page 21: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/21.jpg)
Replication Design Options in AWS
Flexibility Options
The right tool for the right job
![Page 22: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/22.jpg)
Common Data Replication Scenarios
• Hybrid IT
• Database migration
• HA databases
• Increase throughput
• Cross regions
• Data warehousing
![Page 23: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/23.jpg)
Please Meet Bob
• DBA for a large
enterprise company
• 10 years of IT
experience
• What is AWS?
![Page 24: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/24.jpg)
Disaster Recovery
Bob, DBA Sue, DBA
I can’t find
archlog_002
file!!!!!!
![Page 25: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/25.jpg)
We Need a Better Way….
MySQL
SQL
Server
Oracle
Daily
5 - 6 hours
RTO is 8 hours
RPO is 1 hour
Bob, DBA
![Page 26: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/26.jpg)
Demo – AWS S3 Upload
Corporate Data Center
Amazon S3
Bucket
Generic Database
DB
Full Backup
![Page 27: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/27.jpg)
Think Parallel
2 Seconds
Multipart
![Page 28: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/28.jpg)
Think Parallel
2 Seconds 2 Seconds 2 Seconds 2 Seconds
8 Seconds
Foreach($file in $files) {Write-S3Object -BucketName mybucket -Key $file.filename}
![Page 29: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/29.jpg)
Think Parallel
Nearly 3 Days
![Page 30: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/30.jpg)
Think Parallel
2 Seconds 2 Seconds 2 Seconds 2 Seconds
2 Seconds 2 Seconds 2 Seconds 2 Seconds
2 Seconds 2 Seconds 2 Seconds 2 Seconds
120,000 files @ 15,000 TPS = 8 seconds
Mu
ltip
le M
ach
ine
s, M
ultip
le T
hre
ad
s, M
ultip
le P
art
s
![Page 31: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/31.jpg)
Demo – AWS S3 Multipart Upload
Corporate Data center
Amazon S3
Bucket
Generic Database
DB
Full Backup
![Page 32: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/32.jpg)
Think Parallel
• Use multipart upload (API/SDK or command line
tools) – min part size is 5 MB
• Use multiple threads – GNU parallels: parallel -j0 -N2 --progress /usr/bin/s3cmd put {1} {2}
– Python multiprocessing, .Net parallel extensions, etc.
• Use multiple machines – Limited by host CPU / memory / network / IO
![Page 33: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/33.jpg)
Database Replication Options
Bob, DBA Tom, Sys Admin
How are you going
to replicate
databases?
![Page 34: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/34.jpg)
Database Replication Options in AWS
MySQL
SQL
Server
Oracle
Amazon RDS
Replication
Bob’s Office
![Page 35: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/35.jpg)
Non-RDS to RDS Database Replication
Availability Zone A Corporate Data Center
Amazon RDS
MySQL
Dump
mysqldump
2 3 AWS S3 CP
Configure to Be a Master
1
Amazon S3
Bucket
4
mysqldump
Initialize
Bob’s Office
![Page 36: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/36.jpg)
Non-RDS to RDS Database Replication
Availability Zone A Corporate Data Center
MySQL
Run mysql.rds_set_external_master
5
Bob’s Office
Amazon RDS
![Page 37: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/37.jpg)
Non-RDS to RDS Database Replication
Availability Zone A Corporate Data Center
MySQL
Bob’s Office
Run mysql.rds_start_replication
6
Amazon RDS
![Page 38: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/38.jpg)
Database Replication Options
MySQL
SQL
Server
Oracle
Amazon S3
Bucket
Amazon RDS
Log Shipping SQL
Server
Restore
Bob’s Office
![Page 39: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/39.jpg)
Database Replication Options
MySQL
SQL
Server
Oracle
O
S
B
Amazon RDS
OSB Cloud
Module Oracle
SQL
Server
RMAN
restore
Bob’s Office
Amazon S3
Bucket
![Page 40: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/40.jpg)
Database Replication Options
• Amazon RDS MySQL – Replication
• SQL Server and Oracle on EC2 – SQL server log shipping, always on,
mirroring, etc.
– Oracle RMAN/OSB, Active Data Guard,
etc.
Bob, DBA
![Page 41: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/41.jpg)
HA Database Replication Options
Bob, DBA Katie, Director
We need a highly
available solution.
![Page 42: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/42.jpg)
HA DB Replication Options
Oracle
SQL
Server
Availability Zone A Availability Zone B
SQL
Server
Amazon RDS DB
Instance Standby
(Multi-AZ)
Oracle
Standby Data Guard
Data Guard Configuration
Prepare primary database
1. Enable logging
2. Add standby redo logs
3. Add data guard parameters to init.ora/spfile
4. Update tnsnames.ora and listener.ora
Prepare standby database environment
1. Install or clone the Oracle home
2. Copy password file (orapwdSID) from primary
database
3. Add data guard parameters to init.ora/spfile
4. Update tnsnames.ora and listener.ora
Create standby database using RMAN
1. Duplicate target database for standby
Configure Data Guard broker
1. Setup database parameters on primary and
standby database init.ora/spfile
2. Create Data Guard configuration for primary and
standby using dgmgrl
3. Setup StaticConnectIdentifier for primary and
standby
4. Enable Data Guard configuration
5. Show configuration – should return success
![Page 43: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/43.jpg)
HA DB Replication Options
Oracle
SQL
Server
Availability Zone A Availability Zone B
Amazon RDS DB
Instance Standby
(Multi-AZ)
Physical
Synchronous Replication
Amazon RDS MySQL Multi-AZ
Amazon RDS Oracle Multi-AZ
![Page 44: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/44.jpg)
Increase Throughput
Bob, DBA Manager Hannah, Finance
The order
system is
running
slowly.
Disk
I/O?
![Page 45: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/45.jpg)
Increase Throughput Options
• Amazon EC2 instance type – Amazon RDS MySQL
• PIOPS – Amazon RDS MySQL
• Read replicas – Amazon RDS MySQL
Bob, DBA Manager
![Page 46: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/46.jpg)
Amazon RDS Performance Options
m1.small m2.4xlarge
Amazon
RDS DB
Instance
Read
Replica
Availability Zone A Availability Zone B
Provision IOPS
Asynchronous
![Page 47: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/47.jpg)
Availability Zone A Availability Zone B
Web Web Web
AS
Web
us-east-1
Availability Zone C
Provision IOPS
SQL
Server
Oracle
Standby
SQL
Server
Oracle
![Page 48: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/48.jpg)
Increase Throughput Options
• Amazon CloudFront – Large objects
Bob, DBA Manager
![Page 49: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/49.jpg)
Availability Zone A Availability Zone B
SQL
Server
Oracle
Standby
Web Web Web
AS
Web
us-east-1
Availability Zone C
CloudFront
SQL
Server
Oracle
Logs
![Page 50: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/50.jpg)
Increase Throughput Options
• Amazon DynamoDB – Sessions, orders
Bob, DBA Manager
![Page 51: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/51.jpg)
Availability Zone A Availability Zone B
SQL
Server
Oracle
Standby
Web Web Web
AS
Web
us-east-1
Availability Zone C
Automatic Replication
CloudFront
SQL
Server
Oracle
Amazon DynamoDB
Logs
![Page 52: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/52.jpg)
![Page 53: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/53.jpg)
Cross-region Replication Options
Bob, Architect Bella, VP
We are
opening a new
development
site in….
![Page 54: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/54.jpg)
Tokyo Us-east-1
• Replicate AMIs and Amazon EBS snapshots
• Replicate Amazon DynamoDB tables
• Replicate Amazon RDS snapshots
• Replicate Amazon S3 buckets
![Page 55: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/55.jpg)
Demo Cross-region Replication Options
Amazon DynamoDB Amazon DynamoDB
ap-northeast-1 us-east-1
AWS Data Pipeline
AMI
Amazon EBS
AMI
Copy
Copy
Copy
RDS Snapshot Amazon RDS Snapshot
Amazon EBS Amazon EBS Snapshot Amazon EBS Snapshot
Snapshot Restore
EC2 EC2
Restore Create
Snapshot Restore
![Page 56: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/56.jpg)
Replicate Amazon S3 Bucket
Source
Bucket with
Objects
Destination
Bucket
![Page 57: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/57.jpg)
Source
Bucket with
Objects
Destination
Bucket
Dequeue
Task
Agent(s)
Task Queue
Controller
List bucket (S) List bucket (D)
Copy Queue List(S-D)
S3 Copy API
Amazon S3 Copy
![Page 58: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/58.jpg)
Hive Script to Compare Amazon S3 Buckets
Create external table sourcekeys (key string)
location ‘s3://mybucket/sourcebucketlist’;
Create external table destinationkeys (key string)
location ‘s3://mybucket/destinationbucketlist’;
Create table differencelist
location ‘s3://mybucket/differencelist’
as
Insert overwrite table differencelist
Select sourcekeys.key
From sourcekeys
Left outer join destinationkeys
On (sourcekeys.key = destinationkeys.key)
Where destinationkeys.key is null;
![Page 59: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/59.jpg)
Data Warehouse Replication
Bob, Architect
Gus, Marketing
We need to
understand
impact of price
changes.
![Page 60: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/60.jpg)
Data Warehouse Replication
Amazon Redshift
Amazon
DynamoDB
Bucket with
Objects
Oracle
BI
Reports
![Page 61: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/61.jpg)
AWS Data Pipeline
Data Pipeline
DynamoDB Amazon S3 Amazon Redshift JDBC
Copy EMR Hive Pig Shell
command
Redshift
copy SQL
Hive
copy
Scheduler
DB on
Instance
![Page 62: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/62.jpg)
Demo Data Warehouse Replication
Amazon Redshift
Amazon
DynamoDB
AWS Data Pipeline Bucket with
Objects
Oracle
BI
Reports
![Page 63: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/63.jpg)
Replication Server
Attunity CloudBeam for Amazon Redshift –
Incremental Load (CDC)
1 Generate change files Change Files (CDC)
Net Changes file
S3
Change Files in customer’s S3 account
5 ‘Copy’ data to CDC table
6 Execute SQL commands
‘merge’ change into data tables
Amazon
Redshift AWS region
Data Tables
CDC Table
2 Beam
files to S3
3 Validate file
content upon
arrival
4 Execute ‘copy’ command
to load data tables from S3
Source Database
Oracle DB
![Page 64: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/64.jpg)
Business Intelligence
![Page 65: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/65.jpg)
AZ A AZ B
Web Web Web
AS
Web
us-east-1
AZ C
Automatic Replication
CloudFront
Oracle
Amazon DynamoDB
Logs
Amazon
S3
Bucket
ap-northeast-1
AWS
Data
Pipeline
AMI
Amazon
EBS
snapshot
Oracle
STBY
SQL
Server
SQL
Server
Amazon
RDS
snapshot
Amazon
DynamoDB
Amazon
Redshift
![Page 66: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/66.jpg)
Bob, Chief Architect
![Page 67: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/67.jpg)
Key Takeaways
• Consider design factors and make trade offs, if
possible
• Think parallel
• Pick the right tool for the right job
![Page 68: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/68.jpg)
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
ARC302
![Page 69: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/69.jpg)
Extra
![Page 70: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/70.jpg)
Amazon DynamoDB Built-in Replication
Availability Zone A
Region A Region B
Availability Zone B Availability Zone C
Amazon
DynamoDB
Amazon
DynamoDB
Amazon
DynamoDB
AWS Data Pipeline
Amazon S3
Bucket
Provisioned Throughput
Amazon
DynamoDB
Table Table Table Table
Automatic 3-way Replication
![Page 71: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/71.jpg)
Amazon Redshift Replication Patterns
Availability Zone A
Region B
Amazon S3
Bucket
Compute
Node
Compute
Node
Compute
Node
10GigE
Leader
Node
Amazon Redshift
Copy
Unload
![Page 72: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/72.jpg)
WAN Acceleration Test
AP-southest-1 EU-West-1 US-East-1
Instance Instance Instance
![Page 73: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/73.jpg)
Tsunami-UDP here’s how Install Tsunami
Origin Server
$ cd /path/to/files $ tsunamid *
Destination Server
$ cd /path/to/receive/files
$ tsunami
tsunami> connect ec2-XX-XX-XX-83.compute-1.amazonaws.com
tsunami> get *
*** Note firewall ports need opening between servers
![Page 74: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/74.jpg)
BBCP here’s how Install BBCP
Transmitting with 64 parallel streams
Notes:
• SSH key pairs are used to authenticate between systems
• TCP port 5031 needs to be open between machines
• Does NOT encrypt data in transit
• Instance Type matters the better the instance type the better the performance
• Many dials and options to tweak to improve performance and friendly features
like retries and restarts
*** Note firewall ports need opening between servers
$ bbcp -P 2 -V -w 8m -s 64 /local/files/* ec2-your-instance.ap-southeast-1.compute.amazonaws.com:/mnt/
$ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_linux26/bbcp
$ sudo cp bbcp /usr/bin/
![Page 75: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/75.jpg)
Minutes to Send a 30GB File from US-East-1
0
10
20
30
40
50
60
To Dublin To Singapore
SCP BBCP Tsunami
*** Note using hs1.xl
![Page 76: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/76.jpg)
Environment & Configuration
#Primary init.ora: LOG_ARCHIVE_DEST_1='LOCATION=/data/oracle/Prod/db/archive VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=Prod' LOG_ARCHIVE_CONFIG='DG_CONFIG=(prod,stb)' DB_FILE_NAME_CONVERT='prod','prod' FAL_CLIENT='prod' FAL_SERVER='stb' log_archive_dest_2='SERVICE=stb VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=‘stb' LOG_ARCHIVE_DEST_STATE_1='ENABLE' log_archive_dest_state_2='ENABLE' log_archive_format='%t_%s_%r.arc' LOG_FILE_NAME_CONVERT='prod','prod' remote_login_passwordfile='EXCLUSIVE' SERVICE_NAMES='prod' STANDBY_FILE_MANAGEMENT='AUTO' db_unique_name=prod global_names=TRUE DG_BROKER_START=TRUE DG_BROKER_CONFIG_FILE1='/data/oracle/prod/db/tech_st/11.1.0/dbs/prod1.dat' DG_BROKER_CONFIG_FILE2='/data/oracle/prod/db/tech_st/11.1.0/dbs/prod2.dat'
#Standby init.ora: LOG_ARCHIVE_DEST_1='LOCATION=/data/oracle/prod/db/archive VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=‘stb' LOG_ARCHIVE_CONFIG='DG_CONFIG=(prod,stb)' DB_FILE_NAME_CONVERT='prod','prod' FAL_CLIENT=‘stb' FAL_SERVER='prod' log_archive_dest_2='SERVICE=prod VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=prod' LOG_ARCHIVE_DEST_STATE_1='ENABLE' log_archive_dest_state_2='DEFER' log_archive_format='%t_%s_%r.arc' LOG_FILE_NAME_CONVERT='prod','prod' remote_login_passwordfile='EXCLUSIVE' SERVICE_NAMES=‘stb' STANDBY_FILE_MANAGEMENT='AUTO' db_unique_name=stb global_names=TRUE DG_BROKER_START=TRUE DG_BROKER_CONFIG_FILE1='/data/oracle/prod/db/tech_st/11.1.0/d bs/prod1.dat' DG_BROKER_CONFIG_FILE2='/data/oracle/prod/db/tech_st/11.1.0/d bs/prod2.dat'
Primary DB: prod Standby DB: stb
![Page 77: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/77.jpg)
Listener.ora
prod =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =
primary) (PORT = 1526))
) )
(SID_LIST =
(SID_DESC =
(ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0)
(SID_NAME = prod)
)
(SID_DESC =
(ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0)
(SID_NAME = prod)
(GLOBAL_DBNAME=prod_DGMGRL)
)
(SID_DESC =
(ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0)
(SID_NAME = prod)
(GLOBAL_DBNAME=prod_DGB)
prod = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ec2)(PORT = 1526)) ) ) SID_LIST_prod = (SID_LIST = (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) ) (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) (GLOBAL_DBNAME=stb_DGMGRL) ) (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) (GLOBAL_DBNAME=stb_DGB)
Primary DB Standby DB
![Page 78: Data Replication Options in AWSawsmedia.s3.amazonaws.com/ARC302.pdf · Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases](https://reader034.vdocuments.net/reader034/viewer/2022051523/5a78ee7f7f8b9a43758b55ac/html5/thumbnails/78.jpg)
tnsnames.ora
#Primary tnsnames.ora:
STB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = standby)(PORT =
1526))
(CONNECT_DATA =
(SID = prod)
)
)
prod =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = primary)(PORT = 1526))
(CONNECT_DATA =
(SID = VIS)
)
)
#Standby tnsnames.ora:
STB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = standby)(PORT =
1526))
(CONNECT_DATA =
(SID = prod)
)
)
prod =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =
primary)(PORT = 1526))
(CONNECT_DATA =
(SID = VIS)
)
)
Primary DB Standby DB