(bdt206) see how amazon redshift is powering business intelligence in the enterprise | aws re:invent...
DESCRIPTION
"Take a look into how NordstromRack.com | HauteLook and Nasdaq OMX are using Amazon Redshift for data warehouse and supporting business intelligence workloads one year after they made the move to using Amazon Redshift. We will cover why HauteLook chose Redshift, how they built the architecture, discuss what data is being stored and accessed, and overall, how that data is powering the HauteLook business. We will also discuss how Nasdaq migrated from an on-premised data warehouse to Amazon Redshift, and how they've been able to take advantage of Redshift's array of security features such as hardware security modules (HSM), encryption, and audit-logging.TRANSCRIPT
![Page 1: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/1.jpg)
November 12, 2014 | Las Vegas, NV
BDT206
See How Amazon Redshift is Powering Business
Intelligence in the EnterpriseRahul Pathak, Amazon Redshift
Jason Timmes, Nasdaq
Kevin Diamond, Hautelook
![Page 2: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/2.jpg)
![Page 3: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/3.jpg)
Amazon
Redshift
Amazon Elastic
MapReduce
Amazon EC2
Analyze
AWS Data
Pipeline
Amazon
GlacierAmazon
DynamoDB
Store
AWS Direct
Connect
Collect
Amazon Kinesis
Amazon
S3
![Page 4: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/4.jpg)
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
![Page 5: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/5.jpg)
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
VPC
JDBC/ODBC
![Page 6: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/6.jpg)
![Page 7: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/7.jpg)
![Page 8: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/8.jpg)
Data Source ET
Direct
Connect
Client
Forwarder
LoaderState Management
SandboxAmazon Redshift
S3
![Page 9: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/9.jpg)
![Page 10: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/10.jpg)
![Page 11: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/11.jpg)
11
LEADING INDEX PROVIDER WITH
41,000+ INDEXES ACROSS ASSET CLASSES AND
GEOGRAPHIES
Over 10,000 Corporate Clients in
60 countries
Our technology
powers over
70
MARKETPLACES,
regulators, CSDs
and clearing-
houses
in over
50 COUNTRIES
100+ DATA
PRODUCT OFFERINGS
supporting 2.5+ millioninvestment professionals
and users
IN 98 COUNTRIES
26 Markets
3 Clearing Houses
5 Central Securities
Depositories
Lists more than 3,500
companies in 35 countries,
representing more than $8.8
trillion in total market value
![Page 12: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/12.jpg)
![Page 13: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/13.jpg)
Our warehouse can be used to
analyze market share, client
activity, surveillance, power our
billing, and more…
![Page 14: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/14.jpg)
![Page 15: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/15.jpg)
![Page 16: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/16.jpg)
![Page 17: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/17.jpg)
![Page 18: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/18.jpg)
• Pay close attention to manifest mandatory flag! – Amazon Redshift UNLOAD always sets this to false!!!
![Page 19: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/19.jpg)
• TableIngestStatus– We originally put this table in Amazon Redshift itself
– Turns out Amazon Redshift is not efficient on really small data sets
– Significantly impacted performance, and increased concurrency
contention
• Solution: Moved TableIngestStatus to a separate
transactional RDBMS (MySQL)– We were already using a MySQL instance to persist workflow
states
![Page 20: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/20.jpg)
• Direct Connect (private lines)
• VPC
• Encryption in flight (HTTPS/SSL/TLS on API, JDBC)– Parameter Group: require_ssl = true
– Use Amazon Redshift cluster SSL certificate to verify cluster
identity
• Encryption at rest– AES-256 encrypt files prior to loading to S3 (not using S3 SSE)
– Amazon Redshift encryption
• Specified at cluster creation, applies to backups/snapshots too
![Page 21: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/21.jpg)
• Amazon Redshift will store the cluster key in a
single customer premise HSM (or CloudHSM)– SafeNet Luna SA HSM, firmware version should match CloudHSM
– Requires certificate exchange between cluster and HSM
– Requires cluster have an EIP
• On our side, required static 1-to-1 NAT of HSM private IP
• VPC Security Groups still apply; can still isolate cluster from others
– Encrypted database key decrypted in HSM, passed over encrypted
channel to cluster on startup, stored in memory to decrypt data
encryption (block) keys
– If running an HSM HA group, must synchronize keys after creation
![Page 22: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/22.jpg)
• HSM integration was critical to Nasdaq adoption
• Monitor cluster access, react to any unauthorized
connections– STL_CONNECTION_LOG
• Query system table on a timed basis, alert to any unexpected access
– CloudTrail to Splunk Amazon Redshift connection & user logs
• Captures all API calls, not activity inside Amazon Redshift
– STL_DDLTEXT
• Audits all schema changes in the cluster
• In response to an alert, Amazon Redshift/HSM connectivity
is severed, and cluster is immediately shut down
![Page 23: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/23.jpg)
• With validation, data integrity, and security
requirements met, the challenge remains to
optimize ingest
• Why?– Concurrency is a huge performance factor; can’t afford to be
loading yesterday’s data when clients are running queries
![Page 24: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/24.jpg)
![Page 25: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/25.jpg)
-
20
40
60
80
100
120
140
1 2 4 6 8 10 12 14 16 18
Th
rou
gh
pu
t (M
B/s
ec)
Concurrent Threads
S3 (over HTTPS) Multithreaded Throughput
![Page 26: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/26.jpg)
![Page 27: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/27.jpg)
On premises AWS Regional (Multi-AZ) Scope AWS (US-East,
primary AZ/VPC)
S3
Amazon SNS
Redshift
Database
Cluster
HSM Key
Appliance
Cluster
MySQL
Redshift
Load files/
Manifests
Redshift
Snapshots/
Backups
Data
Loaded
Topic
RMS Input
Sources
(multiple
systems)
Data Ingest
Process
![Page 28: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/28.jpg)
![Page 29: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/29.jpg)
![Page 30: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/30.jpg)
![Page 31: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/31.jpg)
![Page 32: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/32.jpg)
![Page 33: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/33.jpg)
![Page 34: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/34.jpg)
![Page 35: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/35.jpg)
November 12, 2014 | Las Vegas, NV
BDT206
See How Amazon Redshift is Powering Business
Intelligence in the Enterprise
Kevin Diamond, Nordstromrack.com | HauteLook
![Page 36: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/36.jpg)
![Page 37: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/37.jpg)
![Page 38: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/38.jpg)
![Page 39: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/39.jpg)
![Page 40: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/40.jpg)
![Page 41: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/41.jpg)
![Page 42: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/42.jpg)
![Page 43: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/43.jpg)
![Page 44: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/44.jpg)
![Page 45: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/45.jpg)
![Page 46: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/46.jpg)
Amazon Redshift
![Page 47: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/47.jpg)
![Page 48: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/48.jpg)
![Page 49: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/49.jpg)
Staging ProdEMR
Data Pipeline Data Pipeline
![Page 50: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/50.jpg)
Staging Prod
![Page 51: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/51.jpg)
![Page 52: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/52.jpg)
![Page 53: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/53.jpg)
medium speed
medium storage
$3.7k/month
awesome support
small storage
$3.7k/month
awesome support
medium concurrency
$10k/month
awesome support
![Page 54: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/54.jpg)
![Page 55: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/55.jpg)
![Page 56: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/56.jpg)
![Page 57: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/57.jpg)
Total Storage
Daily Transfer
Monthly Growth
Monthly Spend
Estimated 3yr Savings
![Page 58: (BDT206) See How Amazon Redshift is Powering Business Intelligence in the Enterprise | AWS re:Invent 2014](https://reader034.vdocuments.net/reader034/viewer/2022042816/559445861a28ab1a738b458e/html5/thumbnails/58.jpg)