aws big data analytics ip expo 2013
DESCRIPTION
Many companies recognize the use of data analytics as an opportunity to better understand their customers and gain a lead on their competition. The ability to get better insight from vast amounts of unstructured data, coming from a multitude of sources, can give businesses the advantage in an industry where even the smallest improvement can mean a big difference. Amazon Web Services offers a range of big data, analytics and storage solutions that are used by companies such as NASDAQ, Bankinter and S&P Capital to deliver a highly secure and agile platform. Join this session and learn how it allows customers to start on a small scale but grow as their business requires, giving them the agility they need to deliver cutting edge solutions to their customers without any upfront CAPEX investment.TRANSCRIPT
![Page 1: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/1.jpg)
Big Data Analytics
David de Santiago
Business Development Manager, Analytics EMEA
![Page 2: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/2.jpg)
1. Introducing Big Data
2. From data to actionable information
3. Analytics and Cloud Computing
Overview
![Page 3: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/3.jpg)
Introducing Big Data
1
![Page 4: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/4.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 5: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/5.jpg)
The cost of data generation
is falling
![Page 6: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/6.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Lower cost,
higher throughput
![Page 7: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/7.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Lower cost,
higher throughput
Highly
constrained
![Page 8: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/8.jpg)
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure
Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Generated data
Available for analysis
Data volume
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
![Page 9: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/9.jpg)
Elastic and highly scalable
No upfront capital expense
Only pay for what you use +
+
Available on-demand
+
= Remove
constraints
![Page 10: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/10.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Lower cost,
higher throughput
Highly
constrained
![Page 11: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/11.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Accelerated
![Page 12: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/12.jpg)
Technologies and techniques for
working productively with data,
at any scale.
Big Data
![Page 13: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/13.jpg)
From data to
actionable information
2
![Page 14: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/14.jpg)
![Page 15: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/15.jpg)
3.5 billion records
13 TB of click stream logs
71 million unique cookies
Per day:
![Page 16: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/16.jpg)
User bought
recently a home
theatre system
And is now
looking at sport
games
Targeted Ad
![Page 17: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/17.jpg)
500% return on ad spend
17,000% reduction in procurement time
Results:
“We couldn’t have done it”
![Page 18: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/18.jpg)
![Page 19: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/19.jpg)
Identified early mobile usage
Invested heavily in mobile development
Finding signal in the noise of logs
![Page 20: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/20.jpg)
9,432,061 unique mobile devices
used the Yelp mobile app.
Other Features powered by EMR: People Who Viewed this Also Viewed
Review highlights
Auto complete as you type on search
Search spelling suggestions
Top searches
Ads
In January 2013
![Page 21: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/21.jpg)
Open web index.
3.4 billion records.
Available to all.
![Page 22: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/22.jpg)
You Are What You Tweet: Analyzing Twitter for Public Health. M. J. Paul and M. Dredze, 2011
Tweeting about Flu
![Page 23: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/23.jpg)
Full parse for impact of
social networks
300 lines of Ruby code.
14 hours.
$100.
![Page 24: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/24.jpg)
Analytics and
Cloud Computing
3
![Page 25: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/25.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 26: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/26.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
S3, Glacier,
Storage Gateway,
DynamoDB,
Redshift, RDS,
HBase
![Page 27: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/27.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
EC2 &
Elastic MapReduce
![Page 28: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/28.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
EC2 & S3,
CloudFormation,
Elastic MapReduce,
RDS, DynamoDB, Redshift
![Page 29: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/29.jpg)
Amazon Redshift
Fully Managed Data Warehouse
Scales to 1.6PB
Faster, Simpler, Cheaper
![Page 30: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/30.jpg)
Amazon Redshift
Effective
Hourly Price
Per TB
Effective
Annual Price
per TB
On-Demand $ 0.425 $ 3,723
1 Year Reservation $ 0.250 $ 2,190
3 Year Reservation $ 0.114 $ 999
![Page 31: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/31.jpg)
“Two months to migrate to Amazon Redshift.”
Greg Johnson, Head of Analytics, Nokia
“TOWARDS THE END OF LAST YEAR OUR DATA
VOLUMES LITERALLY
BROKE THE EXISTING
DATABASE. WE WERE NO
LONG ABLE TO SCALE THE
DATABASE OR DO ANYTHING
USEFUL; LIKE RUNNING
QUERIES”
![Page 32: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/32.jpg)
Elastic Map Reduce: How does it work?
EMR
EMR Cluster S3
1. Put the data into S3 (or HDFS)
3. Get the results
2. Launch your cluster. Choose: • Hadoop distribution • How many nodes • Node type (hi-CPU,
hi-memory, etc.) • Hadoop apps (Hive,
Pig, HBase)
![Page 33: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/33.jpg)
EMR
EMR Cluster
Elastic Map Reduce: How does it work?
S3
You can easily resize the cluster
![Page 34: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/34.jpg)
EMR
EMR Cluster
Elastic Map Reduce: How does it work?
S3
Use Spot nodes to save time
and money
![Page 35: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/35.jpg)
EMR
EMR Cluster
Elastic Map Reduce: How does it work?
S3
Launch parallel clusters against the same data source (tune for the
workload)
![Page 36: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/36.jpg)
Elastic Map Reduce: How does it work?
EMR Cluster S3
When the work is complete, you can terminate the cluster
(and stop paying)
![Page 37: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/37.jpg)
Thousands of Customers, 5+ Million Clusters
![Page 38: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/38.jpg)
Give it a try.
Cost to run a 100-node EMR cluster:
£4.90 / hour
![Page 39: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/39.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
EC2 & S3,
CloudFormation,
Elastic MapReduce,
RDS, DynamoDB, Redshift
EC2 &
Elastic MapReduce
S3, Glacier,
Storage Gateway,
DynamoDB,
Redshift, RDS,
HBase AWS Data Pipeline
![Page 40: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/40.jpg)
AWS Data Pipeline
Data-intensive orchestration and automation
Reliable and scheduled
Easy to use, drag and drop
Execution and retry logic
Map data dependencies
Create and manage temporary compute
resources
![Page 41: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/41.jpg)
Anatomy of a pipeline
![Page 42: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/42.jpg)
Arbitrarily complex pipelines
![Page 43: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/43.jpg)
Thanks. [email protected]
To Learn More:
aws.amazon.com/elasticmapreduce
aws.amazon.com/datapipeline
aws.amazon.com/big-data
aws.amazon.com/redshift
aws.amazon.com/rds
![Page 44: AWS Big Data Analytics IP Expo 2013](https://reader033.vdocuments.net/reader033/viewer/2022052410/54b6c7554a795996608b45e2/html5/thumbnails/44.jpg)
Thank you!