Download - Barak regev
Big Data Turning your data problem into a competitive advantage
Barak RegevHead of Cloud Platform - EMEA
Managing big data is hard
20 min in 1 minute
Put your data to work for you.
There is a better way
How we do it - Google Infrastructure
4 billion hours of video per month
425 million Gmail users
100,000,000 GB web Index
0.25 secs to search results
Defining Big DataPractical problems & opportunities
“ How are hotel reservations for Spain from New York compared with this time last year? ”
“ Do we need to adjust our marketing campaign? Where? ”
CenterParcs - European hospitality
“ Which users who signed up last quarter, have also advanced at least 3 levels, and purchased an item worth more than $5? ”
Claritics - mobile & social user analytics
Business & IT trends driving Big Data
ChallengesOpportunities
Data is a core business asset
Increasingly data is out in the Cloud (e.g. social, CRM)
New things are possible in the Cloud (unique algorithms, scale)
Greatly increased speed of sharing and iteration
Information is growing faster than ability to leverage it
Tough for Enterprise to capture all the data they generate
Scaling traditional BI for Big Data can be hard
Skills: requires IT, analytics, software development
Some common characteristics
What does Big Data look like?
Diverse industries
Retail point of sales transactions
User activity logs (mobile & social)
Mobile telemetry & smart devices
Industrial & manufacturing
Financial trading
Medical research (e.g. genomics)
Movie rendering & production
Structured, semi-structured, unstructured
Millions if not billions of rows
Too large to process on a single machine
Too large to store on a single machine
High rate of growth
More daily
Put the Data to workGoogle cloud services for Big Data
Composable cloud services
Focus on the solution rather than on the infrastructure
Do new things that weren't possible before
Pay for what you use.
Use the cloud
BIG DATA LOG ANALYSIS
Scalable Storage
Google Spreadsheets
App EngineApp
MarketingMerchandisingLocal StoresPartners
POS,ClickstreamRFIDCustomer LoyaltyAdd clickthroughs..
Corporate data3rd party data
API
Analyze interactivelyProduct Affinity, Market Basket etc
Securely Share/
distribute the resultsStore all your data
in the cloud
SQL
Other BI Tools
Data sets for further Analysis
BigQuery
Scaling large ads reporting Customer load test: On-prem MySQL vs BigQueryLatency
(seconds)
# days of data
Business: ads authoring tools and reportingData: ad serving logs for 500 websites, ~300M rows/dayProblem solved: interactively finding new trends and patterns
A New Hadoop Terasort World Record
What did we learn?
Store data with reliability, redundancy and consistency
Go from Data to Meaning
At Scale
...fast
Google white papersGoogle File System (2003)
MapReduce: Simplified Data Processing on Large Clusters (2004)
BigTable: A Distributed Storage System for Structured Data (2006)
Dremel: Interactive Analysis of Web-Scale Datasets (2010)
Machine Translation (2004-2011)
The virtuous cycle of data
Build application (GAE / GCE)
Collect Data(Cloud Storage, Datastore,
Logstore)
Process Data(App Engine, GCE)
Analyze Data, (BigQuery)
(improve)
BuildThe Next Generation of Data-Centric Applications
BigQuery use cases in industry
Ad Spend Attribution(online travel reservations)
Media consulting(global top-5 media agency)
Ad authoring tools(online ads authoring)
Revenue optimization(holiday/travel properties)
Mash up Adwords + Google Analytics data + customer reservations for high volume attribution analysis
Analyze 20GB/day of DoubleClick display ads performance metrics for F500 clients
Deliver x-platform performance analytics dashboards to 100s of ads authoring customers
Measure x-media campaign effectiveness to maximize occupancy rates
Social gaming(data analytics vendor)
Cohort analysis on million+ gamers to monetize massive online social gaming
Business RequirementsA single place to capture growing dataCombine data from different sources
Ad hoc detection of patterns and correlationsEasily share data insights with org
Distribute data-based decision making
Interactively analyze 450M rows of sales data
BIME + BigQuery
Mobile & social gaming user analysis
Notice trend change
Slice user data, identify segments
Compare segments vs general population
Revenue optimization - hospitality industry
New solution for real-time decision makingSaves more than $150,00 a year
Cloud StorageBigQuery
AppEngine
Oracle DB
Analysts Execs
BI team
Netezza appliance
Regional Sales
cloud.google.com
Thank you