connecta event: big query och dataanalys med google cloud platform

80
© Connecta - Confidential

Upload: connectadigital

Post on 27-Jan-2015

109 views

Category:

Data & Analytics


3 download

DESCRIPTION

Avancerad dataanalys och ”big data” har under de senaste åren klättrat på trendlistorna och är nu ett av de mest prioriterade områdena i utvecklingen av nya tjänster och produkter för ledarföretag i det digitala landskapet. Informationen som byggs upp i systemen när kundmötena digitaliseras har visat sig vara guld värt. Här finns allt vi behöver veta för att göra våra affärer mer effektiva. Sedan sommaren 2013 har Connecta tillsammans med Google ett etablerat samarbete för att hjälpa våra kunder med övergången till moln-tjänster för bland annat avancerad dataanalys. För att göra oss själva redo att hjälpa våra kunder har vi under ett antal år utvecklat såväl kunskaper som skaffat oss erfarenheter kring Googles olika moln-produkter, som exempelvis ”Big Query”. Big Query är ett molnbaserat analysverktyg och en del av Google Cloud Platform. Big Query gör det möjligt att ställa snabba frågor mot enorma dataset på bara någon sekund. Big Query och Google Cloud Platform erbjuder färdiga lösningar för att sätta upp och underhålla en infrastruktur som med enkla medel gör allt detta möjligt. På Connecta Digital Consultings tredje event för våren introducerade vi våra kunder och partners i koncepten dataanalys och Big Query. Under eventet berördes följande punkter: - Big Data och Business Intelligence (BI) - “The Google Big Data tools” – framgångsfaktorer och hur man kommer igång - Google Cloud Platform och hur man genomför en framgångsrik molnsatsning Vi presenterade case och berättade om viktiga lärdomar vi dragit i samarbetet med Google och våra kunder.

TRANSCRIPT

Page 1: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Page 2: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Ett verkligt kundbehov – våra kunder upplever svårigheter att göra vettiga analyser En stor potential och affärsmöjlighet genom dagens enorma mängder data Innovations– och kunskapsutvecklingen går fort – och det nu händer nu!

1)

2) 3)

Page 3: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

■  What is Big Data?

■  The Google Cloud Platform

■  Big Data on the Google Cloud Platform - Big Query

■  Case study - Casual gaming

■  Demo - Swedish election with Big Query and Tableau

■  Summary - The benefits of Big Data

Agenda

Page 4: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

•  Svenskt konsultbolag som finns till för att förverkliga punkterna på ledningens agenda. Från strategi till transformation och värdeskapande

•  Ca 700 konsulter inom -  Digital Consulting -  Management Consulting -  Enterprise Consulting -  AM och Infrastruktur

•  Omsätter ca 800 MSEK och är noterade på Nordiska börsen.

•  Vi gör våra kunderna mer konkurrenskraftigagenom att kombinera affärsstrategiskt tänkande, tekniska kunskaper och förmågan att gå från ord till handling.

Page 5: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

“90% of the data in the world today was created in the last 2 years alone”

http://www.forbes.com/sites/ciocentral/2013/01/15/big-data-get-ready-for-the-2013-big-bang/

Page 6: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Page 7: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Big Data on the top of the agenda

Page 8: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Top technology priority The 2013 CIO agenda (and 2012, 2009, 2008, 2007…)

Page 9: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Page 10: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

data is the oil of the 21st century

Page 11: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

What is Big Data?

Page 12: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

▪  “Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications”

▪  The 3 V’s of Big Data

Introduction to Big Data

Page 13: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

“Now that we have all this data we have to ask the pivotal question; can it be trusted? This is the essence of Veracity.”

The 4:th V: Veracity

Edd Dumbill. Planning for Big Data: A CIO’s Handbook to the Changing Data Landscape. O’Reilly Media, 2012

Page 14: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Big data is about the business value it provides

▪  Unless business needs are met the data and the plan it drives are missing the vital element of value

▪  Value comes when you find insights you wouldn’t have found otherwise and when you start making better decisions

▪  Try to quantify the value and communicate it across the organization

Page 15: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Page 16: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Page 17: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Challenges

Page 18: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Page 19: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Key Challenges in Big Data

Information Strategy:

■  What is your plan with Big Data?

Enterprise & External Information Management:

■  Information is everywhere – volume, variety, velocity – and it keeps growing!

Technical threshold and competence

■  How will you start the work and who will do it?

Page 20: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Solution

Page 21: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Information Strategy:

■  Make it a top management issue and make somebody take responsibility for the effort

■  Connect your corporate strategy with your information strategy

■  Transforming company culture to be data-driven

Enterprise & External Information Management:

■  Ensuring reliable and consistent data by structured work with Master Data Management (MDM)

■  The information must be used in the organization, veracity is crucial

Solution to Key Challenges in Big Data

Page 22: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Technical threshold and competence

■  Choose the technical solution that fits your needs and resources

■  Secure competences with an overall picture in order to start the work

■  Start with small pilot projects to show the business value it can bring

Solution to Key Challenges in Big Data

Page 23: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Cloud Platform Big Data Session with Connecta, April 24 - 2014 Guillaume Leygues, Enterprise Cloud Platform Sales Engineer Benelux & Nordics André Hoekzema, Enterprise Cloud Platform Lead Benelux & Nordics

Page 24: Connecta Event: Big Query och dataanalys med Google Cloud Platform

“Enabling Technology for Disruptive Business Models”

Page 25: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Agenda 25th, 2014

Google Cloud Platform Introduction, Gaining Momentum

Big Data on Google Cloud Platform

Discussion

1

2

3

Page 26: Connecta Event: Big Query och dataanalys med Google Cloud Platform

- Google’s Mission Statement

“Organize the world’s information and make it universally accessible and useful.”

Page 27: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Building Products that Scale

Google Maps Gmail Google Drive

Page 28: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Developing at Google scale means encountering Google-sized challenges.

Page 29: Connecta Event: Big Query och dataanalys med Google Cloud Platform

For the past 15 years, Google has been building out the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

Images by Connie Zhou

Page 30: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Google has been running some of the world’s largest distributed systems with unique and stringent requirements.

Images by Connie Zhou

Page 31: Connecta Event: Big Query och dataanalys med Google Cloud Platform

A Network that Spans the Globe

Page 32: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Google's Global OpenFlow Network

Page 33: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Innovating Software & Driving Technology Forward

Spanner Dremel MapReduce

Big Table Colossus

2012 2013 2002 2004 2006 2008 2010

GFS Compute Engine

Page 34: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Cloud Storage Cloud SQL Cloud

Datastore

Compute

Compute Engine App Engine

App Services

BigQuery Cloud Endpoints

Storage

Page 35: Connecta Event: Big Query och dataanalys med Google Cloud Platform

May 2013 Google Compute Engine (Preview)

PHP for App Engine (Preview) Big JOIN in BigQuery

The Last Year in the Cloud Platform

November 2013 Cloud Endpoints GA

Dedicated Memcache GA

August 2013 Layer 3 Load Balancing Encryption at Rest for Cloud Storage

December 2013 Compute Engine GA

Live Migration

Persistent Disks

July 2013 Dedicated Memcache Offline Disk Import

February 2014 HIPAA Support

Cloud SQL GA

Page 36: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Source: Google Internal Data

4.75 Million active applications

Page 37: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Investments in Cloud Platform

Page 38: Connecta Event: Big Query och dataanalys med Google Cloud Platform

We can do better

Lower and simplify pricing

Make developers more productive

Page 39: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Prices are falling

•  Public cloud prices have dropped 6-8% annually

Source: Google Internal Data

2014 2006

Public Cloud Prices

Page 40: Connecta Event: Big Query och dataanalys med Google Cloud Platform

But prices are not falling fast enough

•  Hardware costs have dropped 20-30% annually

Hardware Cost

Public Cloud Prices •  Public cloud prices have dropped 6-8% annually

Source: Google Internal Data

2014 2006

Page 41: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Pricing Updates (Effective April 1st, 2014)

35% price drop on Compute Engine, across all sizes, regions, and classes

37% price drop on App Engine frontend instance hours, 33% on Datastore writes and 50% on Dedicated Memcache

68% price drop on Cloud Storage

On Demand pricing reduced by 85% - $5/TB

Page 42: Connecta Event: Big Query och dataanalys med Google Cloud Platform

You should get the best price with...

No Upfront Payments

No Lock-in

No Complexity

Page 43: Connecta Event: Big Query och dataanalys med Google Cloud Platform

100% 0%

20% 40% 60% 80% Sustained Use

Previous On Demand

New On Demand

$0.11

$0.10

$0.09

$0.08

$0.07

$0.06

$0.05

$0.04

$0.03

Sustained-use discounts N

et P

rice

Per H

our

Page 44: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Sustained-Use Pricing

30% net reduction on Compute Engine instances with 24x7 use

Page 45: Connecta Event: Big Query och dataanalys med Google Cloud Platform

•  Managed VMs •  The Flexibility of Compute Engine •  The productivity of App Engine

•  Provides best of both worlds

•  IaaS + PaaS

Flexibility Management and

Managed VMs

Page 46: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Developer Productivity

•  Use the tools you know and love •  Fast, reliable deployments •  Isolate and fix issues in production

with Continuous Integration

Developer Productivity

Time to Market and Robust

Design

Page 47: Connecta Event: Big Query och dataanalys med Google Cloud Platform

1000X BigQuery Streaming

•  Near real-time analysis •  High fidelity, low latency •  Focus on results, not sharding

and transforming

$0.01 per 100,000 rows Real time availability of data 100,000 rows per second

Page 48: Connecta Event: Big Query och dataanalys med Google Cloud Platform

•  Deployment Manager •  Replica Pools •  Cloud DNS •  Windows Server, SuSE, RHEL support

and so much more...

Page 49: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Agenda 25th, 2014

Google Cloud Platform Introduction, Gaining Momentum

Big Data on Google Cloud Platform

Discussion

2

3

1

Page 50: Connecta Event: Big Query och dataanalys med Google Cloud Platform

http://www.google.org/flutrends/

Detecting Flu Trends

Page 51: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Speech Recognition

Page 52: Connecta Event: Big Query och dataanalys med Google Cloud Platform

•  Applications at the heart of business interactions

•  Devices and sensors

•  Lower cost of storage & ingestion

•  New programming models

•  New scale and capabilities for SQL

•  Easily available software (Open Source)

•  Easy on-ramp, cost effective experimentation

•  Unlimited scale, low TCO

•  Combine Open Source software and platform services

Ability to process Cloud consumption model Data availability

Key drivers in the growth of Big Data

Page 53: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Google Cloud Storage

Mix and match storage and computation from OSS and Google Cloud Platform

BigQuery and Datastore Connectors

BigQuery Datastore

Hadoop BigQuery Connector

Datastore Connector

Cloud Storage

Connector

HBase Hive Pig

Hadoop Applications

Hadoop, Pig, HBase, and Hive are trademarks of the Apache Software Foundation.

Page 54: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Q3, 2012 Q4,2012 Q1, 2013 Q2, 2013 Today Q3, 2013 Q4, 2013 Q2, 2012

Launch

1000x Streaming rate Table Views

Table Wildcards JSON functions

SQL Improvements

BigQuery Innovation Momentum

Google Analytics

Integration

Streaming API Table Decorators

Large Query Results Query Caching

Analytic functions Big JOIN

Big Aggregates Timestamp

JSON Import Nested /

Repeated Fields Datastore Import Batch Processing

Excel Connector

Page 55: Connecta Event: Big Query och dataanalys med Google Cloud Platform

BigQuery Ecosystem

Chartio

Page 56: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Ease of use •  Simplified infrastructure for realtime use cases

•  Stream events row-by-row via simple API

Use cases •  Server Logs, Mobile apps, Gaming, In-App real time

analytics

BigQuery Streaming

Low cost: $0.01 per 100,000 rows Real time availability of data 100,000 rows per second

Customer example:

Page 57: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Google Analytics + BigQuery

Google Analytics Premium Platform Google BigQuery Data Pipeline

Native Data Pipeline to Load Data into BigQuery Project

Page 58: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Google Analytics + BigQuery Customers

Page 59: Connecta Event: Big Query och dataanalys med Google Cloud Platform

BigQuery in Action

" The interactive performance of Google BigQuery, combined with Tableau’s intuitive visualization tools, enabled our analysts to interactively explore huge quantities of data – hundreds of millions of rows – with incredible efficiency. Previously, analyses would require hours or days to complete, if they would even complete at all. With Google BigQuery it takes minutes, if that, to process. This time-to-insight was previously impossible"

– Giovanni DeMeo Vice President

Global Marketing and Analytics

Page 60: Connecta Event: Big Query och dataanalys med Google Cloud Platform

" The simulation cluster ran for nearly two months as part of the ATLAS distributed compute grid, logging over 5 million core-hours, completing 458,000 computationally intensive jobs and processing about 214 million events. The cluster achieved sustained peak throughput of 15,000 jobs per day. “We had a great experience with Google Compute Engine … and think that it is modern cloud infrastructure that can serve as a stable, high performance platform for scientific computing”.

– Dr. Panitkin CERN Atlas Project

CERN Atlas Compute Grid Extended on GCE

Page 61: Connecta Event: Big Query och dataanalys med Google Cloud Platform

•  1.5TB in 60 seconds

•  8,412 cores

•  Google Compute Engine

MapR Breaks Minute Record Sort

Page 62: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Thank You

Page 63: Connecta Event: Big Query och dataanalys med Google Cloud Platform

Agenda 25th, 2014

Google Cloud Introduction, Gaining Momentum

Big Data on Google Cloud Platform

Discussion

1

2

3

Page 64: Connecta Event: Big Query och dataanalys med Google Cloud Platform

28 Billion requests per day on App Engine

Page 65: Connecta Event: Big Query och dataanalys med Google Cloud Platform

6.3 Trillion Cloud Datastore operations per month

Page 66: Connecta Event: Big Query och dataanalys med Google Cloud Platform

“[Google's] ability to build, organize, and operate a huge network of servers and fiber-optic cables with an efficiency and speed that rocks physics on its heels.

This is what makes Google Google: its physical network, its thousands of fiber miles, and those many thousands of servers that, in aggregate, add up to the mother of all clouds.”

- Wired

Images by Connie Zhou

Page 67: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Big Data in practice - Understanding player behavior in a Casual game

- Patrik Gottfridsson

Page 68: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

■  Simple rules, easy to learn

■  Play in short bursts

■  No long-term commitment

■  Targets a mass audience

What is casual gaming?

Page 69: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Very small revenue per user

●  (Paid)

●  In-App Purchase

●  Ads

Business model

Page 70: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

■  Measure 2nd day retention

■  Optimize across game versions

Make it sticky Reactivate Encourage

■  Find the “stales” ■  Send a “miss you”

push notification

■  Find the “spiders”, the socially connected players

■  Drop their rate of ad shows

Facts based revenue optimization

Page 71: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Big D

ata

Big D

ata

Big D

ata

■  Measure 2nd day retention

■  Optimize across game versions

Make it sticky Reactivate Encourage

■  Find the “stales” ■  Send a “miss you”

push notification

■  Find the “spiders”, the socially connected players

■  Drop their rate of ad shows

Facts based revenue optimization

Page 72: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

CSV upload

Cron import

Google spreadsheets

High level technical solution

Page 73: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Quickly up and running

Avoid upfront license costs

Avoid on-premise

hardware

Process millions of events per

day

Challenges

Page 74: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Collect everything you can

Segmentation of the data model

Validate your analytical queries

Visualize graphically (obviously)

Success factors

Page 75: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Immediate discoveries about gamer behavior

New campaigns launched to revive “stales” and encourage “spiders”

Continous follow-up of player statistics at the board level

All in all, better optimized games and an increased profitability

Results

Page 76: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Demo How to make data useful using Google Cloud Platform

Page 77: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

60% potential increase in operating margins for retail

Page 78: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

> 2x competitive advantage 5-6% higher productivity and profitability Significantly higher return on equity and market value

Data-driven decisionmaking

Page 79: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

What’s your next step?

Page 80: Connecta Event: Big Query och dataanalys med Google Cloud Platform

© Connecta - Confidential

Connecta offers:

■  BigQuery Quickstart - Initial analysis, workshops and a running BigQuery solution

■  Cloud Code Workshop - Get your team up to speed on the Google Cloud Platform

■  Cloud Assessment - Analysis, workshops and identification of where a Cloud solution would make

your company more competitive

What’s your next step?