70 534 architecting microsoft azure solutions

51
Architecting Microsoft Azure Solutions

Upload: bopomaji

Post on 07-Aug-2015

296 views

Category:

Education


3 download

TRANSCRIPT

Architecting Microsoft Azure Solutions

Big Data BriefingHow to implement Big Data on Windows Azure

Big Data; running cheaply and efficiently on Windows Azure to discover business value from data you already hold about your business.

Introduction

AgendaWindows Azure IntroHDInsight IntroArchitectureBI and VisualisationsCase Studies

WhoAndy Cross, Director of Microsoft Partner Elastacloud, Windows Azure MVP, Big Data specialist and international speaker.

Time until next coffee 55:00

30 minutes

30 minutes

30 minutes

15 minutes

15 minutes

Find out more at http://www.Microsoft.com

WINDOWS AZUREIntroduction to

Today’s Transformation – Cloud

Usage BasedElasticSelf-ServicePooled Resources

Economics ▪ Agility ▪ Focus

PaaS SaaS

IT Continuum

Evolution toward highly-virtual and beyond to cloud

Physical Virtual / Private IaaS

Microsoft On The Internet

One of 3rd world’s largest private networks

1.0

1.0

400m active accounts

More than 3 billion queries per month

1 billion+ authentications per day

750 million unique visitors at any given time

Microsoft account

Global Foundation Services

The Microsoft Cloud~Globally Distributed Data Centers

Quincy, WA Chicago, IL San Antonio, TX Dublin, Ireland Generation 4 DCs

Inside a Microsoft Data Center

Cloud Computing Patterns

tCom

pute

Inactivity

Period

On and Off – Start/End SemesterOn & off workloads (e.g. batch job)Over provisioned capacity is wasted Time to market can be cumbersome

t

Unpredictable Bursting – Web demandUnexpected/unplanned peak in demand Sudden spike impacts performance Can’t over provision for extreme cases C

om

pute

t

Predictable Burst – Registration

Services with micro seasonality trends Peaks due to periodic increased demandIT complexity and wasted capacity

Com

pute

t

Growing Fast – Research ProjectSuccessful services needs to grow/scale Keeping up w/ growth is big IT challenge Cannot provision hardware fast enoughC

om

pute

Applicationbuilding blocks

StorageBig data

Caching

CDN

Database

Identity

Media

Messaging

Networking

Traffic

Big Data Technologies

HDInsight Summary

GA, recent preview of Hadoop 2Managed Hadoop on Windows AzureFamiliar tools such as Hive, Pig, OozieAdditional BoB Microsoft ecosystem tooling with .net SDK

Powershell and .net for provisionExecution with .net and powershell for Hive

Paired with Hortonworks HDP for on-premises Hadoop; compatible with all major Hadoop implementationsCombined with Excel and traditional Microsoft BI stack for compelling solutions

PaaS

Position in Cloud

Centralised Resources

State – a Windows Azure SQL DB stores metadata to allow identical clusters to be spawned

Data – all your data resides on Windows Azure Blob Storage; resilient, secure and cost effective

Logs – machine state and cluster uptime is stored in Windows Azure Table Storage

Provision hundreds of machines against your data; gain understanding rapidly

Speed to understanding

500 servers.

20 minutes.

40,000,000,000 data points

Buy t

he L

ook

Drive Investment

Exploring Sentiment, Social and Metadata Graphs

Person Product

Person

Product

Owns

Owns

Alike

Alike?

Social Media

Microsoft Offerings

Hive ODBC Driver & Hive Add-in for Excel

Integration with Microsoft PowerPivot

Hadoop based distribution for Windows Server & Azure

Strategic Partnership with Hortonworks

JavaScript framework for Hadoop

RTM of Hadoop connectors for SQL Server and PDW

Hadoop on WindowsInsights to all users by activating new types of data

Integrate with Microsoft Business Intelligence

Choice of deployment on Windows Server + Windows Azure

Integrate with Windows Components (AD, Systems Center)Easy installation and configuration of Hadoop on Windows

Simplified programming with . Net & Javascript integration

Integrate with SQL Server Data Warehousing

Diff

ere

nti

ati

on

Microsoft Big Data Roadmap

To accelerate the delivery of Microsoft’s Hadoop based solution for Windows Server and service for Windows Azure, Microsoft is announcing a partnership with HortonworksMicrosoft is committed to broadening accessibility and usage of Hadoop to end users, developers and IT professionals in organizations of all sizes

Microsoft is announcing an end-to-end roadmap for Big Data that embraces Apache HadoopTM by distributing enterprise class Hadoop based solutions on both Windows Server and Windows Azure

Microsoft is extending its leadership in business intelligence and data warehousing to provide insights to all users by activating new types of data of any size

Microsoft’s value addsWhy Elastacloud recommend Microsoft’s platform

Skill reuse

Express elegant solutions in C#Familiar Unit Testing patternsConcise programmatic terseness

Your existing development team can immediately

realise value

The frameworks

facilitate deterministic

testing for highly reliable

queries

Complex logic is best expressed in programmatic

form

Commoditised query

Provision

Execute

De-provision

Valu

e

Action Cost

Value of query

Time

Cost

HDInsight in Azure

Hybrid Compatibility

Hadoop On Premises

Name=Andy Pnid=123456

123456 4712

Microsoft are the only vendor with enterprise on premises and cloud big data offerings

Just as Big Data is more than Hadoop; Windows Azure is more than HDInsight

HDPSparkStorm

Elastacloud provisioned a bespoke 500 core virtual cluster on Windows Azure, allowing us to cut down a compute intensive task from 20 days over 8 cores to under 12 hours over 500 cores.

Jedidiah Francis, Data Scientist, ASOS

Architectural ConsiderationsSolving Big Data challenges cohesively

The first challenge you face with Big Data is somewhat the hardest

Elastacloud’s tooling aims to solve these challenges

Ingress, Compute, Egress automated; allowing focus on providing value

Workflows Big Data solutions are rarely the result of a single piece of code running; instead jobs transform data into intermediate domains before completing an overall piece of work

The overall architectural challenge of Big Data is that just as Data can vary, so must architectures.

Batch

Interactive Real TimeHadoop can operate with O(n) over Petabytes of data

Drill, Stinger and Tez bring must quicker querying

Storm allows enormous scalable throughput

Batc

hServ

ing

Sp

eed

Static Data

Data Stream

Batch Process

Real Time

Operational Data Store

Precompute

Increment

User Query

Technologies possible? Any.

Batch Interactive Real Time

Hadoop SQL / MySQL Storm

Spark Hadoop 2 / Tez / Drill Spark

Lucene Mongo

Cassandra

Windows Azure Table Storage

Spark

Batc

hServ

ing

Sp

eed

Static Data

Data Stream

HDInsight

Storm

SQL/NoSQL Database

Precompute

Increment

User Query

Windows Azure Service Bus

Blob Persist

Elastacloud Service Bus

Spout

Moving Mediatonic to Windows Azure

AWS didn’t work

• Mediatonic are a leading UK games company• They had deployed an analytics platform to AWS• They required quite a bit of manpower to maintain their system

operationally • Their biggest cost was an Amazon Redshift data warehouse which was

a bottleneck in their design• Their system was built so that if a server went down whilst collecting

gaming messages they could lose 15 minutes of data• They found the “Big Data” part so complex that only highly-paid

specialists could improve the system

How Windows Azure Helped

• A joint collaboration between Microsoft, Elastacloud and Mediatonic migrated their Game Fuel platform to Windows Azure

• With the Windows Azure Service Bus we were able to ensure that their messaging was continuous and not discrete so no messages were ever lost

• With HDInsight we were able to teach their staff how to build “Big Data” applications

• We were able to deliver a system with unique components of Windows Azure, removing the expensive data warehousing bottleneck but delivering the same functionality

“I can’t believe that I could get up and running with HDInsight doing real work in 30 minutes. It took me days to do the same thing with Elastic Map Reduce on Amazon”

Adam Fletcher – Gamefuel Project Lead

Buy t

he L

ook

Initial State

Traditional Analytics

Limited

Traditional Analytics allows for the tracing of single actions on a website; who added a product to a bag, where did they come from, where did they go?

Buy t

he L

ook

Advancement Cross Sell

Improvements possible

Beyond immediacy is the ongoing workflow of a system. Who added items to a basket after using the Buy the Look functionality. However, this lacks comprehension of customer behaviour.

Cutting edgeWho bought this?

Cutting edge – understand your customerPu

rch

ase

pro

pen

sity

Time

January February

Proving through Big Data technologies that engaged customers will return through engagement following life events such as pay-days to continue the engagement. Building business on this basis.

Find out more

www.elastacloud.com

[email protected]

Some imagery © nounproject, CC