big data - in the cloud or rather on-premises?

44
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Big Data - in der Cloud oder doch lieber On - Premises? Guido Schmutz Kassel, 21.9.2017 @ gschmutz [email protected]

Upload: guido-schmutz

Post on 21-Jan-2018

559 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Big Data - in the cloud or rather on-premises?

BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH

Big Data - in der Cloud oder dochlieber On-Premises? Guido SchmutzKassel, 21.9.2017

@gschmutz [email protected]

Page 2: Big Data - in the cloud or rather on-premises?

Guido Schmutz

Working at Trivadis for more than 20 yearsOracle ACE Director for Fusion Middleware and SOAConsultant, Trainer Software Architect for Java, Oracle, SOA andBig Data / Fast DataHead of Trivadis Architecture BoardTechnology Manager @ Trivadis

More than 30 years of software development experience

Contact: [email protected]: http://guidoschmutz.wordpress.comSlideshare: http://www.slideshare.net/gschmutzTwitter: gschmutz

2

Page 3: Big Data - in the cloud or rather on-premises?

Agenda

1. Cloud Primer2. Big Data and IoT Architecture3. Big Data in the Cloud4. Various Models for Big Data in Cloud5. Big Data On-Premises6. Hybrid Big Data Solutions

Page 4: Big Data - in the cloud or rather on-premises?

4

Cloud Primer

Page 5: Big Data - in the cloud or rather on-premises?

Cloud Primer

5

Instance• the thing running in the cloud provider’s infrastructure• can be a VM but does not have to be

Instance Type• the size of the instance (Combination of CPU, Memory, Disk Storage => Cost)• Azure: Instance sizes

Instance Control• lifecycle of an instance• Instances can be stopped or terminated (deleted)

Page 6: Big Data - in the cloud or rather on-premises?

Cloud Primer

6

Images• the template used for provisioning an

instance

Serverless• Run code “without” servers => only

specify functions (Java, C#, Python, Node.js)

• Pay only for the compute time you consume

• easy scale-out• management and capacity planning

decision done by provider

Regions and Availability Zones• represents geographic distribution of

cloud provider• Regions are the geographic areas

where a service is offered• Availability Zones (AZ) add high

availability within a Region• communication within AZ in same

region cost less than across regions

Page 7: Big Data - in the cloud or rather on-premises?

Cloud Primer – Specific Instances

7

On-Demand Instance• flexible, on-demand usage• billing increment dependent on provider

Temporary Instance• can disappear at any time (bid price)• are charged significantly less• well suited for Hadoop workloads (if storage

and compute are separated)• AWS: spot instances

Reserved Instance• reserved capacity in advance• reduced pricing (up to 75% to on-demand)

Dedicated Instance• pay for instances• run on hardware dedicated to you• Amazon decides placement

Dedicated Host• pay for entire physical server• full flexibility of placement of instances (VM)• solves existing server-bound licenses issues

Bare Metal• bare hardware resources, no virtualization by

cloud provider• full flexibility / full control• almost no automation provided

Page 8: Big Data - in the cloud or rather on-premises?

Cloud Primer - Storage

8

Block Storage• most common type offered by a cloud

provider • disk-like storage• comes with each instance when provisioned• accessed as filesystem mounts => volumes,

disks• persistent volumes survive beyond lifetime

of instance that spawned it• ephemeral volumes are limited to life of

instance to which they are attached• AWS: EBS• Azure: VHDS & Azure File Storage• Oracle: Block Storage

Object Storage• each chunk of data is treated as its own

entity independent of any instance• content of each object is opaque to the

provider• API or URL is used to access data (no

mount) • well suited for Big Data• hot and cold storage options• AWS: S3 & Glacier• Azure: Azure Blob Storage• Oracle: Object Storage & Archive Storage

Page 9: Big Data - in the cloud or rather on-premises?

Cloud Primer – Usage Patterns

9

Short Lived (Transient)👍 Minimal maintenance, high efficiency👎 spin up time, higher resource demand👎 data transfer to permanent storage

Self-Service👍 efficiency of on-demand creation👎 need to maintain tooling

Cloud-Only👍 data transfer stay within cloud, minimal on-

premises costs, integration with provider👎 higher cloud expenditure

Long lived (Long Running)👍 less time waiting for clusters to start/stop👍 lower resource demand👎 wasted idle time (if there is)👎 maintenance burden, growing size over time

Managed👍 easy alignment with budget constraints👎 waiting time for usage, admin effort

Hybrid👍 lower cloud expenditure, local resources

available👎 complex workflows, data transfer costs

Page 10: Big Data - in the cloud or rather on-premises?

10

Big Data & IoT Architecture

Page 11: Big Data - in the cloud or rather on-premises?

Big Data & IoT Reference ArchitectureBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 12: Big Data - in the cloud or rather on-premises?

Big Data & IoT Reference ArchitectureBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Cloud / On-PremisesEdgeInternet / Cloud /

On-Premises

Page 13: Big Data - in the cloud or rather on-premises?

1) Bulk Source – Bulk ProcessingBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 14: Big Data - in the cloud or rather on-premises?

2) Bulk Source - Edge & Bulk ProcessingBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 15: Big Data - in the cloud or rather on-premises?

3) Event Source – Stream & Bulk ProcessingBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 16: Big Data - in the cloud or rather on-premises?

4) Event Source – Edge & Stream & Bulk ProcessingBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 17: Big Data - in the cloud or rather on-premises?

5) Stream Ingestion – Edge & Stream ProcessingBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 18: Big Data - in the cloud or rather on-premises?

Big Data & IoT Reference ArchitectureBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Page 19: Big Data - in the cloud or rather on-premises?

20

Big Data in the Cloud

Page 20: Big Data - in the cloud or rather on-premises?

Big Data in the Cloud – two usage patterns

21

Short Lived Cluster (Transient)

data is repurposed, and used for a specific use case in a specific workload

Cluster spun up only when needed

Flexibility• spin up arbitrary number of nodes quickly• Expand quickly from very small to very large

Simplicity• use as is, solve problem and move on

Long Lived Cluster (Long Running)

data is acquired and augmented continuously

cluster is in permanent use for mixed workloads

Performance• Raw compute performance across wide range

of workloads• time of availability

Page 21: Big Data - in the cloud or rather on-premises?

BDaaS – Possible Cost Optimizations

22

Autoscaling• scale up when a query comes in• scale down when jobs finish• match utilization with job demand• benchmark: auto-scaling saves 33% in

compute costs compared to static-sized cluster

Excess capacity• Hadoop is fault tolerant, can take

advantage of unreliable instances such as temporary instances

• benchmark: if 50% is done on spot nodes, save 80% compared to normal nodes

Common workload distribution with Big Data applications

Page 22: Big Data - in the cloud or rather on-premises?

Data Locality vs. Compute/Storage Separation

23

Data Local Compute Separate Compute and Storage

Worker #1

Disk

Processing

Master Node

Worker #2

Disk

Processing

Worker #3

Disk

Processing

Network

Storage

Disk Disk Disk

Compute #1

Processing

Compute #2

Processing

Compute #3

Processing

Network

Master Node

Network

Separation of compute and storage – the fundamental difference• store data in Object

Storage instead of DFS

• bring up Compute nodes only for data processing

• multiple workloads on separate clusters can access same data

Page 23: Big Data - in the cloud or rather on-premises?

A new way to Manage Big Data

24

Big Data Traditional Assumptions

Bare-metal

Data Locality

HDFS on local disks

Big DataA New Approach

Containers and VMs

Compute and storage separation

Shared storage

Benefits and Value

Big-Data-as-a-Service

Agility and cost savings

Faster time-to-insights

Page 24: Big Data - in the cloud or rather on-premises?

5 ½ ways to get Big Data in the Cloud

26

1. “Bring your own Hadoop” (MapR, Cloudera, Hortonworks) on Bare Metal

2. “Bring your own Hadoop” (MapR, Cloudera, Hortonworks) on VM

3. Hadoop PaaS from Cloud Provider’s Marketplace

4. Dedicated (Long-Running) BigData-as-a-Service

5. Elastic (Transient) Big-Data-as-a-Service (storage and compute separated)

6. “Cloud on Premises” (Cloud Stack from Vendors on Premises)

Page 25: Big Data - in the cloud or rather on-premises?

28

Various Models for Big Data in Cloud

Page 26: Big Data - in the cloud or rather on-premises?

Various Models for Big Data in Cloud

29

1. Bare Metal Cloud (Bring Your Own Hadoop - BYOH)

2. IaaS with any Hadoop Distribution (Bring Your Own Hadoop)

3. PaaS with Hadoop (from Marketplace)

4. Dedicated (Long-Running) BDaaS

5. Elastic (Transient) BDaaS

6. BDaas + Analytics SaaS

Page 27: Big Data - in the cloud or rather on-premises?

1) Bare Metal Cloud (BYOH)

30

Compute(BareMetal)

BigData(Custom)

OracleCompute

Analytics(Custom)

Storage(BareMetal)OracleBlockVolume&ObjectStorage,DataTransferService

Intelligence(Custom)

Amazon

Azure

Oracle

Custom

n.a.(DedicatedHostclose,butrunsVMs) n.a.

n.a.(DedicatedHost,close,butrunsVMs) n.a.

BringYourOwnHadoop(BYOH)

Custom(SQL,MachineLearning,..)

Custom(Image-,Speech-Recognition,

Bots,…)

Page 28: Big Data - in the cloud or rather on-premises?

2) IaaS (Bring Your Own Hadoop)

31

AmazonEC2&EC2 AzureVM

BringYourOwnHadoop(BYOH)

BringYourOwnHadoop(BYOH)

Custom(SQL,MachineLearning,..)

Custom(SQL,MachineLearning,..)

GeneralPurposeCompute&Dedicated

Compute

BringYourOwnHadoop(BYOH)

Custom(SQL,MachineLearning,..)

S3,EBS,Glacier,Snowball,SnowballEdge,Snowmobile

Storage(Blob),DataLakeStore,

Import/Export

Custom(Image-,Speech-Recognition,

Bots,…)

Custom(Image-,Speech-Recognition,

Bots,…)

OracleObject&ArchiveStorage,DataTransfer

Service

Custom(Image-,Speech-Recognition,

Bots,…)

Amazon

Azure

Oracle

Custom

Compute(BareMetal)

BigData(Custom)

Analytics(Custom)

Storage(BareMetal)

Intelligence(Custom)

Page 29: Big Data - in the cloud or rather on-premises?

3) PaaS (Hadoop from Marketplace)

32

S3,EBS,Glacier,Snowball,SnowballEdge,Snowmobile

Hadoop(Hortonworks,MapR)

Hadoop(Cloudera,Hortonworks,MapR)

Custom(SQL,MachineLearning,..)

Custom(SQL,MachineLearning,..)

AmazonEC2 AzureVMGeneralPurpose

Compute&DedicatedCompute

AzureStorage(Blob,Block,Disk,File),Azure

DataLakeStore

Custom(Image-,Speech-Recognition,

Bots,…)

Custom(Image-,Speech-Recognition,

Bots,…)

OracleObject&ArchiveStorage,DataTransfer

Service

n.a.

Amazon

Azure

Oracle

Custom

Compute(BareMetal)

BigData(Custom)

Analytics(Custom)

Storage(BareMetal)

Intelligence(Custom)

Page 30: Big Data - in the cloud or rather on-premises?

4) Dedicated BDaaS

33

S3,EBS,Glacier

AmazonEMRAzureHDInsight(Hortonworks)

Custom(SQL,MachineLearning,..)

Custom(SQL,MachineLearning,..)

AmazonEC2 AzureVMGeneralPurpose

Compute&DedicatedCompute

AzureStorage(Blob,Block,Disk,File),Azure

DataLakeStore

Image-,Speech-Recognition,Bots,…

Image-,Speech-Recognition,Bots,…

OracleObject&ArchiveStorage,DataTransfer

Service

BigDataCS(Cloudera)

Custom(SQL,MachineLearning,..)

Image-,Speech-Recognition,Bots,…

Amazon

Azure

Oracle

Custom

Compute(BareMetal)

BigData(Custom)

Analytics(Custom)

Storage(BareMetal)

Intelligence(Custom)

Page 31: Big Data - in the cloud or rather on-premises?

5) Elastic BDaaS

34

S3,EBS,Glacier

AmazonEMRAzureHDInsight(Hortonworks)

Custom(SQL,MachineLearning,..)

Custom(SQL,MachineLearning,..)

AmazonEC2 AzureVMGeneralPurpose

Compute&DedicatedCompute

AzureStorage(Blob,Block,Disk,File),Azure

DataLakeStore

Image-,Speech-Recognition,Bots,…

Image-,Speech-Recognition,Bots,…

OracleObject&ArchiveStorage,DataTransfer

Service

BigDataCSComputeEdition(Hortonworks)

Custom(SQL,MachineLearning,..)

Image-,Speech-Recognition,Bots,…

Amazon

Azure

Oracle

Custom

Compute(BareMetal)

BigData(Custom)

Analytics(Custom)

Storage(BareMetal)

Intelligence(Custom)

Page 32: Big Data - in the cloud or rather on-premises?

6) BDaaS + Analytics SaaS

35

S3,EBS,Glacier

AmazonEMRAzureHDInsight(Hortonworks)

MachineLearning,Polly,…

MachineLearning,DataLakeAnalytics,…

AmazonEC2&EC2DedicatedHosts

AzureVMGeneralPurpose

Compute&DedicatedCompute

AzureStorage(Blob,Block,Disk,File),Azure

DataLakeStore

Alexa,Lex,PollyCortana,SpeechAPI,ComputerVisionAPI,

VideoAPI,...

OracleObject&ArchiveStorage,DataTransfer

Service

BigDataCSComputeEdition/BigDataCS

BigDataDiscoveryCS,AnalyticsCloud,Data

Spatial&Graph

n.a.

Amazon

Azure

Oracle

Custom

Compute(BareMetal)

BigData(Custom)

Analytics(Custom)

Storage(BareMetal)

Intelligence(Custom)

Page 33: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Oracle Cloud

36

IoT CS

EventHubCS

StreamAnalytics

BigDataCS

NoSQLCS

BigDataDiscoveryCS

BigDataCS–Compute

ObjectStorageArchiveStorage

DataTransferService

BlockStorage

NoSQLCS

DataSpecial&Graph

DataTransferService

BigData SQL

DataTransferService

NoSQLCS

EventHubCS

DataTransferService

IntegrationCS

MessagingCS

BICS

ProcessCS

MobileCS

ContainerCS

ApplicationContainerCS

GoldenGate

VisualBuilder

BigDataPreparationCS

DataVisualizationCS

OracleDataIntegratorCS AnalyticsCS

Page 34: Big Data - in the cloud or rather on-premises?

Amazon AWSBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

ElasticMapReduce(EMR)

Polly

ML

Lex

Rekognition

KinesisAnalytics

KinesisStreams

KinesisFirehose

Snowmobile

Snowball

AWSIoT Platform Lambda

DirectConnect

S3

Glacier

DynamoDB

EC2 AutoScaling

EBS

EFS

Alexa

Athena

DynamoDB

Snowball

DirectConnect

SnowballEdge

KinesisFirehose

Athena

Snowball

Greengrass

RulesEngine

Lambda

Redshift

EC2ContainerService

EC2ContainerRegistry

MobileHub

MobileSDK

Lambda

SQS

SNS

Email

Pinp

oint

APIG

atew

ay

Elasticsearch

ElasticCache

DynamoDB

Elasticsearch

TensorFlow

Glue

Datapipeline

QuickSight

Page 35: Big Data - in the cloud or rather on-premises?

Microsoft AzureBulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

HDInsight

StorageBlobMachineLearning

DataLakeStore

StorageBlock

DataLakeAnalytics

EventHub

StreamAnalytics

IoT Suite

CosmosDB

Import/Export

Import/Export

SpeechAPI

VisionAPI

Cortana

BotService

ServiceBu

s

Notificatio

nHu

b

APIM

anagem

ent

PowerBI

BizTalkServices

EventHub

IoT Hub

IoT Edge

SQLDataWarehouse

TableStorage

RedisCache

Functions

ContainerService

ContainerRegistry

CosmosDB

TableStorage

ContainerInstances

TimeSeriesInsight

TimeSeriesInsight

EventGrid

Page 36: Big Data - in the cloud or rather on-premises?

43

Big Data On-Premises

Page 37: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

On-Premises – Oracle Cloud Machine

44

IoT CS

EventHubCS

StreamAnalytics

BigDataCS

NoSQLCS

BigDataDiscoveryCS

BigDataCS–Compute

ObjectStorageArchiveStorage

DataTransferService

BlockStorage

NoSQLCS

DataSpecial&Graph

DataTransferService

BigData SQL

DataTransferService

NoSQLCS

EventHubCS

DataTransferService

IntegrationCS

MessagingCS

BICS

ProcessCS

MobileCS

ContainerCS

ApplicationContainerCS

Page 38: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

On Premises – Open Source

45

Page 39: Big Data - in the cloud or rather on-premises?

46

Hybrid Big Data Solutions

Page 40: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Hybrid Big Data Solutions

47

Cloud On-PremOn-Prem/Edge/Internet

Page 41: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Hybrid Big Data Solutions

48

Cloud On-PremOn-Prem/Edge/Internet

Page 42: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Hybrid Big Data Solutions

49

CloudOn-Prem/Edge/Internet

On-Prem

Page 43: Big Data - in the cloud or rather on-premises?

Bulk Source

Event Source

Location

DBExtract

SQL /Stream

Search

SQL /Export

Service /Stream /Export

BI Tools

Enterprise Data Warehouse

Search / Explore

Enterprise Apps

Import

Import

Edge Cluster

Storage

Core Processing

StreamProcessing

Reference /Models

File

Weather

Batch AnalyticsStream Analytics

Parallel Processing

Storage

Storage

Raw

Ref

ined

Results

Serverless

DBCDC

Event Hub

Edge Node

Serverless

Rule Engine

Event Hub

Event Hub

Serverless

Processing

FileCDC

Storage

Stream

Stream

State / Results

IoTData

MobileApps

Hybrid Big Data Solutions

50

CloudOn-Prem/Edge

Page 44: Big Data - in the cloud or rather on-premises?

Guido SchmutzTechnology Manager

[email protected]

@gschmutz guidoschmutz.wordpress.com