soa & big data

36
SOA & Big data Arnon RotemGalOz

Upload: arnon-rotem-gal-oz

Post on 10-May-2015

3.800 views

Category:

Technology


3 download

DESCRIPTION

Some of the challenges and oppurtunities

TRANSCRIPT

Page 1: SOA & Big Data

SOA & Big data  

Arnon  Rotem-­‐Gal-­‐Oz  

Page 2: SOA & Big Data

Sept  2012  –  iOS6  launched  with  new  maps  applica>on  

Page 3: SOA & Big Data

But  something  went  terribly  wrong….  

hEp://theamazingios6maps.tumblr.com/  

Page 4: SOA & Big Data

•  It  isn’t  just  about  geKng  all  the  data  there  

•  Algorithms  are  cool  but  we  need  humans  in  the  loop  

•  Hire  the  right  people  •  Test  !  Test  !  Test!    

hEp://theamazingios6maps.tumblr.com/  

Page 5: SOA & Big Data

hEp://theamazingios6maps.tumblr.com/  

It  isn’t  just  one  pile  of  data  

Page 6: SOA & Big Data

Integra>ng  Big  data  &  SOA    

Yoel  Ben  Avraham  -­‐  hEp://www.flickr.com/photos/epublicist/3546059144/  

Page 7: SOA & Big Data
Page 8: SOA & Big Data

Data    Refinery    

Ofer  Berger    hEp://www.haifacity.com/allsites/allpic/a/A1738/A1738Pic3326.jpg  

Page 9: SOA & Big Data
Page 10: SOA & Big Data

ETL  integra>on  DB  integra>on  File-­‐based  integra>on  Online  integra>on  

Department  Server  DB  

Page 11: SOA & Big Data

ASB BLT

HDL

AFT TGI FRY

DRW SWG

QYD DLY

BST

WIU

ASB

ZIS XOI CUI

RMO

DLY XPS

KYF

KFC

WHR

JIA GEX

FQA VUH

HCO

WKD

ECP

SKD

MFP

WCP

DKE AJT

 Object  soup  

Page 12: SOA & Big Data

ASB BLT

HDL

AFT TGI FRY

DRW SWG

QYD DLY

BST

WIU

ASB

ZIS XOI CUI

RMO

DLY XPS

KYF

KFC

WHR

JIA GEX

FQA VUH

HCO

WKD

ECP

SKD

MFP

WCP

DKE AJT

 Services  

Invoices

Customer

Promotions

Orders

Page 13: SOA & Big Data

Service  

Describes  

Endpoint   Exposes  

Messages   Sends/receives  

Contracts  

Binds  to  

Service    consumer   Implements  

Policy   Governed  by  

Sends/receives  

Adheres  to  

Component  

Rela>on  

Key  

Understands  Serves  

Page 14: SOA & Big Data

Interac>ons  

Customer  

Categories  Agents  

Page 15: SOA & Big Data

Integra>ng  Big  data  &  SOA    

Yoel  Ben  Avraham  -­‐  hEp://www.flickr.com/photos/epublicist/3546059144/  

Page 16: SOA & Big Data

               

               

               

               

Coordinator*  

Prepare/commit/undo  

Service  consumer  

Protocol  

Rela>on  

Key  

SOA  component   PaEern  component  

Concern/aEribute    

RegistraDon  

Perform    acDvity  

Compensate  

Create  context  

Ini>ator  Service    

Par>cipator  

Perform    acDvity  

Compensate  Prepare  /  commit  /  undo  

Register  

AcDviDes  and  replies  

AcDviDes  and  replies  

Saga  

Page 17: SOA & Big Data

Hadoop Cluster

NIM

Interaction Recordings

ETL

Customer HBase

Raw(HDFS)

Interactions

HBaseData Management

HBase

HCatalog

Resolved Interactions(H

DFS)

Categories HBase

HBase

Resolved Interactions(H

DFS)

Page 18: SOA & Big Data

So,  what’s  the    problem  ?  

Page 19: SOA & Big Data

 &  Big  data    can’t  move  

Page 20: SOA & Big Data

Performance  of  joins  in  distributed  system  sucks!  

Node 1

customers A-H

Interactions 0-99

Node 2

customers I-M

Interactions 100-199

Node 3

customers N-Z

Interactions 200-299

{”Interac>on":  {      "id":  ”5",        ”par>cipants":  {          ”customer":  [              {”surname":  ”McDonalds",  ”name":  ”Old"},]      }  }}  

Page 21: SOA & Big Data

Cookie  cuEer  scalability    

Page 22: SOA & Big Data

Cell  architecture  

Node  2  

Node  3  

Node  1  

Node  N  

Page 23: SOA & Big Data

Cell  Architecture  

BUS

Categories Customers

Interactions ReferenceData

ORCA

HBaseHDFS HBase

HBase HBase HBase

Page 24: SOA & Big Data

                 

Initiate business process

Workflow engine

Endpoint  

Workflow instance

Invoke services

Manage  process  

Route request

Host  workflows  

Schedule  

Service

                 

Endpoint  Service

Manage  workflows  

Monitor  workflows  

Orchestra>on  

Page 25: SOA & Big Data

Map  Reduce  processing  pipeline  

Resolve Customer IDs(Custoemr)

Categorize Segment

(Categorization)

Update Segmentdocument

(Interaction)

Map pipeline

Segment RowRetrive segment

data - create segment

document(Interaction)

Write Categories

Results(Categorization)

Write Interaction

(interaction)

CustomersLocal cache

InteractionID, Segment Row

Map

Prepare data mart Export(Datamart)

Update Interactiondocument

(Interaction)

Reduce pipeline

Interaction &Segments Categorize

Interaction(Categorization)

Write Categories

Results(Categorization)

Write Interaction

(interaction)

Reduce

Write Interaction

(interaction)

Hadoop Map/Reduce

Page 26: SOA & Big Data

Map  Reduce  processing  pipeline  

Resolve Customer IDs(Custoemr)

Categorize Segment

(Categorization)

Update Segmentdocument

(Interaction)

Map pipeline

Segment RowRetrive segment

data - create segment

document(Interaction)

Write Categories

Results(Categorization)

Write Interaction

(interaction)

CustomersLocal cache

InteractionID, Segment Row

Map

Prepare data mart Export(Datamart)

Update Interactiondocument

(Interaction)

Reduce pipeline

Interaction &Segments Categorize

Interaction(Categorization)

Write Categories

Results(Categorization)

Write Interaction

(interaction)

Reduce

Write Interaction

(interaction)

Hadoop Map/Reduce

Page 27: SOA & Big Data

Data  Facets  

Page 28: SOA & Big Data

In-memory

Data grid Columnar

Graph

Indexing

NewSQL

Columnar

Caching

HBase  

Hypertable  

Neo4j  

Apache  Solr   AKvio  

IndexTank  

RavenDB  

Cassandra  

MongoDB  

CouchDB  

ScaleBase  

VoltDB  

Amazon  RDS  

HP  Ver>ca  

EMC  Greenplum  

IBM  Netezza  

Microsoo  PDW  

Aster  Data  

ParAccel  

Memcached   GigaSpaces  Redis   GridGain  

Oracle  Coherence  

WebSphere  eXtreme  Scale  

Pregel  

Hama  

SAP  HANA   Oracle  Exadata  

Accumulo  

Document Relational

Analytics/MPP

Key-value store

Distributed file systems

Hadoop  GlusterFS  

Page 29: SOA & Big Data

Mul>-­‐>ered  data  

Datawarehouse  (Hadoop/Hbase)  

     

20  years    detailed  

aggregated      

Datamart(s)  (RDBMS)  

   

6-­‐12  months  Detailed  

 1-­‐3  years  aggregated  

Cube  (MOLAP)  

   

6-­‐?  Months  aggregated  

Real-­‐>me  (in  memory)  1-­‐7  days  detailed  

Data  is  mul>-­‐>ered  

Page 30: SOA & Big Data

Mul>-­‐>ered  data  

Data  warehouse  (Hadoop/Hbase)  

     

20  years    detailed  

aggregated      

Real-­‐>me    

1-­‐7  days  detailed  

Datamart(s)  (Columnar)  

 6-­‐12  months  Detailed  

 

Data  is  mul>-­‐>ered  

Page 31: SOA & Big Data

SOA  leaves  us  with  a  lot  of  isolated  data  

Page 32: SOA & Big Data

                         

                         

Subscribed/ pulled data

                         

               

Pull data

Data backend

Endpoint

Out

Load

Report

Ingest  

Clean  Join  

Transform

Transpose  

Produce    reports  

Report

Endpoint

Request

Raw  data  

ODS/DM

                         SQL endpoint

                         SQL endpoint

Landing  area  

Service

Aggregated  Repor>ng  

Page 33: SOA & Big Data

 Landing  

Raw  data          

DW/ODS  

Views  

Transforma>on  service  

1

1

2

3

4

5

Load  service  

2

Report  service  

Page 34: SOA & Big Data

Report tool

Data mart

4

3

Raw data (HDFS)

Aggregation map/reduce

HBase

ETL (map/reduce

+ETL)

Drill through REST API

Details Aggregates

1

2

2

5

6

7

8

9

10

Page 35: SOA & Big Data

Take  aways  

SOA  &  Big  data  are  beEer  together  

Page 36: SOA & Big Data

Arnon  Rotem-­‐Gal-­‐Oz    

[email protected]    hEp://www.nice.com  

 

hEp://arnon.me/soa-­‐paEerns    

[email protected]     hEp://arnon.me  

 

@arnonrgo