10-step methodology to building a single view with mongodb

39
10-Step Methodology to Building a Single View Mat Keep, Director of Product & Market Analysis. [email protected] @matkeep Jon Rangel, Director of Professional Services, EMEA. [email protected]

Upload: mat-keep

Post on 07-Apr-2017

285 views

Category:

Software


4 download

TRANSCRIPT

10-Step Methodology to Building a Single View

Mat Keep, Director of Product & Market Analysis. [email protected] @matkeep Jon Rangel, Director of Professional Services, EMEA. [email protected]

What You Will Learn

1. Single View: Opportunities & Challenges

2. Repeatable 10-Step Methodology

3. Required Technical Capabilities

Why Single View

Single View Defined •  What

–  Single, real-time representation of a business entity or domain

–  Customer, product, supply chain, financial asset class, & more

•  How –  Gathers and organizes data from multiple,

disconnected sources; –  Aggregates information into a standardized format

and joint information model

•  Why –  Improves business visibility –  Serve operational applications –  Foundation for analytics

Single View Use Cases

•  Comparative view of traders or products

•  Firm-wide view of asset exposure

•  Aggregated transactions for fraud models

•  Omni-channel view of customers for personalized marketing

•  Inventory control & management

•  Single view of product across channels & demographics

•  Management of patient medical records for treatment plans

•  Macro-analysis view for public health

•  Medical history to identify insurance risk

Finance Retail Healthcare

Challenges •  Current State

–  Data dispersed across multitude of systems –  Different structures, different attributes –  Apps built to meet specific business requirements, not

integrated –  New data sources from new apps, M&A

•  Governance Processes –  How to deliver & maintain single view in face of

constant business change •  Technology Limitations

–  Traditional databases not well suited to single view required capabilities

10-Step Methodology

ETL

or M

essa

ge Q

ueue

Web

Mobile

CRM

Mainframe Single View

Call Center

Analytics

Technical Support

Billing

Source Systems Consuming Systems

Load Reads

High Level Architecture

10-Step Methodology

Step 1: Define Scope

Step 4: Appoint

Data Stewards

Step 5: Develop

Data Model

Step 6: Load &

Standardize

Step 7: Merge,

Test & Reconcile

Step 8: Infrastructure

Design

Step 3: Identify

Data Producers

Step 2: Identify

Data Consumers

Step 9: Modify Consuming

Systems

Step 10: Maintenance Processes

Discover

Develop

Deploy

Step 1: Define Scope & Sponsorship •  Scope needs to be realistic, defined by specific success metric

–  Long term: aggregate all customer data into a single view, serving all business functions

–  Initial phase: collecting all customer interactions on digital channels over past 3-months to improve call center MTTR

•  Appoint executive sponsors –  Senior: allocate resources and command credibility –  Combination of senior title from the business, and from the technology

group

Discover

Web

Mobile

CRM

Mainframe

Source Systems

Steps 2 & 3: Identify Data Consumers & Producers

•  Single View Consumers Define –  Typical queries and SLAs –  Required data attributes –  Current data sources

•  Identify apps generating the source data –  Identify application owners + associated databases –  Profile apps: operational, analytical

Step 2: Data Consumers

Step 3: Data Producers

Discover

Step 4: Appoint Data Stewards

•  Data steward appointed for each data source.

•  Deep knowledge of: –  Source system schema –  Which tables store required attributes, what format –  Clients and apps that generate & consume the

source data

•  Advise on data loading strategies

Develop

Step 5: Develop Single View Data Model •  Key inputs

–  Required data attributes –  Query patterns

•  Define common fields & data types –  Create rules to validate common data

•  Define primary & secondary indexes •  Identify dynamic fields

–  No need to pre-declare when using a document database

•  Localize data into a single document (where appropriate)

{_id : “[email protected]”,first_name : "Mark",last_name : "Smith",city : "San Francisco",

phones: [ { number : “1-212-777-1212”, dnc : true, type : “home”

},{

number : “1-212-777-1213”, type : “cell”

}]}

Single View

Develop

Resources to Support Schema Design

MongoDB Documentation

MongoDB Development Rapid Start

Develop

Step 6: Load 2 phases: Initial Load & Delta Load Emit JSON to preserve data types. Use Extended JSON

Load

ETL

or M

essa

ge Q

ueue

Single View

Develop

Initial Load •  ETL Tools •  Custom Loaders

Delta Load •  Batch loads: use tools above •  Real-time loads: Message queue

Step 6 (cont’d): Standardize

Data  Source  A   Data  Source  B   Data  Source  C  

14   77  26  

cust_id:  14  f_name:  James  l_name:  Bond  dob:  07/14/1968  eMail:  [email protected]  

fno:  77  first:  Jim  last:  Bond  born:  1968-­‐07-­‐14  email:  [email protected]  

xc_id:  26  name:  James  Bind  bdate:  July  14,  68  Email:  [email protected]  

Develop

Step 7: Match, Merge & Reconcile Develop

cust_id:  14  f_name:  James  l_name:  Bond  dob:  07/14/1968  eMail:  [email protected]  

xc_id:  26  name:  James  Bind  bdate:  July  14,  68  Email:  [email protected]  

source_id:  A_14  first_name:  James  last_name:  Bond  dob:  1968-­‐07-­‐14  eMail:  [email protected]  

source_id:  B_77  first_name:  Jim  last_name:  Bond  dob:  1968-­‐07-­‐14  eMail:  [email protected]  

source_id:  C_26  first_name:  James  last_name:  Bind  dob:  1968-­‐07-­‐14  eMail:  [email protected]  

_id:  [email protected]  first_name:  James  last_name:  Bond  dob:  1968-­‐07-­‐14  

Source  Data  

Standardized  Data  Field  names  &  data  types  

Single  View  Data  merged,    tested  &  reconciled  

fno:  77  first:  Jim  last:  Bond  born:  1968-­‐07-­‐14  email:  [email protected]  

Step 7 (cont’d): Match, Merge & Reconcile •  Use iterative grouping functions to cluster records with similar

attributes 1.  Match against unique, authoritative attributes (email address, credit card #) 2.  Match by combining attributes (last name, DoB, zip code) 3.  Use fuzzy matching to catch errors in source data (i.e. different spellings of customer

name)

•  Apply confidence factor to dictate merging –  Automatically merge records with 95%+ confidence –  Manually inspect records with lower confidence

Develop

Step 7 (cont’d): MongoDB Tools •  Workers framework to parallelize document comparisons •  Grouping tool to cluster documents based on attribute similarity

–  Levenshtein to calculate distances, single-linkage clustering for matching

Develop

Step 8: Architecture Design Deploy

•  Deployment infrastructure •  MongoDB Production Readiness Consulting

Package provides recommendations: –  Hardware sizing –  HA/DR strategies –  Scaling –  Security for corporate and regulatory compliance

•  Follow-on services for implementation

Step 9: Modify Consuming Systems Deploy

•  Modify the apps that consume the single view –  Create an API that exposes the single view (i.e.

RESTful web service) –  Re-point apps to the web service (reads initially)

•  Modify one consuming application at time

Call Center

Analytics

Technical Support

Billing

Consuming Systems

Reads

Single View

Step 10: Implement Maintenance Processes Deploy

•  Frequency of application launch & evolution is accelerating

•  Impacts to single view –  Adding new attributes from source systems –  Onboarding new data sources or digital channels –  Creating new apps that consume the single view

•  Single view team needs to institutionalize governance around on-going maintenance –  Repeat the 10-step process –  Dynamic schema is HUGE!

Single View Maturity Model

Scope

Bus

ines

s B

enef

its

Transactions are written first to the single view, which propagates the data back to the source system of record.

Writes are performed concurrently to the source systems as well as the single view

The single view data model is enriched with additional sources to serve more applications, including real-time analytics. The single view becomes a platform serving multiple applications

Single View Platform

Records are copied via ETL or message queue mechanisms from the source systems into the single view, serving read queries. The single view serves one specific application

Single View Application

Single View First

Dual Writes

Read Centric

Transforming the role of the single view

Reads & Writes

Single View Maturity Model

•  Advantages of writing to the single view –  Fresher data –  Reduced app complexity –  Improved application agility

Architecture for Writes to the Single View

ETL

or M

essa

ge Q

ueue

Web

Mobile

CRM

Mainframe

Single View Call Center

Analytics

Technical Support

Billing Update Queue

Reads

Writes

Source Systems Consuming Systems

Load

Required Capabilities for Single View

Single View with a Relational Database

Required Database Capabilities

•  Data model flexibility with a dynamic schema •  Real-time analytics •  Performance, scale & always-on •  Enterprise deployment model

MongoDB Compass

MongoDB Connector for BI

MongoDB Enterprise Server

Enterprise Deployment Model  24

x 7

Sup

port

(1 h

our S

LA)

Com

mercial License

(N

o AG

PL C

opyleft Restrictions)

Platform Certifications

MongoDB Ops Manager

Monitoring  &  AlerBng  

Query  OpBmizaBon  

Backup  &  Recovery  

AutomaBon  &  ConfiguraBon  

Schema  VisualizaBon  

Data  ExploraBon  

Ad-­‐Hoc  Queries  

VisualizaBon  

Analysis  

ReporBng  

AuthorizaBon   AudiBng   EncrypBon  (In  Flight  &  at  Rest)  AuthenBcaBon  

REST  API  Emergency Patches

Customer Success Program

On-Demand Online Training

Warranty

Limitation of Liability

Indemnification

Single View In Action

Single View of Customer Insurance leader generates coveted single view of customers in 90 days – “The Wall”

Problem   Why  MongoDB   Results  Problem Solution Results

No single view of customer, leading to poor customer experience and churn 145 years of policy data, 70+ systems, 24 800 numbers, 15+ front-end apps that are not integrated Spent 2 years, $25M trying build single view with RDBMS – failed

Built “The Wall,” pulling in disparate data and serving single view to customer service reps in real time Flexible data model to aggregate disparate data into single data store Expressive query language and secondary indexes to serve any field in real time

Prototyped in 2 weeks Deployed to production in 90 days Decreased churn and improved ability to upsell/cross-sell

Single View of LHC Analytics Data aggregation system to accelerate scientific research & discovery

Problem   Why  MongoDB   Results  Problem Solution Results

Raw data from LHC & experiments distributed across multitude of source systems Scientists don’t know location of source data, or how to extract it Relational databases rigid data model prevented aggregation of data from different sources

Data Aggregation System built on MongoDB, consolidating analytics into a single view Dynamic schema represents data of any structure MongoDB query language supports simple lookups to complex search, traversals & analytics

A single query to MongoDB can return 10,000 documents from different data sources for real time analytics Accelerates scientific time to insight Accessed by 3,000 physicists from 200 research institutions across the globe

Wrap Up

Where to Go from Here? •  Single view projects are challenging

–  Partner with a vendor offering proven methodology, tools & technologies

•  Learn More –  Download the whitepaper –  10-Step Methodology to Building a Single View

•  Engage –  MongoDB Global Consulting Services can help you

scope the project and get started –  Book a workshop

10-Step Methodology to Building a Single View

Single View of the Customer 360° view of the customer increases customer satisfaction, cross-sell & up-sell with MongoDB, Spark, & Hadoop

Problem   Why  MongoDB   Results  Problem Solution Results

Customer data scattered across 100+ different systems Poor customer experience: no personalization, no consistent experience across brands or devices No way to analyze customer behavior to deliver targeted offers

Single View application on MongoDB flexible data model, expressive query language, secondary indexes, & horizontal scalability Data from old relational systems fed into Spark for analysis and then stored in MongoDB to support real-time CRM Customer data synced from MongoDB to Hadoop for nightly batch jobs, then fed back to MongoDB for personalized recommendations

Single view serves customers from any channel Stores 10s of TBs of customer data across multiple data centers Increased revenues from improved customer intimacy, driving cross-sell and upsell

Global Airline

Data Model Flexibility

… Mobile App

Web

Call

Centre CRM Social Feed

COMMON FIELDS CustomerID | eMail |

DYNAMIC FIELDS Can vary from record to record: location, action

Single View

Customer Service Application

MongoDB  Primary  Replica  Single  View  

BI & Reporting    

REST Data Services    

Real-time Data Services for Regulators & Partners

Visualisations Queries & Updates

Aggregates  Predictive Analytics

MongoDB  Secondary  Replica  Single  View  MongoDB  Secondary  Replica  Single  View  MongoDB  Secondary  Replica  Single  View  MongoDB  Secondary  Replica  Single  View  MongoDB  Secondary  Replica  

MongoDB  Secondary  Replica  

Data Analytics Pipeline

Real-Time Analytics

Predictable Scale & Always-On

Shard 1

Horizontally Scalable

Shard 2

Shard 3

Shard n