mapr data scientist가제공하는 · 2019-04-08 · mapr quick start solutions: speeding...

21
© 2017 MapR Technologies MapR Confidential 1 + MapR Data Scientist가 제공하는 Professional Service 소개 정인철 부장, Ph.D. Data Engineer, MapR 2017.06.15

Upload: others

Post on 16-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 1

+MapR Data Scientist가제공하는Professional Service 소개

정인철부장, Ph.D.

Data Engineer, MapR

2017.06.15

Page 2: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 2

Professional Services

Focus on Customer Success

Key Institutional Knowledge

• Data Science

• Data Engineer

• Solution Design

Measurable Results

• 빠른 실행

• ROI

• 위험 관리

Page 3: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 3

Professional Services 영역

• Installation

• Migrations

• SLA Plans

• Best Practices

• Performance

Tuning

Hadoop Core

Services

IT/ Infrastructure

Linux

Networking

Data Center

Storage

Operations

Big Data

Workflows

• Hive/Pig

• Oozie/Sqoop

• Flume

• M7/HBase

• Data Flow

BI / DBA

BI / ETL / Reporting

Scripting / Java

Hadoop MR

Eco Projects

(HBase, Hive, …)

Solution

Design

• HBase/M7

• Map/Reduce

• Application

Development

• Integration

Development

Java

Hadoop Developer

Architectural Design

Advanced

Analytics

• Use case

Discovery

• Use case

Modeling

• POC

• Workshops

Modeler / Analyst

PhD

Statistics/Math

MatLab / R / SAS

Scripting / Java

BI / ETL / Reporting

Data Engineering Data Science

AUDIENCE

ENGAGEMENTS

SKILLS

Page 4: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 4

MapR Global PS Team

DS/DS

DS/DEDS/DE

Globally Data Scientist/

Data Engineers

North AmericaData Scientists/Data Engineers

(Korean)

EMEAData Scientists/Data Engineers

Asia PacificData Scientists/Data Engineers

(Korean)

Page 5: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 5

제공 서비스 종류

• Hadoop Core Service

– Hadoop Operations

• Big Data Workflows

– Custom Big Data ETL Workflows

• Solution Design

– 어플리케이션 Design, Implementation,Integration

• Advanced Analytics

– Data Science

Page 6: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 6

Hadoop Core Service

일반적인 Hadoop Core Service 제공

• Implementation / Deployment

• Cluster Migration

– From other Distributions to MapR

– Development to Stage to Production Cluster

• MapR Upgrades

• Cluster Tuning and Optimization

• Cluster Health Check / Best Practices

• SLA/DR Strategies

Page 7: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 7

Big Data Workflows

• Architect and develop custom Big Data ETL workflow:

– Hive/Pig

– Flume/Oozie

– Hbase/M7

– NFS

– Etc …

• Custom ETL Big Data workflow includes:

– Ingest

– Processing

– Access

Page 8: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 8

Big Data Workflow On MapR

MapR Data Platform

Processing and Analytics

Ingest

Sqoop

Flume

HDFS

NFS

Access

Tez

Drill

Hive

Pig

Impala

Data Sources

Clickstream

Sensor Data

Billing Data

CRM / ERP

Product Catalog

Social Media

Server Logs

Merchant Listings

Online Chat

Call Detail Records

Visualization

M7HBaseMapReduce

v1 & v2

StormCascadingPig

Solr MahoutYARN

Oozie Hive MLLib

Page 9: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 9

Solution Design

• Map/Reduce Jobs 커스터마이징

• Spark job 커스터마이징

• MapR 단일 데이터 플랫폼 시스템 이용한 어플리케이션 개발

• Hbase 어플리케이션 개발

• 빅데이터 에코 시스템을 이용한 어플리케이션 개발

Page 10: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 10

Advanced Analytics

• Use case Discovery

• Use case Modeling

• Machine Learning

• POC of Modeling solutions

• Workshops

Page 11: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 11

서비스형태

• 다양한형태의조합으로써서비스제공

– 주단위(1주 ~ 4주기간)서비스제공

– 일단위(1일 ~ 3일기간)서비스제공

• 예,교육,워크샵, usecase,

– 시간단위서비스제공

– 프로젝트단위서비스제공

– QSS(Quick Start Service)제공

Page 12: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 12

Quick Start Solutions

Page 13: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 13

What/Why Quick Start Solutions?

일단어느정도갖추고싶은데무엇부터시작할수있죠?

무엇을했으면좋을지모르겠어요

1~2달내에구축을끝내고쓸만한 usecase까지돌려보고싶어요

일단어떻게하는건지알면,그다음부터는우리가알아서할수있을텐데…

한정된 리소스 인력과 범위와 비용일 경우, 적합한 서비스

Page 14: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 14

MapR Quick Start Solutions: Speeding Time-to-Value

Data Warehouse Offload, Optimization and Analytics

Real-Time Security Log Analytics, Production Log analysis

Customer 360, Social Media Analysis, Recommendation Engine

Time Series Analytics, NoSQL Webstore Applications

Deep Learning on GPUs for Image Analytics

솔루션템플릿

빠른 delivery

Knowledge전수

Financial Services – Fraud Detection, Anti-Money Laundering

Complex Event Processing with Drools / Stream Processing

Self Service Data Exploration and BI Analytics on Hadoop

Page 15: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 15

Enterprise Data Hub

Page 16: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 16

Data Warehouse Optimization QSS

Data Transformation/ETL on Hadoop

Offloading “cold” data to Hadoop

Restores

Storage capacity

One-time offload capitalizes on

historic underused data

Minimal impact to existing

data pipelines

Present new data

for exploration

ETL work includes

incremental updatesRestores CPU capacity

and storage to DW

Page 17: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 17

Offload Cold Data to Hadoop

Structured DataETLIncoming

Data

Data Warehouse

MapR Data Platform

▪ Process:

▪ One-time Migration of

Cold Data

▪ Demonstration of Data

upload to DW

▪ Data Access:

▪ ODBC

▪ Thrift

▪ Standard Connectors

Cold Data

Offload

Log Archive

Page 18: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 18

Cold Data Offload - Deliverables

Phases Task Summary/Deliverables Timeline

Cluster Preparation Properly Installed, Configured and validated MapR Cluster ready for QSS

5 days

Requirements Review Review and agree upon requirements and goals 1 Day

Data Selection Data Set Identification and JustificationData Transfer Plan

1 Day

Historical Data Migration Configuration of new data source Data MigrationData Access (ODBC or other Tool)Metadata Creation and Service Setup

10 Days

Data and Services Validation Validation of the following:Transferred Data parityODBC availabilityMetadata Service accuracy

3 Days

TOTAL SCOPE 20 Days

Page 19: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 19

Offload ETL onto Hadoop

Low Latency Data

ETLIncoming

Data

Data Warehouse

MapR

Bulk Data

Restored CPU

and Disk

▪ Process:

▪ One-time Migration of

Historical Data

▪ Redirection on New

Data

▪ Migration of ETL onto

Hadoop

▪ Demonstration of Data

upload to DW

▪ Data Access:

▪ ODBC

▪ Thrift

▪ Standard Connectors

Page 20: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 20

Offload ETL onto Hadoop- Deliverables

Phases Task Summary/Deliverables Timeline

Cluster Preparation Properly Installed, Configured and validated MapR Cluster ready for QSS 5 Days

Requirements Review Review and agree upon requirements and goals 1 Day

Architecture Design Development and Implementation Plan 2 Days

Data Selection. Data Set Identification and JustificationData Transfer Plan 1 Day

Historical Data Migration Configuration of cluster to support new data sourceData Migration of existing data 3 Days

Transient Data Ingestion Configuration of cluster to support new data sourceData ingest in current/new data 3 Days

Workflow Implementation ETL Process DevelopmentWorkflow Management SetupService Implementation 5 Days

Data and Workflow Validation Validation of the following:Transferred Data parityODBC availabilityMetadata Service accuracy Workflow 5 Days

TOTAL SCOPE 25 Days

Page 21: MapR Data Scientist가제공하는 · 2019-04-08 · MapR Quick Start Solutions: Speeding Time-to-Value Data Warehouse Offload, Optimization and Analytics Real-Time Security Log Analytics,

© 2017 MapR TechnologiesMapR Confidential 21

Q&A

@mapr

[email protected]

ENGAGE WITH US