enterprise conference 2013 microsoft bigdata 사례발표자료
DESCRIPTION
Microsoft의 엔터프라이즈 컨퍼런스 중 Microsoft의 Big Data Solution 체계에 대한 소개와 고객사(신세계)에서 적용한 사례 발표 내용입니다. Open Source 기반의 저비용 Big Data 관리 체계에 더하여, 현 시점에서의 Open Source 진영의 취약점인 Big Data 활용 관점에서의 보완책으로 기존의 DW/BI 아키텍쳐가 결합된 사례 이며, 이를 위한 핵심 컴포넌트로 Polybase가 국내에서 최초로 적용되었습니다. Polybase에 대한 보다 자세한 내용은 다음 사이트 - Microsoft Gray SystemLab : http://gsl.azurewebsites.net/Projects/Polybase.aspx - 에서 확인하실 수 있습니다. Back-end Infra인 Hadoop Echo 와 MPP 장비인 PDW 그리고 가장 활용성 높은 Microsoft BI 체계가 결합된 Big Data Hybrid Architect로 향후 상당 기간은 이와 유사한 아키텍쳐가 주류를 이룰 것으로 예상됩니다.TRANSCRIPT
![Page 1: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/1.jpg)
Microsoft
Enterprise Conference 2013Reimagining Your Enterprise with Microsoft!
유통/서비스 고객을 위한마이크로소프트의 BIGDATA 전략 및 고객 적용 사례
한국마이크로소프트박명은 부장신세계 김훈동 과장
![Page 2: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/2.jpg)
Enterprise Conference 2013
70% 의 미국
스마트폰 유저는
모바일 디바이스를
통해 온라인 쇼핑
44% 유저는
모바일 디바이스를
통해 페이스북 접근
(350M people)
50% of
millennial 는 모바일
디바이스를 통해
제품 리서치
60% 의 미국
모바일 데이터는
오디오/비디오
스트리밍 데이터로
예상 (2014)
Mobility
2/3 전세계 모바일
데이터 트래픽은
비디오 데이터로
예상됨 (2016기준)
33% 의 BI 는
휴대용 디바이스를
통해 서비스 될 것임
게임 콘솔이 평균적으로
1.5 hrs/wk을 인터넷을 접속하기 위해 사용함
80% 비정형
데이터가 증가 될
것으로 예상 (향후
5년내 )
1.8 zettabytes
디지털 데이터가
사용됨 (2011기준)
2010기준 30% 증가함
1 in 4페이스북 유저는
그들의 주소를
페이스북 상에
포스트함
(2B/month).
500M 트윗은
매일 발생함
38% 유저는
쇼셜상에서 “follow”
or “like” 를 통해
브랜드를 추천함
100M페이스북 “likes”/일
Brands get
빅데이터
소셜
모바일클라우
드
신규 데이터에 대한 비즈니스 분석 Needs
![Page 3: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/3.jpg)
Enterprise Conference 2013
기존 방식의 분석 플랫폼 Challenge
![Page 4: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/4.jpg)
Enterprise Conference 2013
Modernized Data Warehouse
소프트웨어
협업
Self-service BI
가상화
“Big Data”
“In-Memory”
대규모 병렬 처리
HW 어플라이언스
클라우드
![Page 5: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/5.jpg)
Enterprise Conference 2013
Modern Data Warehouse
![Page 6: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/6.jpg)
Enterprise Conference 2013
Modern Data Warehouse
![Page 7: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/7.jpg)
Enterprise Conference 2013
Unstructured
Structured
Power Pivot
PowerView
PowerQuery
IT 프로비저닝
자가 서비스
Analytical
RelationalNon-relational
In-memory/OLAP
모든 데이터
Power Map
신속한 데이터 분석
소셜 웹 API
![Page 8: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/8.jpg)
Enterprise Conference 2013
Microsoft’s Modern Data Warehouse
![Page 9: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/9.jpg)
Structured:e.g.MM/DD/YYYY
Semi-structured:
e.g. web logs, RFID,
“the internet of things”
APPSBiz process
ERP, CRMBusiness
“Data
Scientists”“Quants”
EDW
E
T
L
DM
DM
DM
INFRASTRUCTURE TOOLS USERSDATA SOURCES
“Whiteboard”
Known, known
DATA TYPES
Machines“The Internet
of things”
Social MediaWeb Logs
PDW
HDInsights
HDP on Windows
Power BI
unknown,
unknownBig Data
Hadoop
P
OL
YB
AS
E
The Modern Data Warehouse
![Page 10: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/10.jpg)
Enterprise Conference 2013
![Page 11: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/11.jpg)
Enterprise Conference 2013
신속한 데이터 접근: 병렬과 확장성
MPP 기반 SQL Server 2012 Parallel Data Warehouse
![Page 12: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/12.jpg)
Enterprise Conference 2013
신속한 데이터 접근 : 실시간 분석을 위한 In-Memory 기술
SQL Server 2012 Parallel Data Warehouse 컬럼단위 데이터 저장
Custo
mer
Sale
s
Country
Supplie
r
Pro
ducts
![Page 13: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/13.jpg)
Enterprise Conference 2013
모든 데이터 타입 : “Big Data”
HDInsights in Azure and HDInsights in PDW
![Page 14: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/14.jpg)
Enterprise Conference 2013
![Page 15: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/15.jpg)
Enterprise Conference 2013
익숙한 툴 기반으로 손쉬게 데이터 접근
Power View and PowerPivot visual and modeling tools
![Page 16: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/16.jpg)
Enterprise Conference 2013
직관적인 시각화
Preview: Project codename “GeoFlow”
![Page 17: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/17.jpg)
Enterprise Conference 2013
관리의 편리성
SharePoint 온라인/오프라인
![Page 18: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/18.jpg)
Enterprise Conference 2013
![Page 19: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/19.jpg)
Enterprise Conference 2013
어플라이언스
SQL Server 2012 Parallel Data Warehouse
![Page 20: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/20.jpg)
Enterprise Conference 2013
Microsoft BIG DATA 국내 구축 사례
![Page 21: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/21.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – Our Team
신세계그룹 -> S.COM -> DataLab팀(가칭) ->’Money Mall’ Project
”돈이 되는, 숫자로 말하는 쇼핑몰” 만들기 Project
![Page 22: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/22.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – Our Goal
Our Big Picture
실시간수집
![Page 23: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/23.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – 도입배경
What is it? ( ROI 측면에서 ) Why it?
Traditional Method BigData Method
Open Source… Free…
But, ?? Cost & ?? Return
And, High Risk & High Return
DW Appliance 30 TBHadoop Cluster 420 TB
(SW = 0, only HW)
VS
In-Memory Appliance 128 GB + Real-
time Engine
In-Memory NoSQL 1TB
Real-time Analytics
(SW = 0, only HW)
Volume
25:1
Ve locity
10:1
Mining SW + Text Mining SW +
Weblog SW
+
Mining Model Development Cost
Open Source R + Mahout SW = 0억+
R + Mahout Development COST = ??
Variety
& Value
15+α:1
Very High Cost & Middle-High Return
or
VS
Investment
측면
Return
측면
![Page 24: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/24.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – Architecture1
TO-BE 정보계 & BigData Eco Systems
S.com ODS
OLAP/ BI
Mining
Campaign
Mart
EDW
BigData (Hadoop)NoSQL 캐시 & 랜덤 억세스
ETL / Batch
운영계 분석계
Off-line
shin.mall
e.mall
MS Solution
![Page 25: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/25.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – Architecture2
MS Solution 선택 배경
DW / ETL
BI (OLAP + EIS) + α
BigData 연동성 (with PolyBase)
MSPDWHadoop
Combine(with Polybase)
Insight
• 관리 UI 및 사용성 매우 우수• 비용 ROI 매우 우수• MPP 성능 확장성 우수• 동접처리 우수• 백업 및 장애 대처 우수• 컬럼인덱스, ROLAP + MOLAP+ RDBS 연
동기능 매우 우수
• 아래 5가지 기술의 조합 & In-Memory(Tabular) + Local-Memory(PowerPivot)확장• PowerPivot + PowerView + SharePoint + Silverlight + .NET Framework • 생산성 + 화려함 + 연동&융합 + 커스텀&디테일 + 확장성&성능&다양한UX
&성능 &간편함 &공유 &크로스브라우징 &풍부한API
MS PDW
MS BI
![Page 26: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/26.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – Architecture3
MS Solution 선택 배경2
Hadoop Eco System + MS Tech Mesh Up
Development Benefit
Cloud Infra / SAAS
• 개발자 & Power Data Scientist 환경 : Windows OS + Windows HDP + Java M.R. or C# Streaming + Linq To Hive & Avro Support (http://hadoopsdk.codeplex.com)
• Visual Studio 2013 IDE : Python Support. (+ Django Web Framework) , Python Streaming.C# Streaming. Run-Time Remote Debugging.
• Production 에 부하를 주지 않는 Exploring Mart 역할. Local Power Pivot Visualization.
• STORM vs MS-SQL StreamInsight• R, Mahout vs MS-SQL SSAS Data Mining• Redis, Memcashed vs Tabular • R, D3, jQuery vs
PowerPivot, PowerView, SilverLight
• HDInsight ( PDW + HDFS ) 옵션• Windows Hyper-V 가상화 : 운영이 아니
더라도 Staging & Dev 영역에서 충분한강점.
• Windows Azure Storage : Out of date data 에 대한 고려( Hadoop Connection)
![Page 27: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/27.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – 기대효과1
BigData Visualization
& In-Memory Performance
Hadoop Eco Systems
NoSQL
PDW ( with PolyBase )
Tabular
Power Pivot / Silverlight
![Page 28: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/28.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – 기대효과2
Performance & Productivity
• TEST 시스템 : PolyBase(MS PDW ver2 Half Rack), Hive(24TB,48Gb,12core * 5대 + 4TB,64Gb,12Core * 2대)• TEST Full Scan Avg Query : (Hive 용 기준)
• SELECT avg(scm.qty) FROM scm JOIN item ON (item.item_id = scm.item_id AND item.cat_lvl1 > 5) ;
• TEST Full Scan Group By Query : (Hive 용 기준)• SELECT item.cat_lvl1 , count(*)
FROM scm JOIN item ON (item.item_id = scm.item_id AND item.cat_lvl1 > 5) GROUP BY item.cat_lvl1;
Full Scan Avg Join (726Mb,21Mb)
Full Scan Group By Join(726Mb,21Mb)
Full Scan Avg Join (7.2GB,212Mb)
Full Scan Group By Join(7.2GB,212Mb)
Hive 35.469 초 88.21초 33.884초 85.147초
PDW PolyBase 16초 51초 18초 48초
![Page 29: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/29.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – 기대효과3
Performance & Productivity
![Page 30: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/30.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – 기대효과3
BigData Collection Infra & Usage
![Page 31: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/31.jpg)
Enterprise Conference 2013
SHINSEGAE BigData – Road Map
Road Map & Future Work
온-오프통합한 옴니 채널
• 마트 & 백화점• 온라인 & 오프라인• 내부 & 외부
![Page 32: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/32.jpg)
Enterprise Conference 2013
Microsoft’s Modern Data Warehouse
HDInsights Service
![Page 33: Enterprise conference 2013 Microsoft BigData 사례발표자료](https://reader034.vdocuments.net/reader034/viewer/2022051818/54b3850d4a79597d218b458a/html5/thumbnails/33.jpg)
Enterprise Conference 2013
감사합니다.