e-commerce企業における ビッグデータへの挑戦と...
TRANSCRIPT
-
Rakuten Inc. RIT. Masaya Mori Nov. 7th, 2012
E-commerce
-
2
Rakuten Open Data
IT
http://rit.rakuten.co.jp/rdr/index.html
-
3
Introduction
Introduction
SuperDB
BigData
-
4
Introduction
Masaya Mori
Twitter: @emasha
-
5
Rakuten Group
Introduction
SuperDB
BigData
-
6
n n 3,2097,615n 1997217nIPO 2000419n 1,079201112n3,7992011n 7562011
e
-
7
13 EC
2Tokyo, New York)
Free Cause(USA) Linkshare(USA) Tradoria(Germany)
-
8
Next Reality - -
R&D
Tokyo & NY
-
9
Personalize Platform Recommender Engine
(working on) Data Mining, NLP, Semantic Web
Recommender Platform
SPDB item DB user DB purchase history
DB page - view history DB
[ recommender logic ] Collaborative filter
retargeting basket !
Search Tech
Global Catalogue Creation Noise Detection
Next E-Commerce Platform
-
10
SuperDB
BigData
Introduction
-
11
Amazon,
Pandora Radio
PDCA
-
12
SuperDB SuperDB
BigData
Introduction
-
13
E-Commerce Portal and Media
Telecommunications
Securities
Credit Card
Professional Sports
Banking
E-money
-
14
78,000,000+ 800,000,000+ 68,000,000+ 3,000,000+ 1 37,000+ 60,000+ . 1Access Log etc
-
15
-
16
DB Rakuten has tons of businesses, and so have many kinds of business data. Its diversified.
We aggregate such data into one big dataware house.DWH
Rakuten Super DB
That is our important core generating revenue.
-
17
DB
)
Mosaic
-
18
AB
C
D
EF
GHI
JA
B
C
D
EF
GHI
J
CD
-
19
CVR
-
20
TOHO
Recommender Platform
DB
DB for service
-
21
DVD
-
22
SuperDB
BigData
Introduction
-
23
DB DB
-
24
NLP
Global Catalogue Creation Noise Detection
DB
-
25
PDCA
PDCA
Plan (Hypothesis)
Do (Learning)
Check (Understanding)
Action (Prediction)
-
26
SuperDB
BigData
Introduction
-
27
K-MeanspLSI ( LSH Locally Sensitive Hash
Collaborative Filtering Basket Analysis
Text Matching Clustering
Cluster Coefficient
-
28
-
29
CRF
-
30
-
31
CRF
6040
CatID: 2034500167
-
32
IP, SVM/ Passive aggressive
-
33
SNS SOM EM OK/NG
FFNN
No Image
-
34
RSGP
-
35
(
-
36
BigData
SuperDB
BigData
Introduction
-
37
Along with this, we are increasingly getting difficulty of processing data.
-
38
Big Data
Its getting more and more difficult to handle with it.
-
39
HadoopNoSQL
OSS
-
40
DB
1/1
1/300GB
M/R
1
70
RAN DB
Calculate
Rakuten Product
-
41
Batch
Batch
NGS Hive Shared Hadoop
Cluster dictionary batch Server
Batch
NGS common platform for hive
suggest batch server
Dictionary Index
Suggest Index
update search index
update search index
sync analyzed data
n Hive"n
300GB
-
42
-
43
For closing SuperDB
BigData
Introduction
-
44
Rakuten Open Data
IT
http://rit.rakuten.co.jp/rdr/index.html
-
Rakuten Inc. RIT. Masaya Mori Nov. 7th, 2012
E-commerce