wmfra # 46: case study - in-store analysis
DESCRIPTION
This is the case study presentation of a in-store analysis in retail, presented at the WebMonday Frankfurt #46.TRANSCRIPT
![Page 1: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/1.jpg)
CC 2.0 by Per Olesen | http://flic.kr/p/7pVCgZ
![Page 2: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/2.jpg)
CC 2.0 by Franck BLAIS | http://flic.kr/p/cwVnSy
![Page 3: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/3.jpg)
CC 2.0 by John Steven Fernandez | http://flic.kr/p/a8uTzz
![Page 4: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/4.jpg)
CC 2.0 by Ian Carroll | http://flic.kr/p/6NWoGm
![Page 5: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/5.jpg)
CC 2.0 by Perry French | http://flic.kr/p/8wDMJS
![Page 6: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/6.jpg)
CC 2.0 by John Mitchell | http://flic.kr/p/5UaPg8
![Page 7: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/7.jpg)
7
How do we answer these questions?
Before we started designing a blueprint solution we first of all asked ourselves:
1 Who would be asked to answer questions like this?
2 Who is this person? 3 What tools does this person expect to
use? 4 And what is a typical skill set of this
person? 5 How do they work?
Preparation
März 8,
2013
![Page 8: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/8.jpg)
8
So, how do we answer these questions as a Data Scientist?
From a high level of abstraction the answer is simple. We need a data management system with three pieces: ingest, store and process.
Traditional Data Management System Approach
März 8,
2013
Data Source
Data Ingestion
Data Processing
Data Storage
![Page 9: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/9.jpg)
9
So, how do we answer these questions as a Data Scientist?
We take this basis architecture and replace the generic terms while mapping it onto the Hadoop ecosystem. With this Hadoop architecture a Data Scientist should be able to answer the questions without any programming environment. He/she can also use familiar BI, analysis and reporting tools as well.
Blueprint for a Data Management System with Hadoop
März 8,
2013
Data Source Flume
HIVE, Impala HDFS
BI/Analysis/Reporting
![Page 10: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/10.jpg)
10
Ingrediants
1 2 WiFi access points to simulate two different stores with OpenWRT, a linux based firmware for routers, installed
2 Flume to move all log messages to HDFS, without any manual intervention (no transformation, no filtering)
3 A 4 node CDH4 cluster 4 Pentaho Data Integration‘s graphical designer for data
transformation, parsing, filtering and loading to the warehouse
5 Hive as data warehouse system on top of Hadoop to project structure onto data
6 Impala for querying data from Hive in real time 7 Tool to visualize results
Setup
März 8,
2013
![Page 11: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/11.jpg)
CC 2.0 by Qi Wei Fong | http://flic.kr/p/7w8vfq
![Page 12: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/12.jpg)
12
Visits for stores number one & two
The plot indicates that about 85% of the visits were detected in store number one and about 15% in store number two. One might draw the conclusion that store number one is in a much better location with more occasional customers.
But let’s gain more insights by analysing the number of unique visitors.
Analysis Result
März 8,
2013
![Page 13: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/13.jpg)
13
Unique visitors
This plot gives us more details about the customers. It turns out that the 135 visits in store number one were caused by just 9 unique visitors while store number two encountered 5 unique visitors.
Analysis Result
März 8,
2013
![Page 14: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/14.jpg)
14 This plot indicates that we have more returning than new users in both stores. In store number two we didn’t see a new user over the past 4 days at all. It’s probably a good idea to start a marketing campaign which aims at new customers, e.g. to give out vouchers for the first purchase.
New vs. returning users
Analysis Result
März 8,
2013
![Page 15: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/15.jpg)
15 The plot for the last 4 days vividly visualizes that the visit duration in store number one was evenly distributed while the distribution in store number two shows some peaks. We can also see that visitors tend to stay in shop number one much longer.
Visit duration over the past 4 days
Analysis Result
März 8,
2013
![Page 16: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/16.jpg)
16 There is a lot of useful information that can be derived from this plot. 1. There is a repeating pattern of step-ins and step-outs
within a short period of time. 2. There was a step-out of store number one and a step-in
into store number two within just 28 seconds.
Avg. Duration Between Visits of one particular user
Analysis Result
März 8,
2013
![Page 17: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/17.jpg)
März 8, 2013
CC 2.0 by Aurelien Guichard | http://flic.kr/p/cjg9yw
![Page 18: WMFRA # 46: Case Study - In-Store Analysis](https://reader034.vdocuments.net/reader034/viewer/2022052601/5594760d1a28ab476e8b470c/html5/thumbnails/18.jpg)
18
Links
1 Presentation, Video and Post Series • http://bit.ly/YgtIMK
2 http://sentric.ch 3 http://www.bigdata-usergroup.ch 4 http://about.me/jpkoenig
März 8,
2013