Scaling up Business Intelligence
01
from the scratch and to 15 countries
Sergii Khomenko, Data [email protected], @lc0d3r
Budapest BI Forum - October 15, 2015
Sergii Khomenko
2
Data scientist at one of the biggest fashion communities, Stylight.
Data analysis and visualisation hobbyist, working on problems not only in working time but in free time for fun and personal data visualisations.
Speaker at Berlin Buzzwords 2014, ApacheCon Europe 2014, Puppet Camp London, Berlin Buzzwords 2015 , Tableau Conference on Tour.
Upcoming talks:October 28-30, 2015 - Crunch Practical Big Data Conference, Budapest - Building data pipelines: from simple to more advanced - hands-on experience
Profitable LeadsStylight provides its partners with high-quality leads enabling partner shops to leverage Stylight as a ROI positive traffic channel.
InspirationStylight offers
shoppable inspiration that
makes it easy to know what to
buy and how to style it.
Branding & ReachStylight offers a unique opportunity for brands to reach an audience that is actively looking for style online.
ShoppingStylight helps users search
and shop fashion and lifestyle products smarter across
hundreds of shops.
3
Stylight – Make Style HappenCore Target Group
Stylight help aspiring women between 18 and 35 to evolve their style through shoppable inspiration.
Stylight – acting on a global scale
Experienced & Ambitious Team
Innovative cross-functional organisation with flat hierarchy builds a unique team spirit.• +200 employees• 40 PhDs/Engineers• 28 years average age
• 63% female• 23 nationalities• 0 suits
5
Early days
6
Different tools, different approaches
Pros and Cons
8
• Data consistency
• Not flexible structure – report change
• Difficult to scale
• Time-consuming – for new ad-hoc
• Maintain and support – in-house development
Pros and Cons
9
In short term
• More flexible - add that advance feature
• Easy to add alerting
Tableau days
10
Simple, Cheap, Just works
11
13
Pros and Cons
16
• Easy to start using
• Works for free
• All datasources in one place
• Unified routine
Pros and Cons
17
• Hard to scale
• Not production ready
• No backups
• No control over data
• No control over failures
Migrating data from Tableau Online
18
Be the owner of your data
19
• Tableau is a good DS editor
• We already have so many DS
• Current tasks and sprints
20
P i c t u r e o f t h e o l d s e t u p
24
• We have all DS accessible
• We know where data comes from
• Structure re-creation
• Migration without any manual input
DWH with AWS Redshift
25
Be the owner of your data
P i c t u r e o f t h e o l d s e t u p
Benefits
30
• Control over backups
• Control over refreshes
• Scale DWH up to petabyte scale
• Easy to add new ETL stages(EMR)
• More open for new challenges
https://github.com/stylight/postgresql-rest-api
Scale your organisation
35
Cross-Functional Team
36
Department: mission oriented team with all resources and the least dependencies
Product Team: builds the software the department or its customers use
Squad: team that executes the product development
36
Department
Product Team
Squad
PO
Engineer
Engineer
Designer
Data Scientist
Head of
Business Role
Business Role
Cross-Functional Team
37
• You build it - you run it
• You check your numbers (domain knowledge)
• You provide your data as interface layer
• Data report comes after data tracking
37
Department
Product Team
Squad
PO
Engineer
Engineer
Designer
Data Scientist
Head of
Business Role
Business Role
Future plans
39
Make it even more awesome
40
• Data definition unifications - ibis?
• Pipeline unification - Luigi?
• Flexible to integrate new things
• Open Source our Python toolchain
• Tableau replacement re:dash, AWS QuickSight
Related talks
42
• Helping Data Teams with Puppet / Puppet Camp London
• Secure Data Scalability at Stylight with Tableau Online and Amazon Redshift / Tableau Conference on Tour - Berlin