web-scale recommendation engines - computer science...week 5: component buildout week 6: component...
TRANSCRIPT
![Page 1: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/1.jpg)
Web-Scale Recommendation
EnginesCS130 Fall 2012
Andrew Look Senior Software Engineer
Shopzilla
![Page 2: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/2.jpg)
About Shopzilla
● Comparison Shopping Engine● Lots of traffic
○ > 200 million products○ > 100 million impressions / day○ > 8000 searches / second○ > 30M unique visitors / month
● Operate web properties in US, Europe○ Shopzilla.com○ Bizrate.com○ Beso.com
![Page 3: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/3.jpg)
Beso.com
![Page 4: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/4.jpg)
Beso.com: Personalization
● Focused on personalization for users● Users can pick favorite..
○ Brands○ Stores○ Products
![Page 5: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/5.jpg)
What's missing?
● Recommendations● Relevancy● ...but we have data!
![Page 6: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/6.jpg)
Recommender: System Overview
![Page 7: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/7.jpg)
Extracting User Preferences
● Shopzilla Engineers built this component● Important in tuning recommender algorithm
![Page 8: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/8.jpg)
Extracting User Preferences--which brands did each user favorite?
SELECT user_id, brand_id
FROM favorites
--which brands did they click on?
SELECT C.user_id, P.brand_id
FROM clicks C, products P
WHERE C.product_id = P.id
![Page 9: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/9.jpg)
Recommendations: Overview
● I like a brand. Which other brands do i like?● Use latest available user data to answer this
![Page 10: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/10.jpg)
Recommender: What will you learn?
● Leverage Machine Learning at scale● Use Hadoop on BigData● How to evaluate solutions to data problems
Ex. Recommendations With Hadoop Streaming
![Page 11: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/11.jpg)
Recommender: Why is it cool?
● Run recommender on cluster of 332 CPU's● Patterns emerge from behavioral data
○ Nike <---> Adidas, Reebok○ Prada <---> Gucci, Louis Vuitton○ Levi <---> Wrangler, Lee'
● Data from 100's of millions of users● Creative ways to represent user preferences
![Page 12: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/12.jpg)
Web Service: Overview
● Load recs into NoSQL cache● Build web service for recs● Build client for Beso.com
![Page 13: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/13.jpg)
Web Service: What will you learn?
● Build a high-performance web service○ 100's of requests per second
● Learn enterprise Java best practices● Understand Service-Oriented Architecture
![Page 14: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/14.jpg)
Web Service: Why is it cool?
● Scaling○ how many servers do we need?○ how fast does cache need to be?○ how much storage do we need?
● NoSQL Technology selection○ read/write performance○ reliability○ availability○ consistency
● See evolving data affect web layer
![Page 15: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/15.jpg)
Evaluation Tool: Overview
● Build UI to compare algorithms side-by-side
![Page 16: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/16.jpg)
Evaluation Tool: What will you learn?
● Build enterprise Java webapp from scratch● Use frontend technologies (CSS/JS/AJAX)● Consuming web services using REST client
![Page 17: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/17.jpg)
Evaluation Tool: Why is it cool?
● Like CS144, with modern Java webapp tools● Visualizing data problems● Inform recommendation tuning process
![Page 18: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/18.jpg)
Logistics
● Pre-configured VM to get you started● Knowledgeable project mentors
○ available for questions / help daily● Agile Tracking system● Github / Continuous Integration build
![Page 19: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/19.jpg)
Schedule
● Week 1: bootcamp● Week 2: technology selection● Week 3: interface definition● Week 4: mock implementation● Week 5: component buildout● Week 6: component integration● Week 7: hardening, performance● Week 8: deployment
![Page 20: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/20.jpg)
Big Picture
● Technology selection & platform rollout● Mirrors process used by Shopzilla's Senior
Architects and Engineers● Program designed to teach enterprise
engineering techniques● Equips you with skills for your first job
![Page 21: Web-Scale Recommendation Engines - Computer Science...Week 5: component buildout Week 6: component integration Week 7: hardening, performance Week 8: deployment Big Picture Technology](https://reader036.vdocuments.net/reader036/viewer/2022062509/61054a6afc91792e927e7d88/html5/thumbnails/21.jpg)
Questions?
alook [at] shopzilla [dot] com@andrewlook