project aura recommendation for the rest of us · • 60s • oldies • psychedelic •...
TRANSCRIPT
PROJECT AURARECOMMENDATION FOR THE REST OF USPaul Lamere, Senior Staff Engineer, Sun LabsStephen Green, Senior Staff Engineer, Sun Labs
TS-5841
2008 JavaOneSM Conference | java.sun.com/javaone | 2
In this talk you will learn how you can use Project Aura to deliver better recommendations for your content
2008 JavaOneSM Conference | java.sun.com/javaone | 3
Agenda
What is recommendation and why do we careWhat is wrong with existing recommendationsHow Project Aura fixes these problemsHow to use Project AuraDemonstration
2008 JavaOneSM Conference | java.sun.com/javaone | 4
The Problem: Long Tail Content
Huge amount of content on the web:• Videos, Music, Podcasts, Images, • News, Blogs, Wikis, Books
Much of this content is user-generated:• YouTube, Flickr, Digg, last.fm• Technorati, Delicious, eBay
Even more content is coming: IPTVKey to enabling Long Tail• Make everything available• Help me find it
2008 JavaOneSM Conference | java.sun.com/javaone | 5
How Important is Recommendation?
Netflix • 2/3 of movies rented were recommended• Offering $1 million prize for the first group to improve their
recommendations by 10%Amazon • 35% of product sales result from recommendations• Recommendations generated a couple of orders of magnitude
more sales than just showing top sellersGoogle News• Recommendations generate 38% greater click through
2008 JavaOneSM Conference | java.sun.com/javaone | 6
Extreme Commercial Interest
Image courtesy of Jadam Kahn
2008 JavaOneSM Conference | java.sun.com/javaone | 7
Using Aura – Quick Start – Code// Look up the user and the itemsUser fan = aura.getUser("http://openid.sun.com/plamere");
Item weezer = aura.findItemByName("Weezer");Item greenday = aura.findItemByName("Green Day");Item paris = aura.findItemByName("Paris Hilton");
// Pay attention to the itemsaura.attend(fan, weezer, Attention.RATE, Rating.FIVE_STARS);aura.attend(fan, greenday, Attention.PLAY);aura.attend(fan, paris, Attention.SKIP);
// Get recommendationsList<Recommendation> recs = aura.recommend(fan, ARTISTS, 3);for (Recommendation r : recs) { log("Recommended: " + r.getItem() + "\n reason: " + r.getExplanation());}
2008 JavaOneSM Conference | java.sun.com/javaone | 8
Using Aura – Quick Start – Output
Recommended: CAKE reason: geek rock, nerd rock
emo, indie pop, fun, 90s
Recommended: Hot Hot Heat reason: power pop, emo, fun,
indie pop, pop punk
Recommended: Ben Folds Five reason: geek rock, Nerd Rock,
power pop, emo, 90s
2008 JavaOneSM Conference | java.sun.com/javaone | 9
Project Aura Quick Start
Project Aura API is very simple• Define Users• Define Items• Add attention data• Get Recommendations
But Aura is not just the API
In Project Aura, we try to solve the problems encountered in traditional recommender systems
2008 JavaOneSM Conference | java.sun.com/javaone | 10
Look At All the Things That Can Go WrongIf you likeBritney Spears
Report on Pre-War Intelligence
you might like...
Let's look at some of the issues...
2008 JavaOneSM Conference | java.sun.com/javaone | 11
Collaborative FilteringItem similarity based upon the halo of users
mary
sue
bill steve
jim
jack
john
karl
nicole
nigelseth
jeff
vic
boblev
james
jannicole
jen bob
paul
jon
jeff
thrash
jan
jen
jeffvic
james
karl
jonanneseth
paul
mary
suebill
stevejim
jack john
jeffnicole
paul
karl
jack
sue
bill
john
2008 JavaOneSM Conference | java.sun.com/javaone | 12
mary
sue
bill steve
jim
jack
john
karl
nicole
nigelseth
jeff
vic
boblev
james
jannicole
jen bob
paul
jon
jeff
thrash
jan
jen
jeffvic
james
karl
jonanneseth
paul
mary
suebill
stevejim
jack john
jeffnicole
paul
karl
jack
sue
bill
john
thrash ???thrash ???A big problem for new items Collaborative Filtering – Cold Start
2008 JavaOneSM Conference | java.sun.com/javaone | 13
Cold Start in the WildIf you like Gregorian Chants you might like Green Day
2008 JavaOneSM Conference | java.sun.com/javaone | 14
StrangeConnections
More Strange Recommendations
2008 JavaOneSM Conference | java.sun.com/javaone | 15
Collaborative Filtering – Cold StartA bigger problem for a new recommender
thrash
thrash
2008 JavaOneSM Conference | java.sun.com/javaone | 18
Social Recommenders: Hacking
Courtesy of Bamshad Mobasher
Censored
2008 JavaOneSM Conference | java.sun.com/javaone | 19
● Millions of items and users, billions of taste data points
● Recommendation is the secret sauce
● Scaling is hard
● Thousands of items, users, and taste data points
● Data is sparse● Collaborative Filtering
techniques don't work as well
The Scale of the Problem
2008 JavaOneSM Conference | java.sun.com/javaone | 20
Agenda
What is recommendation and why do we careWhat is wrong with existing recommendationsHow Project Aura fixes these problemsUsing Project AuraDemonstration
2008 JavaOneSM Conference | java.sun.com/javaone | 21
Recommendation for the Rest of Us
Good recommendation needs big data• Millions of users, billions of datapoints• Fine for Amazon, Netflix, iTunes and Google
But what if you don't have big data?• Poor recommendations• No recommendations
We want to give everyone the opportunity to provide good recommendations, regardless of size
2008 JavaOneSM Conference | java.sun.com/javaone | 22
Our solution: Augment This...
mary
sue
bill steve
jim
jack
john
karl
nicole
nigelseth
jeff
vic
boblev
james
jannicole
jen bob
paul
jon
jeff
thrash
loud
jan
jen
jeffvic
james
karl
jonanneseth
paul
mary
suebill
stevejim
jack john
jeff
nicole paul
karl
jack
sue
bill
john
thrash ???thrash ???
2008 JavaOneSM Conference | java.sun.com/javaone | 23
With Thisthe text aura Similarity based upon
the aura surrounding the musicrock
Indie
Cute
guitar
Drums
90s
FastWeird
twee
Quirky
noise pop
Cute
playful
Drums
80s
FastWeird
Sweet
Quirky
Cuterock
pop
90sFun
metal
rock
Edgyguitar
thrash
90s
FierceWeird
concert
Loudthrash
rock metal
90s
Loud
Weird
thrash
loud metal
death
thrash
rock
Edgygothic
heavy metal
90s
FrenziedWeirdconcert
Loud
2008 JavaOneSM Conference | java.sun.com/javaone | 24
Where Does the Text Aura Come From?Crawling the webAnalysis of content
thrash
rock
Edgy
gothic
heavy metal
90s
Weird
concert
Loud
Content Reviews Lyrics Blogs Social Tags Bios Users
2008 JavaOneSM Conference | java.sun.com/javaone | 25
Autotagging Music Directly From Audio
Labeled ExamplesUnknown Examples
Machine Learning
Model
Labeled Examples
Windowing
MEL Scale
Decode
FFT
Log
DCT
MFCC
Feature Extraction Training Tagging
2008 JavaOneSM Conference | java.sun.com/javaone | 26
Autotagging MusicBohemian Rhapsody
• Classic rock• Favorite artist• Glam Rock• Happy• Psychedelic Rock• 70s• UK • Favourites• England• Prog
Stairway to Heaven• Awesome• Proto-punk• Rock and Roll• Folk Rock• 70s• Genius• Blues Rock• Classic Rock• Great Lyricists• Art Rock
Take Five (Brubeck)• Fusion• Saxophone• Avant-Garde• Trumpet• Jazz Fusion• Instrumental• Jazz• Bossa Nova• Cool• Gentle
2008 JavaOneSM Conference | java.sun.com/javaone | 27
Using the Aura to Determining Similarity
Artist similarity based on the tag aura for The BeatlesTop Tags
• classic rock• rock• pop• british• 60s • oldies • psychedelic• alternative• indie • britpop
Distinctive Tags• The Beatles• 60s• liverpool• british• british psychedelia• oldies• britrock• psychedelic• classic rock• Rock and Roll
Similar Artists via Tags• John Lennon• Rolling Stones• Paul McCartney• The Kinks• The Who• Pink Floyd• Queen• The Police• Led Zeppelin• David Bowie
2008 JavaOneSM Conference | java.sun.com/javaone | 28
Project Aura Technology Summary
No cold start problemTransparent recommendationsSteerable recommendationsLess susceptible to hacking / shillingImmune to typical biases
How do Project Aura recommendations compare to other systems?
2008 JavaOneSM Conference | java.sun.com/javaone | 29
Recommender Performance
Sun Aura 4.023.683.50
Sun CF 3.483.26
Critic 1 2.89Critic 2 2.76
2.592.061.82
Critic 3 1.64Critic 4 1.59Critic 5 1.14
0.890.82
-2.39
Recommender System
Average Rating
Rec Sys ARec Sys B
Rec Sys C
Rec Sys DRec Sys ERec Sys F
Rec Sys GRec Sys HRec Sys I
Web Survey:• 200 Participants• > 10,000 data points
Ranked:• 2 Sun Labs recommenders• 9 Commercial recommenders• 5 Professional music critics
Performance:• 280,000 artists• 7.2 million tags• Find Similar Artist: 200 ms
2008 JavaOneSM Conference | java.sun.com/javaone | 30
Agenda
What is recommendation and why do we careWhat is wrong with existing recommendationsHow Project Aura fixes these problemsUsing Project AuraDemonstration
2008 JavaOneSM Conference | java.sun.com/javaone | 31
Project Aura Java™ APIs
Open Source – available soon at tastekeeper.comHost it yourself• Advantages
• You keep your data to yourself• You can customize it
• Disadvantages • You have to buy computers• You can't benefit from other big data
2008 JavaOneSM Conference | java.sun.com/javaone | 32
Using Aura – Defining Items and Users// Creating and adding a new artistItem artist = aura.newItem(ARTIST, "6fe07", "Weezer" );artist.addField("BIOGRAPHY", "Weezer is a Rock Band..."); aura.putItem(artist);
// Creating and adding a new trackItem track = aura.newItem(TRACK, "39173", "Hash Pipe");track.addField("ARTIST_KEY", artist.getKey()); track.addField("REVIEW", "The opening guitar rif of...");track.addField("LYRICS", "I can't help my feelings...");aura.putItem(track);
// Creating and adding a userUser user = aura.newUser(
"http://openid.sun.com/plamere", "Paul");aura.addUser(user);
2008 JavaOneSM Conference | java.sun.com/javaone | 33
Using Aura – Adding Attention Data// Look up the user and the items
User fan = aura.getUser("http://openid.sun.com/plamere");Item weezer = aura.findItemByName("Weezer");Item paris = aura.findItemByName("Paris Hilton");
// Pay attention to the items aura.attend(fan, weezer, Attention.PLAY);aura.attend(fan, weezer, Attention.RATE, Rating.FIVE_STARS);aura.attend(fan, weezer, Attention.TAG, "nerd rock");aura.attend(fan, weezer, Attention.TAG, "seen live");
aura.attend(fan, paris, Attention.SKIP);
2008 JavaOneSM Conference | java.sun.com/javaone | 34
Using Aura – Item Similarity
// Look up the userItem weezer = aura.findItemByName("Weezer");
// find 10 artists most similar to WeezerSortedSet<Scored<Item>> items = aura.findSimilar(weezer, 10); // find 10 items most similar based on tagsitems = aura.findSimilar(weezer, TAGS, 10);
// find 10 most similar based on tags, biography and decadeWeightedFields[] tagsBioDecade = {...};items = aura.findSimilar(weezer, tagsBioDecade, 10);
2008 JavaOneSM Conference | java.sun.com/javaone | 35
Using Aura – Item Recommendation// find 10 artist to recommend to a userList<Recommendation> recommendations = aura.recommend(user, ARTISTS, 10);for (Recommendation r : recommendations) { log("Recommended " + r.getItem() + " for user " + user + " reason: " + r.getExplanation());}
// find 10 tracks suitable for exercise RecommenderProfile joggingProfile = new RecommenderProfile();joggingProfile.add("BPM", "> 120");joggingProfile.add("PERIOD", "70s");List<Recommendation> recommendations = aura.recommend(user, TRACKS, 10, joggingProfile );
2008 JavaOneSM Conference | java.sun.com/javaone | 36
Project Aura Web Services
Open / Free Web Services for Recommendation• You provide taste data• Project Aura provides recommendations
Advantages:• Simple recommendation solution for Web startups• We solve the scaling problems
• Built on top of Project Caroline • Reliable / Scalable• Ready for millions of users / billions of taste datapoints
Disadvantages:• You can't customize it• Your taste data is shared with others
2008 JavaOneSM Conference | java.sun.com/javaone | 37
Web-scale RecommendationDataStore
PartitionCluster
ClusterReplicas
DB
SearchIndex
DB
SearchIndex
DB
SearchIndex
CollaborativeFiltering
Recommender
Content-BasedRecommender
HybridRecommender
WebCrawler
WebCrawler
WebCrawler
WebCrawler
WebCrawler
WebCrawler
WebCrawler
WebCrawler
WebCrawler
PartitionCluster
ClusterReplicas
DB
SearchIndex
DB
SearchIndex
DB
SearchIndex
PartitionCluster
ClusterReplicas
DB
SearchIndex
DB
SearchIndex
DB
SearchIndex
2008 JavaOneSM Conference | java.sun.com/javaone | 38
Using Aura – Web Services
http://tastekeeper.com/RecommenderService ?apmlURL=http://example.com/lamere.apml &outputFormat=apml &type=artist &num=10 &apmlProfile=MyMusic &recProfile=standard
Simple Web Service:• Input:
• Attention Profile Markup representing taste• Output format• Type of items to recommend• Number of recommendations desired• APML Profile selector• Recommendation Profile selector
• Output:• APML with recommendations embedded
• Example:
2008 JavaOneSM Conference | java.sun.com/javaone | 39
Attention Profile Markup Language<apml> <Profile name="MyMusic"> <ExplicitData> <Concepts> <Concept key="Breaking Benjamin" value="1.0"/> <Concept key="Deerhoof" value="0.455"/> <Concept key="The Arcade Fire" value="0.395"/> <Concept key="Radiohead" value="0.384"/>
<Concept key="Björk" value="0.358"/> <Concept key="Weezer" value="0.347"/> <Concept key="The Postal Service" value="0.324"/> <Concept key="Rodrigo y Gabriela" value="0.205"/> <Concept key="Fiona Apple" value="0.201"/> </Concepts> </ExplicitData> </Profile></apml>
Recommendation Input
More info at apml.org
2008 JavaOneSM Conference | java.sun.com/javaone | 40
Attention Profile Markup Language<apml> <Profile name="Music-Recommendations"> <ImplicitData>
<Concepts> <Concept key="Elefant" value="0.933"/> <Concept key="Finger Eleven" value="0.917"/> <Concept key="The New Amsterdams" value="0.913"/> <Concept key="Gomez" value="0.902" /> <Concept key="Ocean Colour Scene" value="0.902"/> <Concept key="Three Days Grace" value="0.901" /> <Concept key="Marcy Playground" value="0.900" /> <!-- elements omitted --> </Concepts> </ImplicitData> </Profile></apml>
Recommendation output
2008 JavaOneSM Conference | java.sun.com/javaone | 41
● Use the content of the item● Use the text aura of the items
> Social Tags, Reviews, blog posts, ...
● Use autotagging to populate the aura
● Use taste data to find similar users, items
● Works best with lots of taste data
Content-basedRecommendation
CollaborativeFiltering
● Reduce reliance on big taste data● Reduce cold start and feedback
problems● Provide a way to explain why and
how recommendations were made
Aura Technology Summary
2008 JavaOneSM Conference | java.sun.com/javaone | 43
For More Information
Aura Links• tastekeeper.com – general information about Project Aura
• blogs.tastekeeper.com – a blog recommender based upon Aura• music.tastekeeper.com – a music recommender based upon Aura
• Project Caroline - projectcaroline.netFurther reading• apml.org – attention profile markup language• Team Blogs
• blogs.sun.com/plamere• blogs.sun.com/searchguy
Alternatives• taste.sourceforge.net – collaborative filtering in the Java
programming language