artificial intelligence with web applications dell zhang birkbeck, university of london 2010/11

27
Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Upload: adele-allison

Post on 05-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Artificial Intelligence with

Web Applications

Dell Zhang

Birkbeck, University of London

2010/11

Page 2: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

CI Chapter 2

Making Recommendations

Page 3: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Recommendation Systems

Page 4: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

State of the Art

An Ensemble The Neighbourhood Approach

User-based Collaborative Filtering Item-based Collaborative Filtering

The Latent-Factor Approach ……

Page 5: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

The Neighbourhood Approach

The nearest-neighbour method works on the principle that a person tends to give similar ratings to similar movies. Joe likes the three movies on the left, so to make

a prediction for him, find users who also liked those movies and see what other movies they liked.

Here the three other viewers all liked Saving Private Ryan, so that is the top recommendation. Two of them liked Dune, so that’s ranked second, and so on.

Page 6: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11
Page 7: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

The Latent-Factor Approach

A second, complementary method scores both a given movie and viewer according to latent factors, themselves inferred from the ratings given to all the movies by all the viewers.

The factors define a space that at once measures the characteristics of movies and the viewer’s interest in those characteristics. Here we would expect the fellow in the southeast

corner of the graph to love Norbit, to hate Dreamgirls, and, perhaps, to rate Braveheart about average.

Page 8: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11
Page 9: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Collaborative Filtering (user-based)

Page 10: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Collecting Preferences

Possible mappings of user actions to numerical scores

Page 11: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Collecting Preferences

A set of movie critics and their ratings of movies 'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5, 'Just My

Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5, 'The Night Listener': 3.0}

'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5, 'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 3.5}

'Michael Phillips': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0, 'Superman Returns': 3.5, 'The Night Listener': 4.0}

'Claudia Puig': {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0, 'The Night Listener': 4.5, 'Superman Returns': 4.0, 'You, Me and Dupree': 2.5}

'Mick LaSalle': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0, 'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 2.0}

'Jack Matthews': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0, 'The Night Listener': 3.0, 'Superman Returns': 5.0, 'You, Me and Dupree': 3.5}

'Toby': {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0,'Superman Returns':4.0}

Page 12: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Finding Similar Users

Euclidean Distance Score Pearson Correlation Score

Page 13: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Euclidean Distance Score

The straight-line distance between two points in a multidimensional space, which is the kind of distance you measure with a ruler.

Given the Euclidean distance d between two points, their similarity score can be defined as 1/(d+1).

Page 14: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Euclidean Distance Score

The movie critics in (Euclidean) preference space

Page 15: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Pearson Correlation Score

A measure of how correlated two variables are (how well their values fit on a straight line) It is a value between 1 and –1

1: the variables are perfectly correlated 0: the variables are not correlated –1: the variables are perfectly inversely correlated

It tends to work better in situations where the dataset isn’t well normalized

Page 16: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Pearson Correlation Score

Comparing two movie critics on a scatter plot

lowpositive correlation

Page 17: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Pearson Correlation Score

Comparing two movie critics on a scatter plot

highpositive correlation

Page 18: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Pearson Correlation Score

Page 19: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

User-based Collaborative Filtering

Step 1: Look for users who share the same rating patterns with the active user (the user whom the prediction is for) e.g., using the k-Nearest Neighbours algorithm

Step 2: Use the ratings from those like-minded users to calculate a prediction for the active user.

Page 20: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

User-based Collaborative Filtering

The user-based recommendations for Toby

Page 21: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Weighted Mean

A type of average that has a weight for every observation being averaged

Page 22: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Finding Similar Items

Just switch user and items in the previous slides!

Page 23: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Finding Similar Items

Comparing two movies on a scatter plot

negative correlation

Page 24: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Finding Similar Items

Page 25: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Item-based Collaborative Filtering

Step1: Build an item-item matrix determining relationships between pairs of items. The similarities between items will not change as

often as the similarities between users. Step 2: Infer the taste of the active user, using

his top-rated items and the above item-item matrix.

Page 26: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

Item-based Collaborative Filtering

The item-based recommendations for Toby

Page 27: Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

User-based vs Item-based

Efficiency The latter is significantly faster than the former on

large datasets as it allows many calculations to be performed in advance, but it has the additional overhead of maintaining the item similarity table.

Effectiveness The latter usually outperforms the former on

sparse datasets, while they perform about equally on dense datasets.

Simplicity The former is simpler to implement as it doesn’t

have the extra steps.