part 15: knowledge-based recommender · pdf file4 knowledge based recommender ! suggests...
TRANSCRIPT
Part 15: Knowledge-Based Recommender Systems
Francesco Ricci
2
Content
p Knowledge-based recommenders: definition and examples
p Case-Based Reasoning p Instance-Based Learning p A recommender system exploiting a “simple”
case model (the product is a case) p A more complex CBR recommender system for
travel planning
3
“Core” Recommendation Techniques
[Burke, 2007]
U is a set of users I is a set of items/products
4
Knowledge Based Recommender
p Suggests products based on inferences about a user’s needs and preferences
p Functional knowledge: about how a particular item meets a particular user need
p The user model can be any knowledge structure that supports this inference n A query, i.e., the set of preferred features for a
product n A case (in a case-based reasoning system) n An adapted similarity metric (for matching) n A part of an ontology
p There is a large use of domain knowledge encoded in a knowledge representation language/approach.
5
ActiveBuyersGuide
Wizard: My Product Advisor
The system decides what the wizard says
Possible user's
requests
7
8
9
10
11
Trip.com
12
13
14
15
Matching in TripleHop
Example: TripleHop
activities constraint
C-UM:00341
relaxing shopping
lying on a beach
sitting in cafes
meat = beef
budget = 200
Catalogue of Destinations
matching
[Delgado and Davidson, 2002]
16
TripleHop and Content-Based RS
p The content (destination description) is exploited in the recommendation process
p A classical Content-Based method would have used a “simpler” content model ,e.g., keywords or TF-IDF
p Here a more complex knowledge structure – a tree of concepts – is used to model the product (and the query)
p The query is the user model and it is acquired every time the user asks for a new recommendation - (not exactly, more details later) n Stress on ephemeral needs rather than building a
persistent user model p Typical in Knowledge-Based RS, they are more focused on
ephemeral users – because Collaborative Filtering and Content-Based methods cannot cope with that users.
17
Learning User Profile: query mining
activities constraint
C-UM:00341
relaxing shopping
lying on a beach
sitting in cafes
meat = beef
budget = 200
Crete
Old query – user model
activities constraint
C-UM:00357
relaxing adventure
lying on a beach
meat = pork
hiking
new user request
activities constraint
C-UM:00357bis
relaxing adventure
lying on a beach
meat = pork hiking
budget = 200
sitting in cafes
shopping new user request, as computed by the systems. Shadowed means less important.
18
Query Augmentation
p Personalization in search is not only “information filtering”
p Query augmentation: when a query is entered it can be compared against contextual and individual information to refine the query n Ex1: If the user is searching for a restaurant and
enter a keyword “Thai” then the query can be augmented to “Thai food” (See Part 8 – Query expansion – based on co-occurrence analysis in the corpus of documents)
n Ex2: If the query “Thai food” does not retrieve any restaurant the query can be refined to “Asian food”
n Ex3: If the query “Asian food” retrieves too many restaurant, and the user searched in the past for “Chinese” food the query can be refined to “Chinese food”.
19
Query Augmentation in TripleHop
1. The current query is compared with previous queries of the same user
2. Preferences expressed in past (similar) queries are identified
3. A new query is built by combining the short term preferences contained in the query with the “inferred” preferences extracted from the persistent user model (past queries)
4. When the query is matched against an item (destination) if two destinations have the same degree of matching for the explicit preferences then the “inferred” preferences are used to break the tie
p This is another example of the cascade approach n the two combined RS are based on the same
knowledge but with two definitions of the user model.
20
What is Case Based Reasoning ?
p A case-based reasoner solves new problems by adapting solutions that were used to solve old problems (Riesbeck & Shank 1989)
p CBR problem solving process: n store previous experiences (cases) in memory n to solve new problems
p Retrieve form the memory similar experience about similar situations
p Reuse the experience in the context of the new situation: complete or partial reuse, or adapt according to differences
p Store new experience in memory (learning)
[Aamodt and Plaza, 1994]
21
Case-Based Reasoning
[Aha, 1998]
22
CBR Assumption
p New problem can be solved by n retrieving similar problems n adapting retrieved solutions
p Similar problems have similar solutions
?
S S S
S S S
S S S
P P P P P P
P P
P
X
23
Examples of CBR
p Classification: “The patient’s ear problems are like this prototypical case of otitis media”
p Compiling solutions: “Patient N’s heart symptoms can be explained in the same way as previous patient D’s”
p Assessing values: My house is like the one that sold down the street for $250,000 but has a better view”
p Justifying with precedents: “This Missouri case should be decided just like Roe v. Wade where the court held that a state’s limitations on abortion are illegal”
p Evaluating options: “If we attack Cuban/Russian missile installations, it would be just like Pearl Harbor”
24
Instance-based learning – Lazy Learning
p One way of solving tasks of approximating discrete or real valued target functions
p Have training examples: (xn, f(xn)), n=1,...,N p Key idea:
n just store the training examples n when a test example is given then find the
closest matches n use the closest matches to guess the value of
the target function on the test example.
25
The distance between examples
p We need a measure of distance (or similarity) in order to know who are the neighbours
p Assume that we have T attributes for the learning problem. Then one example point x has elements xt ∈ ℜ, t=1,…,T
p The distance between two points x and y is often defined as the Euclidean distance:
∑=
−=T
ttt yxyxd
1
2][),(
26
two one
four
three
five six
seven Eight ?
yes
yes
yes
no no
no
no ?
27
Training data
Number Lines Line types Rectangles Colours Mondrian?
8 7 2 9 4
Test instance
Number Lines Line types Rectangles Colours Mondrian? 1 6 1 10 4 No 2 4 2 8 5 No 3 5 2 7 4 Yes 4 5 1 8 4 Yes 5 5 1 10 5 No 6 6 1 8 6 Yes 7 7 1 14 5 No
28
Lines LinesT Rect Colors Class Distance to test
Train1 4 2 8 5 no 3,32
Train2 5 2 7 4 yes 2,83
Train3 5 1 8 4 yes 2,45
Train4 5 1 10 5 no 2,65
Train5 6 1 8 6 yes 2,65
Train6 7 1 14 5 no 5,20
test 7 2 9 4
Train1 -0,32 0,32 -0,11 0,06 no 0,80
Train2 -0,08 0,32 -0,21 -0,28 yes 0,52
Train3 -0,08 -0,16 -0,11 -0,28 yes 0,69
Train4 -0,08 -0,16 0,08 0,06 no 0,77
Train5 0,16 -0,16 -0,11 0,39 yes 0,86
Train6 0,40 -0,16 0,47 0,06 no 0,76
test 0,40 0,32 -0,02 -0,28
Feature values are not normalized
Feature values are normalized
x' = (x –avg(X))/4*stdev(X)), where x is a feature value of the feature X
What is the difference between this feature values normalization and vector normalization in IR?
29
Example of CBR Recommender System
p Entree is a restaurant recommender system – it finds restaurants: 1. matching some
user goals (case features)
2. or similar to restaurants the user knows and likes
30
The Product is the Case
p In Entrée a case is a restaurant – the case is the product
p The problem component is the description of the restaurant given by the user
p The user will input a partial description of it – this is the only difficulty
p The solution part of the case is the restaurant itself – i.e. the name of the restaurant
p The assumption is that the needs of the user can be modeled as the features of the product description ….
31
Partial Match
p In general, only a subset of the preferences will be matched in the recommended restaurant.
32
Nearest Neighbor
33
Recommendation in Entree
p The system first selects from the database the set of all restaurants that satisfy the largest number of logical constraints generated by considering the input features type and value
p If necessary, implicitly relaxes the lowest important constraints until some restaurants could be retrieved n Typically the relaxation of constraints will
produce many restaurants in the result set p Sorts the retrieved cases using a similarity metric
– this takes into account all the input features.
34
Similarity in Entree
p This similarity metric assumes that the user goals, corresponding to the input features (or the features of the source case), could be sorted to reflect the importance of such goals from the user point of view
p Hence the global similarity metric (algorithm) sorts the products first with respect the most important goal and then iteratively with respect to the remaining goals (multi-level sort)
p Attention: it does not works as a maximization of a Utility-Similarity defined as the sum of local utilities.
35
Example
p If the user query q is: price=9 AND cusine=B AND Atm=B
p And the weights (importance) of the features is: 0.5 price, 0.3 Cusine, and 0.2 Atmosphere
p The Entrée will suggest Dolce first (and then Gabbana) p A more traditional CBR system will suggest Gabbana
because the similarities are (30 is the price range): n Sim(q,Dolce) = 0.5 * (1 - 1/30) + 0.3 * 0 + 0.2 * 0 =
0.48 n Sim(q, Gabbana) = 0.5 (1 - 3/30) + 0.3 *1 + 0.2 * 1 =
0.45 + 0.3 + 0.2 = 0.95
Restaurant Price Cusine Atmosphere
Dolce 10 A A
Gabbana 12 B B
36 http://www.visitfinland.com/web/guest/travel-planner/home
37
38
Query Tightening
39
40 [Ricci et al., 2002]
41
NutKing as a CBR System
p Problem = recommend a set of tourism related products and build a travel plan
p Cases = All the recommended travel plans that users have built using the system (how they were built and what they contain)
p Retrieval = search in the memory travel plans built during “similar” recommendation sessions
p Reuse 1. extract from previous travel plans elementary
components (items) and use them to build a new plan
2. rank items found in the catalogues
42
Travel Plan and Interaction Session Model
cnq Queries on content attributes
Travel bag
Collaborative Component 2: selected products (Kitzbühel,True,True,..)
rating
(category=3 AND Health=True)
(Golfing=True AND Nightlife=True)
case
Collaborative Component 1: travel wish
clf (family, bdg_medium,7,Hotel)
(Hotel Schwarzer, 3, True,…)
43
Item Ranking
Case Base
2. Search Similar Cases
tb
u
r
twc Case
3. Output Reference Set
Q Input
4. Sort locations loci by similarity to locations in
reference cases
1. Search the catalogue
loc1 loc2 loc3
Locations from Catalogue Travel
components
loc1
loc2 loc3
Ranked Items
Output
Suggest Q changes
tb
u
r
twc Current Case
Interactive query management
44
Two-fold Similarity
s1 s2 s3 s4 s5 s6
?
Product similarity
Target user
Target session
Sessions similarity
45
Rank using Two-Fold Similarity
p Given the current session case c and a set of retrieved products R (using the interactive query management facility - IQM)
1. retrieve 10 cases (c1, …, c10) from the repository of stored cases (recommendation sessions managed by the system) that are most similar to c with respect to the collaborative features
2. extract products (p1, …, p10) from cases (c1, …, c10) of the same type as those in R
3. For each product r in R compute the Score(r) as the maximum of the product of a) the similarity of r with pi, b) the similarity of the current case c and the retrieved case ci containing pi
4. sort and display products in R according to the Score(r).
46
Example: Scoring Two Destinations
D2 D1
Destinations matching the user’s query
Sim(CC,C1) 0.2 Sim(CC,C2) 0.6
Sim(D1, CD1) 0.4 Sim(D1, CD2) 0.7 Sim(D2, CD1) 0.5 Sim(D2, CD2) 0.3
Score(Di) = Maxj {Sim(CC,Cj)*Sim(Di,CDj)}
Score(D1)=Max{0.2*0.4,0.6*0.7}=0.42 Score(D2)=Max{0.2*0.5,0.6*0.3}=0.18
current case CC
D1
current case CC
D2
?
?
similar cases in the case base
CD1
CD2
C1
C2
47
p A case is a rooted tree and each node has a: n node-type: similarity between two nodes in two cases is
defined only for nodes with the same node-type n metric-type: node content structure - how to measure
the node similarity with another node in a second case
Tree-based Case Representation
c1
nt: case
mt: vector
clf1
cnq1
cart1
nt: cart
mt: vector
dests1
accs1
acts1
nt: destinations
mt: set
dest1
dest2
X1
X2
X3
X4
nt: location
mt: hierarchical
mt: vector
nt: destination
ITEM
48
Item Representation
Node Type Metric Type Example: Canazei X1 LOCATION Set of hierarchical
related symbols Country=ITALY, Region=TRENTINO, TouristArea=FASSA, Village=CANAZEI
X2 INTERESTS Array of Booleans Hiking=1, Trekking=1, Biking=1
X3 ALTITUDE Numeric 1400
X4 LOCTYPE Array of Booleans Urban=0, Mountain=1, Rivereside=0
TRAVELDESTINATION=(X1,X2,X3,X4)
dest1
X1 = (Italy, Trentino, Fassa, Canazei)
X2 = (1,1,1)
X3 = 1400
X4 = (0, 1, 0)
49
Item Query Language p For querying purposes items x a represented as
simple vector features x=(x1, …, xn)
p A query is a conjunction of constraints over features: q=c1 ∧ c2 ∧ … ∧ cm where m ≤ n and
⎪ ⎩
⎪ ⎨ ⎧
≤ ≤ = =
=
numerical is if nominal is if
boolean is if
k k k k
k k
i i i i i i
k x u x l x v x x true x
c
dest1
X1 = (Italy, Trentino, Fassa, Canazei)
X2 = (1,1,1)
X3 = 1400
X4 = (0, 1, 0)
(Italy, Trentino, Fassa, Canazei, 1, 1, 1, 1400, 0, 1, 0)
50
If X and Y are two items with same node-type
d(X,Y) = (1/∑i wi)1/2 [∑i wi di(Xi,Yi)2 ]1/2
where 0 ≤ wi ≤ 1, and i=1..n (number of features).
1 if Xi or Yi are unknown overlap(Xi,Yi) if Xi is symbolic |Xi - Yi|/rangei if Xi is finite integer or real
di(Xi,Yi) = Jaccard(Xi,Yi) if Xi is an array of Boolean Hierarchical(Xi,Yi) if Xi is a hierarchy Modulo(Xi,Yi ) if Xi is a circular feature (month) Date (Xi,Yi ) if Xi is a date
Sim(X,Y) = 1 - d(X,Y) or Sim(X,Y) = exp(- d(X,Y))
Item Similarity dest1
X1=(Italy, Trentino, Fassa, Canazei)
X2 = (1,1,1)
X3 = 1400
X4 = (0, 1, 0)
51
Item Similarity Example
dest1
X1 = (I, TN, Fassa, Canazei)
X2 = (1,1,1)
X3 = 1400
X4 = (0, 1, 0)
dest2
Y1 = (I, TN, Fassa,?)
Y2 = (1,0,1)
Y3 = 1200
Y4 = (1, 1, 0)
Sim(dest1,dest2 ) = exp(!(1 / 4) d1(X1,Y1)2 +!+ d4 (X4,Y4 )
2 )
= exp(!(1 / 4) (0.3)2 + (1! 2 / 3)2 + ((1400!1200) / 2000)2 + (1!1/ 2)2 )
= exp(!(1 / 4) 0, 461) = exp(!0,339) = 0, 712
1 0.7 0.5 0.3
3 in the union 2 in the union
52
Case Distance
c1
nt: case
mt: vector
clf1
cnq1
cart1
nt: cart
mt: vector
dests1
accs1
acts1
nt: destinations
mt: set
dest1
dest2
X1
X2
X3
X4
nt: location
mt: hierarchical
mt: vector
nt: destination
c2 clf12
cnq2
cart2
dests2
accs2
acts2
dest3
dest4
Y1
Y2
Y3
Y4
dest5
53
Case Distance
2213
2212
22113
1
21 ),(),(),(1),( cnqcnqdWclfclfdWcartcartdWW
ccd
ii
++=
∑=
c1
nt: case
mt: vector
clf1
cnq1
cart1
c2 clf12
cnq2
cart2
54
221
221
22121 ),(),(),(),( actactdaccsaccsddestsdestsdcartcartd ++=
nt: cart
mt: vector
c1
nt: case
mt: vector
clf1
cnq1
cart1
c2 clf12
cnq2
cart2
dests1
accs1
acts1
dests2
accs2
acts2
55
c1
nt: case
mt: vector
clf1
cnq1
cart1
nt: cart
mt: vector
dests1
accs1
acts1
nt: destinations
mt: set
dest1
dest2
c2 clf12
cnq2
cart2
dests2
accs2
acts2
dest3
dest4
dest5
)),(),(),(
),(),(),((3*21),(
524232
51413121
destdestddestdestddestdestd
destdestddestdestddestdestddestsdestsd
++
+++=
56
CBR Knowledge Containers
p CBR is a knowledge-based approach to problem solving
p The knowledge is “contained” into four containers n Cases: the instances belonging to our case base n Case representation language: the
representation language that we decided to use to represent cases
n Retrieval knowledge: the knowledge encoded in the similarity metric and in the retrieval algorithm
n Adaptation knowledge: how to reuse a retrieved solution to solve the current problem.
57
Conclusions
p Knowledge-based systems exploits knowledge to map a user to the products she likes
p KB systems uses a variety of techniques p Knowledge-based systems requires a big effort in
term of knowledge extraction, representation and system design
p Many KB recommender systems are rooted in Case-Based Reasoning
p Similarity of complex data objects is required often required in KB RSs.
p NutKing is a hybrid case-based recommender system
p The case is the recommendation session.
58
Questions p What are the main differences between a CF recommender
system and a KB RS (such as activebuyers.com or Entree)? p What is the role of query augmentation? p What is the basic rationale of a CBR recommender system? p What is a case in a CBR recommender system such as
Entree? p How a CBR recommender system learns to recommend? p What are the knowledge containers is a CBR RS? p What are the main differences between a “classical” CBR
recommender system such as Entrée and Nutking? p What are the motivations for the introduction of the double-
similarity ranking method? p What are the types of local similarity metrics used in
Nutking?