recommender systems bener

25
Recommender Systems: Challenges and Opportunities Ayse Bener January 22, 2015

Upload: diannepatricia

Post on 16-Apr-2017

241 views

Category:

Travel


0 download

TRANSCRIPT

Page 1: Recommender systems   bener

Recommender Systems: Challenges and Opportunities

Ayse Bener January 22, 2015

Page 2: Recommender systems   bener

} Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn’t know how to ask for, finds you.” – CNN Money, “The race to create ‘smart’ Google.”

From Search to Recommendation

Page 3: Recommender systems   bener

Recommender problem

the user is } Consumer } Subscriber } Member

Estimate a utility function to predict

how a user will like an item

the item is } Movie } Apps } Travel destinations

Page 4: Recommender systems   bener

Recommender

} A good recommendation } Relevant to the user } Personalized } Diverse } Expands the user's taste into neighboring

areas (serendipity – unsought finding)

Page 5: Recommender systems   bener

Paradigm of Recommender Systems }  Recommender systems reduce information overload by

estimating relevance }  Collaborative filtering : What is popular in a community

}  User profile & community information

}  Content Based: Provides more of what user liked before }  User profile & Item profile

}  Knowledge Based : What is best based on the users’ needs }  User profile & Item profile & Knowledge Model

}  Hybrid Method: Combination of inputs and/or composition of different methods }  User profile & Item profile & knowledge Model & Community

Information

Page 6: Recommender systems   bener

Recommender Systems Challenges }  Dealing with Big Data problems

}  Lack of Useful Data }  Unstructured data

}  Missing Data }  New user and New Item

}  Cold Start problem }  Temporality

}  Changing Data }  Changing user preferences and biases }  Negative choices

}  Evaluating Recommenders

Page 7: Recommender systems   bener

Main Research Issues }  Understanding the context and modeling context }  Algorithms }  Evaluation }  Engineering

Page 8: Recommender systems   bener

Bayesian Networks For Evidence-Based Decision-Making in Software Engineering

Ayse Tosun Misirli, and Ayse Bener, IEEE Transactions on Software Engineering, vol.40, no.6., June 2014

Page 9: Recommender systems   bener

Recommendation systems for software engineering (RSSE)

}  Recommendation systems/ prediction models should be designed in a way that they are capable of integrating evidence, i.e., facts and probabilities systematically collected or measured from real data and observations, into practitioners’ experience.

}  In this study, we follow the lead of computational biology and healthcare decision-making, and investigate the applications of BNs in SE

Page 10: Recommender systems   bener

The Bayesian Approach }  Provides a natural statistical framework for evidence-based

decision-making by incorporating an integrated summary of the available evidence and associated uncertainty (of consequences) }  Maintaining observations, statistical distributions, prior

assumptions, and expert judgment in a single model }  Encoding causal relationships among variables for predicting

future actions }  “information propagation through the network”, i.e., gaming

over the network to see all possible scenarios and their outcomes to give the best action

}  imitating the process of human thinking, while going beyond the capabilities of human reasoning with a fact-based, error-free intelligence through the usage of enormous amounts of historical data

Page 11: Recommender systems   bener

Example of a simple BN with different variable types

Page 12: Recommender systems   bener

Systematic Mapping of BNs in SWE

}  To investigate the applications of BNs in SE }  main software

engineering challenges addressed

}  techniques used to learn causal relationships among variables

}  techniques used to infer the parameters

}  variable types used as BN nodes

Page 13: Recommender systems   bener

Empirical Analysis on Bayesian Decision-Making

}  Hybrid Bayesian Network that would solve a specific software engineering challenge }  predicting software reliability in terms of post-release defects

}  a ’mixeddata’ model to represent software life cycle phases by incorporating expert judgment (qualitative data through surveys) into quantitative data collected from software repositories

}  a ’hybrid’ BN that incorporates both continuous and categorical variables

Page 14: Recommender systems   bener

Demographics for Two Software Companies

Page 15: Recommender systems   bener

BN Models in this Study

Page 16: Recommender systems   bener

Model Representation

Model #1

Model #2

Model #3

Page 17: Recommender systems   bener

Graphical Representation of BN (Co. A)

Page 18: Recommender systems   bener

Graphical Representation of BN (Co. B)

Page 19: Recommender systems   bener

Setting Prior Distributions }  Model #1

}  expert knowledge }  Model #2

}  Lilliefors significance test on all variables and on post release defects

}  normal probability plots }  Model #3

}  The requirements specification subnet whose distributions were set based on expert knowledge is used, and it is incorporated with the development and testing subnet in Model #2 whose variables are assigned different distributions based on the significance tests

Page 20: Recommender systems   bener

Structure Learning }  Expert Judgement }  Chi-plot

}  Independence betwen two variables }  Copula models- a transformation of data with marginal distributions }  Prior to modeling it is necessary to chack the presence of dependence

there is a positive monotone dependence between test cases and post release defects as data pairs are shifted towards right from the center

Page 21: Recommender systems   bener

Inference }  Bayesian learning for complex models using Monte Carlo

methods, especially Gibbs sampling }  insufficient statistics }  incomplete data }  successively sample from posterior distribution of each

node in a Bayesian model given all the others as full conditionals

}  successful when estimating the unknown parameters of probability distributions or when conducting empirical analysis to infer true values of a given sample

}  enables to make predictions for future scenarios even though some of the input variables are missing

Page 22: Recommender systems   bener

Prediction Performance of the Models

Page 23: Recommender systems   bener

Threats to Validity }  Internal validity

}  biases during data collection }  Used scripts to extract data }  Eliminated outliers }  BNs for causality and to avoid over-fitting

}  Construct validity }  Large set of metrics were chosen }  Well-known performance measures are used

}  Conclusion validity }  Non-parametric test (Mann-Whitney U-test), ANOVA, t-test were used

}  External validity }  we aim to transfer the methodology behind BN construction to enhance the

usage of these graphical, probabilistic models in software engineering

Page 24: Recommender systems   bener

Conclusions }  Similar to computational biology and healthcare, we need

to make decisions under uncertainty using multiple data sources

}  As we understand the dynamics of BNs and the techniques used for model learning, these models would enable us to uncover hidden relationships between variables, which cannot be easily identified by experts

}  Understanding the theory behind BNs also gives us the opportunity to adopt these models to different industrial settings by changing the set of metrics, their distributions, and causal relationships among variables

Page 25: Recommender systems   bener

Conclusions }  An integrated tool support (intelligent software delivery

platform) }  Dione – to be integrated to IBM Rational