improving social recommender systems, ofer arazy, nanda kumar, bracha shapira

8
38 IT Pro July/August 2009 Published by the IEEE Computer Society 1520-9202/09/$25.00 © 2009 IEEE SOCIAL COMPUTING Improving Social Recommender Systems Ofer Arazy, University of Alberta Nanda Kumar, City University of New York Bracha Shapira, Deutsche Telekom Laboratories at Ben-Gurion University Recommender systems play a significant role in reducing information overload for people visiting online sites, but their accuracy could be improved by using data from online social networks and electronic communication tools. R ecommender systems are a key compo- nent of successful online stores such as Amazon.com, Epinions.com, and Netflix as they help users sort through a site and find relevant information (we discuss the approach behind each of these examples in the “Commercial Social Recommender Systems” sidebar). Since the emergence of social (or collab- orative) filtering techniques in the mid-1990s, the industry has adopted a wide variety of collabora- tive filtering (CF) designs to generate recommen- dations. Typically, CF works by identifying recom- mendation sources with preferences similar to the user, identifying items that these sources like (but which the user hasn’t purchased yet), predicting the relevance of these items (based on ratings and the source’s similarity to the user), and recom- mending the most relevant items. Today, online communities—with their strong ties and built-in relationships—present an op- portunity for enhancing the design of social recommender systems and increasing system pre- diction accuracy. We can use the various relation- ships captured in these communities (phrased as “trust” on Epinions and “reputation” on eBay) in new ways, by incorporating better indicators of relationship information. The potential impact of these social recommender systems is not restricted to the public domain: the recent advent of Enter- prise 2.0—the application of Web 2.0 approaches in enterprises—is expected to bring social recom- mendation techniques to corporate settings. In this article, we present a framework for so- cial recommender systems that is intended to en- hance recommendation accuracy. We model our approach after Arazy and Woo, 1 who proposed © Terhox | Dreamstime.com

Upload: gclaplata

Post on 16-Feb-2016

223 views

Category:

Documents


0 download

DESCRIPTION

Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira, IT Pro July/August 2009 Publ ished by the IEEE C omputer S ociet y , 2009

TRANSCRIPT

Page 1: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

38 IT Pro July/August 2009 P u b l i s h e d b y t h e I E E E C o m p u t e r S o c i e t y 1520-9202/09/$25.00 © 2009 IEEE

SoCIAl CompuTINg

Improving Social Recommender SystemsOfer Arazy, University of AlbertaNanda Kumar, City University of New YorkBracha Shapira, Deutsche Telekom Laboratories at Ben-Gurion University

Recommender systems play a significant role in reducing information overload for people visiting online sites, but their accuracy could be improved by using data from online social networks and electronic communication tools.

R ecommender systems are a key compo-nent of successful online stores such as Amazon.com, Epinions.com, and Netflix as they help users sort through

a site and find relevant information (we discuss the approach behind each of these examples in the “Commercial Social Recommender Systems” sidebar). Since the emergence of social (or collab-orative) filtering techniques in the mid-1990s, the industry has adopted a wide variety of collabora-tive filtering (CF) designs to generate recommen-dations. Typically, CF works by identifying recom-mendation sources with preferences similar to the user, identifying items that these sources like (but which the user hasn’t purchased yet), predicting the relevance of these items (based on ratings and the source’s similarity to the user), and recom-mending the most relevant items.

Today, online communities—with their strong ties and built-in relationships—present an op-portunity for enhancing the design of social recommender systems and increasing system pre-diction accuracy. We can use the various relation-ships captured in these communities (phrased as “trust” on Epinions and “reputation” on eBay) in new ways, by incorporating better indicators of relationship information. The potential impact of these social recommender systems is not restricted to the public domain: the recent advent of Enter-prise 2.0—the application of Web 2.0 approaches in enterprises—is expected to bring social recom-mendation techniques to corporate settings.

In this article, we present a framework for so-cial recommender systems that is intended to en-hance recommendation accuracy. We model our approach after Arazy and Woo,1 who proposed

© T

erh

ox

| Dre

amst

ime.

com

Page 2: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

computer.org/ITPro 3 9

that the design of systems should be grounded in theoretical foundations. In the context of rec-ommender systems, we believe that designers should consider behavioral theories of persua-sion and advice taking when they design social recommender systems. Although the design of existing CF systems assumes that similarities in preferences (as captured in users’ consumption profiles) determine recommendation quality, be-havioral theory suggests that other characteris-tics—such as the source’s trustworthiness and reputation—determine the recipient’s perception of the recommendation.

Should I Take Your Advice?Online relationships are useful for a variety of purposes, including social (such as those in MySpace), job searching (LinkedIn), informa-tion access (Slashdot.org), and commerce (eBay). Although these online ties weren’t established for the purpose of advice taking, recommender systems could use them to link a user with rel-evant sources. Using previous research in mar-keting, applied psychology, and organization, we identified four salient constructs that impact a recipient’s advice-taking decision—homophily, tie strength, trust, and social capital. We argue that these constructs are relevant for the design of recommender systems.

Homophily refers to the similarity between source and recipient, and marketing research has investigated it for word-of-mouth recom-

mendations. Homophily—particularly, similarity in knowledge and preferences—is a key deter-minant of whether a recipient would accept a source’s advice,2 specifically in domains such as movie and book recommendations.

From a system design perspective, we could es-timate similarity in preferences between various users by recording their consumption patterns and comparing these patterns. Early recommend-er systems adopted this CF approach,3 which has quickly become the industry standard (an example is Amazon’s recommender system). This approach works well for large user communities where suf-ficient information is available about each user. Recent CF research provides enhancements along various dimensions, such as automatically elicit-ing accurate user feedback, employing algorithms to measure users’ similarities, and improving prediction methods.4 The main advantage of this approach is that it requires little effort from us-ers: they might need to rate the items they’ve con-sumed, but they aren’t required to explicitly define their relationships to other users. Its limitation is that in cases where little information is available about users and items (referred to as a cold start), prediction accuracy suffers.

Behavioral researchers have studied tie strength—the intensity of the relationship be-tween the recipient and source—and identified it as a key determinant in a recipient’s likelihood to take advice.5 Tie strength has several facets, including the relationship’s duration, interaction

Commercial Social Recommender Systems

Since the introduction of collaborative filtering (CF) algorithms in the mid-1990s, social-based

recommendation techniques have played a significant role in shaping consumer Web-based recommenda-tion applications.

The first large-scale implementation of CF is at-tributed to Amazon.com, which launched its book recommendation application in 1995. It later extend-ed recommendations to additional products, such as music CDs and consumer goods. Amazon has been a leader in adopting social approaches to recommenda-tions, and it provided user reviews for its products at an early stage. Recently, Amazon upgraded its review system to incorporate user ratings of reviews and a reputation system that establishes reviewer credibility.

Netflix, a Web-based movie rental service, relies heavily on its CF system to recommend movies to users. The company has been extremely effective at

matching users with movies and using these recom-mendations to push items on the long-tail portion of its inventory. In addition to CF, Netflix lets users define a social network of friends, allowing them to view each other’s preferences. However, this social network data isn’t incorporated into Netflix’s CF algorithm yet.

Epinions.com is a successful product recommenda-tion site launched in 1999 to let users rate products; its CF algorithm then uses these ratings to make product recommendations. Additionally, users can associate themselves with others whose opinions they trust. Epinions then forms a “web of trust,” propagating this trust information across a network and incorporating it into its CF algorithm. Thus, Epinions is a pioneer in developing a social recom-mender system that incorporates two types of social relations: shared preferences and trust.

Page 3: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

40 IT Pro July/August 2009

SoCIAl CompuTINg

frequency, and feeling of closeness. Empirical findings suggest that frequency and closeness can impact a recipient’s advice-taking decision.6

In the design of recommender systems, we can easily calculate the frequency of users’ electronic communications (such as email or text messaging) by installing a tracking utility on their comput-ers or electronic devices (with their permission). Assuming that consumption data is available for these users, communication frequency data can link users to sources, thus potentially im-proving prediction accuracy. Although this ap-proach would require little effort from users, it could pose a risk to user privacy. To the best of our knowledge, no commercial application has implemented this approach yet.

A recipient’s trust in a recommendation source is yet another important indicator of his or her likelihood of accepting a recommendation.5,7 The construct of trust includes both cognitive and af-fective dimensions—and both dimensions play an important role in advice taking.5 Researchers have investigated the impact of trust primarily in the context of organizational advice networks.

Online social networks provide ample evidence of trust relationships. If we can harvest this rela-tional information and incorporate it into a recom-mender system, we could obtain a more accurate representation of recipient–source relationships. Alternatively, instead of harvesting data from on-line communities, the system might ask users to explicitly define the extent to which they trust other users. The first CF system, introduced in 1992,8 employed this approach and required users to define explicit trust relations. At that time, the explicit trust approach failed to gain acceptance. However, this approach is now gaining momen-tum, and recent studies9 demonstrate its potential. The trust-based approach’s main limitation—besides requiring users to spend time explicitly

defining their online relationships—is that users often have only a few links, resulting in insufficient data for improving recommendation quality. We could potentially alleviate this limitation by propa-gating trust across relationships—for example, if user A trusts B, and B trusts C, then we could assume that A trusts C (at least to some extent). Researchers have explored various trust propaga-tion algorithms,10 and existing trust-based recom-mender systems often employ some variation of trust propagation. Again, the big drawback of this approach is the potential risk to users’ privacy.

Finally, a source’s social capital (that is, the source’s reputation or opinion leadership) has also been shown to affect the recipient’s decision-making process. A person’s social capital repre-sents his or her ability to influence others, and stems from that person’s structural positioning in the social network. Sociology and management research have investigated this construct.2

In designing online recommender systems, we can use two approaches to calculate social capi-tal (or reputation). The first is based on a system that records user ratings on others’ recommenda-tions and accumulates these ratings to calculate recommenders’ reputation scores.10 Online com-merce sites (such as eBay) were among the first to adopt this reputation system approach, and it has now been adopted by many other sites (such as Amazon.com). The alternative approach for esti-mating a user’s reputation is based on the structural analysis of online social networks. This technique, referred to as social network analysis (SNA), assigns various centrality measures to each user, based on his or her position in the network. We can apply SNA in a variety of situations, including manage-ment consulting, analyzing the Web structure, and evaluating citations. To date, no commercial rec-ommender system has capitalized on these possi-bilities to incorporate social capital information.

Table 1. Key social recommendation research studies.

Social dimensions Implementation approach Task domain

Trust8 Established a social trust network. —Shared preferences3 original collaborative filtering (CF) work. —Shared preferences, trust In addition to standard CF, used trust from Epinions’ product recommendations (local trust), and reputation web-of-trust data; propagation of trust; reputation (global trust)10 based on a user’s average trust scores.Shared preferences and trust11 In addition to standard CF, used trust (which is extracted movie recommendations automatically based on the accuracy of the user’s past predictions); used movielens data.Shared preferences and trust9 In addition to standard CF, established a social trust movie recommendations network; propagation of trust.

Page 4: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

computer.org/ITPro 41

Table 1 summarizes some relevant research projects that explore the use of social approaches to design recommender systems.

Research on social recommendation sys-tems is in its early phases, and most current attempts to incorporate relationship informa-tion into recommender systems employ only a subset of the available indicators. Further-more, it seems that the design choices in these works are somewhat ad hoc and are often not informed by current knowledge and theories of human behavior.

Our Proposed FrameworkWe propose a social recommendation framework that borrows from advice-taking theory by inte-grating the aforementioned relationship indica-tors between users and recommendation sources (homophily, tie strength, trust, and reputation). A social recommender system based on this framework would employ various mechanisms for capturing relationship information:

track user consumption patterns, construct •user profiles, and compare profiles (to detect homophily, as in CF systems);establish social networks and propagate links •to form indirect links (to establish users’ trust in each other);record user communication patterns and inter-•action frequency (as evidence for tie strength); and

establish reputation mechanisms based on ei-•ther ratings of recommendations or on analysis of the social network’s structure.

Figure 1 presents a conceptual design of a rec-ommender system based on our proposed framework.

As Figure 1 shows, once the system records the various relationship indicators, the system source qualification component calculates a weighted av-erage of the indicators to arrive at a single quali-fication score for each source. We expect that the task domain (that is, leisure versus work-related tasks) will affect the relative importance (weights) of the various source qualification indicators. For example, based on results from behavioral studies, we expect that for movie recommenda-tion tasks, users will deem shared preferences as more important than interaction frequency. Next, the system prediction component takes sources’ qualifications and their history of rat-ings as input to predict an item’s relevancy to the recipient and produces a recommendation. We present an algorithm for a possible implementa-tion of this framework in the sidebar “Algorithm for Implementing Our Framework.”

Using Social Relationship Data to Alleviate the Cold-Start ProblemBecause research on social recommender sys-tems is still in its infancy, both industry and academia have experiments currently in process

Sharedpreferences

System’sprediction

component

System’s prediction(recommendation)

System’s sourcequalificationcomponent

Calculate profilesimilarity

Source’squalifications

Social network

Trust propagation Trust

Ratings ofrecommendation

Reputationmechanisms

Social networkanalysis

Source’sreputation

Onlinecommunications

Calculateinteractionfrequency

Tiestrength

Consumption history

Receiver’s Source’s

Figure 1. Conceptual recommender system design based on our proposed framework. Rectangles represent input (red) or output (blue) information, trimmed rectangles (orange) represent system processes, and the green rectangle is the final output.

Page 5: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

42 IT Pro July/August 2009

SoCIAl CompuTINg

about how to incorporate various indicators of social relationships into recommender systems. We grounded our proposed framework on be-havioral theory, utilizing a series of relationship indicators that we can extract in online settings.

We expect this framework to provide accu-racy enhancements beyond traditional CF, es-pecially in cold-start situations. This problem is critical in commercial recommender systems4,12 because in the early phases of CF system deploy-ment, relatively little information on user tastes is available, making it difficult to provide accu-rate recommendations.13 For example, two of the most popular commercial CF applications— GroupLens and Epinions—suffer from the cold-start problem.10,13

Advice-taking literature suggests that relation-ship indicators such as trust and tie strength are highly correlated with homophily. It makes sense, then, that data extracted from a social net-work could serve as a proxy for preference simi-larities in cold-start situations and ensure that the system associates a recipient with appropriate sources.

Effort and PrivacyOur proposed framework is somewhat generic in the sense that it includes all available relationship indicators. However, any implementation of this framework is likely to use a subset of indicators. We can choose which indicators to use based on the domain in which we deploy the system and

based on efficiency considerations. The impact of the various indicators on system efficiency is independent of task domain. Efficiency depends on three key factors:

effort required by users,•effort required by system administrators, and•privacy concerns.•

Table 2 summarizes these considerations for the various relationship indicators.

The effort required from users plays a large role in determining system adoption. To keep user effort down to a minimum, the system can calculate shared preferences based on users’ consumption records. It can also capture and calculate communication frequency automati-cally. Establishing a social network (whether to calculate trust or indicate reputation) requires users to invite and accept invitations from other users, whereas a reputation system requires them to rate the quality of the recommendations they received.

The effort required from system administra-tors, too, might play a part in decisions about which relationship indicators to use. Calculating shared preferences requires the recording of user profiles—and matching them. Although calculat-ing direct trust relationships from a given social network is straightforward, propagating trust to indirect relationships requires additional calcu-lations. We can calculate reputation scores from

Algorithm for Implementing Our Framework

We calculate the source qualification for user u, Qu,k, as a weighted average of various indica-

tors. A simple formula is

Qu,k = WH × Hu,k + WT × Tu,k + WTS × TSu,k + WR × Ru,k,

where Hu,k is the homophily (shared preferences) score for users u and k, Tu,k is the trust score, TSu,k is the tie strength (interaction frequency) score, and Ru,k is the reputation score. W represents the relative weight assigned to each indicator: WH for homophily, WT for trust, WTS for tie strength, and WR for reputation.

Alternative formulas, such as harmonic mean, are also possible. The system prediction component’s output is a prediction of item relevancy to users. The system computes it as an aggregation of the recom-mendations of the n most qualified sources, where the effect of each of n sources on the final recommen-

dation is relative to their qualification. The recom-mendation function of an item i to a user u could be based on various algorithms, the gold standard1 in CF systems being

,

where Pu,i is the prediction score of item i to user u, is the average overall past ratings provided by user u, rk,i is the rating assigned to item i by user u, and is the average overall past ratings provided by user k.

ReferenceJ. Herlocker et al., “Evaluating Collaborative Filtering 1. Recommender Systems,” ACM Trans. Information Systems, vol. 22, no. 1, 2004, pp. 5–53.

Page 6: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

computer.org/ITPro 4 3

a social network using SNA, but implementing a reputation mechanism requires setting up tech-nical and social controls to combat fraud and as-sure normative user behavior.

Privacy is a major issue for both users and system administrators. Users are reluctant to provide personal details for fear of misuse, and system administrators are concerned about the legal issues associated with protecting user pri-vacy. Calculating shared preferences requires tracking consumption data and gathering data about consumed item ratings (whenever collect-ed). Tracking communication frequency, as well as collecting social network data, might pose a larger threat to privacy because users might con-sider their social relations with others to be con-fidential information. However, the information that reputation systems use—ratings of recom-mendations and reputation scores—is often con-sidered public knowledge.

The analysis we mention here highlights the advantages of the shared-preferences approach in light of user effort and privacy concerns. Never-theless, the use of additional indicators for social relationships has potential benefits. First, in-corporating additional information sources will tackle the cold-start problem and increase pre-diction reliability. Second, even in cases where shared preference scores are reliable, we need to incorporate additional indicators of social re-lationships because behavioral theory suggests that shared preferences are just one of several factors that determine a recipient’s likelihood of accepting advice. Moreover, extracting relation-ship indicators might not require much effort from users, especially if we can harvest this in-formation from existing online social networks.

F or more than a decade now, the ad hoc standard in recommendation systems has been based on users’ shared preferences.

Recent advances in academia and industry sug-gest that we can employ alternative sources of re-lationship information to enhance recommender system performance. By considering these dif-ferent approaches and grounding our analysis in behavioral theory, we propose a conceptual de-sign for a social recommender system that has the potential to alleviate the cold-start problem and improve recommendation accuracy. We hope that others will investigate similar approaches to em-ploy social relationship information in the design of recommender systems. Notwithstanding the potential benefits, our approach has some limi-tations associated with administration costs, us-ability, and user privacy. In implementing social recommender systems and choosing which types of relationship indicators to employ, system de-signers should consider the risks associated with each indicator.

ReferencesO. Arazy and C. Woo, “Enhancing Information Re-1. trieval through Statistical Natural Language Process-ing: A Study of Collocation Indexing,” Management Information Systems Quarterly, vol. 31, no. 3, 2007, pp. 525–546.M. Gilly et al., “A Dyadic Study of Interpersonal In-2. formation Search,” J. Academy of Marketing Science, vol. 26, no. 2, 1998, pp. 83–100. U. Shardanand and P. Maes, “Social Information Fil-3. tering: Algorithms for Automating Word of Mouth,” Proc. Conf. Human Factors in Computing Systems, ACM Press, 1995, pp. 210–217.

Table 2. Effort and privacy considerations for extracting relationship indicators.

Evidence User effort System administration effort Privacy concerns

Shared preferences low (if based on purchase low (existing CF available) low (only rating of items) history) or medium (when ratings of items are required)Communication low (automatic) low (monitoring electronic medium (social relations) frequency communication)Social network—direct High (establishing a social low (social network) medium (social relations) relations network)Social network—indirect High (establishing a social medium (social network and medium (social relations) relations network) trust propagation)Social network—social High (establishing a social medium (social network and medium (social relations) network analysis (SNA) network) SNA calculations) Reputation system medium (rating of others’ High (reputation mechanism low (rating of others’ recommendations) and fraud control) recommendations)

Page 7: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

44 IT Pro July/August 2009

SoCIAl CompuTINg

J. Herlocker et al., “Evaluating Collaborative Filter-4. ing Recommender Systems,” ACM Trans. Information Systems, vol. 22, no. 1, 2004, pp. 5–53.D. Levin and R. Cross, “The Strength of Weak Ties 5. You Can Trust: The Mediating Role of Trust in Ef-fective Knowledge Transfer,” Management Science, vol. 40, no. 11, 2004, pp. 1477–1490.P. Marsden and K. Campbell, “Measuring Tie 6. Strength,” Social Force, vol. 63, no. 2, 1984, pp. 482–501.D. Smith, S. Menon, and K. Sivakumar, “Online Peer 7. and Editorial Recommendations, Trust, and Choice in Virtual Markets,” J. Interactive Marketing, vol. 19, no. 3, 2005, pp. 15–37.D. Goldberg et al., “Using Collaborative Filtering to 8. Weave an Information Tapestry,” Comm. ACM, vol. 35, no. 12, 1992, pp. 61–70.J. Golbeck and J. Hendler, “Filmtrust: Movie Recom-9. mendations Using Trust in Web-Based Social Net-works,” Proc. Consumer Comm. and Networking Conf., IEEE CS Press, 2006, pp. 282–286.P. Massa and P. Avesani, “Trust-Aware Collabora-10. tive Filtering for Recommender Systems,” LNCS, vol. 3290, Springer, 2004, pp. 492–508.J. O’Donovan and B. Smyth, “Trust in Recommender 11. Systems,” Proc. 10th Int’l Conf. Intelligent User Interfaces, ACM Press, 2005, pp. 167–174.K. Goldberg et al., “Eigentaste: A Constant Time Col-12. laborative Filtering Algorithm,” Information Retrieval, vol. 4, no. 2, 2001, pp. 133–151.

D.A. Maltz and K. Ehlrich, “Pointing the Way: Active 13. Collaborative Filtering,” Proc. Computer–Human Inter-action, ACM Press, 1995, pp. 202–209.

Ofer Arazy is an assistant professor in the School of Busi-ness at the University of Alberta. His research interests are in knowledge management and social computing. Ara-zy has a PhD in management information systems from the University of British Columbia. Contact him at ofer. [email protected].

Nanda Kumar is an associate professor in the computer information systems department at Baruch College, City University of New York. His research interests include human–computer interaction, digital government, and the impact of IT on the organization of work and leisure. Ku-mar has a PhD in management information systems from the University of British Columbia. Contact him at [email protected].

Bracha Shapira is a project manager in the Deutsche Telekom Laboratories at Ben-Gurion University, where she leads a project that deals with personalized content on mo-bile devices. She’s also a senior lecturer in the Department of Information Systems Engineering at Ben-Gurion, where she leads the Information Retrieval Laboratory. Shapira’s re-search interests include information retrieval and filtering—especially for user modeling, profiling, and personalization. She has a PhD in information systems from Ben-Gurion University. Contact her at [email protected].

MOBILE AND UBIQUITOUS SYSTEMS

IEEE Pervasive Computing

seeks accessible, useful papers on the latest peer-

reviewed developments in pervasive, mobile, and

ubiquitous computing. Topics include hardware

technology, software infrastructure, real-world

sensing and interaction, human-computer interaction,

and systems considerations, including deployment,

scalability, security, and privacy.

Call for

ArticlesAuthor guidelines:

www.computer.org/mc/

pervasive/author.htm

Further details:

[email protected]

www.computer.org/

pervasive

Page 8: Improving Social Recommender Systems, Ofer Arazy, Nanda Kumar, Bracha Shapira

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.