a multi-purpose ontology-based approach

Upload: tran-van-long

Post on 05-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 A Multi-Purpose Ontology-Based Approach

    1/6

    A Multi-Purpose Ontology-Based Approach

    for Personalized Content Filtering and Retrieval

    David Vallet, Ivn Cantador, Miriam Fernndez, and Pablo Castells

    Universidad Autnoma de Madrid, Escuela Politcnica Superior

    {david.vallet,ivan.cantador,miriam.fernandez,pablo.castells}@uam.es

    Abstract

    Personalized multimedia access aims at enhancing

    the retrieval process by complementing explicit user

    requests with implicit user preferences. We propose

    and discuss the benefits of the introduction of ontolo-

    gies for an enhanced representation of the relevantknowledge about the user, the context, and the domain

    of discourse, as a means to enable improvements in the

    retrieval process and the performance of adaptive ca-

    pabilities. We develop our proposal by describing

    techniques in several areas that exemplify the explota-

    tion of the richness and power of formal and explicit

    semantics descriptions, and the improvements therein.

    1. Introduction

    Personalized multimedia access aims at enhancing

    the retrieval process by complementing explicit user

    requests with implicit user preferences, to better meetindividual user needs [5]. Automatic user modeling

    and personalization has been a thriving area of re-

    search for nearly two decades, gaining significant pres-

    ence in commercial applications around the mid-90s.

    Personalizing a content retrieval system involves

    considerable complexity, mainly because finding im-

    plicit evidence of user needs and interests through their

    behavior is not an easy task. This difficulty is often

    considerably increased by an imprecise and vague rep-

    resentation of the semantics involved in user actions

    and system responses, which makes it even more diffi-

    cult to properly pair user interests and content descrip-

    tions. The ambiguity of terms used in this representa-tion, the unclear relationships between them, their het-

    erogeneity, especially in current ever-growing large-

    scale networked environments such as the WWW,

    often constitute a major obstacle for achieving an accu-

    rate personalization, e.g. when comparing user prefer-

    ences to content items, or users among themselves.

    In this paper we argue for the introduction of on-

    tologies [8] as an enhanced representation of the rele-

    vant knowledge about the domain of discourse, about

    users, about contextual conditions, involved in the re-

    trieval process, as a means to enable significant im-

    provements in the performance of adaptive content

    retrieval services. We exemplify our point by describ-ing the development of advanced features and en-

    hancements in specific areas related to personalization

    where the ontology-based approach shows its benefit,

    including:

    Basic personalized content search and browsingbased on user preferences.

    Dynamic contextualization of user preferences.

    Dynamic augmented social networking and collabo-rative filtering.

    Domain ontologies and rich knowledge bases play a

    key role in the models and techniques that we propose

    in the above areas, as will be described in the sequel.

    The approaches presented in this paper share and ex-ploit a common representation framework, thus obtain-

    ing multiple benefits from a shared single ontology-

    rooted grounding. Furthermore, it will be shown that

    modular semantic processing strategies, such as infer-

    ence, graph processing, or clustering, over networked

    ontology concepts, may be reused and combined to

    serve multiple purposes.

    The rest of the paper is organized as follows. Sec-

    tion 2 introduces the basic approach for the ontology-

    oriented representation of semantic user preferences,

    and its application to personalized content search and

    retrieval. Following this, section 3 describes an ap-

    proach for the dynamic contextualization of semanticuser preferences, and section 4 shows the extension of

    the techniques described in previous sections to multi-

    user environments, based on collaborative personaliza-

    tion strategies. Finally, some conclusions are given in

    section 5.

    First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)0-7695-2692-6/06 $20.00 2006

    Authorized licensed use limited to: IEEE Xplore. Downloaded on November 1, 2008 at 07:41 from IEEE Xplore. Restrictions apply.

  • 7/31/2019 A Multi-Purpose Ontology-Based Approach

    2/6

    2. Ontology-based personalization for con-

    tent retrieval

    Most personalized retrieval techniques (e.g. col-

    laborative filtering) keep and process long records of

    accessed documents by each user, in order to infer

    potential preferences for new documents (e.g. by find-ing similarities between documents, or between users).

    The data handled by these techniques have been rather

    low-level and simple: document IDs, text keywords

    and topic categories at most [9], [12]. The recent pro-

    posals and achievements towards the enrichment of

    multimedia content by formal, ontology-based, seman-

    tic descriptions open new opportunities for improve-

    ment in the personalization field from a new, richer

    representational level [2], [5]. We see the introduction

    of ontology-based technology in the area of personal-

    ization as a promising research direction [7]. Ontolo-

    gies enable the formalization of user preferences in a

    common underlying, interoperable representation,whereby user interests can be matched to content

    meaning at a higher level, suitable for conceptual rea-

    soning.

    An ontology-based representation is richer, more

    precise, less ambiguous than a keyword-based model.

    It provides an adequate grounding for the representa-

    tion of coarse to fine-grained user interests (e.g. inter-

    est for individual items such as a sports team, an actor,

    a stock value) in a hierarchical way, and can be a key

    enabler to deal with the subtleties of user preferences.

    An ontology provides further formal, computer-

    processable meaning on the concepts (who is coaching

    a team, an actors filmography, financial data on astock), and makes it available for the personalization

    system to take advantage of. Moreover, an ontology-

    rooted vocabulary can be agreed and shared (or

    mapped) between different systems, or different mod-

    ules of the same system, and therefore user prefer-

    ences, represented this way, can be more easily shared

    by different players. For instance, a personalization

    framework may share a domain ontology with a

    knowledge-based content analysis tool that extracts

    semantic metadata from a/v content, conforming to the

    ontology [2]. On this basis, it is easier to build algo-

    rithms that match preference to content, through the

    common domain ontology.In an ontology-based approach, semantic user pref-

    erences may be represented as a vector of weights

    (numbers from 0 to 1), representing the intensity of the

    user interest for each concept in a domain ontology [5].

    If a content analysis tool identifies, for instance, a cat

    in a picture, and the user is known to like cats, the per-

    sonalization module can find how the user may like the

    picture by comparing the metadata of the picture, and

    the preferred concepts in the user profile. Based on

    preference weights, measures of user interest for con-

    tent units can be computed, with which it is possible to

    discriminate, prioritize, filter and rank contents (a col-

    lection, a catalog section, a search result) in a personal

    way.Furthermore, ontology standards backed by interna-

    tional consortiums (such as the W3C), and the corre-

    sponding available processing tools, support inference

    mechanisms that can be used to further enhance per-

    sonalization, so that, for instance, a user interested in

    animals (superclass of cat) is also recommended pic-

    tures of cats. Inversely, a user interested in lizards,

    snakes, and chameleons can be inferred to be inter-

    ested in reptiles with a certain confidence. Also, a user

    keen of Sicily can be assumed to like Palermo, through

    the transitive locatedIn relation. In fact, it is even pos-

    sible to express complex preferences based on generic

    conditions, such as athletes that have won a goldmedal in the Olympic Games.

    3. Contextual personalization

    The shallowest consideration is sufficient to notice

    that human preferences are complex, variable and het-

    erogeneous, and that not all preferences are relevant in

    every situation [13]. For instance, if a user is consis-

    tently looking for some contents in the Formula 1 do-

    main, it would not make much sense that the system

    prioritizes some Formula 1 picture with a helicopter in

    the background just because the user happens to have a

    general interest for aircrafts. In other words, in thecontext ofFormula 1, aircrafts are out of (or at least far

    from) context. Context is a difficult notion to grasp and

    capture in a software system, and the elements than

    can, and have been considered in the literature under

    the notion of context are manifold: user tasks and

    goals, computing platform, network conditions, social

    environment, physical environment, location, time,

    noise, external events, text around a word, visual con-

    text of a graphic region, to mention a few.

    Complementarily to the ones mentioned, we pro-

    pose a particular notion, for its tractability and useful-

    ness in semantic content retrieval: that ofsemantic

    runtime context, which we define as the backgroundthemes under which user activities occur within a

    given unit of time. Using this notion, a finer, qualita-

    tive, context-sensitive activation of user preferences

    can be defined. Instead of a uniform level of personal-

    ization, user interests related to the context are priori-

    tized, discarding the preferences that are out of focus.

    The problems to be addressed include how to represent

    First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)0-7695-2692-6/06 $20.00 2006

    Authorized licensed use limited to: IEEE Xplore. Downloaded on November 1, 2008 at 07:41 from IEEE Xplore. Restrictions apply.

  • 7/31/2019 A Multi-Purpose Ontology-Based Approach

    3/6

    such context and determine it at runtime, and how the

    activation of user preferences should be related to it,

    predicting the drift of user interests over time. Our

    approach is based on a concept-oriented context repre-

    sentation, and the definition of distance measures be-

    tween context and preferences as the basis for the dy-

    namic selection of relevant preferences [13].A runtime context is represented (is approximated)

    in our approach as a set of weighted concepts from the

    domain ontology. This set is obtained by collecting the

    concepts that have been involved, directly or indi-

    rectly, in the interaction of the user (e.g. issued queries

    and accessed items) with the system during a retrieval

    session. The context is built in such a way that the im-

    portance of concepts fades away with time (number of

    user requests back when the concept was referenced)

    by a decay factor.

    Once a context is thus built, the contextual activa-

    tion of preferences achieved by a computation of the

    semantic similarity between each user preference andthe set of concepts in the context. In spirit, the ap-

    proach consists of finding semantic paths linking pref-

    erences to context, where the considered paths are

    made of existing semantic relations between concepts

    in the domain ontology. The shorter, stronger, and

    more numerous such connecting paths, the more in

    contexta preference is considered. The proposed tech-

    niques to find these paths use a form of Constraint

    Spreading Activation (CSA) strategy [6]. In fact, in the

    proposed approach a semantic expansion of both user

    preferences and the context takes place, during which

    the involved concepts are assigned preference weights

    and contextual weights, which decay as the expansion

    grows farther from the initial sets. This process can

    also be understood as finding a sort of fuzzy semantic

    intersection between user preferences and the semantic

    runtime context, where the final computed weight of

    each concepts represents the degree to which it belongs

    to each set (see Figure 1).

    Initial user preferences

    Extended user preferences

    Initial runtime context

    Extended context

    Domain

    concepts

    Semantic

    user preferences

    Semantic

    runtime context

    Contextualised

    user preferences

    Contextualised

    user preferences

    Figure 1. Contextual activation of semantic user pref-erences.

    Finally, the perceived effect of contextualization is

    that user interests that are out of focus, under a given

    context, are disregarded, and only those that are in the

    semantic scope of the ongoing user activity (the inter-

    section of user preferences and runtime context) are

    considered for personalization. The inclusion or exclu-

    sion of preferences is in fact not binary, but ranges ona continuum scale, where the contextual weight of a

    preference decreases monotonically with the semantic

    distance between the preference and the context.

    Contextualized preferences can be understood as an

    improved, more precise, dynamic, and reliable

    representation of user preferences, and as such they

    can be used directly for the personalized ranking of

    content items and search results, as described in the

    previous section, or they can be input to any system

    that exploits this information in other ways, such as the

    one described in the next section.

    4. Augmented social networking and col-laborative filtering

    When the system perspective is widened to take in

    contextual aspects of the user, it is often relevant to

    consider that in most cases the user does not work in

    isolation. Indeed, the proliferation of virtual communi-

    ties, computer-supported social networks, and collec-

    tive interaction (e.g. several users in front of a Set-

    Top-Box), call for further research on group modeling,

    opening new problems and complexities. A variety of

    group-based personalization functionalities can be en-

    abled by combining, comparing, or merging prefer-

    ences from different users, where the expressive powerand inference capabilities supported by ontology-based

    technologies can act as a fundamental piece towards

    higher levels of abstraction [3], [4].

    4.1. Semantic group profiling

    Group profiling can be understood under the ex-

    plicit presence of a-priori given user groups, or as an

    activity that involves the automatic detection of im-

    plicit links between users by the system, in order to put

    users in contact with each other, or to help them bene-

    fit from each others experience. In the first view, col-

    laborative applications may be required to adapt togroups of people who interact with the system. These

    groups may be quite heterogeneous, e.g. age, gender,

    intelligence and personality influence on the percep-

    tion and demands on system outputs that each member

    of the groups may have. The question that arises is

    how can the system adapt itself to the group in such a

    way that each individual benefits from the results.

    First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)0-7695-2692-6/06 $20.00 2006

    Authorized licensed use limited to: IEEE Xplore. Downloaded on November 1, 2008 at 07:41 from IEEE Xplore. Restrictions apply.

  • 7/31/2019 A Multi-Purpose Ontology-Based Approach

    4/6

    Figure 2. Group profiling by aggregation of individualuser profiles.

    We have explored the combination of the ontology-

    based profiles defined in section 2 to meet this pur-

    pose, on a per concept basis, following differentstrategies from social choice theory [11] for combining

    multiple individual preferences. In our approach, user

    profiles are merged to form a shared group profile, so

    that common content recommendations are generated

    according to this new profile (see Figure 2). Our pre-

    liminary experiments have shown that improved re-

    sults can be obtained from the accuracy and expressiv-

    ity of the ontology-based representation as proposed in

    this approach [3].

    4.2. Semantic social networking

    Even when explicit groups are not defined, usersmay take advantage of the experience of other users

    with common interests, without needing to know each

    other. The issue of finding hidden links between users

    based on the similarity of their preferences or historic

    behavior is not a new idea. In fact, this is the essence

    of the well-known collaborative recommender systems

    [1], where items are recommended to a certain user

    concerning those of her interests shared with other

    users or according to opinions, comparatives, and rat-

    ings of items given by similar users.

    However, in typical approaches, the comparison be-

    tween users and items is done globally, in such a way

    that partial, but strong and useful similarities may be

    missed. For instance, two people may have a highly

    coincident taste in cinema, but a very divergent one insports. The opinions of these people on movies could

    be highly valuable for each other, but risk to be ig-

    nored by many collaborative recommender systems,

    because the global similarity between the users might

    be low.

    In recommendation environments there is an under-

    lying need to distinguish different layers within the

    interests and preferences of the users. Depending on

    the current context, only a specific subset of the seg-

    ments (layers) of a user profile should be considered in

    order to establish her similarities with other people

    when a recommendation has to be performed. Models

    of social networks partitioned into different commonsemantic layers can achieve more accurate and con-

    text-sensitive results.

    The definition and generation of such models can

    be facilitated by a more accurate semantic description

    of user preferences, as supported by ontologies. A

    multi-layered approach to social networking can be

    developed by dividing user profiles into clusters of

    cohesive interests, so that several layers of social net-

    works are found. This provides a richer model of in-

    terpersonal links, which better represents the way peo-

    ple find common interests in real life.

    Taking advantage of the relations between con-

    cepts, and the (weighted) preferences of users for the

    concepts, we have defined a system that clusters the

    semantic space based on the correlation of concepts

    appearing in the preferences of individual users [4].

    After this, user profiles are partitioned by projecting

    the concept clusters into the set of preferences of each

    user (see Figure 3). Then, users can be compared on

    the basis of the resulting subsets of interests, in such a

    way that several, rather than just one, (weighted) links

    can be found between two users.

    Figure 3. Multilayer generation of social links between users: 1) the initial sets of individual interests are expanded, 2)domain concepts are clustered based on the vector space of user preferences, and 3) users are clustered in order toidentify the closest class to each user.

    First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)0-7695-2692-6/06 $20.00 2006

    Authorized licensed use limited to: IEEE Xplore. Downloaded on November 1, 2008 at 07:41 from IEEE Xplore. Restrictions apply.

  • 7/31/2019 A Multi-Purpose Ontology-Based Approach

    5/6

    Multilayered social networks are potentially useful

    for many purposes. For instance, users may share pref-

    erences, items, knowledge, and benefit from each

    others experience in focused or specialized conceptual

    areas, even if they have very different profiles as a

    whole. Such semantic subareas need not be defined

    manually, as they emerge automatically with our pro-posed method. Users may be recommended items or

    direct contacts with other users for different aspects of

    day-to-day life.

    In addition to these possibilities, our two-way space

    clustering, which finds clusters of users based on the

    clusters of concepts found in a first pass, offers a rein-

    forced partition of the user space that can be exploited

    to build group profiles for sets of related users. These

    group profiles enable efficient strategies for collabora-

    tive recommendation in real-time, by using the merged

    profiles as representatives of classes of users.

    The degree of membership of the obtained subpro-

    files to the clusters, and the similarities among them,can be used to define social links to be exploited by

    collaborative filtering systems. We report early ex-

    periments with real subjects in [4], where the emergent

    social networks are applied to a variety of collabora-

    tive filtering models, showing the feasibility of the

    clustering strategy.

    4.3. Semantic profile expansion for collabora-

    tive group profiling

    In real scenarios, user profiles tend to be very scat-

    tered, especially in those applications where user profiles

    have to be manually defined. Users are usually not will-ing to spend time describing their detailed preferences to

    the system, even less to assign weights to them, espe-

    cially if they do not have a clear understanding of the

    effects and results of this input. On the other hand, ap-

    plications where an automatic preference learning algo-

    rithm is applied tend to recognize the main characteris-

    tics of user preferences, thus yielding profiles that may

    entail a lack of expressivity. To overcome this problem,

    the semantic preference spreading mechanism described

    in section 3 has proved highly useful for improving our

    group profiling techniques as well.

    Previous experiments without the semantic spreading

    feature showed considerably poorer results. The profileswere very simple and the matching between the prefer-

    ences of different users was low. Typically, the basic

    user profiles provide a good representative sample of

    user preferences, but do not reflect the real extent of user

    interests, which results in low overlaps between the

    preferences of different users. Therefore, the extension is

    not only important for the performance of individual

    personalization, but is essential for the clustering strat-

    egy described in the previous subsection.

    In very open collaborative environments, it is also

    the case that not only direct evidence of user interests

    needs to be properly completed in their semantic con-

    text, but that they are not directly comparable with the

    input from other users in its initial form. If the envi-ronment is very heterogeneous, the potential disparity

    of vocabularies and syntax used by different users or

    subsystems pose an additional barrier for collaborative

    techniques. One of the major purposes for which on-

    tologies are conceived is that of reflecting or achieving

    a consensus between different parties in a common

    knowledge space [8]. Therefore, they provide special-

    purpose facilities to ensure the required interoperabil-

    ity between semantic user spaces, and match descrip-

    tions that are syntactically different but semantically

    related.

    5. Conclusion

    In this paper we propose the introduction of ontolo-

    gies as a key tool for moving beyond current state of

    the art in the area of personalization. We show ways in

    which ontology-driven representations can be used to

    improve the effectiveness of different personalization

    techniques, and we describe a set of functionalities

    where ontologies are essential to achieve qualitative

    enhancements. In our approach, ontologies are used to

    model the domain of discourse in terms of which user

    interests, content meaning, retrieval context, and social

    relationships, can be described and analyzed.

    The advantages of ontology-driven representations(expressiveness and precision, formal properties, infer-

    ence capabilities, interoperability) enable further de-

    velopments that exploit such capabilities, beyond the

    ones proposed here, on top of the basic personalization

    framework described in this paper.

    A trade-off of our proposals is the cost and diffi-

    culty of building well-defined ontologies and populat-

    ing large-scale knowledge bases, which is not ad-

    dressed in this paper. Recent research on these areas is

    yielding promising results [10], in a way that any ad-

    vancement on these problems can be played to the

    benefit of our proposed achievements.

    6. Acknowledgements

    The research leading to this document has received

    funding from the European Communitys Sixth Frame-

    work (FP6-027685 MESH), and the Spanish

    Ministry of Science and Education (TIN2005-06885).

    However, it reflects only the authors views, and the

    First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)0-7695-2692-6/06 $20.00 2006

    Authorized licensed use limited to: IEEE Xplore. Downloaded on November 1, 2008 at 07:41 from IEEE Xplore. Restrictions apply.

  • 7/31/2019 A Multi-Purpose Ontology-Based Approach

    6/6

    European Community is not liable for any use that may

    be made of the information contained therein.

    7. References

    [1] M. Balabanovic and Y. Shoham, Content-Based Col-

    laborative Recommendation, Communications of theACM40(3), March 1997, pp. 66-72.

    [2] S. Bloehdorn, K. Petridis, C. Saathoff, N. Simou, V.Tzouvaras, Y. Avrithis, S. Handschuh, Y. Kompatsiaris,

    S. Staab, M. G. Strintzis, Semantic Annotation of Im-

    ages and Videos for Multimedia, 2nd European Seman-

    tic Web Conference (ESWC 2005). Springer Verlag Lec-

    ture Notes in Computer Science, Vol. 3532, 2005.

    [3] I. Cantador, P. Castells, and D. Vallet, EnrichingGroup Profiles with Ontologies for Knowledge-Driven

    Collaborative Content Retrieval, 1st International

    Workshop on Semantic Technologies in Collaborative

    Applications (STICA 2006), at the 15th IEEE Interna-

    tional Workshops on Enabling Technologies (WETICE2006), Manchester, UK, June 2006.

    [4] I. Cantador and P. Castells, Multilayered SemanticSocial Network Modeling by Ontology-Based User Pro-

    files Clustering: Application to Collaborative Filtering,

    15th International Conference on Knowledge Engineer-

    ing and Knowledge Management (EKAW 2006), Po-

    debrady, Czech Republic, October 2006, Springer Ver-

    lag Lecture Notes in Computer Science, to appear.

    [5] P. Castells, M. Fernndez, D. Vallet, P. Mylonas, andY. Avrithis, Self-Tuning Personalized Information Re-

    trieval in an Ontology-Based Framework, 1st Interna-

    tional Workshop on Web Semantics (SWWS 2005), AgiaNapa, Cyprus, July 2005. Springer Verlag Lecture

    Notes in Computer Science, Vol. 3762, pp. 977-986.

    [6] F. Crestani, Application of Spreading ActivationTechniques in Information Retrieval, Artificial Intelli-

    gence Review 11, 1997, pp. 453-482.

    [7] S. Gauch, J. Chaffee, and A. Pretschner, Ontology-based personalized search and browsing, WIAS Journal

    1(3-4), 2003, pp. 219-234.

    [8] T. R. Gruber, A Translation Approach to Portable On-tology Specification, Knowledge Acquisition 5, 1993,

    pp. 199-220.

    [9] G. Jeh, J. Widom, Scaling Personalized Web Search,International World Wide Web Conference (WWW

    2003), Budapest, Hungary, 2003, pp. 271-279.

    [10]A. Kiryakov, B. Popov, I. Terziev, D. Manov, D. Ogn-yanoff, Semantic Annotation, Indexing, and Retrieval,

    Journal of Web Sematics 2(1), Elsevier, 2004, pp. 47-

    49.

    [11]J. Masthoff, Group Modeling: Selecting a Sequence ofTelevision Items to Suit a Group of Viewers, User

    Modeling and User-Adapted Interaction 14(1), 2004,

    pp. 37-85.

    [12]A. Micarelli, F. Sciarrone, Anatomy and EmpiricalEvaluation of an Adaptive Web-Based Information Fil-

    tering System, User Modelling and User-Adapted In-

    teraction 14(2-3), 2004, pp. 159-200.

    [13]D. Vallet, M. Fernndez, P. Castells, P. Mylonas, andY. Avrithis, Personalized Information Retrieval in

    Context, 3rd International Workshop on Modeling and

    Retrieval of Context (MRC 2006) at the 21st National

    Conference on Artificial Intelligence (AAAI 2006), Bos-

    ton, USA, July 2006.

    First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)0-7695-2692-6/06 $20.00 2006