extraction of adaptation knowledge from internet communities

27
Extraction Extraction of of Adaptation Knowledge Adaptation Knowledge from Internet Communities from Internet Communities * * Norman Ihle, Alexandre Hanft , and Klaus-Dieter Althoff University of Hildesheim Institute for Computer Science Intelligent Information Systems Lab [lastname]@iis.uni-hildesheim.de * This is an extended version of the paper presented at the Workshop ”WebCBR: Reasoning from Experiences on the Web” at ICCBR’09 FGWM’09 Fachgruppentreffen Wissensmanagement at LWA09, TU Darmstadt, 2009-09-22

Upload: university-of-hildesheim-germany

Post on 18-Nov-2014

1.659 views

Category:

Technology


0 download

DESCRIPTION

presentation of the paper "Extraction of Adaptation Knowledge from Internet Communities"

TRANSCRIPT

Page 1: Extraction of Adaptation Knowledge from Internet Communities

Extraction Extraction ofof Adaptation Knowledge Adaptation Knowledge from Internet Communities from Internet Communities **

Norman Ihle, Alexandre Hanft, and Klaus-Dieter Althoff

University of HildesheimInstitute for Computer Science

Intelligent Information Systems Lab

[lastname]@iis.uni-hildesheim.de

* This is an extended version of the paper presented at the Workshop ”WebCBR: Reasoning from Experiences on the Web” at ICCBR’09

FGWM’09 Fachgruppentreffen Wissensmanagement at LWA09, TU Darmstadt, 2009-09-22

Page 2: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 2

OutlineOutline

Motivation

CookIIS

CommunityCookA system for model-based knowledge extraction from Internet-Communities

Evaluation

Conclusion & Outlook

Page 3: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 3

MotivationMotivation

Adaptation is the „Reasoning“ in CBR [Kolodner 1997]

Most CBR-Systems avoid adaptation [Schmidt et al. 2003; Minor 2006]

Adaptation Knowledge Acquisition (AKA) is cost intensive and time consuming− Experts hardly available− Small number of research papers and systems− Most systems focus on the case-base as source of knowledge

The Internet is a large source of knowledge, especially user-generated content

Page 4: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 4

The Cooking DomainThe Cooking Domain

The cooking domain is well suited for adaptation, because:The context can be described easily:

1. all ingredients can be listed with exact amount and quality2. ingredients can be obtained in standardized quantity and in

comparable quality3. kitchen machines and tools are available in a standardized manner4. (in case of a failure) the preparation of a meal can start over every

time again from the same initial situation (except that we have more experience in cooking after each failure)

Cooking is about creativity and variation

Page 5: Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 5

Page 6: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 6

CookIISCookIIS

System for the retrieval and adaptation of recipes in the cooking domain http://cookiis.iis.uni-hildesheim.de

Competes in the ComputerCookingContestGiven recipesDifferent tasks and requirements− Identification of negations, type of meal and origin of the dish− Handling of certain diets− Creation of a three course menu

Developed using the empolis:Information Access Suite (e:IAS)

Page 7: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 7

CookIISCookIIS knowledge modelknowledge model

Most important component: modelled ingredients11 different classes, about 1000 conceptsModelled in English and German with synonymsConcepts organised in taxonomiesCombined similarity

Other components: tools, origins, methods, etc.Overall about 2000 modelled concepts

Rules for the recognition of the origin of the dish

Rules for the recognition of the type of meal

Page 8: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 8

Adaptation in Adaptation in CookIISCookIIS

Model-based approach:Replace unwanted ingredients with similar onesSimilarity is mainly based on taxonomies and using a set-function offered by e:IAS Rule Engine:− Parent and Child concepts are retrieved as well as sibling concepts− Too many similar ingredients are retrieved

In many cases the approach is not appropriate

Page 9: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 9

Cooking CommunitiesCooking Communities

A number of Internet communities deal with cooking knowledge

Users upload recipes and discuss them inside comments

They express affirmation, critics and what they changed for their own variation of the recipe (their personal adaptation)

If they vary the recipe, they name ingredients

Idea: using the CookIIS knowledge model to extract those ingredients

Page 10: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 10

CommunityCookCommunityCook: Classification Idea: Classification Idea

Comments can be classified according to extracted ingredients into three categories:

NEW: all ingredients that are discussed, but are not part of the recipeAdd some ingredient

OLD: all ingredients that are discussed and are part of the recipe“more”/ “less” of an ingredient, explanation for an ingredient

OLDANDNEW: some ingredients that are discussed are part of the recipe and others not

Replacement of ingredients == adaptationSpecialisation (“for cheese I took parmesan”)

Latter category can be an adaptation suggestion, especially if ingredients are of the same class of the knowledge model

Page 11: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 11

CommunityCookCommunityCook: Crawling: Crawling

Crawling of a large German cooking community:About 76.000 recipes with 286.000 related commentsHTML source code

Extraction of necessary data by building filters with the help of an open source tool (HTMLParser)

Recipe titleSingle ingredients (amount, measurement, name)CommentsStatistics

Saved into a database

Page 12: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 12

CommunityCookCommunityCook: Text Mining : Text Mining case basescase bases

Configuring e:IAS with two case-bases: recipe, comment

Cases representation is based on the modelled ingredients of the CookIIS knowledge model

Use e:IAS TextMiner to fill cases with concepts from text

Page 13: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 13

CommunityCookCommunityCook: perform Classification: perform ClassificationIn the next step we retrieved one recipe and all comments relating to that recipeEach comment classified into on of the three categoriesAdditionally we tried to find phrases in the text that support and specify the classificationassigned a score to determine the confidence of the classificationIf a pair of ingredients of the same class is found we also analyse if one concept is the parent concept of the other

no adaptation, but specialisationAbout 35.000 comments classified as OLDANDNEW, 16.000 with the subcategory “adaptation”

Page 14: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 14

CommunityCookCommunityCook: Aggregation: Aggregation

One way:Aggregation of all classified comments belonging to the recipe

We counted the number of same classifications per recipe and aggregated the score by calculating the average and assigning a bonus for every classification

Second way:also aggregated all classified ingredients without regarding the recipe (statistical)

Page 15: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 15

CommunityCookCommunityCook: Realization: Realization

Transformation of data:Building of adaptation suggestions in database-rows to easily retrieve thoseWith regard to the recipe and without (“independent”)

Page 16: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 16

CommunityCookCommunityCook: Integration into : Integration into CookIISCookIIS

6200 different adaptation suggestions are available for 570 different ingredients

Using the two most common adaptation suggestions per ingredient (without regard of the recipe) to create adaptation suggestions

Integration into CookIIS-Workflows:− If no adaptation suggestion is created with community data, the

model-based adaptation is used

Page 17: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 17

CommunityCookCommunityCook: Realization: Realization

Query: Chicken, but no cream

Page 18: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 18

Starting EvaluationStarting Evaluation

First look: The class “supplement” of the knowledge model− Too many different kinds of ingredients are in these class so that the

adaptation suggestions are not adequateThe class “basic” of the knowledge model− Basic ingredients like flour or egg are just hard to replace

both not in the review, not used in CookIIStwo different evaluations:

One evaluation to review the classification scheme− Do the classified ingredients represent what was expressed in the original

comment?One evaluation to review the extracted knowledge− Are the created adaptation suggestion applicable?

Page 19: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 19

EvaluationEvaluation

Evaluation of the extracted knowledge:Expert survey− Real chefs review the adaptation suggestions for recipes− Questionnaire with recipe and adaptation suggestions− One adaptation suggestion that was extracted from comments

belonging to that recipe (“dependent”)− Two adaptation suggestions without regard of the recipe

(“independent”) each with two ingredients as replacement suggestion (as in CookIIS)

50 Questionnaires with 50 dependent and 100 pairs of independent ingredients

Page 20: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 20

Evaluation: overall ApplicabilityEvaluation: overall Applicability

Page 21: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 21

Evaluation 1st vs. 2nd suggestionEvaluation 1st vs. 2nd suggestion

Only 11 of the 100 independent adaptation suggestions included no ingredient that can be used as substitution

Page 22: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 22

Evaluation: QualityEvaluation: Quality

Page 23: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 23

Future workFuture work

Further improvements on the knowledge model

Usage of adaptation suggestions that were extracted from recipes similar to the current one

Adding some semantic analysis to improve accuracy

Usage of comments with other classifications for building variations of recipes

Check the applicability in other domains

Page 24: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 24

Related workRelated workSystems that give preparation advises for meals:

CHEF [Hammond 1986]JULIA [Hinrichs 1992]

Adaptation knowledge aquisition:DIAL [Leake et. al 1996]CABAMAKA [d'Aquin et al. 2007]IAKA [Cordier et al. 2008]

Using the web as knowledge source in CBR:SEASALT [Bach et al. 2007]EDIR [Plaza 2008]

Page 25: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 25

ConclusionConclusionAdaptation knowledge is hard to acquire

The World Wide Web is a large source for knowledge

CommunityCook is a system the extracts adaptation knowledge from web communities in the domain of cooking

uses an existing knowledge model

The evaluation shows the applicability of the extracted knowledge

Page 26: Extraction of Adaptation Knowledge from Internet Communities

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 26

Thank you for your attention!

Questions?

Page 27: Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 27

LiteratureLiterature[Bach et al., 2007] Kerstin Bach, Meike Reichle, and Klaus-Dieter Althoff. A domain

independent system architecture for sharing experience. In Alexander Hinneburg, editor, Proceedings of LWA 2007, Workshop Wissens- und Erfahrungsmanagement, pages 296–303, Sep. 2007.

[Cordier et al., 2008] Am´elie Cordier, B´eatrice Fuchs, L´eonardo Lana de Carvalho, Jean Lieber, and Alain Mille. Opportunistic acquisition of adaptation knowledge cases - the iakaapproach. In Althoff et al. [2008], pages 150–164.

[d’Aquin et al., 2007] Mathieu d’Aquin, Fadi Badra, Sandrine Lafrogne, Jean Lieber, Amedeo Napoli, and Laszlo Szathmary. Case base mining for adaptation knowledge acquisition. In Manuela M. Veloso, editor, IJCAI, pages 750–755. Morgan Kaufmann, 2007.

[Hammond, 1986] Kristian J. Hammond. Chef: A model of case-based planning. In American Association for Artificial Intelligence, AAAI-86, Philadelphia, pages 267–271, 1986.

[Hinrichs, 1992] Thomas R. Hinrichs. Problem solving inopen worlds. Lawrence Erlbaum, 1992.

[Leake et al. 1996]: D. Leake, A. Kinley und D. Wilson: Acquiring Case-Adaptation Knowledge: A hybrid Approach, in: Proceedings of theThirteenth National Conference on ArtificialIntelligence, S. 684-689, AAAI Press, 1996.

[Minor 2006]: M. Minor: Erfahrungsmanagement mit fallbasierten Assistenzsystemen, Dissertation, Humbolt-Universitat zu Berlin, 2006.

[Plaza, 2008] Enric Plaza. Semantics and experience in the future web. In Althoff et al. [2008], pages 44–58. invited talk.

[Schmidt et al. 2003]: R. Schmidt, O. Vorobieva und L. Gierl: Case-based Adaptation Problems in Medicine, in: U. Reimer (Hrsg.): Proceedings of WM2003: Professionelles Wissensmanagement – Erfahrungen und Visionen, Kollen-Verlag, 2003.