concept modeling:

28
CONCEPT MODELING: Đorđe Popović, Ognjen Šćekić, Đorđe Popović, Ognjen Šćekić, Veljko Milutinović Veljko Milutinović A Proposed Hybrid Approach to A Proposed Hybrid Approach to Patent Modeling Patent Modeling

Upload: kathy

Post on 22-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

CONCEPT MODELING:. A Proposed Hybrid Approach to Patent Modeling. Đorđe Popović, Ognjen Šćekić, Veljko Milutinović. What is concept modeling?. A way of modeling reality: Identifying concepts Identifying relations among concepts - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CONCEPT MODELING:

CONCEPT MODELING:Đorđe Popović, Ognjen Šćekić,Đorđe Popović, Ognjen Šćekić, Veljko MilutinovićVeljko Milutinović

A Proposed Hybrid Approach to Patent ModelingA Proposed Hybrid Approach to Patent Modeling

Page 2: CONCEPT MODELING:

22//2828

What is concept modeling?

• A way of modeling reality:A way of modeling reality: Identifying conceptsIdentifying concepts Identifying relations among conceptsIdentifying relations among concepts Organizing the concepts in a knowledge-base, Organizing the concepts in a knowledge-base,

allowing an "intelligent" way to search and process this data.allowing an "intelligent" way to search and process this data.

• Why do we need concept modeling?Why do we need concept modeling?To make electronic resources not only machine-processable, To make electronic resources not only machine-processable, but also but also machine-understandablemachine-understandable!!

Page 3: CONCEPT MODELING:

33//2828

Challenges

• How to create a model that has a uniform structure, How to create a model that has a uniform structure, and is powerful enough to capture the essence of any concept?and is powerful enough to capture the essence of any concept?

• How should these models be linked into an efficient structure?How should these models be linked into an efficient structure?

• How can we bridge the gap between natural languageHow can we bridge the gap between natural languageand a machine-processable model?and a machine-processable model?

Page 4: CONCEPT MODELING:

44//2828

Ultimate Goal:Ultimate Goal:From a specific to a general model!From a specific to a general model!

7 Ws – PROs and CONs

concept

Which

When

What

Why

Where

Who

(W)How

WHAT associations provide general facts about WHAT associations provide general facts about anyany concept.concept.

Page 5: CONCEPT MODELING:

55//2828

Why start with patents?

• Described by a very formal, structured language – Described by a very formal, structured language – claimsclaims..

• Each patent is a novel concept.Each patent is a novel concept.

• Definition of one patent is usually based on another one.Definition of one patent is usually based on another one.

Page 6: CONCEPT MODELING:

66//2828

General info about General info about the the patentpatent(can be used for Which, When, Why, Where, Who and How)(can be used for Which, When, Why, Where, Who and How)

Structure of a Patent Document

Abstract of the patentAbstract of the patent

Claims – primary target for WhatClaims – primary target for WhatDescription – not well structuredDescription – not well structuredReferenceReferences tos to related patent related patentss

Page 7: CONCEPT MODELING:

77//2828

Conceptual Indexing (1)

• What is conceptual indexing?What is conceptual indexing?““New technique for organizing information to support subsequent access New technique for organizing information to support subsequent access that can dramatically improve your ability to find the information you need,that can dramatically improve your ability to find the information you need,with less hassle and with better results.”with less hassle and with better results.”

William William A. WoodsA. Woods

• Conceptual indexing combines techniques of:Conceptual indexing combines techniques of: Knowledge representationKnowledge representation Natural language processingNatural language processing Classical techniques for indexing words and phrasesClassical techniques for indexing words and phrases

• BBridgeridgess the gap between natural language the gap between natural languageand a machine processable modeland a machine processable model..

Page 8: CONCEPT MODELING:

88//2828

Conceptual Indexing (2)

Conceptual indexing technology is a combination of:Conceptual indexing technology is a combination of:

• Concept extractorConcept extractorIIdentifies phrases to be indexeddentifies phrases to be indexed..

• Concept Concept aassimilatorssimilatorAAnalyzes nalyzes a a concept phrase to determine concept phrase to determine its place in the conceptual taxonomyits place in the conceptual taxonomy..

• Conceptual retriConceptual retrieeval systemval systemUUsseses conceptual taxonomy conceptual taxonomy to maketo make connections connections between requestbetween requesteded and indexed items and indexed items..

Target Text

Concept ExtractorConcept

Assimilator

Conceptual Retrieval System

RequestConnections

between requests and indexed items

Figure 1 –Figure 1 – Main components of a conceptual indexer Main components of a conceptual indexer

Page 9: CONCEPT MODELING:

99//2828

Hybrid Approach: Indices + RDF/OWL

• Conceptual indicesConceptual indices• RDF/OWLRDF/OWL

• Motivation:Motivation: Use the advantages of one approach Use the advantages of one approach to eliminate the drawbacks of the other.to eliminate the drawbacks of the other.

Page 10: CONCEPT MODELING:

1010//2828

Conceptual Indices vs. RDF/OWL

Conceptual indicesConceptual indices RDF/OWL ontologiesRDF/OWL ontologies

MajorMajoradvantages:advantages:

Linear-complexity structures Very expressive and precise

Provide basic subsumption relations Based on First-Order Logic

Provide built-in knowledgeon low-level concepts Supported by W3C

MajorMajordrawbacks:drawbacks:

Incapability of establishingexplicit relations among

high-level concepts Great complexityIncapability to create

precise models

Page 11: CONCEPT MODELING:

1111//2828

Why not use ontologies alone?

• If we want to use an ontology we have 2 choices:If we want to use an ontology we have 2 choices: Use an existing, well-established ontology that might not suite our needs. Use an existing, well-established ontology that might not suite our needs. Create a new ontology which does suit our needs:Create a new ontology which does suit our needs:

– We can create several different ontologies,We can create several different ontologies,depending on depending on howhow we want to capture the information. we want to capture the information.

– Problems arise when we want to merge ontologies.Problems arise when we want to merge ontologies.

• This approach works fine within a closed communityThis approach works fine within a closed communitywith specific needs:with specific needs: There already exists a well-defined basic ontology structure.There already exists a well-defined basic ontology structure. Community members have a good knowledge of how to model new conceptsCommunity members have a good knowledge of how to model new concepts

in terms of the existing ones.in terms of the existing ones.

Page 12: CONCEPT MODELING:

1212//2828

Why not use indices alone?

• For example, let us take the simplest possible definition, for a bird:For example, let us take the simplest possible definition, for a bird:

bird bird 11 – – a creature with wings and feathers that lays eggs a creature with wings and feathers that lays eggs and can usually fly. and can usually fly.

• Our index might then contain the following associations:Our index might then contain the following associations:

creature, wings, feathers, eggs, flycreature, wings, feathers, eggs, fly..

• A conceptual index does not offer the possibility A conceptual index does not offer the possibility to state the fact that to state the fact that some birds do not flysome birds do not fly!!

11 - Word definition taken from - Word definition taken from Longman Dictionary of Contemporary EnglishLongman Dictionary of Contemporary English , 3rd edition, 1995., 3rd edition, 1995.

Page 13: CONCEPT MODELING:

1313//2828

Hybrid Approach (1)

• An index of associations represents a simple model,An index of associations represents a simple model,similar to what humans have on their mindsimilar to what humans have on their mindwhen they first think of a bird.when they first think of a bird.

• Having enough associations, one can create a model Having enough associations, one can create a model with a considerable degree of accuracy.with a considerable degree of accuracy.

• RDF/OWL statements provide a means RDF/OWL statements provide a means for expressing additional (but very important) informationfor expressing additional (but very important) information(e.g. there are birds that cannot fly!)(e.g. there are birds that cannot fly!)

• We believe this is good enough for most applications.We believe this is good enough for most applications.

Page 14: CONCEPT MODELING:

1414//2828

Hybrid Approach (2)

• It is important to keep track of how many times a term is mentioned,It is important to keep track of how many times a term is mentioned,because it affects its descriptive power.because it affects its descriptive power.

Example: Example:

U.S. Patent 6,989,179 – “U.S. Patent 6,989,179 – “Synthetic grass sport surfacesSynthetic grass sport surfaces”, claims section:”, claims section:

1. synthetic grass1. synthetic grass [10][10]

2. playing surface2. playing surface [9] [9]

… …

• These terms represent the essence of what is being described!These terms represent the essence of what is being described!

Page 15: CONCEPT MODELING:

1515//2828

Hybrid Approach (3)

• However, this is only because we However, this is only because we knowknow what “synthetic grass” what “synthetic grass” and “playing surface” are!and “playing surface” are!

At some level, we need to have some intrinsic,At some level, we need to have some intrinsic, built-in knowledge-base of basic concepts! built-in knowledge-base of basic concepts!

• All the other concepts can then be described All the other concepts can then be described in terms of these basic concepts.in terms of these basic concepts.

• Solution: Solution: Conceptual indexers are equipped Conceptual indexers are equipped with a knowledge base of basic terms.with a knowledge base of basic terms.

Page 16: CONCEPT MODELING:

1616//2828

Patent Model – Conceptual Index

• A patent’s Claims section is scanned and processedA patent’s Claims section is scanned and processedby a by a conceptual indexerconceptual indexer..

• The result is a The result is a descriptive indexdescriptive index,, associated with the patent associated with the patent (it size is approx. 1-5% of the full text).(it size is approx. 1-5% of the full text).

• This index can be seen as an ordered list This index can be seen as an ordered list of the patent’s of the patent’s WHAT associationsWHAT associations (terms, phrases, sentence fragments).(terms, phrases, sentence fragments).

• An entry in the descriptive index contains a low-level concept,An entry in the descriptive index contains a low-level concept,and the number of its occurrences.and the number of its occurrences.

Page 17: CONCEPT MODELING:

1717//2828

Patent Model – RDF/OWL

• For a different application, For a different application, a different RDF/OWL model needs to be devised.a different RDF/OWL model needs to be devised.

• For describing patents this model could be used For describing patents this model could be used to capture explicitly stated information:to capture explicitly stated information:

• Patent number and other numbersPatent number and other numbers (( WHICH) WHICH)• Inventor, examiner, attorney, …Inventor, examiner, attorney, … (( WHO) WHO)• Date when the patent was filedDate when the patent was filed (( WHEN) WHEN)• Explicit references to similar patentsExplicit references to similar patents (( WHICH) WHICH)• etc…etc…

• Each W can have multiple sub-categories Each W can have multiple sub-categories that are application-specific!that are application-specific!

Page 18: CONCEPT MODELING:

1818//2828

Patent Model – Creation

Patent

XML /RDF

Figure 2 –Figure 2 – Creation of a patent model: Creation of a patent model:

Claims section is processed by the conceptual indexer to produce an index associated with the patent.Claims section is processed by the conceptual indexer to produce an index associated with the patent.

Additional information about the concept is captured by RDF/OWL statements,Additional information about the concept is captured by RDF/OWL statements,into a predefined, application-specific structure.into a predefined, application-specific structure.

ConceptualIndexer

“descriptive index”

Page 19: CONCEPT MODELING:

1919//2828

Patent Model – Result

Figure 3 – Figure 3 – Patent model:Patent model:

WHAT associations are contained in a descriptive index. WHAT associations are contained in a descriptive index. Other Ws are expressed through RDF/OWL statements.Other Ws are expressed through RDF/OWL statements.

XML /RDF

XML /RDF

XML /RDF

WH

O

WHEN

WHERE

WHAT

application-specificinformation

“descriptive index”

Page 20: CONCEPT MODELING:

2020//2828

Patent model – Big Picture

• Descriptive indices are re-processed by the Conceptual indexer,Descriptive indices are re-processed by the Conceptual indexer,to form the to form the system indexsystem index..

• Each entry in the system index retains links Each entry in the system index retains links to the descriptive indices it originates from,to the descriptive indices it originates from,and vice-versa.and vice-versa.

• This structure allows us to:This structure allows us to: Perform quick searches of the existing patentsPerform quick searches of the existing patents Add/remove patents easilyAdd/remove patents easily

Page 21: CONCEPT MODELING:

2121//2828

RDF/OWL INDICES

XML /RDF

XML /RDF

XML /RDF

WHO

WHEN

WHERE

WHAT

XML /RDF

XML /RDF

XML /RDF

WHO

WHEN

WHERE

WHAT

AutomatedReasoner

ConceptualIndexer

”System Index”

XML /RDF

XML /RDF

XML /RDF

WHO

WHEN

WHERE

WHAT

Figure 4 – Figure 4 – Top-level schemeTop-level scheme

Page 22: CONCEPT MODELING:

2222//2828

Patent Model – Patent Relations

Two ways of establishing relations among patents:Two ways of establishing relations among patents:

• Via RDF/OWL statements, using automated reasonersVia RDF/OWL statements, using automated reasoners Problem: Referential integrity & ConsistencyProblem: Referential integrity & Consistency

• Via System index (implicit links)Via System index (implicit links) Problem: Inexact, based on probabilityProblem: Inexact, based on probability

Page 23: CONCEPT MODELING:

2323//2828

Patent Model – Implicit Links (1)

• Descriptions of similar concepts (patents) Descriptions of similar concepts (patents) usually make a frequent use of similar or even same terms.usually make a frequent use of similar or even same terms.

• By determining overlapping terms we createBy determining overlapping terms we createdynamic, implicit links among similar concepts.dynamic, implicit links among similar concepts.

• The number of such implicit links can be used The number of such implicit links can be used to express similarity among concepts.to express similarity among concepts.

• The algorithm for determining the similarity The algorithm for determining the similarity needs to be tweaked empirically.needs to be tweaked empirically.

Page 24: CONCEPT MODELING:

2424//2828

Patent Model – Implicit Links (2)

• For example:For example:When describing two different vaccines When describing two different vaccines we would probably make a frequent use of terms like: we would probably make a frequent use of terms like: vaccine, inactivated antigens, immune response, etc.vaccine, inactivated antigens, immune response, etc.

Vaccine (59)

Immune response (34)

Inactivated antigens (23)

etc.

Antigen

Immune response

Vaccine

...

Vaccine (39)

Inactivated antigens (19)

Laboratory test (17)

Immune response (16)

etc.

Vaccine #1 Vaccine #2

“System Index”

Page 25: CONCEPT MODELING:

2525//2828

Advantages & Drawbacks

• AdvantagesAdvantages Reduced complexityReduced complexity (a great reduction of direct links between concepts)(a great reduction of direct links between concepts)

Fast search and retrievalFast search and retrieval (as the result of using indices)(as the result of using indices)

ScalabilityScalability

• DrawbacksDrawbacks Use of indices implies loss of precisionUse of indices implies loss of precision

Page 26: CONCEPT MODELING:

2626//2828

Conclusion

• Our idea is still in the first stage of development.Our idea is still in the first stage of development.

• Its key advantages are:Its key advantages are:its general applicability and reduced complexity.its general applicability and reduced complexity.

• Further research is needed to explore Further research is needed to explore the quality and feasibility of the proposed solution.the quality and feasibility of the proposed solution.

• However, we expect that the combination However, we expect that the combination of OWL/RDF structures and indices might produce of OWL/RDF structures and indices might produce a satisfactory performance/exactness ratio.a satisfactory performance/exactness ratio.

Page 27: CONCEPT MODELING:

2727//2828

References

W. A. Woods, L. A. Bookman, A. Houston, R. J. Kuhns, P. Martin, S. Green, W. A. Woods, L. A. Bookman, A. Houston, R. J. Kuhns, P. Martin, S. Green, "Linguistic Knowledge Can Improve Information Retrieval","Linguistic Knowledge Can Improve Information Retrieval",Proc. of the Applied Natural Language Processing Conference (ANLP-2000),Seattle, 2000.Proc. of the Applied Natural Language Processing Conference (ANLP-2000),Seattle, 2000.

O. Scekic, P. Bojic, O. Scekic, P. Bojic, "An Overview of OWL and its Role in Semantic Web Architecture""An Overview of OWL and its Role in Semantic Web Architecture",,YU-INFO 06, Kopaonik, Serbia&Montenegro, 2006.YU-INFO 06, Kopaonik, Serbia&Montenegro, 2006.

Boris V. Dobrov, Natalia V. Loukachevitch, Tatyana N. Yudina, Boris V. Dobrov, Natalia V. Loukachevitch, Tatyana N. Yudina, "Conceptual Indexing Using Thematic Representation of Texts“"Conceptual Indexing Using Thematic Representation of Texts“,,Scientific Research Computer Center of Moscow State University, Moscow,Scientific Research Computer Center of Moscow State University, Moscow, 1998. 1998.

S. Omerovic, D. Savic, S. Tomazic,S. Omerovic, D. Savic, S. Tomazic,"A Survey of Concept Modeling""A Survey of Concept Modeling",,Faculty of Electrical Engineering, University of Ljubljana, Slovenia (to appear)Faculty of Electrical Engineering, University of Ljubljana, Slovenia (to appear)..

William A. Woods, William A. Woods, “Conceptual Indexing:“Conceptual Indexing: A Better Way to Organize Knowledge“A Better Way to Organize Knowledge“,, Technical report, Technical report, Sun Microsystems Laboratories, 1998.Sun Microsystems Laboratories, 1998.

http://www.uspto.govhttp://www.uspto.gov – – U.S. Patent officeU.S. Patent office

Page 28: CONCEPT MODELING:

CONCEPT MODELING:Đorđe PopovićĐorđe Popović Ognjen ŠćekićOgnjen Šćekić Veljko Veljko MilutinovićMilutinović [email protected]@ptt.yu [email protected]@cg.yu [email protected]@etf.bg.ac.yu

A Proposed Hybrid Approach to Patent ModelingA Proposed Hybrid Approach to Patent Modeling

Thank you !Thank you !