progress on building the component library
DESCRIPTION
Progress on Building the Component Library. Bruce Porter, Peter Clark Ken Barker, Art Souther, John Thompson James Fan, Dan Tecuci, Peter Yeh Charles Benton, Marwan Elrakabawy, Cheyenne Kohnlein November 1, 2000. The Purpose of the Component Library. - PowerPoint PPT PresentationTRANSCRIPT
Progress on Building the Component Library
Bruce Porter, Peter Clark
Ken Barker, Art Souther, John Thompson
James Fan, Dan Tecuci, Peter Yeh
Charles Benton, Marwan Elrakabawy, Cheyenne Kohnlein
November 1, 2000
The Purpose of the Component Library
• To represent the set of common actions, states, objects, and properties so that SME’s can build KB’s by simply instantiating and assembling them.
• Representing actions has been our primary focus for four months.
• Most team members have used a few prototype components to build relatively simple scenarios. Now we’re trying to properly build a more comprehensive set of components.
Refresher…
Slides from kickoff meeting in
New Orleans
Bioremediation Amount Amount
Oil Fertilizer
Get Apply BreakDown
Absorb
Microbes Script
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
patient agent
scriptpollutant
se
rateagent
then then
product
sesese
patient
remediatoramount
Representation of Bioremediation
Bioremediation Amount Amount
Oil Fertilizer
Get Apply BreakDown
Absorb
Microbes Script
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
patient agent
scriptpollutant
se
rateagent
then then
product
sesese
patient
remediator
Conversion Amount Amount
Substance
RateQ+ I- Q-
I-
amountraw-materials
rate
product
Substance
amount
amount
An underlying abstraction...
Bioremediation Amount Amount
Oil Fertilizer
Get Apply BreakDown
Absorb
Microbes Script
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
patient agent
scriptpollutant
se
rateagent
then then
product
sesese
remediatoramount
Digest
Substance
BreakDown
Absorb
Agent Script
absorbedagent
script food
se
then
se patient
eater
agent
Another abstraction...
patient
Bioremediation Amount Amount
Oil Fertilizer
BreakDown
Absorb
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
agent
pollutant
se
rateagent
Get Apply
Microbes Scriptpatient
script
thenthen
product
sesese
remediatoramount
TreatmentAgent
Another abstraction...
patient
Get Apply
substance Scriptpatient
script
then
substance
patient
se
The Space of Actions
• Based on various linguistic resources and an analysis of 2 texts by Alberts, we’re working toward this set of about 190 action components.
• We’ve built components for about half of them, as shown here.
• Our coding rate has increased significantly, and we’re now able to productively add more personnel.
Schedule• Through the end of 2000:
– focus on action components, completing about 90% of those currently planned.
– Start coding pump-priming knowledge, building basic representations of about 200 objects and events.
• January through March 2001:– Focus on exercising the component library by encoding
significant portions of Alberts. This work doubles as essential pump-priming.
– Begin to represent generic objects, especially “role concepts” (more on this later).
– Integrate the component library with core knowledge developed by other team members (more on this later).
What’s in a Component?
• The specification gives the definition, slot constraints, and links to standard linguistic sources. Here’s an example.
• The KM code gives the axioms and an explicit interface to the user. Here’s an example. Note that the code includes only local axioms; KM infers the rest. Here’s the complete expansion.
Our Process for Building a Component
• form initial clusters of actions (e.g. transfer) based on an analysis of Alberts, Roget’s clusters, Cyc, and other linguistic sources.
• write a specification for each action.• search Alberts for all occurrences (including all morphological
variants) of each action, and make sure that the representation will accommodate them. Here’s the result of analyzing the actions in one chapter. These “coded examples”will be useful for training SME’s.
• organize the actions taxonomically and pull out commonalities that can be handled with various types of composition.*
• code the actions in KM along with simple test cases, commit them to the CVS-managed library, and run all test cases daily. Larger scenarios will provide the next level: integration testing.*
* These points will be elaborated below.
How to access the Component Library
• Click here to visit the component library.
• It’s updated every day unless some test case fails.
• We’ll add a feature to download the entire library via FTP.
• We want a simple, small, and slow growing set of slots. Ours currently has 78 slots (53 relations and 25 properties) and is inspired by well-studied sets of semantic roles from Linguistics, (surveyed in Ken’s dissertation).
The Dictionary of Slots
• Slots should apply intuitively to knowledge expressed informally. We have early evidence based on 3 large experiments.
• The semantics of the slots must be axiomatized. Here are some examples.
• Slots must make the distinctions necessary for inferencing (at least to the fidelity of the KR language)
• The slot language must continue to evolve.
Non-taxonomic composition:Clichés
• a cliché is a small pattern of axioms that recurs throughout the hierarchy. For example:
• Reflexive:requiredslot: agent, objectagent=object
• Reciprocal:requiredslot: agent, objectagent is object of an instance of this action
having this object as agent
• Undo(A):precondition: object is the object of the
resulting-state of action A postcondition: object is no longer the object of
the resulting-state of action A
Non-taxonomic composition:Utility Concepts
• concepts that have natural homes within the hierarchy, but also form a part of the semantics of concepts across the hierarchy
• Copy:– reasonable as a standalone concept– also part of Transcribe, Forge, Encode,
Reproduce, etc.
• Many concepts in the KB are “role concepts”– e.g., container, nutrient– are generic– are highly reusable (can be applied in many concepts)
• “If the DNA containing the 5S rRNA genes is …”• “many DNA sequences produce two or more distinct proteins”• “The DNA guides the synthesis of specific RNA molecules…”• “The DNA is enclosed in …”• “The idea that DNA transfers information…”
• By separating the “model” (e.g. container) and its application (e.g. to DNA), we can apply & reuse the same model in many ways.
Non-taxonomic composition:model-as
• Traditional: “Hard-wire” models to the modeled things
Applying models
• Better: Define machine-selectable “views”
Cell generalizations: Container Consumer …?
Cell model-as: Container (wall = membrane, ..)
Consumer (consumes = organic molecules, ..) Vehicle (transported = DNA, …) ….
• Control when and how components apply• Allows generic components to be used multiple ways (more
reuse) - difficult in the traditional approach!
How others can contribute to the Component Library
• Because the Library is only 4 months old and we’ve focused on particular types of knowledge, much remains to be done. We have several suggestions for how it might be usefully expanded.
How SME’s might index the Component Library
• SME’s will undoubtedly adjust to our tools somewhat, but they start with English. We should index the Library by English terms.
• Here’s a simple way to do that ... (next slide)
SME: I would like to use transport.Shaken: Which of these senses of transport would you like?
- v. send from one person or place to another (see: Transfer)- v. move while supporting … (see: Carry)- v. hold spellbound- v. transport commercially- v. move something or somebody around (see: Move)- n. the commercial enterprise of transporting goods and materials- n. something that serves as a means of transportation (see: Transport-Device)- n. a mechanism to transport magnetic tape over the head …- n. an exchange of molecules across a membrane (see: Molecular-Transport)- n. a state of being carried away by overwhelming emotion
We get “for free” the mapping from transport to: Transfer, Carry, Move, Transport-Device, and Molecular-Transportby linking our components to synsets in Wordnet.
The red components are currently in the Library; the blue components are planned.
Mapping from Verbs to Actions
Other types of Knowledgewe’re Encoding
• Properties usually surface as adjectives. We have a framework for representing them, and a plan for populating the KB.
• Pump-priming knowledge. We have proposed a scenario for Jan’01 and started to represent knowledge of biological objects. We start with taxonomies and partonomies (like SME’s build), then convert them automatically to KM.
Coordinating our efforts on developing Core Knowledge
• The Core Knowledge Workshop in Austin next month• Proposed agenda:
– Address representation challenges: continuous processes, modes of existence, time, space, causality, modals and counterfactuals, …
– Develop a detailed plan for integrating other core theories, such as ‘Everyday Semantics’
– Design the Core Knowledge for Shaken 1.0
• Schedule:– Duration: we suggest 3 days– Dates: we suggest mid-December