progress on building the component library

Progress on Building the Component Library

Bruce Porter, Peter Clark

Ken Barker, Art Souther, John Thompson

James Fan, Dan Tecuci, Peter Yeh

Charles Benton, Marwan Elrakabawy, Cheyenne Kohnlein

November 1, 2000

The Purpose of the Component Library

• To represent the set of common actions, states, objects, and properties so that SME’s can build KB’s by simply instantiating and assembling them.

• Representing actions has been our primary focus for four months.

• Most team members have used a few prototype components to build relatively simple scenarios. Now we’re trying to properly build a more comprehensive set of components.

Refresher…

Slides from kickoff meeting in

New Orleans

Bioremediation Amount Amount

Oil Fertilizer

Get Apply BreakDown

Absorb

Microbes Script

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

patient agent

scriptpollutant

se

rateagent

then then

product

sesese

patient

remediatoramount

Representation of Bioremediation


Oil Fertilizer

Get Apply BreakDown

Absorb

Microbes Script

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

patient agent

scriptpollutant

se

rateagent

then then

product

sesese

patient

remediator

Conversion Amount Amount

Substance

RateQ+ I- Q-

I-

amountraw-materials

rate

product

Substance

amount

amount

An underlying abstraction...


Oil Fertilizer

Get Apply BreakDown

Absorb

Microbes Script

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

patient agent

scriptpollutant

se

rateagent

then then

product

sesese

remediatoramount

Digest

Substance

BreakDown

Absorb

Agent Script

absorbedagent

script food

se

then

se patient

eater

agent

Another abstraction...

patient


Oil Fertilizer

BreakDown

Absorb

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

agent

pollutant

se

rateagent

Get Apply

Microbes Scriptpatient

script

thenthen

product

sesese

remediatoramount

TreatmentAgent

Another abstraction...

patient

Get Apply

substance Scriptpatient

script

then

substance

patient

se

The Space of Actions

• Based on various linguistic resources and an analysis of 2 texts by Alberts, we’re working toward this set of about 190 action components.

• We’ve built components for about half of them, as shown here.

• Our coding rate has increased significantly, and we’re now able to productively add more personnel.

http://www.cs.utexas.edu/users/mfkb/RKF/all-actions.html

http://www.cs.utexas.edu/users/mfkb/RKF/current-actions.html

Schedule• Through the end of 2000:

– focus on action components, completing about 90% of those currently planned.

– Start coding pump-priming knowledge, building basic representations of about 200 objects and events.

• January through March 2001:– Focus on exercising the component library by encoding

significant portions of Alberts. This work doubles as essential pump-priming.

– Begin to represent generic objects, especially “role concepts” (more on this later).

– Integrate the component library with core knowledge developed by other team members (more on this later).

What’s in a Component?

• The specification gives the definition, slot constraints, and links to standard linguistic sources. Here’s an example.

• The KM code gives the axioms and an explicit interface to the user. Here’s an example. Note that the code includes only local axioms; KM infers the rest. Here’s the complete expansion.

http://www.cs.utexas.edu/users/mfkb/RKF/tree/specs/Carry.spec.html

http://www.cs.utexas.edu/users/mfkb/RKF/tree/km/Carry.km.html

http://www.cs.utexas.edu/users/mfkb/RKF/example-expanded.html

Our Process for Building a Component

• form initial clusters of actions (e.g. transfer) based on an analysis of Alberts, Roget’s clusters, Cyc, and other linguistic sources.

• write a specification for each action.• search Alberts for all occurrences (including all morphological

variants) of each action, and make sure that the representation will accommodate them. Here’s the result of analyzing the actions in one chapter. These “coded examples”will be useful for training SME’s.

• organize the actions taxonomically and pull out commonalities that can be handled with various types of composition.*

• code the actions in KM along with simple test cases, commit them to the CVS-managed library, and run all test cases daily. Larger scenarios will provide the next level: integration testing.*

* These points will be elaborated below.

http://www.cs.utexas.edu/users/porter/Private/event-list-alberts.doc

http://www.cs.utexas.edu/users/porter/Private/event-list-alberts.doc

http://www.cs.utexas.edu/users/porter/Private/alberts-text

http://www.cs.utexas.edu/users/porter/Private/Alberts-7-verb-case.rtf

How to access the Component Library

• Click here to visit the component library.

• It’s updated every day unless some test case fails.

• We’ll add a feature to download the entire library via FTP.

http://www.cs.utexas.edu/users/mfkb/RKF/tree/

• We want a simple, small, and slow growing set of slots. Ours currently has 78 slots (53 relations and 25 properties) and is inspired by well-studied sets of semantic roles from Linguistics, (surveyed in Ken’s dissertation).

The Dictionary of Slots

• Slots should apply intuitively to knowledge expressed informally. We have early evidence based on 3 large experiments.

• The semantics of the slots must be axiomatized. Here are some examples.

• Slots must make the distinctions necessary for inferencing (at least to the fidelity of the KR language)

• The slot language must continue to evolve.

http://www.cs.utexas.edu/users/mfkb/RKF/tree/km/Event.km.html

Non-taxonomic composition:Clichés

• a cliché is a small pattern of axioms that recurs throughout the hierarchy. For example:

• Reflexive:requiredslot: agent, objectagent=object

• Reciprocal:requiredslot: agent, objectagent is object of an instance of this action

having this object as agent

• Undo(A):precondition: object is the object of the

resulting-state of action A postcondition: object is no longer the object of

the resulting-state of action A

Non-taxonomic composition:Utility Concepts

• concepts that have natural homes within the hierarchy, but also form a part of the semantics of concepts across the hierarchy

• Copy:– reasonable as a standalone concept– also part of Transcribe, Forge, Encode,

Reproduce, etc.

• Many concepts in the KB are “role concepts”– e.g., container, nutrient– are generic– are highly reusable (can be applied in many concepts)

• “If the DNA containing the 5S rRNA genes is …”• “many DNA sequences produce two or more distinct proteins”• “The DNA guides the synthesis of specific RNA molecules…”• “The DNA is enclosed in …”• “The idea that DNA transfers information…”

• By separating the “model” (e.g. container) and its application (e.g. to DNA), we can apply & reuse the same model in many ways.

Non-taxonomic composition:model-as

• Traditional: “Hard-wire” models to the modeled things

Applying models

• Better: Define machine-selectable “views”

Cell generalizations: Container Consumer …?

Cell model-as: Container (wall = membrane, ..)

Consumer (consumes = organic molecules, ..) Vehicle (transported = DNA, …) ….

• Control when and how components apply• Allows generic components to be used multiple ways (more

reuse) - difficult in the traditional approach!

How others can contribute to the Component Library

• Because the Library is only 4 months old and we’ve focused on particular types of knowledge, much remains to be done. We have several suggestions for how it might be usefully expanded.

http://www.cs.utexas.edu/users/mfkb/RKF/cl-growth.html

How SME’s might index the Component Library

• SME’s will undoubtedly adjust to our tools somewhat, but they start with English. We should index the Library by English terms.

• Here’s a simple way to do that ... (next slide)

SME: I would like to use transport.Shaken: Which of these senses of transport would you like?

- v. send from one person or place to another (see: Transfer)- v. move while supporting … (see: Carry)- v. hold spellbound- v. transport commercially- v. move something or somebody around (see: Move)- n. the commercial enterprise of transporting goods and materials- n. something that serves as a means of transportation (see: Transport-Device)- n. a mechanism to transport magnetic tape over the head …- n. an exchange of molecules across a membrane (see: Molecular-Transport)- n. a state of being carried away by overwhelming emotion

We get “for free” the mapping from transport to: Transfer, Carry, Move, Transport-Device, and Molecular-Transportby linking our components to synsets in Wordnet.

The red components are currently in the Library; the blue components are planned.

Mapping from Verbs to Actions

Other types of Knowledgewe’re Encoding

• Properties usually surface as adjectives. We have a framework for representing them, and a plan for populating the KB.

• Pump-priming knowledge. We have proposed a scenario for Jan’01 and started to represent knowledge of biological objects. We start with taxonomies and partonomies (like SME’s build), then convert them automatically to KM.

http://www.cs.utexas.edu/users/porter/Private/Scenario-selections.doc

http://www.cs.utexas.edu/users/porter/Private/object-hierarchy.doc

Coordinating our efforts on developing Core Knowledge

• The Core Knowledge Workshop in Austin next month• Proposed agenda:

– Address representation challenges: continuous processes, modes of existence, time, space, causality, modals and counterfactuals, …

– Develop a detailed plan for integrating other core theories, such as ‘Everyday Semantics’

– Design the Core Knowledge for Shaken 1.0

• Schedule:– Duration: we suggest 3 days– Dates: we suggest mid-December

progress on building the component library

Documents

action components

component libraryto

prototype components

component librarybruce

set of common actions

coding pumppriming knowledge

generic objects

team members