csc 599: computational scientific discovery lecture 9: introduction to the scienceomatic...

CSC 599: Computational Scientific Discovery

Lecture 9: Introduction to the Scienceomatic

Architecture

Outline

Motivation CSD thus far

Scienceomatic Architecture

First Trend in CSD

1. Data structures that are more predictive

Single simple equationsBACON, late 1980s

List of mechanismsMECHEM, mid 1990s

Differential equationsLagramge, mid 1990s

Process networkIPM, mid 2000's

Second Trend in CSD

2. Better application of domain knowledge

“Better” in the sense that1. Provides more efficiency for limiting search2. Provides In scientist-friendly format

Examples: Ad hoc

BACON, late 1980s Grammar

Lagramge, mid 1990s Domain constraints of acceptable solutions

MECHEM, mid 1990s Abstract processes

IPM, mid 2000's

Emphases of CSD

Predictive data structures:

1. More structurally complex

2. More embedded in knowledge scientists already have

Use domain knowledge More “understandable”

3. Integrates simulation and exhaustive search

2 strengths of computers

But what do scientists do?

Give reasons why! (explanations)

Templates for solving problems:

Philosophy of science Kuhn's exemplars

Artificial Intelligence Explanation Based Learning

Explanations need:1. Assertions2. Reasoning method(s) to

tie them together

About Explanations

Assertions come in (at least) two flavors What Prolog would call “facts”

Measurements (thermometer1 read 20.6 C at time t

0)

Fundamental properties (c = 299,792,458 m/s) What Prolog would call “rules”

F = ma Modern philosophers of science don't like this

Thinks it smells too much like logical empiricism

Reasoning comes in several flavors Deduction: A; A->B; therefore B Abduction: B; A->B; therefore A Analogy: f(A); g(a); relates(A,a); f(B); g(b); therefore

relates(B,b) Maybe Induction: f(1); f(2); f(3); therefore "n: f(n)

Explanation-based Learning

Deductive learning from one training example Requires:

1. The training example World provides proof of one legal configuration

2. A Goal Concept High level description of what to learn

3. An Operationality Criteria Tells which concepts are usable

4. A Domain Theory Tells relationship between rules & action in domain

EBL generalizes example to describe goal concept and satisfy operationality criteria

1. Explanation: remove unimportant details from training example with respect to goal concept

2. Generalization: generalize as much as can while still describing goal concept

EBL Applied to Scientific Reasoning

We have Newton's Law of gravity: F = GMm/r2 (domain know.) (Example):

Mass of an apple Force on apple due to gravity (ie. Its weight) mass[earth] >> mass[apple]; r = radius[earth]

Force of weight (Goal concept to learn) Data struct outlining how (Operationality criterion)

weight[apple]=(GM/radius[earth]) *mass[apple]

Generalize data struct to anything fitting criteria:mass[earth] >> mass[X], r=radius[earth]weight[X]=GM*mass[X]/radius[earth]

But what else do scientists do?

Remember what has been tried, and why! Historical trajectory of

scientific effort Reason where to put effort

Human science Funding agency

Artificial Intelligence Reinforcement learning

Issues:1. Strategy vs. tactics2. Tried-and-true vs. brand new

Science under limited resources

Ranking (priority queue) of operators to try

Funding agencies Limited resource = money (and time)

Reinforcement learning Limited resource = CPU time (and memory)

What is Reinforcement Learning?

Is type of learning Not particular algorithm! Agent always acting in environment Gets reward at end, or as goes along

Can be delay j between action and its payoff Goal: maximize the payoff

Strategy vs. Tactics (Military)

(Definitions from Compact Oxford English Dictionary)

Strategy:“a plan designed to achieve a particular long-term aim”

Examples: “Destroy enemy's forces” “Destroy enemy's economy/industrial base” “Destroy enemy's morale”

Tactics:“the art of disposing armed forces in order of battle and of

organizing operations.” Examples:

Frontal assault Siege Pincer Hit and run

Strategy (Scientific Discovery)

Strategy:“What should the long-term process of science

be?”

Topic in contemporary philosophy of science:

Examples:Lakatos

Minimize number unpredicted phenomena Cumulatively build upon research programmes' hardcore

Laudan Maximize number of predicted attributes, Research traditions less structured, not necessarily

cumulative

Tactics (Scientific Discovery)

Tactics:

“What should this scientist be doing right now?”

Related to inductive bias in machine learning

Examples: Information gain Minimize cross-validation error Maximize conditional independence

Strategy vs. Tactics: related issues

Tried-and-true vs. Brand new

When does the strategy switch from conventional tactics to unconventional ones?

Philosophy of science: Kuhn: Normal science vs. revolution Lakatos: Progressive vs. degenerate research

programmes

Artificial Intelligence Exploration vs. exploitation

Exploration (revolution: look for the brand new) Exploitation (normal science: get what can from known structure)

Issue studied in reinforcement learning community

Can Implement each object separately

Assertion data structure Directly uses assertions

Explanation data structure Gives explanations

Historical data structure Gives historical context to

justify what to do next

Assertion Usage Object

Sample of important methods: Retrieve assertion a1

Show assertion Edit assertion

Predict object o1's attribute attr1

Plot these values Compare predicted and

recorded values Justify (e.g. logical

resolution) assertion a1

Explanation Usage Object

Sample of important methods: Predict o1's attribute attr1

Satisfy with assertion usage obj Satisfy with solved problem library

Philosophy of science justification Kuhnian exemplar: what scientists do

Artificial Intelligence justification: EBL: cheaper than de novo reasoning

Give trace why object o1's attribute attr1 is value v1.

Give trace how assertion a1 is justified (e.g. derived)

Refine reasoning method“I like traces like this over traces like

that because . . .”

Historical Trajectory Object

Sample of important methods: Predict o1's attribute attr1

Show vs. edit (assertion object) De novo vs. exemplar (explanation) Why this trace? Previous traces?

Change strategy Lakatos, Laudan or other? Change tactics

Which inductive bias When to use operator op1

Change operator library Add/delete/modify operators

Examine history How well do ops work, and when? Selectively erase history

Change priority queue Reorder operator instances

Three objects underManual or Autonomous Control

Ontology

Is-a hierarchy Single inheritance (except for processes)

Instance-of leaves Each instance only belongs to one class

Inherited properties from classes Override-able at instance or derived class level

Assertion Usage Object

Types of assertions“Assertions of state”

“Facts” (in the Prolog sense) “Rules” (in the Prolog sense)

Relations (e.g. equations) Numeric computation

Decision trees Symbolic computation

“Assertions of motion” Process classes Process instances

analogous to “facts” Rules

Numeric relations for processes Decision trees for processes

About assertions

Assertions have: Name List of entities (things they interrelate) Conditions (when they hold) Expression <entity,attribute> pair that they define (optional) Authorship

Who is responsible for putting them in kb When placed in? Where they came from (Operator? User edit?) List of assertions that they depend on (if created by

operator)

Numeric Relation example

Numeric relation: Ideal gas law Name:

ideal_gas_law List of entities (things they interrelate)

[gas_ent, container_ent, molecule_ent] gas_ent is the gas sample being described container_ent is the container holding the gas sample molecule_ent is the

Conditions (when they hold), ex:“Gas phase molecules are not attractive or repulsive”

Expression PV = nRT

<entity,attribute> pair that they define (optional) Gas's thermal energy = PV = nRT (?)

Numeric Relation Example (2)

Authorship Of discovery

Who discovered it person(some_scientist) operator(bacon3)

When it was discovered date(century19) date(1990)

On inclusion Who included it

person(joseph_perry_phillips) When it was included

date(2008,may,27)

Processes

Describe changes over time Process classes

Langley et al call them “abstract processes” Whole class of similar events Arranged hierarchically Have assertions associated with them

Process instances Instance of process class Single event May be decomposed into finer process instances

Processes exampleMotion

Very abstract1-D motion

Specifies that motion along one dimension only

abstract means “fnc to be given in derived class”

1-D uniform acceleration Specifies uniform accel. abstract_const means

“constant to be given in derived class”

1-D gravitational accel. Gives conditions

Process assertion example

Decision tree to stochastically compute child's genotype form parents Non-leaves are

tests Some are random

Leaves are answers

Explanation Usage Object

Returns traces of reasoning Akin to resolution refutation traces Given: A; B; A∧B -> C (or not(A)∨not(B)∨C) Prove: C Method:

Assume not(C) Show contradiction C must be true!

Explanation Usage Object (2)

Can look up in library If not found calls

assertion usage object

Works for: Justifying single

values Justifying whole

assertionsOptionally allow more

than deduction

History Trajectory Object

Does several things:1. Decides which operator to do next based on:

a) How successful they have been (operator id)b) Type of data (data id)c) Tacticsd) Strategy

2. Keeps track of what's been tried beforea) operator/datab) success/failurec) “by how much”d) who/when/why/etc.

3. Modifiablea) Learns best operators on for given datab) PROGRAMMABLE?!? (Under these conditions

create an operator that does this . . .)

Next time

1. More detail about kb structure A “culture” for science Value hierarchy States and time Java/C++ simulators

2. Writing programs in Scienceomatic architecture Dynamically configured discovery operators in

history trajectory object3. Scienceomatic in action

a) What might “normal science” look like?b) What might “revolution” look like?

csc 599: computational scientific discovery lecture 9: introduction to the scienceomatic...

Documents

goal conceptebl

goal concept generalization

domain theory

gmmr2 domain

csd data structures

training examplerequires

massapplegeneralize

training exampleworld