knowledge representation and machine learning

Knowledge Representationand Machine Learning

Stephen J. Guy

Overview

Recap some Knowledge Rep. History First order logic

Machine Learning ANN Bayesian Networks Reinforcement Learning

Summary

Knowledge Representation?

Ambiguous term “The study of how to put knowledge

into a form that a computer can reason with” (Russell and Norvig)

Originally couple w/ linguistics Lead to philosophical analysis of

language

Knowledge Representation?

Cool Robots Futuristic Robots

Early Work

Blockworlds (1972) SHRDLU “Find a block which

is taller than the one you are holding and put it in the box”

SAINT (1963) Closed form Calculus Problems

STUDENT (1967) “If the number of customers Tom gets is twice the

square of 20% of the number of advertisements he runs, and the number of advertisements he runs is 45, what is the number of customers Tom gets?

xx 2

Early Work - Theme

Limit domain “Microworlds” Allows precise rules

Generality Problem Size

1) Making rules are hard 2) State space is unbounded

Generality

First-order Logic Is able to capture simple Boolean

relations and facts x y Brother(x,y) Sibling(x,y) x y Loves(x,y)

Can capture lots of commonsense knowledge

Not a cure-all

First order Logic - Problems Faithful captures fact, objects and

relations Problems

Does not capture temporal relations Does not handle probabilistic facts Does not handle facts w/ degrees of truth

Has been extended to: Temporal logic Probability theory Fuzzy logic

First order Logic - Bigger Problem

Still lots of human effort “Knowledge Engineering”

Time consuming Difficult to debug

Size still a problem Automated acquisition of knowledge

is important

Machine Learning

Sidesteps all of the previous problems

Represent Knowledge in a way that is immediately useful for decision making

3 specific examples Artificial Neural Networks (ANN) Bayesian Networks Reinforcement Learning

Artificial Neural Networks (ANN)

1st work in AI (McCulloch & Pitts, 1943) Attempt to mimic brain neurons Several binary inputs, One binary output

N

nnn thresholdIRO

0

Inputs: I1, I2, …

Output: OResponses: R1, R2, …

Artificial Neural Networks (ANN)

Can be chained together to Represent logical connectives (and, or, not) Compute any computable functions

Hebb (1949) introduced simple rule to modify connection strength (Hebbian Learning)

Inputs: I1, I2, …

Output: OResponses: R1, R2, …

N

nnn thresholdIRO

0

Single Layer feed-forward ANNs (Perceptrons)Input Layer Output Unit

N

nnn thresholdIRO

0

Can easily represent otherwise complex (linearly separable) functions And, Or Majority Function

Can Learn based on gradient descent Cannot tell if 2 inputs are different!! (Minskey,

1969)

Learning in Perceptrons

Replace Threshold function w/ Sigmod g(x)

Define Error Metric (Sum Sqr Diff) Calculate Gradient wrt Weight

Err * g’(in) * Xj

Wj = Wj + * Err * g’(in) * Xj

Multi Layer feed-forward ANNs

Breaks free of problems of perceptions

Simple gradient decent no longer works for learning

Input Layer Output UnitHidden Layer

Learning in Multilayer ANNs (1/2)

Backpropagation Treat top level just like single-layer ANN Diffuse error down network based on

input strength from each hidden node

Learning in Multilayer ANNs (2/2)

i = Erri* g’(ini) Wj,i = Wj,i + * aj * i Wk,j = Wk,j + * ak * j

ANN - Summery

Single Layer ANNs (Proceptrons) can capture linearly separable functions

Multi-layer ANNs can caputer much more complex functions and can be effectively trained using back-propagation

Not a silver bullet How to avoid over-fitting? What shape should the network be? Network values are meaningless to

humans

ANN – In Robots (Simple)

Can be easily set up and robot Brian Input = Sensors Output = Motor Control Simple Robot learns to avoid bumps

ANN – In Robots (Complex)

Autonomous Land Vehicle In a Neural Network (ALVINN) CMU project learned to drive from

humans 32x30 “retina” 5 hidden layers 30 output nodes Capable of driving

itself after 2-3 minutes of training

Bayesian Networks

Combines advantages of basic logic and ANNs

Allows for “effucient represenation of, and rigorous reasoning with, unceartain knwoledge” (R&N)

Allows for learning from experience

Bayesian Networks

Allows us to chain together more complex relations

Creating network is not necessarily easy Create a fully connected network Cluster groups w/ high correlation together Find probabilities using rejection sampling

Meningitis

Stiff Neck

P(M) = 1/50000

M P(S)T .5F 1/20

Bayesian Networks (Temporal Models)

More complex Bayesian networks are possible

Time can be taken into account Imagine predicting if it will rain tomorrow,

based only on if your co-worker brings in an umbrella

Raint-1

Umbrellat-1

Raint

Umbrellat

Raint+1

Umbrellat+1


4 Possible Inference tasks based on this knowledge Filtering – Computing belief as to current state Prediction – Computing belief of future state Smoothing – Improving knowledge of pasts states

using hindsight (Forward-backward Algorithm) Most likely explanation – Finding the single most

likely explanation for a set of observations (Viterbi)

Raint-1

Umbrellat-1

Raint

Umbrellat

Raint+1

Umbrellat+1


Assume you see umbrella 2 days in a row (U1= 1, U2 = 1) P(R0) = <0.5,0.5> (<.5 R0 = T, .5 R0 = F>) P(R1) = P(R1|R0)*P(R0)+P(R1|~R0)*P(~R0)

= 0.7*0.5 + 0.3*0.5 = <0.5,0.5> P(R1|U1) =nrm(P(U1|R1)*P(R1))

=nrm<.9*.5,.3*.5> =nrm<.45,.1> = <.818,.182>

Raint-1

Umbrellat-1

Raint

Umbrellat

Raint+1

Umbrellat+1


Assume you see umbrella 2 days in a row (U1= 1, U2 = 1) P(R2|U1) = P(R2|R1)P(R1|U1)+ P(R2|~R1)P(~R1|U1) =.7*.818 + 0.3*0.182 = .627 = <.627,.373> P(R2|U2,U1) =nrm(P(U2|R2)*P(R2|U1))

=nrm<.9*.627,.2*.373> =nrm<.565,.075> = <.883,.117>

On the 2nd day of seeing the umbrella we were more confident that it was raining

Raint-1

Umbrellat-1

Raint

Umbrellat

Raint+1

Umbrellat+1

Bayesian Networks - Summary

Bayesian Networks are able to capture some important aspects of human Knowledge Representation and use Uncertainty Adaptation

Still difficulties in network design Overall a powerful tool

Meaningful values in network Probabilistic logical reasoning

Bayesian Networks in Robotics

Speech Recognition Inference

Sensors Computer Vision SLAM

Estimating HumanPoses

Robot going through doorway using Bayesian networks (Univ. of Basque)

Reinforcement Learning

How much can we take the human out of loop?

How do humans/animals do it? Genes Pain Pleasure

Simply define rewards/punishments let agent figure out all the rest

Reinforcement Learning - Example

start R(s) = Reward of state s

R(Goal) = 1 R(pitfall) = -1 R(anything else) = ?

Attempts to move forward may move left or right Many (~262,000) possible policies

Different policies are optimal depending on the value of R(anything else)

-1

1

.1 .1.8

Reinforcement Learning - Policy

start

Above is Optimal policy for R(s) = -.04 Given a policy how can an agent evaluate U(s), the utility of

a state? (Passive Reinforcement Learning) Adaptive Dynamic Programming (ADP) Temporal Difference Learning (TD)

With only an environment how can an agent develop a policy? (Active Reinforcement Learning)

Q-learning

-1

1

Reinforcement Learning - Utility

U(s) = R(s) + U(s’)P(s’) ADP: Updating all U(s) based on each new

observation TD: Update U(s) only for last state change

Ideally: U(s) = R(s) + U(s’), but s’ is probabilistic U(s) = U(s) + (R(s)+U(s’)-U(s)) decays from 1 to 0 as a function of # times state

is visited U(s) is guaranteed converge to correct value

startstart

-1-1

11123

1 2 3 4

S’

.338.661.655start

.660.762

.918.868.812

.338.661.655start

.660.762

.918.868.812

-1-1

11123

1 2 3 4

Reinforcement Learning – Policy Ideally Agents can create their own policies Exploration: Agents must be rewarded for exploring as

well as taking best known path Adaptive Dynamic Programming (ADP)

Can be achieved by changing U(s) to U’(s) U’(s) = n< N ? Max_Reward : U(s) Agent must also update transition model

Temporal Difference Learning (TD) No changes to utility calculation! Can explore based on balancing utility and novelty (like

ADP) Can chose random directions with a decreasing rate over

time Both converge on optimal value

Reinforcement Learning in Robotics

Robot Control Discretize

workspace Policy Search

Pegasus System (Ng, Stanford)

Learned how to control robots

Better than human pilots w/ Remote Control

Summary

3 different general learning approaches Artificial Neural Networks

Good for learning correlation between inputs and outputs

Little human work Bayesian Networks

Good for handling uncertainty and noise Human work optional

Reinforcement Learning Good for evaluating and generating

policies/behaviors Can handle complex tasks Little human work

References 1. Russell S, Norvig P (1995) Artificial Intelligence: A Modern Approach,

Prentice Hall Series in Artificial Intelligence. Englewood Cliffs, New Jersey (http://aima.cs.berkeley.edu/)

2. Mitchell, Thomas. Machine Learning. McGraw Hill, 1997. (http://www.cs.cmu.edu/~tom/mlbook.html)

3. Sutton, Richard S., and Andrew G. Barto. Reinforcement Learning. Cambridge, MA: MIT Press, 1998.(http://www.cs.ualberta.ca/~sutton/book/the-book.html )

4. Hecht-Nielsen, R. "Theory of the backpropagation neural network." Neural Networks 1 (1989): 593-605. (http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=3401&arnumber=118638)

5. P. Batavia, D. Pomerleau, and C. Thorpe, Tech. report CMU-RI-TR-96-31, Robotics Institute, Carnegie Mellon University, October, 1996 (http://www.ri.cmu.edu/projects/project_160.html)

6. Bayesian Network based Human Pose Estimation D.J. Jung, K.S. Kwon, and H.J. Kim (Korea) (http://www.actapress.com/PaperInfo.aspx?PaperID=23199)

7. Frank L. Lewis, "Neural Network Control of Robot Manipulators," IEEE Expert: Intelligent Systems and Their Applications ,vol. 11, no. 3, pp. 64-75, June, 1996. (http://doi.ieeecomputersociety.org/10.1109/64.506755)

http://aima.cs.berkeley.edu/

http://www.cs.cmu.edu/~tom/mlbook.html

http://www.cs.ualberta.ca/%7Esutton/book/the-book.html

http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=3401&arnumber=118638

http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=3401&arnumber=118638

http://www.ri.cmu.edu/projects/project_160.html

http://www.actapress.com/PaperInfo.aspx?PaperID=23199



http://doi.ieeecomputersociety.org/10.1109/64.506755

knowledge representation and machine learning

Documents

multilayer anns

number of advertisements

number of customers

y x y lovesx

y siblingx

ycan capture

singlelayer anndiffuse

factsx y brotherx