instance based learning bob durrant school of computer science university of birmingham (slides: dr...

Instance Based Learning

Bob DurrantSchool of Computer Science

University of Birmingham

(Slides: Dr Ata Kabán)

1

Outline

• Today we learn:• K-Nearest Neighbours• Case-based reasoning• Lazy and eager learning

2

Instance-based learning

• One way of solving tasks of approximating discrete or real valued target functions

• Have training examples: (xn, f(xn)), n=1..N.

• Key idea: – just store the training examples– when a test example is given then find the closest

matches

3

“Nearest Neighbours”• 1-Nearest neighbour:

– given a query instance xq – locate the nearest training example xn

– then f(xq):= f(xn)• K-Nearest neighbour:

– given a query instance xq – locate the k nearest training examples – if discrete values target function then take vote

among its k nearest neighbours – if real valued target function then take the mean of

the f values of the k nearest neighbours:

k

xfxf

k

i iq

1)(

:)(4

The distance between examples

• We need a measure of distance in order to know who are the neighbours

• Assume that we have T attributes for the learning problem. Then one example point x has elements xt R, t=1,…T.

• The distance between two points xi and xj is usually defined to be the Euclidean distance:

5

T

ttjtiji xxd

1

2][),( xx

Voronoi Diagram

6

Characteristics ofInstance-Based Learning

• An instance-based learner is a so-called lazy learner which does all the work when the test example is presented. This is as opposed to eager learners, which build a parameterised compact model of the target.

• It produces local approximation to the target function (different with each test instance)

7

When to consider Nearest Neighbour algorithms?

• Instances map to points in Rn

• Not more then say 20 attributes per instance• Lots of training data• Advantages:

– Training is very fast– Can learn complex target functions– Don’t lose information

• Disadvantages:– ? (will see them shortly…)

8

9

twoone

four

three

five six

seven Eight ?

Training data

10

Number Lines Line types Rectangles Colours Mondrian?

1 6 1 10 4 No

2 4 2 8 5 No

3 5 2 7 4 Yes

4 5 1 8 4 Yes

5 5 1 10 5 No

6 6 1 8 6 Yes

7 7 1 14 5 No

Number Lines Line types Rectangles Colours Mondrian?

8 7 2 9 4

Test instance

Keep data in normalised form

11

One way to normalise the data ar(x) to a’r(x) is:

t

ttt

xxx

attributestofmeanx tht

attributestofdeviationndardsta tht

Normalised training data

12

Number Lines Line types

Rectangles Colours Mondrian?

1 0.632 -0.632 0.327 -1.021 No

2 -1.581 1.581 -0.588 0.408 No

3 -0.474 1.581 -1.046 -1.021 Yes

4 -0.474 -0.632 -0.588 -1.021 Yes

5 -0.474 -0.632 0.327 0.408 No

6 0.632 -0.632 -0.588 1.837 Yes

7 1.739 -0.632 2.157 0.408 No

Number Lines Line types

Rectangles Colours Mondrian?

8 1.739 1.581 -0.131 -1.021

Test instance

Distances of test instance from training data

13

Example Distanceof testfromexample

Mondrian?

1 2.517 No

2 3.644 No

3 2.395 Yes

4 3.164 Yes

5 3.472 No

6 3.808 Yes

7 3.490 No

Classification

1-NN Yes

3-NN Yes

5-NN No

7-NN No

What if the target function is real valued?

• The k-nearest neighbour algorithm would just calculate the mean of the k nearest neighbours

14

Variant of kNN: Distance-Weighted kNN

• We might want to weight nearer neighbors more heavily:

• Then it makes sense to use all training examples instead of just k (Stepard’s method)

15

2

1

1

),(

1 where

)(:)(

iqik

i i

k

i iiq d

ww

fwf

xx

xx

Difficulties with k-nearest neighbour algorithms

• Have to calculate the distance of the test case from all training cases

• There may be irrelevant attributes amongst the attributes – curse of dimensionality

16

Case-based reasoning (CBR)

• CBR is an advanced instance based learning applied to more complex instance objects

• Objects may include complex structural descriptions of cases & adaptation rules

17

Case-based Reasoning (CBR)

• CBR cannot use Euclidean distance measures • Must define distance measures for those

complex objects instead (e.g. semantic nets)• CBR tries to model human problem-solving

– uses past experience (cases) to solve new problems

– retains solutions to new problems• CBR is an ongoing area of machine learning

research with many applications

18

Applications of CBR

• Design– landscape, building, mechanical, conceptual

design of aircraft sub-systems• Planning

– repair schedules• Diagnosis

– medical• Adversarial reasoning

– legal

19

CBR process

20

New Case

matching Matched Cases

Retrieve

Adapt?No

Yes

Closest Case

Suggest solution

Retain

Learn

Revise

Reuse

Case Base

Knowledge and Adaptation rules

CBR example: Property pricing

21

Case Locationcode

Bedrooms Receprooms

Type floors Cond-ition

Price£

1 8 2 1 terraced 1 poor 20,500

2 8 2 2 terraced 1 fair 25,000

3 5 1 2 semi 2 good 48,000

4 5 1 2 terraced 2 good 41,000

Case Locationcode

Bedrooms Receprooms

Type floors Cond-ition

Price£

5 7 2 2 semi 1 poor ???

Test instance

How rules are generated

• There is no unique way of doing it. Here is one possibility:

• Examine cases and look for ones that are almost identical– case 1 and case 2

• R1: If recep-rooms changes from 2 to 1 then reduce price by £5,000

– case 3 and case 4• R2: If Type changes from semi to terraced then reduce

price by £7,000

22

Matching

• Comparing test instance – matches(5,1) = 3– matches(5,2) = 3– matches(5,3) = 2– matches(5,4) = 1

23

Estimate price of case 5 is £25,000

Adapting

• Reverse rule 2– if type changes from terraced to semi then

increase price by £7,000

• Apply reversed rule 2 – new estimate of price of property 5 is £32,000

24

Learning

• So far we have a new case and an estimated price– nothing is added yet to the case base

• If later we find house sold for £35,000 then the case would be added– could add a new rule

• if location changes from 8 to 7 increase price by £3,000

25

Problems with CBR

• How should cases be represented?• How should cases be indexed for fast

retrieval?• How can good adaptation heuristics be

developed?• When should old cases be removed?

26

Advantages

• A local approximation is found for each test case

• Knowledge is in a form understandable to human beings

• Fast to train

27

Summary

• K-Nearest Neighbours• Case-based reasoning• Lazy and eager learning

28

Lazy and Eager Learning

• Lazy: wait for query before generalizing– k-Nearest Neighbour, Case based reasoning

• Eager: generalize before seeing query– Radial Basis Function Networks, ID3, …

• Does it matter?– Eager learner must create global approximation– Lazy learner can create many local approximations

29

instance based learning bob durrant school of computer science university of birmingham (slides: dr...

Documents