geometric learning algorithms

33
Geometric Learning Algorithms Stephen M. Omohundro International Computer Science Institute Berkeley, California • Density estimation, classification, mappings • Average case algorithms • K-d trees and nearest neighbors • Balltrees and boxtrees • Robot arm task • Bumptrees and mapping learning

Upload: others

Post on 04-Feb-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Geometric Learning Algorithms

Geometric Learning Algorithms

Stephen M. Omohundro

International Computer Science Institute

Berkeley, California

• Density estimation, classification, mappings

• Average case algorithms

• K-d trees and nearest neighbors

• Balltrees and boxtrees

• Robot arm task

• Bumptrees and mapping learning

Page 2: Geometric Learning Algorithms

Geometric Learning Algorithms

• Geometric - Mathematics

• Learning - Statistics

• Algorithms - Computer Science

Page 3: Geometric Learning Algorithms

• •

r ~Some Basic Tasks

• Density Estimation

• Classification learning

• Smooth nonlinear mapping learning

out , out.B .B • • B.B.B •

•AA .CC ....... .. ... ·...... I~ ./),A .QC

10 in

Classification Learning Mapping Learning

\.. ~

Page 4: Geometric Learning Algorithms

• • • • • • • • • ••

.,

Density estimation by Histogram

• • •

• ••

• •••~~•

• Bins too large -> washes out the fine structure

• Bins too small-> only one or no samples per bin

• Correct size for one region may be incorrect for another

Page 5: Geometric Learning Algorithms

Volume Density Estimator

• Instead of taking regions of fixed volume and counting samples

• take regions with a fixed number of samples and measure volume

• eg. Take inverse volume of nth nearest neighbor sphere

• Get small regions in high density parts of the space

• Large regions in low density parts of the space

• Need to compute the distance to the nth nearest neighbor.

Page 6: Geometric Learning Algorithms

The beta distribution and non-parametric statistics.

Let Sa. be a nested family of sets parameterized by a such that p (Sa.) = a. If we draw N points and choose the smallest set So.

containing exactly n pOints, then the a'S are distributed according to the Beta distribution:

p (a) - Nt n-l N-n n - F "i " F 11. T " , a ( 1 - a)__ •

n n 2 n ~-E (a) = 'lLT ."i N cr (a) ~2

N

Page 7: Geometric Learning Algorithms

class 2

Nearest Neighbor Classification

• Baye's optimal approach would choose that class for which the posterior distribution is largest.

class 1

• Typically don't know the distributions, however.

• Nearest neighbor assigns a point the class of the nearest example.

• Asymptotically error is less than twice that of Bayes optimal.

• Non-parametric.

Page 8: Geometric Learning Algorithms

Double bucket sort: an adaptive algorithm

• Sorting n values in [0,1] takes O(n log n) worst case.

• Would like good average case performance

• but, What distribution do you average over?

• Assume samples come from an unknown distribution, want good average with respect to that distribution.

• Bucket sort: Break interval into equal sized buckets, sort each one.

1-- I I- - I I- 1...1 1

• Double bucket sort: Further break up intervals into a number of subintervals proportional to the samples which fell into them.

• Linear expected time O(n), beats quicksort by factor of 2 in practice.

I - 4t I I - - I I - I .11111 I

Page 9: Geometric Learning Algorithms

0.

u

cr;

S .l""1

'1j

Page 10: Geometric Learning Algorithms

K -d trees can find n nearest neighbors in log expected time.

If N samples are drawn from a non-vanishing, smooth distribution on a compact region, then we can use a k-d tree to find the n nearest neighbors of a new sample in O(log N) expected time, asymptotically for large N.

Idea: Asymptotically, volume of n nearest neighbor ball is beta distributed. With k-d construction, tree regions are also beta distributed. n nearest neighbor ball will overlap only a constant number of tree regions on average. Using branch and bound, we do log time initial search and then constant extra work.

Page 11: Geometric Learning Algorithms

K -d Tree Search Dimension Dependence 0.16., Truncated .".­

0.200.12 -I Brute sea.r;en/"

//"" 0.08 -I """"""

/"" 0.10

0.04 -I "" ",,// K-d search

0.00 If iii iii iii I 0.00 0 2000 4000 6000 8000 10000

4 Dimensions 8 Dimensions 9 Dimensions 0.30., 0.30 /

// 0.30

0.20

l0 k:::: -I /- 0.20-1

.

/'"

// "","" ----./ 0.20

0.10O.10~O.

0.00 i I : I 0.00 I I Iii I 0.00 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 1 0000 0 2000 4000 6000 8000 10000

10 Dimensions 11 Dimensions 12 Dimensions

Average nearest neighbor search time (sees) vs number of points

""~",,"" 0.20

"","""".... //

0.10",,"" ",,/

,/""'­

0.00 0 2000 4000 6000 8000 10000 a 2000 4000 6000 8000 10000

Page 12: Geometric Learning Algorithms

•en--C

.. .0

0)­

0)­

.... "'C

-1

ft

-0

)C

....o

J-

....co

C

J

OJ::

(J

.-en

en 0)C

Q)

0) ... E+:,.--0

_ en I-

Nc

«~ C

0).....

LU

Page 13: Geometric Learning Algorithms

Balltree queries. Pruning: • Return leaf balls containing a query point.

Branch and bound queries: • Return nearest leaf to query point.

Distribution independent average performance:

If leaves and queries are drawn from an underlying distribution p then would like good performance on average with respect to p.

Page 14: Geometric Learning Algorithms

Five balltree construction algorithms.

• K-d algorithm Split most spread dimension at median.

• Top-down algorithm Split best dimension at point to minimize new ball volume.

• Insertion algorithm Insert ball on-line at minimum volume insertion location.

• Cheap insertion algorithm Insert ball on-line at heuristically good location.

• Bottom up algorithm Repeatedly pair the best two balls.

Page 15: Geometric Learning Algorithms

The balltrees produced from a Cantor set bythe five construction algorithms.

Insertion and Kd Cheap insertion

Bottom upTop down

Page 16: Geometric Learning Algorithms

8alltree volume and construction time vs. number of 2-d uniformly distributed point

leaves.

Construction Time Balltree Volume 254.88top_dn40.

203.9032. ~\I \

I I \ 152.9324. chpjns I '! .... f I . .J I \

..... '\ r \ 101.9516. ...... -"'" I \ I kd" ­,...__ I \ ~

./ ,rOo" ! LJ ~ns 50.988. "'" - / ',--'/"-;:.......... - _....;.::-1==.. • --....=. -=.::::::..bOt up o00 I se:ry ~--"",;;;.__ .r---=...... ---="""'------­• , ""T I I I I I I Io. I I I I I I I I I I .. o 100 200 300 400 500o 100 200 300 400 500 Number of Leaves Number of Leaves

Page 17: Geometric Learning Algorithms

8alltree volume and construction time vs. number of 2-d uniformly distributed point

leaves.

Construction Time Balhree Volume

254.88top_dn40.

boCup 203.90

32.

152.9324.

101.9516.

50.988.

0.00 I ==-; ..-=*""1' - '"f" ,;::; - - ......... - - - - - - - ­iii i

o 100 200 300 400 500 o 100 200 300 400 500

Number of LeavesNumber of Leaves

Page 18: Geometric Learning Algorithms

Octbox image trees

289 leaves, depth 12, average depth 8.7

~ ~\

edges level 0 level 1 level 2 level 3

level 4 level 5 level 6 level 7 level 12

Page 19: Geometric Learning Algorithms

Robot Arm Task

Camera

SystemArm

Kinematic space ,. Visual space

R3 ,. R6

Setup studied by Bartlett Mel in "Connectionist Robot Motion Planning"

Page 20: Geometric Learning Algorithms

Kinematic to Visual map testbed

predicted

C!!!!D ~ (]!!!!J First joint: (52] 0 1180

llecoru1 jotnt: [94] 0 1180

Third jotnt: (135] 0 1180

lIe.ples: [u] 1_ 1200

IInbrs: (5] 1M I so

actual

Page 21: Geometric Learning Algorithms

... .2 .... G

)

z c. 2 c. ~

o o.:a

Page 22: Geometric Learning Algorithms

------ en c::: o .... o c::: J

LL

en en C

CO

-C -c c ~

II

u. CO a: c ro en en :::J ro

C)

u. CO a:

Page 23: Geometric Learning Algorithms

--

.. 0.... w c:

0.­.... 0c: ::J

LI-

en.­en 0 ra -0.­-0

0 ~

0 0 (Y')

0 0 LL

" C

\JI

co I

a: I

c I

C'd /

0 0s....

en /

en ./

~0 s....

:::J ./

s.... C'd

--"

UJ

CJ -­

Q.)

-­s....

-­C'd :::J

0 c-

o 0

0 0

C/)

(Y')

C\J

~

0 0

0 0

0c

. .

. .

C'd 0

0 0

0Q

.)

~

en - Q

.)

c. E

C'd en C

) c c C'd s.... +

-' ...... 0 s.... Q

.) .0

E

:::J z

Page 24: Geometric Learning Algorithms

---------- CD E

.... t»

c c... C

~

c 0 iii­

0 C

:::J

L&­

en en C

£

Q

-C -c C

~

.-. (J) u CD (J)

'-""

CD E

.­+

-oJ

C')

c.­c "­~

CD ....J

LL COa::

o o

o o o o

o C

\J ,­

o L()

(Y')

o L()

C\J

o L()

o L()

(J)

- (])

a. E

~

(J)

C')

.~ c.­~

+-oJ

'+­

0 "­

~

E

::J Z

Page 25: Geometric Learning Algorithms

RBF vs. Bumptree Learning Time

Learning time (sees)

/ /

/2000 Gaussian RBFI I II

I I I I

/ I / I1000

/ / // Linear RBF / /

//,/../

~'/'" """",,'IIf/fII-"o -- ~

50 150 250 350

Number of training samples

Page 26: Geometric Learning Algorithms

---- CD E

I-c:: 0

..... 0 ::J........ C

I)

c: 0 0 CD CD........ a. E

::J

I:Q

en a. E

:J

0

,,, ..a

0 ('I)

, a>

en

', a>

a. L

­~

E

a. :J

"-

E..a

"­:J

0

.- c "-

ClJ 0

~

"­C

\I-0..

" " " -en

0u

" " ~ 0

a> ~

en ,

"""'-'"

a> E

~

0)

0 c c L

­

~

C\I

0 a>

--I

en a> a. E

~

en 0

)

C.­c.­~

L­~

'+­

0 L­

a> ..a E

:J

Z

Page 27: Geometric Learning Algorithms

---

0 <.0 ~

CD

.- E 0 C

\JI­

.--... ~

en U

Ol

LL Q

)c:

C(]

en.-

a: Q

)c:

c 0

... ro

co E

+

-'C

Q

) C

>CD

Q)

C'­

..... +

-' C

'-c..

0 '­

• 0

ro'-

E

~en

'­Q

)

... :::J

--I Q

)>

W

C

(]

0 '-ro

... :::J

... c-

o (f)o

co

<.0 ~

C\J

0W

~

0 0

0 0

0C

0

0 0

0 0

0ro

. .

. .

. .

Q)

0 0

0 0

0 0

~

Page 28: Geometric Learning Algorithms

Function Evaluation Time

Eval time (secs)

Gaussian RBf;,// 0.03 ,/'/

.,;.,;",,/

0.02

/ ,,/ Bumptree

//

0.01

" O.OOU'+I-....----.---r-----,.--,-----,r---,

o 100 200 300

Number of training samples

Eval time (secs)

I Gaussian RBFI Bumptree

,LRan~raPh

2000 4000 6000 800010000

Number of training samples

Page 29: Geometric Learning Algorithms

c

What is a Bumptree?

A bumptree is a complete binary tree with a real valued function associated to each node such that an interior node's function is everywhere larger than the functions associated with the leaves beneath it.

!\ /57 ego ball supported bump

~ Een d ~b'0f a abc d e f

2-d leaf functions tree structure tree functions

Page 30: Geometric Learning Algorithms

Some bumptree functions

• kd-trees, oct-trees, boxtrees At?1 in rectangle, 0 outside:

• balltrees 1 in ball, 0 outside: /07 ~ or smooth bumps in ball, 0 outside: /Ey

-• Gaussian mixtures Really store quadratic form

A

:(Acin exponent: abed

For fast access, parent functions are quadratic function of distance from a coordinate partition on one side, 0 on other:

Page 31: Geometric Learning Algorithms

Bumptrees Queries

Pruning: • Return leaf functions which are larger than a given value on query point

Branch and bound queries: • Return leaf functions which are within a given factor or constant of the

largest leaf function value

Can find nearest neighbors in log expected time over a wide variety of probability distributions. In practice, works well up to about 10 input dimensions for a few thousand uniformly distributed data points. Structure in the distribution can give speedups in arbitrarily high dimensions.

Page 32: Geometric Learning Algorithms

en CD CD... a E

J 1

0

s:::. o

!!2.....

"Ci5 <D

en "'0

:J-~

oo

EU

) ....

Ol

<D ~

c::: c

--en

c::: ~

o a;

"...

0)

"Co

<D o

II

CD E

....

(J) .t:.....

Ol ---.....-_

(J)....

....c:

...c c

. . -a

en . .

-o C

­a.

. ~

-..

... a. E

C

en

(J)a;

....:::J

:::::>o

en "C

c

.a .­

~a.

0(J)

:E E

o

0 - >­

c :2

.t:.:::J m

c

.- 0 "0

(J) ....

"0 o

~

o :::J

t 0

(J) -

0 0

Ec

..... C

I).....

~

0..- en

C

Page 33: Geometric Learning Algorithms

Bumptrees for Density Estimation, Classification and Constraint Learning

Use bumptree to hold normal mixture.

Density estimation by adaptive kernel estimator (with merging).

Classification by Bayes' decision rule on density.

Constraint queries such as partial match:

(the product of Gaussians is Gaussian, so normal mixtures may be composed to form normal mixtures)