ethem alpaydin © the mit press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · apriori algorithm...

15
ETHEM ALPAYDIN © The MIT Press, 2010 [email protected] http://www.cmpe.boun.edu.tr/~ethem/i2ml2e Lecture Slides for

Upload: others

Post on 05-Jul-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

ETHEM ALPAYDIN© The MIT Press, 2010

[email protected]://www.cmpe.boun.edu.tr/~ethem/i2ml2e

Lecture Slides for

Page 2: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),
Page 3: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Probability and Inference Result of tossing a coin is {Heads,Tails}

Random var X {1,0}

Bernoulli: P {X=1} = poX (1 ‒ po)(1 ‒ X)

Sample: X = {xt }Nt =1

Estimation: po = # {Heads}/#{Tosses} = ∑txt / N

Prediction of next toss:

Heads if po > ½, Tails otherwise

3Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 4: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Classification Credit scoring: Inputs are income and savings.

Output is low-risk vs high-risk Input: x = [x1,x2]T ,Output: C Î {0,1} Prediction:

otherwise 0

)|()|( if 1 choose

or

otherwise 0

)|( if 1 choose

C

C

C

C

,xxCP ,xxCP

. ,xxCP

2121

21

01

501

4Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 5: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Bayes’ Rule

x

xx

p

pPP

CCC

| |

110

0011

110

xx

xxx

||

||

CC

CCCC

CC

Pp

PpPpp

PP

5

posterior

likelihoodprior

evidence

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 6: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Bayes’ Rule: K>2 Classes

K

kkk

ii

iii

CPCp

CPCp

p

CPCpCP

1

|

|

||

x

x

x

xx

xx | max | if choose

and 1

kkii

K

iii

CPCPC

CPCP

10

6Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 7: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Losses and Risks Actions: αi

Loss of αi when the state is Ck : λik

Expected risk (Duda and Hart, 1973)

xx

xx

|min| if choose

||

kkii

k

K

kiki

RR

CPR

1

7Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 8: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Losses and Risks: 0/1 Loss

ki

kiik

if

if

1

0

x

x

xx

|

|

||

i

ikk

K

kkiki

CP

CP

CPR

1

1

8

For minimum risk, choose the most probable class

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 9: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Losses and Risks: Reject

10

1

1

0

otherwise

if

if

,Ki

ki

ik

xxx

xx

|||

||

iik

ki

K

kkK

CPCPR

CPR

1

1

1

otherwise reject

| and || if choose 1xxx ikii CPikCPCPC

9Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 10: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Discriminant Functions Kigi ,, , 1x xx kkii ggC max if choose

xxx kkii gg max| R

ii

i

i

i

CPCp

CP

R

g

|

|

|

x

x

x

x

10

K decision regions R1,...,RK

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 11: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

K=2 Classes Dichotomizer (K=2) vs Polychotomizer (K>2)

g(x) = g1(x) – g2(x)

Log odds:

otherwise

if choose

2

1 0

C

gC x

x

x

|

|log

2

1

CP

CP

11Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 12: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Utility Theory Prob of state k given exidence x: P (Sk|x)

Utility of αi when state is k: Uik

Expected utility:

xx

xx

| max| if Choose

||

jj

ii

kkiki

EUEUα

SPUEU

12Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 13: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Association Rules Association rule: X Y

People who buy/click/visit/enjoy X are also likely to buy/click/visit/enjoy Y.

A rule implies association, not necessarily causation.

13Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Page 14: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Association measures Support (X Y):

Confidence (X Y):

Lift (X Y):

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 14

customers

and bought whocustomers

#

#,

YXYXP

X

YX

XP

YXPXYP

bought whocustomers

and bought whocustomers

|

#

#

)(

,

)(

)|(

)()(

,

YP

XYP

YPXP

YXP

Page 15: ETHEM ALPAYDIN © The MIT Press, 2010ethem/i2ml2e/2e_v1-0/i2ml2e-chap3 … · Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough support),

Apriori algorithm (Agrawal et al., 1996) For (X,Y,Z), a 3-item set, to be frequent (have enough

support), (X,Y), (X,Z), and (Y,Z) should be frequent.

If (X,Y) is not frequent, none of its supersets can be frequent.

Once we find the frequent k-item sets, we convert them to rules: X, Y Z, ...

and X Y, Z, ...

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15