perceptrons - university of texas at dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · problems...
TRANSCRIPT
![Page 1: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/1.jpg)
CS6364Perceptrons
![Page 2: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/2.jpg)
LinearClassifiers
§ Inputsarefeaturevalues§ Eachfeaturehasaweight§ Sumistheactivation
§ Iftheactivationis:§ Positive,output+1§ Negative,output-1
Σf1f2f3
w1w2w3
>0?
![Page 3: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/3.jpg)
Weights§ Binarycase:comparefeaturestoaweightvector§ Learning:figureouttheweightvectorfromexamples
# free : 2YOUR_NAME : 0MISSPELLED : 2FROM_FRIEND : 0...
# free : 4YOUR_NAME :-1MISSPELLED : 1FROM_FRIEND :-3...
# free : 0YOUR_NAME : 1MISSPELLED : 1FROM_FRIEND : 1...
Dot product positive means the positive class
![Page 4: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/4.jpg)
DecisionRules
![Page 5: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/5.jpg)
BinaryDecisionRule
§ Inthespaceoffeaturevectors§ Examplesarepoints§ Anyweightvectorisahyperplane§ OnesidecorrespondstoY=+1§ OthercorrespondstoY=-1
BIAS : -3free : 4money : 2... 0 1
0
1
2
freemoney
+1=SPAM
-1=HAM
![Page 6: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/6.jpg)
WeightUpdates
![Page 7: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/7.jpg)
Learning:BinaryPerceptron
§ Startwithweights=0§ Foreachtraininginstance:
§ Classifywithcurrentweights
§ Ifcorrect(i.e.,y=y*),nochange!
§ Ifwrong:adjusttheweightvector
![Page 8: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/8.jpg)
Learning:BinaryPerceptron
§ Startwithweights=0§ Foreachtraininginstance:
§ Classifywithcurrentweights
§ Ifcorrect(i.e.,y=y*),nochange!§ Ifwrong:adjusttheweightvectorbyaddingorsubtractingthefeaturevector.Subtractify*is-1.
![Page 9: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/9.jpg)
Examples:Perceptron
§ SeparableCase
![Page 10: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/10.jpg)
MulticlassDecisionRule
§ Ifwehavemultipleclasses:§ Aweightvectorforeachclass:
§ Score(activation)ofaclassy:
§ Predictionhighestscorewins
Binary=multiclasswherethenegativeclasshasweightzero
![Page 11: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/11.jpg)
Learning:MulticlassPerceptron
§ Startwithallweights=0§ Pickuptrainingexamplesonebyone§ Predictwithcurrentweights
§ Ifcorrect,nochange!§ Ifwrong:lowerscoreofwronganswer,
raisescoreofrightanswer
![Page 12: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/12.jpg)
Example:MulticlassPerceptron
BIAS : 1win : 0game : 0 vote : 0 the : 0 ...
BIAS : 0 win : 0 game : 0 vote : 0 the : 0 ...
BIAS : 0 win : 0 game : 0 vote : 0 the : 0 ...
“winthevote”
“wintheelection”
“winthegame”
![Page 13: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/13.jpg)
PropertiesofPerceptrons
§ Separability:trueifsomeparametersgetthetrainingsetperfectlycorrect
§ Convergence:ifthetrainingisseparable,perceptron willeventuallyconverge(binarycase)
§ MistakeBound:themaximumnumberofmistakes(binarycase)relatedtothemargin ordegreeofseparability
Separable
Non-Separable
![Page 14: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/14.jpg)
Examples:Perceptron
§ Non-SeparableCase
![Page 15: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/15.jpg)
ImprovingthePerceptron
![Page 16: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/16.jpg)
ProblemswiththePerceptron
§ Noise:ifthedataisn’tseparable,weightsmightthrash§ Averagingweightvectorsovertime
canhelp(averagedperceptron)
§ Mediocregeneralization:findsa“barely”separatingsolution
§ Overtraining:test/held-outaccuracyusuallyrises,thenfalls§ Overtrainingisakindofoverfitting
![Page 17: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/17.jpg)
FixingthePerceptron
§ Idea:adjusttheweightupdatetomitigatetheseeffects
§ MIRA*:chooseanupdatesizethatfixesthecurrentmistake…
§ …but,minimizesthechangetow
§ The+1helpstogeneralize
*MarginInfusedRelaxedAlgorithm
![Page 18: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/18.jpg)
MinimumCorrectingUpdate
minnotτ=0,orwouldnothavemadeanerror,sominwillbewhereequalityholds
![Page 19: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/19.jpg)
MaximumStepSize
§ Inpractice,it’salsobadtomakeupdatesthataretoolarge§ Examplemaybelabeledincorrectly§ Youmaynothaveenoughfeatures§ Solution:capthemaximumpossiblevalueofτ withsome
constantC
§ Correspondstoanoptimizationthatassumesnon-separabledata§ Usuallyconvergesfasterthanperceptron§ Usuallybetter,especiallyonnoisydata
![Page 20: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/20.jpg)
LinearSeparators
§ Whichoftheselinearseparatorsisoptimal?
![Page 21: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/21.jpg)
SupportVectorMachines
§ Maximizingthemargin:goodaccordingtointuition,theory,practice§ Onlysupportvectorsmatter;othertrainingexamplesareignorable§ Supportvectormachines(SVMs)findtheseparatorwithmaxmargin§ Basically,SVMsareMIRAwhereyouoptimizeoverallexamplesatonce
MIRA
SVM
![Page 22: Perceptrons - University of Texas at Dallasvgogate/ai/fall16/grad/slides/perceptron.pdf · Problems with the Perceptron § Noise: if the data isn’t separable, weights might thrash](https://reader033.vdocuments.net/reader033/viewer/2022051723/5aabcd007f8b9a59658c4cf0/html5/thumbnails/22.jpg)
Classification:Comparison
§ NaïveBayes§ Buildsamodeltrainingdata§ Givespredictionprobabilities§ Strongassumptionsaboutfeatureindependence§ Onepassthroughdata(counting)
§ Perceptrons/MIRA:§ Makeslessassumptionsaboutdata§ Mistake-drivenlearning§ Multiplepassesthroughdata(prediction)§ Oftenmoreaccurate