![Page 1: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/1.jpg)
Classification: Perceptron
Prof. Seungchul Lee
Industrial AI Lab.
![Page 2: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/2.jpg)
Classification
• Where 𝑦 is a discrete value– Develop the classification algorithm to determine which class a new input should fall into
• Start with a binary class problem– Later look at multiclass classification problem, although this is just an extension of binary
classification
• We could use linear regression– Then, threshold the classifier output (i.e. anything over some value is yes, else no)
– linear regression with thresholding seems to work
2
![Page 3: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/3.jpg)
Classification
• We will learn– Perceptron
– Support vector machine (SVM)
– Logistic regression
• To find
a classification boundary
3
![Page 4: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/4.jpg)
Perceptron
• For input 𝑥 =
𝑥1⋮𝑥𝑑
'attributes of a customer’
• Weights 𝜔 =
𝜔1
⋮𝜔𝑑
4
![Page 5: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/5.jpg)
Perceptron
• Introduce an artificial coordinate 𝑥0 = 1 :
• In a vector form, the perceptron implements
5
![Page 6: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/6.jpg)
Perceptron
• Works for linearly separable data
• Hyperplane– Separates a D-dimensional space into two half-spaces
– Defined by an outward pointing normal vector
– 𝜔 is orthogonal to any vector lying on the hyperplane
– Assume the hyperplane passes through origin, 𝜔𝑇𝑥 = 0 with 𝑥0 = 1
6
![Page 7: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/7.jpg)
Distance from a Line
7
![Page 8: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/8.jpg)
𝝎
• If Ԧ𝑝 and Ԧ𝑞 are on the decision line
8
![Page 9: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/9.jpg)
Signed Distance 𝒅 from the Origin
• If 𝑥 is on the line and 𝑥 = 𝑑𝜔
𝜔(where 𝑑 is a normal distance
from the origin to the line)
9
![Page 10: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/10.jpg)
Distance from a Line: 𝒉
• for any vector of 𝑥
10
![Page 11: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/11.jpg)
Sign
• Sign with respect to a line
11
![Page 12: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/12.jpg)
How to Find 𝝎
• All data in class 1 (𝑦 = 1)– 𝑔 𝑥 > 0
• All data in class 0 (𝑦 = −1)– 𝑔 𝑥 < 0
12
![Page 13: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/13.jpg)
Perceptron Algorithm
• The perceptron implements
• Given the training set
1) pick a misclassified point
2) and update the weight vector
13
![Page 14: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/14.jpg)
Perceptron Algorithm
• Why perceptron updates work ?
• Let's look at a misclassified positive example 𝑦𝑛 = +1
– Perceptron (wrongly) thinks 𝜔𝑜𝑙𝑑𝑇 𝑥𝑛 < 0
– Updates would be
– Thus 𝜔𝑛𝑒𝑤𝑇 𝑥𝑛 is less negative than 𝜔𝑜𝑙𝑑
𝑇 𝑥𝑛
14
![Page 15: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/15.jpg)
Iterations of Perceptron
1. Randomly assign 𝜔
2. One iteration of the PLA (perceptron learning algorithm)
where (𝑥, 𝑦) is a misclassified training point
3. At iteration 𝑖 = 1, 2, 3,⋯ , pick a misclassified point from
4. And run a PLA iteration on it
5. That's it!
15
![Page 16: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/16.jpg)
Diagram of Perceptron
16
![Page 17: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/17.jpg)
Perceptron Loss Function
• Loss = 0 on examples where perceptron is correct,
i.e., 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 > 0
• Loss > 0 on examples where perceptron is misclassified,
i.e., 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 < 0
• Note:
– sign 𝜔𝑇𝑥𝑛 ≠ 𝑦𝑛 is equivalent to 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 < 0
17
![Page 18: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/18.jpg)
Perceptron Algorithm in Python
18
![Page 19: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/19.jpg)
Perceptron Algorithm in Python
• Unknown parameters 𝜔
19
![Page 20: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/20.jpg)
Perceptron Algorithm in Python
20
where 𝑥, 𝑦 is a misclassified training point
![Page 21: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/21.jpg)
Perceptron Algorithm in Python
21
![Page 22: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/22.jpg)
Perceptron Algorithm in Python
22
![Page 23: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/23.jpg)
Perceptron Algorithm in Python
23
![Page 24: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/24.jpg)
Scikit-Learn for Perceptron
24
![Page 25: Classification: Perceptron · Iterations of Perceptron 1. Randomly assign 𝜔 2. One iteration of the PLA (perceptron learning algorithm) where Ὄ , Ὅis a misclassified training](https://reader034.vdocuments.net/reader034/viewer/2022042102/5e7e6a28ac7e894aab4095c9/html5/thumbnails/25.jpg)
The Best Hyperplane Separator?
• Perceptron finds one of the many possible hyperplanes separating the data if one exists
• Of the many possible choices, which one is the best?
• Utilize distance information
• Intuitively we want the hyperplane having the maximum margin
• Large margin leads to good generalization on the test data
– we will see this formally when we discuss Support Vector Machine (SVM)
• Utilize distance information from all data samples
– We will see this formally when we discuss the logistic regression
• Perceptron will be shown to be a basic unit for neural networks and deep learning later
25