rami n. mahdi 1 eric c....
TRANSCRIPT
![Page 1: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/1.jpg)
REDUCED HYPERBF NETWORKS: REDUCED HYPERBF NETWORKS: REGULARIZATION BY EXPLICIT COMPLEXITYREDUCTION AND SCALED RPROP BASED TRAINING
Rami N. MahdiEric C. Rouchka
Bi i f ti L b
1
Bioinformatics LabDepartment of Computer Engineering and Computer ScienceUniversity of Louisville
![Page 2: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/2.jpg)
PATTERN RECOGNITION
Classify data samples based either on:A priori knowledgeStatistical information extracted from available labeled data
Different methods learn the boundaries using different approachesapproaches. 2
![Page 3: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/3.jpg)
SUPPORT VECTOR MACHINE
Transform samples to a new spaceFind points at the boundaryMaximize the separation margin
3
![Page 4: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/4.jpg)
RBF - NNLearn significant clustersClass samples are distinctively described by a sum of weighted Gaussians
4
![Page 5: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/5.jpg)
RBF - NN
1
Diagonal Scaling Matrices Full Scaling Matrices
Results are interpretableSignificant neurons represent significant clusters
5
![Page 6: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/6.jpg)
HyperBF Networks
Regular RBF HyperBF
6
![Page 7: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/7.jpg)
Locally Scaled RBF (HyperBF)
7Simplified Notation
![Page 8: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/8.jpg)
HYPERBF-NN (ELLIPTICAL GAUSSIANS)
TrainingPerform clustering Initialize neuronsInitialize weightsEstimates all variables simultaneously using gradient optimization
11
8
![Page 9: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/9.jpg)
CHALLENGES
Challenging Optimizationg g pExample: (MNIST hand written digits: 748 features)
100 neurons would contain 156900 parameters.
Optimization Function Not Convexp
Over Fitting (very complex model)
9
![Page 10: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/10.jpg)
RPROP ALGORITHM
It uses a separate learning factor for every variableUses the direction of the first derivative and not the magnitude
η increases if the direction of the derivative stays the same from previous iteration.
d if di ti hη decreases if direction change.
Gradient Descent
RPropRProp
10- Subject to: and
![Page 11: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/11.jpg)
SCALED RPROP
Adaptive Estimation of Ti-Init and Ti-Max
Ti-Init and Ti-Max are estimated by bounding
the change to the output 11output
![Page 12: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/12.jpg)
SCALED RPROP WITH PARTIAL BACKTRACKINGInit
Network (hierarchal clustering)Loop (iSRProp)
C t ight d i tiCompute weights derivativesUpdate network
For every neuron jCompute all and derivatives.
Update network using Rpropif Error Increases
R ll b k 25% f h l d j• Roll back 25% of the last updates to neuron j.End if
End forUntil Convergence. 12
![Page 13: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/13.jpg)
DATASETS
Data Set # of samples # of classes # of test Samples # of Features
MNIST 60,000 10 10,000 784
USPS 7291 10 2,007 256
TSS 93550 2 N/A 1024
SO 6238 26 1 9 61ISOLET 6238 26 1559 617
Wis. Breast Cancer 569 2 N/A 32
P i 17766 3 6621 357Protein 17766 3 6621 357
SatImage 4435 6 2000 36
13
![Page 14: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/14.jpg)
ISRPROP VS IRPROP+ VS BPVSISRPROP VS. IRPROP+ VS. BPVS
14(a) USPS net: 100 neurons, (b) MNIST net: 100 neurons, (c) TSS net: with 30 neurons, (d) Breast Cancer net: 40 neurons, (e) Protein net: 30 neurons, and (f) Satimage net: 60 neurons
![Page 15: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/15.jpg)
REGULARIZATION (ANTI – OVER-FITTING)
Simple models need less examples to approximate
Statistical Learning Theory: Generalization 1 / ComplexityStatistical Learning Theory: Generalization 1 / Complexity
15
![Page 16: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/16.jpg)
REDUCED HYPERBF
16
![Page 17: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/17.jpg)
HyperBF
Reduced HyperBF
17
![Page 18: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/18.jpg)
RESULTS
CV Error% Test Error%Data Set k-Folds
%HBF R-HBF SVM HBF R-HBF SVM
USPS 10 2.47 1.37 1.74 5.83 4.38 4.78MNIST 5 3.33 2.29 1.52 3.23 2.05 1.42ISOLET 10 4.44 3.03 2.45 6.54 3.78 3.21
Breast Cancer 10 4.04 1.67 1.93 N/A N/A N/AProtein 10 38.61 32.03 29.56 38.07 29.9 29.9
Satimage 10 9.8 8.71 7.86 10.7 9.5 8.8
TSS Validation auROC%HBF R-HBF SVM88.5 94.06 94.42
18
![Page 19: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/19.jpg)
COMPARISON OF MODEL STRUCTURE
D t S t# of support
t # factive dims
i R HBF %~Size R tiDataSet vectors # of neurons in R-HBF % Ratio
USPS 1464 200 0.36 1:10MNIST 16523 200 0.24 1:172ISOLET 3956 260 0.29 1:26
Breast Cancer 79 40 0.084 1:12Protein 12019 30 0.22 1:910
SATIMAGE 1322 60 0.46 1:24
TSS 14554 30 0.13 1:1900
MNIST-HBF is about 172 times smallerMNIST HBF is about 172 times smaller
19
![Page 20: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/20.jpg)
SENSITIVITY TO REGULARIZATIONPARAMETERS
20a) ISOLET, b) USPS, and c) Protein. Stared boxes are the ones with the highest accuracy
![Page 21: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/21.jpg)
AVAILABILITY
HyperBF optimization tool with source code are made available at : http://bioinformatics.louisville.edu/HyperBFLib.html
HyperBFLib is developed and the important classes are:RH BFN t 2Cl T i k f l blRHyperBFNet_2Class: Train networks for two class problems.RHyperBFNet_MultiClass: Train networks for multi-class problems.HeirarchalAgglomerative: Perfrom hierarchal clustering with moving centers
DataLoader; load or save data objects of different type including arrays DataLoader; load or save data objects of different type including arrays, clusters, and objects.USPS_Client: A sample implementation to use the above classes to training HyperBF to classify the USPS dataset in tow cases: Multi-class and two class classification.
Formatted USPS Dataset is made available as example of formatting data.
For further question write the package, send email to: [email protected]
![Page 22: Rami N. Mahdi 1 Eric C. Rouchkabioinformatics.louisville.edu/localresources/software/HyperBFLib.pdfRami N. Mahdi Eric C. Rouchka Bi i f ti L b 1 Bioinformatics Lab Department of Computer](https://reader034.vdocuments.net/reader034/viewer/2022050506/5f97e07dc164ea727867a794/html5/thumbnails/22.jpg)
CONCLUSION
iSRprop is shown to be practical and convergent optimization f method for training HyperBF networks
The proposed regularization improved the generalization of H BF t k i ifi tlHyperBF networks significantly
Reduced HyperBF is shown to be competitive to SVM with significantly smaller model structure (1-3 orders of magnitude)
Reduced HyperBF is shown to facilitate higher level analysis..
22