machine learning for adaptive multi-core machines - … phd... · noel lopes machine learning for...
TRANSCRIPT
Machine Learning forAdaptive Multi-Core Machines
Noel Lopes
Supervisor: Prof. Dr. Bernardete Ribeiro
University of Coimbra, Portugal
September 17, 2013
Outline
I Introduction
I Objectives
I Contributions
I High-performance Deep Learning
I Conclusions
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Machine LearningNeed to Scale up
High-throughput
Machine Learning
implementations
Largedatasets
high-dimensional
inputs
Inferencetime
constraints
Algorithmscomplextity
Adequatemodel
selection
Cascadepredictors
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Machine LearningBig Data
Data sources Real Data
Data streams
ComputerSimulation Models
Artificial Data
Extract usefuland relevantinformation
Largevolumesof data
Persistentrepositories of(accumulated)
Data
vastly exceeds ourcapacity to analyze itchallenge
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Machine LearningBig Data
Data sources Real Data
Data streams
ComputerSimulation Models
Artificial Data
ML Algorithms
Extractedinformation
Largevolumesof data
Persistentrepositories of(accumulated)
Data
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Scientific Contributions
I Machine LearningI Supervised LearningI Semi-supervised LearningI Unsupervised Learning
I GPUMLib – GPU ML Library
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Scientific ContributionsSupervised Learning
I Machine LearningI Supervised Learning
I Autonomous Training System (ATS)
I Neural Selective Input Model (NSIM)
I Incremental Hypersphere Classifier (IHC)
I Semi-supervised LearningI Unsupervised Learning
I GPUMLib – GPU ML Library
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
x1×
x2 ×
x3×
y1×
y2×
Space Network
Main Network withselective actuation neurons
bias
Scientific ContributionsSupervised Learning
I Machine LearningI Supervised Learning
I Autonomous Training System (ATS)I Neural Selective Input Model (NSIM)I Incremental Hypersphere Classifier (IHC)
I Semi-supervised LearningI Unsupervised Learning
I GPUMLib – GPU ML Library
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
x1
x2
x3
κ3
xi
y1
y2
wij
bj
×multiplier
κi
xi
selective input neuron
Physical model
Model 1 when x3 is missing: κ3 = 0
x1
x2
Conceptual models
y1
y2
Model 2 when the value of x3 is known: κ3 = 1
x1
x2
x3
y1
y2
Scientific ContributionsSupervised Learning
I Machine LearningI Supervised Learning
I Autonomous Training System (ATS)I Neural Selective Input Model (NSIM)I Incremental Hypersphere Classifier (IHC)
I Semi-supervised LearningI Unsupervised Learning
I GPUMLib – GPU ML Library
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
x2x1
xk
Scientific ContributionsSemi-Supervised Learning
I Machine LearningI Supervised LearningI Semi-supervised
I Semi-supervised Non-Negative Matrix FactorizationI Unsupervised Learning
I GPUMLib – GPU ML Library
V W Htrain
≈D
N r
N
r
V1 V2 · · · VC W1 W2 · · · WC
r1 r2 rC H1
H2
· · ·
HC
r1
r2
rC
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Scientific ContributionsUnsupervised Learning
I Machine LearningI Supervised LearningI Semi-supervisedI Unsupervised Learning
I Deep Belief Networks (Adaptive Step Size technique)
I GPUMLib – GPU ML Library
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Scientific ContributionsCase studies and Benchmarks
I Case studiesI biomedicalI finance and businessI bio-informatics
Yale face database ORL face database
MNIST hand-written digits HHreco multi-stroke images
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
GPUMLib – GPU ML LibraryGraphics Processing Unit (GPU)
● ● ● ●●
● ●
●
●
●
●
● ● ●●● ● ● ● ● ● ● ● ● ● ●●
●● ●● ●● ●
●
●
●
●
0
1000
2000
3000
4000
2002 2004 2006 2008 2010 2012Date
GF
LOP
S
Precision
● SP
DP
Vendor
●
●
●
●
AMD (GPU)
NVIDIA (GPU)
Intel (CPU)
Intel Xeon Phi
Historical Single−/Double−Precision Peak Compute Rates
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Scientific ContributionsGPUMLib – GPU ML Library
Host (CPU) and device (GPU) memory access framework
HostArray HostMatrix CudaArray
DeviceArray DeviceMatrix · · ·
C++ classes (algorithms)
Back-Propagation
Radial BasisFunctions
Deep BeliefNetworks
RestrictedBoltzmannMachines
MultipleBack-
Propagation
SupportVector
Machines
Non-NegativeMatrix
Factorization
· · ·
CommonHost (CPU)
Classes
CommonCUDAKernels
CUDA (GPU) Kernels
MultipleBack-
Propagation
SupportVector
Machines
Non-NegativeMatrix
Factorization
NonlinearDimensionReduction
Radial BasisFunctions
RestrictedBoltzmannMachines
Self Orga-nizing Maps
· · ·
CommonDevice(GPU)
Functions
http://gpumlib.sourceforge.net/
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Deep Belief NetworksDeep architecture
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Restricted Boltzmann Machines(RBMs)
For the binary units hj ∈ {0, 1} and vi ∈ {0, 1} the energyfunction of the whole network is:
E(v,h) = −∑i,j
Wijvihj −∑i
civi −∑j
bjhj (1)
where W is the matrix of weights, and b and c are the bias unitsw.r.t. hidden and visible layers, respectively.
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
biasvisible units
hidden units
dec
od
er
enco
der
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Restricted Boltzmann Machines(RBMs)
Given a random training vector v, the state of a given hidden unitj is set to 1 with probability:
p(hj = 1|v) = σ(bj +∑i
viWij) (2)
Similarly:
p(vi = 1|h) = σ(ci +∑j
hjWij) (3)
where σ (x) is the sigmoid squashing function 1(1+e−x)
.
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
biasvisible units
hidden units
dec
od
er
enco
der
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Training an RBMAlternating Gibbs Sampling
v(0) = x
i · · ·
h(0)
· · · j
〈vihj〉0
p(hj = 1|v) = σ(bj +∑I
i=1 viWji)
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Training an RBMAlternating Gibbs Sampling
v(0) = x
i · · ·
h(0)
· · · j
〈vihj〉0
v(1)
i · · ·
p(vi = 1|h) = σ(ci +∑J
j=1 hjWji)
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Training an RBMAlternating Gibbs Sampling
v(0) = x
i · · ·
h(0)
· · · j
〈vihj〉0
v(1)
i · · ·
h(1)
· · · j
p(vi = 1|h) = σ(ci +∑J
j=1 hjWji)
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Training an RBMAlternating Gibbs Sampling
v(0) = x
i · · ·
h(0)
· · · j
〈vihj〉0
v(1)
i · · ·
h(1)
· · · j
v(1)
i · · ·
p(vi = 1|h) = σ(ci +∑J
j=1 hjWji)
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Training an RBMAlternating Gibbs Sampling
v(0) = x
i · · ·
h(0)
· · · j
〈vihj〉0
v(1)
i · · ·
h(1)
· · · j
v(2)
i · · ·
h(2)
· · · j
v(∞)
i · · ·
h(∞)
· · · j
〈vihj〉∞
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Training an RBMContrastive Divergence (CD–k)
I To solve this problem, Hinton proposed the ContrastiveDivergence algorithm.
I CD–k replaces 〈.〉∞ by 〈·〉k for small values of k.
∆Wji = γ(〈vihj〉0 − 〈vihj〉k) (4)
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Deep Belief Networks (DBNs)
x· · ·
h1· · ·
p(x|h1)p(h1|x)
x· · ·
h1· · ·
h2· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
x· · ·
h1· · ·
h2· · ·
h3· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
p(h2|h3)p(h3|h2)
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Deep Belief Networks (DBNs)GPU Implementation Results
1
10
100
1000
10000
0 100 200 300 400 500 600 700 800 900
Tim
e(s)
Hidden units
N = 60, 000
42.73×43.46×
38.64×41.83× 46.07×
10s
1m40s
16m40s
3h46m40s
GTX 460 (GPU)
dual-core i5 (CPU)
MNIST average training time per epoch.
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Deep Belief Networks (DBNs)Adaptive Step Size
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(recon
struction)
Epoch
α = 0.1
adaptiveγ = 0.1γ = 0.4γ = 0.7
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(recon
struction)
Epoch
α = 0.4
adaptiveγ = 0.1γ = 0.4γ = 0.7
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(recon
struction)
Epoch
α = 0.7
adaptiveγ = 0.1γ = 0.4γ = 0.7
Average reconstruction error (RMSE).
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Restricted Boltzmann MachinesReceptive Fields
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Restricted Boltzmann MachinesReceptive Fields
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Deep Belief Networks (DBNs)
Demonstration
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
ConclusionsFuture Work
I Big Data Problem:I Novel ML algorithmsI Scale-up existing algorithms
I High-performance (GPU) ML implementations
I Size matters:I Enhancing GPUMLib algorithms with Big Data in mind
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
PublicationsFirst author
I 5 Journal Articles
I 15 Conference Articles
I 30+ Citations
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
PublicationsJournal Articles
Noel Lopes and Bernardete Ribeiro.Towards adaptive learning with improved convergence of deep belief networks ongraphics processing units.Pattern Recognition, 2013.
Noel Lopes and Bernardete Ribeiro.Towards a hybrid NMF-based neural approach for face recognition on GPUs.International Journal of Data Mining, Modelling and Management (IJDMMM),4(2):138–155, 2012.
Noel Lopes and Bernardete Ribeiro.Handling missing values via a neural selective input model.Neural Network World, 22(4):357–370, 2012.
Noel Lopes and Bernardete Ribeiro.GPUMLib: An efficient open-source GPU machine learning library.International Journal of Computer Information Systems and IndustrialManagement Applications, 3:355–362, 2011.
Noel Lopes and Bernardete Ribeiro.An evaluation of multiple feed-forward networks on GPUs.International Journal of Neural Systems (IJNS), 21(1):31–47, 2011.
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
PublicationsProceeding Articles (page 1 of 4)
Noel Lopes, Bernardete Ribeiro, and Joao Goncalves.Restricted Boltzmann machines and deep belief networks on multi-coreprocessors.In The 2012 International Joint Conference on Neural Networks (IJCNN), 2012.
Noel Lopes and Bernardete Ribeiro.Improving convergence of restricted Boltzmann machines via a learning adaptivestep size.In Progress in Pattern Recognition, Image Analysis, Computer Vision, andApplications, LNCS 7441, pages 511–518. Springer Berlin / Heidelberg, 2012.
Noel Lopes, Daniel Correia, Carlos Pereira, Bernardete Ribeiro, and AntonioDourado.An incremental hypersphere learning framework for protein membershipprediction.In 7th International Conference on Hybrid Artificial Intelligent Systems, LNCS7208, pages 429–439. Springer Berlin / Heidelberg, 2012.
Noel Lopes and Bernardete Ribeiro.A robust learning model for dealing with missing values in many-corearchitectures.In 10th International Conference on Adaptive and Natural Computing Algorithms(ICANNGA 2011), Part II, LNCS 6594, pages 108–117. Springer Berlin, 2011.
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
PublicationsProceeding Articles (page 2 of 4)
Noel Lopes and Bernardete Ribeiro.Incremental learning for non-stationary patterns.In 17th edition of the Portuguese Conference on Pattern Recognition (RECPAD2011), 2011.
Noel Lopes and Bernardete Ribeiro.An incremental class boundary preserving hypersphere classifier.In International Conference on Neural Information Processing (ICONIP 2011),Part II, LNCS 7063, pages 690–699. Springer Berlin Heidelberg, 2011.
Noel Lopes and Bernardete Ribeiro.A fast optimized semi-supervised non-negative matrix factorization algorithm.In IEEE International Joint Conference on Neural Networks (IJCNN 2011), pages2495–2500, 2011.
Noel Lopes, Bernardete Ribeiro, and Ricardo Quintas.GPUMLib: A new library to combine machine learning algorithms with graphicsprocessing units.In IEEE 10th International Conference on Hybrid Intelligent Systems (HIS 2010),pages 229–232, August 2010.
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
PublicationsProceeding Articles (page 3 of 4)
Noel Lopes and Bernardete Ribeiro.A strategy for dealing with missing values by using selective activation neurons ina multi-topology framework.In IEEE World Congress on Computational Intelligence (WCCI 2010), 2010.
Noel Lopes and Bernardete Ribeiro.Stochastic GPU-based multithread implementation of multiple back-propagation.In Second International Conference on Agents and Artificial Intelligence(ICAART 2010), pages 271–276, 2010.
Noel Lopes and Bernardete Ribeiro.Non-negative matrix factorization implementation using graphic processing units.In 11th International Conference on Intelligent Data Engineering and AutomatedLearning (IDEAL 2010), LNCS 6283, pages 275–283. Springer, 2010.
Noel Lopes and Bernardete Ribeiro.A hybrid face recognition approach using GPUMLib.In 15th Iberoamerican Congress on Pattern Recognition (CIARP 2010), LNCS6419, pages 96–103. Springer, 2010.
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
PublicationsProceeding Articles (page 4 of 4)
Noel Lopes and Bernardete Ribeiro.Fast pattern classification of ventricular arrhythmias using graphics processingunits.In 14th Iberoamerican Congress on Pattern Recognition (CIARP 2009), LNCS5856, pages 603–610. Springer, 2009.
Noel Lopes and Bernardete Ribeiro.GPU implementation of the multiple back-propagation algorithm.In 10th International Conference on Intelligent Data Engineering and AutomatedLearning (IDEAL 2009), LNCS 5788, pages 449–456. Springer, 2009.
Noel Lopes and Bernardete Ribeiro.MBPGPU: A supervised pattern classifier for graphical processing units.In 15th edition of the Portuguese Conference on Pattern Recognition (RECPAD2009), 2009.
Noel Lopes Machine Learning for Adaptive Multi-Core Machines
Machine Learning forAdaptive Multi-Core Machines
Noel Lopes
Supervisor: Prof. Dr. Bernardete Ribeiro
University of Coimbra, Portugal
September 17, 2013
GPUMLibOver 2000 downloads
19.6%11.7%
5.6%
Noel Lopes Machine Learning for Adaptive Multi-Core Machines