[harvard cs264] 09 - machine learning on big data: lessons learned from google projects (max lin,...
DESCRIPTION
Abstract:Machine learning researchers and practitioners develop computeralgorithms that "improve performance automatically throughexperience". At Google, machine learning is applied to solve manyproblems, such as prioritizing emails in Gmail, recommending tags forYouTube videos, and identifying different aspects from online userreviews. Machine learning on big data, however, is challenging. Some"simple" machine learning algorithms with quadratic time complexity,while running fine with hundreds of records, are almost impractical touse on billions of records.In this talk, I will describe lessons drawn from various Googleprojects on developing large scale machine learning systems. Thesesystems build on top of Google's computing infrastructure such as GFSand MapReduce, and attack the scalability problem through massivelyparallel algorithms. I will present the design decisions made inthese systems, strategies of scaling and speeding up machine learningsystems on web scale data.Speaker biography:Max Lin is a software engineer with Google Research in New York Cityoffice. He is the tech lead of the Google Prediction API, a machinelearning web service in the cloud. Prior to Google, he publishedresearch work on video content analysis, sentiment analysis, machinelearning, and cross-lingual information retrieval. He had a PhD inComputer Science from Carnegie Mellon University.TRANSCRIPT
![Page 1: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/1.jpg)
Machine Learning on Big DataLessons Learned from Google Projects
Max LinSoftware Engineer | Google Research
Massively Parallel Computing | Harvard CS 264Guest Lecture | March 29th, 2011
![Page 2: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/2.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 3: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/3.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 4: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/4.jpg)
“Machine Learning is a study of computer algorithms that
improve automatically through experience.”
![Page 5: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/5.jpg)
![Page 6: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/6.jpg)
![Page 7: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/7.jpg)
![Page 8: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/8.jpg)
![Page 9: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/9.jpg)
![Page 10: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/10.jpg)
Training
Testing
The quick brown fox jumped over the lazy dog. English
To err is human, but to really foul things up you need a computer.
English
No hay mal que por bien no venga. Spanish
La tercera es la vencida. Spanish
To be or not to be -- that is the question
?
La fe mueve montañas. ?
Input X
f(x’)
Output Y
Model f(x)
= y’
![Page 11: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/11.jpg)
The quick brown fox jumped over the lazy dog.
Linear Classifier
0,‘a’ ‘aardvark’...
[ 0,...
... ...‘dog’
1,... ‘the’
1,...... ‘montañas’... 0, ...
...]x
0.1,[ 132,... ... 150, 200,... ... -153, ... ]w
f(x) = w · x =P�
p=1
wp ∗ xp
![Page 12: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/12.jpg)
Training Data
...
...
...
... ... ... ... ... ...
...
NP
Input X Ouput Y
![Page 13: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/13.jpg)
http://www.flickr.com/photos/mr_t_in_dc/5469563053/
Typical machine learning data at Google
N: 100 billions / 1 billionP: 1 billion / 10 million(mean / median)
![Page 14: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/14.jpg)
Classifier Training
• Training: Given {(x, y)} and f, minimize the following objective function
argminw
N�
n=1
L(yi, f(xi;w)) +R(w)
![Page 15: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/15.jpg)
http://www.flickr.com/photos/visitfinland/5424369765/
Use Newton’s method?wt+1 ← wt −H(wt)−1∇J(wt)
![Page 16: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/16.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 17: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/17.jpg)
Scaling Up
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
![Page 18: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/18.jpg)
Machine
SubsamplingBig Data
Shard 1 Shard 2 Shard MShard 3...
Model
Reduce N
![Page 19: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/19.jpg)
Why not Small Data?
[Banko and Brill, 2001]
![Page 20: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/20.jpg)
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
Scaling Up
![Page 21: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/21.jpg)
Parallelize Estimates
• Naive Bayes Classifier
• Maximum Likelihood Estimates
wthe|EN =
�Ni=1 1EN,the(xi)�N
i=1 1EN (xi)
argminw
−N�
i=1
P�
p=1
P (xip|yi;w)P (yi;w)
![Page 22: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/22.jpg)
Word Counting
MapX: “The quick brown fox ...”Y: EN
(‘the|EN’, 1)(‘quick|EN’, 1)(‘brown|EN’, 1)
Reduce [ (‘the|EN’, 1), (‘the|EN’, 1), (‘the|EN’, 1) ]
C(‘the’|EN) = SUM of values = 3
w�the�|EN =C(�the�|EN)
C(EN)
![Page 23: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/23.jpg)
Map
Reduce
Big Data
Mapper 1
Shard 1
Mapper 2
Shard 2
Mapper 3
Shard 3
Mapper M
Shard M
(‘the’ | EN, 1)
Reducer
Tally counts and update w
...
Word Counting
(‘fox’ | EN, 1) ... (‘montañas’ | ES, 1)
Model
![Page 24: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/24.jpg)
Parallelize Optimization
• Maximum Entropy Classifiers
• Good: J(w) is concave
• Bad: no closed-form solution like NB
• Ugly: Large N
argminw
N�
i=1
exp(�P
p=1 wp ∗ xip)
yi
1 + exp(�P
p=1 wp ∗ xip)
![Page 25: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/25.jpg)
Gradient Descent
http://www.cs.cmu.edu/~epxing/Class/10701/Lecture/lecture7.pdf
![Page 26: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/26.jpg)
Gradient Descent
• w is initialized as zero
• for t in 1 to T
• Calculate gradients
•
∇J(w)
wt+1 ← wt − η∇J(w)
∇J(w) =N�
i=1
P (w, xi, yi)
![Page 27: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/27.jpg)
Distribute Gradient
• w is initialized as zero
• for t in 1 to T
• Calculate gradients in parallel
• Training CPU: O(TPN) to O(TPN / M)
wt+1 ← wt − η∇J(w)
![Page 28: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/28.jpg)
Distribute Gradient
Map
Reduce
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M
(dummy key, partial gradient sum)
Sum and Update w
...
ModelRepeat M/R
until converge
![Page 29: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/29.jpg)
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
Scaling Up
![Page 30: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/30.jpg)
Parallelize Subroutines
• Support Vector Machines
• Solve the dual problems.t. 1− yi(w · φ(xi) + b) ≤ ζi, ζi ≥ 0
arg minw,b,ζ
1
2||w||22 + C
n�
i=1
ζi
argminα
1
2αTQα− αT1
s.t. 0 ≤ α ≤ C,yTα = 0
![Page 31: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/31.jpg)
http://www.flickr.com/photos/sea-turtle/198445204/
The computational cost for the Primal-Dual Interior Point
Method is O(n^3) in time and O(n^2) in
memory
![Page 32: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/32.jpg)
Parallel SVM• Parallel, row-wise incomplete Cholesky
Factorization for Q
• Parallel interior point method
• Time O(n^3) becomes O(n^2 / M)
• Memory O(n^2) becomes O(n / M)
• Parallel Support Vector Machines (psvm) http://code.google.com/p/psvm/
• Implement in MPI
√N
[Chang et al, 2007]
![Page 33: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/33.jpg)
• Distribute Q by row into M machines
• For each dimension n <
• Send local pivots to master
• Master selects largest local pivots and broadcast the global pivot to workers
Machine 1
row 1
√N
Parallel ICF
Machine 2
...row 2
row 3
row 4
Machine 3
row 5
row 6
![Page 34: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/34.jpg)
![Page 35: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/35.jpg)
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
Scaling Up
![Page 36: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/36.jpg)
Majority Vote
Map
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M...
Model 1 Model 2 Model 3 Model 4
![Page 37: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/37.jpg)
Majority Vote
• Train individual classifiers independently
• Predict by taking majority votes
• Training CPU: O(TPN) to O(TPN / M)
![Page 38: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/38.jpg)
Parameter Mixture
Map
Reduce
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M
(dummy key, w1)
Average w
...
(dummy key, w2) ...
[Mann et al, 2009]
Model
![Page 39: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/39.jpg)
http://www.flickr.com/photos/annamatic3000/127945652/
Much Less network usage than distributed gradient descentO(MN) vs. O(MNT)
![Page 40: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/40.jpg)
![Page 41: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/41.jpg)
Iterative Param Mixture
Map
Reduce after each epoch
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M
(dummy key, w1)
Average w
...
(dummy key, w2) ...
Model
[McDonald et al., 2010]
![Page 42: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/42.jpg)
![Page 43: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/43.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 44: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/44.jpg)
http://www.flickr.com/photos/mr_t_in_dc/5469563053/
Scalable
![Page 45: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/45.jpg)
http://www.flickr.com/photos/aloshbennett/3209564747/
Parallel
![Page 46: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/46.jpg)
http://www.flickr.com/photos/wanderlinse/4367261825/
Accuracy
![Page 47: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/47.jpg)
http://www.flickr.com/photos/imagelink/4006753760/
![Page 48: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/48.jpg)
http://www.flickr.com/photos/brenderous/4532934181/
Binary Classification
![Page 49: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/49.jpg)
http://www.flickr.com/photos/mararie/2340572508/
Automatic Feature
Discovery
![Page 50: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/50.jpg)
http://www.flickr.com/photos/prunejuice/3687192643/
Fast Response
![Page 51: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/51.jpg)
http://www.flickr.com/photos/jepoirrier/840415676/
Memory is new hard disk.
![Page 52: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/52.jpg)
http://www.flickr.com/photos/neubie/854242030/
Algorithm + Infrastructure
![Page 53: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/53.jpg)
Design for Multicores
http://www.flickr.com/photos/geektechnique/2344029370/
![Page 54: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/54.jpg)
Combiner
![Page 55: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/55.jpg)
![Page 56: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/56.jpg)
Multi-shard Combiner
[Chandra et al., 2010]
![Page 57: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/57.jpg)
Machine Learning on
Big Data
![Page 58: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/58.jpg)
Parallelize ML Algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
![Page 59: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/59.jpg)
Parallel Accuracy
Fast Response
![Page 60: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.vdocuments.net/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/60.jpg)
Google APIs
• Prediction API
• machine learning service on the cloud
• http://code.google.com/apis/predict
• BigQuery
• interactive analysis of massive data on the cloud
• http://code.google.com/apis/bigquery