intro to the distributed version of tensorflow
TRANSCRIPT
Dr. Miha Pelko, NorCom*
@mpelko
*views are my own
WE ARE HIRING!
Configuration at Yahoo! :
“We avoid unnecessary data movement between Hadoop clusters and separate deep learning clusters.”
“YARN works well for deep learning. Multiple experiments of deep learning can be conducted concurrently on a single cluster. It makes deep learning extremely cost effective as opposed to conventional methods. In the past, we had teams use “notepad” to schedule GPU resources manually, which was painful and worked only for a small number of users.”
From: http://yahoohadoop.tumblr.com/post/129872361846/large-scale-distributed-deep-learning-on-hadoop
See: https://www.tensorflow.org/versions/r0.9/how_tos/image_retraining/index.html
Inception-v3
RETRAIN INSTEAD OF DISTRIBUTE
ERLKÖNIG RECOGNITION
3 SPECIFIC CATEGORIES
erlkönig
car
road
Cut the last layer and train a new one~30 minutes on Desktop CPU> 90% accuracy
TasksJobs
One server per task!
§ Wrapper over a Coordinator, a Saver, and a SessionManager
§ Variable initialization
§ Checkpointing
§ Summarizes to the log
§ Automatic initialization from the most recent checkpoint
§ is_chief flag in replica-type models
Source: http://download.tensorflow.org/paper/whitepaper2015.pdf
Performance of Distributed TensorFlow: A Multi-Node and Multi-GPU Configurationhttp://www.altoros.com/performance-benchmark-distributed-tensorflow.html
§ Putting it all together (including deployment management)§ See: https://www.tensorflow.org/versions/r0.9/how_tos/distributed/index.html§ See: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/dist_test§ See: https://github.com/bwahn/tensorflow-kr-docker
§ In-graph replication vs. Between-graph replication§ See: https://www.tensorflow.org/versions/r0.9/how_tos/distributed/index.html#replicated-training
§ Specific hardware components§ How to handle GPUs?§ Other hardware?
§ Model splitting parallelization§ You’re on your own
§ See also: https://www.youtube.com/watch?v=YAkdydqUE2c
Thank you.