diy deep learning with caffe workshop

62
DIY DEEP LEARNING WITH CAFFE Kate Saenko UMASS Lowell O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci

Upload: odsc

Post on 08-Aug-2015

614 views

Category:

Technology


3 download

TRANSCRIPT

  1. 1. DIY DEEP LEARNING WITH CAFFE Kate Saenko UMASS Lowell O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci
  2. 2. Caffe: Open Source Deep Learning Library Tutorials by Evan Shelhamer, Jon Long, Sergio Guadarrama, Jeff Donahue, and Ross Girshick and Yangqing Jia caffe.berkeleyvision.org github.com/BVLC/caffe Based on tutorials by the Caffe creators at UC Berkeley
  3. 3. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell Caffe crew ...plus the cold-brew and over 100 open source contributors!
  4. 4. See also: upcoming Caffe Tutorial at CVPR 2015 in Boston, June 7 http://tutorial.caffe.berkeleyvision.org/ Caffe Tutorial Deep learning framework tutorial by Evan Shelhamer, Jeff Donahue, Jon Long, Yangqing Jia and Ross Girshick
  5. 5. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  6. 6. Why Deep Learning? End-to-End Learning for Many Tasks Pinterest used deep learning-based visual search to find pinned images of bags similar to the one on the left. Image Credit: Pinterest
  7. 7. Why Deep Learning? Compositional Models Learned End-to-End Hierarchy of Representations - vision: pixel, motif, part, object - text: character, word, clause, sentence - speech: audio, band, phone, word concrete abstract learning figure credit Yann LeCun, ICML 13 tutorial
  8. 8. Why Deep Learning? Compositional Models Learned End-to-End figure credit Yann LeCun, ICML 13 tutorial Back-propagation jointly learns all of the model parameters to optimize the output for the task.
  9. 9. http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/
  10. 10. http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/ All in a days work with Caffe Why Deep Learning?
  11. 11. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  12. 12. What is Caffe? Prototype Train Deploy Open framework, models, and worked examples for deep learning - 1.5 years - 300+ citations, 100+ contributors - 2,000+ forks, >1 pull request / day average - focus has been vision, but branching out: sequences, reinforcement learning, speech + text
  13. 13. What is Caffe? Prototype Train Deploy Open framework, models, and worked examples for deep learning - Pure C++ / CUDA architecture for deep learning - Command line, Python, MATLAB interfaces - Fast, well-tested code - Tools, reference models, demos, and recipes - Seamless switch between CPU and GPU
  14. 14. Caffe is a Community project pulse
  15. 15. Caffe offers the model definitions optimization settings pre-trained weights so you can start right away. The BVLC models are licensed for unrestricted use. The community shares models in our Model Zoo. Reference Models GoogLeNet: ILSVRC14 winner
  16. 16. Brewing by the Numbers... Speed with Krizhevsky's 2012 model: o 2 ms / image on K40 GPU o caffe train -solver lenet_solver.prototxt -gpu 0 Stochastic Gradient Descent (SGD) + momentum Adaptive Gradient (ADAGRAD) Nesterovs Accelerated Gradient (NAG)
  17. 50. Setup: run once for initialization. Forward: make output given input. Backward: make gradient of output - w.r.t. bottom - w.r.t. parameters (if needed) Reshape: set dimensions. Layer Protocol Layer Development Checklist Compositional Modeling The Nets forward and backward passes are composed of the layers steps.
  18. 51. Layer Protocol == Class Interface Define a class in C++ or Python to extend Layer. Include your new layer type in a network and keep brewing. layer { type: Python python_param { module: layers layer: EuclideanLoss } }
  19. 52. Classification SoftmaxWithLoss HingeLoss Linear Regression EuclideanLoss Attributes / Multiclassification SigmoidCrossEntropyLoss Others New Task NewLoss Loss What kind of model is this? Define the task by the loss. loss (LOSS_TYPE)
  20. 53. Recipe for Brewing Convert the data to Caffe-format o lmdb, leveldb, hdf5 / .mat, list of images, etc. Define the Net Configure the Solver caffe train -solver solver.prototxt -gpu 0 Examples are your friends o caffe/examples/mnist,cifar10,imagenet o caffe/examples/*.ipynb o caffe/models/*
  21. 54. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  22. 55. Classification instant recognition the Caffe way see notebook
  23. 56. Brewing Models from logistic regression to non-linearity see notebook
  24. 57. Latest Roast Parallelism: open pull requests - synchronous SGD #2114 - data parallelism - ~linear speedup on multiple GPUs for computation >> communication - Flickr + NVIDIA collaboration on distributed opt + asynchronous SGD Python Net Specification #2086 New MATLAB Interface #2505 These will be bundled in the next release shortly.
  25. 58. Help Brewing Documentation - tutorial documentation - worked examples Modeling, Usage, and Install - caffe-users group - gitter.im chat Caffe @ CVPR15 Convolutional Nets - CS231n online convnet class by Andrej Karpathy and Fei- Fei Li. - Deep Learning for Vision tutorial from CVPR 14.
  26. 59. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell Thanks to the Caffe crew ...plus the cold-brew and open source contributors!
  27. 60. Acknowledgements Thank you to the Berkeley Vision and Learning Center Sponsors. Thank you to NVIDIA for GPU donation and collaboration on cuDNN. Thank you to our 75+ open source contributors and vibrant community. Thank you to A9 and AWS for a research grant for Caffe dev and reproducible research.
  28. 61. References [ DeCAF ] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. ICML, 2014. [ R-CNN ] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014. [ Zeiler-Fergus ] M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014. [ LeNet ] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. IEEE, 1998. [ AlexNet ] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012. [ OverFeat ] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. ICLR, 2014. [ Image-Style ] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. Winnemoeller. Recognizing Image Style. BMVC, 2014. [ Karpathy14 ] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. CVPR, 2014. [ Sutskever13 ] I. Sutskever. Training Recurrent Neural Networks. PhD thesis, University of Toronto, 2013. [ Chopra05 ] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.