pydata nyc by akira shibata
DESCRIPTION
Deck used for my talk during PyDataNYC in which I described how we improved thumbnail cropping in our news app, Kamelio. We used Deep Learning object detection to identify the interesting regions of the image which was subsequently fed into image cropping logic.TRANSCRIPT
![Page 1: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/1.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved.
Akira Shibata, PhD
Putting Together World's Best Data Processing Research with Python
Shiroyagi Corporation
![Page 2: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/2.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 2
Who am I
Akira Shibata, PhD.
TW: @punkphysicist
CEO, Shiroyagi Corporation (shiroyagi.co.jp)
Kamelio: Personalised News Curation
Kamect: Contents Discovery Platform
2004 - 2010:
Data Scientist @ NYU
Statistical data modelling @ LHC, CERN
2010 - 2013
Boston Consulting Group
![Page 3: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/3.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 3
![Page 4: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/4.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 4
Statistical modelling of Physics data
Confirmatory: Highly theory driven model building
![Page 5: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/5.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 5
Telling discovery from noise
The model tells you the expected uncertainty
![Page 6: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/6.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 6
![Page 7: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/7.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 7
![Page 8: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/8.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 8
![Page 9: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/9.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 9
![Page 10: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/10.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 10
![Page 11: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/11.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 11
Kamelio
“Deep Learning”
“Internet of Things”
“Global Strategy”
“Medical IT”
Collects news through >3M topics to chose from
![Page 12: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/12.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 12
3
“Cats”
“Anime”
“Cats reaction to sighting dogs for the first time”
![Page 13: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/13.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 13
Python puts all our tools together
0 1 2 3 4
Image in Detect regions
Object recog. Scoring Cropping
IPython and Python script
Matlab +Scipy
C++ +Libraries
Numpy PIL
![Page 14: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/14.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 14
Our approach is heavily influenced by Berkeley Vision and Learning Center
Acknowledgement
![Page 15: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/15.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 15
0 1 2 3 4
Detect regions
![Page 16: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/16.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 16
Region detection: Telling where to look at
How do we find regions to feed into object recognition? Default strategy was to look at the center
1
![Page 17: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/17.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 17
Exhaustive windows -> segmentation
Search over position, scale, aspect ratio
Grouping parts of image at different scales
Exhaustive search far too time inefficient for use with Deep Learning
1
![Page 18: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/18.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 18
Run matlab as subprocess pid = subprocess.Popen(shlex.split(mc), stdout=open('/dev/null', 'w'), cwd=script_dirname)
matlab -nojvm -r "try; selective_search({‘image_file.jpg’}, ‘output.mat'); catch; exit; end; exit”
1
2
3
Install Malab and Selective Search algorithm from author
Import output using scipy.io all_boxes = list(scipy.io.loadmat(‘output.mat')['all_boxes'][0])subtractor = np.array((1, 1, 0, 0))[np.newaxis, :]all_boxes = [boxes - subtractor for boxes in all_boxes]
1 Region detection: in practice
![Page 19: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/19.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 19
1 Region detection: proposals generated
~200 proposals generated per image
![Page 20: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/20.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 20
0 1 2 3 4
Object recog.
![Page 21: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/21.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 21
Object recognition
Deep blue beat Kasparov at chess in 1997…
2
![Page 22: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/22.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 22
Deep Learning: Damn good at it2
![Page 23: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/23.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 23
Convoluted Neural Network2
…
![Page 24: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/24.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 24
Caffe: open R-CNN framework under rapid dev.
C++/CUDA with Python wrapper
2
![Page 25: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/25.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 25
Pre-trained models published
We used 200-category object recog. model developed for 2013 ImageNet Challenge
2
![Page 26: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/26.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 26
Import wrapper and configure MODEL_FILE=‘models/bvlc_…_ilsvrc13/deploy.prototxt’PRETRAINED_FILE = ‘models/…/bvlc_…_ilsvrc13.caffemodel’MEAN_FILE = 'caffe/imagenet/ilsvrc_2012_mean.npy'detector = caffe.Detector(MODEL_FILE, PRETRAINED_FILE, mean=np.load(MEAN_FILE), raw_scale=255, channel_swap=[2,1,0])
1
2
3
Install a bunch of libraries and Caffe CUDA, Boost, OpenCV, BLAS…
Pass found regions for object detection self.detect_windows(zip(image_fnames, windows_list))
2 Object recognition: in practice
![Page 27: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/27.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 27
Object recognition: Result2
Takes minutes to detect all windows
0 domestic cat 1.03649377823 1 domestic cat 0.0617411136627 2 domestic cat -0.097744345665 3 domestic cat -0.738470971584 4 chair -0.988844156265 5 skunk -0.999914288521 6 tv or monitor -1.00460898876 7 rubber eraser -1.01068615913 8 chair -1.04896986485 9 rubber eraser -1.09035253525 10 band aid -1.09691572189
Obj Score
![Page 28: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/28.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 28
0 person 0.126184225082 1 person 0.0311727523804 2 person -0.0777613520622 3 neck brace -0.39757412672 4 person -0.415030777454 5 drum -0.421649754047 6 neck brace -0.481261610985 7 tie -0.649109125137 8 neck brace -0.719438135624 9 face powder -0.789100408554 10 face powder -0.838757038116
Object recognition: Result2
Obj Score
![Page 29: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/29.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 29
0 1 2 3 4
Scoring
![Page 30: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/30.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 30
1 For every pixel, sum up score from all detections for i in xrange(len(detec0ons)): arr[ymin:ymax, xmin:xmax] += math.exp(score)
Scoring3
![Page 31: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/31.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 31
Score heatmap
We used 200-cat object recognition model developed for 2013 ImageNet Challenge
3
![Page 32: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/32.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 32
0 1 2 3 4
Cropping
![Page 33: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/33.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 33
Cropping4
Find the crop that encloses the highest point of interest in the centre for i, window_loc in enumerate(window_locs): x1, y1, x2, y2 = window_loc if max_val != np.max(arr_con[y1:y2, x1:x2]): scores[i]=np.nan else: scores[i] = ((x1+x2)/2.-‐xp)**2+ ((y1+y2)/2.-‐yp)**2
1
2
3
Generate all possible crop areas while y+hws <= h: while x+hws <= w: window_locs = np.vstack((window_locs, [x, y, x+hws, y+hws]))
Crop and save! img_pil = Image.open(fn) crop_area=map(lambda x: int(x), window_locs[scores.argmax()]) img_crop = img_pil.crop(crop_area)
![Page 34: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/34.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 34
Finally4
![Page 35: PyData NYC by Akira Shibata](https://reader033.vdocuments.net/reader033/viewer/2022052316/559ee49d1a28ab420a8b4735/html5/thumbnails/35.jpg)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 35
Future improvements
Aspect detection:square or rectangle?
Magnification
Fast face/human detection
Unseen object
Object weighting