Deck used for my talk during PyDataNYC in which I described how we improved thumbnail cropping in our news app, Kamelio. We used Deep Learning object detection to identify the interesting regions of the image which was subsequently fed into image cropping logic.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
PyData NYC by Akira Shibata
1. Putting Together
World's Best Data Processing Research
with Python
Copyright 2014 Shiroyagi Corporation. All rights reserved.
Akira Shibata, PhD
Shiroyagi Corporation
2. Who am I
Akira Shibata, PhD.
TW: @punkphysicist
CEO, Shiroyagi Corporation (shiroyagi.co.jp)
Kamelio: Personalised News Curation
Kamect: Contents Discovery Platform
2004 - 2010:
Data Scientist @ NYU
Statistical data modelling @ LHC, CERN
2010 - 2013
Boston Consulting Group
Copyright 2014 Shiroyagi Corporation. All rights reserved. 2
4. Statistical modelling of Physics data
Confirmatory:
Highly theory driven model building
Copyright 2014 Shiroyagi Corporation. All rights reserved. 4
5. Telling discovery from noise
The model tells you the expected uncertainty
Copyright 2014 Shiroyagi Corporation. All rights reserved. 5
11. Kamelio
“Deep Learning”
“Internet of
Things”
“Medical IT”
“Global Strategy”
Collects news through >3M
topics to chose from
Copyright 2014 Shiroyagi Corporation. All rights reserved. 11
12. 3
“Cats”
“Anime”
“Cats reaction to sighting
dogs for the first time”
Copyright 2014 Shiroyagi Corporation. All rights reserved. 12
13. Python puts all our tools together
Image in Detect
regions
Object
recog. Scoring Cropping
0 1 2 3 4
Matlab
+Scipy
C++
+Libraries
Numpy PIL
IPython and Python script
Copyright 2014 Shiroyagi Corporation. All rights reserved. 13
14. Our approach is
heavily influenced by
Berkeley Vision and
Learning Center
Acknowledgement
Copyright 2014 Shiroyagi Corporation. All rights reserved. 14
15. Detect
regions
0 1 2 3 4
Copyright 2014 Shiroyagi Corporation. All rights reserved. 15
16. Region detection: Telling where to look at
How do we find regions to feed into object recognition?
Default strategy was to look at the center
1
Copyright 2014 Shiroyagi Corporation. All rights reserved. 16
17. Exhaustive windows -> segmentation
Search over position,
scale, aspect ratio
Grouping parts of
image at different scales
Exhaustive search far too time inefficient
for use with Deep Learning
1
Copyright 2014 Shiroyagi Corporation. All rights reserved. 17
18. 1 Region detection: in practice
Install Malab and Selective Search algorithm
from author
Run matlab as subprocess
pid = subprocess.Popen(shlex.split(mc), stdout=open('/dev/null',
'w'), cwd=script_dirname)
matlab -nojvm -r "try; selective_search({‘image_file.jpg’},
‘output.mat'); catch; exit; end; exit”
1
2
3
Import output using scipy.io
all_boxes = list(scipy.io.loadmat(‘output.mat')['all_boxes'][0])
subtractor = np.array((1, 1, 0, 0))[np.newaxis, :]
all_boxes = [boxes - subtractor for boxes in all_boxes]
Copyright 2014 Shiroyagi Corporation. All rights reserved. 18
19. 1 Region detection: proposals generated
~200 proposals generated per image
Copyright 2014 Shiroyagi Corporation. All rights reserved. 19
20. Object
recog.
0 1 2 3 4
Copyright 2014 Shiroyagi Corporation. All rights reserved. 20
21. Object recognition
Deep blue beat Kasparov at chess in 1997…
2
Copyright 2014 Shiroyagi Corporation. All rights reserved. 21
22. 2 Deep Learning: Damn good at it
Copyright 2014 Shiroyagi Corporation. All rights reserved. 22
23. 2 Convoluted Neural Network
…
Copyright 2014 Shiroyagi Corporation. All rights reserved. 23
24. Caffe: open R-CNN framework under rapid dev.
C++/CUDA with Python wrapper
2
Copyright 2014 Shiroyagi Corporation. All rights reserved. 24
25. Pre-trained models published
We used 200-category object recog. model
developed for 2013 ImageNet Challenge
2
Copyright 2014 Shiroyagi Corporation. All rights reserved. 25
26. 2 Object recognition: in practice
Install a bunch of libraries and Caffe
CUDA, Boost, OpenCV, BLAS…
Import wrapper and configure
MODEL_FILE=‘models/bvlc_…_ilsvrc13/deploy.prototxt’
PRETRAINED_FILE = ‘models/…/bvlc_…_ilsvrc13.caffemodel’
MEAN_FILE = 'caffe/imagenet/ilsvrc_2012_mean.npy'
detector = caffe.Detector(MODEL_FILE, PRETRAINED_FILE,
mean=np.load(MEAN_FILE), raw_scale=255, channel_swap=[2,1,0])
1
2
3
Pass found regions for object detection
self.detect_windows(zip(image_fnames, windows_list))
Copyright 2014 Shiroyagi Corporation. All rights reserved. 26
27. 2 Object recognition: Result
Obj Score
0 domestic cat 1.03649377823
1 domestic cat 0.0617411136627
2 domestic cat -0.097744345665
3 domestic cat -0.738470971584
4 chair -0.988844156265
5 skunk -0.999914288521
6 tv or monitor -1.00460898876
7 rubber eraser -1.01068615913
8 chair -1.04896986485
9 rubber eraser -1.09035253525
10 band aid -1.09691572189
Takes minutes to detect all windows
Copyright 2014 Shiroyagi Corporation. All rights reserved. 27
28. 2 Object recognition: Result
Obj Score
0 person 0.126184225082
1 person 0.0311727523804
2 person -0.0777613520622
3 neck brace -0.39757412672
4 person -0.415030777454
5 drum -0.421649754047
6 neck brace -0.481261610985
7 tie -0.649109125137
8 neck brace -0.719438135624
9 face powder -0.789100408554
10 face powder -0.838757038116
Copyright 2014 Shiroyagi Corporation. All rights reserved. 28
29. Scoring
0 1 2 3 4
Copyright 2014 Shiroyagi Corporation. All rights reserved. 29
30. 3 Scoring
1 For every pixel, sum up score from all detections
for
i
in
xrange(len(detec0ons)):
arr[ymin:ymax,
xmin:xmax]
+=
math.exp(score)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 30
31. Score heatmap
We used 200-cat object recognition model
developed for 2013 ImageNet Challenge
3
Copyright 2014 Shiroyagi Corporation. All rights reserved. 31
32. Cropping
0 1 2 3 4
Copyright 2014 Shiroyagi Corporation. All rights reserved. 32
33. 4 Cropping
Generate all possible crop areas
while
y+hws
<=
h:
while
x+hws
<=
w:
window_locs
=
np.vstack((window_locs,
[x,
y,
x+hws,
y+hws]))
Find the crop that encloses the highest point of
interest in the centre
for
i,
window_loc
in
enumerate(window_locs):
x1,
y1,
x2,
y2
=
window_loc
if
max_val
!=
np.max(arr_con[y1:y2,
x1:x2]):
scores[i]=np.nan
else:
scores[i]
=
((x1+x2)/2.-‐xp)**2+
((y1+y2)/2.-‐yp)**2
1
2
3
Crop and save!
img_pil
=
Image.open(fn)
crop_area=map(lambda
x:
int(x),
window_locs[scores.argmax()])
img_crop
=
img_pil.crop(crop_area)
Copyright 2014 Shiroyagi Corporation. All rights reserved. 33