20141127 py datatokyomeetup2

30
Copyright 2014 Shiroyagi Corporation. All rights reserved. シバタアキラ PyData NYC に行ってきました Shiroyagi Corporation

Upload: akira-shibata

Post on 10-Jul-2015

770 views

Category:

Technology


0 download

DESCRIPTION

PyData Tokyo二回目で発表した際のプレゼン資料です。一週間前に会ったPyData NYCの模様をレポートしました。 Second PyData Tokyo Meetup where I reported some highlights from PyData NYC which was held a week before.

TRANSCRIPT

Page 1: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved.

シバタアキラ

PyData NYC に行ってきました

Shiroyagi Corporation

Page 2: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 2

Who am I

Akira Shibata, PhD.

TW: @punkphysicist

CEO, Shiroyagi Corporation (shiroyagi.co.jp)

Kamelio: Personalised News Curation

Kamect: Contents Discovery Platform

2004 - 2010:

Data Scientist @ NYU

Statistical data modelling @ LHC, CERN

2010 - 2013

Boston Consulting Group

Page 3: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

3

PyData NYC

http://pydata.org/nyc2014/schedule/

Page 4: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

4

とまってたとこ (KickStarterすぐ)

PyDataNYC (WeWork)

NYU

Page 5: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

5

Overview: Andy Terrel(Continuum Ana.)https://speakerdeck.com/aterrel/pydata-nyc-2014-keynote

主に “PyData Stack” を超えた分散処理をPythonからどう行うか の全貌について

Page 6: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

6

様々なツール/バックエンドをPythonがつなげる

Page 7: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

7

http://blaze.pydata.org/presentations/Blaze: Matthew Rocklin (Continuum Analytics)

“high level thinking, low-level computation”

Page 8: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

8

データとオペレーションを分離する

Page 9: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

9

複数のデータバックエンドを行き来することができる

Page 10: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

10

DBへのインターフェイスを統合

Page 11: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

11

Mongoから取り出したデータをSparkに入れる

Page 12: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

12

SymPy: Sageみたいなもの

Page 13: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

13

Symbolicな計算を各言語に翻訳できる

すごいけど、誰が使うのかよくわからない

Page 14: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

14

https://github.com/amueller/pydata-nyc-advanced-sklearnScikit-Learn: Andreas Muller

かなり分厚いチュートリアル。冬休みにでも

Page 15: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

15

Beaker: Scott Draves

Page 16: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

16

Software Artist らしい

Fractal Flames

Page 17: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

17

様々な言語が同時に使えるNotebook

Page 18: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

18

Demo

http://beakernotebook.com/examples

Page 19: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

19

アジェンダ

Page 20: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 20

こんな問題について発表してきました

“Cats”

“Anime”

“Cats reaction to sighting dogs for the first time”

Page 21: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 21

アプローチ

0 1 2 3 4

Image in Detect regions

Object recog. Scoring Cropping

IPython and Python script

Matlab +Scipy

C++ +Libraries

Numpy PIL

Page 22: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 22

Region detection: Telling where to look at

How do we find regions to feed into object recognition? Default strategy was to look at the center

1

Page 23: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 23

1 Region detection: proposals generated

~200 proposals generated per image

Page 24: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 24

Object recognition: Result2

Takes minutes to detect all windows

0 domestic cat 1.03649377823 1 domestic cat 0.0617411136627 2 domestic cat -0.097744345665 3 domestic cat -0.738470971584 4 chair -0.988844156265 5 skunk -0.999914288521 6 tv or monitor -1.00460898876 7 rubber eraser -1.01068615913 8 chair -1.04896986485 9 rubber eraser -1.09035253525 10 band aid -1.09691572189

Obj Score

Page 25: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 25

0 person 0.126184225082 1 person 0.0311727523804 2 person -0.0777613520622 3 neck brace -0.39757412672 4 person -0.415030777454 5 drum -0.421649754047 6 neck brace -0.481261610985 7 tie -0.649109125137 8 neck brace -0.719438135624 9 face powder -0.789100408554 10 face powder -0.838757038116

Object recognition: Result2

Obj Score

Page 26: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 26

Score heatmap

We used 200-cat object recognition model developed for 2013 ImageNet Challenge

3

Page 27: 20141127 py datatokyomeetup2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 27

Finally4

Page 28: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

28

PyData Tokyoについても話してきました

大ボス Travis Oliphant

Page 29: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

29

事務局:非常に協力的

スピーカー紹介するわよ

Page 30: 20141127 py datatokyomeetup2

2014年11月 Copyright 2014 Shiroyagi Corporation. All rights reserved.

30

SymPy NumPy Cython Pandas

scikit-learn Julia etc. etc.