structural human action recognition from still images moin nabi computer vision lab. ©ipm - oct....
TRANSCRIPT
![Page 1: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/1.jpg)
Structural Human Action Recognition from Still Images
Moin Nabi
Computer Vision Lab.
©IPM - Oct. 2010
![Page 2: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/2.jpg)
Problem Definition
How can we recognize human action from a single Image?
?
![Page 3: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/3.jpg)
Problem Definition
Pose as a Latent Valiable
![Page 4: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/4.jpg)
Application
• News/sports image retrieval and analysis• An important cue for video-based action
recognition
![Page 5: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/5.jpg)
Previous Works• Global template-based representation
HOG by Dalal and Triggs. And , Ikizler-Cinbis et al. ICCV09
5
Action Label
Action Label
• Pose estimation -> action recognitione.g. Ramanan and Forsyth NIPS03, Ferrari et al. CVPR09
![Page 6: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/6.jpg)
Our Work
• Examplar based representation
Using Poselet as a new definition of a part
6
• Pose estimation + action recognition
![Page 7: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/7.jpg)
Discriminative Pose
Golfing?
Walking?
• All elements of pose are not equally important• Develop integrated learning framework to
estimate pose for action recognition
![Page 8: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/8.jpg)
Pose Representation
8
• We use a coarse non-parametric pose representation– An action-specific variant of the poselet[Bourdev&Malik ICCV09]
• A poselet is a set of patches not only with similar pose configuration, but also from the same action class.
![Page 9: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/9.jpg)
Poselets
• Poselets obtained by clustering ground-truth joint positions of body parts for each action
![Page 10: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/10.jpg)
PoseletsVisualization of the Poselets for Running images
For Every Action Class:
1. Devide annotation to 4 parts2. Cluster on normalized x,y3. Remove small clusters4. Crop that part of image
Learn SVM for every Poselet with HoG+: From that action -:same part from other action
5 (Actions) x 4 (Parts) x 5 (Clusters) = 100 – 10 (Remove) = 90 = 26 (leg) + 20 (L-arm) + 20 (R-arm) + 24 (Upper body)
![Page 11: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/11.jpg)
Model Formation
⌂ Using Pictorial Structure Model of Pedro Felzenswalb
Training: Test:
![Page 12: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/12.jpg)
Model Formation• Develop a scoring function– Should have high score for correct action label– Low score for other action labels– Model parameters
![Page 13: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/13.jpg)
Model Formation
13
Pose
Action Label
Image
Choose best pose L
![Page 14: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/14.jpg)
Model Formation
14
Pose
Action Label
Image
Running
Large score for
![Page 15: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/15.jpg)
Model Formation
15
Pose
Action Label
Image
Sitting
Small score for
![Page 16: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/16.jpg)
Model Formulation
Pairwise Relation
Part Appearance
![Page 17: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/17.jpg)
17
Part Appearance Potential
Pose
Action Label
Image
Poselet matching
![Page 18: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/18.jpg)
18
Pairwise Potential
Pose
Action Label
Image
Relative body part locations
![Page 19: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/19.jpg)
19
Full Model
Pose
Action Label
Image
Model parameters learned using max-margin
![Page 20: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/20.jpg)
Learning and Inference
![Page 21: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/21.jpg)
Latent SVM
We Should Minimize Loss Function !
![Page 22: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/22.jpg)
Latent SVM
![Page 23: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/23.jpg)
Results
23
• Still image action dataset (Internet Image)
– Five action categories– 2458 images total– Train using 1/3 of images from each category
![Page 24: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/24.jpg)
Visualization of latent pose
24
Successful classification examples
Unsuccessful classification examples
![Page 25: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010](https://reader036.vdocuments.net/reader036/viewer/2022062714/56649cf85503460f949c9093/html5/thumbnails/25.jpg)
Any question
?