![Page 1: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/1.jpg)
School of Electronic Information Engineering , Tianjin University
Human Action Recognition by Learning Bases of Action
Attributes and Parts
Jia pingping
![Page 2: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/2.jpg)
Outline:
3
Experiments: PASCAL & Stanford 40 Actions4
Intuition: Action Attributes and Parts2
5
Algorithm: Learning Bases of Attributes and Parts
Conclusion
1 Action Classification in Still Images
![Page 3: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/3.jpg)
Action Classification in Still Images
Low level featureRiding bike
![Page 4: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/4.jpg)
Action Classification in Still Images
Riding a bikeSitting on a bike seatWearing a helmetPeddling the pedals…
- Semantic concepts – Attributes
Low level feature High-level representationRiding bike
![Page 5: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/5.jpg)
Action Classification in Still Images
- Semantic concepts – Attributes- Objects
Riding a bikeSitting on a bike seatWearing a helmetPeddling the pedals…
Low level feature High-level representationRiding bike
![Page 6: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/6.jpg)
Action Classification in Still Images
- Semantic concepts – Attributes- Objects- Human poses
Parts
Riding a bikeSitting on a bike seatWearing a helmetPeddling the pedals…
Low level feature High-level representationRiding bike
![Page 7: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/7.jpg)
Action Classification in Still Images
- Semantic concepts – Attributes- Objects- Human poses- Contexts of attributes & parts
Parts
Riding a bikeSitting on a bike seatWearing a helmetPeddling the pedals…
Riding
Low level feature High-level representationRiding bike
![Page 8: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/8.jpg)
Low level feature
- Semantic concepts – Attributes- Objects- Human poses- Contexts of attributes & parts
High-level representation
Parts
riding a bike
wearing a helmet
Peddling the pedal
sitting on bike seat
Incorporate human knowledge; More understanding of image content; More discriminative classifier.
Action Classification in Still Images
Riding bike
![Page 9: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/9.jpg)
![Page 10: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/10.jpg)
Outline:
3
Experiments: PASCAL & Stanford 40 Actions4
Intuition: Action Attributes and Parts2
5
Algorithm: Learning Bases of Attributes and Parts
Conclusion
1 Action Classification in Still Images
![Page 11: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/11.jpg)
Action Attributes and Parts
Attributes:
… …
semantic descriptions of human actions
![Page 12: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/12.jpg)
Action Attributes and Parts
Attributes:
… …
semantic descriptions of human actions
Riding bike
Not riding bike
Discriminative classifier, e.g. SVM
![Page 13: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/13.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
A pre-trained detector
![Page 14: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/14.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
Attribute classification
Object detection
Poselet detection
a: Image feature vector
![Page 15: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/15.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
Attribute classification
Object detection
Poselet detection
a: Image feature vector
…
Action bases Φ
![Page 16: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/16.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
a: Image feature vector
…
Action bases Φ
![Page 17: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/17.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
a: Image feature vector
…
Action bases Φ
![Page 18: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/18.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
…
Action bases
Bases coefficients w
Φ
a: Image feature vector
SVM
a Φw
![Page 19: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/19.jpg)
Action Attributes and Parts
Attributes:
… …
Parts-Objects:
… …
Parts-Poselets:
… …
…
Action bases
Bases coefficients w
Φ
a: Image feature vector
Riding bike
a Φw
![Page 20: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/20.jpg)
![Page 21: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/21.jpg)
Outline:
3
Experiments: PASCAL & Stanford 40 Actions4
Intuition: Action Attributes and Parts2
5
Algorithm: Learning Bases of Attributes and Parts
Conclusion
1 Action Classification in Still Images
![Page 22: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/22.jpg)
Bases of Atr. & Parts: Training
w
Φa
a Φw
• Input: 1, , Na a
• Output: 1, , MΦ Φ Φ
1, , NW w wsparse
2
2 1,1
1min ,
2
N
i i ii
Φ W
a Φw w
2
1 2s.t. , 1
2j jj
Φ Φ
• Jointly estimate and :Φ W
…
![Page 23: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/23.jpg)
Bases of Atr. & Parts: Testing
…
w
Φa
a Φw
• Input: a
• Output:
1, , MΦ Φ Φ
w sparse
• Estimate w:
2
2 1
1min
2
wa Φw w
![Page 24: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/24.jpg)
Outline:
3
Experiments: PASCAL & Stanford 40 Actions4
Intuition: Action Attributes and Parts2
5
Algorithm: Learning Bases of Attributes and Parts
Conclusion
1 Action Classification in Still Images
![Page 25: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/25.jpg)
1. PASCAL Action Dataset
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2008/
![Page 26: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/26.jpg)
1. PASCAL Action Dataset
• Contain 9 classes , there are 21,738 images in total;
• Randomly select 50% of each class for training/validation and the remain images for testing;
• 14 attributes, 27 objects, 150 poselets;
• The number of action bases are set to 400 and 600 respectively. The 𝜆and values are set to 0.1 and 0.15.𝛾
![Page 27: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/27.jpg)
Classification Result
1 2 3 4 5 6 7 8 9
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Phoning Playing instrument
Reading Riding bike
Riding horse
Running Taking photo
Using computer
Walking
Ave
rage
pre
cisi
on Our method, use “a”
POSELETS
SURREY_MKUCLEAR_DOSP
…
w
Φa
![Page 28: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/28.jpg)
…
w
Φa
1 2 3 4 5 6 7 8 9
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Phoning Playing instrument
Reading Riding bike
Riding horse
Running Taking photo
Walking
Our method, use “a”Our method, use “w”
POSELETS
SURREY_MKUCLEAR_DOSP
Ave
rage
pre
cisi
on
Using computer
Classification Result
![Page 29: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/29.jpg)
…
w
Φa
1 2 3 4 5 6 7 8 9
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Phoning Playing instrument
Reading Riding bike
Riding horse
Running Taking photo
Walking
Our method, use “a”Our method, use “w”
Poselet, Maji et al, 2011
SURREY_MKUCLEAR_DOSP
Ave
rage
pre
cisi
on
Using computer
400 action bases
attributesobjects
poselets
Classification Result
![Page 30: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/30.jpg)
…
w
Φa
1 2 3 4 5 6 7 8 9
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Phoning Playing instrument
Reading Riding bike
Riding horse
Running Taking photo
Walking
Our method, use “a”Our method, use “w”
Poselet, Maji et al, 2011
SURREY_MKUCLEAR_DOSP
Ave
rage
pre
cisi
on
Using computer
400 action bases
attributesobjects
poselets
Classification Result
![Page 31: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/31.jpg)
…
w
Φa
1 2 3 4 5 6 7 8 9
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Phoning Playing instrument
Reading Riding bike
Riding horse
Running Taking photo
Walking
Our method, use “a”Our method, use “w”
Poselet, Maji et al, 2011
SURREY_MKUCLEAR_DOSP
Ave
rage
pre
cisi
on
Using computer
400 action bases
attributesobjects
poselets
Classification Result
![Page 32: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/32.jpg)
Control Experiment
…
w
Φa
Use “a”
Use “w”
A: attributeO: objectP: poselet
![Page 33: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/33.jpg)
2. Stanford 40 Actions
Applauding Blowing bubbles
Brushing teeth
Calling Cleaning floor
Climbing wall
Cooking Cutting trees
Cutting vegetables
Drinking Feeding horse
Fishing Fixing bike
Gardening Holding umbrella
Jumping
Playing guitar
Playing violin
Pouring liquid
Pushing cart
Reading Repairing car
Riding bike
Riding horse
Rowing Running Shooting arrow
Smoking cigarette
Taking photo
Texting message
Throwing frisbee
Using computer
Using microscope
Using telescope
Walking dog
Washing dishes
Watching television
Waving hands
Writing on board
Writing on paper
http://vision.stanford.edu/Datasets/40actions.html
![Page 34: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/34.jpg)
2. Stanford 40 Actions
• contains 40 diverse daily human actions;• 180∼300 images for each class, 9532 real world images in total;• All the images are obtained from Google, Bing, and Flickr;• large variations in human pose, appearance, and background clutter.
Cutting vegetables
Drinking Feeding horse
Fixing bike
Gardening Holding umbrella
Playing guitar
Playing violin
Pouring liquid
Reading Repairing car
Riding bike
Shooting arrow
Smoking cigarette
Taking photo
Walking dog
Washing dishes
Watching television
Drinking Gardening
Smoking Cigarette
![Page 35: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/35.jpg)
35
Result: • Randomly select 100 images in each class for training, and the remaining images for testing.• 45 attributes, 81 objects, 150 poselets. The number of action bases are set to 400 and 600 respectively. The 𝜆 and 𝜆 values are set to 0.1 and 0.15.•Compare our method with the Locality-constrained Linear Coding (LLC, Wang et al, CVPR 2010) baseline.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Riding
a h
orse
Rowing
a b
oat
Riding
a b
ike
Climbin
g m
ount
ain
Jum
ping
Cleanin
g th
e flo
or
Wal
king
a do
g
Shoot
ing a
n ar
row
Playin
g gu
itar
Fishin
g
Holding
up
an u
mbr
ella
Runni
ng
Throw
ing
a fri
sbee
Writ
ing
on a
boa
rd
Wat
chin
g TV
Cuttin
g tre
es
Feedin
g a
hors
e
Garde
ning
Writ
ing
on a
boo
k
Repai
ring
a ca
r
Look
ing th
ru a
micr
osco
pe
Cuttin
g ve
geta
bles
Blowing
bub
bles
Playin
g vio
lin
Brush
ing te
eth
Repai
ring
a bi
ke
Pushin
g a
cart
Using
a co
mpu
ter
Appla
uding
Cookin
g
Smok
ing c
igare
tte
Look
ing th
ru a
teles
cope
Was
hing
dishe
s
Drinkin
g
Calling
Wav
ing h
ands
Pourin
g liq
uid
Readi
ng a
boo
k
Taking
pho
tos
Textin
g m
essa
ge
LLC
Our Method
Ave
rage
pre
cisi
on
![Page 36: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/36.jpg)
Control Experiment
…
w
Φa
A: attributeO: objectP: poselet
Use “a”
Use “w”
![Page 37: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/37.jpg)
Outline:
3
Experiments: PASCAL & Stanford 40 Actions4
Intuition: Action Attributes and Parts2
5
Algorithm: Learning Bases of Attributes and Parts
Conclusion
1 Action Classification in Still Images
![Page 38: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/38.jpg)
• Partwise Bag-of-Words (PBoW) Representation: Local feature Body part localization PBoW generation
head-wise BoW
limb-wise BoW
leg-wise BoW
foot-wise BoW
![Page 39: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/39.jpg)
• Local Action Attribute Method: 1. Label the action samples according to different parts
static
vertical move
horizontal move
Head
static
swing
…
Limb …
For each part, we define a
new set of low-level semantic to re-class the training action
samplesstatic
…
Leg…
static
…
Foot
…
![Page 40: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/40.jpg)
• Local Action Attribute Method: 2. For each part, train a set of attribute classifiers according to the set of
semantic we define.
for each part
train
……
…
![Page 41: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/41.jpg)
• Local Action Attribute Method: 3. For each action sample, map its low-level representation to a middle-
level representation through the framework as follow:
Head-wise BoW
Limb-wise BoW
Leg-wise BoW
Foot-wise BoW
Combine this four part to built a new histogram
representation of the sample
One action sample
![Page 42: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/42.jpg)
• Local Action Attribute Method: 4. Thus, based on local action attribute, we construct a new descriptor of
action samples. It can be used to classify.
Training set
Testing set
SVMK-NN
Training set
Testing set
![Page 43: School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping](https://reader038.vdocuments.net/reader038/viewer/2022110401/56649ddd5503460f94ad4d7a/html5/thumbnails/43.jpg)
School of Electronic Information Engineering , Tianjin University
Thank you