capturing complex spatio-temporal relations among facial...
TRANSCRIPT
QUICK DESIGN GUIDE (--THIS SECTION DOES NOT PRINT--)
This PowerPoint 2007 template produces a 36x56 inch
professional poster. You can use it to create your
research poster and save valuable time placing titles,
subtitles, text, and graphics.
We provide a series of online tutorials that will guide
you through the poster design process and answer your
poster production questions.
To view our template tutorials, go online to
PosterPresentations.com and click on HELP DESK.
When you are ready to print your poster, go online to
PosterPresentations.com.
Need Assistance? Call us at 1.866.649.3004
Object Placeholders
Using the placeholders
To add text, click inside a placeholder on the poster and type
or paste your text. To move a placeholder, click it once (to
select it). Place your cursor on its frame, and your cursor
will change to this symbol . Click once and drag it to a
new location where you can resize it.
Section Header placeholder
Click and drag this preformatted section header placeholder
to the poster area to add another section header. Use section
headers to separate topics or concepts within your
presentation.
Text placeholder
Move this preformatted text placeholder to the poster to add
a new body of text.
Picture placeholder
Move this graphic placeholder onto your poster, size it first,
and then click it to add a picture to the poster.
QUICK TIPS (--THIS SECTION DOES NOT PRINT--)
This PowerPoint template requires basic PowerPoint (version 2007 or newer) skills. Below is a list of commonly asked questions specific to this template. If you are using an older version of PowerPoint some template features may not work properly.
Template FAQs
Verifying the quality of your graphics
Go to the VIEW menu and click on ZOOM to set your
preferred magnification. This template is at 100% the
size of the final poster. All text and graphics will be
printed at 100% their size. To see what your poster will
look like when printed, set the zoom to 100% and
evaluate the quality of all your graphics before you
submit your poster for printing.
Modifying the layout
This template has four different
column layouts. Right-click
your mouse on the background
and click on LAYOUT to see the
layout options. The columns in
the provided layouts are fixed and cannot be moved
but advanced users can modify any layout by going to
VIEW and then SLIDE MASTER.
Importing text and graphics from external sources
TEXT: Paste or type your text into a pre-existing
placeholder or drag in a new placeholder from the left
side of the template. Move it anywhere as needed.
PHOTOS: Drag in a picture placeholder, size it first,
click in it and insert a photo from the menu.
TABLES: You can copy and paste a table from an
external document onto this poster template. To
adjust the way the text fits within the cells of a table
that has been pasted, right-click on the table, click
FORMAT SHAPE then click on TEXT BOX and change
the INTERNAL MARGIN values to 0.25.
Modifying the color scheme
To change the color scheme of this template go to the
DESIGN menu and click on COLORS. You can choose
from the provided color combinations or create your
own.
© 2013 PosterPresentations.com 2117 Fourth Street , Unit C Berkeley CA 94710 [email protected]
Student discounts are available on our Facebook page.
Go to PosterPresentations.com and click on the FB icon.
Rensselaer Polytechnic Institute1 University of Science and Technology of China2
Ziheng Wang1 Shangfei Wang2 Qiang Ji1
Capturing Complex Spatio-Temporal Relations among Facial Muscles for Facial Expression Recognition
2. Main Idea
3. Related Work
Existing approaches
o Time-sliced graphical models such as hidden Markov models
(HMMs) and dynamic Bayesian networks (DBNs)
o Syntactic and description-based approaches
Limitations
o Can only model a sequence of instantaneously occurring events
o Only offer three time point relations: precedes, follows and
equals
o Typically assume first order Markov property and local
stationary transition.
o Syntactic and description-based models lack the expressive
power to capture uncertainties
The proposed model
o Model both sequential and overlapping events
o Do no reply on local assumptions and capture global relations
o Capture a larger variety of complex relations
o Remains fully probabilistic
8. Experimental Results
Data Sets
o CK+: 7 expressions, 327 video sequences
o MMI: 6 expressions, 205 video sequences
Performance Vs Number of Relation Nodes
Performance Vs Related works
o ITBN outperforms time-slice based models such as HMM
o ITBN achieves comparable and even better performance than
the related works, even though it is based purely on the
tracking results
Relationship Analysis
Existing Dynamic Models Proposed Model
Sequential events Sequential or overlapping events
Local stationary relations Global relations
3 relations (precede, follow, equal) 13 complex relations
4. Interval Temporal Bayesian Network
Node: Temporal Entity (Event)
o A temporal entity is characterized by a pair Σ, Ω where Σ is a
set of all possible states for the temporal entity, and Ω ={ 𝑎, 𝑏 ∈ ℝ: 𝑎 < 𝑏} is the period of time spanned by the
temporal entity.
Link: Temporal Dependency
o A temporal dependency denoted as 𝐼𝑋𝑌 describes a temporal
relation between two temporal entities 𝑋 = Σ𝑋, Ω𝑋 and
𝑌 = Σ𝑌, Ω𝑌 with 𝑋 as the reference.
o The strength of 𝐼𝑋𝑌 is quantified by the conditional probability
𝑃(𝐼𝑋𝑌|Σ𝑋, Σ𝑌)
o 13 temporal dependencies are defined according to Allen’s
Interval Algebra
B A
C
IAC IBC
IAB
Primitive Facial Muscle Event
o Definition: each primitive event is the movement of one facial
feature point
o Duration: the interval from the time the point starts to move
till the time it finally comes back
o State: one of the moving patterns
o Temporal relation: based on the durations of the events, their
temporal relations can be uniquely determined
Facial Expression Recognition with ITBN
o To recognize 𝑁 expressions, we build 𝑁 ITBN’s {𝑀𝑦: 𝑦 =
1,… , 𝑁}, each of which models one expression.
o Given a query sample 𝑥, its expression is classified as:
P1
P2
T1
E1
y
t
T2 E2
y
t
R12: E2 overlaps E1
S1
S2
S3
Sm
………………
6. Modeling Facial Expression with ITBN
7. Learning and Inference
Step 1: Temporal Relation Node Selection
o It is not necessary to consider the relations among all possible
pairs of events
o A selection routine is performed to select those that maximize
discrimination
o The following score is used to rank all the relation nodes. The
top L nodes are selected and instantiated in the model
Step 2: Structure Learning
o Temporal nodes are linked to their corresponding event nodes
o Spatial links are learned by finding a network 𝐺 that maximizes
the BIC score on the training data 𝐷
Step 3: Parameter Estimation
o Parameters include the conditional probability distribution
(CPD) 𝑃 𝐸𝑖 𝜋 𝐸𝑖 , and the CPD 𝑃 𝐼𝑘 𝜋 𝐼𝑘
o Tree structured CPD is introduced to reduce the number of
parameters
o Parameters are learned by maximizing the log likelihood
𝑦∗ = argmaxy
𝑃 𝑥 𝑀𝑦
𝐶𝑜𝑚𝑝𝑙𝑒𝑥𝑖𝑡𝑦(𝑀𝑦)
𝑆(𝐼𝐴𝐵) = 𝐷𝐾𝐿 𝑃𝑖 ∥ 𝑃𝑗 + 𝐷𝐾𝐿(𝑃𝑗 ∥ 𝑃𝑖)
𝑖>𝑗
max𝐺log 𝑃 𝐷 𝐺, Θ −
Θ log𝑁
2
Θ∗ = argmaxΘlog 𝑃(𝐷|Θ)
CK+
ITBN HMM Lucey et al.
CK+ 86.3% 83.5% 83.3%
ITBN HMM Zhong et al.
CPL CSPL ADL
MMI 59.7% 51.5% 49.4% 73.5% 47.8%
9. Conclusions
Model a facial expression as a complex activity of temporally
sequential or overlapping facial events
Propose a novel ITBN to capture global and a larger variety of
complex spatio-temporal relations among facial events
The proposed model achieves comparable and even better results
than existing methods, without using appearance information
ITBN can be widely applied for analyzing other complex activities
1. Problem
Facial Expression Recognition
Happiness Surprise Disgust Sadness
A
B
C
IAB
IAC
IBC
B
B
B
B
B
C
B
C
C
C
C
C
B
C
before
meet
overlap
start
during
finish
equal
𝑃 ℰ,ℛ = 𝑃 𝐸𝑖 𝜋 𝐸𝑖
𝑛
𝑖=1
𝑃 𝐼𝑘 𝜋 𝐼𝑘
𝐾
𝑘=1
Spatial Relations Temporal Relations
MMI
5. Implementation of ITBN
We propose to implement ITBN with a corresponding Bayesian
Network (BN) to exploit the well developed BN mathematical
machinery
o Circular node: state of each temporal event
o Square node: type of temporal relation
o Solid links: capture the spatial dependencies among events
o Dotted links: connect the relation node with the corresponding
temporal event nodes and capture the temporal dependencies
Given a set of temporal events ℰ = {𝐸𝑖}𝑖=1𝑛 and their pairwise
relations ℐ = {𝐼𝑘}𝑘=1𝐾 , the joint distribution can be written as
t
E1
E2
E3
E4
Model Facial expression as a complex activity consisting of
sequential or overlapping facial muscle events
Propose an Interval Temporal Bayesian Network (ITBN) to
explicitly capture a larger variety of complex spatio-temporal
relations among facial muscle events for expression recognition