microsoft kinect gesture recognition usingchang/231/y14/seminarqi.ppt · gesture recognition an...
TRANSCRIPT
Gesture Recognition using Microsoft Kinect
Outline:
1. Show an example of gesture recognition
2. Describe the framework of Kinect gesture recognition
3. Divide the framework into several stages and explain the techniques used in each stage
Example for gesture recognition
XBOX 360
Dragon Ball Z for Kinect
Gestures can be used to both attack and defense
Question:
Suppose we want to use “release” to attack and “hold” to defense, how can we recognize the gestures?
General frame work for gesture recognition
1) identifying the pixels in the image that constitute the hand we’re interested in
2) extracting features from those identified pixels in order to classify the hand into one of a set of predefined poses
3) recognizing the occurrence of specific pose sequences as
gestures.
IDENTIFICATION OF ‘HAND PIXELS’
1. Kinect collects color and depth infomation
2. differentiate between(‘hand pixels’) and (‘background pixels’).
solutions: 1) threshold based on depth2) label hand pixels using rgb data
Identification based on rgb data
- RGB indicates a specific set of rgb values
- S is a binary random variable indicating whether or not a pixel is that of skin
Identification based on rgb data and depth data
- RGB indicates a specific set of rgb values
- S is a binary random variable indicating whether or not a pixel is that of skin
- D represents a pixel’s depth value
- H is a binary random variable indicating whether or not a pixel is that of a hand
FEATURE EXTRACTION
Given a labeling of hand pixels, we now want to identify it as one of a set of predefined poses
Solution: Radial histogram
1) find the center of mass in our hand pixel
2) for each pixel, we calculate the angle offset from this
center.
The idea here is that for any hand image, the corresponding radial histogram will have distinct spikes corresponding to extended fingers.
GESTURE RECOGNITION
an open hand followed by aclosed hand can be labeled a ‘grasping’ gesture, while a closed hand followed by an open hand can be labeled a ‘dropping’ gesture.
exploiting the temporal dependence between poses
Et be the estimated pose attime t, and let St be the actual pose (or state) at time t
use p(St |E t ) as our final estimate ofthe hand pose at time t.
GESTURE RECOGNITION
an open hand followed by aclosed hand can be labeled a ‘grasping’ gesture, while a closed hand followed by an open hand can be labeled a ‘dropping’ gesture.
exploiting the temporal dependence between poses
Et be the estimated pose attime t, and let St be the actual pose (or state) at time t
use p(St |E t ) as our final estimate ofthe hand pose at time t.