microsoft kinect gesture recognition usingchang/231/y14/seminarqi.ppt · gesture recognition an...

Gesture Recognition using Microsoft Kinect

Outline:

1. Show an example of gesture recognition

2. Describe the framework of Kinect gesture recognition

3. Divide the framework into several stages and explain the techniques used in each stage

Example for gesture recognition

XBOX 360

Dragon Ball Z for Kinect

Gestures can be used to both attack and defense

Question:

Suppose we want to use “release” to attack and “hold” to defense, how can we recognize the gestures?

General frame work for gesture recognition

1) identifying the pixels in the image that constitute the hand we’re interested in

2) extracting features from those identified pixels in order to classify the hand into one of a set of predefined poses

3) recognizing the occurrence of specific pose sequences as

gestures.

IDENTIFICATION OF ‘HAND PIXELS’

1. Kinect collects color and depth infomation

2. differentiate between(‘hand pixels’) and (‘background pixels’).

solutions: 1) threshold based on depth2) label hand pixels using rgb data

Identification based on rgb data

- RGB indicates a specific set of rgb values

- S is a binary random variable indicating whether or not a pixel is that of skin

Identification based on rgb data and depth data

- RGB indicates a specific set of rgb values

- S is a binary random variable indicating whether or not a pixel is that of skin

- D represents a pixel’s depth value

- H is a binary random variable indicating whether or not a pixel is that of a hand

FEATURE EXTRACTION

Given a labeling of hand pixels, we now want to identify it as one of a set of predefined poses

Solution: Radial histogram

1) find the center of mass in our hand pixel

2) for each pixel, we calculate the angle offset from this

center.

The idea here is that for any hand image, the corresponding radial histogram will have distinct spikes corresponding to extended fingers.

GESTURE RECOGNITION

an open hand followed by aclosed hand can be labeled a ‘grasping’ gesture, while a closed hand followed by an open hand can be labeled a ‘dropping’ gesture.

exploiting the temporal dependence between poses

Et be the estimated pose attime t, and let St be the actual pose (or state) at time t

use p(St |E t ) as our final estimate ofthe hand pose at time t.

microsoft kinect gesture recognition usingchang/231/y14/seminarqi.ppt · gesture recognition an...

Documents