ziyong feng, shaojie xu, xin zhang , lianwen jin, zhichao ye, and weixin yang

39
Real-time Fingertip Tracking and Detection using Kinect Depth Sensor for a New Writing-in-the Air System Ziyong Feng, Shaojie Xu, Xin Zhang, Lianwen Jin, Zhichao Ye, and Weixin Yang Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Upload: avedis

Post on 23-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Real-time Fingertip Tracking and Detection using Kinect Depth Sensor for a New Writing-in-the Air System. Ziyong Feng, Shaojie Xu, Xin Zhang , Lianwen Jin, Zhichao Ye, and Weixin Yang. Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Real-time Fingertip Tracking and Detection using Kinect Depth Sensor for a New Writing-in-the Air SystemZiyong Feng, Shaojie Xu, Xin Zhang, Lianwen Jin, Zhichao Ye, and Weixin Yang

Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Page 2: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

2

Outline• Introduction • Related Work• Proposed Method• Experimental Results• Conclusion

Page 3: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

3

Introduction

Page 4: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

4

Introduction• Fingertip detection takes a very important role of the natural HCI

• Challenge : • Variety of hand poses• Occlusion

• In this paper:• Propose a real-time finger writing character

recognition system using depth information• Accurate and fast

(Human Computer Interaction)

Page 5: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

5

Related Work

Page 6: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Related work• Template matching[3]:

• Curvature Fitting[6]:

[3] L. Jin, D. Yang, L. Zhen, and J. Huang. A novel vision based finger-writing character recognition system. Journal of Circuits, Systems, and Computers (JCSC), 16(3):421–436, 2007.[6] D. Lee and S. Lee. Vision-based finger action recognition by angle detection and contour analysis.Electronics and Telecommunications Research Institute Journal, 33(3):415–422, 2011.

Page 7: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

7

ProposedMethod

Page 8: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Flow Chart

Hand Segmentation

Data Conversion

Region Clustering

Fingertip Identification

Page 9: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

9

• Extract human body from background:• User ID map ( by Open Natural Interaction (OpenNI ) )• User Generator

Hand Segmentation

Page 10: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

10

• Two kinds hand-torso relationship:• 1) Hand is holding up front. • 2) Hand is close to the body.

Hand Segmentation

Depth Histogram

Page 11: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

11

• Characterize the depth-histogram by two models:• 1) Two component Gaussian mixture model . • 2) Single Gaussian model.

• Hand pixels :• Belong to the Gaussian component with smaller mean

Hand Segmentation

: weight of k-th component : maen of k-th component : variance of k-th componentd : depth value

Expectation-maximization algorithm

Two-Component

Page 12: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

12

• One Gaussian fitting:• When the means of two Gaussian are too near• • Distribution:

• Hand pixels: • Compared with torso, hand takes a few room.• Lower part of p :

Hand Segmentation

One-Component

Page 13: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

13

• Convert to real world coordinate:• The accuracy of world coordinate is about 1mm.• The following discussions are all based on real-world coordinate.

Data Conversion

: projected point coordinated : depth value: camera’s focal length at axis x and yx : real word coordinate

Page 14: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

14

• Clustering algorithm : K-means• Finger part vs. non-finger part (K=2)

• Minimize distortion measure J:

Region Clustering

n-th sample would be assigned to k-th cluster maen of the k-th cluster

Page 15: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

15

• After clustering → hand-related region is separated into two parts.

• The fingertip:• The farthest point from one cluster to the center of the other cluster

Fingertip Identification

O

X

‧Arm point: - the mean of points that have the same maximum depth

‧The fingertip:

Page 16: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

16

ExperimentalResults

Page 17: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

17

Experimental Results• Resolution : 480 640

• 30 ftps using OpenNI (KINECT)

• Dataset:• 2 subjects• 6 categories• Total 8185 frames

Page 18: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

18

Experimental Results

Page 19: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

19

Experimental Results

Near mode (1m)

Far mode (1.5m)

Page 20: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

20

Experimental Results• The distribution of errors from a sequence:

‧Fast movement‧Finger is orthogonal to the camera plane.

Page 21: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

21

Experimental Results• Smoothed trajectory: Mean filter

• 90% recognition rate on English characters• 80% on Chinese characters

Page 22: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

22

Conclusion

Page 23: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

23

Conclusion• Proposes a novel real-time fingertip detection and

tracking.

• Using depth sequences

• Accurate and fast on fingertip detection & character recgonition

Page 24: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Real-time Hand Tracking on Depth ImagesChia-Ping Chen, Yu-Ting Chen, Ping-Han Lee, Yu-Pao Tsai, and Shawmin Lei

Visual Communications and Image Processing (VCIP), 2011 IEEE

Page 25: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

25

Outline• Introduction• Proposed Method• Experimental Results• Conclusion

Page 26: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

26

Introduction

Page 27: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

27

Introduction• Most previous works tracked the hand position on color images and

relied heavily on skin color information.

• Vulnerable to lighting variations and skin color

• In this paper:

• Propose a hand tracking algorithm that uses depth images only• Real-time and accurate• Hand click detection method

Page 28: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

28

ProposedMethod

Page 29: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

29

• Predict the new hand position based on the hand moving velocity:

• H : hand moving velocity (estimated from hand positions tracked in previous frames)

Hand Position Detection

Page 30: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

30

• Hand region:• Connected component in the 3D point cloud P (from 2D depth image)

• Seed Point:

• d(.,.) : Euclidean distance• The nearest point in the point cloud P from the predicted hand position

Hand Region Segmentation

‧Seed Point‧Predicted hand position

Page 31: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

31

• Connectivity:

• Entire hand region:• Using standard region growing techniques• Hand region grows incrementally and stops when:

• 1) Two neighboring points are no longer connected• 2) The geodesic distance to the seed point <

Hand Region Segmentation

𝜴𝜺

Seed Point250mm

30mm

Page 32: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

32

• A) Rough hand center:

• -- The point with maximum boundary points in its neighborhood• -- There should be more boundary points around the palm.

• B) Refined hand center:

Hand Region Segmentation

𝜴𝜺

(12mm)

Mean-Shift(One iteration)

Page 33: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

33

• C) Hand center after Mean-Shift:

Hand Region Segmentation

𝜴𝜺

Page 34: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

34

ExperimentalResults

Page 35: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

35

Experimental Results• Resolution : 320 240

• 3GHz Intel Core 2 Duo E8400

• Computational complexity:

Page 36: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

36

Experimental Results

Page 37: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

37

Experimental Results• Ground truth vs. tracked position (in millimeters)

Page 38: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

38

Conclusion

Page 39: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

39

Conclusion• Proposes a real-time hand tracking algorithm on depth images.

• Using:• Region Growing• Geodesic distance• Mean-shift

• Can be further extended to two-hand tracking: