introduction kinect for xbox 360, referred to as kinect, is developed by microsoft, used in xbox 360...

1
Introducti on Kinect for Xbox 360, referred to as Kinect, is developed by Microsoft, used in Xbox 360 video game console and Windows PCs peripheral equipment. Hardware The Kinect sensor is a motion sensing device and is capable of calculating depth of the scene. It employs an IR camera, IR projector, an RGB camera and a multi-array microphone. The IR projector produces a speckle pattern of infra red dots on the scene which is then read by the IR camera and from the deformation of the IR pattern, it calculates the depth of the objects in the scene. Software -Microsoft Visual Studio 2012 C# -Kinect SDK v1.7: This SDK helps in utilizing the full resources and potential of the Kinect sensor. It has Human Tracking and motion gesture recognition tools. Simulation Software & Results Conclusion & Future Work The system works reasonably well. A large number of simulation tests were performed with a number of different candidates; the system produced reliable results most of the time. Voice commands works with small sentences and the results are adequate. The Validation process has a great potential to be improved. By employing the second generation of Kinect, the system performance can be vastly improved in terms of reliability and performance. Also with the new Kinect, voice commands can be taken to a next level by incorporating longer sentences. Supervised by Dr. Brett Wilkinson, Flinders University, Adelaide, Australia Literature cited [1] A. K. Jain, R. Bolle, and S. Pankanti, Biometrics: personal identication in networked society, Kluwer Academic Publishers 1999 [2] X. Qinghan, "Technology review - Biometrics-Technology, Application, Challenge, and Computational Intelligence Solutions”, IEEE Computational Intelligence Magazine 2007, vol.2, pp. 5-25. [3] DG. Lowe.: Object Recognition from Local Scale-Invariant Features. Proceedings of the International Conference on Computer Vision. 2 (1999). Human Tracking Kinect depth sensor can locate a person in the scene by using the IR Depth camera. It involves capturing depth information of the environment, looking for the largest moving object in the scene and then inferring body parts from a decision tree with a large number of training examples. A set of 20 points representing 20 joint positions is marked. Each of the 20 joint values describes the positions of a specific joint in 3D with respect to the Kinect sensor. The system allows access to an authorised person only. It takes note of the skeletal features of a person which involves the height, arms’ length and width of a person. This helps in The person can move the Tank with either hand. The Tank moves in accordance with the motion of the hand The Tank object can be fixed at a position by raising the opposite hand to the chest Once the position is fixed, it will not move irrespective of the motion of the hand. The Tank can be made to move again by raising both the hands The Tank can be moved by the opposite hand again The Tank can be moved by opposite hand by touching both the hands. The simulation scenario can be changed by raising both the hands up Any of the targets in any scenario can be fired at by voice commands. Person Motion Gesture Accuracy % Voice Recogniti on Accuracy % Validatio n Accuracy % A 95 70 75 B 90 80 70 C 80 70 80 D 95 100 75 E 90 90 80 F 90 80 85 Person Motion Gesture Accuracy % Voice Recogniti on Accuracy % Validatio n Accuracy % A 80 70 65 B 75 80 70 C 85 70 75 D 80 90 80 E 70 70 75 F 75 80 70 Environment 1 Environment 2 Test Results showing the accuracy of the software Simulation software was designed to utilise the motion gesture capabilities of the Kinect Sensor. For this a system was designed which involves moving a Tank with either hand and then taking certain actions based on the persons’ gestures and voice commands. The motivation of this project is to give an overview of the actual training field scenario to new soldiers by giving them a demonstration in the simulation software. The cost and set up time for in- field training exercises can be expensive. By developing a simulation it is expected that costs can be reduced and immediate feedback can be supplied to the trainees. The simulation software has been designed to provide new soldiers with an emulation of the battlefield. This scenario allows the soldiers to understand issues surrounding command, manoeuvres, fields of view and firing lines. The simulation represents a strategic planning exercise and is not intended as a 3D game. The Kinect sensor provides a solid ground for the simulation system. The RGB camera of the Kinect sensor helps in providing a view of the scene while the 3D Depth sensor helps in allowing the program to understand how the scene looks like and then certain actions can be taken based on the content of the scene. In this system, the user is placed in an environment where all the interaction can be controlled via hand motion and voice commands.

Upload: rhoda-harrington

Post on 27-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction Kinect for Xbox 360, referred to as Kinect, is developed by Microsoft, used in Xbox 360 video game console and Windows PCs peripheral equipment

IntroductionKinect for Xbox 360, referred to as Kinect, is developed by Microsoft, used in Xbox 360 video game console and Windows PCs peripheral equipment.

Hardware The Kinect sensor is a motion sensing device and is capable of calculating depth of the scene. It employs an IR camera, IR projector, an RGB camera and a multi-array microphone. The IR projector produces a speckle pattern of infra red dots on the scene which is then read by the IR camera and from the deformation of the IR pattern, it calculates the depth of the objects in the scene.

Software-Microsoft Visual Studio 2012 C#

-Kinect SDK v1.7: This SDK helps in utilizing the full resources and potential of the Kinect sensor. It has Human Tracking and motion gesture recognition tools.

Simulation Software & Results

Conclusion & Future WorkThe system works reasonably well. A large number of simulation tests were performed with a number of different candidates; the system produced reliable results most of the time. Voice commands works with small sentences and the results are adequate. The Validation process has a great potential to be improved. By employing the second generation of Kinect, the system performance can be vastly improved in terms of reliability and performance. Also with the new Kinect, voice commands can be taken to a next level by incorporating longer sentences.

Supervised by Dr. Brett Wilkinson, Flinders University, Adelaide, Australia

Literature cited[1] A. K. Jain, R. Bolle, and S. Pankanti, Biometrics: personal identication in networked society, Kluwer Academic

Publishers 1999[2] X. Qinghan, "Technology review - Biometrics-Technology, Application, Challenge, and Computational Intelligence

Solutions”, IEEE Computational Intelligence Magazine 2007, vol.2, pp. 5-25.[3] DG. Lowe.: Object Recognition from Local Scale-Invariant Features. Proceedings of the International Conference on

Computer Vision. 2 (1999).

Human TrackingKinect depth sensor can locate a person in the scene by using the IR Depth camera. It involves capturing depth information of the environment, looking for the largest moving object in the scene and then inferring body parts from a decision tree with a large number of training examples. A set of 20 points representing 20 joint positions is marked. Each of the 20 joint values describes the positions of a specific joint in 3D with respect to the Kinect sensor. The system allows access to an authorised person only. It takes note of the skeletal features of a person which involves the height, arms’ length and width of a person. This helps in identifying an authorised person from others when there are many people in the scene .

The person can move the Tank with either hand. The Tank moves in accordance with the motion of the hand

The Tank object can be fixed at a position by raising the opposite hand to the chest

Once the position is fixed, it will not move irrespective of the motion of the hand.

The Tank can be made to move again by raising both the hands

The Tank can be moved by the opposite hand againThe Tank can be moved by opposite hand by touching both the hands.

The simulation scenario can be changed by raising both the hands up

Any of the targets in any scenario can be fired at by voice commands.

Person Motion

Gesture

Accuracy

%

Voice

Recognition

Accuracy

%

Validation

Accuracy

 

%

A 95 70 75

B 90 80 70

C 80 70 80

D 95 100 75

E 90 90 80

F 90 80 85

Person Motion

Gesture

Accuracy

 %

Voice

Recognition

Accuracy

%

Validation

Accuracy

 

%

A 80 70 65

B 75 80 70

C 85 70 75

D 80 90 80

E 70 70 75

F 75 80 70 Environment 1 Environment 2

Test Results showing the accuracy of the software

Simulation software was designed to utilise the motion gesture capabilities of the Kinect Sensor. For this a system was designed which involves moving a Tank with either hand and then taking certain actions based on the persons’ gestures and voice commands. The motivation of this project is to give an overview of the actual training field scenario to new soldiers by giving them a demonstration in the simulation software.

The cost and set up time for in-field training exercises can be expensive. By developing a simulation it is expected that costs can be reduced and immediate feedback can be supplied to the trainees.

The simulation software has been designed to provide new soldiers with an emulation of the battlefield. This scenario allows the soldiers to understand issues surrounding command, manoeuvres, fields of view and firing lines. The simulation represents a strategic planning exercise and is not intended as a 3D game. The Kinect sensor provides a solid ground for the simulation system. The RGB camera of the Kinect sensor helps in providing a view of the scene while the 3D Depth sensor helps in allowing the program to understand how the scene looks like and then certain actions can be taken based on the content of the scene. In this system, the user is placed in an environment where all the interaction can be controlled via hand motion and voice commands.