hmm based automatic arabic sign language translator using
Embed Size (px)
TRANSCRIPT

Hidden Markov Model based Automatic Arabic Sign Language Translator
using KinectOmar Amin†, Hazem Said‡, Ahmed Samy†, Hoda El Korashy¥
†Teaching Assistant, Computer Engineering Department Ain Shams University.
†Software Developer at Robovics.
‡Assistant Professor, Computer Engineering Department, Ain Shams University.
¥Professor, Computer Engineering Department, Ain Shams University.

Outline
• Introduction• Problem Statement• Related Work
• Proposed System• System Description
• Experimental work.
• Conclusion.
2

Problem Introduction
Source : http://wfdeaf.org/human-rights/crpd/sign-language 3
• There are about 70 million deaf people who use sign language as their first language or mother tongue

Research Effort
4
• Data Source• Sensor Based Systems
• Camera Based Systems
• Research Focus• Isolated SLR (Sign Language
Recognition).• Continuous SLR.• Scalable SLR.• Signer Independence.• Posture Recognition.

Sensor Based Systems
5
• Using electromyographybased sensors to measure the electrical activity of muscles at rest and during contraction, and then these measurements are used to detect the sign being performed

Sensor Based Systems
6
• Using Data gloves (i.e. Cyber glove) to capture fingers positions and orientation, to be used to recognize hand shape and signs.

Camera Based Systems
7
• Normal RGB Camera (Usually using colored gloves)
• Stereo System (2 RGB Cameras)
• Kinect Sensor

• Algorithms used• Hidden Markov Model.
• Conditional Random Fields.
• Dynamic time warping.
• Recurrent neural networks.
Research Effort
8

Proposed System Block Diagram
9

Kinect
10

Kinect
11
A Kinect sensor (also called a Kinect) is a physical device that contains cameras, a microphone array, and an accelerometer as well as a software pipeline that processes color, depth, and skeleton data.

Kinect Skeleton Tracking
12
Kinect provides data about 20 different Skeleton joints, that includes:
• 3D accurate position for each joint.
• Joints orientation.

Go-Stop Detector
13

Go-Stop Detector
14
• Detects the start and end of each sign using a threshold to differentiate between signing and non signing space
Signing space

Go-Stop Detector
15
• A Threshold is decided to differentiate between signing space and non signing space based on hands 3d position.
• Three subsequent frames in the signing space or non signing space to flag a start or end of the sign.

Go-Stop Detector
16

Sign Recorder
17

Preprocessing System
18

Preprocessing System
19

Feature Extraction
20
• Features captured from skeleton stream
1. Right hand joint x, y, and depth.
2. Left hand joint x , y, and depth.
3. HIP Center joint x, y, and depth.

Feature Vector
21
• Feature Vector consist of 6 values per skeleton frame.
Feature Number Feature Value
1 Right Hand x – Hip Center x
2 Right Hand y – Hip Center y
3 Right Hand depth – Hip Center depth
4 Left Hand x – Hip Center x
5 Left Hand y – Hip Center y
6 Left Hand depth – Hip Center depth
We need the Hip center Joint to calculate hands positions relative to a static point to compensate for signer position in front of the Kinect.

Linear Resampling
22
Kinect camera records skeleton at the rate of 30 frames/seconds. However, this is the average rate. Practically, time period measured between two consecutive samples show variations from 30ms to 100ms.

Trajectory Smoothing
23
• To decrease the effect of noisy sensors measurements (spikes).
• Next slide : Demo for the trajectory smoothing for one component

Trajectory Smoothing
24

Hidden Markov Model Classifier
25

Hidden Markov Model
26

Hidden Markov Model
27
• To build a Hidden Markov model we need:

Hidden Markov Model
28
• Each hidden Markov model has a Topology

Hidden Markov Model
29
• Our Hidden States Emission Probability Distribution function is
6-D Gaussian distribution

Training Set Generation
30
• For each sign out of the 40 signs, a long video containing 60 samples have been recorded and segmented using the go stop detector into 60 annotated samples per sign to generate the training set and test set.
• These annotated samples are used as observations sequence from which HMMs are created using Baum-Welch Algorithm.

Hidden Markov Model
31
• In Sign Language context: Hands positions in 3d space
Observation
Hidden State6-D Gaussian distributionSingle Skeleton Frame

Hidden Markov Model
32
• Evaluation Algorithm

Hidden Markov Model Classifier
33

Experimental Results
34
• Go Stop Detector• Reliable Segmentation for long video.
• Minimum transition time : 300 ms

Experimental Results
35
• Hidden Markov Model Classifier output Performance (online mode)
Person Test Set Size per Sign Classification output
Original Signer 20 95.125%
Different Signer 20 92.5%
• Hidden Markov Model Classifier (offline mode)
Person Test Set Size per Sign Classification output
Original Signer 20 99.25%

Experimental Results
36
• Hidden Markov Classifier Performance
• Algorithm used for classification is the Forward-Backward algorithm.
Sign Timing Time needed to classify (ms)
Average Sign time 12.68 ms
Maximum 20.2 ms
Minimum 8.75 ms

Experimental Results
37
• Hidden Markov Model Hidden state count.

Conclusion
38
• A System has been developed to automatically segment a live video streams into isolated signs using Kinect and translate these signs into text.
• Performance for signer dependent is 95.125% and the signer independent is 92.5%.

Thank you!
39