synopsis voice recognition
TRANSCRIPT
-
7/27/2019 Synopsis voice recognition
1/9
HARDENING VOICE BASED AUTHENTICATION SYSTEMS
BY ENHANCING CONFIDENTIALITY AND INTEGRITY
FOR SECURE ENTRY POINTS:
Research Objective to Improve Integrity and Confidentiality
using
A
Synopsis
Submitted By
XXX-XXX
Supervised By
Xxx. Xxxxxx Xxxxxx
Assistant Professor (Computer Science)
Department of Computer Science,
Punjabi University Regional Center of Information & Technology,
-
7/27/2019 Synopsis voice recognition
2/9
Mohali
-
7/27/2019 Synopsis voice recognition
3/9
ABSTRACT
Speech analysis is a unique biometric property of human being, which can be used for
various purposes. A handful of research has been already done in this field which includes
various fields, speech analysis and synthesis, voice based authentication, etc. Voice print highintegrity and confidentiality is very important when applying speech analysis on voice based
authentication systems. Voice recognition based authentication systems can be used both
individually and in combination with other biometric or other authentication techniques.
Speech analysis integrity is very important when using speech analysis/voice recognition for
authentication systems. In this research we will experiment with various voice integrity
methods under speech analysis research for my master thesis research, and will select the best
and appropriate method, and will improve it to a higher accuracy level. In case, if found all of
the techniques not up to the mark, the research will lead to the creation of all new algorithm.
1. INTRODUCTION
People have always kept secrets and protected their possessions. It has evolved
from the physical to the technological, where now technology is used to restrict
access to our resources and user authentication is the key to the door.
Authentication techniques have developed with it, and now who can access our
resources is controlled using three main methods:
Knowledge -based authentication is based on information authorized individuals will
know, and unauthorized individuals will not. E.g. A PIN or a password. information.
Object-based authentication is based on possessing a token or tool th at perm its the
person access to the controlled resource. E.g. Keys, pass cards or a secure Id.
Biometric-based authentication measures individuals unique physical or behavioral
characteristics. It exists today in various form s such as fingerprint verification, retinal
scans, facial analysis, analysis of vein structures and voice authentication.
1.1 Speech
Now that we are aware of the technologies voice authentication employs, we need to
look at how the voice is produced, the characteristics that allow us to extract
meaning from it, and the method by which it can be converted into a form that can be
handled by com puter systems. In doing this particular attention will be paid to those
Key fingerprint =of the FA27 2F94 998D FDB5unique F8B5 06E4individual, therefore
-
7/27/2019 Synopsis voice recognition
4/9
characteristics AF19 voice that render it DE3D for each A169 4E46
allowing their identification. To do this, we can examine the physiological component
of human speech, which is produced by the human voice tract.
In simple terms, the voice is created by air passing over the larynx or other parts of
the vocal tract. The larynx vibrates creating an acoustic wave, essentially a hum,
which is modified by the motion of the p alate, tongue and lips. Sounds produced by
the larynx are called phonated or voiced sounds. Examples of voiced sounds would
be the m in mud or the r in ram . Simultaneously, other sounds are produced by
other parts of the vocal tract, for instance whispered sounds are created by air
rushing over the arytenoids cartilage at the back of the throat. Sounds not originating
in the larynx are called unvoiced sounds. Exam ples of these would be the f in fish
or the s in sea.
All sounds produced are, at the same time, fundamentally influenced by the actual
shape of the vocal tract. This shape is brought about both as a consequence of
hereditary and developmental factors, and of environmental factors.
In parallel to these physiological characteristics, speech contains a behavioural
component. This manifests itself as the accent o f the voice, and affects how quickly
words are spoken, how sounds are pronounced and emphasized, and what other
mannerisms are applied to speech.
Together these physiological and behavioural factors combine to produce voice
patterns that are essentially unique for every individual, and are difficult or
impossible to duplicate short of recording the voice.
When analyzed using modern technology, human spe ech appears to be rather
inefficient, in terms of time and energy expended to transfer information. Speech is
constructed out of various sounds, termed phonemes. Common English usage
utilises around 40 phonemes, analogous to the characters of the phonetic alphabet
seen in most dictionaries. For instance the word mud uses three phonemes
denoted /m / (the mmm sound), /u/ (the uh), and /d/.
1.2 Digitizing human speech.
The human voice produces a highly complex acoustic wave, which, fortunately, the
hum an ear and brain have evolved to interpret effectively. In technical terms thevoice is an analogue signal. An analogue signal is defined as one with a
-
7/27/2019 Synopsis voice recognition
5/9
continuously variable physical value. A single frequency tone, such as the dial tone
on a telephone will be a simple analogue signal. Such a simple signal will take the
form of a Sine Wave
In explaining how analogue signals can be converted to digital data, I will use the
example of the Sine Wave. This is appropriate because of a ba sic rule of signal
theory called Fouriers principle. In essence, this rule states that any signal, even
one as complex as a voiceprint, is in fact a combination of many simple sine waves,
of varying frequencies (cycles per second) and amplitudes (strength s), added
together. This means that any process that works for a Sine Wave, will work for any
other kind of signal as well.
The initial problem is to convert this analogue sound signal into a digital signal (i.e. a
sequence of numbers that a computer can input and manipulate). The first stage,
converting it from a sound to an electrical wave is simple, using a microphone.
The second stage involves converting the signal from a continuous wave to a series
of discrete voltage measurements. This is done by a process called sampling.
Sampling involves measuring the voltage of the signal at regular intervals, many
times per second.
If we use the sine wave of Figure 2 as an example: Say the wave has an amplitudeof 10 Volts: that is to say its peak positive voltage is 5 Volts and its correspondingly
lowest negative voltage is 5 Vol ts. We see that the wave goes through a complete
cycle every 10 milliseconds (ms), this means that it completes 100 cycles every
second, so it has a frequency of 100Hz. (Hz, Hertz , is the measure of cycles or
measured events per second.)
To effectively sample this signal we need to measure its voltage every 5 ms, a
sampling frequency of 200Hz. We know that we must use at least this sampling
frequency because of a fundamental rule of communications called Shannons Law,
which says that in order to full y sample any signal, without losing any part of it, you
must have a sampling frequency of at least double the highest frequency contained
shows the sampling of the signal.
2. LITERATURE SURVEY
-
7/27/2019 Synopsis voice recognition
6/9
3. STATEMENT OF THE PROBLEM
High integrity in speech recognition and authentication system is very important and critical
property of the Speech processing systems. Less integrity can lead toward a higher FRR or
FAR values, which indicates the failure of the system.
4. OBJECTIVES OF THE PROJECT WORK
In order to satisfy the goal of the research following objectives have been identified:
1. Study the Speech Processing techniques
2. Study the Voice authentication sysetms
3. Study the Speech processing research methods
4. Study various hybrid authentication systems based on voice authentication
5. Use and test various speech recognition models
6. Use and test speech integrity in existing speech recognition models
7. Selection and Creation of an improved speech recognition technique with high
integrity
8. Test and compare the results with existing systems
9. Test in the real-time on local and online voice print data
-
7/27/2019 Synopsis voice recognition
7/9
5. WORK DONE SO FAR
To achieve the goal of the research following work has been done so far:
1. Literature Survey and Problem Identification.
2. Studied about various methods of speech recognition and voice authentication techniques
3. Studied about the Short-time Fourier analysis, Cepstral analysis algorithms
4. Studied about various Pitch extraction techniques like Autocorrelation technique,
Cepstrum technique, Tracking and Smoothing, etc.
5. Studied the MATLAB simulator
6. Studied the existing speech/voice recognition and authentication sysetms.
6. REFERENCES
1. A Practical guide to Biometric Security Technology,
Silverm an, 2001, IEEE
Sim on Liu and Mark
http://www.computer.org/itpro/hom epage/jan_feb01/security3.htm
2. Hendry, M, Sm art Card Security and Applications, Artech House
3. Privacy Laws & Business 9 th Privacy Commissioners' / Data Protection Authorities
Biometrics as a Privacy -Enhancing Technology: Friend or Foe of Pri vacy?
Workshop, Dr. George Tom ko, Sept. 15 t h , 1998
http://www.dss.state.ct.us/digital/tom ko.htm
4. Speech Technology for Telecommunications, F.A. Westhall, R.D. Johnston, A.V.
Lewis
-
7/27/2019 Synopsis voice recognition
8/9
Key fingerprint = AF19 FA27 2F94 998D FDB5 DE3D F8B5 06E4 A169 4E46
http://www.eee.bham.ac.uk/woolleysi/downloads /speech/2.pdf
5. Speech Links, Andrew Hunt
http://www.speech.cs.cm u.edu/com p.speech/
rr
6. Making Speech Sounds Phonetics, Oxford uni versity press
http://www.oup.com/pdf/elt/catalogue/0 -19-437239-1-b.pdf
7. Phoneti cs and the theory of speech production
http://www.acoustics.hut.fi/~slemmett/dippa/chap3.htm l
8. Speaker Verification, Biometrics Research dept, MSU
http://biometrics.cse.msu.edu/speaker.html
9. Speaker Recognition, Identifying people by their voices, BRNO University of
Technology, I ng. Milan Sigm und, 2000
http://deer.ro.vutbr.cz/nakl/habilit/sigm und.pdf
& Speaker Recognition, Sadaoki Furui , NTT Human Interface Laboratories, Tokyo
cslu.cse.ogi.edu/HLTsurvey/ch1node9.html
10. Visa gets behind voice recognition for e -commerce, Paul Roberts, IDG News
Service 10/21/02
www.nwfusion.com/n ews/2002/1021visa.html
Key fingerprint = AF19 FA27 2F94 998D FDB5 DE3D F8B5 06E4 A169 4E46
11. Voice Recognition, Jim Baumann, Human Interface Technology Laboratory,
University of Washington.
-
7/27/2019 Synopsis voice recognition
9/9
http://www.hitl.washington.edu/scivw/EVE/I .D.2.d.VoiceRecognition.htm l