synopsis voice recognition

Upload: amardeepsinghseera

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Synopsis voice recognition

    1/9

    HARDENING VOICE BASED AUTHENTICATION SYSTEMS

    BY ENHANCING CONFIDENTIALITY AND INTEGRITY

    FOR SECURE ENTRY POINTS:

    Research Objective to Improve Integrity and Confidentiality

    using

    A

    Synopsis

    Submitted By

    XXX-XXX

    Supervised By

    Xxx. Xxxxxx Xxxxxx

    Assistant Professor (Computer Science)

    Department of Computer Science,

    Punjabi University Regional Center of Information & Technology,

  • 7/27/2019 Synopsis voice recognition

    2/9

    Mohali

  • 7/27/2019 Synopsis voice recognition

    3/9

    ABSTRACT

    Speech analysis is a unique biometric property of human being, which can be used for

    various purposes. A handful of research has been already done in this field which includes

    various fields, speech analysis and synthesis, voice based authentication, etc. Voice print highintegrity and confidentiality is very important when applying speech analysis on voice based

    authentication systems. Voice recognition based authentication systems can be used both

    individually and in combination with other biometric or other authentication techniques.

    Speech analysis integrity is very important when using speech analysis/voice recognition for

    authentication systems. In this research we will experiment with various voice integrity

    methods under speech analysis research for my master thesis research, and will select the best

    and appropriate method, and will improve it to a higher accuracy level. In case, if found all of

    the techniques not up to the mark, the research will lead to the creation of all new algorithm.

    1. INTRODUCTION

    People have always kept secrets and protected their possessions. It has evolved

    from the physical to the technological, where now technology is used to restrict

    access to our resources and user authentication is the key to the door.

    Authentication techniques have developed with it, and now who can access our

    resources is controlled using three main methods:

    Knowledge -based authentication is based on information authorized individuals will

    know, and unauthorized individuals will not. E.g. A PIN or a password. information.

    Object-based authentication is based on possessing a token or tool th at perm its the

    person access to the controlled resource. E.g. Keys, pass cards or a secure Id.

    Biometric-based authentication measures individuals unique physical or behavioral

    characteristics. It exists today in various form s such as fingerprint verification, retinal

    scans, facial analysis, analysis of vein structures and voice authentication.

    1.1 Speech

    Now that we are aware of the technologies voice authentication employs, we need to

    look at how the voice is produced, the characteristics that allow us to extract

    meaning from it, and the method by which it can be converted into a form that can be

    handled by com puter systems. In doing this particular attention will be paid to those

    Key fingerprint =of the FA27 2F94 998D FDB5unique F8B5 06E4individual, therefore

  • 7/27/2019 Synopsis voice recognition

    4/9

    characteristics AF19 voice that render it DE3D for each A169 4E46

    allowing their identification. To do this, we can examine the physiological component

    of human speech, which is produced by the human voice tract.

    In simple terms, the voice is created by air passing over the larynx or other parts of

    the vocal tract. The larynx vibrates creating an acoustic wave, essentially a hum,

    which is modified by the motion of the p alate, tongue and lips. Sounds produced by

    the larynx are called phonated or voiced sounds. Examples of voiced sounds would

    be the m in mud or the r in ram . Simultaneously, other sounds are produced by

    other parts of the vocal tract, for instance whispered sounds are created by air

    rushing over the arytenoids cartilage at the back of the throat. Sounds not originating

    in the larynx are called unvoiced sounds. Exam ples of these would be the f in fish

    or the s in sea.

    All sounds produced are, at the same time, fundamentally influenced by the actual

    shape of the vocal tract. This shape is brought about both as a consequence of

    hereditary and developmental factors, and of environmental factors.

    In parallel to these physiological characteristics, speech contains a behavioural

    component. This manifests itself as the accent o f the voice, and affects how quickly

    words are spoken, how sounds are pronounced and emphasized, and what other

    mannerisms are applied to speech.

    Together these physiological and behavioural factors combine to produce voice

    patterns that are essentially unique for every individual, and are difficult or

    impossible to duplicate short of recording the voice.

    When analyzed using modern technology, human spe ech appears to be rather

    inefficient, in terms of time and energy expended to transfer information. Speech is

    constructed out of various sounds, termed phonemes. Common English usage

    utilises around 40 phonemes, analogous to the characters of the phonetic alphabet

    seen in most dictionaries. For instance the word mud uses three phonemes

    denoted /m / (the mmm sound), /u/ (the uh), and /d/.

    1.2 Digitizing human speech.

    The human voice produces a highly complex acoustic wave, which, fortunately, the

    hum an ear and brain have evolved to interpret effectively. In technical terms thevoice is an analogue signal. An analogue signal is defined as one with a

  • 7/27/2019 Synopsis voice recognition

    5/9

    continuously variable physical value. A single frequency tone, such as the dial tone

    on a telephone will be a simple analogue signal. Such a simple signal will take the

    form of a Sine Wave

    In explaining how analogue signals can be converted to digital data, I will use the

    example of the Sine Wave. This is appropriate because of a ba sic rule of signal

    theory called Fouriers principle. In essence, this rule states that any signal, even

    one as complex as a voiceprint, is in fact a combination of many simple sine waves,

    of varying frequencies (cycles per second) and amplitudes (strength s), added

    together. This means that any process that works for a Sine Wave, will work for any

    other kind of signal as well.

    The initial problem is to convert this analogue sound signal into a digital signal (i.e. a

    sequence of numbers that a computer can input and manipulate). The first stage,

    converting it from a sound to an electrical wave is simple, using a microphone.

    The second stage involves converting the signal from a continuous wave to a series

    of discrete voltage measurements. This is done by a process called sampling.

    Sampling involves measuring the voltage of the signal at regular intervals, many

    times per second.

    If we use the sine wave of Figure 2 as an example: Say the wave has an amplitudeof 10 Volts: that is to say its peak positive voltage is 5 Volts and its correspondingly

    lowest negative voltage is 5 Vol ts. We see that the wave goes through a complete

    cycle every 10 milliseconds (ms), this means that it completes 100 cycles every

    second, so it has a frequency of 100Hz. (Hz, Hertz , is the measure of cycles or

    measured events per second.)

    To effectively sample this signal we need to measure its voltage every 5 ms, a

    sampling frequency of 200Hz. We know that we must use at least this sampling

    frequency because of a fundamental rule of communications called Shannons Law,

    which says that in order to full y sample any signal, without losing any part of it, you

    must have a sampling frequency of at least double the highest frequency contained

    shows the sampling of the signal.

    2. LITERATURE SURVEY

  • 7/27/2019 Synopsis voice recognition

    6/9

    3. STATEMENT OF THE PROBLEM

    High integrity in speech recognition and authentication system is very important and critical

    property of the Speech processing systems. Less integrity can lead toward a higher FRR or

    FAR values, which indicates the failure of the system.

    4. OBJECTIVES OF THE PROJECT WORK

    In order to satisfy the goal of the research following objectives have been identified:

    1. Study the Speech Processing techniques

    2. Study the Voice authentication sysetms

    3. Study the Speech processing research methods

    4. Study various hybrid authentication systems based on voice authentication

    5. Use and test various speech recognition models

    6. Use and test speech integrity in existing speech recognition models

    7. Selection and Creation of an improved speech recognition technique with high

    integrity

    8. Test and compare the results with existing systems

    9. Test in the real-time on local and online voice print data

  • 7/27/2019 Synopsis voice recognition

    7/9

    5. WORK DONE SO FAR

    To achieve the goal of the research following work has been done so far:

    1. Literature Survey and Problem Identification.

    2. Studied about various methods of speech recognition and voice authentication techniques

    3. Studied about the Short-time Fourier analysis, Cepstral analysis algorithms

    4. Studied about various Pitch extraction techniques like Autocorrelation technique,

    Cepstrum technique, Tracking and Smoothing, etc.

    5. Studied the MATLAB simulator

    6. Studied the existing speech/voice recognition and authentication sysetms.

    6. REFERENCES

    1. A Practical guide to Biometric Security Technology,

    Silverm an, 2001, IEEE

    Sim on Liu and Mark

    http://www.computer.org/itpro/hom epage/jan_feb01/security3.htm

    2. Hendry, M, Sm art Card Security and Applications, Artech House

    3. Privacy Laws & Business 9 th Privacy Commissioners' / Data Protection Authorities

    Biometrics as a Privacy -Enhancing Technology: Friend or Foe of Pri vacy?

    Workshop, Dr. George Tom ko, Sept. 15 t h , 1998

    http://www.dss.state.ct.us/digital/tom ko.htm

    4. Speech Technology for Telecommunications, F.A. Westhall, R.D. Johnston, A.V.

    Lewis

  • 7/27/2019 Synopsis voice recognition

    8/9

    Key fingerprint = AF19 FA27 2F94 998D FDB5 DE3D F8B5 06E4 A169 4E46

    http://www.eee.bham.ac.uk/woolleysi/downloads /speech/2.pdf

    5. Speech Links, Andrew Hunt

    http://www.speech.cs.cm u.edu/com p.speech/

    rr

    6. Making Speech Sounds Phonetics, Oxford uni versity press

    http://www.oup.com/pdf/elt/catalogue/0 -19-437239-1-b.pdf

    7. Phoneti cs and the theory of speech production

    http://www.acoustics.hut.fi/~slemmett/dippa/chap3.htm l

    8. Speaker Verification, Biometrics Research dept, MSU

    http://biometrics.cse.msu.edu/speaker.html

    9. Speaker Recognition, Identifying people by their voices, BRNO University of

    Technology, I ng. Milan Sigm und, 2000

    http://deer.ro.vutbr.cz/nakl/habilit/sigm und.pdf

    & Speaker Recognition, Sadaoki Furui , NTT Human Interface Laboratories, Tokyo

    cslu.cse.ogi.edu/HLTsurvey/ch1node9.html

    10. Visa gets behind voice recognition for e -commerce, Paul Roberts, IDG News

    Service 10/21/02

    www.nwfusion.com/n ews/2002/1021visa.html

    Key fingerprint = AF19 FA27 2F94 998D FDB5 DE3D F8B5 06E4 A169 4E46

    11. Voice Recognition, Jim Baumann, Human Interface Technology Laboratory,

    University of Washington.

  • 7/27/2019 Synopsis voice recognition

    9/9

    http://www.hitl.washington.edu/scivw/EVE/I .D.2.d.VoiceRecognition.htm l