audio captcha

7/31/2019 Audio CaptCha

1/23

N IMPR VED

AUDIO


2/23

Contents

What are captchas? Problem with current audio captchas.Testing of current captchas.

Categories of audio Captcha. Algorithm used and its details. Need for audio reCaptcha. Applications. Pitfalls. Conclusion.

2


3/23

WHAT ARE CAPTCHAS?

CAPTCHAs are tests generated by computers and

generally passable by humans but not current computer

programs.

3


4/23

THE PROBLEM WITH CURRENT AUDIO

CAPTCHAS

In some cases the human passing rate is only 70%!

To make the CAPTCHAs secure, noise was injected

into the audio files making it harder forboth

computers and humans to pass.

4


5/23


6/23

HOW DID WE TEST THE CURRENT AUDIO

CAPTCHAs?

Selected three different types of audio CAPTCHAs:

google, reCAPTCHA, and digg

Collected 1000 CAPTCHAs per type of audio

CAPTCHA to use for training and testing Created an ASR system using machine learning

techniques

6


7/23

Three categories of audiocaptcha

reCAPTCHA audio captcha - multiplevoices, digits and background noisethat is backwards speech

Google audio captcha- digits, singlevoice, backwards speech

Digg audio captcha- digits andletters, static/water for noise

7


8/23

THE ALGORITHM

Given the .wav file of an audio CAPTCHA

Segmentation - selecting portions of the audio

which most likely are digits/letters

Recognition

Extract features from the segment

Classify segment as digit/letter or noise and

output the label

Stop once a maximum number of segments are

classified

8


9/23

ALGORITHM DETAILS - SEGMENTATION

CAPTCHAs were manually labeled and segmented.

We created training segments using this information.

For testing, we chose the highest energy peaks in the

test CAPTCHA and selected fixed size segmentsroughly centered at the peaks.

9

QuickTime and adecompressor

are needed to see this picture.


10/23

ALGORITHM DETAILS - FEATURES

We used three popular techniques for extracting

features from speech to derive 5 sets of features from

the audio.

Mel-frequency cepstral coefficients (MFCC) Perceptual linear prediction (PLP)

Relative spectral transform with PLP (RASTA-PLP)

10


11/23

ALGORITHM DETAILS - AdaBoost

Used decision stumps for weak classifiers

For each type of audio CAPTCHA we created enough

classifiers to label a segment as a digit, letter, or noise.

Created 11 to 37 classifiers

Each classifier returns a value which represents its

confidence that the segment should be labeled as digit

letter or noise.

11


12/23

ALGORITHM DETAILS - SVM

Created a single multiclass classifier using all the

training segments (from 900 CAPTCHAs)

12


13/23

ALGORITHM DETAILS - k-NN

Created 5 classifiers corresponding to each of the

feature sets

13


14/23

THE ALGORITHM

Input: Audio CAPTCHA as an audio file

Segmentation

Find the highest energy peak, and extract a fixed

size segment centered at that peak

Recognition

Extract features from segment

Give segment to classifier and obtain label

Stop extracting segments once all segments have been

labeled or a max solution size is reached.

14


15/23

ANALYSIS OF CURRENT AUDIO

CAPTCHAs

Using three machine

learning techniques to

perform ASR on the

CAPTCHAs AdaBoost

Support Vector

Machines (SVM)

k-Nearest Neighbor

(k-NN)

15

0

10

20

30

40

50

60

70

80

%

GooglereCAPTCHA Digg

Exact Match Rate

AdaBoost

SVM

k-NN


16/23

THE GOAL

Make a secure audio CAPTCHA which will be easier

for a human to pass and harder for a computer to pass.

Equate solving a CAPTCHA with doing some useful

work. In other words, create an audio reCAPTCHA.

16


17/23

WHAT IS reCAPTCHA?

reCAPTCHA helps digitize text on which OCR fails

by using the text as its CAPTCHA.

Since millions of people solve CAPTCHAs each day,

millions of words get digitized each day!

17


18/23

18


19/23

THE AUDIO RECAPTCHA

Takes advantage of the human ability to understand

words through context.

Will help transcribe digital audio on which ASR

systems fail. The audio being used was originally recorded with the

intention that it should be easily understood by

humans.

19


20/23

Applications

Preventing Comment Spam inBlogs.

Protecting WebsiteRegistration.

Protecting Email Addresses FromScrapers.

Online Polls

Preventing Dictionary Attacks.

Worms and Spam. 20


21/23

ANALYSIS OF SECURITY

Speaker independent recognition is difficult.

Open vocabularies make it even more difficult for

ASR systems

AM broadcasts and .mp3 compression cause the lossof important data needed for automatic analysis

21


22/23

CONCLUSION

CAPTCHAs need to be more accessible, yet remain

secure and not too difficult for humans.

Deploy audio reCAPTCHA through reCAPTCHA site.

Help make knowledge captured in audio available intext form

22


23/23

Thank

you

23

audio captcha

Documents