enhancing security and privacy in biometrics-based ...govind/cse717/papers/... · tions,...

21
Enhancing security and privacy in biometrics-based authentication systems by N. K. Ratha J. H. Connell R. M. Bolle Because biometrics-based authentication offers several advantages over other authentication methods, there has been a significant surge in the use of biometrics for user authentication in recent years. It is important that such biometrics-based authentication systems be designed to withstand attacks when employed in security-critical applications, especially in unattended remote applications such as e- commerce. In this paper we outline the inherent strengths of biometrics-based authentication, identify the weak links in systems employing biometrics-based authentication, and present new solutions for eliminating some of these weak links. Although, for illustration purposes, fingerprint authentication is used throughout, our analysis extends to other biometrics-based methods. R eliable user authentication is becoming an in- creasingly important task in the Web-enabled world. The consequences of an insecure authenti- cation system in a corporate or enterprise environ- ment can be catastrophic, and may include loss of confidential information, denial of service, and com- promised data integrity. The value of reliable user authentication is not limited to just computer or net- work access. Many other applications in everyday life also require user authentication, such as banking, e- commerce, and physical access control to computer resources, and could benefit from enhanced secur- ity. The prevailing techniques of user authentication, which involve the use of either passwords and user IDs (identifiers), or identification cards and PINs (per- sonal identification numbers), suffer from several limitations. Passwords and PINs can be illicitly ac- quired by direct covert observation. Once an intruder acquires the user ID and the password, the intruder has total access to the user’s resources. In addition, there is no way to positively link the usage of the system or service to the actual user, that is, there is no protection against repudiation by the user ID owner. For example, when a user ID and password is shared with a colleague there is no way for the sys- tem to know who the actual user is. A similar sit- uation arises when a transaction involving a credit card number is conducted on the Web. Even though the data are sent over the Web using secure encryp- tion methods, current systems are not capable of assuring that the transaction was initiated by the rightful owner of the credit card. In the modern dis- tributed systems environment, the traditional au- thentication policy based on a simple combination of user ID and password has become inadequate. Fortunately, automated biometrics in general, and fingerprint technology in particular, can provide a much more accurate and reliable user authentica- tion method. Biometrics is a rapidly advancing field that is concerned with identifying a person based on his or her physiological or behavioral characteristics. Examples of automated biometrics include fingerprint, face, iris, and speech recognition. User authentica- tion methods can be broadly classified into three cat- egories 1 as shown in Table 1. Because a biometric rCopyright 2001 by International Business Machines Corpora- tion. Copying in printed form for private use is permitted with- out payment of royalty provided that (1) each reproduction is done without alteration and (2) the Journal reference and IBM copy- right notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor. RATHA, CONNELL, AND BOLLE 0018-8670/01/$5.00 © 2001 IBM IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 614

Upload: others

Post on 03-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

Enhancing securityand privacy inbiometrics-basedauthentication systems

by N. K. RathaJ. H. ConnellR. M. Bolle

Because biometrics-based authentication offersseveral advantages over other authenticationmethods, there has been a significant surgein the use of biometrics for user authenticationin recent years. It is important that suchbiometrics-based authentication systems bedesigned to withstand attacks when employedin security-critical applications, especially inunattended remote applications such as e-commerce. In this paper we outline the inherentstrengths of biometrics-based authentication,identify the weak links in systems employingbiometrics-based authentication, and presentnew solutions for eliminating some of theseweak links. Although, for illustration purposes,fingerprint authentication is used throughout, ouranalysis extends to other biometrics-basedmethods.

Reliable user authentication is becoming an in-creasingly important task in the Web-enabled

world. The consequences of an insecure authenti-cation system in a corporate or enterprise environ-ment can be catastrophic, and may include loss ofconfidential information, denial of service, and com-promised data integrity. The value of reliable userauthentication is not limited to just computer or net-work access. Many other applications in everyday lifealso require user authentication, such as banking, e-commerce, and physical access control to computerresources, and could benefit from enhanced secur-ity.

The prevailing techniques of user authentication,which involve the use of either passwords and userIDs (identifiers), or identification cards and PINs (per-sonal identification numbers), suffer from several

limitations. Passwords and PINs can be illicitly ac-quired by direct covert observation. Once an intruderacquires the user ID and the password, the intruderhas total access to the user’s resources. In addition,there is no way to positively link the usage of thesystem or service to the actual user, that is, there isno protection against repudiation by the user IDowner. For example, when a user ID and passwordis shared with a colleague there is no way for the sys-tem to know who the actual user is. A similar sit-uation arises when a transaction involving a creditcard number is conducted on the Web. Even thoughthe data are sent over the Web using secure encryp-tion methods, current systems are not capable ofassuring that the transaction was initiated by therightful owner of the credit card. In the modern dis-tributed systems environment, the traditional au-thentication policy based on a simple combinationof user ID and password has become inadequate.

Fortunately, automated biometrics in general, andfingerprint technology in particular, can provide amuch more accurate and reliable user authentica-tion method. Biometrics is a rapidly advancing fieldthat is concerned with identifying a person based onhis or her physiological or behavioral characteristics.Examples of automated biometrics include fingerprint,face, iris, and speech recognition. User authentica-tion methods can be broadly classified into three cat-egories1 as shown in Table 1. Because a biometric

rCopyright 2001 by International Business Machines Corpora-tion. Copying in printed form for private use is permitted with-out payment of royalty provided that (1) each reproduction is donewithout alteration and (2) the Journal reference and IBM copy-right notice are included on the first page. The title and abstract,but no other portions, of this paper may be copied or distributedroyalty free without further permission by computer-based andother information-service systems. Permission to republish anyother portion of this paper must be obtained from the Editor.

RATHA, CONNELL, AND BOLLE 0018-8670/01/$5.00 © 2001 IBM IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001614

Page 2: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

property is an intrinsic property of an individual, itis difficult to surreptitiously duplicate and nearly im-possible to share. Additionally, a biometric propertyof an individual can be lost only in case of seriousaccident.

Biometric readings, which range from several hun-dred bytes to over a megabyte, have the advantagethat their information content is usually higher thanthat of a password or a pass phrase. Simply extend-ing the length of passwords to get equivalent bitstrength presents significant usability problems. It isnearly impossible to remember a 2K phrase, and itwould take an annoyingly long time to type such aphrase (especially without errors). Fortunately, au-tomated biometrics can provide the security advan-tages of long passwords while retaining the speedand characteristic simplicity of short passwords.

Even though automated biometrics can help allevi-ate the problems associated with the existing meth-ods of user authentication, hackers will still find thereare weak points in the system, vulnerable to attack.Password systems are prone to brute force dictio-nary attacks. Biometric systems, on the other hand,require substantially more effort for mounting suchan attack. Yet there are several new types of attackspossible in the biometrics domain. This may not ap-ply if biometrics is used as a supervised authentica-tion tool. But in remote, unattended applications,such as Web-based e-commerce applications, hack-ers may have the opportunity and enough time tomake several attempts, or even physically violate theintegrity of a remote client, before detection.

A problem with biometric authentication systemsarises when the data associated with a biometric fea-ture has been compromised. For authentication sys-tems based on physical tokens such as keys andbadges, a compromised token can be easily canceledand the user can be assigned a new token. Similarly,user IDs and passwords can be changed as often asrequired. Yet, the user only has a limited numberof biometric features (one face, ten fingers, two eyes).If the biometric data are compromised, the user mayquickly run out of biometric features to be used forauthentication.

In this paper, we discuss in more detail the prob-lems unique to biometric authentication systems andpropose solutions to several of these problems. Al-though we focus on fingerprint recognition through-out this paper, our analysis can be extended to otherbiometric authentication methods. In the next sec-tion, “Fingerprint authentication,” we detail thestages of the fingerprint authentication process. Inthe following section, “Vulnerable points of a bio-metric system,” we use a pattern recognition frame-work for a generic biometric system to help identifythe possible attack points. The section “Brute forceattack directed at matching fingerprint minutiae” an-alyzes the resilience of a minutiae-based fingerprintauthentication system in terms of the probabilityof a successful brute force attack. The next two sec-tions, “WSQ-based data hiding” and “Image-basedchallenge/response method,” propose two methodsthat address some of the vulnerable points of a bio-metric system. The section “Cancelable biometrics”introduces the concept of “cancelable biometrics”

Table 1 Existing user authentication techniques

Method Examples Properties

What you know User ID SharedPassword Many passwords easy to guessPIN Forgotten

What you have Cards SharedBadges Can be duplicatedKeys Lost or stolen

What you know and what you have ATM card 1 PIN SharedPIN a weak link(Writing the PIN on the card)

Something unique about the user Fingerprint Not possible to shareFace Repudiation unlikelyIris Forging difficultVoice print Cannot be lost or stolen

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 615

Page 3: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

and discusses its application to authentication. Fi-nally, the section “Conclusions” recapitulates the is-sues discussed and summarizes the proposed new ap-proaches.

Fingerprint authentication

We present here a brief introduction to fingerprintauthentication. Readers familiar with fingerprint au-thentication may skip to the next section.

Fingerprints are a distinctive feature and remain in-variant over the lifetime of a subject, except for cutsand bruises. As the first step in the authenticationprocess, a fingerprint impression is acquired, typi-cally using an inkless scanner. Several such scanningtechnologies are available.2 Figure 1A shows a fin-gerprint obtained with a scanner using an optical sen-sor. A typical scanner digitizes the fingerprint im-pression at 500 dots per inch (dpi) with 256 gray levelsper pixel. The digital image of the fingerprint in-cludes several unique features in terms of ridge bi-furcations and ridge endings, collectively referred toas minutiae.

The next step is to locate these features in the fin-gerprint image, as shown in Figure 1B, using an au-tomatic feature extraction algorithm. Each featureis commonly represented by its location ( x, y) and

the ridge direction at that location (u). However, dueto sensor noise and other variability in the imagingprocess, the feature extraction stage may miss someminutiae and may generate spurious minutiae. Fur-ther, due to the elasticity of the human skin, the re-lationship between minutiae may be randomly dis-torted from one impression to the next.

In the final stage, the matcher subsystem attemptsto arrive at a degree of similarity between the twosets of features after compensating for the rotation,translation, and scale. This similarity is often ex-pressed as a score. Based on this score, a final de-cision of match or no-match is made. A decisionthreshold is first selected. If the score is below thethreshold, the fingerprints are determined not tomatch; if the score is above the threshold, a correctmatch is declared. Often the score is simply a countof the number of the minutiae that are in correspon-dence. In a number of countries, 12 to 16 correspon-dences (performed by a human expert) are consid-ered legally binding evidence of identity.

The operational issues in an automated fingerprintidentification system (AFIS) are somewhat differentfrom those in a more traditional password-based sys-tem. First, there is a system performance issue knownas the “fail to enroll” rate to be considered. Somepeople have very faint fingerprints, or no fingers at

A B

Figure 1 Fingerprint recognition; (A) input image, (B) features

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001616

Page 4: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

all, which makes the system unusable for them. Arelated issue is a “Reject” option in the system basedon input image quality. A poor quality input is notaccepted by the system during enrollment and au-thentication. Note that poor quality inputs can becaused by noncooperative users, improper usage, dirton the finger, or bad input scanners. This has no an-alog in a password system. Then there is the fact thatin a biometric system the matching decision is notclear-cut. A password system always provides a cor-rect response—if the passwords match, it grants ac-cess but otherwise refuses access. However, in a bio-metric system, the overall accuracy depends on thequality of input and enrollment data along with thebasic characteristics of the underlying feature extrac-tion and matching algorithm.

For fingerprints, and biometrics in general, there aretwo basic types of recognition errors, namely the falseaccept rate (FAR) and the false reject rate (FRR). Ifa nonmatching pair of fingerprints is accepted as amatch, it is called a false accept. On the other hand,if a matching pair of fingerprints is rejected by thesystem, it is called a false reject. The error rates area function of the threshold as shown in Figure 2. Of-ten the interplay between the two errors is presentedby plotting FAR against FRR with the decision thresh-old as the free variable. This plot is called the ROC(Receiver Operating Characteristic) curve. The twoerrors are complementary in the sense that if onemakes an effort to lower one of the errors by varyingthe threshold, the other error rate automatically in-creases.

In a biometric authentication system, the relativefalse accept and false reject rates can be set by choos-

ing a particular operating point (i.e., a detectionthreshold). Very low (close to zero) error rates forboth errors (FAR and FRR) at the same time are notpossible. By setting a high threshold, the FAR errorcan be close to zero, and similarly by setting a sig-nificantly low threshold, the FRR rate can be closeto zero. A meaningful operating point for the thresh-old is decided based on the application requirements,and the FAR versus FRR error rates at that operatingpoint may be quite different. To provide high secur-ity, biometric systems operate at a low FAR insteadof the commonly recommended equal error rate(EER) operating point where FAR 5 FRR. High-per-formance fingerprint recognition systems can sup-port error rates in the range of 1026 for false acceptand 1024 for false reject.3 The performance num-bers reported by vendors are based on test resultsusing private databases and, in general, tend to bemuch better than what can be achieved in practice.Nevertheless, the probability that the fingerprint sig-nal is supplied by the right person, given a goodmatching score, is quite high. This confidence levelgenerally provides better nonrepudiation supportthan passwords.

Vulnerable points of a biometric system

A generic biometric system can be cast in the frame-work of a pattern recognition system. The stages ofsuch a generic system are shown in Figure 3. Excel-lent introductions to automated biometric systemscan be found in References 1 and 4.

The first stage involves biometric signal acquisitionfrom the user (e.g., the inkless fingerprint scan). Theacquired signal typically varies significantly from pre-

IMPOSTERS

pTHRESHOLD

FALSE REJECT FALSE ACCEPT

GENUINE

MATCH SCORE

Figure 2 Error trade-off in a biometric system

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 617

Page 5: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

sentation to presentation; hence, pure pixel-basedmatching techniques do not work reliably. For thisreason, the second signal processing stage attemptsto construct a more invariant representation of thisbasic input signal (e.g., in terms of fingerprint mi-nutiae). The invariant representation is often aspatial domain characteristic or a transform (fre-quency) domain characteristic, depending on theparticular biometric.

During enrollment of a subject in a biometric au-thentication system, an invariant template is storedin a database that represents the particular individ-ual. To authenticate the user against a given ID, thecorresponding template is retrieved from the data-base and matched against the template derived froma newly acquired input signal. The matcher arrivesat a decision based on the closeness of these two tem-plates while taking into account geometry, lighting,and other signal acquisition variables.

Note that password-based authentication systems canalso be set in this framework. The keyboard becomesthe input device. The password encryptor can beviewed as the feature extractor and the comparatoras the matcher. The template database is equivalentto the encrypted password database.

We identified eight places in the generic biometricsystem of Figure 3 where attacks may occur. In ad-dition, Schneier5 describes several types of abusesof biometrics. The numbers in Figure 3 correspondto the items in the following list.

1. Presenting fake biometrics at the sensor: In thismode of attack, a possible reproduction of the bio-

metric feature is presented as input to the sys-tem. Examples include a fake finger, a copy of asignature, or a face mask.

2. Resubmitting previously stored digitized biomet-rics signals: In this mode of attack, a recorded sig-nal is replayed to the system, bypassing the sen-sor. Examples include the presentation of an oldcopy of a fingerprint image or the presentationof a previously recorded audio signal.

3. Overriding the feature extraction process: Thefeature extractor is attacked using a Trojan horse,so that it produces feature sets preselected by theintruder.

4. Tampering with the biometric feature represen-tation: The features extracted from the input sig-nal are replaced with a different, fraudulent fea-ture set (assuming the representation method isknown). Often the two stages of feature extrac-tion and matcher are inseparable and this modeof attack is extremely difficult. However, if minu-tiae are transmitted to a remote matcher (say,over the Internet) this threat is very real. Onecould “snoop” on the TCP/IP (Transmission Con-trol Protocol/Internet Protocol) stack and altercertain packets.

5. Corrupting the matcher: The matcher is attackedand corrupted so that it produces preselectedmatch scores.

6. Tampering with stored templates: The databaseof stored templates could be either local or re-mote. The data might be distributed over severalservers. Here the attacker could try to modify oneor more templates in the database, which couldresult either in authorizing a fraudulent individ-ual or denying service to the persons associatedwith the corrupted template. A smartcard-based

YES/NO

Figure 3 Possible attack points in a generic biometrics-based system

FEATUREEXTRACTION

SENSOR

1

MATCHER

2

3

4

5

6

7

8

STOREDTEMPLATE(S)

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001618

Page 6: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

authentication system,6 where the template isstored in the smartcard and presented to the au-thentication system, is particularly vulnerable tothis type of attack.

7. Attacking the channel between the stored tem-plates and the matcher: The stored templates aresent to the matcher through a communicationchannel. The data traveling through this channelcould be intercepted and modified.

8. Overriding the final decision: If the final matchdecision can be overridden by the hacker, thenthe authentication system has been disabled. Evenif the actual pattern recognition framework hasexcellent performance characteristics, it has beenrendered useless by the simple exercise of over-riding the match result.

There exist several security techniques to thwart at-tacks at these various points. For instance, finger con-ductivity or fingerprint pulse at the sensor can stopsimple attacks at point 1. Encrypted communicationchannels7 can eliminate at least remote attacks atpoint 4. However, even if the hacker cannot pene-trate the feature extraction module, the system isstill vulnerable. The simplest way to stop attacks atpoints 5, 6, and 7 is to have the matcher and the da-tabase reside at a secure location. Of course, eventhis cannot prevent attacks in which there is collu-sion. Use of cryptography8 prevents attacks atpoint 8.

We observe that the threats outlined in Figure 3 arequite similar to the threats to password-based au-thentication systems. For instance, all the channelattacks are similar. One difference is that there is no“fake password” equivalent to the fake biometric at-tack at point 1 (although, perhaps if the passwordwas in some standard dictionary it could be deemed“fake”). Furthermore, in a password- or token-basedauthentication system, no attempt is made to thwartreplay attacks (since there is no expected variationof the “signal” from one presentation to another).However, in an automated biometric-based authen-tication system, one can check the liveness of the en-tity originating the input signal.

Brute force attack directed at matchingfingerprint minutiae

In this section we attempt to analyze the probabilitythat a brute force attack at point 4 of Figure 3, in-volving a set of fraudulent fingerprint minutiae, willsucceed in matching a given stored template. Fig-ure 4 shows one such randomly generated minutiae

set. In a smart card system where the biometrics tem-plate is stored in the card and presented to the au-thentication system, a hacker could present theserandom sets to the authentication system assumingthat the hacker has no information about the storedtemplates. Note that an attack at point 2 of Figure3, which involves generating all possible fingerprintimages in order to match a valid fingerprint image,would have an even larger search space and conse-quently would be much more difficult.

A naive model. For the purpose of analyzing the “na-ive” matching minutiae attack, we assume the fol-lowing.

● The system uses a minutia-based matching methodand the number of paired minutiae reflects the de-gree of match.

● The image size S 5 300 pixels 3 300 pixels.● A ridge plus valley spread T 5 15 pixels.● The total number of possible minutiae sites (K 5

S/(T 2)) 5 20 3 20 5 400.● The number of orientations allowed for the ridge

angle at a minutia point d 5 4, 8, 16.● The minimum number of corresponding minutiae

in query and reference template m 5 10, 12, 14,16, 18.

These values are based on a standard fingerprintscanner with 500 dpi scanning resolution coveringan area 0.6 3 0.6 inches. A ridge and valley can span

Figure 4 Example of a randomly generated minutiae set

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 619

Page 7: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

about 15 pixels on average at this scanning resolu-tion. The other two variables d and m are being usedas parameters to study the brute force attack. Westart with 10 matching minutiae since often a thresh-old of 12 minutiae is used in matching fingerprintsin manual systems. Ridge angles in an automatedsystem can be quantized depending on the tolerancesupported in the matcher. A minimum of four quan-tization levels provides a 45 degree tolerance, while16 levels provides roughly an 11 degree tolerance.

Then, the number of possible ways to place m mi-nutiae in K possible locations is

SKmD (1)

and, the number of possible ways to assign directionsto the minutiae is d m .

Hence, the total number of possible minutiae com-binations equals

SKmD 3 ~d m! (2)

Note that it is assumed that the matcher will toler-ate shifts between query and reference minutiae ofat most a ridge and valley pixel width, and an an-

gular difference of up to half a quantization bin (645degrees for d 5 4).

Plugging these values into Equation 2, for d 5 4 andm 5 10, the probability of randomly guessing theexact feature set is 3.6 3 10226 5 2284.5. The log2 ofthe probability of randomly guessing a correct fea-ture set through a brute force attack for different val-ues of d and m is plotted in Figure 5. We refer tothis measure (in bits) as “strength,” and it representsthe equivalent number of bits in a password authen-tication system. This should convince the reader thata brute force attack in the form of a random imageor a random template attempting to impersonate anauthorized individual will, on average, require a verylarge number of attempts before succeeding.

The foregoing analysis assumes that each fingerprinthas exactly m minutiae, that only m minutiae are gen-erated, and that all of these minutiae have to match.A realistic strength is much lower because one cangenerate more than m query minutiae, say Ntotal, andonly some fraction of these must match m minutiaeof the reference fingerprint. This leads to a factorof about (m

Ntotal)2 or a loss of nearly 64 bits in strengthfor m 5 10 with Ntotal 5 50. The equivalent strengththus is closer to 20 bits for this parameter set. A morerealistic model, which carefully incorporates this ef-fect, is described below.

A more realistic model. In the naive approach, wemade several simplifying assumptions. In this morerealistic model, we will make assumptions that aremore realistic and will analyze the brute force at-tack model in more detail.

Let the reference print have Nr minutiae, and leteach feature of the minutiae include a ridge direc-tion which takes d possible values, and a locationwhich takes K possible values. Then the probabilitythat a randomly generated minutia will match oneof the minutiae in the reference print in both loca-tion and direction can be approximated as:

pest 5Nr

K 3 d(3)

A more accurate model would require that we con-sider the probability of a minutiae site being pop-ulated as a function of the distance to the center ofthe print (they are more likely in the middle). In ad-dition, such a model would require that the direc-tional proclivities depend on location (they tend to

Figure 5 Bit strength in the naive model

10

170

180

150

160

130

140

100

110

80

ST

RE

NG

TH

(IN

BIT

S)

90

120

11 12 13 14 15 16 17 18

m

d 4d 8d 16

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001620

Page 8: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

swirl around the core). In this model, however, weignore such dependencies and use the simpler for-mulation.

While the expression above is valid for the first gen-erated minutia, when creating the full synthetic setit is undesirable to generate two minutiae with thesame location. So after j 2 1 minutiae have beengenerated, the probability that the jth minutia willmatch (assuming the previous j 2 1 minutiae all failto match) is bounded from above by:

Nr

~K 2 j 1 1!d(4)

Thus, while generating Nq random minutiae we canconservatively assume each minutia has matchingprobability:

p 5 phi 5Nr

~K 2 Nq 1 1!d(5)

Typical parameter values are K 5 400, Nq 5 Nr 550 and d 5 4. Note that brute force attacks with Nq

excessively large (close to the value K) would be easyto detect and reject out of hand. For this reason thereis an upper bound on Nq that still enables an attackerto generate the facsimile of a real finger. Using thevalues above we find pest 5 0.03125 while phi 50.03561 (14 percent higher). This is a relatively smalleffect in itself, but important in the overall calcula-tion.

Therefore, the probability of getting exactly t of Nq

generated minutiae to match is about:

Pthresh 5 p t~1 2 p! Nq2t (6)

This derivation breaks down for small K because theminutiae matching probability changes dependingon how many other minutiae have already been gen-erated as well as on how many of those minutiae havematched. However, for the large values of K typi-cally encountered (e.g., 400) it is reasonably close.

Now there are a number of ways of selecting whicht out of the Nr minutiae in the reference print arethe ones that match. Thus, the total match proba-bility becomes:

Pexact 5 SNr

t Dp t~1 2 p! Nq2t (7)

But matches of m or more minutiae typically countas a verification, so we get:

Pver 5 Ot5m

Nq SNr

t Dp t~1 2 p! Nq2t (8)

For convenience, let us assume that Nq 5 Nr 5 N,so the above equation can be rewritten as:

Pver 5 Ot5m

N SNt Dp t~1 2 p! N2t (9)

Since p is fairly small in our case, we can use the Pois-son approximation to the above binomial probabil-ity density function:

Pver 5 Ot5m

N ~Np! te 2Np

t! (10)

This summation is usually dominated by its first term(where t 5 m). For typical parameter values the sec-ond term is 10 to 20 times smaller than the first. Ne-glecting all but the first term may make the overallestimate approximately 20 percent lower, but for or-der-of-magnitude calculations this is fine. Thus, werewrite the expression as simply:

Pver 5~Np! me 2Np

m! (11)

Because m is moderately large, we can use Stirling’sapproximation for the factorial and further rewritethe equation as:

Pver 5~Np! me 2Np

Î~2pm! e 2mm m (12)

and regrouping to emphasize the exponential depen-dency:

Pver 5e 2Np

Î2pm SeNpm Dm

(13)

The log2 of Pver (bit strength) is plotted in Figure 6for N 5 40, d 5 4, K 5 400 with m (the numberof minutiae required to match) between 10 and 35.For a value of m 5 10, we have about 22 bits ofinformation (close to the prediction of the revised

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 621

Page 9: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

naive model). For the legal threshold of m 5 15,we have around 40 bits of information (represent-ing a number of distinct binary values equal to about140 times the population of the earth). For a moretypical value of m 5 25, we have roughly 82 bits ofinformation content in this representation. This isequivalent to a 16-character nonsense password(such as “m4yus78xpmks3bc9”).

Studies similar to ours have been reported in the lit-erature, and these studies evaluate the individualityof a fingerprint based on the minutiae informa-tion.9,10 These analyses were based on the minutiaefrequency data collected and interpreted by a hu-man expert and involving a small set of fingers. Fur-thermore, these studies used all the ten types of Gal-ton characteristics,11 whereas our study is based onjust one type of feature (with no differentiation be-tween ridge endings and bifurcations). The purposeof these studies was to quantify the information con-tent of a fingerprint (similar to our naive method)rather than set thresholds for matching in the faceof brute force attacks.

Examining the final equation (Equation 13), wemake two important observations. First, in both thenaive and the more realistic model, it can be seenthat adding extra feature information at every mi-nutia (e.g., raising d) increases significantly thestrength of the system. Similarly, if the spatial do-main extent is increased or the number of minutiae

sites K are increased, the strength also increases.Both these factors directly affect p, the single mi-nutia matching probability, which shows up insidethe exponential term of Pver. Second, there is a strongdependence on N, the overall number of minutiaein a fingerprint. For high security, this number needsto be kept as low as possible. This is one reason whythe probability of break-ins is much smaller whengood quality fingerprint images are enrolled as op-posed to using poor quality images with many spu-rious minutiae (yielding a higher overall N). Oftenpractical systems reject a bad quality fingerprint im-age for this reason instead of taking a hit on the ac-curacy of the system.

It should be pointed out that the brute force attackbreak-in probability is not dependent in any way onthe FAR. That is, if the FAR is 1026, this does not meanthat, on average, the system is broken into after500000 trials. The FAR is estimated using actual hu-man fingers and is typically attributable to errors infeature extraction (extra or missing features) and,to a lesser extent, to changes in geometry such asfinger rolling or skin deformations due to twisting.The statistics governing the occurrence of these typesof errors are different from those describing a bruteforce attack.

WSQ-based data hiding

In both Web-based and other on-line transactionprocessing systems, it is undesirable to send uncom-pressed fingerprint images to the server due to band-width limitations. A typical fingerprint image is ofthe order of 512 3 512 pixels with 256 gray levels,resulting in a file size of 256 Kbytes. This would takenearly 40 seconds to transmit at 53 Kbaud. Unfor-tunately, many standard compression methods, suchas JPEG (Joint Photographic Experts Group), havea tendency to distort the high-frequency spatial andstructural ridge features of a fingerprint image. Thishas led to several research proposals regarding do-main-specific compression methods. As a result, anopen Wavelet Scalar Quantization (WSQ) imagecompression scheme proposed by the FBI12 has be-come the de facto standard in the industry, becauseof its low image distortion even at high-compressionratios (over 10:1).

Typically, the compressed image is transmitted overa standard encrypted channel as a replacement for(or in addition to) the user’s PIN. Yet, because ofthe open compression standard, transmitting a WSQcompressed image over the Internet is not partic-

Figure 6 Bit strength in the more realistic model

120

140

80

100

60

20

40

ST

RE

NG

TH

(IN

BIT

S)

10 15 20

m25 30 35

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001622

Page 10: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

ularly secure. If a compressed fingerprint imagebitstream can be freely intercepted (and decrypted),it can be decompressed using readily availablesoftware. This potentially allows the signal to besaved and fraudulently reused (attack point 2 in Fig-ure 3).

One way to enhance security is to use data-hidingtechniques to embed additional information directlyin compressed fingerprint images. For instance, ifthe embedding algorithm remains unknown, the ser-vice provider can look for the appropriate standardwatermark to check that a submitted image was in-deed generated by a trusted machine (or sensor).Several techniques have been proposed in the lit-erature for hiding digital watermarks in images.13,14

Bender et al.15 and Swanson et al.16 present excel-lent surveys of data-hiding techniques. Petitcolas etal.14 provide a nice survey and taxonomy of infor-mation-hiding techniques. Hsu and Wu17 describea method for hiding watermarks in JPEG compressedimages. Most of the research, however, addresses is-sues involved in resolving piracy or copyright issues,not authentication. An exception is the invisible wa-termarking technique for fingerprints proposed byYeung and Pankanti.18 Their study involves exam-ining the accuracy after an invisible watermark is in-serted in the image domain. Our proposed solutionis different because, first, it operates directly in thecompressed domain and, second, it causes no per-formance degradation.

The approach is motivated by the desire to createon-line fingerprint authentication systems for com-mercial transactions that are secure against replayattacks. To achieve this, the service provider issuesa different verification string for each transaction.The string is mixed in with the fingerprint image be-fore transmission. When the image is received by theservice provider it is decompressed and the imageis checked for the presence of the correct one-timeverification string. The method we propose herehides such messages with minimal impact on the ap-pearance of the decompressed image. Moreover, themessage is not hidden in a fixed location (whichwould make it more vulnerable to discovery) but is,instead, deposited in different places based on thestructure of the image itself. Although our approachis presented in the framework of fingerprint imagecompression, it can be easily extended to other bi-ometrics such as wavelet-based compression of fa-cial images.

Our information hiding scheme works in conjunc-tion with the WSQ fingerprint image encoder and de-coder, which are shown in Figures 7A and 7B, re-spectively. In the first step of the WSQ compression,the input image is decomposed into 64 spatial fre-quency subbands using perfect reconstruction mul-tirate filter banks19 based on discrete wavelet trans-formation filters. The filters are implemented as apair of separable 1D filters. The two filters specifiedfor encoder 1 of the FBI standard are plotted in Fig-ures 7C and 7D. The subbands are the filter outputsobtained after a desired level of cascading of the fil-ters as described in the standard. For example, sub-band 25 corresponds to the cascading path of “00,10, 00, 11” through the filter bank. The first digit ineach binary pair represents the row operation index.A zero specifies low pass filtering using h0 on the row(column) while a one specifies high pass filtering usingh1 on the row (column). Thus for the 25th subband,the image is first low pass filtered in both row and col-umn; followed by high pass filtering in rows, then lowpass filtering in columns; the output of which is thenlow pass filtered in rows and columns; and ending withhigh pass filtering in rows and columns. Note that thereis appropriate down sampling and the symmetric ex-tension transform is applied at every stage as specifiedin the standard. The 64 subbands of the gray-scale fin-gerprint image shown in Figure 8A are shown in Fig-ure 8C.

There are two more stages to WSQ compression. Thesecond stage is a quantization process where the Dis-crete Wavelet Transform (DWT) coefficients aretransformed into integers with a small number of dis-crete values. This is accomplished by uniform scalarquantization for each subband. There are two char-acteristics for each band: the zero of the band (Zk)and the width of the bins (Qk). These parametersmust be chosen carefully to achieve a good compres-sion ratio without introducing significant informa-tion loss that will result in distortions of the images.The Zk and Qk for each band are transmitted di-rectly to the decoder. The third and final stage isHuffman coding of the integer indices for the DWTcoefficients. For this purpose, the bands are groupedinto three blocks. In each block, the integer coef-ficients are remapped to numbers between 0–255prescribed by the translation table described in thestandard. This translation table encodes run lengthsof zeros and large values. Negative coefficients aretranslated in a similar way by this table.

Our data-hiding algorithm works on the quantizedindices before this final translation (i.e., between stages

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 623

Page 11: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

2 and 3). We assume the message size is very small com-pared to the image size (or, equivalently, the numberof DWT coefficients). Note, however, that the Huffman

coding characteristics and tables are not changed; thetables are computed as for the original coefficients, notafter the coefficient altering steps described next.

Figure 7D Analysis filter h1

0.8

0.6

VALU

E

0.4

0.2

0

−0.2

−0.4

−0.6−4 −3 −2 −1 0

TAP

1 2

h1

Figure 7A, B WSQ algorithm; (A) compression, (B) decompression

QUANTIZATIONTABLE

QUANTIZATIONTABLE

HUFFMANENCODER TABLES

IDWT FILTERCOEFFICIENTS

WAVELET FILTERCOEFFICIENTS

HUFFMANDECODER TABLES

A

B

SCALARQUANTIZER

HUFFMANENCODER

INVERSE DISCRETEWAVELETTRANSFORM

HUFFMANDECODER

DISCRETE WAVELETTRANSFORM

SCALARDEQUANTIZER

Figure 7C Analysis filter h0

0.9

0.8

0.7

0.6

VALU

E

0.5

0.4

0.3

0.2

0.1

0

−0.1

−0.2

TAP

−4 −3 −2 −1 0 1 2 3 4 5

h0

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001624

Page 12: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

As mentioned, our method is intended for messageswhich are very small (in terms of bits) compared tothe number of pixels in the image. The basic prin-ciple is to find and slightly alter certain of the DWTcoefficients. However, care must be taken to avoidcorrupting the reconstructed image. To hide a mes-sage during the image encoding process, we performthree (or, optionally, four) basic steps:

● Selecting a set of sites S: Given the partially con-verted quantized integer indices, this stage collectsthe indices of all possible coefficient sites wherea change in the least significant bit is tolerable. Typ-ically, all sites in the low frequency bands are ex-cluded. Even small changes in these coefficientscan affect large regions of the image because ofthe low frequencies. For the higher frequencies,candidate sites are selected if they have coefficientsof large magnitude. Making small changes to thelarger coefficients leads to relatively small percent-age changes in the values and hence minimal deg-radation of the image. Note that among the quan-tizer indices there are special codes to representrun lengths of zeros, large integer values, and othercontrol sequences. All coefficient sites incorpo-rated into these values are avoided. In our imple-mentation, we only select sites with translated in-dices ranging from 107 to 254, but excluding 180(an invalid code).

● Generating a seed for random number generationand then choosing sites for modification: Sites fromthe candidate set S, that are modified, are selectedin a pseudorandom fashion. To ensure that the en-coder actions are invertible in the decoder, the seedfor the random number generator is based on thesubbands that are not considered for alteration.For example, in the selection process the contentsof subbands 0–6 are left unchanged in order tominimize distortion. Typically, fixed sites withinthese bands are selected, although in principle anystatistic from these bands may be computed andused as the seed. Selecting the seed in this way en-sures that the message is embedded at varying lo-cations (based on the image content). It furtherensures that the embedded message can only beread if the proper seed selection algorithm isknown by the decoder.

● Hiding the message at selected sites by bit setting:The message to be hidden is translated into a se-quence of bits. Each bit will be incorporated intoa site chosen pseudorandomly by a random num-ber generator seeded as described above. That is,for each bit a site is selected from the set S basedon the next output of the seeded pseudorandomnumber generator. If the selected site has alreadybeen used, the next randomly generated site is cho-sen instead. The low order bit of the value at theselected site is changed to be identical to the cur-

Figure 8B WSQ data-hiding results; (B) reconstructed image

B

Figure 8A WSQ data-hiding results; (A) original image

A

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 625

Page 13: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

rent message bit. On average, half the time thisresults in no change at all of the coefficient value.

● Appending the bits to the coded image: Option-ally, all the original low order bits can be savedand appended to the compressed bit stream as auser comment field (an appendix). The appendedbits are a product of randomly selected low-ordercoefficient bits and hence these bits are uncorre-lated with the hidden message.

The steps performed by the decoder correspond tothe encoder steps above. The first two steps are iden-tical to the first steps of the encoder. These stepsconstruct the same set S and compute the same seedfor the random number generator. The third stepuses the pseudorandom number generator to selectspecific sites in S in the prescribed order. The leastsignificant bits of the values at these sites are extractedand concatenated to recover the original message.

Figure 8C 64 subbands of the image in Figure 8A

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001626

Page 14: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

If the appendix restoration is to be included, the de-coder can optionally restore the original low-orderbits while reconstructing the message. This allowsperfect reconstruction of the image (up to the orig-inal compression) despite the embedded message.Because the modification sites S are carefully se-lected, the decompressed image even with the mes-sage still embedded will be nearly the same as therestored decompressed image. In practice, the er-ror due to the embedded message is not perceptu-ally significant and does not affect subsequent pro-cessing and authentication. Figures 8A and 8B showthe original and the reconstructed images, respec-tively.

Using this process only a specialized decoder can lo-cate and extract the message from the compressedimage during the decoding process. This messagemight be a fixed authentication stamp, personal IDinformation which must match some other part ofthe record (which might have been sent in the clear),or some time stamp. Thus, if the bit stream does notcontain an embedded message or the bit stream isimproperly coded, the specialized decoder will failto extract the expected message and will thus rejectthe image. If instead an unencoded WSQ compressedfingerprint image is submitted to the special decoder,it will still extract a garbage message which can berejected by the server.

Many implementations of the same algorithm arepossible by using different random number gener-ators or partial seeds. This means it is possible tomake every implementation unique without mucheffort; the output of one encoder need not be com-patible with another version of the decoder. This hasthe advantage that cracking one version will not com-promise any other version.

This method can also be extended to other biomet-ric signals using a wavelet compression scheme, suchas facial images or speech. While the filters and thequantizer in the WSQ standard have been designedto suit the characteristics of fingerprint images, wave-let-based compression schemes for other signals arealso available.20 It is relatively straightforward to de-sign techniques similar to ours for such schemes.

Image-based challenge/response method

Besides interception of network traffic, more insidiousattacks might be perpetrated against an automated bio-metric authentication system. One of these is a replay

attack on the signal from the sensor (attack point 2 inFigure 3). We propose a new method to thwart suchattempts based on a modified challenge/response sys-tem. Conventional challenge/response systems arebased either on challenges to the user, such as re-questing the user to supply the mother’s maidenname, or challenges to a physical device, such as aspecial-purpose calculator that computes a numer-ical response. Our approach is based on a challengeto the sensor. The sensor is assumed to have enoughintelligence to respond to the challenge. Silicon fin-gerprint scanners21 can be designed to exploit theproposed method using an embedded processor.

Note that standard cryptographic techniques are nota suitable substitute. While these are mathematicallystrong, they are also computationally intensive andcould require maintaining secret keys for a largenumber of sensors. Moreover, the encryption tech-niques cannot check for liveness of a signal. A storedimage could be fed to the encryptor, which will hap-pily encrypt it. Similarly, the digital signature of asubmitted signal can be used to check only for itsintegrity, not its liveness.

Our system computes a response string, which de-pends not only on the challenge string, but also onthe content of the returned image. The changingchallenges ensure that the image was acquired afterthe challenge was issued. The dependence on im-age pixel values guards against substitution of dataafter the response has been generated.

The proposed solution works as shown in Figure 9.A transaction is initiated at the user terminal or sys-tem. First, the server generates a pseudorandomchallenge for the transaction and the sensor. Notethat we assume that the transaction server itself issecure. The client system then passes the challengeon to the intelligent sensor. Now, the sensor acquiresa new signal and computes the response to the chal-lenge that is based in part on the newly acquired sig-nal. Because the response processor is tightly inte-grated with the sensor (preferable on the same chip),the signal channel into the response processor is as-sumed ironclad and inviolable. It is difficult to inter-cept the true image and to inject a fake image undersuch circumstances.

As an example of an image-based response, considerthe function “x11” which operates by appendingpixel values of the image (in scan order) to the endof the challenge string. A typical challenge might be“3, 10, 50.” In response to this, the integrated pro-

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 627

Page 15: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

cessor then selects the 3rd, 10th, and 50th pixel valuefrom this sequence to generate an output responsesuch as “133, 92, 176.” The complete image as wellas the response is then transmitted to the serverwhere the response can be verified and checkedagainst the image.

Other examples of responder functions include com-puting a checksum of a segment of the signal, a setof pseudorandom samples, a block of contiguoussamples starting at a specified location and with agiven size, a hash of signal values, and a specifiedknown function of selected samples of the signal. Acombination of these functions can be used toachieve arbitrarily complex responder functions. Theimportant point is that the response depends on thechallenge and the image itself.

The responder can also incorporate several differ-ent response functions, which the challenger couldselect among. For instance, the integrated proces-sor might be able to compute either of two select-able functions, “x11” and “x101.” The function“x101” is similar to “x11” except it multiplies therequested pixel values by 10 before appending them.Financial institution A might use function “x11” inall its units, while institution B might use “x101” inall of its units. Alternatively, for even numberedtransactions, function “x101” might be used, and forodd numbered transactions “x11” might be used.This variability makes it even harder to reconstructthe structure and parameters of the response func-tion. Large numbers of such response functions are

possible because we have a large number of pixelsand many simple functions can be applied to thesepixels.

Cancelable biometrics

Deploying biometrics in a mass market, like creditcard authorization or bank ATM access, raises ad-ditional concerns beyond the security of the trans-actions. One such concern is the public’s perceptionof a possible invasion of privacy. In addition to per-sonal information such as name and date of birth,the user is asked to surrender images of body parts,such as fingers, face, and iris. These images, or othersuch biometric signals, are stored in digital form invarious databases. This raises the concern of pos-sible sharing of data among law enforcement agen-cies, or commercial enterprises.

The public is concerned about the ever-growing bodyof information that is being collected about individ-uals in our society. The data collected encompassmany applications and include medical records andbiometric data. A related concern is the coordina-tion and sharing of data from various databases. Inrelation to biometric data, the public is, rightfully ornot, worried about data collected by private com-panies being matched against databases used by lawenforcement agencies. Fingerprint images, for ex-ample, can be matched against the FBI or INS (Im-migration and Naturalization Service) databases withominous consequences.

Figure 9 Signal authentication based on challenge/response

SENSOR-PROCESSORCOMBO

IMAGE

PROCESSOR SENSOR

DIFFICULT TO INTERCEPT

IMAGE SERVER

COMMUNICATIONNETWORKCHALLENGE/

RESPONSE

CLIENT

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001628

Page 16: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

These concerns are aggravated by the fact that aperson’s biometric data are given and cannot bechanged. One of the properties that makes biomet-rics so attractive for authentication purposes—theirinvariance over time—is also one of its liabilities.When a credit card number is compromised, the is-suing bank can just assign the customer a new creditcard number. When the biometric data are compro-mised, replacement is not possible.

In order to alleviate this problem, we introduce theconcept of “cancelable biometrics.” It consists of anintentional, repeatable distortion of a biometric sig-nal based on a chosen transform. The biometric sig-nal is distorted in the same fashion at each presen-tation, for enrollment and for every authentication.With this approach, every instance of enrollment canuse a different transform thus rendering cross-match-ing impossible. Furthermore, if one variant of thetransformed biometric data is compromised, then thetransform function can simply be changed to createa new variant (transformed representation) for re-enrollment as, essentially, a new person. In general,the distortion transforms are selected to be nonin-vertible. So even if the transform function is knownand the resulting transformed biometric data areknown, the original (undistorted) biometrics cannotbe recovered.

Example distortion transforms. In the proposedmethod, distortion transforms can be applied in ei-ther the signal domain or the feature domain. Thatis, either the biometric signal can be transformed di-rectly after acquisition, or the signal can be processedas usual and the extracted features can then be trans-formed. Moreover, extending a template to a largerrepresentation space via a suitable transform can fur-ther increase the bit strength of the system. Ideallythe transform should be noninvertible so that the truebiometric of a user cannot be recovered from oneor more of the distorted versions stored by variousagencies.

Examples of transforms at the signal level includegrid morphing and block permutation. The trans-formed images cannot be successfully matchedagainst the original images, or against similar trans-forms of the same image using different parameters.While a deformable template method might be ableto find such a match, the residual strain energy islikely to be as high as that of matching the templateto an unrelated image. In Figure 10, the original im-age is shown with an overlaid grid aligned with thefeatures of the face. In the adjacent image, we showthe morphed grid and the resulting distortion of theface. In Figure 11, a block structure is imposed onthe image aligned with characteristic points. The

Figure 10 Distortion transform based on image morphing

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 629

Page 17: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

blocks in the original image are subsequently scram-bled randomly but repeatably. Further examples ofimage morphing algorithms are described in Refer-ences 22 and 23.

An example of a transform in the feature domain isa set of random, repeatable perturbations of featurepoints. This can be done within the same physicalspace as the original, or while increasing the rangeof the axes. The second case provides more bruteforce strength as was noted in Section 4 (this effec-tively increases the value of K). An example of sucha transform is shown in Figure 12. Here the blockson the left are randomly mapped onto blocks on theright, where multiple blocks can be mapped onto thesame block. Such transforms are noninvertible, hencethe original feature sets cannot be recovered fromthe distorted versions. For instance, it is impossibleto tell which of the two blocks the points in com-posite block B, D originally came from. Conse-quently, the owner of the biometrics cannot be iden-tified except through the information associated withthat particular enrollment.

Note that for the transform to be repeatable, we needto have the biometric signal properly registered be-fore the transformation. Fortunately, this problem

has been partially answered by a number of tech-niques available in the literature (such as finding the“core” and “delta” points in a fingerprint, or eye andnose detection in a face).

Feature domain transforms. We present here an ex-ample of a noninvertible transform of a point pat-tern. Such a point pattern could, for example, be afingerprint minutiae set

S 5 $~ xi, yi, ui!, i 5 1, . . . , M% (14)

However, this point set could also represent otherbiometrics, for example, the quantized frequenciesand amplitudes of a speech pattern. A noninvertibletransform maps this set S into a new set S9 in sucha fashion that the original set S cannot be recoveredfrom S9, i.e.,

S 5 $~ xi, yi, ui!, i 5 1, . . . , M%3 S9

5 $~Xi, Yi, Qi!, i 5 1, . . . , M% (15)

Figure 13 shows how the x coordinates of the pointset S can be transformed through a mapping x 3

1

15

2

2

3

3

4

6 7

6

87

8

9

910

10

11

11 1212

1313

144 5

14 15 15

16

16

Figure 11 Distortion transform based on block scrambling

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001630

Page 18: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

X, or X 5 F(x). This function of x can, for example,be a high-order polynomial

X 5 F~ x! 5 On50

N

an x n 5 Pn50

N

~ x 2 bn! (16)

The mapping x 3 X is one-to-one, as is seen fromFigure 13. However, it is seen that the mappingX3 x is one-to-many. For instance, the output valueX1 could be generated from three different input x’s.Hence, this transform is noninvertible and the orig-

inal features x cannot be recovered from the X val-ues.

Similar polynomial noninvertible transforms

Y 5 G~ y! and Q 5 H~u ! (17)

can be used for the other coordinates of the pointset.

Encryption and transform management. The tech-niques presented here for transforming biometric sig-

Figure 12 Distortion transform based on feature perturbation

?

?

B

D

SCRAMBLE

B, D

A

C

A

C

Figure 13 Example of noninvertible feature transform

x1

x2

x3x4

X4

X2

X3

X1

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 631

Page 19: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

nals differ from simple compression using signal orimage processing techniques. While compression ofthe signal causes it to lose some of its spatial domaincharacteristics, it strives to preserve the overall ge-ometry. That is, two points in a biometric signal be-fore compression are likely to remain at compara-ble distance when decompressed. This is usually notthe case with our distortion transforms. Our tech-nique also differs from encryption. The purpose ofencryption is to allow a legitimate party to regen-erate the original signal. In contrast, distortion trans-forms permanently obscure the signal in a noninvert-ible manner.

When employing cancelable biometrics, there areseveral places where the transform, its parameters,and identification templates could be stored. This

leads to a possible distributed process model asshown in Figure 14. The “merchant” is where theprimary interaction starts in our model. Based onthe customer ID, the relevant transform is first pulledfrom one of the transform databases and applied tothe biometrics. The resulting distorted biometrics isthen sent for authentication to the “authorization”server. Once the user’s identity has been confirmed,the transaction is finally passed on to the relevantcommercial institution for processing.

Note that an individual user may be subscribed tomultiple services, such as e-commerce merchants orbanks. The authentication for each transaction mightbe performed either by the service provider itself,or by an independent third party. Similarly, the dis-tortion transform might be managed either by the

Figure 14 Authentication process based on cancelable biometrics

TRANSFORM TRANSFORM AUTHORIZATION AUTHORIZATION

MERCHANT

BANK

MERCHANT CREDIT CARDCOMPANY

ID + DISTORTEDBIOMETRICSDATABASE

MERCHANT

CUSTOMER

FINANCIALINSTITUTIONS ACCOUNT

DATA

ACCOUNTDATA

ACCOUNTDATA

ID + DISTORTEDBIOMETRICSDATABASE

TRANSFORMDATABASE

TRANSFORMDATABASE

COMMUNICATIONNETWORK

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001632

Page 20: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

authenticator or by still another independent agency.Alternatively, for the best privacy the transformmight remain solely in the possession of the user,stored, say, on a smart card. If the card is lost or sto-len, the stolen transform applied to another person’sbiometrics will have very little impact. However, ifthe transform is applied to a stored original biomet-rics signal of the genuine user, it will match againstthe stored template of the person. Hence “liveness”detection techniques (such as described earlier)should be added to prevent such misuse.

Conclusions

Biometrics-based authentication has many usabilityadvantages over traditional systems such as pass-words. Specifically, users can never lose their bio-metrics, and the biometric signal is difficult to stealor forge. We have shown that the intrinsic bit strengthof a biometric signal can be quite good, especiallyfor fingerprints, when compared to conventionalpasswords.

Yet, any system, including a biometric system, is vul-nerable when attacked by determined hackers. Wehave highlighted eight points of vulnerability in a ge-neric biometric system and have discussed possibleattacks. We suggested several ways to alleviate someof these security threats. Replay attacks have beenaddressed using data-hiding techniques to secretlyembed a telltale mark directly in the compressed fin-gerprint image. A challenge/response method hasbeen proposed to check the liveliness of the signalacquired from an intelligent sensor.

Finally, we have touched on the often-neglectedproblems of privacy and revocation of biometrics. Itis somewhat ironic that the greatest strength of bi-ometrics, the fact that the biometrics does not changeover time, is at the same time its greatest liability.Once a set of biometric data has been compromised,it is compromised forever. To address this issue, wehave proposed applying repeatable noninvertible dis-tortions to the biometric signal. Cancellation simplyrequires the specification of a new distortion trans-form. Privacy is enhanced because different distor-tions can be used for different services and the truebiometrics are never stored or revealed to the au-thentication server. In addition, such intentionallydistorted biometrics cannot be used for searching leg-acy databases and will thus alleviate some privacyviolation concerns.

Cited references1. B. Miller, “Vital Signs of Identity,” IEEE Spectrum 31, No.

2, 22–30 (1994).2. L. O’Gorman, “Practical Systems for Personal Fingerprint

Authentication,” IEEE Computer 33, No. 2, 58–60 (2000).3. R. Germain, A. Califano, and S. Colville, “Fingerprint Match-

ing Using Transformation Parameter Clustering,” IEEE Com-putational Science and Engineering 4, No. 4, 42–49 (1997).

4. A. Jain, L. Hong, and S. Pankanti, “Biometrics Identifica-tion,” Communications of the ACM 43, No. 2, 91–98 (2000).

5. B. Schneier, “The Uses and Abuses of Biometrics,” Commu-nications of the ACM 42, No. 8, 136 (1999).

6. N. K. Ratha and R. M. Bolle, “Smart Card Based Authen-tication,” in Biometrics: Personal Identification in NetworkedSociety, A. K. Jain, R. M. Bolle, and S. Pankanti, Editors, Klu-wer Academic Press, Boston, MA (1999), pp. 369–384.

7. B. Schneier, “Security Pitfalls in Cryptography,” Proceedings ofthe CardTech/SecureTech Conference, CardTech/SecureTech,Bethesda, MD (1998), pp. 621–626.

8. B. Schneier, Applied Cryptography, John Wiley & Sons, Inc.,New York (1996).

9. J. W. Osterburg, T. Parthasarathy, T. E. S. Raghavan, andS. L. Sclove, “Development of a Mathematical Formula forthe Calculation of Fingerprint Probabilities Based on Indi-vidual Characteristics,” Journal of the American Statistical As-sociation 72, 772–778 (1977).

10. S. L. Sclove, “The Occurrence of Fingerprint Characteristicsas a Two Dimensional Process,” Journal of the American Sta-tistical Association 74, 588–595 (1979).

11. D. A. Stoney, J. I. Thronton, and D. Crim, “A Critical Anal-ysis of Quantitative Fingerprint Individuality Models,” Jour-nal of Forensic Sciences 31, No. 4, 1187–1216 (1986).

12. WSQ Gray-Scale Fingerprint Image Compression Specification,IAFIS-IC-0110v2, Federal Bureau of Investigation, CriminalJustice Information Services Division (1993).

13. N. Memon and P. W. Wong, “Protecting Digital Media Con-tent,” Communications of the ACM 41, No. 7, 35–43 (1998).

14. F. A. Petitcolas, R. J. Anderson, and M. G. Kuhn, “Infor-mation Hiding—A Survey,” Proceedings of the IEEE 87, No.7, 1062–1078 (1999).

15. W. Bender, D. Gruhl, N. Morimoto, and A. Lu, “Techniquesfor Data Hiding,” IBM Systems Journal 35, Nos. 3&4, 313–336 (1996).

16. M. D. Swanson, M. Kobayashi, and A. H. Tewfik, “Multi-media Data Embedding and Watermarking Technologies,”Proceedings of the IEEE 86, No. 6, 1064–1087 (1998).

17. C. T. Hsu and J. L. Wu, “Hidden Digital Watermarks in Im-ages,” IEEE Transactions on Image Processing 8, No. 1, 58–68(1999).

18. M. Yeung and S. Pankanti, “Verification Watermarks on Fin-gerprint Recognition and Retrieval,” Journal of Electronic Im-aging 9, No. 4, 468–476 (2000).

19. C. M. Brislawn, J. N. Bradley, R. J. Onyshczak, and T. Hop-per, “The FBI Compression Standard for Digitized Finger-print Images,” Proceedings of SPIE 2847, 344–355 (1996).

20. M. Vetterli and J. Kovacevic, Wavelets and Subband Coding,Prentice Hall, Englewood Cliffs, NJ (1995).

21. T. Rowley, “Silicon Fingerprint Readers: A Solid State Approachto Biometrics,” Proceedings of the CardTech/SecureTech Confer-ence, CardTech/SecureTech, Bethesda, MD (1997), pp. 152–159.

22. G. Wolberg, “Image Morphing: A Survey,” The Visual Com-puter 14, 360–372 (1998).

23. T. Beier and S. Neely, “Feature-Based Image Metamorpho-sis,” Proceedings of SIGGRAPH, ACM, New York (1992),pp. 35–42.

IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001 RATHA, CONNELL, AND BOLLE 633

Page 21: Enhancing security and privacy in biometrics-based ...govind/CSE717/papers/... · tions, “WSQ-based data hiding” and “Image-based challenge/response method,” propose two methods

Accepted for publication April 24, 2001.

Nalini K. Ratha IBM Research Division, Thomas J. Watson Re-search Center, 30 Saw Mill River Road, Hawthorne, New York 10532(electronic mail: [email protected]). Dr. Ratha is a research staffmember in the Exploratory Computer Vision Group. He receivedhis Ph.D. degree in computer science from Michigan State Uni-versity in 1996, working in the Pattern Recognition and ImageProcessing Laboratory. His research interests include automatedbiometrics, computer vision, image processing, reconfigurablecomputing architectures, and performance evaluation.

Jonathan H. Connell IBM Research Division, Thomas J. WatsonResearch Center, 30 Saw Mill River Road, Hawthorne, New York10532 (electronic mail: [email protected]). Dr. Connell is aresearch staff member in the Exploratory Computer Vision Group.He received his Ph.D. degree in 1989 at the MIT Artificial In-telligence Laboratory, working with Rod Brooks on behavior-based mobile robot control. His research interests include robot-ics, vision, natural language, and complete artificial intelligencesystems.

Ruud M. Bolle IBM Research Division, Thomas J. Watson Re-search Center, 30 Saw Mill River Road, Hawthorne, New York 10532(electronic mail: [email protected]). Dr. Bolle is the foundingmanager of the Exploratory Computer Vision Group. He receivedhis Ph.D. degree in electrical engineering from Brown Univer-sity, Providence, Rhode Island in 1984. He is a Fellow of the IEEEand the AIPR and is a member of the IBM Academy of Tech-nology. His research interests are focussed on video database in-dexing, video processing, visual human-computer interaction, andbiometrics.

RATHA, CONNELL, AND BOLLE IBM SYSTEMS JOURNAL, VOL 40, NO 3, 2001634