ieee transactions on information forensics …€¦ · captcha as graphical passwords—a new...

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 9, NO. 6, JUNE 2014 891

Captcha as Graphical Passwords—A New SecurityPrimitive Based on Hard AI Problems

Bin B. Zhu, Jeff Yan, Guanbo Bao, Maowei Yang, and Ning Xu

Abstract— Many security primitives are based on hard math-ematical problems. Using hard AI problems for security isemerging as an exciting new paradigm, but has been under-explored. In this paper, we present a new security primitivebased on hard AI problems, namely, a novel family of graph-ical password systems built on top of Captcha technology,which we call Captcha as graphical passwords (CaRP). CaRPis both a Captcha and a graphical password scheme. CaRPaddresses a number of security problems altogether, such asonline guessing attacks, relay attacks, and, if combined withdual-view technologies, shoulder-surfing attacks. Notably, a CaRPpassword can be found only probabilistically by automatic onlineguessing attacks even if the password is in the search set.CaRP also offers a novel approach to address the well-knownimage hotspot problem in popular graphical password systems,such as PassPoints, that often leads to weak password choices.CaRP is not a panacea, but it offers reasonable security andusability and appears to fit well with some practical applicationsfor improving online security.

Index Terms— Graphical password, password, hotspots, CaRP,Captcha, dictionary attack, password guessing attack, securityprimitive.

I. INTRODUCTION

AFUNDAMENTAL task in security is to create crypto-graphic primitives based on hard mathematical problems

that are computationally intractable. For example, the problemof integer factorization is fundamental to the RSA public-keycryptosystem and the Rabin encryption. The discrete logarithmproblem is fundamental to the ElGamal encryption, the Diffie-Hellman key exchange, the Digital Signature Algorithm, theelliptic curve cryptography and so on.

Using hard AI (Artificial Intelligence) problems forsecurity, initially proposed in [17], is an exciting new par-adigm. Under this paradigm, the most notable primitiveinvented is Captcha, which distinguishes human users fromcomputers by presenting a challenge, i.e., a puzzle, beyond

Manuscript received April 15, 2013; revised July 22, 2013 andOctober 14, 2013; accepted February 21, 2014. Date of publication March19, 2014; date of current version April 21, 2014. This work was done whenG. Bao and M. Yang worked as interns at Microsoft Research Asia. Theassociate editor coordinating the review of this manuscript and approving itfor publication was Prof. Carlo Blundo.

B. B. Zhu and N. Xu are with Microsoft Research Asia, Beijing 100080,China (e-mail: [email protected]; [email protected]).

J. Yan is with Newcastle University, Newcastle NE 1 7RU, U.K. (e-mail:[email protected]).

G. Bao is with the Institute of Automation, Chinese Academy of Sciences,Beijing 100190, China (e-mail: [email protected]).

M. Yang is with Sichuan University, Chengdu 610207, China (e-mail:[email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIFS.2014.2312547

the capability of computers but easy for humans. Captcha isnow a standard Internet security technique to protect onlineemail and other services from being abused by bots.

However, this new paradigm has achieved just a limitedsuccess as compared with the cryptographic primitives basedon hard math problems and their wide applications. Is itpossible to create any new security primitive based on hardAI problems? This is a challenging and interesting open prob-lem. In this paper, we introduce a new security primitive basedon hard AI problems, namely, a novel family of graphical pass-word systems integrating Captcha technology, which we callCaRP (Captcha as gRaphical Passwords). CaRP is click-basedgraphical passwords, where a sequence of clicks on an image isused to derive a password. Unlike other click-based graphicalpasswords, images used in CaRP are Captcha challenges, anda new CaRP image is generated for every login attempt.

The notion of CaRP is simple but generic. CaRP can havemultiple instantiations. In theory, any Captcha scheme relyingon multiple-object classification can be converted to a CaRPscheme. We present exemplary CaRPs built on both textCaptcha and image-recognition Captcha. One of them is a textCaRP wherein a password is a sequence of characters likea text password, but entered by clicking the right charactersequence on CaRP images.

CaRP offers protection against online dictionary attacks onpasswords, which have been for long time a major securitythreat for various online services. This threat is widespreadand considered as a top cyber security risk [13]. Defenseagainst online dictionary attacks is a more subtle problem thanit might appear. Intuitive countermeasures such as throttlinglogon attempts do not work well for two reasons:

1) It causes denial-of-service attacks (which were exploitedto lock highest bidders out in final minutes of eBayauctions [12]) and incurs expensive helpdesk costs foraccount reactivation.

2) It is vulnerable to global password attacks [14] wherebyadversaries intend to break into any account rather thana specific one, and thus try each password candidate onmultiple accounts and ensure that the number of trials oneach account is below the threshold to avoid triggeringaccount lockout.

CaRP also offers protection against relay attacks, an increas-ing threat to bypass Captchas protection, wherein Captchachallenges are relayed to humans to solve. Koobface [33]was a relay attack to bypass Facebook’s Captcha in creatingnew accounts. CaRP is robust to shoulder-surfing attacks ifcombined with dual-view technologies.

CaRP requires solving a Captcha challenge in every login.This impact on usability can be mitigated by adapting theCaRP image’s difficulty level based on the login history ofthe account and the machine used to log in.

Typical application scenarios for CaRP include:

1) CaRP can be applied on touch-screen devices whereontyping passwords is cumbersome, esp. for secure Internetapplications such as e-banks. Many e-banking systemshave applied Captchas in user logins [39]. For example,ICBC (www.icbc.com.cn), the largest bank in the world,requires solving a Captcha challenge for every onlinelogin attempt.

2) CaRP increases spammer’s operating cost and thus helpsreduce spam emails. For an email service providerthat deploys CaRP, a spam bot cannot log into anemail account even if it knows the password. Instead,human involvement is compulsory to access an account.If CaRP is combined with a policy to throttle the numberof emails sent to new recipients per login session, a spambot can send only a limited number of emails beforeasking human assistance for login, leading to reducedoutbound spam traffic.

The remaining paper is organized as follows: Backgroundand related work are presented in Section II. We outline CaRPin Section III, and present a variety of CaRP schemes inSections IV and V. Security analysis is provided in Section VI.A usability study on two CaRP schemes that we have imple-mented is reported in Section VII. Balance of security andusability is discussed in Section VIII. We conclude the paperwith Section IX.

II. BACKGROUND AND RELATED WORK

A. Graphical Passwords

A large number of graphical password schemes have beenproposed. They can be classified into three categories accord-ing to the task involved in memorizing and entering passwords:recognition, recall, and cued recall. Each type will be brieflydescribed here. More can be found in a recent review ofgraphical passwords [1].

A recognition-based scheme requires identifying amongdecoys the visual objects belonging to a password portfolio.A typical scheme is Passfaces [2] wherein a user selects aportfolio of faces from a database in creating a password.During authentication, a panel of candidate faces is presentedfor the user to select the face belonging to her portfolio. Thisprocess is repeated several rounds, each round with a differentpanel. A successful login requires correct selection in eachround. The set of images in a panel remains the same betweenlogins, but their locations are permuted. Story [20] is similarto Passfaces but the images in the portfolio are ordered, anda user must identify her portfolio images in the correct order.Déjà Vu [21] is also similar but uses a large set of computer-generated “random-art” images. Cognitive Authentication [22]requires a user to generate a path through a panel of images asfollows: starting from the top-left image, moving down if theimage is in her portfolio, or right otherwise. The user identifiesamong decoys the row or column label that the path ends.

This process is repeated, each time with a different panel.A successful login requires that the cumulative probabilitythat correct answers were not entered by chance exceeds athreshold within a given number of rounds.

A recall-based scheme requires a user to regeneratethe same interaction result without cueing. Draw-A-Secret(DAS) [3] was the first recall-based scheme proposed. A userdraws her password on a 2D grid. The system encodes thesequence of grid cells along the drawing path as a user-drawn password. Pass-Go [4] improves DAS’s usability byencoding the grid intersection points rather than the grid cells.BDAS [23] adds background images to DAS to encourageusers to create more complex passwords.

In a cued-recall scheme, an external cue is provided to helpmemorize and enter a password. PassPoints [5] is a widelystudied click-based cued-recall scheme wherein a user clicksa sequence of points anywhere on an image in creating apassword, and re-clicks the same sequence during authenti-cation. Cued Click Points (CCP) [18] is similar to PassPointsbut uses one image per click, with the next image selectedby a deterministic function. Persuasive Cued Click Points(PCCP) [19] extends CCP by requiring a user to select apoint inside a randomly positioned viewport when creating apassword, resulting in more randomly distributed click-pointsin a password.

Among the three types, recognition is considered the easiestfor human memory whereas pure recall is the hardest [1].Recognition is typically the weakest in resisting guessingattacks. Many proposed recognition-based schemes practicallyhave a password space in the range of 213 to 216 passwords [1].A study [6] reported that a significant portion of passwords ofDAS and Pass-Go [4] were successfully broken with guessingattacks using dictionaries of 231 to 241 entries, as compared tothe full password space of 258 entries. Images contain hotspots[7], [8], i.e., spots likely selected in creating passwords.Hotspots were exploited to mount successful guessing attackson PassPoints [8]–[11]: a significant portion of passwords werebroken with dictionaries of 226 to 235 entries, as compared tothe full space of 243 passwords.

B. Captcha

Captcha relies on the gap of capabilities between humansand bots in solving certain hard AI problems. There are twotypes of visual Captcha: text Captcha and Image-RecognitionCaptcha (IRC). The former relies on character recogni-tion while the latter relies on recognition of non-characterobjects. Security of text Captchas has been extensively studied[26]–[30]. The following principle has been established: textCaptcha should rely on the difficulty of character segmenta-tion, which is computationally expensive and combinatoriallyhard [30].

Machine recognition of non-character objects is far lesscapable than character recognition. IRCs rely on the difficultyof object identification or classification, possibly combinedwith the difficulty of object segmentation. Asirra [31] relies onbinary object classification: a user is asked to identify all thecats from a panel of 12 images of cats and dogs. Security of

IRCs has also been studied. Asirra was found to be susceptibleto machine-learning attacks [24]. IRCs based on binary objectclassification or identification of one concrete type of objectsare likely insecure [25]. Multi-label classification problems areconsidered much harder than binary classification problems.

Captcha can be circumvented through relay attacks wherebyCaptcha challenges are relayed to human solvers, whoseanswers are fed back to the targeted application.

C. Captcha in Authentication

It was introduced in [14] to use both Captcha and passwordin a user authentication protocol, which we call Captcha-basedPassword Authentication (CbPA) protocol, to counter onlinedictionary attacks. The CbPA-protocol in [14] requires solvinga Captcha challenge after inputting a valid pair of user ID andpassword unless a valid browser cookie is received. For aninvalid pair of user ID and password, the user has a certainprobability to solve a Captcha challenge before being deniedaccess. An improved CbPA-protocol is proposed in [15] bystoring cookies only on user-trusted machines and applyinga Captcha challenge only when the number of failed loginattempts for the account has exceeded a threshold. It is furtherimproved in [16] by applying a small threshold for failedlogin attempts from unknown machines but a large thresholdfor failed attempts from known machines with a previoussuccessful login within a given time frame.

Captcha was also used with recognition-based graphicalpasswords to address spyware [40], [41], wherein a textCaptcha is displayed below each image; a user locates her ownpass-images from decoy images, and enters the characters atspecific locations of the Captcha below each pass-image asher password during authentication. These specific locationswere selected for each pass-image during password creationas a part of the password.

In the above schemes, Captcha is an independent entity, usedtogether with a text or graphical password. On the contrary,a CaRP is both a Captcha and a graphical password scheme,which are intrinsically combined into a single entity.

D. Other Related Work

Captcha is used to protect sensitive user inputs on anuntrusted client [35]. This scheme protects the communicationchannel between user and Web server from keyloggers andspyware, while CaRP is a family of graphical passwordschemes for user authentication. The paper [35] did notintroduce the notion of CaRP or explore its rich propertiesand the design space of a variety of CaRP instantiations.

III. CAPTCHA AS GRAPHICAL PASSWORDS

A. A New Way to Thwart Guessing Attacks

In a guessing attack, a password guess tested in an unsuc-cessful trial is determined wrong and excluded from subse-quent trials. The number of undetermined password guessesdecreases with more trials, leading to a better chance of findingthe password. Mathematically, let S be the set of passwordguesses before any trial, ρ be the password to find, T denote

a trial whereas Tn denote the n-th trial, and p(T = ρ) bethe probability that ρ is tested in trial T . Let En be the setof password guesses tested in trials up to (including) Tn . Thepassword guess to be tested in n-th trial Tn is from set S\En−1,i.e., the relative complement of En−1 in S. If ρ ∈ S, then wehave

p (T = ρ|T1 �= ρ, . . . , Tn−1 �= ρ) > p(T = ρ), (1)

and

En → Sp(T = ρ|T1 �= ρ, . . . , Tn−1 �= ρ)→ 1

}with n→ |S|, (2)

where |S| denotes the cardinality of S. From Eq. (2), thepassword is always found within |S| trials if it is in S;otherwise S is exhausted after |S| trials. Each trial determinesif the tested password guess is the actual password or not, andthe trial’s result is deterministic.

To counter guessing attacks, traditional approaches indesigning graphical passwords aim at increasing the effectivepassword space to make passwords harder to guess and thusrequire more trials. No matter how secure a graphical passwordscheme is, the password can always be found by a brute forceattack. In this paper, we distinguish two types of guessingattacks: automatic guessing attacks apply an automatic trialand error process but S can be manually constructed whereashuman guessing attacks apply a manual trial and error process.

CaRP adopts a completely different approach to counterautomatic guessing attacks. It aims at realizing the followingequation:

p(T = ρ|T1, . . . , Tn−1) = p(T = ρ), ∀n (3)

in an automatic guessing attack. Eq. (3) means that each trialis computationally independent of other trials. Specifically,no matter how many trials executed previously, the chanceof finding the password in the current trial always remainsthe same. That is, a password in S can be found onlyprobabilistically by automatic guessing (including brute-force)attacks, in contrast to existing graphical password schemeswhere a password can be found within a fixed number of trials.

How to achieve the goal? If a new image is used for eachtrial, and images of different trials are independent of eachother, then Eq. (3) holds. Independent images among differentlogin attempts must contain invariant information so that theauthentication server can verify claimants. By examining theecosystem of user authentication, we noticed that human usersenter passwords during authentication, whereas the trial anderror process in guessing attacks is executed automatically.The capability gap between humans and machines can beexploited to generate images so that they are computationally-independent yet retain invariants that only humans can iden-tify, and thus use as passwords. The invariants among imagesmust be intractable to machines to thwart automatic guessingattacks. This requirement is the same as that of an idealCaptcha [25], leading to creation of CaRP, a new family ofgraphical passwords robust to online guessing attacks.

B. CaRP: An Overview

In CaRP, a new image is generated for every login attempt,even for the same user. CaRP uses an alphabet of visualobjects (e.g., alphanumerical characters, similar animals) togenerate a CaRP image, which is also a Captcha challenge.A major difference between CaRP images and Captcha imagesis that all the visual objects in the alphabet should appearin a CaRP image to allow a user to input any password butnot necessarily in a Captcha image. Many Captcha schemescan be converted to CaRP schemes, as described in the nextsubsection.

CaRP schemes are clicked-based graphical passwords.According to the memory tasks in memorizing and enter-ing a password, CaRP schemes can be classified into twocategories: recognition and a new category, recognition-recall,which requires recognizing an image and using the recog-nized objects as cues to enter a password. Recognition-recallcombines the tasks of both recognition and cued-recall, andretains both the recognition-based advantage of being easyfor human memory and the cued-recall advantage of a largepassword space. Exemplary CaRP schemes of each type willbe presented later.

C. Converting Captcha to CaRP

In principle, any visual Captcha scheme relying on recogniz-ing two or more predefined types of objects can be convertedto a CaRP. All text Captcha schemes and most IRCs meet thisrequirement. Those IRCs that rely on recognizing a singlepredefined type of objects can also be converted to CaRPsin general by adding more types of objects. In practice,conversion of a specific Captcha scheme to a CaRP schemetypically requires a case by case study, in order to ensureboth security and usability. We will present in Sections IVand V several CaRPs built on top of text and image-recognitionCaptcha schemes.

Some IRCs rely on identifying objects whose types are notpredefined. A typical example is Cortcha [25] which relieson context-based object recognition wherein the object to berecognized can be of any type. These IRCs cannot be convertedinto CaRP since a set of pre-defined object types is essentialfor constructing a password.

D. User Authentication With CaRP Schemes

Like other graphical passwords, we assume that CaRPschemes are used with additional protection such as securechannels between clients and the authentication server throughTransport Layer Security (TLS). A typical way to apply CaRPschemes in user authentication is as follows. The authentica-tion server AS stores a salt s and a hash value H (ρ, s) foreach user ID, where ρ is the password of the account andnot stored. A CaRP password is a sequence of visual objectIDs or clickable-points of visual objects that the user selects.Upon receiving a login request, AS generates a CaRP image,records the locations of the objects in the image, and sendsthe image to the user to click her password. The coordinatesof the clicked points are recorded and sent to AS along

Fig. 1. Flowchart of basic CaRP authentication.

with the user ID. AS maps the received coordinates onto theCaRP image, and recovers a sequence of visual object IDs orclickable points of visual objects, ρ′, that the user clicked onthe image. Then AS retrieves salt s of the account, calculatesthe hash value of ρ′ with the salt, and compares the resultwith the hash value stored for the account. Authenticationsucceeds only if the two hash values match. This process iscalled the basic CaRP authentication and shown in Fig. 1.

Advanced authentication with CaRP, for example,challenge-response, will be presented in Section V-B. Weassume in the following that CaRP is used with the basicCaRP authentication unless explicitly stated otherwise.

To recover a password successfully, each user-clicked pointmust belong to a single object or a clickable-point of anobject. Objects in a CaRP image may overlap slightly withneighboring objects to resist segmentation. Users should notclick inside an overlapping region to avoid ambiguity inidentifying the clicked object. This is not a usability concern inpractice since overlapping areas generally take a tiny portionof an object.

IV. RECOGNITION-BASED CaRP

For this type of CaRP, a password is a sequence of visualobjects in the alphabet. Per view of traditional recognition-based graphical passwords, recognition-based CaRP seemsto have access to an infinite number of different visualobjects. We present two recognition-based CaRP schemes anda variation next.

A. ClickText

ClickText is a recognition-based CaRP scheme built on topof text Captcha. Its alphabet comprises characters without anyvisually-confusing characters. For example, Letter “O” anddigit “0” may cause confusion in CaRP images, and thus onecharacter should be excluded from the alphabet. A ClickTextpassword is a sequence of characters in the alphabet, e.g.,ρ =“AB#9CD87”, which is similar to a text password. AClickText image is generated by the underlying Captchaengine as if a Captcha image were generated except that allthe alphabet characters should appear in the image. Duringgeneration, each character’s location is tracked to produceground truth for the location of the character in the generatedimage. The authentication server relies on the ground truth toidentify the characters corresponding to user-clicked points.In ClickText images, characters can be arranged randomly

Fig. 2. A ClickText image with 33 characters.

Fig. 3. Captcha Zoo with horses circled red.

Fig. 4. A ClickAnimal image (left) and 6 × 6 grid (right) determined byred turkey’s bounding rectangle.

on 2D space. This is different from text Captcha challengesin which characters are typically ordered from left to rightin order for users to type them sequentially. Fig. 2 shows aClickText image with an alphabet of 33 characters. In enteringa password, the user clicks on this image the characters in herpassword, in the same order, for example “A”, “B”, “#”, “9”,“C”, “D”, “8”, and then “7” for password ρ = “AB#9CD87”.

B. ClickAnimal

Captcha Zoo [32] is a Captcha scheme which uses3D models of horse and dog to generate 2D animals withdifferent textures, colors, lightings and poses, and arrangesthem on a cluttered background. A user clicks all the horsesin a challenge image to pass the test. Fig. 3 shows a samplechallenge wherein all the horses are circled red.

ClickAnimal is a recognition-based CaRP scheme built ontop of Captcha Zoo [32], with an alphabet of similar animalssuch as dog, horse, pig, etc. Its password is a sequence ofanimal names such as ρ = “Turkey, Cat, Horse, Dog,….”For each animal, one or more 3D models are built. TheCaptcha generation process is applied to generate ClickAnimalimages: 3D models are used to generate 2D animals byapplying different views, textures, colors, lightning effects,and optionally distortions. The resulting 2D animals are thenarranged on a cluttered background such as grassland. Someanimals may be occluded by other animals in the image, buttheir core parts are not occluded in order for humans to identifyeach of them. Fig. 4 shows a ClickAnimal image with analphabet of 10 animals. Note that different views applied inmapping 3D models to 2D animals, together with occlusion in

the following step, produce many different shapes for the sameanimal’s instantiations in the generated images. Combinedwith the additional anti-recognition mechanisms applied in themapping step, these make it hard for computers to recognizeanimals in the generated image, yet humans can easily identifydifferent instantiations of animals.

C. AnimalGrid

The number of similar animals is much less than the numberof available characters. ClickAnimal has a smaller alphabet,and thus a smaller password space, than ClickText. CaRPshould have a sufficiently-large effective password space toresist human guessing attacks. AnimalGrid’s password spacecan be increased by combining it with a grid-based graphicalpassword, with the grid depending on the size of the selectedanimal.

DAS [3] is a candidate but requires drawing on the grid.To be consistent with ClickAnimal, we change from drawingto clicking: Click-A-Secret (CAS) wherein a user clicks thegrid cells in her password. AnimalGrid is a combination ofClickAnimal and CAS. The number of grid-cells in a gridshould be much larger than the alphabet size. Unlike DAS,grids in our CAS are object-dependent, as we will see next.It has the advantage that a correct animal should be clickedin order for the clicked grid-cell(s) on the follow-up grid tobe correct. If a wrong animal is clicked, the follow-up gridis wrong. A click on the correctly labeled grid-cell of thewrong grid would likely produce a wrong grid-cell at theauthentication server side when the correct grid is used.

To enter a password, a ClickAnimal image is displayed first.After an animal is selected, an image of n × n grid appears,with the grid-cell size equaling the bounding rectangle of theselected animal. Each grid-cell is labeled to help users identify.Fig. 4 shows a 6 × 6 grid when the red turkey in the left imageof Fig. 4 was selected. A user can select zero to multiplegrid-cells matching her password. Therefore a password is asequence of animals interleaving with grid-cells, e.g., ρ =“Dog, Grid〈2〉, Grid〈1〉; Cat, Horse, Grid〈3〉”, where Grid〈1〉means the grid-cell indexed as 1, and grid-cells after an animalmeans that the grid is determined by the bounding rectangleof the animal. A password must begin with an animal.

When a ClickAnimal image appears, the user clicks theanimal on the image that matches the first animal in herpassword. The coordinates of the clicked point are recorded.The bounding rectangle of the clicked animal is then foundinteractively as follows: a bounding rectangle is calculated anddisplayed, e.g., the white rectangle shown in Fig. 4. The userchecks the displayed rectangle and corrects inaccurate edgesby dragging if needed. This process is repeated until the user issatisfied with the accuracy of the bounding rectangle. In mostcases, the calculated bounding rectangle is accurate enoughwithout needing manual correction.

Once the bounding rectangle of the selected animal isidentified, an image of n×n grid with the identified boundingrectangle as its grid-cell size is generated and displayed. If thegrid image is too large or too small for a user to view, thegrid image is scaled to a fitting size. The user then clicksa sequence of zero to multiple grid-cells that match the grid-

cells following the first animals in her password, and then getsback to the ClickAnimal image. For the example password ρgiven previously, she clicks a point inside grid-cell〈2〉, andthen a point inside grid-cell〈1〉 to select the two grid-cells.The coordinates of user-clicked points on the grid image (theoriginal one before scaling if the grid image is scaled) arerecorded. The above process is repeated until the user hasfinished entering her password. The resulting sequenceof coordinates of user-clicked points, e.g., “AP〈150,50〉,GP〈30,66〉, GP〈89,160〉, AP〈135,97〉,…” where “AP〈x,y〉”denotes the point with coordinates 〈x,y〉 on a ClickAnimalimage, and “GP〈x,y〉” denotes the point with coordinates 〈x,y〉on a grid image, is sent to the authentication server.

Using the ground truth, the server recovers the first animalfrom the received sequence, regenerates the grid image fromthe animal’s bounding rectangle, and recovers the clickedgrid-cells. This process is repeated to recover the passwordthe user clicked. Its hash is then calculated and compared withthe stored hash.

V. RECOGNITION-RECALL CaRP

In recognition-recall CaRP, a password is a sequence ofsome invariant points of objects. An invariant point of anobject (e.g. letter “A”) is a point that has a fixed relativeposition in different incarnations (e.g., fonts) of the object,and thus can be uniquely identified by humans no matterhow the object appears in CaRP images. To enter a password,a user must identify the objects in a CaRP image, and thenuse the identified objects as cues to locate and click theinvariant points matching her password. Each password pointhas a tolerance range that a click within the tolerance rangeis acceptable as the password point. Most people have a clickvariation of 3 pixels or less [18]. TextPoint, a recognition-recall CaRP scheme with an alphabet of characters,is presented next, followed by a variation for challenge-response authentication.

A. TextPoints

Characters contain invariant points. Fig. 5 shows someinvariant points of letter “A”, which offers a strong cue tomemorize and locate its invariant points. A point is saidto be an internal point of an object if its distance to theclosest boundary of the object exceeds a threshold. A set ofinternal invariant points of characters is selected to form aset of clickable points for TextPoints. The internality ensuresthat a clickable point is unlikely occluded by a neighboringcharacter and that its tolerance region unlikely overlaps withany tolerance region of a neighboring character’s clickablepoints on the image generated by the underlying Captchaengine. In determining clickable points, the distance betweenany pair of clickable points in a character must exceed athreshold so that they are perceptually distinguishable andtheir tolerance regions do not overlap on CaRP images.In addition, variation should also be taken into consideration.For example, if the center of a stroke segment in one characteris selected, we should avoid selecting the center of a similarstroke segment in another character. Instead, we should select

Fig. 5. Some invariant points (red crosses) of “A”.

a different point from the stroke segment, e.g., a point atone-third length of the stroke segment to an end. This variationin selecting clickable points ensures that a clickable point iscontext-dependent: a similarly structured point may or may notbe a clickable point, depending on the character that the pointlies in. Character recognition is required in locating clickablepoints on a TextPoints image although the clickable pointsare known for each character. This is a task beyond a bot’scapability.

A password is a sequence of clickable points. A charactercan typically contribute multiple clickable points. ThereforeTextPoints has a much larger password space than ClickText.

Image Generation. TextPoints images look identical toClickText images and are generated in the same way exceptthat the locations of all the clickable points are checked toensure that none of them is occluded or its tolerance regionoverlaps another clickable point’s. We simply generate anotherimage if the check fails. As such failures occur rarely dueto the fact that clickable points are all internal points, therestriction due to the check has a negligible impact on thesecurity of generated images.

Authentication. When creating a password, all clickablepoints are marked on corresponding characters in a CaRPimage for a user to select. During authentication, the user firstidentifies her chosen characters, and clicks the password pointson the right characters. The authentication server maps eachuser-clicked point on the image to find the closest clickablepoint. If their distance exceeds a tolerable range, login fails.Otherwise a sequence of clickable points is recovered, and itshash value is computed to compare with the stored value.

It is worth comparing potential password points betweenTextPoints and traditional click-based graphical passwordssuch as PassPoints [5]. In PassPoints, salient points shouldbe avoided since they are readily picked up by adversariesto mount dictionary attacks, but avoiding salient points wouldincrease the burden to remember a password. This conflictdoes not exist in TextPoints. Clickable points in TextPointsare salient points of their characters and thus help remembera password, but cannot be exploited by bots since they are bothdynamic (as compared to static points in traditional graphicalpassword schemes) and contextual:• Dynamic: locations of clickable points and their contexts

(i.e., characters) vary from one image to another. Theclickable points in one image are computationally inde-pendent of the clickable points in another image, as wewill see in Section VI-B.

• Contextual: Whether a similarly structured point is aclickable point or not depends on its context. It is onlyif within the right context, i.e., at the right location of aright character.

These two features require recognizing the correct contexts,i.e., characters, first. By the very nature of Captcha, recogniz-ing characters in a Captcha image is a task beyond computer’scapability. Therefore, these salient points of characters cannotbe exploited to mount dictionary attacks on TextPoints.

B. TextPoints4CR

For the CaRP schemes presented up to now, the coordinatesof user-clicked points are sent directly to the authenticationserver during authentication. For more complex protocols, saya challenge-response authentication protocol, a response is sentto the authentication server instead. TextPoints can be modifiedto fit challenge-response authentication. This variation is calledTextPoints for Challenge-Response or TextPoints4CR.

Unlike TextPoints wherein the authentication server storesa salt and a password hash value for each account, theserver in TextPoints4CR stores the password for each account.Another difference is that each character appears only oncein a TextPoints4CR image but may appear multiple times ina TextPoints image. This is because both server and clientin TextPoints4CR should generate the same sequence ofdiscretized grid-cells independently. That requires a uniqueway to generate the sequence from the shared secret,i.e., password. Repeated characters would lead to several pos-sible sequences for the same password. This unique sequenceis used as if the shared secret in a conventional challenge-response authentication protocol.

In TextPoints4CR, an image is partitioned into a fixed gridwith the discretization grid-cell of size μ along both directions.The minimal distance between any pair of clickable pointsshould be larger than μ by a margin exceeding a thresholdto prevent two clickable points from falling into a singlegrid-cell in an image. Suppose that a guaranteed toleranceof click errors along both x-axis and y-axis is τ , we requirethat μ ≥ 4τ .

Image Generation. To generate a TextPoints4CR image,the same procedure to generate a TextPoints image is applied.Then the following procedure is applied to make every click-able point at least τ distance from the edges of the grid-cell itlies in. All the clickable points, denoted as set �, are locatedon the image. For every point in �, we calculate its distancealong x-axis or y-axis to the center of the grid-cell it liesin. A point is said to be an internal point if the distance isless than 0.5μ−τ along both directions; otherwise a boundarypoint. For each boundary point in �, a nearby internal pointin the same grid-cell is selected. The selected point is calleda target point of the boundary point. After processing all thepoints in �, we obtain a new set �′ comprising internal points;these are either internal clickable points or target points ofboundary clickable points. Mesh warping [36], widely used ingenerating text Captcha challenges, is then used to warp theimage so that � maps to �′. The result is a TextPoint4CRimage wherein every clickable point would tolerate at leastτ of click errors. Selection of target points should try to reducewarping distortion caused by mapping � to �′.

Authentication. In entering a password, a user-clicked pointis replaced by the grid-cell it lies in. If click errors are within τ ,

each user-clicked point falls into the same grid-cell as theoriginal password point. Therefore the sequence of grid-cellsgenerated from user-clicked points is identical to the one thatthe authentication server generates from the stored passwordof the account. This sequence is used as if the shared secretbetween the two parties in a challenge-response authenticationprotocol.

Unlike other CaRP schemes presented in this paper, Text-Points4CR requires the authentication server to store pass-words instead of their hash values. Stored passwords must beprotected from insider attacks; for example, they are encryptedwith a master key that only the authentication server knows.A password is decrypted only when its associated accountattempts to log in.

VI. SECURITY ANALYSIS

A. Security of Underlying Captcha

Computational intractability in recognizing objects in CaRPimages is fundamental to CaRP. Existing analyses on Captchasecurity were mostly case by case or used an approximateprocess. No theoretic security model has been establishedyet. Object segmentation is considered as a computationally-expensive, combinatorially-hard problem [30], which moderntext Captcha schemes rely on. According to [30], the com-plexity of object segmentation, C , is exponentially dependentof the number M of objects contained in a challenge, andpolynomially dependent of the size N of the Captcha alphabet:C = αM P(N), where α > 1 is a parameter, and P() is apolynomial function. A Captcha challenge typically contains6 to 10 characters, whereas a CaRP image typically contains30 or more characters. The complexity to break a Click-Text image is about α30 P(N)/(α10 P(N)) = α20 times thecomplexity to break a Captcha challenge generated by itsunderlying Captcha scheme. Therefore ClickText is muchharder to break than its underlying Captcha scheme. Fur-thermore, characters in a CaRP scheme are arranged two-dimensionally, further increasing segmentation difficulty dueto one more dimension to segment. As a result, we canreduce distortions in ClickText images for improved usabilityyet maintain the same security level as the underlying textCaptcha. ClickAnimal relies on both object segmentationand multiple-label classification. Its security remains an openquestion.

As a framework of graphical passwords, CaRP does not relyon any specific Captcha scheme. If one Captcha scheme getsbroken, a new and more robust Captcha scheme may appearand be used to construct a new CaRP scheme. In the remainingsecurity analysis, we assume that it is intractable for computersto recognize any objects in any challenge image generated bythe underlying Captcha of CaRP. More accurately, the Captchais assumed to be chosen-pixel attack (CPA)-secure definedwith the following experiment: an adversary A first learns froman arbitrary number of challenge images by querying a ground-truth oracle O as follows: A selects an arbitrary number ofinternal object-points and sends to O, which responds with theobject that each point lies in. Then A receives a new challengeimage and selects an internal object-point to query O again.

This time O chooses a random bit b ← {0, 1} to determinewhat to return: It returns the true object if b = 1; otherwisea false object selected with a certain strategy. A is asked todetermine whether the returned object is the true object thatthe internal object-point lies in or not. A Captcha scheme issaid to be CPA-secure if A cannot succeed with a probabilitynon-negligibly higher than ½, the probability of a randomguess.

B. Automatic Online Guessing Attacks

In automatic online guessing attacks, the trial and errorprocess is executed automatically whereas dictionaries can beconstructed manually. If we ignore negligible probabilities,CaRP with underlying CPA-secure Captcha has the followingproperties:

1. Internal object-points on one CaRP image arecomputationally-independent of internal object-points onanother CaRP image. Particularly, clickable points onone image are computationally-independent of clickablepoints on another image.

2. Eq. (3) holds, i.e., trials in guessing attacks are mutuallyindependent.

The first property can be proved by contradiction. Assumethat the property does not hold, i.e., there exists an inter-nal object-point α on one image A that is non-negligiblydependent of an internal object-point β on another imageB . An adversary can exploit this dependency to launch thefollowing chosen-pixel attack. In the learning phase, imageA is used to learn the object that contains point α. In thetesting phase, point β on image B is used to query the oracle.Since point α is non-negligibly dependent of point β, thisCPA-experiment would result in a success probability non-negligibly higher than a random guess, which contradicts theCPA-secure assumption. We conclude that the first propertyholds.

The second property is a consequence of the firstproperty since user-clicked internal object-points in onetrial are computationally-independent of user-clicked internalobject-points in another trial due to the first property. We haveignored background and boundary object-points since clickingany of them would lead to authentication failure.

Eq. (3) indicates that automatic online guessing attackscan find a password only probabilistically no matter howmany trials are executed. Even if the password guess tobe tested in a trial is the actual password, the trial has aslim chance to succeed since a machine cannot recognizethe objects in the CaRP image to input the password cor-rectly. This is a great contrast to automatic online guessingattacks on existing graphical passwords which are determin-istic, i.e., that each trial in a guessing attack can alwaysdetermine if the tested password guess is the actual passwordor not, and all the password guesses can be determined bya limited number of trials. Particularly, brute-force attacksor dictionary attacks with the targeted password in the dic-tionary would always succeed in attacking existing graphicalpasswords.

C. Human Guessing Attacks

In human guessing attacks, humans are used to enterpasswords in the trial and error process. Humans are muchslower than computers in mounting guessing attacks. For8-character passwords, the theoretical password space is338 ≈ 240 for ClickText with an alphabet of 33 characters,108 ≈ 226 for ClickAnimal with an alphabet of 10 animals,and 10× 467 ≈ 242 for AnimalGrid with the setting asClickAnimal plus 6× 6 grids. If we assume that 1000 peopleare employed to work 8 hours per day without any stop in ahuman guessing attack, and that each person takes 30 secondsto finish one trial. It would take them on average 0.5 · 338 ·30/(3600 · 8 · 1000 · 365) ≈ 2007 years to break a ClickTextpassword, 0.5 · 108 · 30/(3600 · 8 · 1000) ≈ 52 days to breaka ClickAnimal password, or 0.5 · 10 · 467 · 30/(3600 · 8 ·1000 · 365) ≈ 6219 years to break an AnimalGrid password.Human guessing attacks on TextPoints require a much longertime than those on ClickText since TextPoints has a muchlarger password space.

Just like any password scheme, a longitudinal evaluationis needed to establish the effective password space for eachCaRP instantiation. This requires a separate study similar towhat Bonneau [42] did for text passwords.

A recent study on text passwords [42] indicates that userstend to choose passwords of 6–8 characters and have astrong dislike of using non-alphanumeric characters, and thatan acceptable benchmark of effective password space is theexpected number of optimal guesses per account needed tobreak 50% of accounts, which is equivalent to 21.6 bits forYahoo! users. If we assume that ClickText has roughly thesame effective password space as text passwords, it requireson average 1000 people to work 1.65 days or one person towork 4.54 years to find a ClickText password.

D. Relay Attacks

Relay attacks may be executed in several ways. Captchachallenges can be relayed to a high-volume Website hackedor controlled by adversaries to have human surfers solve thechallenges in order to continue surfing the Website, or relayedto sweatshops where humans are hired to solve Captchachallenges for small payments. Is CaRP vulnerable to relayattacks? We make the same assumption as Van Oorschot andStubblebine [15] in discussing CbPA-protocol’s robustness torelay attacks: a person will not deliberately participate in relayattacks unless paid for the task. The task to perform and theimage used in CaRP are very different from those used tosolve a Captcha challenge. This noticeable difference makesit hard for a person to mistakenly help test a password guess byattempting to solve a Captcha challenge. Therefore it would beunlikely to get a large number of unwitting people to mounthuman guessing attacks on CaRP. In addition, human inputobtained by performing a Captcha task on a CaRP image isuseless for testing a password guess.

If sweatshops are hired to mount human guessing attack, wecan make a rough estimation of the cost. We assume that thecost to click one password on a CaRP image is the same assolving a Captcha challenge. Using the lowest retail price, $1,

reported [34] to solve 1000 Captcha challenges, the averagecost to break a 26-bit password is 0.5 · 226 · 1/1000, or about33.6 thousand US dollars.

E. Shoulder-Surfing Attacks

Shoulder-surfing attacks are a threat when graphicalpasswords are entered in a public place such as bank ATMmachines. CaRP is not robust to shoulder-surfing attacksby itself. However, combined with the following dual-viewtechnology, CaRP can thwart shoulder-surfing attacks.

By exploiting the technical limitation that commonly-usedLCDs show varying brightness and color depending on theviewing angle, the dual-view technology can use softwarealone to display two images on a LCD screen concurrently,one public image viewable at most view-angles, and the otherprivate image viewable only at a specific view-angle [38].When a CaRP image is displayed as the “private” image by thedual-view system, a shoulder-surfing attacker can capture user-clicked points on the screen, but cannot capture the “private”CaRP image that only the user can see. However, the obtaineduser-clicked points are useless for another login attempt, wherea new, computationally-independent image will be used andthus the captured points will not represent the correct passwordon the new image anymore.

To the contrary, common implementations of graphicalpassword schemes such as PassPoints use a static input imagein the same location of the screen for each login attempt.Although this image can be hidden as the private image bythe dual-view technology from being captured by a shoulder-surfer, the user-clicked points captured in a successful loginare still the valid password for next login attempt. That is,capturing the points alone is sufficient for an effective attackin this case.

In general, the higher the correlation of user-clicked pointsbetween different login attempts is, the less effective protectionthe dual-view technology would provide to thwart shoulder-surfing attacks.

F. Others

CaRP is not bulletproof to all possible attacks. CaRP isvulnerable if a client is compromised such that both the imageand user-clicked points can be captured. Like many othergraphical passwords such as CCP and PCCP, CaRP schemesusing the basic CaRP authentication are vulnerable to phishingsince user-clicked points are sent to the authentication server.However, CaRP schemes such as TextPoints4CR used withchallenge-response authentication are robust to phishing toa certain level: a phishing adversary has to mount offlineguessing attacks to find out the password using the verifiabledata obtained through a successful phishing attack.

VII. EMPIRICAL EVALUATIONS

A. Implementations

ClickText and AnimalGrid were implemented usingASP.NET. ClickText was implemented by calling a config-urable text Captcha engine commercially used by Microsoft.

This Captcha engine accepts only capital letters. As a result,we chose the following 33 characters in our usability studies:capital letters except I, J, O, and Z, digits except 0 and 1,and three special characters “#”, “@”, and “&”. The last threespecial characters were chosen to balance security and users’strong dislike of using non-alphanumeric characters in textpasswords [42]. Characters were arranged in 5 rows. Eachcharacter was randomly rotated from −30◦ to 30◦ and scaledfrom 60% to 120%. Neighboring characters could overlap upto 3 pixels. Warping effect was set to the light level. Eachimage was set to 400 by 400 pixels. Fig. 2 in Section IV-Ashows an image generated with the above setting.

In our implementation of AnimalGrid, we used an alphabetof 10 animals: bird, cow, horse, dog, giraffe, pig, rabbit, camel,element, and dinosaur. Each animal had three 3D models. Thenumber of animals in a ClickAnimal image ranged randomlyfrom 10 to 12, with the extra animals randomly selectedfrom the alphabet. In generating an animal object, one ofthe three 3D animal models was randomly selected, andposed at a random view in generating a 2D object. Eachanimal was assigned a color randomly selected from a setof 12 colors. Generated 2D objects were placed randomlyon a grass background, with the main part of each animalnot occluded by other animals. Each ClickAnimal image wasalso set to 400 by 400 pixels. A 6× 6 grid was used forCAS. Cells were labeled clockwise starting from cell 0. Fig. 4in Section IV shows an example of generated ClickAnimalimages and an example of grid images. There was a crossicon on top of a grid image that a user could click to closethe grid image.

B. Usability Study

1) Experimental SettingsWe conducted an in-lab usability study to compare Click-

Text, AnimalGrid, PassPoints, text password (Text), and textpassword combined with text Captcha (P + C). P + C wasused to simulate a CbPA-protocol when a Captcha challengewas used in login. In P + C , a user was asked to entera password and solve a Captcha challenge generated withthe same Captcha engine used in ClickText. Each Captchachallenge contained 6 to 8 random characters. Keyboard inputwas used to create and enter passwords for Text and P + Cas well as to enter user IDs for all the schemes. As explainedlater, Text and P + C were conducted as if they were a singlescheme to participants.

We recruited 40 (30 males and 10 females) voluntary seniorand graduate students majoring in engineering and sciences,with ages ranging from 20 to 28 years (the average age = 23.4and the standard deviation = 1.74). For pragmatic rea-sons, they were recruited from interns working at MicrosoftResearch Asia. None of them had studied security or wasinvolved in any security usability study before. They wereinvolved in this work solely as participants in our usabilitystudy. All participants were trained to get familiar with eachauthentication scheme and their experimental tasks before ourdata collection. During the experiment, one of the authors goteach participant when it was time for the participant to take

a test, which ensures that we could collect the required datafrom every participant.

Each scheme was tested in the following setting: a partic-ipant used a web browser to interact with an authenticationserver, creating passwords or logging into the server. Oncea participant submitted his/her credentials to the server, thebrowser would show the login result.

The schemes were classified into two categories accordingto their types of passwords: AnimalGrid and PassPoints inthe first category, and the remaining schemes in the secondcategory. A password for the schemes in the second categorywas a string of characters.

Each participant was asked to create a new password neverused previously for each scheme, 4 in total, and a userID for all the schemes. Each created password consisted of8 characters or click-points. We also made it explicit thatparticipants were not allowed to write down their passwords.

Each password must meet the following minimum com-plexity requirements. A password must contain at least oneletter, one digit, and one non-alphanumerical character forboth Text and ClickText, and at least three different animalsfor AnimalGrid. No repeating patterns such as “A#A#…” or“Dog, Dog,…” were allowed. For PassPoints, click-points ina password must be distinct (i.e., no click-point was insideanother click-point’s tolerance range). Each password wasverified immediately after creation.

The study was partitioned into two stages. Two schemeswere tested in each stage. In the first stage, two schemes, onefrom each category, were randomly selected for a participant totest. One scheme had a string of text characters as a passwordwhile the other had a string of animals and grid cells or click-points as a password.

During the study, each participant was asked to log inwith the following intervals between two consecutive logintests: one hour after creation, one day, one week, and threeweeks. In each test, a participant was allowed three tries tolog in. If he/she failed three attempts, his/her password wasconsidered forgotten, and no more test would be conductedwith the participant for that specific scheme.

In the second stage, the remaining two schemes were testedin the same way as above.

In the end, each participant was required to fill a question-naire to compare ClickText and AnimalGrid with PassPointsand Text, and to compare ClickText with P+C, in terms ofease of use as a password system, taking both memorizingand entering a password into consideration.

A participant’s login time in each trial was recorded bythe server. We define the login time as the duration from thetime when the server received a login request to the timewhen the server gave its response to the login request, whichincludes the time to enter user ID and password, to generatea CaRP image, and to communicate between the server and aparticipant’s browser.

For Text and P + C, a participant was asked to enter apassword. If successful, the server recorded the time as thelogin time for Text, and then generated a Captcha challengeand sent to the user to solve. If the participant failed withthe challenge, another challenge was generated and used. This

process was repeated until the server received a correct answerto a challenge. Then the server recorded the time as the logintime for P + C, which included the time that the participantfailed to solve a challenge.

2) Experimental ResultsUsability. Among all the recorded login attempts, 24.4%

failed. Tests after a larger interval tended to have morefailed attempts. Some participants contributed significantlymore failed attempts than others. At the end of tests, 40(100%) participants remembered their PassPoints passwords,39 (97.5%) remembered their passwords of both ClickText andAnimalGrid, and 34 (85%) remembered their Text passwords.One participant forgot the AnimalGrid password at the one-hour test, and another one forgot the ClickText passwordat the one-week test. For Text, two participants forgot theirpasswords at the one-week test, and four forgot at the three-week test. PassPoints scored the best in memorability whereasText scored the worst. This may be partially due to the fact thathotspots were allowed for PassPoints passwords, and that Textpasswords had a much larger alphabet than both ClickText andAnimalGrid.

Table I shows the login time averaged over the40 participants’ successful login attempts and the samplestandard deviation as well as the maximum and minimumlogin times for each scheme. ClickText, AnimalGrid andP + C had similar average login time whereas PassPoints hada little shorter average login time. Text had a much shorteraverage login time than the other schemes. Each schemehad a large sample standard deviation relative to the averagelogin time, indicating large variations of login time for eachscheme, which is confirmed by the great difference betweenthe minimum and maximum login times in each columnshown in Table I. This is mainly caused by large individualdifferences. We did not detect obvious patterns indicating thata test with a longer interval had a larger login time than a testwith a shorter interval. We did notice that some participantshad a much larger login time when the preceding trial failed,but many other participants didn’t follow this observation.

The passwords in our tests were used much less fre-quently than typical usage of a password in practice since wewould like to test password memorability for each scheme.We expect improved results when a password is used morefrequently.

Table II shows the comparison results of different schemefor ease of use as a password system. We assign a valueranging from 1 to 5 to each category, indicating the spectrumfrom “much more difficult” to “much easier”. ClickText hasa mean value of 3.2 and a median value of 3 as compared toPassPoints, and a mean of 2.85 and a median of 2 as comparedto Text. AnimalGrid has a mean of 3.325 and a median of 4 ascompared to PassPoints, and a mean of 3.5 and a median of4 as compared to Text. ClickText has a mean of 3.875 and amedian of 4 as compared to P + C.

Security. We also analyzed the security of the passwordswe collected for Text and ClickText, with a popular password-cracking tool, John the Ripper version 1.7.9 [43]. Given oursample size of 40 users, the distribution of passwords will notbe as complete as a larger study with significantly more users,

TABLE I

LOGIN TIME FOR DIFFERENT SCHEMES: AVERAGE (T ), SAMPLE

STANDARD DEVIATION (σ ), MAX. AND MIN.

TABLE II

COMPARING DIFFERENT SCHEMES FOR EASE OF USE

but such an analysis can provide at least an indication of theirsecurity.

John the Ripper has three operation modes: “single crack”,“wordlist”, and “incremental”. In the “single crack” mode,login names and other account information as well as alarge set of mangling rules are used to generate passwordguesses. In the “wordlist” mode, a list of words and wordmangling rules are used to generate password guesses. In the“incremental” mode, a brute force attack is applied, withguesses being tested at the descending order of their likelihoodto be a password.

In our study, the default parameters and settings were usedfor John the Ripper except that length of a password was setto 8 characters, and that the recommended wordlist “all.lst”from [44] was used in the “wordlist” mode. When operat-ing in both “single crack” and “wordlist” modes, John theRipper didn’t find any password for either Text or ClickText.When running in the “incremental” mode for 24 hours on aHP Compaq Elite 8100 PC with Intel Core i7-870 CPU, 8GBRAM and 64-bit Windows 7 OS, John the Ripper found twoof 40 (i.e., 5.0%) passwords for Text but didn’t find anypassword for ClickText. The small difference in the resultsof the “incremental” mode is probably because the unusualalphabet set in ClickText increased a user’s chance to choosemore random passwords in order to meet the same passwordcomplexity requirement applied to both Text and ClickText.A separate longitudinal study is needed to fully understandthe distribution and security of ClickText passwords that userswould select.

We conclude from the cracking results that the pass-words the participants selected for Text and ClickText werereasonably strong, which matches our expectation of the

password complexity requirement described in the previoussubsection.

3) DiscussionsWiedenbeck et al. [5] studied memorability of text pass-

words and PassPoints with 20 participants for each scheme:their recall rate after one week was 65% and 70%, respectively.Our memorability results are higher than theirs for both textand PassPoints passwords.

The sample sizes in both [5] and our studies were too smallto explain conclusively the difference in the results. However,this difference could probably be explained by the differenceof the subjects in both studies. First, although randomlychosen, our participants were from the pool of interns workingat Microsoft Research Asia, who were selected typically fromleading universities. Second, our participants were used tostrong text passwords as their Microsoft accounts were strictlyenforced with strong password policies (e.g. each account mustuse a complex password, and a password has to be changedregularly, with the new password having to be significantlydifferent from previously used ones). Third, our participantswere much younger, with an average age of 23.4 years ascompared to the average of 32.9 years in Wiedenbeck et al.’sexperiment.

However, we do not think this sample bias introduced anundue impact on the main goal of our experiment, namely,comparing the performance of CaRP with other authenticationschemes. Nonetheless, user studies with a diverse sample poolare needed for CaRP, which are our future work.

C. Computation Load on Server

Compared with many graphical password schemes, gener-ation of CaRP images is an extra load on the server side.We tested our implementations of ClickText and AnimalGridon the same HP Compaq Elite 8100 PC we used to runJohn the Ripper, as described in Section VII-B. For imagesof 400x400 pixels, the average speed was 10.68 imagesper second in generating a ClickText image with 33 charactersand 0.86 images/s in generating an AnimalGrid image with10 to 12 animals. Our implementations were single-threadedwithout code optimization, and did not take any advantageof multi-core capability of the test machine. Much fasterimage generation should be viable by exploiting multi-corearchitecture of today’s servers and by optimizing the code.

VIII. BALANCE OF SECURITY AND USABILITY

Some configurations of CaRP offer acceptable usabilityacross common device types, e.g. our usability studies used400 × 400 images, which fit displays of smart phones, iPads,and PCs. While CaRP may take a similar time to entera password as other graphical password schemes, it takesa longer time to enter a password than widely used textpasswords. We discuss two approaches for balancing CaRP’ssecurity and usability.

A. Alphabet Size

Increasing alphabet size produces a larger password space,and thus is more secure, but also leads to more complex CaRP

images. When the complexity of CaRP images gets beyond acertain point, humans may need a significant amount of timeto recognize the characters in a CaRP image and may getfrustrated. The optimal alphabet size for a CaRP scheme suchas ClickText remains an open question.

It is possible to use a fixed subset of the alphabet to generateCaRP images for a user if the server receives her user IDbefore sending an image. In this case, the authentication serverallows a user to create her password from the full alphabet.Once the password is created, the server finds a suitable subsetof a reasonable size, which contains all the symbols in thepassword. The server stores the subset or its index for theaccount, and retrieves it later when the account attempts tolog in to generate a CaRP image. This scheme is suitablewhen the alphabet must be large while some people wouldlog in on small-screen devices for which an image using thefull alphabet would be too complex to quickly identify theobjects in the image.

B. Advanced Mechanisms

The CbPA-protocols described in Section II-C require auser to solve a Captcha challenge in addition to inputting apassword under certain conditions. For example, the schemedescribed in [16] applies a Captcha challenge when the numberof failed login attempts has reached a threshold for an account.A small threshold is applied for failed login attempts fromunknown machines but a large threshold is applied for failedattempts from known machines on which a successful loginoccurred within a given time frame. This technique can beintegrated into CaRP to enhance usability:

1. A regular CaRP image is applied when an account hasreached a threshold of failed login attempts. As in [16],different thresholds are applied for logins from knownand unknown machines.

2. Otherwise an “easy” CaRP image is applied.An “easy” CaRP image may take several forms depending

on the application requirements. It can be an image generatedby the underlying Captcha generator with less distortion oroverlapping, a permuted “keypad” wherein undistorted visualobjects (e.g. characters) are permuted, or even a regular“keypad” wherein each visual object (e.g., character) is alwayslocated at a fixed position. These different forms of “easy”CaRP images allow a system to adjust the level of difficultyto fit its needs.

With such a modified CaRP, a user would always enter apassword on an image for both cases listed above. No extratask is required. The only difference between the two casesis that a hard image is used in the first case whereas an easyimage is used in the second case.

IX. CONCLUSION

We have proposed CaRP, a new security primitive relyingon unsolved hard AI problems. CaRP is both a Captcha anda graphical password scheme. The notion of CaRP intro-duces a new family of graphical passwords, which adoptsa new approach to counter online guessing attacks: a newCaRP image, which is also a Captcha challenge, is used

for every login attempt to make trials of an online guessingattack computationally independent of each other. A passwordof CaRP can be found only probabilistically by automaticonline guessing attacks including brute-force attacks, a desiredsecurity property that other graphical password schemes lack.Hotspots in CaRP images can no longer be exploited to mountautomatic online guessing attacks, an inherent vulnerability inmany graphical password systems. CaRP forces adversariesto resort to significantly less efficient and much more costlyhuman-based attacks. In addition to offering protection fromonline guessing attacks, CaRP is also resistant to Captcharelay attacks, and, if combined with dual-view technologies,shoulder-surfing attacks. CaRP can also help reduce spamemails sent from a Web email service.

Our usability study of two CaRP schemes we haveimplemented is encouraging. For example, more participantsconsidered AnimalGrid and ClickText easier to use thanPassPoints and a combination of text password and Captcha.Both AnimalGrid and ClickText had better password memora-bility than the conventional text passwords. On the other hand,the usability of CaRP can be further improved by using imagesof different levels of difficulty based on the login history ofthe user and the machine used to log in. The optimal tradeoffbetween security and usability remains an open question forCaRP, and further studies are needed to refine CaRP for actualdeployments.

Like Captcha, CaRP utilizes unsolved AI problems.However, a password is much more valuable to attackers thana free email account that Captcha is typically used to protect.Therefore there are more incentives for attackers to hack CaRPthan Captcha. That is, more efforts will be attracted to thefollowing win-win game by CaRP than ordinary Captcha:If attackers succeed, they contribute to improving AI byproviding solutions to open problems such as segmenting2D texts. Otherwise, our system stays secure, contributingto practical security. As a framework, CaRP does not relyon any specific Captcha scheme. When one Captcha schemeis broken, a new and more secure one may appear and beconverted to a CaRP scheme.

Overall, our work is one step forward in the paradigm ofusing hard AI problems for security. Of reasonable securityand usability and practical applications, CaRP has goodpotential for refinements, which call for useful future work.More importantly, we expect CaRP to inspire new inventionsof such AI based security primitives.

REFERENCES

[1] R. Biddle, S. Chiasson, and P. C. van Oorschot, “Graphical passwords:Learning from the first twelve years,” ACM Comput. Surveys, vol. 44,no. 4, 2012.

[2] (2012, Feb.). The Science Behind Passfaces [Online]. Available:http://www.realuser.com/published/ScienceBehindPassfaces.pdf

[3] I. Jermyn, A. Mayer, F. Monrose, M. Reiter, and A. Rubin, “The designand analysis of graphical passwords,” in Proc. 8th USENIX SecuritySymp., 1999, pp. 1–15.

[4] H. Tao and C. Adams, “Pass-Go: A proposal to improve the usability ofgraphical passwords,” Int. J. Netw. Security, vol. 7, no. 2, pp. 273–292,2008.

[5] S. Wiedenbeck, J. Waters, J. C. Birget, A. Brodskiy, and N. Memon,“PassPoints: Design and longitudinal evaluation of a graphical passwordsystem,” Int. J. HCI, vol. 63, pp. 102–127, Jul. 2005.

[6] P. C. van Oorschot and J. Thorpe, “On predictive models and user-drawn graphical passwords,” ACM Trans. Inf. Syst. Security, vol. 10,no. 4, pp. 1–33, 2008.

[7] K. Golofit, “Click passwords under investigation,” in Proc. ESORICS,2007, pp. 343–358.

[8] A. E. Dirik, N. Memon, and J.-C. Birget, “Modeling user choice in thepasspoints graphical password scheme,” in Proc. Symp. Usable PrivacySecurity, 2007, pp. 20–28.

[9] J. Thorpe and P. C. van Oorschot, “Human-seeded attacks and exploitinghot spots in graphical passwords,” in Proc. USENIX Security, 2007,pp. 103–118.

[10] P. C. van Oorschot, A. Salehi-Abari, and J. Thorpe, “Purely automatedattacks on passpoints-style graphical passwords,” IEEE Trans. Inf.Forensics Security, vol. 5, no. 3, pp. 393–405, Sep. 2010.

[11] P. C. van Oorschot and J. Thorpe, “Exploiting predictability in click-based graphical passwords,” J. Comput. Security, vol. 19, no. 4,pp. 669–702, 2011.

[12] T. Wolverton. (2002, Mar. 26). Hackers Attack eBay Accounts[Online]. Available: http://www.zdnet.co.uk/news/networking/2002/03/26/hackers-attack-ebay-accounts-2107350/

[13] HP TippingPoint DVLabs, Vienna, Austria. (2010). Top Cyber SecurityRisks Report, SANS Institute and Qualys Research Labs [Online].Available: http://dvlabs.tippingpoint.com/toprisks2010

[14] B. Pinkas and T. Sander, “Securing passwords against dictionaryattacks,” in Proc. ACM CCS, 2002, pp. 161–170.

[15] P. C. van Oorschot and S. Stubblebine, “On countering online dictionaryattacks with login histories and humans-in-the-loop,” ACM Trans. Inf.Syst. Security, vol. 9, no. 3, pp. 235–258, 2006.

[16] M. Alsaleh, M. Mannan, and P. C. van Oorschot, “Revisitingdefenses against large-scale online password guessing attacks,” IEEETrans. Dependable Secure Comput., vol. 9, no. 1, pp. 128–141,Jan./Feb. 2012.

[17] L. von Ahn, M. Blum, N. J. Hopper, and J. Langford, “CAPTCHA:Using hard AI problems for security,” in Proc. Eurocrypt, 2003,pp. 294–311.

[18] S. Chiasson, P. C. van Oorschot, and R. Biddle, “Graphical passwordauthentication using cued click points,” in Proc. ESORICS, 2007,pp. 359–374.

[19] S. Chiasson, A. Forget, R. Biddle, and P. C. van Oorschot, “Influencingusers towards better passwords: Persuasive cued click-points,” in Proc.Brit. HCI Group Annu. Conf. People Comput., Culture, Creativity,Interaction, vol. 1. 2008, pp. 121–130.

[20] D. Davis, F. Monrose, and M. Reiter, “On user choice in graphicalpassword schemes,” in Proc. USENIX Security, 2004, pp. 1–11.

[21] R. Dhamija and A. Perrig, “Déjà Vu: A user study using images forauthentication,” in Proc. 9th USENIX Security, 2000, pp. 1–4.

[22] D. Weinshall, “Cognitive authentication schemes safe against spyware,”in Proc. IEEE Symp. Security Privacy, May 2006, pp. 300–306.

[23] P. Dunphy and J. Yan, “Do background images improve ‘Draw a Secret’graphical passwords,” in Proc. ACM CCS, 2007, pp. 1–12.

[24] P. Golle, “Machine learning attacks against the Asirra CAPTCHA,” inProc. ACM CCS, 2008, pp. 535–542.

[25] B. B. Zhu et al., “Attacks and design of image recognition CAPTCHAs,”in Proc. ACM CCS, 2010, pp. 187–200.

[26] J. Yan and A. S. El Ahmad, “A low-cost attack on a microsoftCAPTCHA,” in Proc. ACM CCS, 2008, pp. 543–554.

[27] G. Mori and J. Malik, “Recognizing objects in adversarial clutter,”in Proc. IEEE Comput. Society Conf. Comput. Vis. Pattern Recognit.,Jun. 2003, pp. 134–141.

[28] G. Moy, N. Jones, C. Harkless, and R. Potter, “Distortion estimationtechniques in solving visual CAPTCHAs,” in Proc. IEEE Comput. Soc.Conf. Comput. Vis. Pattern Recognit., Jul. 2004, pp. 23–28.

[29] K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, “Computersbeat humans at single character recognition in reading-based humaninteraction proofs,” in Proc. 2nd Conf. Email Anti-Spam, 2005, pp. 1–3.

[30] K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, “Buildingsegmentation based human-friendly human interaction proofs,” in Proc.2nd Int. Workshop Human Interaction Proofs, 2005, pp. 1–10.

[31] J. Elson, J. R. Douceur, J. Howell, and J. Saul, “Asirra: A CAPTCHAthat exploits interest-aligned manual image categorization,” in Proc.ACM CCS, 2007, pp. 366–374.

[32] R. Lin, S.-Y. Huang, G. B. Bell, and Y.-K. Lee, “A new CAPTCHAinterface design for mobile devices,” in Proc. 12th Austral. User Inter.Conf., 2011, pp. 3–8.

[33] N. Joshi. (2009, Nov. 29). Koobface Worm Asks for CAPTCHA [Online].Available: http://blogs.mcafee.com/mcafee-labs/koobface-worm-asks-for-CAPTCHA

[34] M. Motoyama, K. Levchenko, C. Kanich, D. McCoy, G. M. Voelker,and S. Savage, “Re: CAPTCHAs—Understanding CAPTCHA-SolvingServices in an Economic Context,” in Proc. USENIX Security, 2010,pp. 435–452.

[35] M. Szydlowski, C. Kruegel, and E. Kirda, “Secure input for webapplications,” in Proc. ACSAC, 2007, pp. 375–384.

[36] G. Wolberg, “2-pass mesh warping,” in Digital Image Warping.Hoboken, NJ, USA: Wiley, 1990.

[37] HP TippingPoint DVLabs, New York, NY, USA. (2011). TheMid-Year Top Cyber Security Risks Report [Online]. Available:http://h20195.www2.hp.com/v2/GetPDF.aspx/4AA3-7045ENW.pdf

[38] S. Kim, X. Cao, H. Zhang, and D. Tan, “Enabling concurrent dual viewson common LCD screens,” in Proc. ACM Annu. Conf. Human FactorsComput. Syst., 2012, pp. 2175–2184.

[39] S. Li, S. A. H. Shah, M. A. U. Khan, S. A. Khayam, A.-R. Sadeghi,and R. Schmitz, “Breaking e-banking CAPTCHAs,” in Proc. ACSAC,2010, pp. 1–10.

[40] H. Gao, X. Liu, S. Wang, and R. Dai, “A new graphical password schemeagainst spyware by using CAPTCHA,” in Proc. Symp. Usable PrivacySecurity, 2009, pp. 760–767.

[41] L. Wang, X. Chang, Z. Ren, H. Gao, X. Liu, and U. Aickelin, “Againstspyware using CAPTCHA in graphical password scheme,” in Proc. IEEEInt. Conf. Adv. Inf. Netw. Appl., Jun. 2010, pp. 1–9.

[42] J. Bonneau, “The science of guessing: Analyzing an anonymized corpusof 70 million passwords,” in Proc. IEEE Symp. Security Privacy,Jun. 2012, pp. 20–25.

[43] John the Ripper Password Cracker [Online]. Available: http://www.openwall.com/john/

[44] Openwall Wordlists Collection [Online]. Available: http://www.openwall.com/wordlists/

Bin B. Zhu received the B.S. degree in physicsfrom the University of Science and Technologyof China, Hefei, China, and the M.S. and Ph.D.degrees in electrical engineering from the Universityof Minnesota, Minneapolis, MN, in 1986, 1993,and 1998, respectively. From 1997 to 2001, hewas a Lead Scientist with Cognicity, Inc., a startupcompany he cofounded with his dissertation Advisorand a Colleague in 1997. Since 2001, he has been aResearcher with Microsoft Research Asia, Beijing,China. He has published four book chapters and

more than 60 journal and conference papers. He has also received 38 U.S.patents. His research interests include Internet security, applied cryptography,data security and privacy, and multimedia processing.

Jeff Yan received the Ph.D. degree in computersecurity from Cambridge University. He is on thefaculty of computer science, Newcastle Univer-sity, England. His recent research includes breakingvarious graphical passwords and CAPTCHAs, andinventing new ones. He has published novel attacksthat broke reCAPTCHA, and CAPTCHAs designedby Microsoft, Yahoo, and Google. A variant ofhis graphical password design, Background Draw aSecret, is deployed in Microsoft Windows 8.

Guanbo Bao received the Ph.D. degree in computerscience from the Institute of Automation, ChineseAcademy of Sciences, in 2012, where he is anAssistant Professor. His research interests includehigh performance rendering and image editing.

Maowei Yang was born in Chongqing in 1986. Hereceived the B.S. and M.S. degrees in computerscience and technology from Sichuan University,China, in 2009 and 2012, respectively. He joinedTencent, Guangzhou, China, in 2012, and is cur-rently working on mobile Internet applications.

Ning Xu received the B.S. degree from the NanjingUniversity of Science and Technology and the M.S.degree from Shanghai Jiao Tong University, both inautomation, in 2005 and 2008, respectively. He is aResearch Software Development Engineer with theInnovation Engineering Group, Microsoft ResearchAsia. He is currently working on image processingand computer vision related projects.

ieee transactions on information forensics …€¦ · captcha as graphical passwords—a new...

Documents