smudge noise for quality estimation of fingerprints and...

12
1 Smudge Noise for Quality Estimation of Fingerprints and its Validation Robin Richter, Carsten Gottschlich, Lucas Mentch, Duy H. Thai and Stephan F. Huckemann Abstract—Automated biometric identification systems are in- herently challenged to optimize false (non-)match rates. This can be addressed either by directly improving comparison subsys- tems, or indirectly by allowing only ”good quality” biometric queries to be compared. We are interested in the latter, where the challenge lies in relating ”good quality” of a query to its utility with respect to a comparison subsystem. First, we propose a new general robust biometric quality validation scheme (RBQ VS) that, mimicking the use-case, robustly quantifies comparison improvement obtained by employing a specific quality estimator. For this purpose we robustify an existing validation scheme by repeated random subsampling cross-validation. Secondly, specifically for the task of fingerprint comparison, we propose a novel biometric feature for quality estimation. Since comparison subsystems based on fingerprint minutiae, which are ridge endings and bifurcations, appear to miss minutiae or detect spurious minutiae, especially in the presence of smudge noise, we propose an algorithm aiming at measuring corruption by smudge. To this end, we employ a recently developed three parts image- decomposition and link our new smudge noise quality estimator (SNoQE) to the structure of the texture part found. Lastly, using the FVC databases and a NIST database, we compare the SNoQE with the popular NFIQ 2.0 estimator, and its predecessor. Experimental results show that the single-feature SNoQE can compete with the multi-feature NFIQ 2.0 and, in fact, adds new information not sufficiently reproduced by the NFIQ 2.0. Indeed, a simple combination of SNoQE and NFIQ 2.0 tends to outperform on all databases included in the comparison study. An implementation of the RBQ VS and the SNoQE can be found online. Index Terms—Biometrics, fingerprint recognition, quality as- sessment, system validation I. I NTRODUCTION T HERE is an abundance of biometric features that are used in biometric identification and verification systems in commercial, governmental, and forensics applications. In R. Richter and C. Gottschlich are with the Institute for Mathematical Stochastics at the University of ottingen, 37077 ottingen, Germany (e-mails: [email protected], [email protected] goettingen.de) L. Mentch is with the Department of Statistics at the University of Pittsburgh, PA 15260 Pittsburgh, United States (e-mail: [email protected]) D.H. Thai is with the Department of Statistical Science at Duke University, NC 27708-0251 Durham, United States (e-mail: [email protected]) S.F. Huckemann is with the Felix-Bernstein-Institute for Mathematical Statistics in the Biosciences at the University of G¨ ottingen, 37077 G¨ ottingen, Germany (e-mail: [email protected]) This paper has supplementary downloadable material available at http://ieeexplore.ieee.org , provided by the authors. The material includes an additional online supplement section detailing the comparison study in Section V further, as well as, the MATLAB code for the proposed SNoQE and RBQ VS. Contact [email protected] for further questions about this work. all of these applications, there is a strong demand to optimize specific goals, typically a trade-off between security and usability over parameters such as false match / non-match rates and processing time. A direct method for optimization lies in improving comparison subsystems. This, however, can only lead to optimal results under the assumption that the processed biometric queries are of “sufficiently good quality”. Thus, a second method does not touch such comparison subsystems, rather, it lowers error rates by disallowing biometric queries with not “sufficient quality” to partake in the comparison in the first place, see [1]. Following this second method requires first a focus on the specific biometric features, a biometric comparison subsystem relies on. For instance in fingerprint analysis these are usually minutiae loci and their orientations, which are subject to various sources of errors, making quantifying a query’s quality a highly non-trivial task. Although there is, of course, a quantification of biometric verification and identification experiments, optimization of an automated comparison subsystem by quality estimation is not in itself a clearly-posed problem. Such subsystems are obvi- ously prone to various algorithm-inherent errors, stemming, for example, from scanners, digitization and distortion. Thus there is a lack of quantification of the performance of quality estimators. In consequence we identify two main challenges for this path to overcome: (a) Design quality features such that improved quality relates with improved false match / non-match rates, and (b) design a validation scheme that, given a quality fea- ture/estimator and a comparison subsystem, robustly quantifies the improvement of the error rates achieved by optimization with a quality threshold. A. Design of the Quality Feature ”Smudge” Automated fingerprint comparison subsystems are usually based on comparing (marked) point clouds obtained from automated extraction of minutiae loci (and orientations) tem- plates of fingerprints, which encode ridge bifurcations or ridge endings. A key factor that corrupts fingerprint images is the presence of large-scale noise, caused by too much pressure, called smudge, or too little pressure, called dryness, as both obscure existing minutiae or add spurious minutiae. At times smudge/dryness can lead to fingerprint’s fringe pattern only partly visible, possibly resulting in wrongly detected ridge connections that obscure present ridge-endings or create spu- rious ridge-bifurcations and vice versa, see Fig. 1. For the two 0000–0000/00$00.00 c 2018 IEEE This article has been accepted in a future issue of IEEE Transactions on Information Forensics and Security, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2018.2889258. (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: others

Post on 05-Aug-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

1

Smudge Noise for Quality Estimation ofFingerprints and its Validation

Robin Richter, Carsten Gottschlich, Lucas Mentch, Duy H. Thai and Stephan F. Huckemann

Abstract—Automated biometric identification systems are in-herently challenged to optimize false (non-)match rates. This canbe addressed either by directly improving comparison subsys-tems, or indirectly by allowing only ”good quality” biometricqueries to be compared. We are interested in the latter, wherethe challenge lies in relating ”good quality” of a query to itsutility with respect to a comparison subsystem. First, we proposea new general robust biometric quality validation scheme (RBQVS) that, mimicking the use-case, robustly quantifies comparisonimprovement obtained by employing a specific quality estimator.For this purpose we robustify an existing validation schemeby repeated random subsampling cross-validation. Secondly,specifically for the task of fingerprint comparison, we propose anovel biometric feature for quality estimation. Since comparisonsubsystems based on fingerprint minutiae, which are ridgeendings and bifurcations, appear to miss minutiae or detectspurious minutiae, especially in the presence of smudge noise, wepropose an algorithm aiming at measuring corruption by smudge.To this end, we employ a recently developed three parts image-decomposition and link our new smudge noise quality estimator(SNoQE) to the structure of the texture part found. Lastly,using the FVC databases and a NIST database, we compare theSNoQE with the popular NFIQ 2.0 estimator, and its predecessor.Experimental results show that the single-feature SNoQE cancompete with the multi-feature NFIQ 2.0 and, in fact, addsnew information not sufficiently reproduced by the NFIQ 2.0.Indeed, a simple combination of SNoQE and NFIQ 2.0 tends tooutperform on all databases included in the comparison study.An implementation of the RBQ VS and the SNoQE can be foundonline.

Index Terms—Biometrics, fingerprint recognition, quality as-sessment, system validation

I. INTRODUCTION

THERE is an abundance of biometric features that areused in biometric identification and verification systems

in commercial, governmental, and forensics applications. In

R. Richter and C. Gottschlich are with the Institute for MathematicalStochastics at the University of Gottingen, 37077 Gottingen, Germany(e-mails: [email protected], [email protected])

L. Mentch is with the Department of Statistics at the University ofPittsburgh, PA 15260 Pittsburgh, United States (e-mail: [email protected])

D.H. Thai is with the Department of Statistical Science at Duke University,NC 27708-0251 Durham, United States (e-mail: [email protected])

S.F. Huckemann is with the Felix-Bernstein-Institute for MathematicalStatistics in the Biosciences at the University of Gottingen, 37077 Gottingen,Germany (e-mail: [email protected])

This paper has supplementary downloadable material available athttp://ieeexplore.ieee.org , provided by the authors. The material includesan additional online supplement section detailing the comparison study inSection V further, as well as, the MATLAB code for the proposed SNoQEand RBQ VS. Contact [email protected] for furtherquestions about this work.

all of these applications, there is a strong demand to optimizespecific goals, typically a trade-off between security andusability over parameters such as false match / non-match ratesand processing time. A direct method for optimization lies inimproving comparison subsystems. This, however, can onlylead to optimal results under the assumption that the processedbiometric queries are of “sufficiently good quality”. Thus, asecond method does not touch such comparison subsystems,rather, it lowers error rates by disallowing biometric querieswith not “sufficient quality” to partake in the comparison inthe first place, see [1].

Following this second method requires first a focus on thespecific biometric features, a biometric comparison subsystemrelies on. For instance in fingerprint analysis these are usuallyminutiae loci and their orientations, which are subject tovarious sources of errors, making quantifying a query’s qualitya highly non-trivial task.

Although there is, of course, a quantification of biometricverification and identification experiments, optimization of anautomated comparison subsystem by quality estimation is notin itself a clearly-posed problem. Such subsystems are obvi-ously prone to various algorithm-inherent errors, stemming,for example, from scanners, digitization and distortion. Thusthere is a lack of quantification of the performance of qualityestimators. In consequence we identify two main challengesfor this path to overcome:(a) Design quality features such that improved quality relates

with improved false match / non-match rates, and(b) design a validation scheme that, given a quality fea-

ture/estimator and a comparison subsystem, robustlyquantifies the improvement of the error rates achievedby optimization with a quality threshold.

A. Design of the Quality Feature ”Smudge”

Automated fingerprint comparison subsystems are usuallybased on comparing (marked) point clouds obtained fromautomated extraction of minutiae loci (and orientations) tem-plates of fingerprints, which encode ridge bifurcations or ridgeendings. A key factor that corrupts fingerprint images is thepresence of large-scale noise, caused by too much pressure,called smudge, or too little pressure, called dryness, as bothobscure existing minutiae or add spurious minutiae. At timessmudge/dryness can lead to fingerprint’s fringe pattern onlypartly visible, possibly resulting in wrongly detected ridgeconnections that obscure present ridge-endings or create spu-rious ridge-bifurcations and vice versa, see Fig. 1. For the two

0000–0000/00$00.00 c© 2018 IEEE

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 2: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

2

(a) (b)

Figure 1: Examples how dry (a) or smudge (b) regions of afingerprint obscure or even alter ridge patterns and minutiae.

comparison subsystems used in this study, bozorth3 - based onmindtct - and the commercial ”FingerCode 3” of the MatchingSDK by Dermalog GmBH, it turns out, however, that mainlysmudge noise relates to improved error rates, and we believethat this is the case for most comparison subsystems currentlyused. For this reason, we provide here only an implementationfor our Smudge Noise Quality Estimator (SNoQE), whilewe detail the ideas underlying both the SNoQE and theclosely related Dryness Noise Quality Estimator (DNoQE). Animplementation of the latter can be obtained by straightforwardadaptation of the former‘s.

Smudge and dryness is the absence of a clear oscillatoryfringe pattern in the presence of background and noise. Tothis end, we require a decomposition method that yields anoscillatory component. A general mathematical model fordecomposition of images into cartoon (local background),texture (local fringe pattern) and noise has been developed in[2, 3], based on the two-parts Rudin-Osher-Fatemi model [4].Specifically, we employ here the global three-part decomposi-tion (G3PD) by Thai and Gottschlich [5], a specific adaptation,that has been trained on fingerprint images to extract theregion of interest ROI via additional morphological operations(cf. also [6]). In our first step we do the same, namelyextracting the ROI, the image region that contains the actualfingerprint, and, using an intermediate step of this calculation,the regions that convey texture-information, cf. Fig. 2. Onlywithin these latter regions we have reliable information onpossible minutiae locations, the others correspond to smudgeor dry areas. Thus the G3PD precisely serves our purpose,yielding even a binary fringe pattern indicator.

In our second step we assess the amount of smudge/drynessat pixels where texture, i.e. oscillation, is observed. Moreprecisely, we deduce the amount of smudge or dryness bycomparing at each such pixel the sum of grey-values in aneighbourhood with an expected lower/upper bound on thatsum, governed by quantiles of the averaged inter-ridge dis-tance distribution. By their estimation (not tuning) we obtainsuitable parameters ensuring that minutiae are not mistakenfor smudge, or dryness, respectively. It turns out that for thepurpose of quality assessment based on rolled fingers, as wellas optical, capacitive and thermal sensors, investigated here,smudge noise is decisive, dryness noise is less of interest.

Figure 2: An original fingerprint image (left); Region ofinterest (ROI, center); and outline of ROI and regions ofsmudge within the ROI (right).

B. Design of the Robust Biometric Quality Validation Scheme(RBQ VS)

We aim at robustly comparing biometric quality estimators,as elaborated above, under the paradigm that excluding lowquality lowers comparison errors. For instance, one may wantto train a quality threshold that robustly guarantees a lowered,at best, minimal equal error rate (EER) of a comparisonsubsystem. Indeed, the comparison scores of a specifiedcomparison subsystem and the resulting error rates for eachcomparison threshold can be viewed as the ground truth forthis evaluation. To this end, Lee et al. [7] separated a databaseinto a training (80 out of 880 fingerprints) and a test set, traineda quality threshold minimizing the EER, thresholded the testset accordingly and compared the respective EERs of differentquality estimators. While their scheme is intuitive, it turns outthat it suffers from high variation under different separationsinto training and test sets, so that their single experiment lackssignificance. As not even k-fold cross-validation (11 runs inthe example of [7], say) sufficiently reduces variation, wehave employed repeated random subsampling cross-validationwhich allows many more runs (100 runs) and larger trainingand test sets, cf.[8, Sec.8.3]. This provides the robustificationnecessary, especially on small data sets.

C. Literature on Validation Schemes

To the best of our knowledge there is a lack of standardiza-tion for the performance evaluation of biometric query qualityestimation. In the following we describe some of the mostprominent methods in the literature that have been used in thepast (an overview over some can be found in [9]). Severalstudies (e.g., [10, 11, 12, 13], the latter involves 11 qualityestimators and 3 comparison subsystems) assessing correlationof quality feature scores with each other and with comparisonscores of genuine attempts yield mixed outcomes of variouspositive and negative correlations and no ranking of the qual-ity features can be derived. Because comparison subsystems(should) implement expert human matching expertise, [14, 15]have compared quality estimators with human expert qualityassignment. As long as variability of expert human assignmentis not evaluated however, such studies have limited authority.More closely related to our paradigm is the assessment of

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 3: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

3

monotonicity of error rates in quality thresholds. While [9,16, 17, 18] consider monotonicity only in single bins ofquality, [10, 19, 20, 21] consider all bins above a given qualityscore and monotonicity therein. Moreover, [9, 22] validateby comparing EERs due to random enrolment against bestquality enrolment. Notably, as a non-ranking but multivaluedfeature giving detailed insight how employing the specifiedquality feature relates to lowered error rates, error versus rejectcharacteristic (ERC) curves by [9], plot false non-match ratesagainst the fraction of genuine attempts rejected by the qualityestimator (Here, an initial false non-match rate (often 0.1)yields a fixed comparison threshold).

D. Literature on Fingerprint Quality

Other quality estimators aim at measuring general fin-gerprint features and their plausibility rather than directlymeasuring the above described kind of noise that impedesminutiae matching (overviews can be found in [10, 13, 23]).Of the general quality features of fingerprints that have beenproposed we mention some typical ones. A prominent qualityfeature is directionality of local blocks, which is assessed viacorrelation matrices of the discrete gradient by Lim et al. [24],and via the polar-transformed Short-Time-Fourier-Transformby Phromsuthirak and Areekul [21]. The application of Gaborfilters [25] for fingerprint quality estimation has been consid-ered by Shen et al. [14] and Olsen et al. [12]. Apart fromdirectionality, another quality feature that is often estimated isconsistency, via differences of orientations of neighbouringblocks, see [24, 26], via reduction to a 1-D structure andbinarization by Lim et al. [27], or, more elaborately by Chenet al. [26] in two dimensions. Alternatively, variances of grey-levels have been assessed by Joun et al. [28] in order to obtainclearness of the ridge-valley-structure. To this end, Fronthaleret al. [16] computed responses of symmetry filters, and morerecently Tao et al. [18] assessed principal components of localblocks. Additionally, Yoon et al. focus on image quality oflatent fingerprints [29], Teixeira and Leite [30] constructedrecently a quality estimator for high resolution images andAlonso-Fernandez et al. analyse the influence of fingerprintsensors on quality [31]. With increased computational power,various quality features can be combined and machine learningtechniques trained on suitable combinations, see [15, 17, 19,20, 32, 33]. Notably, many of these quality features oftenassociate bad quality with rapid change of direction, which,unfortunately can also occur in perfect quality fingerprints, e.g.near singularities (cores, deltas and whorls) and arches, andsometimes also near minutiae.The NIST Fingerprint Quality (NFIQ) is a highly popular,publicly available quality estimator, see [32]. Its second ver-sion, NFIQ 2.0 [33], has been designed using many of theabove mentioned quality features, as well as an orientationfield estimator from [34]. After a survey of 155 quality featuresfrom the literature, fourteen were selected and implementedto give a multi-dimensional feature vector. This feature vectorwas then used in a random forest binary classification thatclassified fingerprints into high and low utility. Assigning toeach classified fingerprint the probability of being of high

utility, multiplying with 100 and rounding to the closestinteger yields the NFIQ 2.0 quality estimate. The randomforest classifier has been trained on several large databases (intotal 6629 images) for optical sensors. The NFIQ 2.0 qualityestimator has been validated by its ERC curves and by DETcurves over the comparison threshold for three quality bins.

E. Organization of the Paper

In Section II, we briefly recall the G3PD-model and itsoutputs, utilized in SNoQE, defined and discussed along withDNoQE in Section III. In particular, we emphasize the param-eters that need to be predefined, or tuned for the SNoQE al-gorithm, and possibly adjusted for different databases/sensors.In Section IV, we introduce our algorithmic RBQ valida-tion scheme for assessing quality estimators. Finally, in Sec-tion V we apply the RBQ VS and provide ERC curvesfor our SNoQE, NFIQ and NFIQ 2.0, over the nine FVC2000/2002/2004 databases containing real fingers (see [35])and the NIST SD4 using the bozorth3 and the commercial”FingerCode 3” of the Matching SDK by Dermalog GmBH.Since the SNoQE’s underlying quality feature of smudge noiseseems less reflected by the NFIQ 2.0, we also evaluate acombination of NFIQ 2.0 with SNoQE. We conclude witha discussion and suggestions for possible extensions.

II. THE G3PD MODEL

A. Constrained Minimization

The G3PD model introduced by Thai and Gottschlich [5]decomposes a given fingerprint image described by a realvalued (grey-level) function f observed over a grid X :=1, 2, . . . , N×1, 2, . . . ,M (N,M ∈ N) into three additivecomponents: the cartoon u, the texture v, and the residualnoise ε,

f(i, j) = u(i, j)+v(i, j)+ε(i, j) , for (i, j) ∈ X . (II.1)

For fingerprint images where v takes the role of fringe patterns,Thai and Gottschlich adapted the minimization problem ofRudin et al. [4] and its extensions by Meyer [2] and Aujoland Chambolle [3] using the TV-norm for the cartoon, afast computable `1-curvelet-norm for the texture component(see [36]) and penalizing the residual noise by a `∞-curvelet-norm of the same curvelet type. This leads to the objectivefunction

JG3PD(u, v) := ||∇u||L1 + µ1||C(v)||`1 + µ2||v||L1 ,

for u ∈ BV, v ∈ GB, to be minimized under the constraints

max(i,`,k)∈I

|Ci,`ε[k]| ≤ δ , f = u+ v + ε ,

where BV is the space of functions with bounded variation,GB is some generalized Besov-space (see [37]) designed forthe curvelet transform, µ1, µ2, δ ∈ R and I ⊂ Z × Z × Z2

is a suitable, pre-defined finite index set, see [36]. The con-strained minimization is solved by an augmented Lagrangianalgorithm, and, as a further novelty, the parameter µ2 isupdated in each iteration. The parameter µ1, as well as manymore parameters, listed in Tab. I, are trained beforehand for

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 4: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

4

TABLE ITrainable parameters of the G3PD, detailed in [5, Tab.1]

Description Parameters

Model parameters µ1, N

Algorithm parameters C, β1, β2, β3, γ, p

Morphological parameters s, t, b

(a) (b) (c)

Figure 3: Example fingerprint (a), its inverted ROI 1 − 1ROI(b) and vfringe (c) (FVC 2004 DB2, finger 1, replica 8).

each database and remain fixed. In Tab. VII in the onlinesupplement we report their values from [5], trained for eachof the FVC databases. For the NIST SD4, as the images aresimilar to those of FVC 2000 DB3, we took the same values.Notably, elaborate tuning seems not necessary.

B. ROI and Fringe Pattern Extraction

As a major advantage of the G3PD method, the texturecomponent v is exactly zero when locally no oscillation isdetected (in that case the variation around u is totally absorbedin ε). Thus, setting

vbin(i, j) =

0 if v(i, j) = 0

1 if v(i, j) 6= 0

for all (i, j) ∈ X

gives an indicator for the existence of a local fringe pattern.As it may happen that small and isolated fringe-like patternsare also detected outside the fingerprint area, the followingmorphological operation, also from [5], introduces a newbinary function that removes these. For parameters s, t, b (cf.Tab. I) consider for each pixel (i0, j0) an s× s-block centredat pixel (i0, j0) along with 8 neighbouring blocks of the samesize, and determine for each of the 9 blocks whether thenumber of pixels with vbin(i, j) = 1 exceeds s2

t . If at leastb of the 9 blocks exceed this threshold (this pixel is markedas contributing to the fringe pattern), set

vfringe(i0, j0) = 1 , else set vfringe(i0, j0) = 0 .

In particular, the convex hull of vfringe(i, j) = 1 gives the ROIwith index function 1ROI(i, j). Notably, the dry and smudgepatterns inside the ROI are characterized by vfringe = 0.

III. THE SMUDGE NOISE QUALITY ESTIMATOR (SNOQE)

Assume that all fingerprint images are given by functions ftaking grey-level values in Ω := 0, 1, 2, . . . , 255 over thepixel grid X . Here 0 stands for black and 255 for white.As described in the previous section, after application of theG3PD we obtain the two binary functions 1ROI and vfringe. As

the central rationale of SNoQE the fringe pattern in f , encodedby vfringe, must exhaust the full grey-scale spectrum in theROI. This is achieved by the following histogram spreadingpre-processing step.

A. Contrast Enhancement

For every single fingerprint image, denote by cblack and255− cwhite the 0.05 and 0.95 quantiles of the empirical greyvalue distribution over the ROI, i.e. cblack ∈ Ω is the minimalgrey value such that at least 5% of all ROI pixels attain greylevels in 0, 1, . . . , cblack and cwhite ∈ Ω is the minimal valuesuch that at least 5% of all ROI pixels attain grey valuesin 255 − cwhite, . . . , 255. Then for (i, j) ∈ X the contrastenhanced image is given by

fCE(i, j) := 255 min

maxf(i, j)− cblack, 0

255− cwhite − cblack, 1

.

After this pre-processing we have in particular that fCE isclose to 255 for both valley and dry areas and close to 0 forridge and smudge areas.

B. Smudge Noise Estimation

The SNoQE algorithm calculates a smudge componentfSNoQE of fCE, and the SNoQE score qSNoQE is then givenby the mean of grey-values of fSNoQE within the ROI. For thispurpose, the algorithm differentiates between pixels of twotypes in the ROI. Pixels of the first type do not contribute tothe fringe pattern, i.e. vfringe(i, j) = 0, and we set fCE(i, j)as their smudge value. For smudge-like pixels, as explainedabove, these values are close to zero, indicating low quality.Pixels of the second type contribute to the fringe pattern, i.e.vfringe(i, j) = 1. We assign such pixels good quality (a valueclose to 255) if they contain within a block of integer radius r(chosen such that 2r + 1 pixels slightly exceed average ridgewidth) at least as many valley pixels (α times block size) asexpected for a good quality fingerprint pixel located in the“worst” position, namely in the center of a ridge, cf. Fig. 4,that may even be a minutia center. For detailed pseudo-codesee Algorithm 1.

Dryness noise estimation is performed in the same way, withthe following obvious difference. For the dryness noise qualityestimator (DNoQE) the routine of Algorithm 1 is applied tothe grey-scale inverted 255 − fCE(i, j). If we simultaneouslypenalize for smudge and dryness and assign to each pixel theminimum of the calculated smudge value and of its drynesscounterpart this leads to the smudge and dryness noise qualityestimator (SaDNoQE) given as:

fSaDNoQE(i, j) := minfSNoQE(i, j), fDNoQE(i, j) .For almost all of the databases investigated in Section V,among the here introduced quality features SNoQE yields byfar the best results. Thus, we conjecture that with respectto the scanning devices and comparison subsystems used,spurious and missed minutiae detection are much more linkedto smudge noise and much less to dryness noise.

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 5: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

5

CE SNoQE

f fCE fSNoQE

qSNoQE(f) = 0.77

Figure 5: SNoQE for the example in Fig. 3, with r = 3 and α = 3.0093.

Algorithm 1 Smudge Noise Quality Estimator (SNoQE)Input: fCE, vfringe, 1ROI, r, αfor (i, j) ∈ X s.t. 1ROI(i, j) = 1 do

if vfringe(i, j) = 1 then

fSNoQE(i, j) =

min

1

α(2r + 1)2

i+r∑

k=i−r

j+r∑

`=j−r

fCE(k, `)

, 255

else

fSNoQE(i, j) = fCE(i, j)

end ifend for

qSNoQE(f) :=

∑i,j fSNoQE(i, j) 1ROI(i, j)

255∑

i,j 1ROI(i, j)

Output: fSNoQE, qSNoQE(f)

ridge

valley

valley

r

Figure 4: Schematic illustration for SNoQE parameters in caseof RMed = 10 and R0.95 = 12, yielding r = 3 and α = 14

49(the light gray square represents (i, j), the dark gray squarerepresents Ω(i,j)).

C. Parameter choices for r and α

Recall that the inter-ridge distance (IRD) is a local featureof a fingerprint, giving the local distance between two neigh-bouring ridge lines. Depending on scanner and fingerprint,this feature varies over a small but considerable interval. Foreach fingerprint we compute its average IRD, and for a givendatabase, denote by Rmed the median and by R0.95 the 95 %

quantile of the distribution of averaged IRDs. These numbersform the basis for our choices of r and α, which are chosensuch that for a good quality fingerprint every (2r+1)×(2r+1)block contains at least a fraction of α valley pixels (i, j), i.e.pixels with fCE(i, j) = 255. Setting

r :=

⌈R95

4

⌉, (III.1)

and under the simplifying assumption that locally the finger-print ridges are straight lines, as in Fig. 4, we ensure that ina (2r + 1) × (2r + 1) box of a good quality fingerprint atleast α(2r + 1)2 pixels feature fCE(i, j) = 255. In Fig. 4 wedepict the “(second) worst” case, namely that the center of thebox is the center of a ridge line where fCE assumes the valuezero. Here we have the equivalent of 2(2r + 1) white pixels(including the 2(2r+1) half white pixels accounting for 2r+1white pixels). In the “worst case” the box’s center is also aminutia center such that most of the upper valley part in Fig. 4is no longer white and we can only ensure the equivalent of2r + 1 white pixels, i.e. α = (2r + 1)−1 in this case. It turnsout, due to overestimation of r by the 0.95 quantile in (III.1),that this number is too small and the following is a good androbust choice

α :=1

2r + 1

(1 + 2

⌈R95

4

⌉− RMed

2

)≥ 1

2r + 1. (III.2)

A detailed analysis of the parameters r and α in the onlinesupplement shows that they are mildly robust against misspec-ification, see Fig. 18 and 19.

IV. ROBUST BIOMETRIC QUALITY VALIDATION SCHEME(RBQ VS)

Validating the performance of a quality estimator is a non-trivial task, since there is no ground truth for the qualityof a fingerprint image. We propose to use a procedure thatassesses a biometric quality estimator considering its abilityto lower errors of a given comparison subsystem, makingthe comparison data our ground truth. As discussed in theintroduction there are many approaches to validate whetherthe assignment of good quality to two biometric queries withrespect to a quality estimator relates to lowered probability ofcomparison errors. While feature vectors, like, for example,ERC curves, serve well for descriptive purposes, for ourtask at hand, we require a single scalar quantification of itsperformance. However, to employ a quality estimator a quality

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 6: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

6

threshold needs to be trained on a training set beforehand withregard to lowering, for example, the EER of a comparisonsubsystem.

To mimic this use-case and validate, for a candidate pair ofa quality estimator and a comparison subsystem, we divide agiven database into a training and a test set, then train a qualitythreshold on the training set and impose this trained thresholdon the test set. This was done in [7] for a small training setas one validation among others. Often, however, due to highvariation, the outcome depends on the specific division of thedatabase into training and test set, and indeed, this happensfor our use-case, see Section V. In consequence, we robustifythe procedure by K runs (K sufficiently large) of repeatedrandom sub-sampling.

Here is the formal definition of our scheme: Assumingthat biometric samples (fingerprints in our case) have beentaken from varying individuals, and all measurement have beencollected in a database Y with Nindv ∈ N different individuals(fingers), where each individual (finger) contributes Nrepl ∈ Nreplicate samples (prints), view a comparison subsystem as amap m : Y × Y → [0, 1] assigning each pair a comparisonscore indicating the degree of similarity between the twomeasurements. For a given threshold tmatch two measurementsi and j match whenever m(i, j) > tmatch.

For assessment, we define the well known false matches(FM) and the false non-matches (FNM) with respect to tmatchas well as the ground truth of genuine attempts (GA) andimposter attempts (IA), respectively, see [38], as

FM(tmatch) = #m(i, j) > tmatch andi and j are not from the same individual ,

FNM(tmatch) = #m(i, j) ≤ tmatch andi and j are from the same individual ,

GA = #(i, j) ∈ Y × Y andi and j are from the same individual ,

IA = #(i, j) ∈ Y × Y andi and j are not from the same individual .

Since, in general, IA is not equal to GA, we consider thefalse match rate and the false non-match rate,

FMR(tmatch) :=FM(tmatch)

IA,

FNMR(tmatch) :=FNM(tmatch)

GA.

Then(FMR(tmatch), FNMR(tmatch)

): tmatch ∈ [0, 1]

defines the receiver operating characteristic (ROC) curve. Inpractice one aims at meeting a pre-specified FMR or FNMR,or at maximizing the area under the ROC curve (AUC), or atminimizing the equal error rate (EER), e.g. [35], given by

tEER := arg mintmatch∈[0,1]

|FMR(tmatch)− FNMR(tmatch)| .

If q : Y → [0, 1] is now a quality estimator, here is theprotocol of the RBQ VS meeting the objective of minimizing

the EER. Protocols for meeting other objectives are thenstraightforward.(1) Input: Number of runs K ∈ N, size X of the training set

0 < X < Nindv and the number of quality bins L ∈ N.(2) Do for k = 1, 2, . . . ,K:(2.1) Randomly select X individuals. All replicate measure-

ments from the selected individuals will constitute thetraining set Ytrain, the replicate measurements from allother individuals comprise the test set Ytest.

(2.2) Do for j = 0, 1L ,

2L , . . . , 1:

(2.2.1) Define the quality enhanced training set Y (j)train of all

replicate measurements with quality higher than j:

Y(j)

train := f ∈ Ytrain : q(f) ≥ j.

(2.2.2) If for Y (j)train either GA = 0 or IA = 0, set EERj :=

1.(2.2.3) Else, compute the EER of Y (j)

train and denote it byEERj .

(2.3) Computej∗ := arg min

j(EERj) . (IV.1)

(2.4) Define the quality enhanced test set Y ∗test of all replicatemeasurements with quality higher than j∗:

Y ∗test := f ∈ Ytest : q(f) ≥ j∗ . (IV.2)

(2.5) If the IA and GA of Y ∗test is not 0 the run is valid andthe ROC curve and EER for Y ∗test are computed.

(3) Average the results of each valid run to obtain an averagedEER and ROC curve.

Upon completion, we can obtain and compare the resultingempirical distributions of the EER, or the ROC, or the AUCas well as their mean or their median, et cetera.

Notably, in order to obtain empirical distributions that leadto statistically robust comparisons of quality estimators, it isdecisive to choose the number K of runs sufficiently high.

V. COMPARISON STUDY

For comparison we use the RBQ VS based on mean EERs,allowing for scalar ranking. In the online supplement we showbox plots of the underlying EER distributions in Fig. 16 andmean DET curves (recall that EERs are single points on DETcurves) in Fig. 17 . Further, we provide ERC curves linkingFNMRs to fractions of genuine attempts excluded.

We conduct our study on the popular and open sourceFVC databases and on the NIST SD4 database of rolledimprints. The FVC databases have been obtained employingvarious sensors: low (2000 DB1) and high cost optical sensors(2000 DB2, 2002 DB1/DB2 and 2004 DB1/DB2), low (2000DB3) and high cost capacitive sensors (2002 DB3) and athermal sensor (2004 DB3), cf. [35] and Tab. VI in the onlinesupplement. Each of the nine FVC databases contains 880fingerprints with Nindv = 110 individual fingers, each withNrepl = 8 replicates. The NIST SD4 contains 4000 fingerprintswith Nindv = 2000 individual fingers and Nrepl = 2 replicatefingerprints each. Typical images of the FVC databases andNIST SD4 are displayed in Fig. 6. Furthermore in the online

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 7: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

7

(a) (b) (c) (d)

Figure 6: Example fingerprints from the FVC and the NISTSD4: optical (a), thermal (b), capacitive (c) and rolled (d).

supplement, highest and lowest scoring examples of the FVCdatabases with respect to SNoQE and DNoQE are presented,see Fig. 14.

For quality estimators we use SNoQE, as well as NFIQ2.0. Since it turns out that its first version NFIQ performsdifferently and sometimes outperforms NFIQ 2.0, we havealso included it. Moreover, we study a simple combinationof SNoQE with NFIQ 2.0, denoted by SNoQE∧NFIQ2. Forthe RBQ VS this combination is obtained by, setting

Y ∗test := f ∈ Ytest : qSNoQE(f) ≥ j∗SNoQE and qNFIQ2(f) ≥ j∗NFIQ2 .(V.1)

enforcing both, independently trained quality thresholds atonce. For the ERC curves considered below, this combinationis obtained by first performing histogram spreading for qSNoQEand qNFIQ2, to exhaust the full interval [0, 1], and then setting

qSNoQE∧NIFQ2(f) := minqSNoQE(f), qNFIQ2(f) . (V.2)

Recall from Section II that the G3PD parameters have beenspecifically trained in [5] for ROI extraction on the nine FVCdatabases; as the NIST SD4 is visually similar to FVC 2000DB3, we have used the same parameters, cf. Tab. VII in theonline supplement, which turn out to be quite satisfactory.Recall further from Section III that SNoQE’s parameters rand α are based on the median of the distribution of averagedIRDs of the database RMed and the corresponding 0.95 quantileR95. Again for the nine FVC databases, the latter quantitieshave been determined (cf. [25]) and for the NIST SD4, theyhave been heuristically estimated, cf. Tab. II and Fig. 13 inthe online supplement.

As comparison subsystems (CS) we use the publicly avail-able bozorth3 of the NIST [32] and, additionally, the com-mercial ”FingerCode 3” of the Matching SDK by DermalogGmBH, in the following denoted by commercial.

A. RBQ VS Results

For the RBQ VS we set the number of runs to K = 100,in order to guarantee robustness, we choose the portion ofthe training X as around 1/4 of Nindv, but not too small,we choose X = 40 for the FVC databases and X = 500for the NIST SD4. By construction, NFIQ has only L = 5quality bins, NFIQ 2.0 has L = 100 and for SNoQE we usethe same number L = 100. We embed these bins in [0, 1]with 1 for highest quality. Notably, for small databases suchas the FVC, employing too fine binning, risks to obtain runs,for which, after a quality threshold has been trained, the set

TABLE IIIRD statistics defining parameters r and α in (III.1) and

(III.2) using histograms of [25] for the FVC databases andheuristic estimates for NIST SD4, cf. Fig. 13 in the online

supplement.

Database R95 RMed α r

FVC2000 DB1 10.6226 9.2353 0.3403 3

FVC2000 DB2 10.6332 9.1034 0.3496 3

FVC2000 DB3 12.0301 10.5852 0.4119 4

FVC2002 DB1 9.9851 8.8957 0.2868 3

FVC2002 DB2 12.0931 10.3204 0.4266 4

FVC2002 DB3 10.5428 8.9698 0.3593 3

FVC2004 DB1 10.7047 9.0918 0.3506 3

FVC2004 DB2 11.0235 9.3484 0.3323 3

FVC2004 DB3 10.2965 9.1005 0.3500 3

NIST SD4 17 14 0.3636 5

(a) (b) (c) (d)

Figure 7: Replicates 2 (a) and 7 (b) of finger 52 from FVC2000 DB1. Both prints have high NFIQ 2.0 score (75 and 59,respectively) but a low comparison score resulting in a falsenon-match. Replicates 1 (c) and 2 (d) of finger 106 from DB2.Both prints have high SNoQE quality (0.9767 and 0.9637,respectively), while only 3 (or 5, including the miniature islandnear the core) minutiae overlap. This leads to a reject by bothCSs.

Y ∗test of (IV.2) has no remaining genuine attempts, cf. commentbelow Tab. III.

In Tab. III the means of the resulting EER distributions overthe quality enhanced test sets (from Section IV) are displayedfor each database, CS and quality estimator, respectively. Asbaseline of the RBQ VS the mean EER of all test sets withoutquality assessment is shown in the fourth column. Out of 20cases considered, SNoQE performs best in 6, NFIQ 2.0 in 8and, surprisingly, NFIQ in the remaining 6 cases. Upon closerinspection, in 2 cases all three quality estimators performsimilarly. Furthermore, in 2 cases SNoQE and NFIQ 2.0outperform NFIQ, in 2 cases NFIQ and NFIQ 2.0 outperformSNoQE and in one case SNoQE and NFIQ outperform NFIQ2.0. For each one of the three, there are also cases where theystrongly outperform the other two. Since NFIQ 2.0 has beentrained on optical sensors, there is a tendency that it is out-performed by SNoQE on thermal and capacitive sensors. Ourproposed combination SNoQE∧NFIQ2 from (V.1), however,clearly outperforms all competitors. Thus by construction of

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 8: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

8

TABLE IIIMean EERs for the quality enhanced test sets. No quality thresholding gives the baseline. Lowest scoring single

quality estimator in bold. SNoQE∧NFIQ2 scores are always lower and marked italic.

Database Comp. Subsystem Sensor type Baseline NFIQ NFIQ 2.0 SNoQE SNoQE∧NFIQ2

FVC2000 DB1 bozorth3 optical 0.0490 0.0139 0.0248 0.0043 0.0012

FVC2000 DB1 commercial optical 0.0138 0.0036 0.0021 0.0033 0.0008

FVC2000 DB2 bozorth3 capacitive 0.0482 0.0144 0.0056 0.0022 0.0001

FVC2000 DB2 commercial capacitive 0.0100 0.0001 0.0008 0.0012 0.0003

FVC2000 DB3 bozorth3 optical 0.1324 0.0118 0.0035 0.0614 0.0001

FVC2000 DB3 commercial optical 0.0410 0.0089 0.0037 0.0146 0.0037

FVC2002 DB1 bozorth3 optical 0.0333 0.0117 0.0125 0.0345 0.0030

FVC2002 DB1 commercial optical 0.0129 0.0053 0.0052 0.0137 0.0036

FVC2002 DB2 bozorth3 optical 0.0280 0.0059 0.0037 0.0032 0

FVC2002 DB2 commercial optical 0.0067 0.0018 0.0008 0.0017 0

FVC2002 DB3 bozorth3 capacitive 0.3236 0.0232 0.0815 0.0672 0

FVC2002 DB3 commercial capacitive 0.0271 0.0066 0.0063 0.0074 0.0037

FVC2004 DB1 bozorth3 optical 0.1118 0.079 0.0265 0.0527 0.0020

FVC2004 DB1 commercial optical 0.0569 0.0500 0.0219 0.0054 0.0002

FVC2004 DB2 bozorth3 optical 0.1125 0.0012 0.0182 0.0540 0.0001

FVC2004 DB2 commercial optical 0.0427 0.0004 0.0042 0.0155 0.0001

FVC2004 DB3 bozorth3 thermal 0.0743 0.0347 0.0075 0.0043 0.0032

FVC2004 DB3 commercial thermal 0.0266 0.0115 0.0007 0.0009 0.0001

NIST SD4 bozorth3 rolled 0.0457 0.0002 0.0019 0.0011 10−6

NIST SD4 commercial rolled 0.0265 0.0005 0.0013 0.0004 3 × 10−6

Comment: Whenever on a database a run had no remaining genuine attempts for one of the three single quality estimators, this runwas disregarded for all three, including baseline (13 runs in 2000 DB3 with bozorth3 and 6 runs in 2000 DB3 with commercial, 2in 2002 DB3 with bozorth3, 11 in 2004 DB1 with bozorth3, 14 in 2004 DB2 with bozorth3 and 6 in 2004 DB3 with commercial).For the combination (last column) we excluded only runs producing no remaining genuine attempts for the combination (5 in 2000DB1 with commercial, 49 with bozorth and 44 with commercial in 2000 DB3, 32 in 2002 DB2 with bozorth3, 92 with bozorthand 34 with commercial in 2002 DB3, 23 in 2004 DB1 with bozorth3, 10 in 2004 DB1 with commercial, 69 in 2004 DB2 withbozorth3, 83 in 2004 DB2 with commercial and 3 in 2004 DB3 with bozorth3).

SNoQE∧NFIQ2, employing thresholds on the NFIQ 2.0 andthe SNoQE quality scores lowers comparison errors over alldatabases. In view of usability, we note that by construction,SNoQE∧NFIQ2 tends to reject more prints from matchingthan the other estimators as can be seen in the comment ofTab. III, and in Tab. V for the NIST SD4.

B. ERC Curves

Following [9] we calculate ERC curves by disregardinggenuine attempts from matching if one of the two samplesdoes not exceed the quality threshold. This is one of manyvalidation schemes discussed in [9] and it conforms withthe rationale underlying the RBQ VS. For each database wecompute the comparison threshold at 0.1 FNMR and plotthe FNMR against the fraction of genuine attempts excluded

from comparison as dictated by the quality estimator. For thecombined quality estimator SNoQE∧NFIQ2, cf. V.2.

In Fig. 8, 9 and 15 (the latter can be found in theonline supplement) we present the ERC curves for NFIQ,SNoQE, NFIQ 2.0 and SNoQE∧NFIQ2, the combination ofthe latter two, using bozorth3 and commercial, respectively.The results are very mixed, endorsing the hypothesis thatSNoQE and NFIQ 2.0 measure different quality aspects.For example, on FVC 2002 DB3 (capacitive sensor) SNoQEconsistently outperforms NFIQ 2.0 for bozorth3, and viceversa NFIQ 2.0 consistently outperforms SNoQE on FVC2002 DB1 (optical sensor) for both CSs. On FVC 2000 DB1(optical sensor) with bozorth3, for high exclusion NFIQ 2.0consistently outperforms SNoQE, while for low exclusion,SNoQE consistently outperforms NFIQ 2.0. This relationship

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 9: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

9

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 8: ERC curves for the FVC databases under bozorth3 (first row: FVC 2000, second row: FVC 2002, third row: FVC2004).

is reversed for FVC 2000 DB3 (thermal sensor) with bozorth3and to lesser extent also with commercial. For the otherdatabases (most are optical), ERC curves of SNoQE andNFIQ 2.0 can be (especially with bozorth3) very close, oftenwinding around each other, giving no clear “winner”, theircombination SNoQE∧NFIQ2, however, often outperforms.Changing the CS from bozorth3 to commercial, NFIQ 2.0tends to perform better under ERC diagnostics in comparisonwith SNoQE, with the exception of FVC 2004 DB1. However,their combination continues to perform reasonably well, forexample on FVC 2002 DB1. Notably, for some databases andCSs, ERC points of NFIQ are below NFIQ 2.0, sometimes alsobelow SNoQE, and occasionally even below their combination.ERC curves of DNoQE are shown in Fig. 12 in the onlinesupplement, exemplifying that for the databases of concern,dryness noise, as measured by DNoQE, is of little interest forquality assessment.

For the NIST SD4 the result in Fig. 9 is surprising: The oldNFIQ outperforms the new NFIQ 2.0 for both CSs. Since the

underlying rolled fingers have been carefully taken followinga forensic protocol, this database contains little smudge andhence both NFIQs outperform SNoQE. Still, all three prove torelate quality to false non-matches. While the RBQ VS showsthat a combination of NFIQ 2.0 and SNoQE outperformseach single quality estimator, this effect is almost invisible inFig. 9 suggesting the study of more advanced combinations.We stress that, also for NIST SD4, NFIQ 2.0 and SNoQEmeasure different quality aspects, see Fig. 10 , in the onlinesupplement.

C. Delineation of RBQ VS from ERC Diagnostics

Recall that a quality estimator is deemed favorable underERC diagnostics if it features a quickly decaying ERC curve.In contrast, it is deemed favorable under RBQ VS if it resultsin low EER, where there is no penalty on the number ofprints rejected as long as genuine attempts remain possible.In consequence, ERC diagnostics and RBQ ranking validatedifferent aspects of a quality estimator. However, the following

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 10: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

10

(a) (b)

Figure 9: ERC curves for NIST SD4 with bozorth3 (a) and commercial (b).

TABLE IVNumber of runs in which the resulting EER of NFIQ 2.0 orSNoQE was lower, respectively, under the bozorth3 comp.

subsystem.

Database NFIQ 2.0 Equal EER SNoQE

FVC2000 DB1 0 37 63

FVC2000 DB2 35 5 60

FVC2000 DB3 75 5 14

FVC2002 DB1 99 0 1

FVC2002 DB2 50 19 31

FVC2002 DB3 41 2 55

FVC2004 DB1 76 0 8

FVC2004 DB2 63 7 19

FVC2004 DB3 21 11 68

NIST SD4 28 1 71

commonality of tail behavior of ERC curves with RBQ VScan be observed, in Fig. 8 and 15 (see online supplement) aswell as Tab. III: If for a quality estimator for high exclusionrate, FNMRs are considerably above zero, it cannot performwell under RBQ VS. Notably, for small databases as the FVCdatabases ”bad tail behaviors” observable in FVC 2000 DB1with bozorth3 for NFIQ 2.0, FVC 2004 DB2 with bozorth3for SNoQE as well as for all quality estimators on FVC 2004DB1 with commercial, can result from ”outlier” prints thathave a high quality score but cause a non-match error, seeFig. 7.

At this point we remark that RBQ VS is a conservativevalidation scheme, as it imposes effectively no penalty on thenumber of prints excluded from comparison. Hence, it modelsa high-security use-case in which the number of rejected printscan be very high (cf. comment to Tab. III and Tab. V) resultingin a high number of expected retries. However, if usability is

TABLE VAverage number of prints rejected by employment of a

quality threshold within the RBQ VS

Database NIST SD4 NIST SD4

Comp. Subsystem bozorth3 commercial

NFIQ 1894 1554

NFIQ 2.0 2184 1360

SNoQE 2403 2332

SNoQE∧NFIQ2 2713 2492

of concern (i.e. low-security), the RBQ VS can be relaxed ifa penalty is introduced in Step (2.2.2/2.2.3), see Section IV.

An advantage of the RBQ VS lies in robust quantificationof its results for the use-case modelled. Recall that, in order toguarantee robustness, we used repeated random subsampling.This allows, in contrast to ERC diagnostics, to state assertionswith statistical confidence. For example, we conclude fromTab. IV, that with high confidence, SNoQE outperforms NFIQ2.0 on FVC 2000 DB1, and NFIQ 2.0 outperforms SNoQEon FVC 2002 DB1 (both with bozorth3). For the other 8databases paired with bozorth3, each quality estimator out-performs the other one an equal number of times, withoutstatistical significance, however. In particular, due to the highvariability of outcomes over the runs, k-fold cross-validationis not applicable, especially on small databases as the FVC,underscoring the need for robustification by repeated randomsubsampling.

D. SNoQE’s Strenghts and Weaknesses

In addition to providing an important quality feature notsufficiently reflected by NFIQ and NFIQ 2.0, SNoQE iscomputationally also approximately up to five times faster thanNFIQ 2.0, cf. Tab. VIII in the online supplement.

By design, however, SNoQE can detect neither insufficientnumbers of minutiae nor insufficient minutiae overlap. This is

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 11: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

11

particularly the case for FVC 2002 DB1 with rather smudge-free prints, cf. Fig. 11 in the online supplement, some of whichfeature little overlap. This causes high FNMRs under highSNoQE quality, see Fig. 8. The same holds true for otherdatabases in which bad tail behavior of SNoQE can be found,exemplary are prints (c) and (d) of Fig. 7 which have littleoverlap.

VI. DISCUSSION

In this paper we have achieved two goals. First, we have pro-vided for a robust biometric quality validation scheme (RBQVS), that, with statistical significance, ranks the performanceof different quality estimators, which is specifically apt fora conservative use-case. Secondly, we have provided a newfingerprint quality feature, that is based on the occurrence ofsmudge, and adopts a novel image decomposition techniquefor fingerprint quality estimation. The resulting one-featurequality estimator SNoQE can compete with the popular NFIQ2.0 and its previous version NFIQ, on the FVC databases andthe NIST SD4 in terms of lowering error rates of compari-son subsystems, whilst even outperforming the multi-featuresquality estimators on some databases. With the robustificationproposed for RBQ VS it is possible to arrive at a statistically(highly) significant (or insignificant) superiority of one ofthe two estimators. Furthermore, detailed analysis shows thatthe smudge noise component is not sufficiently reproduced,neither by NFIQ nor by the NFIQ 2.0 feature vector and thusadds additional information.

In conclusion we would like to ponder on possible im-provements of validation scheme and quality feature. Recallthat due to minimizing the EER over a challenging database,we exclude considerably large numbers from matching. Whenexcluding a large proportion of requests is not an option, atbiometric border control for example, other objectives for theRBQ VS can be employed, for instance, using the lowestquality threshold for which a certain FAR, FRR or EER isreached. Also, additional penalization in Step (2.2.2/2.2.3) ofthe RBQ VS (see Section IV) taking in account the numberof rejected prints could be included.

At this point we remark that SNoQE was built directlyfrom an image decomposition method. While we have usedthe G3PD which is geared toward fingerprint decomposition,usability of other decomposition methods could be exploredand a good candidate is the DG3PD which showed promis-ing results on fingerprints, cf. [39], although an automatedparameter choice is not available to date. While SNoQEfaired very well in comparison to the NFIQ 2.0 and NFIQ,further parameter training for SNoQE as well as for G3PD (nolonger optimizing for segmentation but for quality estimation)may further improve SNoQE. In order to obtain improvementbeyond our results, training on much larger suitable databasesis necessary.

Finally, we recall that a naive combination of SNoQE andNFIQ 2.0 outperforms each single one. This hints towardsa high potential for improvement by exploring more sophisti-cated combinations, for example adding minutiae quantity andoverlap estimation and/or directly including the smudge noisequality feature in the NFIQ 2.0 random forest classifier.

ACKNOWLEDGEMENTS

We thank the three anonymous referees and the associatededitor for their valuable comments improving our contribution.Idea and first try-outs for the SNoQE and the RBQ VSoriginated in two of the working groups of the SAMSI 2015Forensics Program by NSF Grant DMS-1127914, and tosupport of which all authors are indebted.The first author would also like to thank for the support ofthe Deutsche Forschungsgemeinschaft (DFT) within the RTG2088, the last author gratefully acknowledges support from theNiedersachsen Vorab of the Volkswagen Foundation.

REFERENCES

[1] C. Neumann, D. Armstrong, and T. Wu, “Determinationof AFIS “sufficiency” in friction ridge examination,”Forensic Science International, vol. 263, no.Supplement C, pp. 114 – 125, 2016. [Online].Available: http://www.sciencedirect.com/science/article/pii/S0379073816301475

[2] Y. Meyer, Oscillating patterns in image processing andnonlinear evolution equations, ser. University LectureSeries. American Mathematical Society, Providence,RI, 2001, vol. 22, the fifteenth Dean Jacqueline B.Lewis memorial lectures. [Online]. Available: http://dx.doi.org/10.1090/ulect/022

[3] J.-F. Aujol and A. Chambolle, “Dual norms and imagedecomposition models,” International Journal of Com-puter Vision, vol. 63, no. 1, pp. 85–104, 2005.

[4] L. Rudin, S. Osher, and E. Fatemi, “Nonlinear totalvariation based noise removal algorithms,” Physica D.,vol. 60, no. 1-4, pp. 259–268, 1992.

[5] D. H. Thai and C. Gottschlich, “Global variationalmethod for fingerprint segmentation by three-part decom-position,” IET Biometrics, vol. 5, no. 2, pp. 120–130,2016.

[6] D. Thai, S. Huckemann, and C. Gottschlich, “Filterdesign and performance evaluation for fingerprint imagesegmentation,” PloS ONE, vol. 11, no. 5, p. e0154160,2016.

[7] S. Lee, H. Choi, K. Choi, and J. Kim, “Fingerprint-quality index using gradient components,” IEEE Trans-actions on Information Forensics and Security, vol. 3,no. 4, pp. 792–800, 2008.

[8] W. Dubitzky, M. Granzow, and D. Berrar, Fundamentalsof Data Mining in Genomics and Proteomics. SpringerScience, 2007.

[9] P. Grother and E. Tabassi, “Performance of biometricquality measures,” IEEE Trans. Pattern Anal. Mach.Intell, pp. 531–543, 2007.

[10] F. Alonso-Fernandez, J. Fierrez, J. Ortega-Garcia,J. Gonzalez-Rodriguez, H. Fronthaler, K. Kollreider, andJ. Bigun, “A comparative study of fingerprint image-quality estimation methods,” IEEE Transactions on In-formation Forensics and Security, vol. 2, no. 4, pp. 734– 743, 2007.

[11] J. Fierrez-Aguilar, Y. Chen, J. Ortega-Garcia, and A. Jain,“Incorporating image quality in multi-algorithm finger-

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.

Page 12: Smudge Noise for Quality Estimation of Fingerprints and ...na.math.uni-goettingen.de/pdf/Smudge_Noise_Paper.pdf · Smudge Noise for Quality Estimation of Fingerprints and its Validation

12

print verification,” in Proc. ICB, Hong Kong, China, Jan.2006, pp. 213–220.

[12] M. A. Olsen, H. Xu, and C. Busch, “Gabor filters ascandidate quality measure for NFIQ 2.0,” in Proc. 20125th IAPR International Conference on Biometrics (ICB).IEEE, 2012, pp. 158–163.

[13] M. A. Olsen, V. Smida, and C. Busch, “Finger imagequality assessment features – definitions and evaluation,”IET Biometrics, vol. 5, pp. 47–64(17), 2016.

[14] L. Shen, A. Kot, and W. Koo, “Quality measures of fin-gerprint images,” in PROC. AVBPA, SPRINGER LNCS-2091, 2001, pp. 266–271.

[15] J. Wu, S. Xie, D.-H. Seo, and W. Lee, “A new approachfor classification of fingerprint image quality,” in Proc.2008. ICCI 2008. 7th IEEE International Conference onCognitive Informatics. IEEE, 2008, pp. 375–383.

[16] H. Fronthaler, K. Kollreider, and J. Bigun, “Automaticimage quality assessment with application in biometrics,”in In IEEE Workshop on Biometrics, in association withCVPR-06, 2006, pp. 30–35.

[17] Z. Li, Z. Han, and B. Fu, “A novel method for thefingerprint image quality evaluation,” in Proc. 2009.CiSE 2009. International Conference on ComputationalIntelligence and Software Engineering. IEEE, 2009, pp.1–4.

[18] X. Tao, X. Yang, Y. Zang, X. Jia, and J. Tian, “A novelmeasure of fingerprint image quality using principalcomponent analysis (PCA),” in Proc. 2012 5th IAPRInternational Conference on Biometrics (ICB). IEEE,2012, pp. 170–175.

[19] C. Tada, A. Zaghetto, and B. Macchiavello, “Fingerprintimage quality estimation using a fuzzy inference system,”in Proc. Seventh International Conference on ForensicComputer Science (ICoFCS 2012), vol. 10, 2012, p.C2012007.

[20] S. Xie, J. Yang, H. Gong, S. Yoon, and D. Park, “Intel-ligent fingerprint quality analysis using online sequentialextreme learning machine,” Soft Computing-A Fusion ofFoundations, Methodologies and Applications, vol. 16,no. 9, pp. 1555–1568, 2012.

[21] K. Phromsuthirak and V. Areekul, “Fingerprint qualityassessment using frequency and orientation subbands ofblock-based Fourier transform,” in Proc. 2013 Interna-tional Conference on Biometrics (ICB). IEEE, 2013, pp.1–7.

[22] Z. Yao, J.-M. Le Bars, C. Charrier, and C. Rosenberger,“Fingerprint quality assessment combining blind imagequality, texture and minutiae features,” in Proc. 2015 In-ternational Conference on Information Systems Securityand Privacy (ICISSP), 2015.

[23] Z. Yao, J.-M. Le Bars, C. Charrier, and C. Rosenberger,“Literature review of fingerprint quality assessment andits evaluation,” IET Biometrics, vol. 5, no. 3, pp. 243–251, Sep. 2016.

[24] E. Lim, X. Jiang, and W. Yau, “Fingerprint qualityand validity analysis,” in Proc. Int. Conf. on ImageProcessing, 2002, pp. 469–472–271.

[25] C. Gottschlich, “Curved-region-based ridge frequency

estimation and curved Gabor filters for fingerprint imageenhancement,” IEEE Transactions on Image Processing,vol. 21, no. 4, pp. 2220–2227, Apr. 2012.

[26] T. Chen, X. Jiang, and W. Yau, “Fingerprint image qual-ity analysis,” in Proc. Int. Conf. on Image Processing,vol. 2, 2004, pp. 1253–1256.

[27] E. Lim, K. Toh, P. Suganthan, X. Jiang, and W. Yau,“Fingerprint image quality analysis,” in Proc. Int. Conf.on Image Processing, 2004, pp. 1241–1244.

[28] S. Joun, H. Kim, Y. Chung, and D. Ahn, “An ex-perimental study on measuring image quality of infantfingerprints,” in Proc. KES, 2003, pp. 1261–1269.

[29] S. Yoon, K. Cao, E. Liu, and A. Jain, “LFIQ: Latentfingerprint image quality,” in Proc. BTAS, Arlington, VA,USA, Sep. 2013, pp. 1–8.

[30] R. F. Teixeira and N. J. Leite, “A new framework forquality assessment of high-resolution fingerprint images,”IEEE transactions on pattern analysis and machine in-telligence, vol. 39, no. 10, pp. 1905–1917, 2017.

[31] F. Alonso-Fernandez, F. Roli, G. Marcialis, J. Fierrez,J. Ortega-Garcia, and J. Gonzalez-Rodriguez, “Perfor-mance of fingerprint quality measures depending on sen-sor technology,” Journal of Electronic Imaging, vol. 17,no. 1, p. 011008, Jan. 2008.

[32] “Nist fingerprint quality (NFIQ),” https://www.nist.gov/services-resources/software/nist-biometric-image-software-nbis, 2015, accessed:2017-12-04.

[33] “NFIQ 2.0 - NIST fingerprint image quality,” April 2016.[34] M. Kass and A. Witkin, “Analyzing oriented patterns,”

Computer Vision, Graphics, and Image Processing,vol. 37, no. 3, pp. 362–385, 1987.

[35] D. Maio, D. Maltoni, J. Wayman, and A. Jain,“FVC2000: Fingerprint verification competition,” IEEETransactions on Pattern Analysis and Machine Intelli-gence, p. 412, 2002.

[36] E. Candes, L. Demanet, D. Donoho, and L. Ying, “Fastdiscrete curvelet transforms,” Multiscale Model. Simul.,vol. 5, no. 3, pp. 861–899, 2006. [Online]. Available:http://dx.doi.org/10.1137/05064182X

[37] L. Borup and M. Nielsen, “Frame decompositionof decomposition spaces,” J. Fourier Anal. Appl.,vol. 13, no. 1, pp. 39–70, 2007. [Online]. Available:https://doi.org/10.1007/s00041-006-6024-y

[38] D. Maltoni, D. Maio, A. K. Jain, and S. Prabhakar,Handbook of fingerprint recognition. Springer Science& Business Media, 2009.

[39] D. H. Thai and C. Gottschlich, “Directional global three-part image decomposition,” EURASIP Journal on Imageand Video Processing, vol. 12, pp. 1–20, 2016.

This article has been accepted in a future issue of IEEE Transactions on Information Forensicsand Security, but has not been fully edited. Content may change prior to final publication.

Citation information: DOI 10.1109/TIFS.2018.2889258.

(c) 2018 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more

information.