sabah jassim university of buckingham, uk
DESCRIPTION
SecurePhone A Multi -M odal Biometric Verifier for constrained devices. Sabah Jassim University of Buckingham, UK. Outline. The SecurePhone project Fusion approaches to biometric-based Identification SecurePhone multi-Modal Biometric verifier PDA Implementation Constraints - PowerPoint PPT PresentationTRANSCRIPT
BioSecure & COST 2101 – Smart Cards and Biometric – Lausanne, 2007
Sabah JassimUniversity of Buckingham, UK.
SecurePhoneA Multi-Modal Biometric Verifier for constrained
devices
BioSecure + COST 2101 - March 2007
Outline The SecurePhone project
Fusion approaches to biometric-based Identification SecurePhone multi-Modal Biometric verifier
• PDA Implementation Constraints• Modalities• Fusion strategy
Performance: Match on Host (Moh) & Mach on Card (Moc)
Challenges and Potential solutions
Conclusion
BioSecure + COST 2101 - March 2007
The SecurePhone Project Aims to produce a prototype of a new mobile communication
system enabling biometrically authenticated users to deal legally binding m-contracts during a mobile phone call in an easy yet highly dependable and secure way using a biometric recogniser that fuses face, voice and handwritten signature. The SP consortuim
BioSecure + COST 2101 - March 2007
SecurePhone aim 1: secure exchange
Secure PKI (Public Key Infrastructure)
Deal secure m-contracts during a mobile phone call• secure: private key stored on SIM card
• user-friendly: intuitive, non-intrusive• flexible: legally binding text/audio
transactions• dynamic: mobile e-signing “on the fly”
BioSecure + COST 2101 - March 2007
Project aim 2: biometric Project aim 2: biometric verificationverification
preprocessing
modellingmodellingmodelling
preprocessingpreprocessing
face voice signature
accept userrelease private keyreject user
fusionclient &
impostor joint-score
models
Zero-Knowledge Authentication.
BioSecure + COST 2101 - March 2007
Implementation constraints• PDA main processor is such slower processing power
than PC. Thus even on PDA verification must be very efficient.
• Inadequate Audio-Visual signal sample rate using the device applications (only 8 kHz for audio and 10 fps video). Succeeded to improved. Current SP sampling and real time pre-processing is 22 kHz audio and 20 fps video signals.
• Only data on the SIM is secure, so must store and process the biometric models/templates on the SIM. Yet the SIM has very limited computational resources and processing supportSIM model storage is limited to 40 K: text-dependent promptsNote: text-independent prompts or varied text-dependent prompts are more secure, but would require 200-400 K.
• Enrolment should be based on a short session (acceptability)
BioSecure + COST 2101 - March 2007
Voice verification (SU / GET ENST)• Fixed 5-digits prompt – conceptually neutral, easily
extendable, requires few Gaussians• 22 KHz sampling• Online energy based non-speech frame removal• MFCCs with online CMS and first time difference
features – slow to compute, but fixed point faster than floating point
• Features modelled by 100-Gaussian GMM pdf, with UBM for model initialisation and score normalisation
• Training on data from 2 indoor and 2 outdoor recordings from one session. Testing on similar data from another session
BioSecure + COST 2101 - March 2007
Signature verification (GET INT)• 2D coordinates (100 Hz) augmented by time difference
features, curvature, etc. – total 19 featuresNote: no pressure or angles available, since obtained from
PDA’s touch screen, not from writing pad • Shift normalisation, but no rotation or scaling• Features modelled by 100 Gaussian GMM pdf – UBM
used for model initialisation and score normalisation• Fast to compute• Training and testing on data from one session
BioSecure + COST 2101 - March 2007
Face Wavelet feature Representation (BU)
The Discrete Wavelet Transform (DWT) decomposes an image into a set of different frequency subbands with different resolutions, each consisting of
At a resolution depth of k, the pyramidal scheme decomposes an image I into 3k + 1 subbands: (LLk, HLk, LHk, HHk, . . . , HL1, LH1, HH1).The lowest-pass subband LLk represents the k-level resolution approximation of the image I. The subbands HL1, LH1, and HH1 contain finest scale wavelet coefficients, and the coefficients get coarser as k increases, LLk being the coarsest.
Each subband of DWT-decomposed face image represents the person’s face at different frequency ranges and different scales (i.e. a distinct stream for face recognition with varying accuracy rates that can be fused for improved accuracy).
BioSecure + COST 2101 - March 2007
Face verification (BU)• Static face recognition – 10 grey-scale images
selected at random from a video, face area 160x192 pixels
• Histogram equalisation and z-score standardisation of features are applied as simple fast light normalisation.
• Haar wavelet low-low-4 (or low-high) subband as feature vectorsOther wavelet filters were tested but Haar is the fastest to compute
• Features modelled by only 4 Gaussian GMM pdf – UBM used for model initialisation and score normalisation
• Training on data from 2 indoor and 2 outdoor recordings from one session, testing on similar data from another session
BioSecure + COST 2101 - March 2007
Fusion (GET INT)
• For each modality S(i) = log p(Xi|C) - log p(Xi|I)
• Score fusion was tested by:
• Optimal linear weighted sum:Fused-scores = w(i) * S(i)
sum is taken over the 3 modalities
• GMM scores modelling, i.e. modelling both client and impostor joint score pdf’s by diagonal covariance GMMs:Fused-score = log p(S|C) - log p(S|I)
BioSecure + COST 2101 - March 2007
User verification system
• User requests PDA to verify their identity
• PDA requests user to •read prompt (face in box)•sign signature
• Feature processing applied to each modality[silence removal, histogram equalisation, MFCC or Haar wavelets, online CMS, delta features, etc.]
• for each modality S(i)=log p(Xi|C)-log p(Xi|I)
• if S(i) < θ(i) for any (i) please repeatelse fused-score = log p(S|C) - log p(S|I)
• if fused-score > φ user acceptedelse user rejected
Press to start/stop speaking
7 9 8 5 1
start/stop
BioSecure + COST 2101 - March 2007
Speaking face & Forgery (GET ENST)
• Investigated possible attacks and forgery scenarios:
• using synthesised voice and face
Difficult to create – synchronisation problems
• Replay attacks – devised a successful attack whereby the client voice and face images but not the same video.
Used coupled HMM for voice and face reduced greatly the effect of this attack.
BioSecure + COST 2101 - March 2007
PDA Database (PDAtabase)
• After initial development with many databases [TIMIT(V), CSLU(V), BANCA(V,F), ORL(F), BIOMET(V,F,S), NIST(V)]
• CSLU/BANCA-like database recorded on Qtek2020 PDA for realistic conditions (sensors, environment)
• 60 English subjects: 24 for UBM, 18 for g1, 18 for g2. Accept/reject threshold optimised on g1evaluated on g2, vice versa
• Video (voice + face): 18 prompts from (5-digit, 10-digit and phrase);3 sessions, with 2 inside and 2 outside recordings per session
• Signatures in one session, 20 expert impostorisation for each• Virtual couplings of audio-visual with signature data (independent)• Automatic test script allows to test many possible configuration• User just provides executables for feature modelling, scores generation
and scores fusion
BioSecure + COST 2101 - March 2007
Match on Host (MoH): complementarity of modalities
Modality 5 digits 10 digitsVoice (V) 6.1 3.4Face (F) 28.6 29.9Signature (S) 6.2 6.2
V + F 4.8 3.0V + S 1.1 0.7S + F 4.8 4.7V + F + S 0.9 0.6
Result table with improved results for 5-digit and 10-digit prompts in PDAtabase (SPIE 2006)
For LL subband.
Already have improved
results for LH subband!
BioSecure + COST 2101 - March 2007
Match on Card (MoC)
Implementation of the MoH system on the SIMcard (MoC) No problem in terms of storage But is not feasible because of verification time
(matching plus host/SIM communication = one hour )A reduction of the verification time can be attained by reducing the vector size reducing the frame rate reducing the number of Gaussians of the client and
background modelsMatching time was still not acceptable
BioSecure + COST 2101 - March 2007
MoC bottleneck
Not in preprocessing, since this is still all done on the PDA, as in the MoH system.
Not in face: Although feature vectors are Only a few (10) of them in testing and only 4 Gaussians needed (client model and UBM)
Bottleneck caused by voice and signature data: Vectors are relatively small, large number of frames large number of Gaussians
BioSecure + COST 2101 - March 2007
MoC solutionOnly a drastic measure can solve the problem: Globalised features:
Features to represent the whole signature: a single vector of 41 parameters representing correlation and variation in x-y coordinates, velocity and acceleration parameters
Idea generalized to voice: use of means (cf. Long-Term Average Spectrum) and standard deviations per vector parameters across all frames
Works well for signature Improvement:
use up to four equal subparts of signature/voice signal Implementation: 2 equal subparts
BioSecure + COST 2101 - March 2007
MoC-emulated results
EER (percent) for globalised means (columns 2-5) and means plus standard deviations (columns 6-9) for voice and sinature divided into
two equal subparts
Global feat.
Means only
Means only
Means only
Means only
Means + sd
Means + sd
Means + sd
Means + sd
#Gauss. 1 2 4 8 1 2 4 8
Voice 22.13 21.09 20.87 21.86 20.88 19.72 17.68 18.49
Face 32.26 31.78 29.06 29.19 32.26 31.78 29.06 29.19
Signature 38.29 27.58 22.58 17.86 28.14 22.16 17.59 16.45
Fused 12.89 12.48 10.49 9.32 12.56 10.48 8.28 9.15
BioSecure + COST 2101 - March 2007
Solving the capacity problem
Possible options for improving performance of the SecurePhone: Use match-on-server (MoS) - Security and privacy concern. Implement the Biometric Recognizer and Encryption on a
chip (more costly than current solution) Build a secure PDA with sufficient storage and processing
power (A dedicated device that would be more costly and less ubiquitous).
Split matching (hybrid MoC/MoH) considered but not implemented. Initial work is being done and results are encouraging. Promising implications for security and privacy of biometrics data (templates/models)without cryptography.
BioSecure + COST 2101 - March 2007
Conclusion and Future Work• Natural, non-intrusive biometrics guarantee high user acceptance • Biometric data never leave the SIM-card. High security • Fusion of Multi-streams of single trait can lead to improved in
performance (A pilot for Face was tested but not implemented in SP)• MoH is efficient with high accuracy, but vulnerable. • MoC is secure, efficiency and high accuracy cannot happen together!
Future work include: Designing hybrid mixed client-server matching. Investigating the privacy and security of Biometric data, using
Cancellable Biometrics, specially for “Match on Server” Improving performance of single modalities through the multi-
classifier & multi-stream strategies. e.g. Face by mixing larger number of subbands at different depths
BioSecure + COST 2101 - March 2007
AcknowledgementThanks to EU for funding this research through the
SecurePhone (IST-2002-506883) project.