Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Privacy-preserving fingercode authentication

2010

Privacy-Preserving Fingercode Authentication∗ † Mauro Barni Tiziano Bianchi Dario Catalano Dipartimento di Ingegneria dell’Informazione Università di Siena Dipartimento di Elettronica e Telecomunicazioni Università di Firenze Dipartimento di Matematica e Informatica Università di Catania Dipartimento di Matematica e Informatica Università di Catania Dipartimento di Tecnologie dell’Informazione Università di Milano Dipartimento di Ingegneria dell’Informazione Università di Siena barni@dii.unisi.it Mario Di Raimondo diraimondo@dmi.unict.it tiziano.bianchi@unifi.it Ruggero Donida Labati ruggero.donida@unimi.it catalano@dmi.unict.it Pierluigi Failla pierluigi.failla@gmail.com ABSTRACT Categories and Subject Descriptors We present a privacy preserving protocol for fingerprintbased authentication. We consider a scenario where a client equipped with a fingerprint reader is interested into learning if the acquired fingerprint belongs to the database of authorized entities managed by a server. For security, it is required that the client does not learn anything on the database and the server should not get any information about the requested biometry and the outcome of the matching process. The proposed protocol follows a multi-party computation approach and makes extensive use of homomorphic encryption as underlying cryptographic primitive. To keep the protocol complexity as low as possible, a particular representation of fingerprint images, named Fingercode, is adopted. Although the previous works on privacy-preserving biometric identification focus on selecting the best matching identity in the database, our main solution is a generic identification protocol and it allows to select and report all the enrolled identities whose distance to the user’s fingercode is under a given threshold. Variants for simple authentication purposes are provided. Our protocols gain a notable bandwidth saving (about 8 − 24%) if compared with the best previous work [1] and its computational complexity is still low and suitable for practical applications. Moreover, even if such protocols are presented in the context of a fingerprintbased system, they can be generalized to any biometric system that shares the same matching methodology, namely distance computation and thresholding. E.3 [Data Encryption]: Public key cryptosystems; H.2.0 [Database Management]: General—Security, integrity, and protection; K.4.1 [Computers and Society]: Public Policy Issues—Privacy ∗Sponsored by MIUR under project “Priv-Ware” (contract n. 2007JXH7ET). †This is a revised version with few corrections on Section 5 with respect to the proceedings version. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM&Sec’10, September 9–10, 2010, Roma, Italy. Copyright 2010 ACM 978-1-4503-0286-9/10/09 ...$10.00. General Terms Algorithms, Security 1. INTRODUCTION Biometric-based identification is receiving more and more attention as an extremely reliable way of identifying people. Such an interest is mainly due to the high reliability ensured by biometric identification, its universality (most biometric traits are such that every person owns them), uniqueness (it is very rare, if not impossible, that two persons own the same biometric trait), and permanence (good biometric traits are virtually time invariant) [2]. As a matter of fact, biometric templates are uniquely associated with each user and thus represent the strongest form of personally identifiable information. For the same reason, however, the possibility that a biometric template could be stolen or exchanged raises concerns on its uses and abuses. In this sense, a common concern is the possibility that a government agency or a company which maintains personal data might monitor and track the actions and the behavior of each individual. Another basic concern is about the anonymity loss implied by the massive use of biometrics for identification or authentication purposes. Hence the widespread use of biometric systems asks for a careful policy specifying to which party biometric data can be revealed. It is even more important to note that the biometric matching process may involve a central server or be adopted in partially untrusted environments. It is therefore clear that developing techniques to process biometric data in a privacy-preserving way would have a great impact on the use of biometric-based authentication systems in every-day life. In this framework the use of multiparty computation and secure function evaluation techniques has been lately advanced as a viable way to process encrypted biometric data as they preserve the privacy of the parties involved in the computation. For example, the works in [3, 1] propose two privacy-preserving face recognition protocols that may be seen as the basis for a biometric-based identification service where faces are used as the underlying biometric trait. Although face images are widely used in many applications, they are known to be quite weak biometric traits. Therefore more reliable traits like fingerprint, iris code or DNA are likely to be used in applications that need higher reliability. In this paper, we propose a privacy-preserving system for fingerprint-based authentication. We consider the following typical scenario: a client C equipped with a specificbiometric device (fingerprint reader) is interested into learning if the just acquired fingerprint belongs to the database of authorized entities that is managed by a server S. For privacy, we require that the client should trust the server to correctly perform the matching algorithm for the fingerprint recognition and also it should not learn anything about the database managed by the server, beyond the outcome of the matching process. On the other hand the server should not get any information about the requested biometry and the outcome of the matching process. As to the template used to represent the user’s fingerprint, we adopted the Fingercode representation introduced in [4]. While other kinds of representations, noticeably those based on minutiae [5], are more common in practical applications, we chose the fingercode representation as it is more suitable for being implemented in a multi-party computation setting. In fact, after the feature extraction step, its matching phase needs only distance computation and thresholding. Furthermore it keeps good accuracy when quantized to work with integer vectors. Though the works in [3, 1] focus on selecting the bestmatching identity in the database managed by the server (giving out to the client a specific identifier), our main solution is an identification protocol that allows to select and report the identifiers of all (if more than one are present) the enrolled identities whose distance to the user’s fingercode is lower than a given threshold. We also propose the following variants for authentication purposes. The first one considers applications where the client is interested only into knowing if the users’s fingerprint is in the database or not (without an identifier). The second one handles the case of a client who wishes to verify if a given alleged identity is in the database and if the just acquired fingerprint matches with such identity. Such scenarios are detailed in the next sections. Our protocols are entirely based on the use of homomorphic cryptosystems and gain a notable bandwidth saving (about 8−24%), if compared with the best previous work [1]. The computational complexity is still low and suitable for practical applications (as shown in Section 5). Moreover, even if such protocols are presented in the context of a fingerprint-based system, they can be generalized to any biometric system that shares the same matching methodology, namely distance computation and thresholding. The rest of the paper is organized as follows. In Section 2, we briefly describe the Fingercode approach for fingerprintbased authentication. In Section 3 we give a rigorous description of the scenario considered in the paper. In Section 4 we introduce our protocols, whose complexity and security is analyzed in Section 5. Few technical parts are discussed in appendices for lack of space. 2. FINGERCODE-BASED ID MATCHING Various approaches for automatic fingerprint matching have been proposed in the literature. The most popular ones are based on the minutiae pattern of the fingerprint and are collectively called minutiae-based approaches [5]. Although rather different one from the other, most of these methods require extensive preprocessing operations in order to reliably extract the minutia features [6]. Another class of fingerprint matching approaches matches directly the fingerprint images [7], or tries to match features extracted from the image by means of certain filtering or transform operations [8]. The algorithm this paper focuses on, is based on a particular representation of the fingerprints which yields a relatively short, fixed length code, called Fingercode [4] suitable for matching as well as storage on a smartcard. The matching step is particularly simple since it boils down to the computation of the Euclidean distance between the to-be-matched fingercodes and its comparison against a threshold τ . The fingercode representation exploits some well-known peculiarities of fingerprints to generate a short fixed length code, maintaining a high recognition accuracy. The fingercode representation was selected because its good performances in terms of accuracy and speed, and because the used template and the matching step are particularly suited for a secure multi-party implementation. The accuracy of the fingercode system is related to the quality of the input image and the experience of the user to be correctly enrolled. In the literature, implementations of the fingercode algorithm report Equal Error Rates in the range of 3-5% working on standard fingerprint datasets [4, 9]. Experiments show that filter-based matchers such as the fingercode tend to perform slightly worse than the state-of-the-art minutiae-based matcher on the same databases, but the fingercode matching function has a much lower computational complexity. The feature extraction algorithm can be split into four main steps as shown in Figure 1. By referring to such a figure, given the fingerprint gray scale image, the matching algorithm works as follow: 1. determine a reference point; 2. tessellate the region of interest around the reference point; 3. filter the region of interest in eight different directions using a bank of Gabor filters (more details in [10]); 4. compute the average absolute deviation from the mean of gray values in individual sectors in filtered images to define the feature vector or the fingercode. The result of the above procedure is the fingercode that is a k-dimensional feature vectors. By referring again to Figure 1, the top left disk represents the details for the 0◦ degree orientation of the Gabor filter, while the bottom right disk represents the 157.5◦ component. Each disk corresponds to one particular Gabor filter that enhances the details in given directions. In the original configuration, there are 5 bands and 16 sectors for each disk, which results in a total of 640 (5 × 16 × 8) components for each fingercode. Matching is easily achieved by computing the Euclidean distance between the to-be-matched fingercodes and comparing it against a matching threshold. The translation invariance in the fingercode is due to the proper choice of the Input image Normalize each sector Compute A.A.D. feature Filtering Input Fingercode Euclidean distance Divide image Locate the reference point in sectors Matching result Template Fingercode Figure 1: System diagram of the fingerprint authentication system. reference point. However, features are not rotationally invariant. An approximate rotation invariance is achieved by cyclically rotating the features in the fingercode itself. To do so, for each fingerprint in the database, it is necessary to store m fingercode templates corresponding to a given set of m rotations of the fingerprint image (usually we have m = 5). The input fingercode is matched against the m templates stored in the database to obtain m different matching scores. Matching scores are compared with a threshold τ and if at least one of them is below τ the matching succeeds. To improve the matching accuracy, biometric systems may also use personalized threshold (τi ) for each user so to ensure a better accuracy and a lower false positive rate. To adapt the plain version of this algorithm to work on encrypted data, we need to quantize the data involved in the computation. This is necessary because the cryptographic technique works just only on integer number (i.e. Zn ). Moreover, we can also change some parameters (e.g., number of bands, sectors) in order to reduce the length of the fingercode. There are several possible choices for such parameters, our approach is based on finding, via experiments, the solution that guarantees the lowest number of bits and components without significantly affecting the performance of the overall system. The subsequent privacy transformation will not affect the accuracy of the underling biometric identification system. 3. OUR SCENARIO In the following we discuss our scenario as well as some known constructions of privacy-preserving identification using biometric measurements. In particular we focus on the works of Erkin et al. [3] and of Sadeghi et al. [1] as our solution employs some basic building blocks used already in [3, 1]. In a nutshell, both such works implement a secure face recognition protocol using standard Eigenfaces [11]. The main difference between the two solutions is that the former relies entirely on homomorphic encryption (HE) while the latter adopts an hybrid approach where Garbled Circuits (GC) are used in conjuction with HE. In this paper we consider the following scenario: a client C is equipped with a biometric device (e.g. a fingerprint reader). The device is used to read some biometric data ID to be transmitted (in some encrypted form) to a server S where a database of authorized identities is stored. The privacy requirement that we impose is that C should not be able to get any information beyond the fact that ID is in the database or not (as stated later, in our work we also consider the support of more than one matching record). At the same time S should not get any information about ID (not even if it is in the database or not). In our scenario C owns a pair of (matching) keys (pkC , skC ) for a public-key cryptosystem and we assume that the server has a certified copy of pkC 1 . As in [3, 1], our solutions adopt the following three steps approach: • vector extraction: on a first stage the target biometry (i.e. the information acquired by the biometric device) is “converted” in a quantized characteristic feature vector x̄; in our specific case, the fingerprint image is processed as described in Section 2 in order to extract the fingercode vector; similar processes are available in literature for other biometric systems; • distances computation: the distances (with respect to some appropriate metric) between the target vector x̄ and the vectors corresponding to each ID in the database are computed; in our case, we are going to use the Euclidean distance as required by the fingercode system; • selection of the matching identities: one (or more) IDs matching the target ID are selected. About the last step, in order to open new application scenarios, we slightly change the original semantic of the problem: instead of querying about the nearest matching enrolled identity in the database as in [3, 1], we are interested in getting all the matching enrolled identities. In other words: the required outcome for the client is the list of all the identities in the database whose characteristic feature vectors are “near enough” to be considered a successful match (i.e., the distance is lower than the threshold τ ). Are the two problems equivalent? With some biometric systems, if we assume well-chosen parameters (like the threshold τ ), one may assume that a measure of a specific 1 Jumping ahead, the protocols of Section 4 require the use of two different encryption schemes (Paillier and EC-ElGamal, see below): to simplify the presentation we are assuming the pkC contains the different public-keys of the required cryptosystems. biometry matches with just one person: the owner. If, for some application-related reasons, the same person is enrolled in the database more than one time, it should be fine to return all these identities to the client. However, for specific biometric systems or applications, it could not be equivalent and/or desirable. Willing to adapt our main construction to the identification problem threated in [3, 1] we also propose two further variants in Section 4.4: one is suitable to applications where the client is only interested in knowing if a specific biometry is in the database or not (without an identifier); the second ˆ is one allows a client to verify if a given alleged identity id in the database and if a given biometric measure matches with that identity. 3.1 Parameters and Model We will denote the symmetric security parameter by t and the asymmetric one (i.e., bit-length of RSA moduli) by T . Recommended parameters for short-term security are t = 80 and T = 1024, whereas for long-term security t = 128 and T = 3072 are recommended [12]. In all the scenarios we consider a server S with a database of n enrolled entities, where each of them is represented by a characteristic feature vector of k ℓ-bits integers. We will denote with τ the biometric-threshold that, given a specific metric, allows to say if two biometric measures match or not. In order to support the specific matching logic on the fingercode (see Section 2), we will assume that for each enrolled identity m = 5 different vectors are stored in the database as well as an eventual identity-specific threshold τ i . The values of the k, ℓ and m parameters must be tuned according to the current fingerprint dataset. For example, working with a dataset of n = 900 fingerprints captured with a standard fingerprint sensor, a proper parameters configuration can be the following: 2 − 5 concentric bands, 4 − 16 sectors, 2 − 8 gabor filters, quantized with 4 − 8 bits and stored with five different orientations (k = 16−640, ℓ = 4−8 and m = 5). Typical bit lengths of the fingercode range from 64 to 5120 bits. Finally, we work in the honest-but-curious model (as in [3, 1]), where parties are assumed to follow the protocol but may try to learn additional information from the protocol trace beyond what can be derived from the inputs and outputs of the algorithm when used as a black-box. 4. OUR CONSTRUCTIONS In this section we present our proposals to efficiently solve the problems introduced in Section 3. Our constructions strongly rely on the notion of (additively) Homomorphic Encryption (HE) schemes. 4.1 Homomorphic Encryption A public-key encryption cryptosystem is said to be additively homomorphic if, given the encryptions2 JaK and JbK, the ciphertext Ja + bK can be easily computed as Ja + bK = JaK ◦ JbK, where ◦ denotes some efficiently computable operator (e.g. plain multiplication on the underlying ring). As a consequence, it is also possible to compute the multiplication of an encryption JaK for a constant c in clear as Jc·aK = JaKc . 2 In the rest of the paper we will denote with JxK the encryption of the plaintext x; the public-key used for the encryption is generally deducible from the context. In our protocol we will make extensive use of semantically secure3 additively homomorphic encryption schemes. In particular we will adopt Paillier’s encryption scheme and a variant of the well known-variant of the ElGamal encryption scheme (the latter is detailed in Appendix A). 4.1.1 Paillier cryptosystem. Paillier presented in [13] the following efficient scheme. Let N = pq be a T -bits RSA modulus, with p, q primes, and g is an element whose order is multiple of N in Z∗N 2 . The probabilistic encryption of a message x ∈ ZN is computed as JxK = g x rN mod N 2 , where r ∈ ZN is chosen at random. Such a scheme is clearly additively homomorphic: given JxK and JyK we have that Jx + yK = JxK · JyK. The homomorphic operation is deterministic and, for security reasons, it will be occasionally necessary to “re-randomize” the resulting ciphertext JzK using a fresh random value r ∈ ZN as JzK · rN . Pailliers scheme is efficient enough to be used in practice but the ciphertext is twice as long as the original plaintext. The semantic security of the scheme is proven under the decisional composite residuosity assumption (DCRA) [13]. 4.2 Vector Extraction During a preliminary phase the acquired fingerprint image is converted into a quantized fingercode vector. We assume that this phase is done in clear (i.e. not in the encrypted domain) by the client. Notice that this is not an issue in our honest-but-curious setting where the client (i.e. the biometric device) already has the fingerprint data. Moreover, given our current state of knowledge, such an assumption seems to be necessary for our protocol to be practical. Indeed, for many biometric systems (e.g. fingerprint, iris,...) the analysis of the biometric measures, and their corresponding quantization process, are too complex to be efficiently done on the encrypted domain. 4.3 An HE-based Solution for the Identification of Matching Identities Here we outline the details of our main construction. As stated above, we assume that the client C has already processed the fingerprint image to get a characteristic feature vector (fingercode) x̄. On the other side, the server S manages a database of n pairs (idi , ȳ i ), where idi is a unique numeric identifier associated to the specific enrolled identity and ȳ i is the related precomputed feature vector. Our solution requires the use of specific values for these identifiers: idi = 2i (powers of 2). In this phase we deliberately ignore some technical details that are fingercode-specific. In particular we do not consider here the presence of m different fingercodes for each identity and the use of identity-specific thresholds τ i . In this way we get a more general protocol that could be used for other biometric systems as well. Fingercode specific aspects are discussed in Section 4.5. 4.3.1 Secure vector submission. Following Erkin et al.’s approach, in the first step the client C sends an encrypted element-by-element version of the integer vector x̄ to the server S: more specifically, k 3 A cryptosystem is said to be semantically secure if it is infeasible for a “passive” adversary to derive significant information about the plaintext when it is given only the corresponding ciphertext and the used public-key. Paillier encryptions Jx0 K, . . . , Jxk−1 K jointly with a further P 2 encryption J k−1 j=0 xj K. The latter value will be used to complete the computation of the distances in the ciphertexts domain as described later. bi = 1 ⇔ Di < τ for i ∈ {1, . . . , n} 4.3.2 Computation of distances. In this step, it is required to compute the distances between the target vector x̄ and characteristic feature vectors ȳ i extracted from each identity in the database. The fingercode system, as well as other biometric systems, uses the Euclidean distance as underlying metric. In particular we consider squared distance to reduce the complexity of the protocol4 . Denoting with Di the square of the Euclidean distance between x̄ and the stored vector ȳ i , the server can non-interactively compute JDi K by exploiting the homomorphic properties of the Paillier cryptosystem, its knowledge of ȳ i and the ciphertexts received by C as follows: i JD K = = tk−1 X (xj − j=0 tk−1 X x2j j=0 = tk−1 X j=0 x2j yji )2 | t | · −2 | · k−1 Y j=0 k−1 X xj yji j=0 Jxj K −2yji | | tk−1 X i 2 · (yj ) j=0 · tk−1 X j=0 (yji )2 mutes8 these pairs as: (idj1 , JDj1 K), . . . , (idjn , JDjn K) and then it computes, using parallel executions of bit-MIN, the values Jbji K = bit-MIN(Jτ K, JDji K) for i ∈ {1, . . . , n}. The following invariant holds: | ∀i ∈ {1, . . . , n} 4.3.3 Identities selection. This step uses a protocol, that we call bit-MIN, that allows to obliviously5 compute an encryption of the predicate bit of “X < Y ”: i.e. JbK = bit-MIN(JXK, JY K) where b is such that (b = 0 ⇔ X < Y ). Here we assume that the server S has got the encryptions of the two inputs JXK, JY K and that is interested in receiving the outcome JbK. The security of our protocol requires that the client C should not learn anything about X, Y and that the returned predicate bit b should not be revealed to S. For this task we simplify a protocol given in [3], which in turn builds from a solution originally given by Damgård et al. [14]. In these schemes, in order to gain bandwidth, a protocol-tailored variant of the Benaloh-Fischer cryptosystem [15] is adopted6 as underlying, additively homomorphic, encryption scheme. Our solution is simpler than [3] in that we can allow the client to know b7 . For lack of space, it is described in Appendix A. Once the distance computation phase is over, S gets the distances JD1 K, . . . , JDn K. Let’s consider the following n pairs: (id1 , JD1 K), . . . , (idn , JDn K). The server randomly per4 We are of course using the fact that the square function is monotonically increasing function on positive inputs. We implicitly assume that the threshold values τ, τ i are properly adapted to accommodate this. 5 Here by oblivious we mean that the output of the protocol is hidden to at least one of the two parties. 6 We will not discuss the details of this cryptosystem here. The interested reader is referred to [14]. 7 The functionality of the bit-MIN sub-protocol is exploited in the work of Erkin et al. [3] in the identity selection phase. The security of their final solution requires an even stronger variant of it: the protocol has to be oblivious to both the parties (i.e. not even the client can learn the bit b) (1) Finally, the server computes and returns to the client the following encrypted value: t n | n X i Y i i JRK = b · id = Jbi Kid i=1 i=1 As a consequence of invariant (1), it is easy to check that the final value R will consist in the sum of the numeric identifiers associated to the enrolled identities that match the target biometry. In other words: the bit at position i in R is set to 1 if and only if the i-th identity matches. The client can easily extract R and reconstruct the list of matching identities. The complete protocol is shown in Figure 2. Comparing our approach with that of Erkin et al. [3], our identities-selection methodology leads to a solution that is more efficient both in terms of the round complexity (our protocol requires a constant number of rounds, like in [1]) and in terms of the overall bit (bandwidth) complexity (see Section 5.1). Moreover, further bandwidth saving comes from the fact that, in the bit-MIN protocol, we use additive ElGamal (over Elliptic Curves) as underlying homomorphic encryption scheme, which is perfectly suited for our application. Additive ElGamal differs from standard one [16] in that it is additively homomorphic. This comes at the cost of requiring a relatively small messages space (for our application, however, this limitation is not a problem at all as we use additive ElGamal to encrypt bits). Moreover, the scheme becomes extremely bandwidth-efficient when implemented over suitably chosen Elliptic Curve9 . In the following we will refer to this scheme as additively homomorphic EC-ElGamal (additional details are deferred to Appendix A). 4.4 Variants for Specific Scenarios Here we collect a few variants of the novel protocol of Figure 2 suitable for some specific application scenarios. 4.4.1 Simple authentication. In applications where the client is willing to accept a simple boolean outcome, like “authenticated/rejected”, our solution is functionally equivalent to the prior works in [3, 1]. It is sufficient to change the way the value JRK is computed: t n | n X i Y JRK = r · b = ( Jbi K)r i=1 i=1 where r is a fresh random integer. The client C will output rejected if R = 0, authenticated otherwise. 4.4.2 Authentication with identity confirmation. Let’s think about the following high-security authentication scenario: the person who is going to authenticate is 8 The randomization is only used to hide to the client the relation among the positions of the identities in the DB and the order of querying. It is strictly required a fresh permutation at each new session. 9 For instance using the curves from the SECG standard [17, 18]. Our main identification protocol inputs of S: pkC , idi = 2i , ȳ i output for S: nothing inputs of C: pkC , skC , x̄ output for C: set of matching identities client C Jx0 K, . . . , Jxk−1 K, extract R decode R rP k−1 j=0 x2j z ✲ Jbji K = bit-MIN(Jτ K, JDji K), . . . ✛ ✲ ✛ ✛ JRK server S ∀i ∈ {1, . . . , n} qP y Q i JDi K = qPx2j · y Jxj K−2yj · i 2 (yj ) · choose a random permutation: j1 , . . . , jn ∀i ∈ {1, . . . , n} get the bit Jbji K JRK = Qn i=1 Jb i Idi K Figure 2: An HE-based for the identification of matching identities doubly checked through some kind of hardware token (or a simple card with a bar-code) and some specific biometry (e.g., fingercode). In this case the client (the biometric ˆ reader) is able to send to the server an alleged identity id read from the hardware token. The final boolean outcome will be positive (authenticated) if and only if the submitted biometry matches one of the enrolled identities as well as ˆ the alleged identity id. The suitable protocol is shown in Figure 3. After the computation of the encrypted distances JDi K, the server will compute the auxiliary values: ˆ − idi )K = (JidK ˆ · Jidi K−1 )ri Jmi K = Jri · (id where ri are fresh random integers. All the values mi will ˆ be different than zero except for the alleged identity id. The values Jmi K will be sent to the client during the executions of the sub-protocol bit-MIN: C will return the exact outcome Jbi K of bit-MIN only if the corresponding mi is not null, otherwise a dummy outcome J0K is sent. In this way only a single bit bi can be not null and only if it matches ˆ the alleged identity id. 4.5 Fingercode-specific Adaptation The fingercode biometric system has few peculiarities that have to be specifically addressed in our proposals. The main issue comes from the biometric matching algorithm that, in order to keep low the error rate, requires to store m different fingercodes (at different inclinations) for each enrolled identity. The algorithm considers the m Euclidean distances between the target fingercode and all these m vectors: the final reference distance will be the minimum one. If we consider the case of a successful match with a given biometry, it is quite probable that more than one of the m (quite similar) fingercodes present a specific distance under the threshold τ . Given that, if we use the same identifier idi for all the m fingercode variants, in our identification protocol of Figure 2 we could have as final outcome R = ki1 ·idi1 +· · ·+kiq ·idiq , where idi1 , . . . , idiq are q different matching identities and ki1 , . . . , kiq are specific constants that denote the number of matching fingercodes for each identity. Such R would not allow a reliable decoding. To overcome the problem we can use (n · m) different identifiers for each of the stored fingercodes: each biometry will have m different identifiers with a publicly known relation. For example, the i-th biometry could use the following identifiers: idmi = 2mi , idmi+1 = 2mi+1 , . . . , idmi+(m−1) = 2mi+(m−1) . Given a matching identifier idw = 2w , a unique identifier for the biometry is (w mod m). The two variants presented in Section 4.4 do not require the use of n · m different identifiers: the former does not use them at all, and in the latter the use of the same identifier for all the m fingercodes does not arise specific issues. As stated in Section 2, another peculiarity of the quantized version of the fingercode system is the possibility to use a personalized threshold τ i for each one of the enrolled identities. This can be easily handled in our selection proposals: the threshold τ i will be stored in the database record and used in the invocation: Jbi K = bit-MIN(Jτ i K, JDi K). It is important to note that the τ i values (or any related information) for the non-matching identities are not revealed to the client. 4.6 More Scalability The main proposal in Figure 2 strictly requires the use of powers of 2 as identifiers: in real application scenarios with a wide set of enrolled people this fact could limit its scalability. Indeed, the maximum number of different identifiers is equal to the bit-length of the Paillier plaintext (T ). For a security level of t = 128, we can handle at most T = 3072 different identifiers10 . This can be handled clustering the n > T identifiers and using  multi ple outcomes Rj ; more specifically, for j = 0, . . . , Tn − 1 Qnj jT +i 2i the server computes JRj K = K , where n0 = i=1 Jb T, n1 = T, . . . , n⌈ n ⌉−1 = n mod T are the cluster cardiT nalities. In this way the i-th bit in Rj is associated to the identity idjT +i . The ciphertexts JRj K are sent to the client in the last exchange in the protocol. These changes on the protocol do not imply any further leakage of information. As stated above, we notice that such scalability issue on the number of identifiers does not apply at all to both variants presented in Section 4.4. 10 The fingercode system reduces this availability of a factor of m. A variant with identity confirmation ˆ x̄ inputs of C: pkC , skC , id, output for C: authenticated or rejected inputs of S: pkC , idi = 2i , ȳ i output for S: nothing client C Jx0 K, . . . , Jxk−1 K, qP y 2 ˆ xj , JidK ✲ server S ∀i ∈ {1, . . . , n} qP y Q i JDi K = qPx2j · y Jxj K−2yj · i 2 (yj ) · ∀i ∈ {1, . . . , n} ri at random r i  ˆ · Jidi K−1 Jmi K = JidK choose a random permutation: j1 , . . . , jn if mji = 0 return J0K else return Jbji K if R = 0 reject else authenticated ji Jb✛ K = bit-MIN(Jτ K, JDji K), Jmji K ✲ ✛ ✛ JRK ∀i ∈ {1, . . . , n} get the bit Jbji K JRK = Qn i=1 Jb i K Figure 3: A protocol for authentication with identity confirmation 5. ANALYSIS OF THE PROPOSALS 5.1 Bandwith Usage In order to evaluate the performances of our new proposals, we carry on an analytics comparison of the bandwidth usage of all the available solutions. More specifically, we consider our protocol in Figure 2 as well as the solutions in [3, 1]. Due to the different underling biometric systems used in the prior works, we are excluding the preliminary phase related to vector extraction. We consider a scenario where the client C has got an already-computed vector x̄ with k ℓ-bits integers and the server manages a DB with n identities. Later we use the maximum bit-length of the square of Euclidean distance computed on such vectors: ℓ′ = 2ℓ + ⌈log2 k⌉ + 1. In the following we detail on the respective analysis: • our protocol: it requires the exchange of k + 2 Paillier encryptions and n invocations of our bit-MIN subprotocol; each invocation of this last implies the use of 2ℓ′ EC-ElGamal ciphertexts and of 3 Paillier encryptions (see Section 4 and Appendix A); we also consider the use of the “packing” technique (see Appendix C in [1]) to save bandwidth during the parallel executions of the bit-MIN sub-protocol; • Erkin et al.: their HE-based solution exchanges k + 3 Paillier ciphertexts, makes n invocations of their fullyoblivious bit-MIN sub-protocol and n of a specific multiplication sub-protocol: the former requires 2ℓ′ + 1 DGK cyphertexts and 3 Paillier ones; the latter makes use of 2 Paillier encryptions (see [3] for details); • Sadeghi et al.: for such work we are considering all the optimizations11 described in [1]. Here the preliminary delivery of the encrypted vector uses k + 1 11 The analysis takes account of the following techniques: “free XOR” gates [19] to do “free” evaluation of XOR gates Paillier encryptions, then a sub-protocol for the data conversion is invoked: it exchanges a certain number of Paillier ciphertexts (less than n for the use of packing – more details in [1]) and makes use of nℓ′ parallel executions of an Oblivious-Transfer (OT) protocol. The use of GC evaluation requires the exchange of the garbled values for the server’s inputs (n · ℓ′ bits), n · 3ℓ′ table entries for the conversion protocol as well as 6ℓ′ (n − 1) + 3(n + 1) entries for the minimum selection. Each garbled value and each entry table requires t + 1 bits. For the OT protocol we are considering the use of the optimal implementations in [22] with the OT Extension of [23]. The remaining global parameters are chosen according to a plausible fingercode-based scenario12 : a DB with n = 900 · m fingercodes (m = 5 fingercodes for each identity) and vectors of k = 16 integers of ℓ = 7 bits. The total bandwidth usage (in kilobytes) for the recommended [12] short/mid/long-term security parameters (t = 80, 112, 128) are reported in Table 1. The round complexity is also reported in the last column. Our protocol provides a notable bandwidth saving: about 67−79% on Erkin et al. and 8−24% on Sadeghi et al.. Moreover the constant rounds complexity allows to get a better communication latency during the protocol executions. (no communication and negligible computation), garbled row reduction on non-XOR gates [20], point-and-permute [21] to fast GC evaluation and the packing of multiple values in a Paillier ciphertext. They also consider the possibility to carry on an heavy on-line (interactive) preprocessing phase among the client and the server. We are excluding such feature in this analysis: it restricts the range of possible applicatory scenarios. 12 Our advantage on the bandwidth usage is not strictly related to these fingercode-specific parameters: we still enjoy optimal bandwidth usage even using different parameters. Table 1: Bandwidth usage (in protocol t = 80 t = 112 our protocol 9116.7 14034.8 Sadeghi et al. 11957.4 16693.5 Erkin et al. 27567.2 55134.5 5.2 kb) and rounds t = 128 rounds 17627.7 O(1) 19064.8 O(1) 82701.7 O(log n) Time Efficiency Our solution is mainly designed to save on bandwidth and round complexity, nevertheless its computational efficiency is quite practical and comparable with the others. Its time complexity is clearly linear in the number of identities in the DB. We fully implemented our main protocol in order to gather information about its real efficiency: we consider a local run of the authentication protocol on a database of 64 identities with m = 5 fingercodes for each (n = 320) with parameters t = 80, k = 16 and ℓ = 7. Exploiting the use of off-line precomputation13 one execution requires about 16 seconds (25 without precomputation) on a common PC (Intel Core 2 Duo at 2.4GHz). Willing to force a comparison with the previous works: Erkin et al. report an experiment where the distance computation and minimum extraction on 320 pre-computed feature vectors requires about 18 seconds; the efficient Sadeghi et al.’s solution requires about 8 seconds in such setting. 5.3 note we point out that even though C gets b14 in the clear at the end of the protocol bit-MIN this is not an hazard (in an honest-but-curious setting) because of the fact that all the couples are randomly permuted by S before executing the bit-MIN protocol. The two variants in Section 4.4 enjoy the same security level: the client does not get any information about the others identities in server’s DB. 6. ADDITIONAL AUTHORS Additional authors: Dario Fiore (Ècole Normale Supèrieure – CNRS-INRIA, email: dario.fiore@ens.fr), Riccardo Lazzeretti (Dip. di Ing. dell’Informazione – Università di Siena, email: riccardo.lazzeretti@gmail.com), Vincenzo Piuri (Dip. di Tecnologie dell’Informazione – Università di Milano, email: vincenzo.piuri@unimi.it), Fabio Scotti (Dip. di Tecnologie dell’Informazione – Università di Milano, email: fabio.scotti@unimi.it), Alessandro Piva (Dip. di Elettronica e Telecomunicazioni – Università di Firenze, email: alessandro.piva@unifi.it). 7. REFERENCES [1] A. Sadeghi, T. Schneider, and I. Wehrenberg, “Efficient privacy-preserving face recognition,” in ICISC ’09: Proceedings of the 12th Annual International Conference on Information Security and Cryptology, ser. LNCS, vol. 5984. Springer-Verlag, December 2-4, 2009, pp. 235–253, full version available at http://eprint.iacr.org/2009/507. [2] R. Bolle and S. Pankanti, Biometrics: Personal Identification in Networked Society, A. K. Jain, Ed. Norwell, MA, USA: Kluwer Academic Publishers, 1998. [3] Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser, I. Lagendijk, and T. Toft, “Privacy-preserving face recognition,” in PETS ’09: Proceedings of the 9th International Symposium on Privacy Enhancing Technologies. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 235–253. [4] A. Jain, S. Prabhakar, L. Hong, and S. Pankanti, “Filterbank-based fingerprint matching,” Image Processing, IEEE Transactions on, vol. 9, no. 5, pp. 846–859, May 2000. [5] D. Maltoni, D. Maio, A. K. Jain, and S. Prabhakar, Handbook of Fingerprint Recognition. Springer Publishing Company, Incorporated, 2009. [6] A. Jain, L. Hong, S. Pankanti, and R. Bolle, “An identity-authentication system using fingerprints.” Proc. IEEE, 85, (9), pp. 1364–1388, 1997. [7] C. L. Wilson, C. Watson, , and E. Peak, “Effect of resolution and image quality on combined optical and neural network fingerprint matching,” Putt. Recognit., 33, (Z), pp. 317–331, 2000. [8] C. Lee and S. Wang, “Fingerprint feature extraction using gabor filters,” Eleclron. Lett., 35, (4), pp. 288–290, 1999. [9] H.-W. Sun, K.-Y. Lam, M. Gu, and J.-G. Sun, “An efficient algorithm for fingercode-based biometric identification,” in OTM Workshops (1), 2006, pp. 469–478. Security Argument In this section we sketch a security argument for our protocol in Figure 2. In particular we want to argue that, in the honest-but-curious setting, no party should be able to get any information about the other party’s input. In other words, this means that the client C should not be able to get anything about the database held by S (beyond what revealed by the functionality implemented by the protocol) whereas S should not get anything about the fingercode and outcome of the authentication process. We discuss each phase of the protocol separately. The vector extraction phase is done entirely by C so no information is leaked to S. Security of the distance computation phase can be proved easily following the same approach used by Erkin et al. in [3] (recall that our distance computation protocol is the same as that used in [3]). It remains to discuss the selection of the matching identities phase. Intuitively it is clear that the protocol is private for the server as all the messages it receives are encrypted with respect to C’s public key (using a semantically secure cryptosystem). Things are a bit trickier for the client as the latter knows the private key corresponding to the public key with respect to which the ciphertexts are created. Still, we argue that this does not allow C to get more information than what prescribed by the protocol. This is because, whenever C receives a ciphertext, the encrypted message is altered by S via an information theoretic secure mask. For instance, in the bit-MIN protocol (see Appendix A) C receives an encryption of d which is statistically indistinguishable from a uniformly distributed 101 + ℓ random integer. As a final 13 Such precomputation is completely non-interactive and it is suitable for scenarios where the server knowns in advance the identities of the clients that are allowed to initiate authentication transactions. 14 More specifically, the client C does not directly get b but instead a related information: the bit λ. [10] A. Jain, S. Prabhakar, and L. Hong, “A multichannel approach to fingerprint classification,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 21, no. 4, pp. 348–359, Apr 1999. [11] M. Turk and A. Pentland, “Face recognition using eigenfaces,” in Computer Vision and Pattern Recognition, 1991. Proceedings CVPR ’91., IEEE Computer Society Conference on, 1991, pp. 586–591. [12] “Nist recommendation for key management,” ser. NIST Special Publication, vol. 800-57, August 2005. [13] P. Paillier, “Public-key cryptosystems based on composite degree residuosity classes,” in EUROCRYPT’99, ser. LNCS, J. Stern, Ed., vol. 1592. Springer-Verlag, Berlin, Germany, May 1999, pp. 223–238. [14] I. Damgård, M. Geisler, and M. Krøigard, “Efficient and secure comparison for on-line auctions,” in ACISP, ser. Lecture Notes in Computer Science, J. Pieprzyk, H. Ghodosi, and E. Dawson, Eds., vol. 4586. Springer, 2007, pp. 416–430. [15] J. Cohen and M. Fischer, “A robust and verifiable cryptographically secure election scheme,” in 26th FOCS. IEEE Computer Society Press, Oct. 1985. [16] T. ElGamal, “A public key cryptosystem and a signature scheme based on discrete logarithms,” in CRYPTO’84, ser. LNCS, G. R. Blakley and D. Chaum, Eds., vol. 196. Springer-Verlag, Berlin, Germany, Aug. 1985, pp. 10–18. [17] “Standard for efficient cryptography, SEC1: Elliptic curves cryptography,” Technical report, Certicom Research, 2000, available at http://www.secg.org. [18] “Standard for efficient cryptography, SEC2: Recommended elliptic curves domain parameters,” Technical report, Certicom Research, 2000, available at http://www.secg.org. [19] V. Kolesnikov and T. Schneider, “Improved garbled circuit: Free xor gates and applications,” in ICALP ’08: Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part II. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 486–498. [20] B. Pinkas, T. Schneider, N. Smart, and S. Williams, “Secure two-party computation is practical,” in ASIACRYPT, ser. Lecture Notes in Computer Science, M. Matsui, Ed., vol. 5912. Springer, 2009, pp. 250–267. [21] D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella, “Fairplay—a secure two-party computation system,” in SSYM’04: Proceedings of the 13th conference on USENIX Security Symposium. Berkeley, CA, USA: USENIX Association, 2004, pp. 20–20. [22] M. Naor and B. Pinkas, “Efficient oblivious transfer protocols,” in SODA ’01: Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 2001, pp. 448–457. [23] Y. Ishai, J. Kilian, K. Nissim, and E. Petrank, “Extending oblivious transfers efficiently,” in CRYPTO 2003, ser. LNCS, D. Boneh, Ed., vol. 2729. Springer-Verlag, Berlin, Germany, Aug. 2003, pp. 145–161. APPENDIX A. THE SUB-PROTOCOL BIT-MIN In this section we recall the sub-protocol bit-MIN used in Section 4, which is a variant of the one proposed in [3]. As in rest of the paper: we consider a client C and a server S. The latter has got the encryption of two ℓ-bit integers JXK and JY K15 . The protocol bit-MIN allows S to compute the encrypted bit JbK such that b = 0 ⇔ X < Y and uses as building block a variant of the comparison protocol proposed by Damgård et al. in [14] (see the next section). The protocol bit-MIN is given in Figure 4 and works as follows. As first step the server homomorphically computes JzK = J2ℓ + X − Y K. Since X and Y are ℓ-bits long, z is an ℓ + 1 bits integer. Moreover one can interestingly see that the most significant bit of z (which we denote zℓ ) is 0 if and only if X < Y . Thus in order to learn if X < Y it suffices to compute zℓ . This can be done as follows. S additively blinds z with a suitable random value r, obtaining JdK. S sends JdK to C and then they run the subprotocol DGK (see next section) after which S will learn JλK such that λ = 0 ⇔ dˆ < r̂ (where dˆ and r̂ are, respectively, d mod 2ℓ and r mod 2ℓ ). We notice that the information about “dˆ < r̂” is useful to compute zℓ . In fact observe that: b = zℓ = 2−ℓ (z − ẑ) = 2−ℓ (z − ((d − r) mod 2ℓ )) where it is possible to compute (d−r) mod 2ℓ = (d mod 2ℓ )− (r mod 2ℓ ) + λ · 2ℓ . Since λ = 0 ⇔ dˆ < r̂ it is easy to see the correctness of zℓ . A.1 The sub-protocol DGK The DGK comparison protocol of [14] allows both parties (i.e. the client C and the server S) to learn the bit λ of the predicate “d < r” where d and r are two ℓ-bit integers owned by C and S respectively. The original DGK protocol is given in Figure 5 and works as follows. As in the other protocols, the client C has a pair of keys (pkC , skC ) for an additively homorphic cryptosystem: the original protocol uses the DGK [14] cryptosystem, we will use a different scheme as stated later. We are going to use another notation for such ciphetexts: [x]. The inputs for the parties are, respectively, an ℓ-bits integer d for the client and another ℓ-bit integer r for the server. After the run of the original protocol, the server (as well as the client) will learn the decision bit λ of “d < r” (i.e. λ = 0 ⇔ d < r) while d and r will remain hidden to the server and the client respectively. In our bit-MIN we use a slightly different version of this protocol where the client sends JλK (encrypted with his Paillier public-key) instead of λ: in this way the value of the decision bit remains hidden to the server. The protocol consists of three rounds during which 2ℓ ciphertexts are exchanged. More in detail, the server computes the values [wi ] = [di ⊕ ri ] and [ci ] = [di − ri + 1 + Pℓ−1 j=i+1 wj ]. The values ci carry on the information whether or not d < r, in particular we have that one of the ci ’s will be 0 if and only if d < r. To see the correctness of this, consider all possible cases. If d = r, then we clearly have ci = 1 for all 15 We assume that encrypted input and output values of this protocol are made using the Paillier public-key of the client. Furthermore we note that, in the context of our protocol, bit-MIN is usually applied on inputs with a bit-length of ℓ′ = 2ℓ+⌈log2 k⌉+1. In this section we assume ℓ-bits integers in order to simplify the protocol description. The protocol bit-MIN inputs of S: pkC , JXK, JY K output for S: JbK such that b = 0 ⇔ X < Y inputs of C: pkC , skC output for C: nothing client C dˆ ⇐ JdK dˆ = d mod 2ℓ ˆ ⇐ dˆ JdK ✛ server S r ∈ {0, . . . , 2101+ℓ − 1} at random JrK ⇐ r JzK = J2ℓ KJXKJY K−1 JdK = JzKJrK JdK r̂ = r mod 2ℓ ˆ r̂), JdK ˆ JλK = DGK(d, ✲ ✛ ✲ Jr̂K ⇐ r mod 2ℓ ˆ = JdK ˆ −1 J−dK (−2ℓ ) 2−ℓ ˆ JbK = (JzKJ−dKJr̂KJλK ) output JbK Figure 4: A protocol that outputs the encrypted predicate bit of ”X < Y “ i = 0, . . . , ℓ − 1. If d 6= r, assume that the m-th bit (starting from the most significant) is the first one where they differ. Then ℓ−1 , . . . , cm+1 are equal to 1 while cm = dm − rm + 1 Pcℓ−1 (as j=m+1 wj = 0). Moreover since wm = 1, we have Pℓ−1 j=i+1 wj ≥ 1 and ci ≥ 1 ∀i ∈ {0, . . . , m − 1}. Thus cm depends only on dm and rm and it will be 0 only if dm < rm . Finally, since the ci ’s might contain information about d and r, they are randomized (creating ei ) so that when the client decrypt ei he will obtain either 0 (if ci = 0) or a random value16 . Therefore C will set λ = 0 if one of the ei ’s decrypts to 0. We defer the interested reader to [14] for more details about this protocol. 16 It is sufficient to check if the plaintext is equal to 0: the DGK cryptosystem, as well as one that we adopt, has a decryption procedure that is based on an exhaustive search in the plaintext space. A.2 Our Changes As stated in Section 4, we changed the protocol in Figure 5 in few points: the final bit λ is encrypted as JλK in our solution, so that the bit value is oblivious to the server S. Furthermore we use a different cryptosystem instead of the one used in [14, 3]. The chosen cryptosystem is a known variant of the well-known ElGamal cryptosystem [16]. Such a scheme differs from the original in two points: it is additively homomorphic and all the computation is carried over a suitably chosen EC. We name it as: additively homomorphic EC-ElGamal. Let p be the 2t-bit prime order of the working field, g a generator: the private-key will be a random value a ∈ Zp and the public-key will be (p, g, h = g a ). The encryption of a plaintext x ∈ Zp is computed as the pair (g x hr , g r ), where r ∈ Zp is chosen at random. Given a ciphertext (c1 , c2 ) the original plaintext can be recovered by exhaustive search, like in DGK, on the value cca1 . The final 2 cryptosystem is additively homomorphic. The use of EC allows to obtain a great bandwidth saving, indeed, exploiting the point compression [17], the ciphertext can be transmitted using 2 · (2t + 1) bits. For example, for a security parameter t = 80, the ciphertext sizes for the considered cryptosystems would be: Paillier 2048 bits, DGK [14] 1024 bits and EC-ElGamal 322 bits. On the other side, the use of EC usually requires slightly more complex computations. These are the same EC groups exploited in the efficient OT implementations used in the GC-based protocols [1]. DGK Comparison Protocol inputs of C: pkC , skC , d output for C: λ such that λ = 0 ⇔ d < r client C extract the bits d0 , . . . , dℓ−1 of d ∀i = 0, . . . , ℓ − 1 compute [di ] for i = 0 to ℓ − 1 ei ⇐ [ei ] if 0 ∈ {e0 , . . . , eℓ−1 } λ=1 else λ=0 inputs of S: pkC , r output for S: λ (in our version JλK) server S [d0 ], . . . , [dℓ−1 ] ✲ extract the bits r0 , . . . , rℓ−1 of r for i = 0 to ℓ − 1 [wi ] = [di ] · [ri ] · [di ]−2ri generate random values: R0 , . . . , Rℓ−1 6= 0 for i = 0 to ℓ − 1 Q [ci ] = [di ] · [1 − ri ] · ℓ−1 j=i+1 [wj ] [ei ] = ([ci ]Ri )re−rand ✛π([e0 ], . . . , [eℓ−1 ]) choose a random perm. π(·) λ ✲ Figure 5: A protocol that publicly computes the predicate bit of “d < r”.