Face recognition: A literature review

A.A. El-Harby

Face recognition: A literature review

2005

Abstract—The task of face recognition has been actively researched in recent years. This paper provides an up-to-date review of major human face recognition research. We first present an overview of face recognition and its applications. Then, a literature review of the most recent face recognition techniques is presented. Description and limitations of face databases which are used to test the performance of these face recognition algorithms are given. A brief summary of the face recognition vendor test (FRVT) 2002, a large scale evaluation of automatic face recognition technology, and its conclusions are also given. Finally, we give a summary of the research results. Keywords—Combined classifiers, face recognition, graph matching, neural networks. I. INTRODUCTION ACE recognition is an important research problem spanning numerous fields and disciplines. This because face recognition, in additional to having numerous practical applications such as bankcard identification, access control, Mug shots searching, security monitoring, and surveillance system, is a fundamental human behaviour that is essential for effective communications and interactions among people. A formal method of classifying faces was first proposed in [1]. The author proposed collecting facial profiles as curves, finding their norm, and then classifying other profiles by their deviations from the norm. This classification is multi-modal, i.e. resulting in a vector of independent measures that could be compared with other vectors in a database. Progress has advanced to the point that face recognition systems are being demonstrated in real-world settings [2]. The rapid development of face recognition is due to a combination of factors: active development of algorithms, the availability of a large databases of facial images, and a method for evaluating the performance of face recognition algorithms. In the literatures, face recognition problem can be formulated as: given static (still) or video images of a scene, identify or verify one or more persons in the scene by comparing with faces stored in a database. When comparing person verification to face recognition, there are several aspects which differ. First, a client – an authorized user of a personal identification system – is Manuscript received February 22, 2005. A. S. Tolba is with the Information Systems Department, Mansoura University, Egypt, (e-mail: tolba1954@)yahoo.com). A. H. EL-Baz is with the Mathematics Department, Damietta Faculty of Science, New Damietta, Egypt, and doing PhD research on pattern recognition (phone: 0020-57-403980; Fax: 0020-57–403868; e-mail: ali_elbaz@yahoo.com). A. H. EL-Harby is with the Mathematics Department, Damietta Faculty of Science, New Damietta, Egypt, (e-mail: elharby@yahoo.co.uk). assumed to be co-operative and makes an identity claim. Computationally this means that it is not necessary to consult the complete set of database images (denoted model images below) in order to verify a claim. An incoming image (referred to as a probe image) is thus compared to a small number of model images of the person whose identity is claimed and not, as in the recognition scenario, with every image (or some descriptor of an image) in a potentially large database. Second, an automatic authentication system must operate in near-real time to be acceptable to users. Finally, in recognition experiments, only images of people from the training database are presented to the system, whereas the case of an imposter (most likely a previously unseen person) is of utmost importance for authentication. Face recognition is a biometric approach that employs automated methods to verify or recognize the identity of a living person based on his/her physiological characteristics. In general, a biometric identification system makes use of either physiological characteristics (such as a fingerprint, iris pattern, or face) or behaviour patterns (such as hand-writing, voice, or key-stroke pattern) to identify a person. Because of human inherent protectiveness of his/her eyes, some people are reluctant to use eye identification systems. Face recognition has the benefit of being a passive, non intrusive system to verify personal identity in a “natural” and friendly way. In general, biometric devices can be explained with a three- step procedure (1) a sensor takes an observation. The type of sensor and its observation depend on the type of biometric devices used. This observation gives us a “Biometric Signature” of the individual. (2) a computer algorithm “normalizes” the biometric signature so that it is in the same format (size, resolution, view, etc.) as the signatures on the system’s database. The normalization of the biometric signature gives us a “Normalized Signature” of the individual. (3) a matcher compares the normalized signature with the set (or sub-set) of normalized signatures on the system's database and provides a “similarity score” that compares the individual's normalized signature with each signature in the database set (or sub-set). What is then done with the similarity scores depends on the biometric system’s application? Face recognition starts with the detection of face patterns in sometimes cluttered scenes, proceeds by normalizing the face images to account for geometrical and illumination changes, possibly using information about the location and appearance of facial landmarks, identifies the faces using appropriate classification algorithms, and post processes the results using model-based schemes and logistic feedback [3]. The application of face recognition technique can be categorized into two main parts: law enforcement application and commercial application. Face recognition technology is Face Recognition: A Literature Review A. S. Tolba, A.H. El-Baz, and A.A. El-Harby F International Journal of Signal Processing Volume 2 Number 2 88

primarily used in law enforcement applications, especially Mug shot albums (static matching) and video surveillance (real-time matching by video image sequences). The commercial applications range from static matching of photographs on credit cards, ATM cards, passports, driver’s licenses, and photo ID to real-time matching with still images or video image sequences for access control. Each application presents different constraints in terms of processing. All face recognition algorithms consistent of two major parts: (1) face detection and normalization and (2) face identification. Algorithms that consist of both parts are referred to as fully automatic algorithms and those that consist of only the second part are called partially automatic algorithms. Partially automatic algorithms are given a facial image and the coordinates of the center of the eyes. Fully automatic algorithms are only given facial images. On the other hand, the development of face recognition over the past years allows an organization into three types of recognition algorithms, namely frontal, profile, and view- tolerant recognition, depending on the kind of images and the recognition algorithms. While frontal recognition certainly is the classical approach, view-tolerant algorithms usually perform recognition in a more sophisticated fashion by taking into consideration some of the underlying physics, geometry, and statistics. Profile schemes as stand-alone systems have a rather marginal significance for identification, (for more detail see [4]). However, they are very practical either for fast coarse pre-searches of large face database to reduce the computational load for a subsequent sophisticated algorithm, or as part of a hybrid recognition scheme. Such hybrid approaches have a special status among face recognition systems as they combine different recognition approaches in an either serial or parallel order to overcome the shortcoming of the individual components. Another way to categorize face recognition techniques is to consider whether they are based on models or exemplars. Models are used in [5] to compute the Quotient Image, and in [6] to derive their Active Appearance Model. These models capture class information (the class face), and provide strong constraints when dealing with appearance variation. At the other extreme, exemplars may also be used for recognition. The ARENA method in [7] simply stores all training and matches each one against the task image. As far we can tell, current methods that employ models do not use exemplars, and vice versa. This is because these two approaches are by no means mutually exclusive. Recently, [8] proposed a way of combining models and exemplars for face recognition. In which, models are used to synthesize additional training images, which can then be used as exemplars in the learning stage of a face recognition system. Focusing on the aspect of pose invariance, face recognition approaches may be divided into two categories: (i) global approach and (ii) component-based approach. In global approach, a single feature vector that represents the whole face image is used as input to a classifier. Several classifiers have been proposed in the literature e.g. minimum distance classification in the eigenspace [9,10], Fisher’s discriminant analysis [11], and neural networks [12]. Global techniques work well for classifying frontal views of faces. However, they are not robust against pose changes since global features are highly sensitive to translation and rotation of the face. To avoid this problem an alignment stage can be added before classifying the face. Aligning an input face image with a reference face image requires computing correspondence between the two face images. The correspondence is usually determined for a small number of prominent points in the face like the center of the eye, the nostrils, or the corners of the mouth. Based on these correspondences, the input face image can be warped to a reference face image. In [13], an affine transformation is computed to perform the warping. Active shape models are used in [14] to align input faces with model faces. A semi-automatic alignment step in combination with support vector machines classification was proposed in [15]. An alternative to the global approach is to classify local facial components. The main idea of component based recognition is to compensate for pose changes by allowing a flexible geometrical relation between the components in the classification stage. In [16], face recognition was performed by independently matching templates of three facial regions (eyes, nose and mouth). The configuration of the components during classification was unconstrained since the system did not include a geometrical model of the face. A similar approach with an additional alignment stage was proposed in [17]. In [18], a geometrical model of a face was implemented by a 2D elastic graph. The recognition was based on wavelet coefficients that were computed on the nodes of the elastic graph. In [19], a window was shifted over the face image and the DCT coefficients computed within the window were fed into a 2D Hidden Markov Model. Face recognition research still face challenge in some specific domains such as pose and illumination changes. Although numerous methods have been proposed to solve such problems and have demonstrated significant promise, the difficulties still remain. For these reasons, the matching performance in current automatic face recognition is relatively poor compared to that achieved in fingerprint and iris matching, yet it may be the only available measuring tool for an application. Error rates of 2-25% are typical. It is effective if combined with other biometric measurements. Current systems work very well whenever the test image to be recognized is captured under conditions similar to those of the training images. However, they are not robust enough if there is variation between test and training images [20]. Changes in incident illumination, head pose, facial expression, hairstyle (include facial hair), cosmetics (including eyewear) and age, all confound the best systems today. As a general rule, we may categorize approaches used to cope with variation in appearance into three kinds: invariant features, canonical forms, and variation- modeling. The first approach seeks to utilize features that are invariant to the changes being studied. For instance, the Quotient Image [5] is (by construction) invariant to illumination and may be used to recognize faces (assumed to be Lambertian) when lighting conditions change. The second approach attempts to “normalize” away the variation, either by clever image transformations or by synthesizing a new image (from the given test image) in some International Journal of Signal Processing Volume 2 Number 2 89

International Journal of Signal Processing Volume 2 Number 2 Face Recognition: A Literature Review A. S. Tolba, A.H. El-Baz, and A.A. El-Harby assumed to be co-operative and makes an identity claim. Computationally this means that it is not necessary to consult the complete set of database images (denoted model images below) in order to verify a claim. An incoming image (referred to as a probe image) is thus compared to a small number of model images of the person whose identity is claimed and not, as in the recognition scenario, with every image (or some descriptor of an image) in a potentially large database. Second, an automatic authentication system must operate in near-real time to be acceptable to users. Finally, in recognition experiments, only images of people from the training database are presented to the system, whereas the case of an imposter (most likely a previously unseen person) is of utmost importance for authentication. Face recognition is a biometric approach that employs automated methods to verify or recognize the identity of a living person based on his/her physiological characteristics. In general, a biometric identification system makes use of either physiological characteristics (such as a fingerprint, iris pattern, or face) or behaviour patterns (such as hand-writing, voice, or key-stroke pattern) to identify a person. Because of human inherent protectiveness of his/her eyes, some people are reluctant to use eye identification systems. Face recognition has the benefit of being a passive, non intrusive system to verify personal identity in a “natural” and friendly way. In general, biometric devices can be explained with a threestep procedure (1) a sensor takes an observation. The type of sensor and its observation depend on the type of biometric devices used. This observation gives us a “Biometric Signature” of the individual. (2) a computer algorithm “normalizes” the biometric signature so that it is in the same format (size, resolution, view, etc.) as the signatures on the system’s database. The normalization of the biometric signature gives us a “Normalized Signature” of the individual. (3) a matcher compares the normalized signature with the set (or sub-set) of normalized signatures on the system's database and provides a “similarity score” that compares the individual's normalized signature with each signature in the database set (or sub-set). What is then done with the similarity scores depends on the biometric system’s application? Face recognition starts with the detection of face patterns in sometimes cluttered scenes, proceeds by normalizing the face images to account for geometrical and illumination changes, possibly using information about the location and appearance of facial landmarks, identifies the faces using appropriate classification algorithms, and post processes the results using model-based schemes and logistic feedback [3]. The application of face recognition technique can be categorized into two main parts: law enforcement application and commercial application. Face recognition technology is Abstract—The task of face recognition has been actively researched in recent years. This paper provides an up-to-date review of major human face recognition research. We first present an overview of face recognition and its applications. Then, a literature review of the most recent face recognition techniques is presented. Description and limitations of face databases which are used to test the performance of these face recognition algorithms are given. A brief summary of the face recognition vendor test (FRVT) 2002, a large scale evaluation of automatic face recognition technology, and its conclusions are also given. Finally, we give a summary of the research results. Keywords—Combined classifiers, matching, neural networks. F face recognition, graph I. INTRODUCTION ACE recognition is an important research problem spanning numerous fields and disciplines. This because face recognition, in additional to having numerous practical applications such as bankcard identification, access control, Mug shots searching, security monitoring, and surveillance system, is a fundamental human behaviour that is essential for effective communications and interactions among people. A formal method of classifying faces was first proposed in [1]. The author proposed collecting facial profiles as curves, finding their norm, and then classifying other profiles by their deviations from the norm. This classification is multi-modal, i.e. resulting in a vector of independent measures that could be compared with other vectors in a database. Progress has advanced to the point that face recognition systems are being demonstrated in real-world settings [2]. The rapid development of face recognition is due to a combination of factors: active development of algorithms, the availability of a large databases of facial images, and a method for evaluating the performance of face recognition algorithms. In the literatures, face recognition problem can be formulated as: given static (still) or video images of a scene, identify or verify one or more persons in the scene by comparing with faces stored in a database. When comparing person verification to face recognition, there are several aspects which differ. First, a client – an authorized user of a personal identification system – is Manuscript received February 22, 2005. A. S. Tolba is with the Information Systems Department, Mansoura University, Egypt, (e-mail: tolba1954@)yahoo.com). A. H. EL-Baz is with the Mathematics Department, Damietta Faculty of Science, New Damietta, Egypt, and doing PhD research on pattern recognition (phone: 0020-57-403980; Fax: 0020-57–403868; e-mail: ali_elbaz@yahoo.com). A. H. EL-Harby is with the Mathematics Department, Damietta Faculty of Science, New Damietta, Egypt, (e-mail: elharby@yahoo.co.uk). 88 International Journal of Signal Processing Volume 2 Number 2 primarily used in law enforcement applications, especially Mug shot albums (static matching) and video surveillance (real-time matching by video image sequences). The commercial applications range from static matching of photographs on credit cards, ATM cards, passports, driver’s licenses, and photo ID to real-time matching with still images or video image sequences for access control. Each application presents different constraints in terms of processing. All face recognition algorithms consistent of two major parts: (1) face detection and normalization and (2) face identification. Algorithms that consist of both parts are referred to as fully automatic algorithms and those that consist of only the second part are called partially automatic algorithms. Partially automatic algorithms are given a facial image and the coordinates of the center of the eyes. Fully automatic algorithms are only given facial images. On the other hand, the development of face recognition over the past years allows an organization into three types of recognition algorithms, namely frontal, profile, and viewtolerant recognition, depending on the kind of images and the recognition algorithms. While frontal recognition certainly is the classical approach, view-tolerant algorithms usually perform recognition in a more sophisticated fashion by taking into consideration some of the underlying physics, geometry, and statistics. Profile schemes as stand-alone systems have a rather marginal significance for identification, (for more detail see [4]). However, they are very practical either for fast coarse pre-searches of large face database to reduce the computational load for a subsequent sophisticated algorithm, or as part of a hybrid recognition scheme. Such hybrid approaches have a special status among face recognition systems as they combine different recognition approaches in an either serial or parallel order to overcome the shortcoming of the individual components. Another way to categorize face recognition techniques is to consider whether they are based on models or exemplars. Models are used in [5] to compute the Quotient Image, and in [6] to derive their Active Appearance Model. These models capture class information (the class face), and provide strong constraints when dealing with appearance variation. At the other extreme, exemplars may also be used for recognition. The ARENA method in [7] simply stores all training and matches each one against the task image. As far we can tell, current methods that employ models do not use exemplars, and vice versa. This is because these two approaches are by no means mutually exclusive. Recently, [8] proposed a way of combining models and exemplars for face recognition. In which, models are used to synthesize additional training images, which can then be used as exemplars in the learning stage of a face recognition system. Focusing on the aspect of pose invariance, face recognition approaches may be divided into two categories: (i) global approach and (ii) component-based approach. In global approach, a single feature vector that represents the whole face image is used as input to a classifier. Several classifiers have been proposed in the literature e.g. minimum distance classification in the eigenspace [9,10], Fisher’s discriminant analysis [11], and neural networks [12]. Global techniques work well for classifying frontal views of faces. However, they are not robust against pose changes since global features are highly sensitive to translation and rotation of the face. To avoid this problem an alignment stage can be added before classifying the face. Aligning an input face image with a reference face image requires computing correspondence between the two face images. The correspondence is usually determined for a small number of prominent points in the face like the center of the eye, the nostrils, or the corners of the mouth. Based on these correspondences, the input face image can be warped to a reference face image. In [13], an affine transformation is computed to perform the warping. Active shape models are used in [14] to align input faces with model faces. A semi-automatic alignment step in combination with support vector machines classification was proposed in [15]. An alternative to the global approach is to classify local facial components. The main idea of component based recognition is to compensate for pose changes by allowing a flexible geometrical relation between the components in the classification stage. In [16], face recognition was performed by independently matching templates of three facial regions (eyes, nose and mouth). The configuration of the components during classification was unconstrained since the system did not include a geometrical model of the face. A similar approach with an additional alignment stage was proposed in [17]. In [18], a geometrical model of a face was implemented by a 2D elastic graph. The recognition was based on wavelet coefficients that were computed on the nodes of the elastic graph. In [19], a window was shifted over the face image and the DCT coefficients computed within the window were fed into a 2D Hidden Markov Model. Face recognition research still face challenge in some specific domains such as pose and illumination changes. Although numerous methods have been proposed to solve such problems and have demonstrated significant promise, the difficulties still remain. For these reasons, the matching performance in current automatic face recognition is relatively poor compared to that achieved in fingerprint and iris matching, yet it may be the only available measuring tool for an application. Error rates of 2-25% are typical. It is effective if combined with other biometric measurements. Current systems work very well whenever the test image to be recognized is captured under conditions similar to those of the training images. However, they are not robust enough if there is variation between test and training images [20]. Changes in incident illumination, head pose, facial expression, hairstyle (include facial hair), cosmetics (including eyewear) and age, all confound the best systems today. As a general rule, we may categorize approaches used to cope with variation in appearance into three kinds: invariant features, canonical forms, and variation- modeling. The first approach seeks to utilize features that are invariant to the changes being studied. For instance, the Quotient Image [5] is (by construction) invariant to illumination and may be used to recognize faces (assumed to be Lambertian) when lighting conditions change. The second approach attempts to “normalize” away the variation, either by clever image transformations or by synthesizing a new image (from the given test image) in some 89 International Journal of Signal Processing Volume 2 Number 2 “canonical” or “prototypical” form. Recognition is then performed using this canonical form. Examples of this approach include [21,22]. In [21], for instance, the test image under arbitrary illumination is re-rendered under frontal illumination, and then compared against other frontallyilluminated prototypes. The third approach of variation-modeling is self explanatory: the idea is to learn, in some suitable subspace, the extent of the variation in that space. This usually leads to some parameterization of the subspace(s). Recognition is then performed by choosing the subspace closest to the test image, after the latter has been appropriately mapped. In effect, the recognition step recovers the variation (e.g. pose estimation) as well as the identity of the person. For examples of this technique, see [18, 23, 24 and 25]. Despite the plethora of techniques, and the valiant effort of many researchers, face recognition remains a difficult, unsolved problem in general. While each of the above approaches works well for the specific variation being studied, performance degrades rapidly when other variations are present. For instance, a feature invariant to illumination works well as long as pose or facial expression remains constant, but fails to be invariant when pose or expression is changed. This is not a problem for some applications, such as controlling access to a secured room, since both the training and test images may be captured under similar conditions. However, for general, unconstrained recognition, none of these techniques are robust enough. Moreover, it is not clear that different techniques can be combined to overcome each other’s limitations. Some techniques, by their very nature, exclude others. For example, the Symmetric Shape-from-Shading method of [22] relies on the approximate symmetry of a frontal face. It is unclear how this may be combined with a technique that depends on side profiles, where the symmetry is absent. We can make two important observations after surveying the research literature: (1) there does not appear to be any feature, set of features, or subspace that is simultaneously invariant to all the variations that a face image may exhibit, (2) given more training images, almost any technique will perform better. These two factors are the major reasons why face recognition is not widely used in real-world applications. The fact is that for many applications, it is usual to require the ability to recognize faces under different variations, even when training images are severely limited. A. Eigenfaces Eigenface is one of the most thoroughly investigated approaches to face recognition. It is also known as KarhunenLoève expansion, eigenpicture, eigenvector, and principal component. References [26, 27] used principal component analysis to efficiently represent pictures of faces. They argued that any face images could be approximately reconstructed by a small collection of weights for each face and a standard face picture (eigenpicture). The weights describing each face are obtained by projecting the face image onto the eigenpicture. Reference [28] used eigenfaces, which was motivated by the technique of Kirby and Sirovich, for face detection and identification. In mathematical terms, eigenfaces are the principal components of the distribution of faces, or the eigenvectors of the covariance matrix of the set of face images. The eigenvectors are ordered to represent different amounts of the variation, respectively, among the faces. Each face can be represented exactly by a linear combination of the eigenfaces. It can also be approximated using only the “best” eigenvectors with the largest eigenvalues. The best M eigenfaces construct an M dimensional space, i.e., the “face space”. The authors reported 96 percent, 85 percent, and 64 percent correct classifications averaged over lighting, orientation, and size variations, respectively. Their database contained 2,500 images of 16 individuals. As the images include a large quantity of background area, the above results are influenced by background. The authors explained the robust performance of the system under different lighting conditions by significant correlation between images with changes in illumination. However, [29] showed that the correlation between images of the whole faces is not efficient for satisfactory recognition performance. Illumination normalization [27] is usually necessary for the eigenfaces approach. Reference [30] proposed a new method to compute the covariance matrix using three images each was taken in different lighting conditions to account for arbitrary illumination effects, if the object is Lambertian. Reference [31] extended their early work on eigenface to eigenfeatures corresponding to face components, such as eyes, nose, and mouth. They used a modular eigenspace which was composed of the above eigenfeatures (i.e., eigeneyes, eigennose, and eigenmouth). This method would be less sensitive to appearance changes than the standard eigenface method. The system achieved a recognition rate of 95 percent on the FERET database of 7,562 images of approximately 3,000 individuals. In summary, eigenface appears as a fast, simple, and practical method. However, in general, it does not provide invariance over changes in scale and lighting conditions. Recently, in [32] experiments with ear and face recognition, using the standard principal component analysis approach , showed that the recognition performance is essentially identical using ear images or face images and combining the two for multimodal recognition results in a statistically significant performance improvement. For example, the difference in the rank-one recognition rate for the day variation experiment using the 197-image training sets is II. LITERATURE REVIEW OF FACE RECOGNITION TECHNIQUES This section gives an overview on the major human face recognition techniques that apply mostly to frontal faces, advantages and disadvantages of each method are also given. The methods considered are eigenfaces (eigenfeatures), neural networks, dynamic link architecture, hidden Markov model, geometrical feature matching, and template matching. The approaches are analyzed in terms of the facial representations they used. 90 International Journal of Signal Processing Volume 2 Number 2 90.9% for the multimodal biometric versus 71.6% for the ear and 70.5% for the face. There is substantial related work in multimodal biometrics. For example [33] used face and fingerprint in multimodal biometric identification, and [34] used face and voice. However, use of the face and ear in combination seems more relevant to surveillance applications. misclassified to the wrong subnet, the rightful subnet will tune its parameters so that its decision-region can be moved closer to the misclassified sample. PDBNN-based biometric identification system has the merits of both neural networks and statistical approaches, and its distributed computing principle is relatively easy to implement on parallel computer. In [39], it was reported that PDBNN face recognizer had the capability of recognizing up to 200 people and could achieve up to 96% correct recognition rate in approximately 1 second. However, when the number of persons increases, the computing expense will become more demanding. In general, neural network approaches encounter problems when the number of classes (i.e., individuals) increases. Moreover, they are not suitable for a single model image recognition test because multiple model images per person are necessary in order for training the systems to “optimal” parameter setting. B. Neural Networks The attractiveness of using neural networks could be due to its non linearity in the network. Hence, the feature extraction step may be more efficient than the linear Karhunen-Loève methods. One of the first artificial neural networks (ANN) techniques used for face recognition is a single layer adaptive network called WISARD which contains a separate network for each stored individual [35]. The way in constructing a neural network structure is crucial for successful recognition. It is very much dependent on the intended application. For face detection, multilayer perceptron [36] and convolutional neural network [37] have been applied. For face verification, [38] is a multi-resolution pyramid structure. Reference [37] proposed a hybrid neural network which combines local image sampling, a self-organizing map (SOM) neural network, and a convolutional neural network. The SOM provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimension reduction and invariance to minor changes in the image sample. The convolutional network extracts successively larger features in a hierarchical set of layers and provides partial invariance to translation, rotation, scale, and deformation. The authors reported 96.2% correct recognition on ORL database of 400 images of 40 individuals. The classification time is less than 0.5 second, but the training time is as long as 4 hours. Reference [39] used probabilistic decision-based neural network (PDBNN) which inherited the modular structure from its predecessor, a decision based neural network (DBNN) [40]. The PDBNN can be applied effectively to 1) face detector: which finds the location of a human face in a cluttered image, 2) eye localizer: which determines the positions of both eyes in order to generate meaningful feature vectors, and 3) face recognizer. PDNN does not have a fully connected network topology. Instead, it divides the network into K subnets. Each subset is dedicated to recognize one person in the database. PDNN uses the Guassian activation function for its neurons, and the output of each “face subnet” is the weighted summation of the neuron outputs. In other words, the face subnet estimates the likelihood density using the popular mixture-of-Guassian model. Compared to the AWGN scheme, mixture of Guassian provides a much more flexible and complex model for approximating the time likelihood densities in the face space. The learning scheme of the PDNN consists of two phases, in the first phase; each subnet is trained by its own face images. In the second phase, called the decision-based learning, the subnet parameters may be trained by some particular samples from other face classes. The decision-based learning scheme does not use all the training samples for the training. Only misclassified patterns are used. If the sample is C. Graph Matching Graph matching is another approach to face recognition. Reference [41] presented a dynamic link structure for distortion invariant object recognition which employed elastic graph matching to find the closest stored graph. Dynamic link architecture is an extension to classical artificial neural networks. Memorized objects are represented by sparse graphs, whose vertices are labeled with a multiresolution description in terms of a local power spectrum and whose edges are labeled with geometrical distance vectors. Object recognition can be formulated as elastic graph matching which is performed by stochastic optimization of a matching cost function. They reported good results on a database of 87 people and a small set of office items comprising different expressions with a rotation of 15 degrees. The matching process is computationally expensive, taking about 25 seconds to compare with 87 stored objects on a parallel machine with 23 transputers. Reference [42] extended the technique and matched human faces against a gallery of 112 neutral frontal view faces. Probe images were distorted due to rotation in depth and changing facial expression. Encouraging results on faces with large rotation angles were obtained. They reported recognition rates of 86.5% and 66.4% for the matching tests of 111 faces of 15 degree rotation and 110 faces of 30 degree rotation to a gallery of 112 neutral frontal views. In general, dynamic link architecture is superior to other face recognition techniques in terms of rotation invariance; however, the matching process is computationally expensive. D. Hidden Markov Models (HMMs) Stochastic modeling of nonstationary vector time series based on (HMM) has been very successful for speech applications. Reference [43] applied this method to human face recognition. Faces were intuitively divided into regions such as the eyes, nose, mouth, etc., which can be associated with the states of a hidden Markov model. Since HMMs require a one-dimensional observation sequence and images are two-dimensional, the images should be converted into either 1D temporal sequences or 1D spatial sequences. 91 International Journal of Signal Processing Volume 2 Number 2 In [44], a spatial observation sequence was extracted from a face image by using a band sampling technique. Each face image was represented by a 1D vector series of pixel observation. Each observation vector is a block of L lines and there is an M lines overlap between successive observations. An unknown test image is first sampled to an observation sequence. Then, it is matched against every HMMs in the model face database (each HMM represents a different subject). The match with the highest likelihood is considered the best match and the relevant model reveals the identity of the test face. The recognition rate of HMM approach is 87% using ORL database consisting of 400 images of 40 individuals. A pseudo 2D HMM [44] was reported to achieve a 95% recognition rate in their preliminary experiments. Its classification time and training time were not given (believed to be very expensive). The choice of parameters had been based on subjective intuition. In summary, geometrical feature matching based on precisely measured distances between features may be most useful for finding possible matches in a large database such as a Mug shot album. However, it will be dependent on the accuracy of the feature location algorithms. Current automated face feature location algorithms do not provide a high degree of accuracy and require considerable computational time. F. Template Matching A simple version of template matching is that a test image represented as a two-dimensional array of intensity values is compared using a suitable metric, such as the Euclidean distance, with a single template representing the whole face. There are several other more sophisticated versions of template matching on face recognition. One can use more than one face template from different viewpoints to represent an individual's face. A face from a single viewpoint can also be represented by a set of multiple distinctive smaller templates [49,52]. The face image of gray levels may also be properly processed before matching [53]. In [49], Bruneli and Poggio automatically selected a set of four features templates, i.e., the eyes, nose, mouth, and the whole face, for all of the available faces. They compared the performance of their geometrical matching algorithm and template matching algorithm on the same database of faces which contains 188 images of 47 individuals. The template matching was superior in recognition (100 percent recognition rate) to geometrical matching (90 percent recognition rate) and was also simpler. Since the principal components (also known as eigenfaces or eigenfeatures) are linear combinations of the templates in the data basis, the technique cannot achieve better results than correlation [49], but it may be less computationally expensive. One drawback of template matching is its computational complexity. Another problem lies in the description of these templates. Since the recognition system has to be tolerant to certain discrepancies between the template and the test image, this tolerance might average out the differences that make individual faces unique. In general, template-based approaches compared to feature matching are a more logical approach. In summary, no existing technique is free from limitations. Further efforts are required to improve the performances of face recognition techniques, especially in the wide range of environments encountered in real world. E. Geometrical Feature Matching Geometrical feature matching techniques are based on the computation of a set of geometrical features from the picture of a face. The fact that face recognition is possible even at coarse resolution as low as 8x6 pixels [45] when the single facial features are hardly revealed in detail, implies that the overall geometrical configuration of the face features is sufficient for recognition. The overall configuration can be described by a vector representing the position and size of the main facial features, such as eyes and eyebrows, nose, mouth, and the shape of face outline. One of the pioneering works on automated face recognition by using geometrical features was done by [46] in 1973. Their system achieved a peak performance of 75% recognition rate on a database of 20 people using two images per person, one as the model and the other as the test image. References [47,48] showed that a face recognition program provided with features extracted manually could perform recognition apparently with satisfactory results. Reference [49] automatically extracted a set of geometrical features from the picture of a face, such as nose width and length, mouth position, and chin shape. There were 35 features extracted form a 35 dimensional vector. The recognition was then performed with a Bayes classifier. They reported a recognition rate of 90% on a database of 47 people. Reference [50] introduced a mixture-distance technique which achieved 95% recognition rate on a query database of 685 individuals. Each face was represented by 30 manually extracted distances. Reference [51] used Gabor wavelet decomposition to detect feature points for each face image which greatly reduced the storage requirement for the database. Typically, 35-45 feature points per face were generated. The matching process utilized the information presented in a topological graphic representation of the feature points. After compensating for different centroid location, two cost values, the topological cost, and similarity cost, were evaluated. The recognition accuracy in terms of the best match to the right person was 86% and 94% of the correct person's faces was in the top three candidate matches. G. 3D Morphable Model The morphable face model is based on a vector space representation of faces [54] that is constructed such that any convex combination of shape and texture vectors of a set of examples describes a realistic human face. Fitting the 3D morphable model to images can be used in two ways for recognition across different viewing conditions: Paradigm 1. After fitting the model, recognition can be based on model coefficients, which represent intrinsic shape and texture of faces, and are independent of the imaging conditions: Paradigm 2. Three-dimension face reconstruction can also be employed to generate synthetic views from gallery probe images [55-58]. The synthetic views are then 92 International Journal of Signal Processing Volume 2 Number 2 transferred to a second, viewpoint-dependent recognition system. More recently, [59] combines deformable 3 D models with a computer graphics simulation of projection and illumination. Given a single image of a person, the algorithm automatically estimates 3D shape, texture, and all relevant 3D scene parameters. In this framework, rotations in depth or changes of illumination are very simple operations, and all poses and illuminations are covered by a single model. Illumination is not restricted to Lambertian reflection, but takes into account specular reflections and cast shadows, which have considerable influence on the appearance of human skin. This approach is based on a morphable model of 3D faces that captures the class-specific properties of faces. These properties are learned automatically from a data set of 3D scans. The morphable model represents shapes and textures of faces as vectors in a high-dimensional face space, and involves a probability density function of natural faces within face space. The algorithm presented in [59] estimates all 3D scene parameters automatically, including head position and orientation, focal length of the camera, and illumination direction. This is achieved by a new initialization procedure that also increases robustness and reliability of the system considerably. The new initialization uses image coordinates of between six and eight feature points. The percentage of correct identification on CMU-PIE database, based on side-view gallery, was 95% and the corresponding percentage on the FERET set, based on frontal view gallery images, along with the estimated head poses obtained from fitting, was 95.9%. edge map to line segments. After thinning the edge map, a polygonal line fitting process [62] is applied to generate the LEM of a face. An example of a human frontal face LEM is illustrated in Fig. 1. The LEM representation reduces the storage requirement since it records only the end points of line segments on curves. Also, LEM is expected to be less sensitive to illumination changes due to the fact that it is an intermediate-level image representation derived from low level edge map representation. The basic unit of LEM is the line segment grouped from pixels of edge map. A face prefilering algorithm is proposed that can be used as a preprocess of LEM matching in face identification application. The prefilering operation can speed up the search by reducing the number of candidates and the actual face (LEM) matching is only carried out on a subset of remaining models. Experiments on frontal faces under controlled /ideal conditions indicate that the proposed LEM is consistently superior to edge map. LEM correctly identify 100% and 96.43% of the input frontal faces on face databases [63,64], respectively. Compared with the eigenface method, LEM performed equally as the eigenface method for faces under ideal conditions and significantly superior to the eigenface method for faces with slight appearance variations (see Table I). Moreover, the LEM approach is much more robust to size variation than the eigenface method and edge map approach (see Table II) . In [61], the LEM approach is shown to be significantly superior to the eigenface approach for identifying faces under varying lighting condition. The LEM approach is also less sensitive to pose variations than the eigenface method but more sensitive to large facial expression changes. III. RECENT TECHNIQUES A. Line Edge Map (LEM) Edge information is a useful object representation feature that is insensitive to illumination changes to certain extent. Though the edge map is widely used in various pattern recognition fields, it has been neglected in face recognition except in recent work reported in [60]. Edge images of objects could be used for object recognition and to achieve similar accuracy as gray-level pictures. Reference [60] made use of edge maps to measure the similarity of face images. A 92% accuracy was achieved. Takács argued that process of face recognition might start at a much earlier stage and edge images can be used for the recognition of faces without the involvement of high-level cognitive functions. A Line Edge Map approach, proposed by [61], extracts lines from a face edge map as features. This approach can be considered as a combination of template matching and geometrical feature matching. The LEM approach not only possesses the advantages of feature-based approaches, such as invariance to illumination and low memory requirement, but also has the advantage of high recognition performance of template matching. Line Edge Map integrate the structural information with spatial information of a face image by grouping pixels of face Fig. 1 An illustration of a face LEM TABLE I FACE RECOGNITION RESULTS OF EDGE MAP. EIGINFACE (20- EIGINVECTORS), AND LEM [61] Bern database AR database Method EM Recognition 97.7% rate 93 Eigenface LEM 100% 100% EM Eigenface LEM 88.4% 55.4% 96.4% International Journal of Signal Processing Volume 2 Number 2 Classification (NCC) criterion. Both approaches start with the eigenface feature, but different in the classification algorithm. The error rates are calculated as the function of the number of eigenface, i.e., the feature dimension. The minimum error of SVM is 8.79%, which is much better than the 15.14% of NCC. In [68], the face recognition problem is formulated as a problem in difference space, which models dissimilarities between two facial images. In different space they formulate face recognition as a two class problem. The cases are: (i) Dissimilarities between faces of the same person, and (ii) Dissimilarities between faces of different people. By modifying the interpretation of the decision surface generated a similarity metric between faces, that is learned from examples of differences between faces. The SVM-based algorithm is compared with a principal component analysis (PCA) based algorithm on a difficult set of images from the FERET database. Performance was measured for both verification and identification scenarios. The identification performance for SVM is 77-78% versus 54% for PCA. For verification, the equal error rate is 7% for SVM and 13% for PCA. Reference [69] presented a component-based technique and two global techniques for face recognition and evaluated their performance with respect to robustness against pose changes. The component-based system detected and extracted a set of 10 facial components and arranged them in a single feature vector that was classified by linear SVMs. In both global systems the whole face is detected, extracted from the image and used as input to the classifiers. The first global system consisted of a single SVM for each person in the database. In the second system, the database of each person is clustered and trained on a set of view-specific SVM classifiers. The systems were tested on a database consisting of 8.593 gray faces mages which included faces rotated in depth up to about 400. In all experiments the component-based system outperformed the global systems even though a more powerful classifier is used (i.e. non-linear instead of linear SVMs) for the global system. This shows that using facial components instead of the whole face pattern as input features significantly simplifies the test of face recognition. Reference [70] presented a new development in component based face recognition by incorporation a 3D morphable model into the training process. Based on two face images of a person and a 3D morphable model into they computed the 3D face model of each person in the database. By rendering the 3D models under varying poses and lighting conditions, a large number of synthetic face images is used to train the component based recognition system. A component based recognition rates around 98% is achieved for faces rotated up to ± 360 in depth. A major drawback of the system was the need of a large number of training images taken from viewpoints and under different lighting conditions. In [71], a client-specific solution is adopted which requires learning client-specific support vectors. This representation is different from the one given in [68]. Where in [68], as mentioned before, SVM was trained to distinguish between the populations of within-client and between-client difference images respectively. Moreover, they investigate the inherent TABLE II RECOGNITION RESULTS WITH SIZE VARIATIONS [61] Edge map Top 1 43.3% Top 5 56.0% Top 10 64.7% Eigenface (112-eigenvectors) 44.9% 68.8% 75.9% LEM (pLHD) 53.8% 67.6% 71.9% LEM (LHD) 66.5% 75.9% 79.7% B. Support Vector Machine (SVM) SVM is a learning technique that is considered an effective method for general purpose pattern recognition because of its high generalization performance without the need to add other knowledge [65]. Intuitively, given a set of points belonging to two classes, a SVM finds the hyperplane that separates the largest possible fraction of points of the same class on the same side, while maximizing the distance from either class to the hyperplane. According to [65], this hyperplane is called Optimal Separating Hyperplane (OSH) which minimizes the risk of misclassifying not only the examples in the training set but also the unseen example of the test set. SVM can also be viewed as a way to train polynomial neural networks or Radial Basis function classifiers. The training techniques used here are based on the principle of Structure Risk Minimization (SRM), which states that better generalization capabilities are achieved through a minimization of the bound on the generalization error. Indeed, this learning technique is just equivalent to solving a linearly constrained Quadratic Programming (QP) problem. SVM is suitable for average size face recognition systems because normally those systems have only a small number of training samples. But in a large number of QP problems, Reference [66] presented a decomposition algorithm that guarantees global optimality, and can be used to train SVMs over very large data set. In summary, the main characteristics of SVMs are: (1) that they minimize a formally proven upper bound on the generalization error; (2) that they work on high-dimensional feature spaces by means of a dual formulation in terms of kernels; (3) that the prediction is based on hyperplanes in these feature spaces, which may correspond to quite involved classification criteria on the input data; and (4) that outliers in the training data set can be handled by means of soft margins. The application of SVMs to computer vision problem have been proposed recently. Reference [67] used the SVMs with a binary tree recognition strategy to tackle the face recognition problem. After the features are extracted, the discrimination functions between each pair are learned by SVMs. Then, the disjoint test set enters the system for recognition. They propose to construct a binary tree structure to recognize the testing samples. Two sets of experiments were presented. The first experiment is on the Cambridge Olivetti Research Lab (ORL) face database of 400 images of 40 individuals. The second is on a larger data set of 1079 images of 137 individuals. The SVM based recognition was compared with standard eigenfaces approach using the Nearest Center 94 International Journal of Signal Processing Volume 2 Number 2 Experiments were made on two different face databases (Yale and AR databases). The results obtained appear in Table IV. The SVM was used only with polynomial (up to degree 3) and Guassian kernels (while varying the kernel parameter σ). potential of SVM to extract the relevant discriminatory information from the training data irrespective of representation and pre-processing. In order to achieve this object, they have designed experiments in which faces are represented in both Principal Component (PC) and Linear Discriminant (LD) subspace. The latter basis (Fisherfaces) is used as an example of a face representation with focus on discriminatory feature extraction while the former achieves simply data compression. They also study the effect of image photometric normalization on performance of the SVM method, the experimental results showing superior performance in comparison with benchmark methods. However, when the representation space already captures and emphasizes the discriminatory information, SVMs loose their superiority . The results also indicate that the SVMs are robust against changes in illumination provided these are adequately represented in the training data. The proposed system is evaluated on a large database of 295 people obtaining highly competitive results: an equal rate of 1% for verification and a rank-one error rate of 2% for recognition. In [72], a novel structure is proposed to tackle multi-class classification problem for a K-class classification test, an array of K optimal pairwise coupling classifier (O-PWC) is constructed, each of which is the most reliable and optimal for the corresponding class in the sense of cross entropy of square error. The final decision will be got through combining the results of these K O-PWC. This algorithm is applied on the ORL face database, which consists of 400 images of 40 individuals, containing quite a high degree of variability in expression, pose and facial details. The training set included 200 samples (5 for each individual). The remaining 200 samples are used as the test set. The results show that, the accuracy rate is improved while the computational cost will not increase too much. Table III shows the comparison of different recognition methods on ORL database. TABLE IV RECOGNITION RATES OBTAINED FOR YALE AND AR IMAGES USING THE NEAREST MEAN CLASSIFIER (NMC) AND SVM. FOR SVM, A VALUE OF 1000 WAS USED AS MISCLASSIFICATION WEIGHT. THE LAST COLUMN REPRESENTS Yale AR THE RESULTS OBTAINED BY VARYING σ NMC using SVM Euclidean distance P=1 P=2 P=3 PCA 92.73% 98.79% 98.79% 98.79% Gaussian 99.39% ICA 95.76% 99.39% 99.39% 99.39% 99.39% PCA 48.33% 92% 91.67% 91% 92.67% ICA 70.33% 93.33% 93.33% 92.67% 94% A Support Vector Machine based multi-view face detection and recognition framework is described in [74]. Face detection is carried out by constructing several detectors, each of them in charge of one specific view. The symmetrical property of face images is employed to simplify the complexity of the modeling. The estimation of head pose, which is achieved by using the Support Vector Regression technique, provides crucial information for choosing the appropriate face detector. This helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods. For video sequences, further computational reduction can be achieved by using Pose Change Smoothing strategy. When face detectors find a face in frontal view, a Support Vector Machine based multi-class classifier is activated for face recognition. All the above issues are integrated under a Support Vector Machine framework. An important characteristic of this approach is that it can obtain a robust performance in a poorly constrained environment, especially for low resolution, large scale changes, and rotation in depth. Test results on four video sequences are presented, among them, detection rate is above 95%, recognition accuracy is above 90%, and the full detection and recognition speed is up to 4 frames/second on a PentiumII300 PC. In [75], a new face recognition method, which combines several SVM classifiers and a NN arbitrator is presented. The proposed method does not use any explicit feature extraction scheme. Instead the SVMs receive the gray level values of raw pixels as the input pattern. The rationale for this configuration is that a SVM has the capability of learning in highdimensional space, such as gray-level face-image space. Furthermore, the use of SVMs with a local correlation kernel (modified form of polynomial kernel method) provides an effective combination of feature extraction and classification, thereby eliminating the need for a carefully designed feature extractor. The scaling problem that occurs when arbitrating multiple SVMs is resolved by adopting a NN as a trainable scalier. From experimental results using the ORL database (see Fig. 3), the proposed method resulted in a 97.9% recognition rate with an average processing time of 0.22 seconds for a face TABLE III RECOGNITION ACCURACY RATE COMPARISON O-PWC Max O-PWC Method PWC (Cross (Square Error) Voting Entropy) Rate 94% 95.13 96.79% 98.11% Reference [73] combine SVM and Independent Component Analysis (ICA) techniques for the face recognition problem. ICA can be considered as a generalization of Principle Component Analysis. Fig. 2 shows the difference between PCA and ICA basis images. Fig. 2 Some original (left), PCA (center) and ICA (right) basis images for the Yale Face Database 95 International Journal of Signal Processing Volume 2 Number 2 pattern with 40 classes. Moreover, comparison with other known results on the same database. Table V shows a summary of the performance of various systems for which results using the ORL database are available. The proposed method showed the best performance and significant reduction of error rate (44.7%) from the second best performing system– convolutional NN. and correlation coefficient classifiers. However, from the practical point of view the difference is insignificant. Reference [77] describes an approach for the problem of face pose discrimination using SVM. Face pose discrimination means that one can label the face image as one of several known poses. Face images are drawn from the standard FERET database, see Fig. 4. Fig. 3 Sample images obtained from ORL database TABLE V ERROR RATES OF VARIOUS SYSTEMS Method Eigenfaces Psudo-2DHMM Convolutional NN SVMs with local correlation kernel Error rate (%) 10.0 5.0 3.8 2.1 Fig. 4 Examples of (a) training and (b) test images The training set consists of 150 images equally distributed among frontal, approximately 33.75o rotated left and right poses, respectively, and the test set consists of 450 images again equally distributed among the three different types of poses. SVM achieved perfect accuracy - 100% discriminating between the three possible face poses on unseen test data, using either polynomials of degree 3 or Radial Basis Functions (RBFs) as kernel approximation functions. Experimental results using polynomial kernels and RBF kernels are given in Tables VI-VII respectively. On the other hand, [76] studied SVMs in the context of face authentication (verification). Their study supports the hypothesis that the SVM approach is able to extract the relevant discriminatory information from the training data and this is the main reason for its superior performance over benchmark methods. When the representation space already captures and emphasizes the discriminatory information content as in the case of Fisherfaces, SVMs loose their superiority. SVMs can also cope with illumination changes, provided these are adequately represented in the training data. However, on data which has been sanitized by feature extraction (Fisherfaces) and/or normalization, SVMs can get over-trained, resulting in the loss of the ability to generalize. The following conclusion can be drawn from their work: (1) the SVM approach is able to extract the relevant discriminatory information from the data fully automatically. It can also cope with illumination changes. The major role in this characteristic is played by the SVMs ability to learn nonlinear decision boundaries, (2) on data which has been sanitised by feature extraction (Fisherfaces) and/or normalization, SVMs can get over-trained, resulting in the loss of the ability to generalize. (3) SVMs involve many parameters and can employ different kernels. This makes the optimization space rather extensive, without the guarantee that it has been fully explored to find the best solution. (4) a SVM takes about 5 seconds to train per client (on a Sun Ultra Enterprise 450). This is about an order of magnitude longer than determining client-specific thresholds for the Euclidean TABLE VI EXPERIMENT RESULTS USING POLYNOMIAL KERNELS Testing Number Training Testing Accuracy Classifiers Of Accuracy Accuracy Using max. type Support On 150 On 450 output from three vectors examples examples classifiers Frontal vs 33 100% 99.33% others Left 33.750 vs 25 100% 99.56% 100% others Right 37 100% 99.78% 33.750 vs others 96 International Journal of Signal Processing Volume 2 Number 2 multiple classifiers has emerged over recent years and represented a departure from the traditional strategy. This approach goes under various names such as MCS or committee or ensemble of classifiers, and has been developed to address the practical problem of designing automatic pattern recognition systems with improved accuracy. A parameter-based combined classifier has been developed in [79] in order to improve the generalization capability and hence the system performance of face recognition system. A combination of three LVQ neural networks that are trained on different parameters proved successful in generalization for invariant face recognition. The combined classifier resulted in improved system accuracy compared to the component classifiers. With only three training faces, the system performance in the case of the KUFB is 100%. Reference [80] presents a system for invariant face recognition. A combined classifier uses the generalization capabilities of both LVQ and Radial Basis Function (RBF) neural networks to build a representative model of a face from a variety of training patterns with different poses, details and facial expressions. The combined generalization error of the classifier is found to be lower than that of each individual classifier. A new face synthesis method is implemented for reducing the false acceptance rate and enhancing the rejection capability of the classifier. The system is capable of recognizing a face in less than one second. The well-known ORL database is used for testing the combined classifier. In the case of the ORL database, a correct recognition rate of 99.5% at 0.5% rejection rate is achieved. Reference [81] represents a face recognition committee machine (FRCM), which assembles the outputs of various face recognition algorithms, Eigenface, Fisherface, Elastic Graph Matching (EGM), SVM and neural network, to obtain a unified decision with improved accuracy. This FRCM outperforms all the individuals on average. It achieves 86.1% on Yale face database and 98.8% on ORL face database. In [82], a hybrid face recognition method that combines holistic and feature analysis-based approach using a Markov random field (MRF) model is presented. The face images are divided into small patches, and the MRF model is used to represent the relationship between the image patches and the patch ID's. The MRF model is first learned from the training image patches, given a test image. The most probable patch ID's are then inferred using the belief propagation (BP) HM. Finally, the ID of the image is determined by a voting scheme from the estimated patch ID's. This method achieved 96.11% on Yale face database and 86.95% on ORL face database. In [83], a combined classifier system consisting of an ensemble of neural networks is based on varying the parameters related to the design and training of classifiers. The boosted algorithm is used to make perturbation of the training set employing MLP as base classifier. The final result is combined by using simple majority vote rule. This system achieved 99.5% on Yale face database and 100% on ORL face database. To the best of our knowledge, these results are the best in the literatures. TABLE VII EXPERIMENT RESULTS USING RBF KERNELS Classifiers type Frontal vs others Left 33.750 vs others Right 33.750 vs others Number Of Support vectors Training Accuracy On 150 examples Testing Accuracy On 450 examples 47 100% 100% 38 100% 100% 43 100% 100% Testing Accuracy Using max. output from three classifiers 100% Reference [78] presents a method for authenticating an individual’s membership in a dynamic group without revealing the individuals and without restricting the group size and/or the members of the group. They treat the membership authentication as a two-class face classification problem to distinguish a small size set (membership) from its complementary set (non-membership) in the universal set. In the authentication, the false-positive error is the most critical. Fortunately, the error can be validly removed by using SVM ensemble, where each SVM acts as an independent membership/ non-membership classifier and several SVMs are combined in a plurality voting scheme that chooses the classification made by more than half of SVMs. For a good encoding of face images, the Gabor filtering, principal component analysis and linear discrimination analysis have been applied consecutively to the input face image for achieving effective representation, efficient reduction of the data dimension and storing separation of different faces, respectively. Next, the SVM ensemble is applied to authenticate an input face image whether it is included in the membership group or not. Experiment results showed that the SVM ensemble has the ability to recognize non-membership and a stable robustness to cope with the variations of either different group sizes or different group members. The correct authentication rate is almost constant in the range from 97% to 98.5% without regard to the variation of members in the group in the same group size. However, one problem with the proposed authentication method is that the correct classification rate for the membership is highly degraded when the size of members is small (<20), due to the limited training data set. Nevertheless, simulation results show that the authentication performance of the proposed method can keep stable for the member group with a size of less than 50 persons. C. Multiple Classifier Systems (MCSs) Recently, MCSs based on the combination of outputs of a set of different classifiers have been proposed in the field of face recognition as a method of developing high performance classification systems. Traditionally, the approach used in the design of pattern recognition systems has been to experimentally compare the performance of several classifiers in order to select the best one. However, an alternative approach based on combining 97 International Journal of Signal Processing Volume 2 Number 2 III. COMPARISON OF DIFFERENT FACE DATABASES The FRVT 2002 [86] was a large-scale evaluation of automatic face recognition technology. The primary objective of the FRVT 2002 was to provide performance measures for assessing the ability of automatic face recognition systems to meet real-world requirements. From a scientific point of view, FRVT 2002 will have an impact on future directions of research in the computer vision and pattern recognition, psychology, and statistics fields. The heart of the FRVT 2002 was the high computational intensity test (HCInt). The HCInt consisted of 121,589 operational images of 37,437 people. From this date, realworld performance figures on a very large data set were computed. Performance statistics were computed for verification, identification, and watch list tests. The conclusions from FRVT 2002 are summarized below: In Section 2, a number of face recognition algorithms have been described. In Table VIII, we give a comparison of face databases which were used to test the performance of these face recognition algorithms. The description and limitations of each database are given. While existing publicly-available face databases contain face images with a wide variety of poses, illumination angles, gestures, face occlusions, and illuminant colors, these images have not been adequately annotated, thus limiting their usefulness for evaluating the relative performance of face detection algorithms. For example, many of the images in existing databases are not annotated with the exact pose angles at which they were taken. In order to compare the performance of various face recognition algorithms presented in the literature there is need for a comprehensive, systematically annotated database populated with face images that have been captured (1) at variety of pose angles (to permit testing of pose invariance), (2) with a wide variety of illumination angles (to permit testing of illumination invariance), and (3) under a variety of commonly encountered illumination color temperatures (permit testing of illumination color invariance). Reference [84] presents a methodology for creating such an annotated database that employs a novel set of apparatus for the rapid capture of face images from a wide variety of pose angles and illumination angles. Four different types of illumination are used, including daylight, skylight, incandescent and fluorescent. The entire set of images, as well as the annotations and the experimental results, is being placed in the public domain, and made available for download over the worldwide web [85]. • Indoor face recognition performance has substantially improved since FRVT 2000. • Face recognition performance decreases approximately linearly with elapsed time, database and new images. • Better face recognition systems do not appear to be sensitive to normal indoor lighting changes. • Three-dimensional morphable models substantially improve the ability to recognize non-frontal faces. • On FRVT 2002 imagery, recognition from video sequences was not better than from still images. • Males are easier to recognize than females. • Younger people are harder to recognize than older people. • Outdoor face recognition performance needs improvement. • For identification and watch list tests, performance decreases linearly in the logarithm of the database or watch list size. V. SUMMARY OF THE RESEARCH RESULTS IV. THE FACE RECOGNITION-VENDOR TEST (FRVT) In Table IX, a summary of performance evaluations of face recognition algorithms on different databases is given. TABLE VIII COMPARISON OF DIFFERENT FACE DATABASES Database AT&T [87] (formerly ORL) Oulu Physics [88] XM2VTS [89] Description contains face images of 40 persons, with 10 images of each. For most subjects, the 10 images were shot at different times and with different lighting conditions, but always against a dark background. includes frontal color images of 125 different faces. Each face was photographed 16 times, using 1 of 4 different illuminants (horizon, incandescent, fluorescent, and daylight) in combination with 1 of 4 different camera calibrations (color balance settings). The images were captured under dark room conditions, and a gray screen was placed behind the participant. The spectral reflectance (over the range from 400 nm to 700 nm) was measured at the forehead, left cheek, and right cheek of each person with a spectrophotometer. The spectral sensitivities of the R, G and B channels of the camera, and the spectral power of the four illuminants were also recorded over the same spectral range. consists of 1000 GBytes of video sequences and speech recordings taken of 295 subjects at one-month intervals over a period of 4 months (4 recording sessions). Significant variability in appearance of clients (such as changes of hairstyle, facial hair, shape and presence or absence of glasses) is present in the recordings. During each of the 4 sessions a “speech” video sequence and a “head rotation” video sequence was captured. This database is designed to test systems designed to do multimodal (video + audio) identification of humans by facial and voice features. 98 Limitation (1) limited number of people (2) illumination conditions are not consistent from image to image. (3) the images are not annotated for different facial expressions, head rotation, or lighting conditions. (1) although this database contains images captured under a good variety of illuminant colors, and the images are annotated for illuminant, there are no variations in the lighting angle. (2) all of the face images are basically frontal (with some variations in pose angle and distance from the camera) it does not include any information about the image acquisition parameters, such as illumination angle, illumination color, or pose angle. International Journal of Signal Processing Volume 2 Number 2 Database Yale [90] Yale B [91] MIT [92] CMU Pose, Illumination, and Expression (PIE) [93] UMIST [94] Bern University face database [63] Purdue AR [64] The University of Stirling online database [95] The FERET [96] Kuwait University face database (KUFDB ) [97] Description contains frontal grayscale face images of 15 people, with 11 face images of each subject, giving a total of 165 images. Lighting variations include left-light, center-light, and right-light. Spectacle variations include with-glasses and without-glasses. Facial expression variations include normal, happy, sad, sleepy, surprised, and wink. Limitation (1) limited number of people (2) while the face images in this database were taken with 3 different lighting angles (left, center, and right) the precise positions of the light sources are not specified. (3) since all images are frontal, there are no pose angle variations. (4) Environmental factors (such as the presence or absence of ambient light) are also not described. contains grayscale images of 10 subjects with 64 different lighting angles and (1) limited number of Subjects. 9 different poses angles, for a total of 5760 images. Pose 0 is a frontal view, in (2) the background in these images is not homogeneous, which the subject directs his/her gaze directly into the camera lens. In poses 1, and is cluttered. (3) The 9 different pose angles in these 2, 3, 4, and 5 the subject is gazing at 5 points on a semicircle about 12 degrees images were not precisely controlled. Where the exact head away from the camera lens, in the left visual field. In poses 6, 7, and 8 the orientation (both vertically and horizontally) for each pose subject is gazing at 3 different points on a semicircle about 24 degrees away was chosen by the subject. from the camera lens, again in the left visual field. The images were captured with an overhead lighting structure which was fitted with 64 computercontrolled xenon strobe lights. For each pose, 64 images were captured of each subject at a rate of 30 frames/sec, over a period of about 2 seconds. Contains 16 subjects. Each subject sat on a couch and was photographed 27 times, while varying head orientation. The lighting direction and the camera zoom were also varied during the sequence. The resulting 480 x 512 grayscale images were then filtered and sub sampled by factors of 2, to produce six levels of a binary Gaussian pyramid. The six “pyramid levels” are annotated by an X-by-Y pixel count, which ranged from 480x512 down to 15x16. contains images of 68 subjects that were captured with 13 different poses, 43 different illumination conditions, and 4 different facial expressions, for a total of 41,368 color images with a resolution of 640 x 486. Two sets of images were captured – one set with ambient lighting present, and another set with ambient lighting absent. consists of 564 grayscale images of 20 people of both sexes and various races. (Image size is about 220 x 220.) Various pose angles of each person are provided, ranging from profile to frontal views. contains frontal views of 30 people. Each person has 10 gray-level images with different head pose variations (two front parallel pose, two looking to the right, two looking to the left, two looking downwards, and two looking upwards). All images are taken under controlled/ideal conditions. contains over 4,000 color frontal view images of 126 people's faces (70 men and 56 women) that were taken during two different sessions separated by 14 days. Similar pictures were taken during the two sessions. No restrictions on clothing, eyeglasses, make-up, or hair style were imposed upon the participants. Controlled variations include facial expressions (neutral, smile, anger, and screaming), illumination (left light on, right light on, all side lights on), and partial facial occlusions (sun glasses or a scarf). was created for use in psychology research, and contains pictures of faces, objects, drawings, textures, and natural scenes. A web-based retrieval system allows a user to select from among the 1591 face images of over 300 subjects based on several parameters, including male, female, grayscale, color, profile view, frontal view, or 3/4 view. contains face images of over 1000 people. It was created by the FERET program, which ran from 1993 through 1997. The database was assembled to support government monitored testing and evaluation of face recognition algorithms using standardized tests and procedures. The final set of images consists of 14051 grayscale images of human heads with views that include frontal views, left and right profile views, and quarter left and right views. It contains many images of the same people taken with time-gaps of one year or more, so that some facial features have changed. This is important for evaluating the robustness of face recognition algorithms over time. The in-house built database consists of 250 face acquired from 50 people with five images per face. There is a total 250 gray level images (5 images x 50 people). Facial images are normalized to sizes 24 x 24, 32 x 32, and 64 x 64). Images were acquired without any control of the laboratory illumination. Variations in lighting, facial expression, size, and rotation, are considered. 99 (1) Although this database contains images that were captured with a few different scale variations, lighting variations, and pose variations, these variations were not very extensive, and were not precisely measured. (2) There was also apparently no effort made to prevent the subjects from moving between pictures. (1) there was clutter visible in the backgrounds of these images. (2) The exact pose angle for each image is not specified. (1) No absolute pose angle is provided for each image. (2) No information is provided about the illumination used – either its direction or its color temperature. (1) limited number of subjects. (2) the exact pose angle for each image is not specific. (3) there is not variation in illumination conditions. the placement of those light sources, the color temperature of those light sources, and whether they were diffuse of point light sources is not specified. (The placement of the two light sources produces objectionable glare in the spectacles of some subjects.) (1) no information is provided about the illumination used during the image capture. (2) Most of these images were also captured in front of a black background, making it difficult to discern the boundaries of the head of those subjects with dark hair. (1) it does not provide a very wide variety of pose variations. (2) there is no information about the lighting used to capture the images. (1) limited number of people. (2) it does not include any information about image acquisition parameter, such as pose angle. International Journal of Signal Processing Volume 2 Number 2 TABLE IX SUMMARY OF THE RESEARCH RESULTS Database References Method Percentage of correct classification(PCC) [31] Eigenfeatures 95% [28] Eigenface [42] Graph matching [50] Geometrical feature matching and Template matching FERET [68] AR [70] SVM + 3D morphable model [71] SVM+PC+LD [61] LEM SVM+PCA SVM+ICA SVM+PCA SVM+ICA [73] [73] [81] Yale [82] [83] ORL SVM Build face recognition committee machine (FRCM) of Eigenface, Fisherface, Elastic Graph Matching (EGM), SVM, and Neural network Combines holistic and feature analysis-based approaches using a Markov random field (MRF) method Boosted parameter-based combined classifier Notes This method would be less sensitive to appearance changes than standard eigenface method. The DB contained 7,562 images of approximately 3,000 individuals. 95% , 85% , 64% correct classifications DB contained 2,500 images of 16 individuals; the images include averaged over lighting, orientation, and a large quality of background area. size variation, respectively. 86.5% and 66.4% for the matching tests of 111 faces of 15 degree rotation and 110 faces of 30 degree rotation to a gallery of 112 neutral frontal views Template matching achieved 100% to These two matching algorithms occurred on the same DB which 90% for Geometrical feature matching. contained 188 images of 47 individuals. Identification performance is 77.78% versus 54% for PCA. Verification performance is 93%versus 87% for PCA. Face rotation up to ±360 in depth. 98% 99% for verification and 98% recognition. 96.43% 92.67% 94% for 99.39% 99.39% DB contained 295 people. DB contained frontal faces under controlled condition. SVM was used only with polynomial (up to degree 3) and Guassian kernel. DB contained 165 images of 15 individuals. The DB divided into 90 images (6 for each person) for training and 75 for testing (5 for each person) 1) They adopt leaving-one-out cross validation method. FRCM gives 86.1% and it outperforms all 2) Without the lighting variations, FRCM achieves 97.8% the individuals on average accuracy. 96.11 (when using 5 images for training and 6 for testing) 99.5% [37] Hybrid NN: SOM+a convolution NN 96.2% [44] Hidden Markov model (HMMs) 87% [44] A pseudo 2DHMM [76] SVM with a binary tree [72] Optimal-Pairwise coupling (O-PWC) SVM [75] Several SVM+NN arbitrator [28] Eigenface [39] PDBNN [80] A combined classifier uses the generalization capabilities of both Learning Vector Quantization (LVQ) and Radial Basis Function (RBF) neural networks to build a representative model of a face from a variety of training patterns with different poses, details and facial expressions They tested the recognition accuracy with different numbers of training samples. K(k=1,2,…10) images of each subject were randomly selected for training and the remaining 11-k images for testing The DB is divided into 75 images (5 for each person) for training and 90 for testing (6 for each person) DB contained 400 images of 40 individuals. The classification time less than .5 second for recognizing one facial image, but training time is 4 hours. Its classification time and training time were not given (believe to be very expensive.) 91.21% for SVM and 84.86% for Nearst They compare the SVMs with standard eigenface approach using Center Classification (NCC) the NCC PWC achieved 95.13% , O-PWC They select 200 samples (5 for each individual) randomly as (cross entropy) achieved 96.79% and Otraining set. The remaining 200 samples are used as the test set. PWC (square error) achieved 98.11% An average processing time of .22 second for face pattern with 40 97.9% classes. On the same DB, PCC for eigenfaces is 90% and for pseudo-2D HMM is 95% and for convolutional NN is 96.2% 90% PDBNN face recognizing up to 200 people in approximately 1 96% second and the training time is 20 minutes. 95% 99.5% 100 A new face synthesis method is implemented for reducing the false acceptance rate and enhancing the rejection capability of the classifier. The system is capable of recognizing a face in less than one second. International Journal of Signal Processing Volume 2 Number 2 Database Bern University face database Kuwait University face database (KUFDB ) References Method [81] Build face recognition committee machine (FRCM) of Eigenface, Fisherface, Elastic Graph Matching (EGM), SVM, and Neural network Percentage of correct classification(PCC) [82] MRF [83] Boosted parameter-based combined classifier [61] LEM 100% [79] Combined LVQ neural network 100% FRCM gives 98.8% and it outperforms all They adopt leaving-one-out cross validation method. the individual on average 86.95 (when using 5 images for training and 6 for testing) 100% REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] Notes The KUFDB includes 250 images acquired from 50 people with five images/person. The training set has 3 images x 50 subjects and the testing set has 2 images x 50 subjects. [18] Francis Galton, “Personal identification and description,” In Nature, pp. 173-177, June 21, 1888. W. Zaho, “Robust image based 3D face recognition,” Ph.D. Thesis, Maryland University, 1999. R. Chellappa, C.L. Wilson and C. Sirohey, ”Humain and machine recognition of faces: A survey,” Proc. IEEE, vol. 83, no. 5, pp. 705740, may 1995. T. Fromherz, P. Stucki, M. Bichsel, “A survey of face recognition,” MML Technical Report, No 97.01, Dept. of Computer Science, University of Zurich, Zurich, 1997. T. Riklin-Raviv and A. Shashua, “The Quotient image: Class based recognition and synthesis under varying illumination conditions,” In CVPR, P. II: pp. 566-571,1999. G.j. Edwards, T.f. Cootes and C.J. Taylor, “Face recognition using active appearance models,” In ECCV, 1998. T. Sim, R. Sukthankar, M. Mullin and S. Baluja, “Memory-based face recognition for vistor identification,” In AFGR, 2000. T. Sim and T. Kanade, “Combing models and exemplars for face recognition: An illuminating example,” In Proceeding Of Workshop on Models Versus Exemplars in Computer Vision, CUPR 2001. L. Sirovitch and M. Kirby, “Low-dimensional procedure for the characterization of human faces,” Journal of the Optical Society of America A, vol. 2, pp. 519–524, 1987. M. Turk and A. Pentland “Face recognition using eigenfaces,” In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–591, 1991. P. Belhumeur, P. Hespanha, and D. Kriegman, “Eigenfaces vs fisherfaces: Recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997. M. Fleming and G. Cottrell, “Categorization of faces using unsupervised feature extraction,” In Proc. IEEE IJCNN International Joint Conference on Neural Networks, pp. 65–70, 1990. B. Moghaddam, W. Wahid, and A. Pentland, “Beyond eigenfaces: Probabilistic matching for face recognition,” In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 30–35, 1998. A. Lanitis, C. Taylor, and T. Cootes, “Automatic interpretation and coding of face images using flexible models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 743 – 756, 1997. K. Jonsson, J. Matas, J. Kittler, and Y. Li, “Learning support vectors for face verification and recognition,” In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 208–213, 2000. R. Brunelli and T. Poggio, “Face recognition: Features versus templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 10, pp. 1042–1052, 1993. D. J. Beymer, “Face recognition under varying pose,” A.I. Memo 1461, Center for Biological and Computational Learning, M.I.T., Cambridge, MA, 1993. [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] 101 The DB is divided into 200 images (10 for each person) for training and 200 for testing (10 for each person) L. Wiskott, J.-M. Fellous, N. Kruger, and C. von der Malsburg, “Face recognition by elastic bunch graph matching,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 775 – 779, 1997. A. Nefian and M. Hayes, ” An embedded hmm-based approach for face detection and recognition,” In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, pp. 3553–3556, 1999. U.S. Department of Defense, ”Facial Recognition Vendor Test, 2000,” Available: http://www.dodcounterdrug.com/facialrecognition/FRVT2000/frvt200 0.htm. W. Zhao and R. Chellappa, “ Robust face recognition using symmetric shape-from-hading,” Technical Report CARTR -919, 1999., Center for Automation Research, University of Maryland, College Park, MD, 1999. L. Zheng, “A new model-based lighting normalization algorithm and its application in face recognition,” Master’s thesis, National University of Singapore, 2000. G.J. Edwards, T.F. Cootes, and C.J. Taylor, “Face recognition using active appearance models,” In ECCV, 1998. A.S. Georghiades, P.N. Belhumeur, and D.J. Kriegman, “From few to many: Generative models for recognition under variable pose and illumination,” In AFGR, 2000. D.B. Graham and N.M., “Allinson face recognition from unfamiliar views: Subspace methods and pose dependency,” In AFGR, 1998. L. Sirovich and M. Kirby, “Low-Dimensional procedure for the characterisation of human faces,” J. Optical Soc. of Am., vol. 4, pp. 519-524, 1987. M. Kirby and L. Sirovich, “Application of the Karhunen- Loève procedure for the characterisation of human faces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, pp. 831-835, Dec. 1990. M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cognitive Neuroscience, vol. 3, pp. 71-86, 1991. M.A. Grudin, “A compact multi-level model for the recognition of facial images,” Ph.D. thesis, Liverpool John Moores Univ., 1997. L. Zhao and Y.H. Yang, “Theoretical analysis of illumination in pcabased vision systems,” Pattern Recognition, vol. 32, pp. 547-564, 1999. A. Pentland, B. Moghaddam, and T. Starner, “View-Based and modular eigenspaces for face recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 84-91, 1994. K. Chang, K.W. Bowyer and S. Sarkar, “Comarison and combination of ear and face images in appearance-based biometrics,” IEEE Trans. On Pattern analysis and machine intelligence, vol. 25, no. 9, September 2003. L. Hong and A. Jain, “Integrating faces and fingerprints for personal identification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 12, pp. 1295-1307, Dec. 1998. International Journal of Signal Processing Volume 2 Number 2 [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] P. Verlinde, G. Matre, and E. Mayoraz, “Decision fusion using a multilinear classifier,” Proc. Int’l Conf. Multisource-Multisensor Information Fusion, vol. 1, pp. 47-53, July 1998. T.J. Stonham, “Practical face recognition and verification with WISARD,” Aspects of Face Processing, pp. 426-441, 1984. K.K. Sung and T. Poggio, “Learning human face detection in cluttered scenes,” Computer Analysis of Image and patterns, pp. 432-439, 1995. S. Lawrence, C.L. Giles, A.C. Tsoi, and A.D. Back, “Face recognition: A convolutional neural-network approach,” IEEE Trans. Neural Networks, vol. 8, pp. 98-113, 1997. J. Weng, J.S. Huang, and N. Ahuja, “Learning recognition and segmentation of 3D objects from 2D images,” Proc. IEEE Int'l Conf. Computer Vision, pp. 121-128, 1993. S.H. Lin, S.Y. Kung, and L.J. Lin, “Face recognition/detection by probabilistic decision-based neural network,” IEEE Trans. Neural Networks, vol. 8, pp. 114-132, 1997. S.Y. Kung and J.S. Taur, “Decision-Based neural networks with signal/image classification applications,” IEEE Trans. Neural Networks, vol. 6, pp. 170-181, 1995. M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. Von Der Malsburg, R.P. Wurtz, and M. Konen, “Distortion Invariant object recognition in the dynamic link architecture,” IEEE Trans. Computers, vol. 42, pp. 300-311, 1993. L. Wiskott and C. von der Malsburg, “Recognizing faces by dynamic link matching,” Neuroimage, vol. 4, pp. 514-518, 1996. F. Samaria and F. Fallside, “Face identification and feature extraction using hidden markov models,” Image Processing: Theory and Application, G. Vernazza, ed., Elsevier, 1993. F. Samaria and A.C. Harter, “Parameterisation of a stochastic model for human face identification,” Proc. Second IEEE Workshop Applications of Computer Vision, 1994. S. Tamura, H. Kawa, and H. Mitsumoto, “Male/Female identification from 8_6 very low resolution face images by neural network,” Pattern Recognition, vol. 29, pp. 331-335, 1996. T. Kanade, “Picture processing by computer complex and recognition of human faces,” technical report, Dept. Information Science, Kyoto Univ., 1973. A.J. Goldstein, L.D. Harmon, and A.B. Lesk, “Identification of human faces,” Proc. IEEE, vol. 59, pp. 748, 1971. Y. Kaya and K. Kobayashi, “A basic study on human face recognition,” Frontiers of Pattern Recognition, S. Watanabe, ed., pp. 265, 1972. R. Bruneli and T. Poggio, “Face recognition: features versus templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, pp. 1042-1052, 1993. I.J. Cox, J. Ghosn, and P.N. Yianios, “Feature-Based face recognition using mixture-distance,” Computer Vision and Pattern Recognition, 1996. B.S. Manjunath, R. Chellappa, and C. von der Malsburg, “A Feature based approach to face recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 373-378, 1992. R.J. Baron, “Mechanism of human facial recognition,” Int'l J. Man Machine Studies, vol. 15, pp. 137-178, 1981. M. Bichsel, “Strategies of robust object recognition for identification of human faces,” Ph.D. thesis, Eidgenossischen Technischen Hochschule, Zurich, 1991. T. Vetter and T. Poggio, "Linear object classes and image synthesis from a single example image," IEEE Trans. Pattern Analysis and Machin Intelligence, Vol. 19, no. 7, pp. 733-742, July 1997. D. Beymer and T. Poggio, “Face recognition from one model view,” Proc. Fifth Int’l Conf. Computer Vision, 1995. T. Vetter and V. Blanz, “Estimating coloured 3D face models from fingle images: An example based approach,” Proc. Conf. Computer Vision (ECCV ’98), vol. II, 1998. A.S. Georghiades, P.N. Belhumeur, and D.J. Kriegman, “From few to many: Illumination cone models for face recognition under variable lighting and pose,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 643-660, 2001. W. Zhao and R. Chellappa, “SFS based view synthesis for robust face recognition,” Proc. Int’l Conf. Automatic Face and Gesture Recognition, pp. 285-292, 2000. [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] 102 V. Blanz and T. Vetter, “Face recognition based on fitting a 3D morphable model,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 25, no. 9, September 2003. B. Takács, “Comparing face images using the modified hausdorff distance,” Pattern Recognition, vol. 31, pp. 1873-1881, 1998. Y. Gao and K.H. Leung, “Face recognition using line edge map,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 6, June 2002. M.K.H. Leung and Y.H. Yang, “Dynamic two-strip algorithm in curve fitting,” Pattern Recognition, vol. 23, pp. 69-79, 1990. Bern Univ. Face Database, ftp://iamftp.unibe.ch/pub/Images/FaceImages/, 2002. Purdue Univ. Face Database, http://rvl1.ecn.purdue.edu/~aleix/aleix_face_DB.html, 2002. V.N. Vapnik, “The nature of statistical learning theory,” New York: Springverlag, 1995. C.J. Lin, “On the convergence of the decomposition method for support vector machines,” IEEE Transactions on Neural Networks,2001. G. Guo, S.Z. Li, and K. Chan, “Face recognition by support vector machines,” In proc. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 196-201, 2000. P.J. Phillips, “Support vector machines applied to face recognition,” Processing system 11, 1999. B. Heisele, P. Ho, and T. Poggio, “Face recognition with support vector machines: Global versus component-based approach,” in International Conference on Computer Vision (ICCV'01), 2001. J. Huang, V. Blanz, and B. Heisele, “Face recognition using Component-Based support vector machine Classification and Morphable models,” LNCS 2388, pp. 334-341, 2002. K.Jonsson, J. Mates, J. Kittler and Y.P. Li, “Learning support vectors for face verification and recognition,” Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000, pp. 208-213, Los Alamitos, USA, March 2000. G.D. Guo, H.J. Zhang, S.Z. Li. "Pairwise face recognition". In Proceedings of 8th IEEE International Conference on Computer Vision. Vancouver, Canada. July 9-12, 2001. O. Deniz, M. Castrillon, M. Hernandez, “Face recognition using independent component analysis and support vector machines,” Pattern Recognition Letters, vol. 24, pp. 2153-2157, 2003. Y. Li, S. Gong and H. Liddell. Support vector regression and classification based multi-view face detection and recognition. In Proc. IEEE International Conference on Face and Gesture Recognition, Grenoble, France, March 2000. K. I. Kim, K. Jung, and J. Kim, “Face recognition using support vector machines with local correlation kernels,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 16 no. 1, pp. 97111, 2002. K. Jonsson, J. Kittler, Y. P. Li, and J. Matas, “Support vector machines for face authentication,” in T. Pridmore and D. Elliman, editors, British Machine Vision Conference, pp. 543–553, 1999. Huang J., X. Shao, and H. Wechsler, "Face pose discrimination using support vector machines,” 14th International Conference on Pattern Recognition, (ICPR), Brisbane, Queensland, Aus, 1998. S. Pang, D. Kim, S.Y. Bang, “Membership authentication in the dynamic group by face classification using SVM ensemble,” Pattern Recognition Letters vol. 24, pp. 215-225, 2003. A.S. Tolba, " A parameter–based combined classifier for invariant face recognition," Cybernetics and Systems, vol. 31, pp. 289-302, 2000. A.S. Tolba, and A.N. Abu-Rezq, "Combined classifiers for invariant face recognition," Pattern Anal. Appl. Vol. 3, no. 4, pp. 289-302, 2000. Ho-Man Tang, Michael Lyu, and Irwin King, "Face recognition committee machine," In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), pp. 837840, April 6-10, 2003. Rui Huang, Vladimir Pavlovic, and Dimitris N. Metaxas, "A hybrid face recognition method using Markov random fields," ICPR (3) , pp. 157-160, 2004. A.S. Tolba, A.H. El-Baz, and A.A. El-Harby, "A robust boosted parameter- based combined classifier for pattern recognition," submitted for publication. John A. Black, M. Gargesha, K. Kahol, P. Kuchi, Sethuraman Panchanathan,” A Framework for performance evaluation of face International Journal of Signal Processing Volume 2 Number 2 [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] recognition algorithms,” in Proceedings of the International Conference on ITCOM, Internet Multimedia Systems II, 2002. The FacePix reference image set is in the public domain, Available: http://cubic.asu.edu/vccl/imagesets/facepix. P.J. Phillips, P. Grother, R.J. Michaels, D.M. Blackburn, E. Tabassi, and M. Bone, “Face recognition Vendor Test 2002: Evaluation Report,” NISTIR 6965, NAT. Inst. Of Standards and Technology 2003. The At & T Database of Faces , Available: http://www.uk.research.att.com/facedatabase.html The Oulu Physics database, Available: http://www.ee.oulu.fi/research/imag/color/pbfd.html The XM2VTS database, Available: http://www.ee.surrey.ac.uk/Reseach/VSSP/xm2vtsdb/ The Yale database, Available: http://cvc.yale.edu/ The Yale B database, Available: http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html The MIT face database, Available: ftp://whitehapel.media.mit.edu/pub/images/ The CMU PIE database, Available: http://www.ri.cmu.edu/projects/project_418.html The UMIST database, Available: http://images.ee.umist.ac.uk/danny/database.html The University of Stirling online database , Available: http://pics.psych.stir.ac.uk/ The FERET database, Available: http://www.it1.nist.gov/iad/humanid/feret/ Kuwait University Face Database, Available: http://www.sc.kuniv.edu.kw/lessons/9503587/dina.htm A.S. Tolba received his B.Sc. with honors and M.Sc. from Mansoura University (Egypt) in 1978 and 1981, respectively. He received his Ph.D. from Wuppertal University (Germany) in 1988. Since 2000, he has been a full professor of computer engineering at University of Suez-Canal (Egypt). He was Secondment at the Department of Applied Physics at Kuwait University. He is currently a dean of the of the Faulty of Computer and Information systems at Mansoura University (Egypt). He has done research in computer vision, biometric identification, human-computer interaction, autonomous vehicles, neural networks, and laser interferometry. He has published over 50 papers in these areas. He is coauthor of two edited books: "Intelligent Robotic Systems," (Marcel Dekker, New York, 1991) and "Laser Technology and its Application," (Publication of ISESCO, 1997). His most recent research focus is face/gesture recognition. Prof. Tolba is a member of the IEEE and AMSE. He is currently a member of the editorial bard of Modeling, Measurement, and Control. A.H. El-Baz received his B.Sc. with honors and M.Sc. from Mathematical Department, Damietta Faculty of Science, New Damietta, Egypt in 1997 and 2002, respectively. He is working toward the Ph.D. at Mansoura University, Egypt. His research interest is in the area of pattern recognition, signal/image processing, and computer vision, especially automated face recognition. vision, image retrieval. A.H. El-Harby received his B.Sc. and M.Sc. degrees from computer science Department, Suez-Canal University, Egypt and Mathematical Department, Damietta Faculty of Science, New Damietta, Egypt, respectively. He received his Ph.D. degree in computer engineering from Keele University, UK. His thesis is on automatic extraction of vector representations of line features from remotely sensed images. His research interest include remote sensing, image processing, pattern recognition, computer 103 View publication stats

蔚山大学毕业证样本定制[美国大学毕业证模版定制]美国加州州立理工大学波莫那分校毕业证书如何办理&北卡罗来纳大学格林波若分校毕业证购买多久拿到【Q/微:158697755】办美国大学文凭|美国毕业证成绩单|加拿大文凭毕业证成绩单|澳洲文凭毕业证真实留信认证|英国文凭毕业证学位证|海外留学文凭学历证书国外假文凭办理|fake diploma修改成绩单GPA|留学移民转学申请学校|德国法国新西兰文凭毕业证成绩单|温哥华文凭制作|文凭办理吗？哪儿有办得好的文凭，哪儿有高质量的文凭办理？一手留信认证办理,代办留信入库认证|使馆认证|留学文凭补办咨询【Q/微:158697755】本公司拥有大量国际上的大学文凭样本，主要从事制作国外大学毕业证，国外大学文凭购买，国外大学成绩单GPA的排版制作，涵盖美国、英国、澳大利亚、日本、韩国、法国、德国、新加坡、新西兰、加拿大、爱尔兰、意大利、菲律宾以及香港澳门台湾等等各个国家的本科、硕士、博士学历文凭的制作。另外毕业证的辅助材料如：录取通知书，毕业信，留学人员回国证明、国外学历学位认证书、雅思托福成绩单、韩语等级证书、日语等级证书、意大利语言证书、CIA、CFA、ACCA、CMA等国外各类证书文凭制作，港澳台学历学位认证书等等都可以受理，并可以按照客户提供的样本制作一切高难度防伪文凭证件，专业、快速、诚信办理国外文凭证书。出国留学拿国外毕业证的用途与价值一、提升个人能力出国留学可以让人接受不同文化的教育，拓宽视野，提升跨文化交流能力。在国外的学习环境中，学生需要适应不同的教育方式和学习节奏，这对个人的自学能力、时间管理能力和团队协作能力都是一次很好的锻炼。二、提高就业竞争力国外毕业证书在就业市场上具有一定的优势。首先，国外高校在国内外都具有一定的知名度和影响力，拿到这些学校的毕业证书能够增加求职者的竞争力。其次，留学经历本身就是一个很好的个人背景，能在面试中引起招聘官的兴趣。最后，出国留学者往往具备较强的生活能力和适应能力，这些都是企业所看重的。【Q/微:158697755】三、便于出国发展国外毕业证书对于想要在国外发展的同学来说，是一块敲门砖。很多国家和地区都对于国外高校的学历有较高的认可度，持有国外毕业证书的人更容易获得工作机会和签证。四、丰富人生经历留学生活是一次丰富的人生经历。在这个过程中，学生不仅可以学到知识，还能体验不同的文化，结交来自世界各地的朋友，这些都是国内学习无法比拟的。五、学术研究对于一些从事学术研究的人来说，国外的高校往往拥有更好的学术资源和研究环境。在国外的高校获得毕业证书，有助于在学术研究领域获得更好的发展。【Q/微:158697755】出国留学拿国外毕业证书具有一定的价值和用途。但值得注意的是，国外毕业证书并非万能，它只是一个敲门砖，真正决定个人发展的是自身的实力和努力。因此，在选择出国留学时，要结合自身的情况和兴趣，做出明智的决策。

Log In

Face recognition: A literature review