Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey
Open access

3D Face Reconstruction: The Road to Forensics

Published: 21 October 2023 Publication History

Abstract

3D face reconstruction algorithms from images and videos are applied to many fields, from plastic surgery to the entertainment sector, thanks to their advantageous features. However, when looking at forensic applications, 3D face reconstruction must observe strict requirements that still make its possible role in bringing evidence to a lawsuit unclear. An extensive investigation of the constraints, potential, and limits of its application in forensics is still missing. Shedding some light on this matter is the goal of the present survey, which starts by clarifying the relation between forensic applications and biometrics, with a focus on face recognition. Therefore, it provides an analysis of the achievements of 3D face reconstruction algorithms from surveillance videos and mugshot images and discusses the current obstacles that separate 3D face reconstruction from an active role in forensic applications. Finally, it examines the underlying datasets, with their advantages and limitations, while proposing alternatives that could substitute or complement them.

1 Introduction

In the past few decades, much attention has been paid to the use of 3D data in facial image processing applications. This technology has shown to be promising for robust facial feature extraction [10, 51, 189]. In uncontrolled environments, it limits the effects of adverse factors such as unfavorable illumination conditions and the non-frontal poses of the face with respect to the camera [51, 148, 176].
Among the various scenarios, developing personal recognition based on 3D data appears to be a “hot topic” due to the accuracy and efficiency obtainable from comparing faces, thanks to the complementary information of shape and texture [12, 16, 97]. However, acquiring such data requires expensive hardware; moreover, the enrollment process is much more complex [143, 148, 184, 219, 225]. Thus, face recognition technology was mainly developed in the 2D domain. The acquisition of 2D images is more straightforward than that of 3D ones, as it does not require specific hardware, but often makes the recognition task challenging due to the significant variability in facial appearance [35, 148]. 3D face reconstruction (3DFR) from 2D images and videos may overcome these limits, combining the ease of acquiring 2D data with the robustness of 3D ones (Figure 1).
Fig. 1.
Fig. 1. Example of reconstruction of a 3D facial model from a single input 2D image (obtained through the framework proposed by Reference [104]).
One of the possible fields that could benefit from these advantageous characteristics is that of forensics, which often deals with probe images of unidentified people’s faces in non-frontal view, in uncontrolled environments, and in an uncooperative way, such as in the case of the ones captured by CCTV (Closed-Circuit Television) cameras. Despite some frameworks for the acquisition of 3D face models of suspects that have been proposed (e.g., Reference [126]), in such context, it is still common to have 2D mugshots, that is, frontal and, usually, profile images of subjects routinely captured by law enforcement agencies [131] for the recognition of people of interest, such as suspects or witnesses (Figure 2).
Fig. 2.
Fig. 2. Example of forensic facial recognition from a mugshot reference gallery and a probe image (images from the SCface dataset [80]).
Unfortunately, a reference gallery composed of frontal and profile images is not able to provide effective coverage of all possible conditions, such as in the case of a probe image in an arbitrary pose that is not at the same view angle as in one of the available mugshot images [230]. Therefore, from the first attempt at face recognition from mugshots [210], 3D reconstruction techniques were exploited, too, for facing some of the issues that are typical of the considered forensic cases, trying to establish the identity of unknown individuals against a reference dataset of known individuals, either in verification mode (1 to 1) or identification mode (1 to N). Hence, the research community proposed to employ this approach in facial recognition from probe videos and images acquired in an unconstrained environment to provide more information about the individual faces through the generation of multiple views or the “correction” of the pose in probe data. This makes the comparison with reference data more robust to various appearance variations typical of forensic cases.
In particular, to be suitable for real-world forensic applications, any system of this kind should satisfy strict constraints leading to the legal validity of the conclusions during a lawsuit or in the investigation phase [27, 110]. For this reason, it is necessary to analyze the methods that employ 3DFR to shed some light on their admissibility in the forensic scenario. Although other authors investigated the state-of-the-art of 3DFR from 2D images or videos [61, 73, 148, 234] and its applications to face recognition [61, 148, 156], none of them considered the requirements they have to satisfy to be potentially employed in such context and how forensics can benefit from their adoption. Moreover, the validity of the proposed face recognition systems in the considered application scenarios strongly depends on the datasets on which they have been evaluated, since these provide a basis for measuring and comparing their performance with state-of-the-art. In other words, data representativeness is fundamental, and the algorithms’ adoption is bounded by the available data [40, 174].
A specific investigation highlighting the potential and limits of 3D facial reconstruction in forensics is still missing, and, in our opinion, it would be necessary to direct research toward its real-world application. To pursue this goal, this work analyzes the potentiality of the employment of 3D face reconstruction in forensics and the approaches proposed by the research community for its integration in a common face recognition casework while considering the core challenges of legal admissibility of automated systems including it. The central premise of this work is to shed some light on the requirements that should be satisfied to fill the gap between biometric recognition and forensic comparison when reconstructing a facial image into 3D space for the recognition of an individual from 2D videos or images. The investigation of the potential benefit of this technique to forensics is the aim of our work.
This article is the follow-up of Reference [123], which is a first step toward the objectives listed above. To our knowledge, it represents the first investigation focused on state-of-the-art in applications and potentialities of 3D face reconstruction in forensics and the novelties introduced to date (Figure 3), as well as the requirements that any of the related systems must satisfy to be considered admissible in criminal investigations or judicial cases. With respect to Reference [123], this article extends such disquisition, especially in relation to the comparison among the proposed methods and the admissibility constraints that have to be satisfied to be effectively integrated into the reference scenario. Moreover, this survey also provides an analysis of the datasets employed in the reviewed studies, which could further highlight their strengths and limits, suggesting their uses in the design and evaluation of forensic facial recognition algorithms and the potential issues. Finally, some state-of-the-art datasets that could be alternative or complementary to those already used are proposed and analyzed as well to provide suitable ground truth for future studies, with the main focus on the types of data so far considered, namely, facial images, videos, and 3D scans of the face.
Fig. 3.
Fig. 3. Milestones of forensic identification based on 3D face reconstruction.
The article’s structure is as follows: Section 2 analyzes the relationship between forensics and biometrics, mainly focusing on facial traits and the integration of 3DFR. The state-of-the-art assessment of 3DFR methods for face recognition from mugshot images is reported in Section 3. A review of other proposed forensic-related applications of 3DFR from facial images and videos is carried out in Section 4. Section 5 explores the underlying datasets of facial images, videos, and 3D scans, proposing others that could be suitable as well for future research on the analyzed topic. Finally, Section 6 discusses how all the aspects above converge in a unified view.

2 Face Recognition and Forensics

The face represents a valuable clue in many criminal investigations due to its advantageous characteristics with respect to other biometrics [109, 164] and the growing number of surveillance cameras in both private and public places [52, 102, 140]. Over the years, various methods have been proposed to check whether the individual’s identity in a probe image or video matches that of a person of interest, namely, an individual related to the event under investigation, such as a suspect, a victim, or a witness. In particular, these represent a subset of the approaches widely explored in traditional biometric recognition and implemented in the related automated face recognition systems [109, 120, 185]. These methods can be summarized into various qualitative or quantitative examination approaches, which can be employed or are preferred under different conditions [60, 62].
A first approach processes the face globally in a holistic form. However, it is recommended only if other more effective approaches are not suitable, and it is highly inaccurate when faces belong to unfamiliar people, in the case of partially occluded faces [32, 62, 64, 216, 226] or severely distorted CCTV footage [34].
A second approach is based on a set of facial fiducial points named landmarks [28, 49] and employed to derive the distances and proportions between facial features. This choice is not generally recommended as well due to the subjectivity in their manual estimation in uncontrolled images due to adverse factors such as the large pose of the head, the distance from the camera, facial expressions, and lighting conditions [62, 118, 150, 151, 208]. Some of these issues could be mitigated by means of preprocessing techniques (e.g., super-resolution methods [101]).
A third approach is that of superimposition. It allows handling the discrepancies arising from differences in the position of the face with respect to the camera in two different aligned images or videos. To achieve this goal, it combines them through various methods, such as a reduced opacity overlay or blinking quickly between them. This approach is unreliable when comparing data acquired in uncontrolled scenarios, even in previous judicial cases [5, 24, 62, 137, 150, 192, 193, 226].
A fourth approach is that of morphological comparison, in which a generally predefined list of facial regions and features extracted from them related to shape, appearance, presence, and/or location, such as the relative width of the mouth with respect to the distance between the eyes and the asymmetry of the mouth [83], are compared to determine differences and similarities between the probe and reference data [226].
In particular, the latter approach is able to improve the identification accuracy by examiners, even thanks to the higher physical stability over time with respect to many of both photoanthropometry and holistic features [86, 151]. However, the stability of the evaluated features could also be affected by extrinsic factors, such as lighting and the position of the subject’s face with respect to the camera, which can introduce different levels of variability, contributing to the unreliability of certain features [119, 151, 224].
Despite the differences in reliability and acceptance, these approaches are not alternatives to each other. The choice among them is generally dependent on the probe image or videos, and they can even be used jointly in the identification task to carry out a more exhaustive analysis [15, 108, 151]. Furthermore, even if these approaches could not be used as evidence in a confirmatory identification due to the acquisition condition of the probe image or video, these could still be employed in an attempt to exclude possible suspects or be a limited—but not worthless—support for reaching a conclusion through other evidence [84, 119, 137, 151].
Although both biometric recognition and forensic identification seek to link evidence to a particular individual [112], research in these fields has been pursued independently for many years due to their different goals and requirements, as well as the difficulties in achieving significant scientific contributions in this cross-domain research field [123]. Thus, despite the employment of approaches that are common between them, the underlying methods and the automated systems integrating them must satisfy strict constraints to be considered suitable for forensic casework.

2.1 Automated Forensic Facial Recognition: The Italian Case

Due to the stringent requirements of the analyzed field, automatic recognition systems are only recently being introduced. For example, in 2017, the Italian police bodies introduced the ordinary use of an automatic image recognition system, S.A.R.I. (from the Italian “Sistema Automatico di Riconoscimento delle Immagini”), as an innovative tool aimed at supporting investigative activities [17, 173]. This system allows automatically comparing a facial probe image with millions of mugshots to reduce the number of candidates, which are then ordered by the similarity degree. Furthermore, the system is also able to work in real-time on a gallery on the order of hundreds of thousands of individuals to enforce security and control on the territory. The SARI’s outcome is a set of potential candidates that must be examined by the specialized experts of the scientific police in charge of verifying the process [22, 173]. Due to the stringent requirements of the analyzed field, automatic recognition systems are only recently being exploited. Despite the effectiveness and the extreme speed of this automatic system, it cannot yet be used in the criminal field, as it does not allow access and repetition of recognition by the defense, thus precluding cross-examination of the specific functioning of the software in question [38, 70, 173, 179]. Moreover, its functioning lacks the transparency required for any criminal case, thus precluding its compatibility with the constitutional procedural guarantees granted to the suspect [173, 179].
If this is the state of things in face recognition, then what about 3D face reconstruction? 3D reconstruction is already employed for enhancing the views of crime scenes (e.g., Reference [142]), as computer-generated evidence [22]. Thanks to the 3D representation of the scene, obtained by one or more reference photographs, it is possible to recreate the aimed scenario, for example, by inserting moving objects and simulating people’s behavior while respecting the physical laws. However, depending on the task for which it has to be employed, this technology could be considered not admissible due to the still experimental nature of the underlying method [22]. Furthermore, the accuracy of the reconstruction of the human body is low, and the face is strongly influenced by the definition of the reference images as well as by the subjectivity of the operator in positioning the characterizing points for the reconstruction [141].
Therefore, a fully automated 3D reconstruction such as the one integrated into biometric systems could reduce the errors caused by the operator, standardize the process, and speed up the analysis, provided that sufficient quality of the resulting 3D model can be guaranteed. These advantages led the research community to propose methods and approaches strictly focused on reconstructing the body or even single parts, such as the face. In particular, the 3D reconstruction of the face could be crucial for some forensic recognition tasks, strongly enhancing the recognition accuracy with respect to the recognition from raw images, especially on faces represented in non-frontal poses. In particular, this technology could even be integrated into the previously cited scene reconstruction technology to enforce the reliability of the related computer-generated evidence and make it employable for real recognition tasks.
These factors could be crucial in the introduction of this technology in the forensic recognition task. However, it must comply with the technical and admissibility requirements, summarized in Figure 4 and discussed in the following subsections, which any system must satisfy to be considered suitable to be employed in such a field.
Fig. 4.
Fig. 4. Forensic admissibility evaluation of an automatic biometric system in a casework.

2.2 Biometric Systems and Forensic Admissibility

Techniques and systems designed for biometrics, especially the automated ones, are appealing for their potential to address some forensic domain’s problems concerning crime prevention, crime investigation, and judicial trials in a more efficient, “scientifically objective,” and standardized way [15, 112, 149, 162, 176, 190, 220]. In the case of face recognition, the related recognition technology has a role in many forensic and security applications, such as in identifying people of interest (e.g., terrorists) and searching for missing people, even in real-time [71, 75, 99]. In particular, concepts behind biometric facial recognition could be beneficial in various tasks underlying forensic applications. For example, person re-identification and face identification could aid the search task of forensic practitioners, thus the collection of evidence from crime scene images acquired from surveillance cameras [197], and the investigation, thus linking traces between crime scenes by generating and testing likely explanations [197]. Face recognition could as well represent an aid in the individualization (or forensic evaluation) step, in which the evidential value is computed and assigned to the collected traces [197], with noticeable parallelism with the similarity scores assigned by most automated face recognition systems in biometric recognition tasks.
However, despite several groups, such as the FISWG (Facial Identification Scientific Working Group) [9] and the ENFSI (European Network of Forensic Science Institutes) [4], which are currently working in this direction, there is no standardized and validated method in forensics [15, 146, 149]. For example, in the United States, the admissibility of scientific evidence obtained through face recognition is generally evaluated through two guidelines:
the “Frye’s rule” gives the judges the task of assessing whether the technique or technology is accepted in a relevant scientific community [1];
the “Daubert’s rule” adds to the previous one the constraints that it has been tested, the description of its error rate is available, and it must be maintained and adhere to standards [2, 6, 7, 15, 63].
In many other judicial systems beyond the U.S.A., no specific admissibility rule regarding the evaluation of the scientific evidence is given, such as the case of the European judicial system, where the judges are generally responsible for its assessment in single cases [15]. Another issue is the general acceptance of the biometric itself, especially the face, to the point that some governments banned or limited its usage even in law enforcement agencies (e.g., References [48, 76, 136, 145, 218]). The concern is particularly related to positive identification due to the huge consequences of a false match in forensic cases combined with previous failures of face recognition systems in that direction [31, 100].
Therefore, a robust and transparent methodology must be given for forensic recognition, the effectiveness of which has to be quantitatively assessable in statistical and probabilistic terms. The goal is to provide guidelines for quantifying biometric evidence value and its strength based on assumptions, operating conditions, and the casework’s implicit uncertainty [72, 136, 197]. Besides, a set of interpretation methods must be defined independently of the baseline biometric system and integrated into the considered algorithm [153, 197]. This allows reaching conclusions in court trials in agreement with three constraints (Figure 5): performance evaluation, understandability, and forensic evaluation [27, 54, 110]. Closely related to these constraints, the quality of the probe and reference data should also be considered in the admissibility assessment [27, 176, 220].
Fig. 5.
Fig. 5. Taxonomy of forensic recognition methods based on 3DFR with respect to the evaluation levels for forensic purposes.

2.2.1 Performance Evaluation.

Performance evaluation concerns the basic trust level of the system and its performance for a specific purpose; therefore, it supports the forensic practitioner’s decision when using such a system to perform a given task. For instance, a biometric system could be considered suitable for a specific task whenever it is tested and achieves a performance acknowledged as “good” on data representative of the working system’s context, e.g., a face recognition system designed to perform well on high-resolution frontal images is not required to achieve the same performance on images acquired by CCTV cameras and random head poses [27]. In a statistical evaluation, the definition of “good performance” depends on the context, the data, and the end-users’ requirements set in the design process. The performance parameters are different according to the system itself and the specific task for which it should be employed. For example, the accuracy, namely, the percentage of correctly classified samples [113], could be considered in evaluating the performance of classification problems, such as in the case of the face recognition task. Distance-based metrics can instead be used for evaluating the error between the predicted values and the real ones in regression problems such as the 3D reconstruction tasks. An example of the latter is the Root Mean Square Error (RMSE), which considers the distance between a reconstructed facial part and the corresponding ground truth in terms of pixels (e.g., Reference [229]). As previously mentioned, understanding the metrics employed requires basic statistics knowledge, which legal decision-makers often do not have. This makes it difficult to justify the use of a particular system by such metrics in a law court [27]. Thus, a certain level of confidence in the underlying technical aspects is necessary to interpret the performance parameters adopted.
Another issue for the trust of biometric recognition systems in forensics is that of biased performance against certain demographic groups, meaning that the performance parameters may depend, on average, on the demographic groups present in the system’s dataset [110, 212]. For example, biased performance on age, gender, and ethnic groups was recently reported [33, 110, 166]. In face recognition, bias is a severe problem, since facial regions contain rich information strictly correlated to many demographic attributes, which could lead to biased performance [117]. This issue has often been overlooked when face recognition systems were employed by law enforcement agencies [82]. Thus, the missed analysis of this aspect, or on the demographic group representative of the casework, could lead to the inadmissibility of the biometric system in judicial trials or, simply, to unreliable support of the human expert decision. In other words, the choice of the datasets employed for training the system and evaluating it is one of the factors that must be considered in the performance evaluation [108]. Furthermore, fairness, interpretability, and even performance could benefit from the ability of a system to provide information about how biased its decision could be [72, 157] (see also Section 2.2.2).

2.2.2 Understandability.

Understandability (also known as interpretability [27]) is the ability of a human to understand the functioning of a system, its purpose, its features, as well as its outcome and the (computational) steps that led to such a result. In particular, the understandability evaluation supports the decision of whether the outcome of the system is suitable. This is particularly relevant for legal decision-makers (e.g., judges) who are typically not experts in those topics [11, 19, 27, 88, 110, 125].
A first step for making a system understandable is to design it as “explainable” in the decision-making process. This facilitates its traceability, which, in turn, could help prevent or deal with erroneous decisions by revealing the possible points of failure, the most appropriate data and architecture [11, 47, 72]. The main difference between understandability and explainability is that the latter focuses on the system’s design [27], while the former focuses on the end-user experience. Therefore, the system’s understandability requires an explainable design process.
A factor that can improve the system’s understandability is its transparency, meaning the ability of the forensic practitioner to have access to the details related to the functioning of such a system [27]. For example, a fully open-source system is entirely transparent. However, even a fully transparent system does not imply its understandability, as in the case of image processing algorithms whose effects cannot be reversed. In other words, they cause a loss of details or an irreversible/random addition that could even impact the reproducibility required by any automated system to be employable by forensic practitioners for reaching conclusions [139]. Moreover, even details about the algorithms and the implementation of very complex systems like neural networks could be insufficient for their understanding [27].
Therefore, for both complex and black-box systems, such as those based on Artificial Intelligence, it should be necessary to add sufficient local and/or global interpretations through metrics and mechanisms [11, 27, 72, 87, 125, 127]. For example, the forensic practitioner must be able to determine whether the system is using the face area instead of the background when computing the related outcome. Moreover, understandability is an aid for legal decision-makers in cases where both the prosecution and the defense of the suspect present contradictory results based on their own black-box systems [110]. Some examples of approaches for enhancing the explainability and, in particular, spatial understandability in the context of face recognition are the extraction of features in different areas of the face [222] and the use of model-agnostic methods (i.e., not tied to a particular type of system [11, 87]) that visualizes the salient areas that contribute to the similarity between pairs of faces [194]. Other approaches are the estimation of the uncertainty of features through the analysis of the distributional representation in the feature space of each input facial image, therefore assessing the uncertainty through the variance of such distributions [186] and the analysis of the effect of features in the resulting outcomes such as facial angle and non-facial elements [55, 196]. However, black-box systems such as deep neural networks still lack the reasonable interpretability to be effectively employed in forensic processing. In particular, understanding what information is being encoded from the input image into deep face representations would also help address eventual biases of the system (e.g., toward a demographic group) [110, 156].

2.2.3 Forensic Evaluation.

Forensic evaluation is the assignment of a relative plausibility of information over a set of competing hypotheses (or “propositions”) [27]. It supports the forensic practitioner’s opinion regarding the level of confidence and the weight (i.e., the strength) of evidence when the system makes a decision according to its outcome [27, 108, 125]. The system’s performance and understandability are taken into account in forensic evaluation, together with contextual information (e.g., additional cues or supporting evidence from other sources) and general knowledge; thus, additional information that could be either included in the decision process or formalized into the automated system itself [27, 58, 111, 171]. Therefore, forensic evaluation includes the above elements to drive forensic practitioners toward an appropriate decision (e.g., identification) that could be either conclusive or inconclusive according to the assessed level of confidence [27, 46, 56, 57, 58].
From a technical perspective, forensic evaluation is quantitatively given by a statistical approach based on the likelihood ratio values (LR) [146, 167, 176, 197, 209, 217]. In particular, it is acknowledged that the LR allows for a transparent, testable, and quantitative assessment of the probability assigned to the evidence of a face match by forensic practitioners, based on personal experience, experiments, and academic research, against the probability of a non-match [27, 171]. A semi-quantitative scale could also be employed, in which values are aligned with ranges of likelihood ratios (e.g., weak/medium/strong), or employ the relative strength of forensic observations in light of each proposition [27, 39, 42]. Therefore, thanks to its transparency, testability, and formal correctness, LR allows the clear separation of responsibilities between the forensic examiner and the court. This makes it compliant with the requirements of evidence-based forensic science when quantifying the value of the evidence to the law court [168, 175]. However, it must be remarked that calibrating a biometric score to become an LR requires a substantial amount of case-relevant data, thus data representative of the analyzed scenario regarding quality (see Section 2.2.4) and demographic group (see Section 2.2.1).

2.2.4 Quality Evaluation.

As previously pointed out, the characteristics of acquired data are also relevant. First, they should meet minimal requirements in terms of quality [176, 220]. Although not defined in a rigorous way, this term refers to factors that lead to blurriness, distortion, and artifacts in images. They may be caused by (1) the camera employed, whose sensor, optic, and analog-to-digital converter impact on the image resolution, the dynamic of gray-levels, its ability to focus on the target [85, 108, 128, 137, 170, 187], (2) environmental conditions such as the illumination and the background of the scene, the same weather conditions (rainy/cloudy) [105, 128, 170], (3) the subject’s distance from the camera that adds scaling and out-of-focus problems, his/her camouflage to evade recognition (sunglasses, beard/mustache, hat/cap, makeup, jewelry), the speed at which the subject is moving and the direction, the position of the face with respect to the camera, which can lead to non-frontal views and incomplete data [85, 102, 124, 137, 152, 213], (4) the image processing embedded into the camera or next to the raw data acquisition, such as compression and re-sizing [85, 108, 128, 170]. Therefore, quality must be evaluated for both probes and reference facial data to assess whether the proposed face recognition system is compliant with data of the kind [10, 27, 105, 159]. Second, the data amount is crucial from the viewpoint of a new classification system to be trained and fine-tuned [105, 220] and the calibration of LR frameworks for the evaluation of those and already existing systems (see Section 2.2.3), yielding to the creation of large-scale datasets for the evaluation of face recognition algorithms (e.g., Reference [103]).
While the acquisition of mugshot images by law enforcement agencies is usually subject to strict control to ensure the truthful representation of appearance, this is often not the case with the acquisition of probe images and videos. Therefore, concerning the available data, it is necessary to assess the quality to determine whether it fulfills the aimed biometric function, including the 3D reconstruction task and the following recognition. The final goal is the system’s outcome employment in the forensic investigation and the following judicial conclusion. In the middle, the quality evaluation would allow the assessment of the confidence level in decisions based on such data or to rank and select the ones with the best quality (e.g., single frames from a surveillance video) [98, 181, 198, 200]. To the state of our knowledge, a global standard for quality assessment is currently missing [98, 181], probably also due to the human subjectivity factor in the task, and international standards are still under development (e.g., References [106, 107]). However, a score based on the Mean Opinion Scores (MOS) was proposed [182] to justify the legal acceptance or the rejection of a potential probe image, video, or part of them. Unfortunately, the MOS method is often impractical, since it is considered slow, expensive, and, in general, inconvenient. Although other quality assessment methods have been proposed, most of them are not representative of human perception [92, 214]. In our opinion, specific expertise in agreement with the law court process should be included (e.g., References [182, 198]). Furthermore, quality measures about the “partial results” of the system should be integrated as well. For example, in the case of forensic recognition based on 3D face reconstruction from 2D images or videos, the 3D model reconstructed either from reference or probe data could be corrupted due to inaccurate localization of facial landmarks [23], thus requiring the repetition of the localization process or even to discard the sample because it results to be unfeasible. Therefore, the quality measures could be integrated into forensic recognition, considering them as complementary features [74, 98, 163]. This means that quality assessment would pass through the previously described requirements (Sections 2.2.12.2.2) to be admissible in the analyzed context [181] according to the forensic evaluation process (Section 2.2.3).

2.3 3D Face Reconstruction in Forensics

During the investigation phase, the subject’s identity is unknown, and the possible identities within a suspect reference set need to be rendered and sorted [220] in terms of likelihood with respect to the evidence (e.g., a frame captured from a CCTV camera) [14]. In addition to the classic challenges related to facial recognition in uncontrolled environments, such as low resolution, large poses, and occlusions [89], forensic recognition faces even more challenges. Examples are the acquisition systems that are set up cheaply and subjects that actively try not to be captured by cameras, which enhance the previously cited issues and introduce novel problems such as heavy compression, distortions, and aberrations [226]. Thanks to its greater representational power than 2D facial data, 3DFR can alleviate some of these problems. In fact, 3D data provides a representation of the facial geometry that reduces the adverse impact of non-optimal pose and illumination. Depending on the characteristics of the probe image and the reference set narrowed down by police and forensic investigation, whenever the investigator is required to compare these images, and it is necessary or advantageous to use an automatic face recognition system, 3DFR can be employed by following two different approaches, namely, view-based and model-based approaches (Figure 6), to improve the performance of facial recognition systems and, therefore, enhance its admissibility in legal trials.
Fig. 6.
Fig. 6. Taxonomy of performance enhancement methods through 3DFR for forensic recognition.
In a view-based approach, the set of images containing frontal faces is adapted to non-frontal ones, and, thus, it is typically applied on the reference set to adapt the faces within mugshots to the probe image such that it matches the pose of the represented face [160] (Figure 7). Although it allows comparing facial images under similar poses, this approach requires a reference set containing images of suspects captured in such a pose or synthesizing such a view through the 3D model of each suspect. In the latter case, each 3D model can be adapted after applying a pose estimation algorithm on the probe image before employing the actual recognition system [59, 138, 228, 229]. Another proposed strategy is to introduce a gallery enlargement phase instead, which consists of projecting the 3D model in various predefined poses in the 2D domain to enhance the representation capability of each subject and then employing the synthesized images in the recognition task [93, 131, 132, 230]. However, the view-based approach represents a suitable choice whenever multi-view face images of suspects are captured during enrollment for the purpose of highly accurate authentication, such as in the case of the verification task in face recognition [93], although it usually involves higher computational cost in terms of both time and memory with respect to the model-based counterpart.
Fig. 7.
Fig. 7. Example of face recognition through a view-based approach (the 3D model was obtained through Reference [104]).
In a model-based approach, the adaptation phase is performed on non-frontal faces to synthesize a face in frontal view through the reconstructed 3D face [93] (Figure 8). The normalized (or “frontalized”) face is then compared to the frontal faces within the gallery set to determine the subject’s identity in the probe image [69, 95]. This approach is suitable for real-world scenarios in which it is necessary to seek the identity of an unknown person within a probe image or video in a large-scale mugshot dataset [93], as in the so-called face identification task in biometric recognition, for maximizing the likelihood of returning the potential candidates. Despite the generally lower computational cost, this approach is only applicable when it is possible to synthesize good-quality frontal view images with the original texture, since it could provide complementary information for recognition with respect to the shape [16, 134]. According to what we discussed in Section 2.2.4, the minimum quality requirements for the probe images must be met, which is not often the case in real forensic scenarios. Furthermore, it could be necessary to handle possible textural artifacts in the resulting frontal image [36, 95, 233].
Fig. 8.
Fig. 8. Example of face recognition through a model-based approach (based on Reference [95]).
Hence, the application of a view-based approach would allow changing the scenario from a more traditional 2D-to-2D recognition to a 3D-to-2D recognition, in which the reconstructed 3D face representation is typically used to generate synthetic facial views matched with the probe image [35]. This can be achieved by turning the 3D model in such a way that the pose matches the one in the compared image and eventually after applying similar light conditions on the model to ease the comparison (e.g., Reference [211]). Similarly, a model-based approach could be exploited either for aiding the 2D-to-2D face recognition task, through the synthesis of non-frontal faces in the frontal view [93], as it is typically the case of probe images, and the 2D-to-3D recognition scenario, where several synthetic views can provide a set of potential probe images [206], in agreement with the reference ones. Coherently, these approaches would jointly allow a 3D-to-3D recognition scenario: The 3D representation of the face reconstructed from the reference images is compared with the one reconstructed from a probe video sequence [35]. The view-based approach typically involves the reconstruction from mugshots and the model-based approach from probe images, mainly due to the typical qualitative characteristics of data. Nonetheless, it is still possible to employ these approaches on both sets of data, according to the specific task (e.g., it could be possible and convenient to apply a view-based approach on a surveillance video to ease the comparison). However, the potential bias towards the average geometry must be taken into account when reconstructing the 3D faces [205], especially when the reconstruction is performed from single images.

3 3d Face Reconstruction for Mugshot-based Recognition

Although many attempts have been performed in the past years to reconstruct faces in the 3D domain, either from a single image or multiple images of the same subject [148], only a few were evaluated for their potential applications in forensics. Among them, we want to focus on exploiting mugshot images captured by law enforcement agencies. The reason is that methods inspired by this approach are closer than others to satisfying the previously seen criteria for their potential admissibility in forensic cases.
To our knowledge, the earlier study on 3DFR from mugshot images for forensic recognition was proposed in 2008 by Zhang et al. [230], who employed a view-based gallery enlargement approach to recognize probe face images in arbitrary view with the aid of a 3D face model for each subject reconstructed from mugshot images (Figure 9). To reconstruct the shape of such a model, they proposed a multilevel variation minimization approach that requires a set of landmarks specified on a pair of frontal-side views to be used as constraining points (i.e., eyes, eyebrows, nose profiles, lips, ears, and points interpolated between them [232]). Finally, they recovered the corresponding facial texture through a photometric method. They evaluated their approach on the CMU PIE dataset [188] using a holistic face comparator (or matcher) [202] and a local one typically employed in biometrics for a textural classification [13], restricting the rotation angles of the probe images to \(\pm 70^{\circ }\). This analysis revealed a significant improvement in average recognition accuracy with respect to the original mugshot gallery, especially when the rotation angle of the face in the probe image is larger than 30\(^{\circ }\). However, the limit of the rotation angle of faces in probe images and the use of traditional face comparators rather than state-of-the-art ones do not allow for assessing the actual improvement in the effectiveness of 3DFR from mugshot images in terms of forensic recognition [93, 131]. Other drawbacks of the proposed method are the possible artifacts caused by the assumed model [228] and the poorly explored image texture. Furthermore, they performed the analysis on a small-scale dataset containing only 68 subjects. Finally, despite improved performance and the usage of a local face comparator that enhances understandability [222], expressing the similarity between the single facial parts rather than providing a global similarity and allowing the assessment of the salient areas that led to the outcome of the system, the authors did not utilize any strategy for facilitating the forensic evaluation. Moreover, the analysis of local patterns could also help address the presence of occlusions. Another aspect that could be considered is the computational time required for the gallery enlargement, which appears to make the method unsuitable for applications having strict time constraints, even considering how old the hardware system on which it has been tested is (Table 1). We further discussed this factor in Section 6.
Table 1.
MethodInput
data
DatasetsSystemTime complexity (s)3DFR
enhancement
approach
Performance
enhancement
Understandability
enhancement
ReconstructionRecognition
ShapeTextureTraining
(subjects)
Single
test
Zhang
et al.
[230]
Frontal
and
profile
CMU PIE [188]Intel
Pentium IV
2.8-GHz
with
1 GB of
memory
985.8225.61,048.64
(68
subjects)
0.635Gallery
enlargement
From 74.27% to
93.45% face
recognition rate
introducing
synthesized
virtual views
Local binary
patterns
Han and
Jain [93]
Frontal
and
profile
FERET [161]
and PCSO
[3]
N.A.N.A.N.A.N.A.N.A.Probe
frontalization
or gallery
enlargement
From 72.5% to
82.1% rank-1
accuracy
with probe
frontalization,
from 0.1% to
65.5% verification
rate at FMR=0.1%
with gallery
enlargement
Local binary
patterns
Dutta
et al.
[59]
FrontalCMU PIE [188]
and
Multi-PIE [81]
N.A.N.A.N.A.N.A.N.A.Gallery
adaptation
N.A.N.A.
Zeng
et al.
[228, 229]
Frontal,
left
profile,
and
right
profile
Color FERET
[161],
Bosphorous
[178], and
CMU PIE [188]
Intel
core i5
2.60 GHz
with
4 GB of
memory
N.A.N.A.N.A. (68
subjects)
9Gallery
adaptation
Mean accuracy
of 97.8%, 2.3%
higher than
Zhang et al.
[230]
Local binary
patterns on
landmark-based
patches
Liang
et al.
[131, 132]
Frontal,
left
profile,
and
right
profile
Multi-PIE [81]
and Color
FERET
[161]
Intel
core
i7-4710
with
16 GB of
memory
0.041.1133
(1,000
samples)
N.A.Gallery
enlargement
From rank-1
identification
rate in the range
70.35%–88.22% to
87.94%–94.88% with
DL comparators
on Multi-PIE [81]
(86.30%–94.41% with
Han and Jain [93])
N.A.
Table 1. Methods Based on 3DFR for Face Recognition from Mugshots
(FMR Is False Match Rate).
Fig. 9.
Fig. 9. Representation of the gallery enlargement method (the 3D model was obtained through Reference [104]).
Four years later, Han and Jain [93] proposed to employ the frontalization approach in the considered scenario, as it had already shown its effectiveness in the biometric recognition from non-frontal faces [25]. They proposed a 3DFR method from a pair of frontal-profile views based on a 3D Morphable Model (3DMM) [26], a generative model for realistic face shape and appearance, to aid the reconstruction process. They reconstructed the 3D face shape through the correspondence between landmarks within the frontal image and those on the profile one and extracted the texture by mapping the facial image to the 3D shape. A view-based gallery enlargement approach and model-based probe frontalization approach (Figure 10) were employed to enhance the performance through the proposed reconstruction approach. They evaluated them on subsets of PCSO [3] and FERET [161] datasets through a local face comparator and a commercial one, revealing an improved recognition accuracy in both cases. One of the most evident limits of the reconstruction approach in a forensic context is that the involved 3DMM is a global statistical model that is limited in recovering facial details [148], as it could be dominated by the mean 3D face model, which potentially introduces a bias of the outcome towards the underlying model [206]. This aspect could be further enforced by the relatively low quality of the employed images. Furthermore, the involved 3DMM could cause evident distortion when the model is largely rotated [132, 229]. Other limits of this work are that the authors did not fully explore the texture and did not use state-of-the-art face comparators [131, 228]. Therefore, as in the previous case, despite the improvement in performance and the enhanced understandability, thanks to local features, the authors did not employ any framework for easing the forensic evaluation of their method. Finally, no information about the computational time was reported.
Fig. 10.
Fig. 10. Representation of the probe normalization (or “frontalization”) method (based on Reference [95]).
In the same year, Dutta et al. [59] proposed a method based on 3DFR for improving face recognition from non-frontal view images through a view-based gallery adaptation approach (Figure 11). They applied existing recognition systems to the 16 common subjects in the CMU PIE [188] and Multi-PIE [81] datasets, containing frontal and surveillance images, respectively. The adaptation of the reconstructed model to the pose estimated from a probe image could be particularly advantageous whenever poor-quality probe data were acquired, while it is possible to obtain the 3D model from images having a higher quality, such as in the case of mugshot images (Figure 11). However, this approach requires an accurate estimate of the pose of the face in the probe image. Furthermore, the small number of subjects involved in the study should be enlarged to simulate a forensic case and evaluate the improvement entity for assessing their applicability in real-case scenarios. Despite the advantages in some application contexts in terms of performance, the authors did not take into account understandability or forensic evaluation. The required computational time was not assessed as well.
Fig. 11.
Fig. 11. Representation of the gallery adaptation method (the pose was estimated through References [90, 91], and the 3D model was obtained through Reference [104]).
Similarly, Zeng et al. [228, 229] reconstructed 3D faces from 2D forensic mugshot images, employing frontal, left profile, and right profile reference images, through multiple reference models to obtain more accurate outcomes for enhancing recognition performance through a view-based gallery adaptation approach. To this aim, they used a coarse-to-fine 3D shape reconstruction approach based on the three views through a photometric method and multiple reference 3D face models. The use of multiple reference models is an attempt to limit the homogeneity of reconstructed 3D face shape models and increase the probability of finding the most similar candidate for the single parts of the input face. The so-reconstructed 3D face shapes were then used in the recognition task to establish correspondence between the local semantic patches around seven landmarks on the arbitrary view probe image and those on the gallery of mugshot face images, assuming that patches will deform according to the head pose angles. The authors [228] tested their approach on the CMU PIE [188] and Color FERET [161] datasets. They showed that deforming semantic patches is effective [13] and compared the performance with a commercial face recognition system [154] and the previously described method proposed by Zhang et al. [230]. The authors [229] also evaluated the enhancement using a machine learning (ML) classifier on different poses within the Bosphorus [178] and Color FERET [161] datasets. As the authors suggested, the improvement in recognition capability from arbitrary position face images is due to the greatest robustness of semantic patches to pose variation and the higher inter-class variation introduced by the subject-specific 3D face model. A limitation of this work is the out-of-date involved face comparators [131]. Furthermore, although the method employs multiple reference models, the outcome could still be biased toward them [206]. Finally, despite the fact that the proposed method enhances the performance of an understandable recognition approach, thanks to the employed local recognition approach, the authors did not perform any forensic evaluation. Moreover, despite assessing the test time on a single probe image, the authors did not report the computational time required for the reconstruction of the models in the reference gallery nor for the training of the recognition system (Table 1).
In 2018, Liang et al. [131] proposed an approach for arbitrary face recognition based on 3DFR from mugshot images that fully explores image texture. The proposed shape reconstruction approach is based on cascaded linear regression from 2D facial landmarks estimated in frontal and profile images. After reconstructing the 3D shape, they approached the texture recovery through a coarse-to-fine approach. Therefore, they employed the proposed method in a recognition task on a subset of images from each subject of the Multi-PIE dataset [81] through a view-based gallery enlargement approach on state-of-the-art comparators based on deep learning (DL). Furthermore, they compared the performance before and after the gallery enlargement and by fine-tuning the comparators with the generated multi-view images. The results highlighted improved recognition accuracy in large-pose images, especially with fine-tuned comparators. In particular, this method provides better results than the one proposed by Han and Jain [93], probably because of the major focus on reconstructing texture information [131]. Hence, the most significant novelties introduced by this work are the textured full 3D faces reconstructed from the mugshot images and the analysis on DL-based comparators, inherently more robust to pose variations than traditional ones [131]. Furthermore, they fine-tuned those comparators with the enlarged gallery, revealing even better performance than the previous gallery enlargement approaches. The authors also assessed the computational time required for the reconstruction of the 3D models, revealing a huge improvement with respect to the previous study reporting it, still considering the different capabilities of the physical system on which it has been tested (Table 1). Despite the reconstruction method appearing suitable for real-time applications [131], the authors did not report the computational time required for training and testing the recognition system. A limit of the proposed method is that it does not consistently work across all pose directions, revealing worse performance for some poses than in the original gallery (e.g., in frontal pose). Furthermore, the evaluated performance could suffer from demographic bias due to the unbalanced demographic distribution related to the dataset employed in the experiments [81]. Finally, the authors did not take into account any understandability or forensic evaluation.
In 2020, the same authors published an extension of this work [132], in which they also proposed a DL-based shape reconstruction. In this work, the authors extended the evaluation of the face recognition capability of the proposed method based on linear shape reconstruction by employing a subset of the Color FERET dataset [161], obtaining a higher recognition accuracy on average as in the case of the Multi-PIE dataset [81]. Furthermore, they tried to solve the drawback of their previous work, related to worse recognition performance for some poses, with respect to usage of the original gallery, through a fusion between the similarity scores obtained by both the original mugshot images and the synthesized ones. The improvements previously observed by combining 2D images and 3D face models in multi-modal approaches [10, 29, 30, 43, 44] were therefore confirmed. This approach, evaluated on the Multi-PIE dataset [81], revealed consistently better performance on all the pose angles. With respect to their previous study, the authors also reported the computational time required for training the recognition system (Table 1). Despite the proposed novelties, the authors did not assess if the proposed DL-based shape reconstruction approach is able to enhance recognition capability. Finally, the study did not consider understandability or forensic evaluation.
A quantitative comparison among the previously reviewed methods would require the usage of the same face comparators and their evaluation on the same ground truth data through the same performance metrics, and this is often unfeasible due to many factors, such as the current state-of-the-art datasets when the work has been proposed. Similarly, a comparison in terms of computational time is not suitable both due to the unreported information about time complexity and the differences in terms of physical systems on which the proposed methods have been tested. However, a qualitative comparison is provided in Table 1 and then discussed in Section 6.

4 Other Applications of 3d Face Reconstruction in Forensics

In addition to recognition from mugshot images, 3DFR could represent a valuable aid in other forensic contexts to facilitate the recognition of a subject. An example is the search for missing persons. Taking into account such a scenario, Ferková et al. [68] proposed a method that includes demographic information to improve the outcome of the reconstruction from a single frontal image and, at the same time, speed up the related computation. In particular, the method estimates the 3D shape of the missing person’s face by taking into account age, gender, and the similarity between the landmarks of the reference depth images and those previously annotated in the input image. Then, planar meshes are generated by triangulating between the input image and the depth image. The authors reported that their reconstruction method requires a computational time lower than 3 seconds and strongly depends on the underlying landmarks estimation algorithm. Despite the good geometrical results, the width of the outcome is usually overstretched, and the generated 3D face model does not include the forehead. Furthermore, the authors did not quantitatively evaluate the contribution of their method to recognition capability or their potential admissibility in forensic scenarios.
Similarly to some of the previous studies, Rahman et al. [165] highlighted how 3D face models could enhance forensic recognition from CCTV camera footage. In particular, they reconstructed the 3D face models from single frames by optimizing an Active Appearance Model (AAM), an algorithm that matches a statistical model of shape and appearance to an image [115]. Therefore, they evaluated the improvement in the recognition capability of different ML models with respect to 2D AAMs. However, this study on the possible application of 3DFR to forensic recognition from surveillance videos is limited to a dataset of a few subjects, which is not publicly available. Finally, the authors did not assess the recognition performance and did not investigate its admissibility in terms of understandability and forensic evaluation.
With a similar purpose, Van Dam et al. [204] proposed a method based on a projective reconstruction of facial landmarks. An auto-calibration step is added to obtain the 3D face model from CCTV camera footage. The authors considered the specific case of fraud to an Automatic Transaction Machine (ATM) with an uncalibrated camera under very short distance acquisitions with a distorted perspective [201]. They analyzed how the quality of the resulting 3D face model is affected by the number of frames and the number of landmarks, assessing the minimum values for a precise perspective shape reconstruction, which could, however, be affected by the eventual errors on the estimated landmark coordinates introduced by the noise. However, the authors did not quantitatively assess the method’s improvement with respect to its 2D counterpart in face recognition. Neither understandability nor forensic evaluation was addressed.
In 2016, the same authors proposed another method to reconstruct a 3D face from multiple frame images for an application in the forensic context [206]. Such a method employs a photometric algorithm to estimate both the texture and the 3D shape of the face. The goal is to avoid generating an outcome biased towards any facial model, thus enhancing the suitability in a forensic comparison process. The proposed method is a coarse-to-fine shape estimation process: It first provides a coarse 3D shape [205] and other pose parameters from landmarks in multiple frames, and then a refined shape is computed by assessing the photometric parameters for every point in the 3D model. The last step also allows estimating the texture information, thus providing the dense 3D model. The authors evaluated their method in a recognition task on a homemade dataset of single-camera video recordings of 48 people containing frames with different facial views. The reconstructed textures with the ground truth images were compared through FaceVACS [79] by increasing the considered frames among iterations, revealing enhancement in recognition results in most cases. Furthermore, using the likelihood ratio framework, they highlighted that in more than 60% of the cases, data initially unsuitable for forensic cases became meaningful in the same context through the proposed method. As the authors suggested, the outcomes can be used to generate faces under different poses, while they are not suitable for shape-based 3D face recognition. Despite the enhanced suitability in forensic scenarios, one of the most significant drawbacks is that the model-free reconstruction approach is computationally more burdensome than a model-based one and requires multiple images. Furthermore, the authors did not quantitatively evaluate their method on publicly available datasets. Although the authors did not assess understandability, they introduced a forensic evaluation of their method based on 3DFR; thus, in our opinion, this is the most significant work on 3DFR applied to forensics.
Unlike all previous approaches, Loohuis [135] proposed to employ 3DFR for facing the lack of facial images, which could be used in training ML and DL models for face recognition tasks, for example, in a surveillance scenario. The author combined a method for generating face images with rendering techniques to simulate such adverse conditions and assessed the impact of the resulting synthetic images on existing face recognition systems. In particular, the method proposed by Deng et al. [53] for reconstructing the 3D model of the face, based on a DL model [96] and a 3DMM [158], has been applied to the single images of a subset of the ForenFace dataset [227] to generate images simulating different levels of image degradation. Unfortunately, the proposed method does not perform well on very low-quality images. However, a reasonable level of degradation in many forensic scenarios can still be, mimicked because the generated images show a high degree of similarity with the reference ones. Moreover, a similar approach employing 3DFR for generating degraded synthetic views has already been demonstrated to enhance the recognition performance of automatic face recognition systems from low-quality videos, such as those acquired by surveillance cameras, with holistic, local, and DL approaches [102]. Furthermore, despite the human subjectivity in perceiving the quality of an image, such an approach could even be employed in the development of quality assessment algorithms for facial images, since it would allow comparing the degraded image against a known reference version thereof, thus aiding the selection of potentially suitable samples either for the reconstruction or the recognition tasks [181].

5 Datasets for Face Recognition Based On 3d Face Reconstruction

Public datasets provide a way to test and compare the performance of face recognition systems through a common evaluation framework. Therefore, in this section, we focus on the characteristics of the available datasets from the perspective of an application of forensic facial recognition based on 3D face reconstruction. Some of them have already been introduced in Sections 3 and 4. Furthermore, we provide some proposals about not-yet-employed datasets, in our opinion, suitable for the forensic task.
We subdivided the available datasets into two categories. The first one (Section 5.1) is related to sets of 2D images containing mugshot-like facial images and, eventually, in-the-wild images (i.e., images acquired in an uncontrolled environment). These include data to test reconstruction algorithms and recognition methods in realistic forensic scenarios. The second category (Section 5.2) includes sets of 3D facial scans and videos, which could be employed to evaluate the accuracy of the 3D reconstruction algorithm and eventual 3D-to-2D or 3D-to-3D face recognition systems, thus extending the application scenarios to realistic surveillance videos.

5.1 Image Datasets

Five different datasets containing 2D images were employed in the previous studies. These datasets contain either RGB images (i.e., color images) or grayscale images acquired in controlled or semi-controlled scenarios. Consequently, most of them could be considered mugshot-like datasets (Table 2) and employed in studies related to mugshot-based face recognition (Section 3). However, some of them also contain facial images in different poses and expressions, suitable for evaluating the robustness of the proposed algorithms to such factors. We provided their description and suggested some of their possible uses in studies related to forensic face recognition based on 3D face reconstruction. Besides these, we indicated datasets unemployed in the previous studies but of great potential in our view, allowing us to address some shortcomings of the other datasets or even being explicitly designed for realistic forensic scenarios.
Table 2.
DatasetImage typesSubjectsForensic featuresAcquisition contextUsed by
Color FERET [161]RGB994NoneSemi-controlled[132, 228, 229]
FERET [161]Grayscale1,199NoneSemi-controlled[93]
CMU PIE [188]RGB68NoneControlled[59, 228, 230]
Multi-PIE [81]RGB337LandmarksControlled[59, 132]
PCSO [3]RGB28,557NoneControlled[93]
NIST MID [215]Grayscale1,573NoneControlled 
Morph (Academic) [169]RGB13,618Eye coodinatesControlled 
SCface [80]RGB & IR130LandmarksControlled & uncontrolled 
ATVS Forensic DB [207]RGB50LandmarksControlled 
LFW [103]RGB5,749NoneUncontrolled 
Table 2. Mugshot-like Datasets
The Color FERET dataset [161] contains multi-pose, multi-expression, and multi-session facial images captured in a semi-controlled environment during 15 sessions across nearly three years, intended to aid in the development of the forensic field. It contains RGB images of size 512 \(\times\) 768 pixels. The face of each individual was captured in up to 13 different poses and sometimes on different dates, with an average of about 11 samples per subject. These images represent the frontal pose with different facial expressions, the right and left profiles at different angles with respect to the frontal one, and extra irregular positions. Furthermore, some images were captured while individuals were wearing eyeglasses or pulling their hair back, adding further intra-subject variability to the samples. This dataset has been analyzed by some of the previously reviewed studies [132, 228, 229] for its application to biometric recognition based on 3D reconstruction from multi-view facial images, considering them as mugshots, while using other samples of the same subjects for evaluating the recognition performance and the robustness of the system to pose and facial expression. Despite the absence of entirely uncontrolled acquisition, the variations in scale, pose, expression, and illumination, together with the relatively low quality of the images, make the dataset potentially suitable for studies related to 3D face reconstruction for surveillance-related tasks, such as the mugshot-based recognition of a suspect captured by a CCTV camera. Furthermore, its multi-session characteristic makes the dataset even suitable for studies related to aging, to make the system more robust to changes in the appearance of a person’s face over time [40].
The dataset referenced as FERET is the grayscale version of the Color FERET dataset. It has been used for evaluating the recognition performance of 3D face reconstruction based on a pair of frontal-profile facial images [93]. Since the images are grayscale, the potential applications of this dataset appear limited with respect to its colorized version, significantly reducing information about the appearance of the subjects, while the main advantage of this dataset is the lower memory required, since a single color channel is used.
The CMU-PIE dataset [188] is made up of multi-pose, multi-expression and multi-illumination face images. It contains 41,368 RGB images of size 640\(\times\)486 pixels, collected through a common setup composed of 13 fixed cameras and 21 flashes. Therefore, the faces of individuals were acquired in up to 13 different poses, under 43 different illumination conditions, and with 4 different expressions. Furthermore, a background image from each of the 13 cameras was acquired at each recording session to ease face localization. Subsets of CMU-PIE were involved in some of the previously described studies, aiming to evaluate how the recognition performance could benefit from the 3D face reconstruction obtained from a single frontal image [59], a pair of frontal-profile images [230], or the frontal image and both left and right profile images [228], evaluating the robustness of the system to large poses. Furthermore, due to its nature, CMU-PIE can also be used for evaluating the robustness of such systems to illumination conditions and facial expressions. The controlled environment could limit the usage of a system based on this dataset, making it unsuitable for in-the-wild applications. However, its multi-camera setting makes it possible to obtain more images of the same subjects with the same environmental condition as in a registration-like scenario, eventually making the 3D reconstruction from multiple images more straightforward and aiding detailed geometric and photometric modeling of the faces [78, 188]. Other limits of this dataset are the relatively low number of subjects (Table 2), which represents a shortcoming for evaluating the inter-subject discriminability, and the limited intra-subject variability due to the single-session scenario and the small range of expressions [81].
The CMU Multi-PIE dataset was collected to address such issues [81]. In particular, it provides 755,370 facial RGB images of size 3,072 \(\times\) 2,048 pixels collected by 15 cameras using 18 different flashes in a controlled environment, with similar settings used for collecting the CMU PIE dataset [188]. Hence, it represents a multi-pose, multi-expression, multi-illumination and multi-session face dataset, which introduces more variability, thanks to up to 6 facial expressions and 4 sessions, increasing the quality of the images as well. This dataset has been used to evaluate a recognition system’s performance based on 3D face reconstruction from frontal and profile images [132]. Furthermore, it has been used to evaluate the biometric performance on non-frontal images of 16 subjects common with the CMU-PIE [188], from which the 3D face was reconstructed to synthesize a non-frontal view of the subject, which can be compared with the tested image [59]. Hence, in addition to appearing, on the whole, suitable for the tasks already described while discussing the CMU-PIE [188], CMU Multi PIE offers the possibility to evaluate the aging robustness [40], especially jointly with its predecessor, which was acquired about four years earlier, even if in a limited way due to the limited number of common subjects. These would also allow the evaluation of performance by employing different acquisition parameters. Another interesting feature of this dataset is the presence of 68 annotated facial landmark points for images in the range 0\(^{\circ }\) to 45\(^{\circ }\) both left and right and 39 points for profile images, which could be exploited in both reconstruction and recognition algorithms. The most evident disadvantage of this dataset is still the collection in a strictly controlled environment, which makes it unsuitable for in-the-wild applications. Finally, its demographic distributions could lead to a bias, since the subjects were predominantly men (69.7%) and European-Americans (60%) [81].
The PCSO dataset contains mugshot images collected as part of the booking process. One or more RGB images per subject are completed with metadata such as age, sex, and ethnicity. Despite some variations in lighting conditions and head positions, photographic parameters are relatively consistent, and the quality of the images is quite good, with good contrast between the background and the individual, who is photographed with a frontal face and neutral expression [122]. The most significant advantage of this dataset with respect to the previously described ones is its large number of subjects (Table 2), making it suitable for longitudinal research. However, there is no intuitive way to relate multiple arrest records from the same individual [122], making it challenging to perform multi-session analyses. Moreover, it appears to be unsuitable for studies on robustness in the wild due to its semi-controlled nature. Finally, the dataset does not appear to be currently available to other researchers except those already enabled in the past [121].
Due to this availability issue, a possible alternative to the PCSO is the NIST-MID [215], containing frontal and profile facial views. Although some subjects were not acquired in the profile view, other subjects were acquired even more than once in both frontal and profile views. However, these mugshot images are in 8-bit grayscale and, therefore, do not allow fully exploiting the information provided by the face’s texture. Furthermore, the ratio between male and female subjects (i.e., about 19 males for each female) could lead to a demographic bias.
Another alternative is the academic version of the MORPH dataset [169], which contains scanned frontal and profile mugshot images related to 13,618 subjects [8]. The images were acquired in different periods of time, up to 1,681 days, with an average longitudinal time between photos of 164 days and an average of 4 acquisitions per subject [221], thus allowing the evaluation of the robustness of a recognition system to time progression. MORPH also allows analyzing the system’s robustness to the age variation across different subjects, since ages range from 16 to 77. Moreover, it also contains annotations related to the location of the eyes [227], which could be required by some recognition algorithms or employed for evaluating their automatic detection. However, the algorithms tested on this dataset could also suffer from a demographic bias due to unbalance between male and female individuals and in terms of ethnicity.
A dataset simulating realistic forensic scenarios is the SCface [80], providing both mugshot and surveillance images acquired through a high-quality photo camera in controlled conditions and five different commercial cameras at the same height, respectively (e.g., Figure 2). NIR (near-infrared) mugshot images are included as well. The probe images were acquired indoors using the outdoor light coming through a window on one side as the only illumination source. The observed head poses are the ones typically found in footage acquired by a regular commercial surveillance system, with the camera placed slightly above the subject’s head [41]. In total, 21 images of each subject were taken at three different distances from each surveillance camera, between 1 and 4.2 meters. The RGB mugshot images were also cropped by following the ANSI 385-2004 standard recommendations [18]. Furthermore, SCFace provides 21 manually annotated facial landmarks [199], which could be employed in both reconstruction and recognition algorithms and metadata on demographic information and the presence of glasses and mustache [41]. Therefore, this dataset could be suitable for studies on photoanthropometry. To summarize, SCFace could be employed to analyze the effect of different quality and resolution cameras on face recognition performance and the robustness to different illumination conditions, distances, and head poses. SCFace also allows studies on recognition from NIR images, which are inherently more robust to illumination changes than images acquired in the visible spectrum [174]. However, this aspect could find limited application in real-world scenarios due to the specific hardware system required to acquire NIR images [129]. From a 3D reconstruction perspective, the information provided by the nine different poses makes this dataset also suitable for evaluating the performance of the related algorithms in a realistic scenario. Although its characteristics make it suitable for low-resolution face recognition in forensic research, traces in the SCface dataset only consist of frontal surveillance camera images [227]. Furthermore, the difference in the distribution between male and female individuals (i.e., 114 and 16, respectively) and the absence of non-Caucasian people could lead to a demographic bias.
Another set of images containing forensic annotations (i.e., 21 landmarks on frontal faces) is the ATVS-Forensic [207]. Despite the relatively low number of subjects (i.e., 32 men and 18 women) and the limitation concerning the potential application scenarios, since the dataset only consists of high-quality mugshot images, this dataset would allow evaluating the robustness of the recognition system to the distance, thanks to the acquisition at three different distances between 1 and 3 meters from the camera. Furthermore, it provides a lateral view of the full body and the face. All the images were acquired in each of the two sessions held for each subject, therefore simulating forensic scenarios in which the mugshot images of a suspect have been acquired on a different day with respect to the probe image.
One mention that must be made is that of the LFW dataset [103]. Although not explicitly designed for forensic applications, it has been employed in many face recognition algorithms that can cope with uncontrolled settings. It contains images acquired in unconstrained scenarios, including variations of pose, expression, hairstyles, camera parameters, background, lighting, and other demographic aspects. Due to the variability in terms of the number of images for each subject, up to 530, LFW is suitable for both identification and verification scenarios.
The considerable differences among the reviewed datasets make them suitable for different purposes. Therefore, future studies should consider these differences to assess whether a specific dataset is suitable for the performance evaluation of the proposed system, starting from the representativeness of the images contained with respect to the aimed scenario.

5.2 3D Scan and Video Datasets

Despite the smaller amount of available datasets, videos and 3D face scans could be effectively employed to evaluate the proposed 3D face reconstruction algorithms. Their characteristics are summarized in Table 3. In particular, the acquisition context of the analyzed dataset could make them suitable for different scenarios that are characteristics of the forensic fields, either in terms of reference images or probe data. Moreover, most of them contain annotations that are traditionally employed in forensic cases (e.g., landmarks). These features motivate their potential in the simulation of the face comparison from surveillance footage.
Table 3.
DatasetData typesSubjectsForensic featuresAcquisition contextUsed by
Bosphorus [178]3D scans & images105LandmarksControlled[132, 229]
ForenFace [227]3D scans, videos & images97Annotated facial partsControlled & uncontrolled[135]
Quis-Campi [155]3D scans, videos, images & gait320Eye coordinatesControlled & uncontrolled 
Wits Face Database [20]Videos & images622NoneControlled & uncontrolled 
IJB-C [144]Videos & images3,531NoneUncontrolled 
FIDENTIS (Licensed) [203]3D scans200LandmarksControlled 
NoW benchmark [177]3D scans & images100NoneControlled & uncontrolled 
Florence 2D/3D [21]3D scans & videos53NoneControlled & uncontrolled 
Table 3. 3D Scan and Video Datasets
The Bosphorus dataset [178] contains both multi-pose and multi-expression 3D data representing the shape of the face and the correspondent RGB texture images of size 1,600 \(\times\) 1,200 pixels. It comprises 4,666 face scans related to 60 men and 45 women, mainly Caucasian and aged between 25 and 35. The scans were acquired in a single view using a structured-light-based 3D system while the subjects were sitting at a distance of about 1.5 meters. Several face scans are available per subject, in up to 13 head poses with different yaw and pitch angles, and up to 4 deliberate occlusions of eyes or mouth through beard, mustache, hair, hand, or eyeglasses, and 34 different facial expressions for each. Furthermore, it provides up to 24 facial landmarks manually annotated on 2D and 3D images, making it suitable for studies based on photoanthropometry and the estimation of landmarks. Bosphorus was involved in the evaluation of the performance of a recognition system based on the 3D face reconstruction from multi-view facial images [229] and the assessment of the accuracy of the reconstruction from frontal and profile images [132, 229]. This dataset is suitable for studies on robustness to occlusions and adverse conditions such as different poses and expressions, thanks to its big intra-subjects variability. One of its main disadvantages is the low ethnic diversity [50]. Moreover, the acquisitions under uniform illumination do not allow investigation of the effects of the light variations on reconstruction and recognition. Finally, it contains corrupted data due to subject movements during acquisitions and self-occlusion.
The ForenFace dataset [227] contains 3D scans, videos, and high-quality mugshot images and has been specifically designed to represent realistic forensic scenarios. In particular, ForenFace contains images of five views per subject, photos of an identity document (i.e., employee cards taken months or even years before), and the related frontal and semi-profile 3D scan as reference material. The CCTV videos and stills from visible and partially occluded subjects were instead acquired indoors through six different models of surveillance cameras in various locations, positions, and distances from the subject. ForenFace also includes a large set of anthropomorphic features that forensic facial practitioners employ during forensic work, such as those proposed by FISWG [83]; this makes it suitable for studies on morphological comparison and valuable due to the lack of datasets of facial features allowing quantitative, statistical evaluation of face comparison evidence [150]. This dataset is very flexible in its usage and suitable for studies related to various application scenarios. For example, it is suitable for evaluating errors with different models of surveillance cameras. Another potential use is in evaluating the robustness of age differences through passport-style images. The acquired videos allow for assessing the robustness to partial occlusions of the face, thanks to eyeglasses, beard, and baseball caps, and evaluating the reconstruction from probe videos and frontal/profile images. It could also be employed to evaluate methods for extracting facial features and comparing them with the annotated ones. Finally, ForenFace allows the recognition task across different types of facial data (e.g., probe video vs. mugshot image or 3D scan). Despite being particularly useful for forensic research, the size of ForenFace is relatively small from a biometric perspective [227]. Furthermore, the predominance of the Caucasian ethnicity could lead to demographic bias.
The Quis-Campi [155] dataset is made up of videos and images taken from modern surveillance systems that typically have a higher resolution than traditional ones. Compared to the previous dataset, Quis-Campi contains data related to more subjects (Table 3) captured in the outdoor environment in unconstrained conditions through a camera about 50 meters from the subject. It also contains 3D scans of the face and reference images acquired indoors. Furthermore, it provides gait recordings as full-body video sequences, which could be employed in a multimodal recognition system. Annotations about the locations of the eyes in each frame were also added, which can be useful for evaluating the performance of eye detection or head-pose estimation algorithms. In summary, Quis-Campi can be adopted to assess the robustness to the key adverse factors of forensic face recognition in the wild, namely, expression, occlusion, illumination, pose, motion-blur, and out-of-focus, in a realistic outdoor scenario through an automated image acquisition of a non-cooperative subject on-the-move and at-a-distance [155]. However, it lacks a good set of reference images [227] and could lead to demographic bias due to the predominance of the Caucasian ethnicity.
To perform a CCTV-based recognition, including both the identification and the verification scenarios, it is possible to employ the Wits-Face dataset [20]. It includes African male individuals aged between 18 and 35, each acquired in 10 photos, in 5 different frontal and profile views with a neutral expression and facing straight ahead, both under natural outdoor lighting and artificial indoor fluorescent lighting conditions. CCTV video recordings were acquired from 334 subjects in indoor or outdoor environments, allowing the evaluation of the difference in a face recognition system’s performance between these. Furthermore, some of the recordings are related to subjects wearing caps or sunglasses, thus allowing the evaluation of the robustness of such partial occlusions. One critical issue of Wits-Face is related to demographic bias, since only images and videos related to male subjects were acquired.
In the context of face recognition from videos, it is worth mentioning the IARPA Janus Benchmark (IJB) datasets, which contain facial images and videos varying in pose, illumination, expression, resolution, and occlusion, mainly acquired in an uncontrolled scenario. The most recent of these datasets is the IJB-C [144]. In particular, it provides 21,294 facial images and 11,779 face videos of 3,531 subjects. Furthermore, all media has manually annotated facial bounding boxes, and the dataset includes 10,040 non-facial images, allowing studies related to face detection as well. Finally, attribute metadata related to age, gender, occlusion, capture environment, skin tone, facial hair, and face yaw is provided as well, allowing further examinations such as occlusion detection and analysis of the demographic bias.
Considering the reconstruction task’s evaluation, datasets containing 3D facial scans acquired in controlled conditions may be of some use. An example is the FIDENTIS dataset [203], the licensed version of which provides textured one or even multiple 3D scans of 83 males and 117 females. In particular, it contains both raw frontal, profile, and merged models, the latter with and without ears. Furthermore, the models with the ears are provided with 42 associated landmarks, making them suitable for studies on photoanthropometry and the estimation of such landmarks (e.g., Reference [67]). Moreover, it is also suitable for the analysis of multi-session differences. However, a system evaluated on this dataset could suffer from biased performance since, despite ages ranging between 18 and 67, 75% of the subjects are aged between 21 and 29.
To evaluate the reconstruction methods under variations in lighting, occlusions, facial expression, acquisition environment, and viewing angle, it is possible to employ the NoW benchmark [177]. This dataset contains 2,054 2D images of 100 subjects (45 males and 55 females), captured with an iPhone X, and a 3D head scan for each subject as ground truth, captured through an active stereo system with the individual in a neutral expression. However, further demographic information about the subjects is not provided.
A dataset that allows evaluating the reconstruction from videos is the Florence 2D/3D [21], providing 3D scans of 53 subjects (39 males and 14 females) and indoor and outdoor videos acquired in controlled and uncontrolled settings. In particular, it is composed of four 3D models for each subject (i.e., two frontal, a right-side, and a left-side) and a further model with glasses whether he/she wears them. The HD videos (1,280 \(\times\) 720) were acquired indoors, at 25 FPS and four levels of zoom, while asking the subject to generate specific head rotations. The uncontrolled videos were acquired indoors at 25 FPS (704 \(\times\) 576) and outdoors at 5-7 FPS (736 \(\times\) 544), both at three levels of zoom and with the subject asked to be spontaneous. Hence, this dataset could be employed in studies on reconstruction and recognition in realistic surveillance conditions, still lacking the occlusions that are typical of such a scenario. Moreover, the demographic bias could represent an issue, since all the subjects are Caucasian and mostly aged between 20 and 30.
All the datasets described in this subsection consider one or more of the issues addressed in realistic forensic scenarios, both in terms of the environment (light conditions, indoor/outdoor), the subject (presence of occlusions, adverse poses, facial expressions), and technological factors (lower probe resolution, motion-blur, out-of-focus). Some of them also provide annotations related to forensic features (eye coordinates, facial parts), which could be useful in actual law courts [83, 217]. What is currently missing in the state-of-the-art datasets is the presence of occlusions of the lower face, such as in the case of facial masks, which could aid the research on the robustness to non-facial occlusions.

6 Discussions

In this article, we reviewed the state-of-the-art of 3D face reconstruction (3DFR) from 2D images and videos for forensic recognition, evaluating the proposed approaches with respect to the requirements of a potential forensics-related system. Furthermore, the proposed approaches for enhancing forensic recognition in terms of performance were analyzed together with their potential application scenarios (Figure 6).
The previously described studies mainly focus on enhancing the performance of recognition tasks in different contexts, such as the identification or verification of suspects within a gallery of mugshot images or the search for missing persons. They revealed the potential advantages of the fusion of the reconstructed model and the original images, which would allow taking advantage of the characteristics of a 3D facial model while limiting the possible loss of information in the reconstruction [132]. So far, researchers have proposed employing 3DFR either on the reference data or the probe material by re-projecting the 3D model into 2D images to aid a 2D-to-2D recognition. In particular, the first approach could find application in the adaptation of the pose of the model to the face in the probe material for easing both the visual comparison for investigative purposes and for employing the so-obtained figure in comparing it with the probe face through an automatic system, as preliminarily proposed for its application in forensic scenarios by Dutta et al. [59] and then further investigated by Zeng et al. [228, 229]. Similarly, this projection of reference 3D faces in the 2D domain in various poses demonstrated to improve the recognition performance of such systems, especially concerning their robustness to pose variation, introducing it as an augmentation step for training feature-based [93, 230] and DL systems [131, 132]. Moreover, Loohuis [135] suggested that 3DFR could be successfully employed for mimicking the degraded quality of the probe data when coupled with rendering techniques for simulating such adverse conditions. The 3DFR from probe material finds applications in many scenarios as well, easing the comparison from a single probe image by rendering the face to match the pose with a reference image [93, 165] or by reconstructing it from multiple frames of a surveillance video [204, 206].
Despite the promising results, especially concerning the robustness to pose variations in various probe and reference data types, most of the previously described studies did not evaluate their methods considering other requirements of an automated system supporting forensic analysis (Figure 5) related to understandability and forensic evaluation [27, 110], as summarized in Figure 4. Moreover, the proposed methods do not assess their robustness to some typical issues of forensic cases, such as the presence of occlusions [94, 114], making them inherently unsuitable for recognition scenarios involving them (Section 2.2). However, some of them implicitly used a face recognition algorithm based on local descriptors [93, 228, 229, 230], which supports the understandability of the output [190, 222]. Furthermore, a single study [206] employed a framework for easing forensic evaluation.
Although most of the proposed methods aim to enhance face recognition performance, they are not comparable quantitatively due to the variability in the considered settings. One of the most relevant differences is related to the involved datasets, which differ in acquisition environment, size, and availability (Section 5). The differences in terms of data type and quality represent another factor that makes them suitable for different tasks. Thus, it is necessary to address and compare these datasets separately (Section 2.2) in terms of the recognition approach (Figure 12) and application scenarios (Sections 3 and 4). Of course, differences are due to the time of publication, but recently the withdrawal of their availability due to more strict privacy rules on biometric data in the latter years made things complex. For example, the General Data Protection Regulation (GDPR) rules in the European Union strongly differ from those of other countries [116, 235]. In particular, future studies should be based on datasets suitable for forensic research. The model, in our opinion, is the ForenFace dataset [227], because it takes realistic circumstances into account and also provides a set of anthropomorphic features proposed by the FISWG [83]. Furthermore, they should evaluate the face reconstruction accuracy on large-scale 3D face datasets, such as FIDENTIS [203]. Some forensic use cases are not yet included in any benchmark dataset; for example, the special case of CCTV-based recognition from images recorded at ATMs with a very short distance from the subject and a distorted perspective [108].
Fig. 12.
Fig. 12. Taxonomy of 3DFR approaches in forensic scenarios.
For both reconstruction and recognition tasks, a demographic analysis should be conducted on the performance to assess the bias against some demographic groups, an undesired issue in forensics that is sometimes overlooked even in current research [110]. To this aim, explicit demographic information about subjects represented in the datasets could aid in facing such an issue [180]. However, this useful data may be difficult to be assembled and recovered due to the privacy rules mentioned above. Moreover, the source of this bias could be related to the unbalancedness of the underlying data. This issue could be relieved by employing synthetic datasets, like the FAIR benchmark [66]. However, the employment of synthetic data still requires more investigation to be fully validated [45] and then accepted in the forensic context.
The eventual underlying 3D reference model could be affected by bias problems as well [110], which may affect the face recognition system, making it unsuitable in forensic cases [206]. Therefore, a model-free reconstruction approach should be employed whenever possible. An example of this reconstruction approach is stereophotogrammetry, which allows capturing craniofacial morphology in high quality [97] to a level of detail that is often less important in generic recognition applications but which becomes crucial in the forensic context. Although it could not be suitable for its involvement in 3D-to-3D recognition scenarios, especially when based on shape comparison, such a reconstruction approach could be exploited in the generation of synthetic views for the comparison with the reference material [206] and, therefore, employed in a 2D-to-3D scenario. However, a drawback is the requirement of multiple images of the suspects [206], which cannot be acquired in any forensic case. Another disadvantage of a model-free reconstruction approach is the significantly higher computational time required, making it unaffordable for real-time applications. Nonetheless, this represents a minor issue for many forensic applications, such as the ones related to lawsuits.
Thus, when a photometric reconstruction approach is unsuitable, a choice between approaches based on 3DMM and DL must be made [148], even those not strictly proposed for forensic applications, taking into account their suitability, advantages, and drawbacks. For example, the methods based on 3DMM allow generating an arbitrary number of facial expressions, while those based on DL provide high-quality face texture synthesis [77, 148]. Therefore, the morphological model could be employed either for adapting the expression of the reference model or for imposing a neutral expression on the normalized face in the probe image, while a detailed reconstruction could be obtained through a DL network [77]. However, it must be pointed out that huge manipulation, such as expression modification, could not be allowed in most evaluation cases, still being a valid aid for investigation purposes. These approaches have technical limits, namely, the focus on global characteristics rather than fine details of the morphological model and the requirement of a great number of 3D scans for the training of DL networks [148]. The lack of understandability is another issue for the DL approach as well [27]. However, combining two or more reconstruction approaches could help limit some of the drawbacks of the single approach. For example, previous studies highlighted that it could be possible to reconstruct 3D faces that are highly detailed even with a single image by combining the prior knowledge of the global facial shape encoded in the 3DMM and refining it through a photometric approach [37, 130, 172]. Similarly, the combination between a morphological model and one or more DL networks has been proposed as well [65]. State-of-the-art methods not explicitly proposed for forensic applications should be further investigated in terms of potentialities and suitability as well, especially those based on DL, which revealed to be promising in addressing some of the typical issues in forensics such as occlusion removal (e.g., References [147, 183]), 3DFR from one or multiple in-the-wild images (e.g., References [133, 231]), and face frontalization (e.g., Reference [223]), thus potentially representing an aid in many investigative scenarios.
The computational time represents one of the main reasons why automated systems should be employed in forensics. It is an important feature in some specific applications, such as real-time identification through surveillance cameras (e.g., Reference [71]). In this regard, the online computational time must be assessed, representing the time required to test a single probe image and, thus, to recognize the captured individual. Specifically, it depends on both the recognition algorithm and the eventual strategy that must be applied to the probe to enhance the recognition task (e.g., some “canonical” representation). In these terms, a reasonable computational time for some applications related to surveillance and lawsuit was reported by Zhang et al. [230] and Zeng et al. [228] (Table 1). Zhang et al. [230] and Liang et al. [131, 132] also evaluated the offline computational time, representing the time required for applying the proposed enhancement approach based on 3D face reconstruction (e.g., gallery enlargement) and for the training of the recognition system. In particular, reported values suggest a notable improvement with respect to earlier proposals. However, despite these representing the most time-consuming processes, the offline computational time is generally of less concern, since it does not impact real-time operations.
It is important to remark that the most important feature in forensics is generally the reconstruction accuracy [15, 68, 195], since it represents a requirement that is often more strict than in generic recognition tasks. In the literature, 3D model quality is evaluated from the errors in terms of shape by estimating the distance between the model and the corresponding ground truth. However, the extracted texture’s quality should be assessed as well due to its role in the recognition task [12, 16, 97]. For example, the texture could allow exploiting facial marks, such as scars and tattoos. Their exploitation would enhance both performance and understandability in forensic comparison [15, 83, 157, 190, 191]. Furthermore, these facial marks are becoming even more valuable, thanks to the availability of higher resolution sensors and the growing size of face image databases and their capability to improve speed and performance of recognition systems [111]. Hence, future research should take into account these additional features to assess their permanence in the generated 3D models. This also holds for other morphological features, which forensic examiners evaluate to justify the outcome of the facial comparison (e.g., the decision whether the suspect is likely to be the one represented in a probe image) [9]. In addition to holistic ones (e.g., the overall shape), local characteristics are related to the proportions and the position of facial features, such as the relative size of the ears with respect to the eyes, nose, and mouth [83]. The asymmetry between facial components should also be considered [83], thanks to its higher physical stability over time than other features. For example, the overall shape of the face could change because of the weight increase [86]; however, the asymmetry between facial components is less affected. Therefore, these features could be an effective aid for forensic examiners even to justify their conclusion on the comparison in law courts.
To sum up, we expect that great attention will be paid to the improvement of the recognition capability in forensic scenarios by 3DFR. Extremely unfavorable conditions, typically encountered in criminal cases, could be more affordable by considering both shape and texture appropriately modelled. To this goal, data representative of forensic trace and reference material are necessary, also considering the robustness to other common factors altering the appearance, such as facial hair and the presence of occlusions. The bias toward a demographic group would be avoided in the datasets, favoring the system’s fairness. In our opinion, the proposed algorithms’ understanding would couple with data availability. Data and algorithms will play a central role in effectively integrating 3D face reconstruction from 2D images and videos in the forensic field. Similarly, the employment of frameworks for easing forensic evaluation by non-expert professionals should become a practice for stressing the admissibility of the proposed methods in real cases. To this aim, an interdisciplinary approach involving computer science and law experts would speed up this process. Therefore, we believe that its future involvement in real-world forensic applications is not far and that this survey contributes as a step toward this scenario.

References

[1]
Court of Appeals of the District of Columbia. 1923. Frye v. United States. 293 Fed. 1013 (1923).
[2]
United States Supreme Court. 1993. Daubert v. Merrell Dow Pharmaceuticals, Inc. 509 U.S. 579 (1993).
[3]
Pinellas County Florida Sheriff’s Office (PCSO). 1994. The PCSO website is available at http://www.pcsoweb.com/
[4]
European Network of Forensic Science Institutes (ENFSI). 1995. Retrieved from http://www.enfsi.eu/
[5]
Court of Appeal (Criminal Division, United Kingdom). 1995. R v. Clarke., 425 pages.
[6]
United States Supreme Court. 1997. General Electric Co. v. Joiner. 522 US 136, 118 S. Ct. 512 (1997).
[7]
United States Supreme Court. 1999. Kumho Tire Co. v. Carmichael. 119 S. Ct. 1167 (1999).
[8]
University of North Carolina Wilmington. 2003. MORPH Non-Commercial Release Whitepaper. Retrieved from http://people.uncw.edu/vetterr/MORPH-NonCommercial-Stats.pdf
[9]
Facial Identification Scientific Working Group (FISWG). 2008. Retrieved from http://www.fiswg.org/
[10]
A. F. Abate, M. Nappi, D. Riccio, and G. Sabatino. 2007. 2D and 3D face recognition: A survey, pattern recognition letters. In International Conference on Pattern Recognition (ICPR’07). 41.
[11]
Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018), 52138–52160.
[12]
H. M. Rehan Afzal, Suhuai Luo, M. Kamran Afzal, Gopal Chaudhary, Manju Khari, and Sathish A. P. Kumar. 2020. 3D face reconstruction from single 2D image using distinctive features. IEEE Access 8 (2020), 180681–180689.
[13]
Timo Ahonen, Abdenour Hadid, and Matti Pietikainen. 2006. Face description with local binary patterns: Application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28, 12 (2006), 2037–2041.
[14]
Tauseef Ali. 2014. Biometric score calibration for forensic face recognition.
[15]
Tauseef Ali, Raymond Veldhuis, and Luuk Spreeuwers. 2010. Forensic Face Recognition: A Survey. Centre for Telematics and Information Technology Technical Report TR-CTIT-10-40. University of Twente.
[16]
B. Ben Amor, Karima Ouji, Mohsen Ardabilian, and Liming Chen. 2005. 3D Face Recognition by ICP-based Shape Matching. LIRIS Lab, Lyon Research Center for Images and Intelligent Information Systems, UMR 5205.
[17]
ANSA. 2018. Ladri individuati grazie al nuovo sistema di riconoscimento facciale. ANSA (2018). Retrieved from https://www.ansa.it/lombardia/notizie/2018/09/07/ladri-individuati-grazie-a-software-ps_cd3a5272-5a52-4999-9138-4b976d4e5738.html
[18]
ANSI. 2004. ANSI INCITS 385-2004, face recognition format for data interchange. Avalable at http://webstore.ansi.org
[19]
A. B. Arrieta, N. Diaz-Rodriguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. 2020. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58 (2020), 82–115.
[20]
Nicholas Bacci, Joshua Davimes, Maryna Steyn, and Nanette Briers. 2021. Development of the wits face database: An African database of high-resolution facial photographs and multimodal closed-circuit television (CCTV) recordings. F1000Research 10 (2021).
[21]
Andrew D. Bagdanov, Alberto Del Bimbo, and Iacopo Masi. 2011. The Florence 2D/3D hybrid face dataset. In Joint ACM Workshop on Human Gesture and Behavior Understanding. 79–80.
[22]
Giovanni Barroccu. 2013. La prova scientifica nel processo penale. Diritto@ storia. Nuova serie11 (2013).
[23]
Lacey Best-Rowden, Hu Han, Charles Otto, Brendan F. Klare, and Anil K. Jain. 2014. Unconstrained face recognition: Identifying a person of interest from a media collection. IEEE Trans. Inf. Forens. Secur. 9, 12 (2014), 2144–2157.
[24]
Hitoshi Biwasaka, Takuya Tokuta, Yoshitoshi Sasaki, Kei Sato, Takashi Takagi, Toyohisa Tanijiri, Sachio Miyasaka, Masataka Takamiya, and Yasuhiro Aoki. 2010. Application of computerised correction method for optical distortion of two-dimensional facial image in superimposition between three-dimensional and two-dimensional facial images. Forens. Sci. Int. 197, 1-3 (2010), 97–104.
[25]
Volker Blanz, Patrick Grother, P. Jonathon Phillips, and Thomas Vetter. 2005. Face recognition based on frontal views generated from non-frontal images. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2. IEEE, 454–461.
[26]
Volker Blanz and Thomas Vetter. 2003. Face recognition based on fitting a 3D morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25, 9 (2003), 1063–1074.
[27]
Timothy Bollé, Eoghan Casey, and Maëlig Jacquet. 2020. The role of evaluations in reaching decisions using automated systems supporting forensic analysis. Forens. Sci. Int.: Digit. Investig. 34 (2020), 301016.
[28]
Fred L. Bookstein. 1989. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11, 6 (1989), 567–585.
[29]
Kevin W. Bowyer, Kyong Chang, and Patrick Flynn. 2004. A survey of 3D and multi-modal 3D+ 2D face recognition. (2004). Technical Report TR 2004-22, Notre Dame Computer Science Department.
[30]
Kevin W. Bowyer, Kyong Chang, and Patrick Flynn. 2006. A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition. Comput. Vis. Image Underst. 101, 1 (2006), 1–15.
[31]
Ali Breland. 2017. How white engineers built racist code—and why it’s dangerous for black people. Guardian 4 (2017).
[32]
Vicki Bruce, Zoë Henderson, Craig Newman, and A. Mike Burton. 2001. Matching identities of familiar and unfamiliar faces caught on CCTV images. J. Experim. Psychol.: Appl. 7, 3 (2001), 207.
[33]
Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency. PMLR, 77–91.
[34]
A. Mike Burton, Stephen Wilson, Michelle Cowan, and Vicki Bruce. 1999. Face recognition in poor-quality video: Evidence from security surveillance. Psychol. Sci. 10, 3 (1999), 243–248.
[35]
Marinella Cadoni, Andrea Lagorio, Enrico Grosso, and Massimo Tistarelli. 2010. Exploiting 3D faces in biometric forensic recognition. In 18th European Signal Processing Conference. IEEE, 1670–1674.
[36]
Jie Cao, Yibo Hu, Hongwen Zhang, Ran He, and Zhenan Sun. 2020. Towards high fidelity face frontalization in the wild. Int. J. Comput. Vis. 128, 5 (2020), 1485–1504.
[37]
Xuan Cao, Zhang Chen, Anpei Chen, Xin Chen, Shiying Li, and Jingyi Yu. 2018. Sparse photometric 3D face reconstruction guided by morphable models. In IEEE Conference on Computer Vision and Pattern Recognition. 4635–4644.
[38]
Gaetano Carlizzi and Giovanni Tuzet. 2018. La Prova Scientifica Nel Processo Penale. G. Giappichelli Editore.
[39]
Eoghan Casey. 2020. Standardization of forming and expressing preliminary evaluative opinions on digital evidence. Forens. Sci. Int.: Digit. Investig. 32 (2020), 200888.
[40]
Gabriel Castaneda and Taghi M. Khoshgoftaar. 2015. A survey of 2D face databases. In IEEE International Conference on Information Reuse and Integration. IEEE, 219–224.
[41]
Helder F. Castro, Jaime S. Cardoso, and Maria T. Andrade. 2021. A systematic survey of ML datasets for prime CV research areas—Media and metadata. Data 6, 2 (2021), 12.
[42]
Christophe Champod, Alex Biedermann, Joëlle Vuille, Sheila Willis, and Jan De Kinder. 2016. ENFSI guideline for evaluative reporting in forensic science: A primer for legal practitioners. Crim. Law Just. Week. 180, 10 (2016), 189–193.
[43]
Kyong I. Chang, Kevin W. Bowyer, and Patrick J. Flynn. 2003. Face recognition using 2D and 3D facial data. In Workshop in Multidimensional User Authentication. Citeseer, 25–32.
[44]
Kyong I. Chang, Kevin W. Bowyer, and Patrick J. Flynn. 2005. An evaluation of multimodal 2D+ 3D face biometrics. IEEE Trans. Pattern Anal. Mach. Intell. 27, 4 (2005), 619–624.
[45]
Laurent Colbois, Tiago de Freitas Pereira, and Sébastien Marcel. 2021. On the use of automatically generated synthetic image datasets for benchmarking face recognition. In IEEE International Joint Conference on Biometrics (IJCB’21). IEEE, 1–8.
[46]
Simon A. Cole and Barry C. Scheck. 2017. Fingerprints and miscarriages of justice: Other types of error and a post-conviction right to database searching. Alb. L. Rev. 81 (2017), 807.
[47]
European Commission. 2019. Ethics Guidelines for Trustworthy AI. Retrieved from https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai
[48]
Kate Conger, Richard Fausset, and Serge F. Kovaleski. 2019. San Francisco bans facial recognition technology. New York Times 14 (2019). Retrieved from https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html
[49]
Timothy F. Cootes, Christopher J. Taylor, David H. Cooper, and Jim Graham. 1995. Active shape models—Their training and application. Comput. Vis. Image Underst. 61, 1 (1995), 38–59.
[50]
Ciprian Adrian Corneanu, Marc Oliu Simón, Jeffrey F. Cohn, and Sergio Escalera Guerrero. 2016. Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications. IEEE Trans. Pattern Anal. Mach. Intell. 38, 8 (2016), 1548–1568.
[51]
Clement Creusot, Nick Pears, and Jim Austin. 2013. A machine-learning approach to keypoint detection and landmarking on 3D meshes. Int. J. Comput. Vis. 102, 1 (2013), 146–179.
[52]
Miguel De-la Torre, Eric Granger, Paulo V. W. Radtke, Robert Sabourin, and Dmitry O. Gorodnichy. 2015. Partially-supervised learning from facial trajectories for face recognition in video surveillance. Inf. Fusion 24 (2015), 31–53.
[53]
Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, and Xin Tong. 2019. Accurate 3D face reconstruction with weakly-supervised learning: From single image to image set. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.
[54]
Damien Dessimoz and Christophe Champod. 2008. Linkages between biometrics and forensic science. In Handbook of Biometrics. Springer, 425–459.
[55]
Prithviraj Dhar, Ankan Bansal, Carlos D. Castillo, Joshua Gleason, P. Jonathon Phillips, and Rama Chellappa. 2020. How are attributes expressed in face DCNNs? In 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG’20). IEEE, 85–92.
[56]
Itiel E. Dror. 2020. The error in “error rate”: Why error rates are so needed, yet so elusive. J. Forens. Sci. 65, 4 (2020), 1034–1039.
[57]
Itiel E. Dror and Glenn Langenburg. 2019. “Cannot decide”: The fine line between appropriate inconclusive determinations versus unjustifiably deciding not to decide. J. Forens. Sci. 64, 1 (2019), 10–15.
[58]
Itiel E. Dror and Nicholas Scurich. 2020. (Mis) use of scientific measurements in forensic science. Forens. Sci. Int.: Synerg. 2 (2020), 333–338.
[59]
Abhishek Dutta, Raymond N. J. Veldhuis, and Lieuwe Jan Spreeuwers. 2012. Non-frontal model based approach to forensic face recognition. In (ICT’12). Retrieved from https://research.utwente.nl/files/27715527/Dutta12non.pdf
[60]
Gary Edmond, Katherine Biber, Richard Kemp, and Glenn Porter. 2009. Law’s looking glass: Expert identification evidence derived from photographic and video images. Curr. Iss. Crim. Just. 20, 3 (2009), 337–377.
[61]
Bernhard Egger, William A. P. Smith, Ayush Tewari, Stefanie Wuhrer, Michael Zollhoefer, Thabo Beeler, Florian Bernard, Timo Bolkart, Adam Kortylewski, Sami Romdhani, Christian Theobalt, Blanz Volker, and Thomas Vetter. 2020. 3D morphable face models—Past, present, and future. ACM Trans. Graph. 39, 5 (2020), 1–38.
[62]
ENFSI. 2018. ENFSI-BPM-DI-01 Version 01 - January 2018. Best Practice Manual for Facial Image Comparison. Retrieved from https://enfsi.eu/wp-content/uploads/2017/06/ENFSI-BPM-DI-01.pdf
[63]
Rosemary J. Erickson and Rita James Simon. 1998. The Use of Social Science Data in Supreme Court Decisions. University of Illinois Press.
[64]
Alejandro J. Estudillo, Peter Hills, and Hoo Keat Wong. 2021. The effect of face masks on forensic face matching: An individual differences study. J. Appl. Res. Mem. Cog. 10, 4 (2021), 554–563.
[65]
Xin Fan, Shichao Cheng, Kang Huyan, Minjun Hou, Risheng Liu, and Zhongxuan Luo. 2020. Dual neural networks coupling data regression with explicit priors for monocular 3D face reconstruction. IEEE Trans. Multim. 23 (2020), 1252–1263.
[66]
Haiwen Feng, Timo Bolkart, Joachim Tesch, Michael J. Black, and Victoria Abrevaya. 2022. Towards racially unbiased skin tone estimation via scene disambiguation. In 17th European Conference on Computer Vision (ECCV’22). Springer, 72–90.
[67]
Zuzana Ferková and Petr Matula. 2019. Multimodal point distribution model for anthropological landmark detection. In IEEE International Conference on Image Processing (ICIP’19). IEEE, 2986–2990.
[68]
Zuzana Ferková, Petra Urbanová, Dominik Černỳ, Marek Žuži, and Petr Matula. 2020. Age and gender-based human face reconstruction from single frontal image. Multim. Tools Applic. 79, 5 (2020), 3217–3242.
[69]
Claudio Ferrari, Giuseppe Lisanti, Stefano Berretti, and Alberto Del Bimbo. 2016. Effective 3D based frontalization for unconstrained face recognition. In 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 1047–1052.
[70]
Paolo Ferrua. 2008. Metodo scientifico e processo penale. Diritto penale e processo (2008), 12–19.
[71]
Agence France-Presse. 2017. From ale to jail: Facial recognition catches criminals at China beer festival. Guardian 1 (2017). Retrieved from https://www.theguardian.com/world/2017/sep/01/facial-recognition-china-beer-festival
[72]
Danilo Franco, Luca Oneto, Nicolò Navarin, and Davide Anguita. 2021. Toward learning trustworthily from data combining privacy, fairness, and explainability: An application to face recognition. Entropy 23, 8 (2021), 1047.
[73]
Haibin Fu, Shaojun Bian, Ehtzaz Chaudhry, Andres Iglesias, Lihua You, and Jian Jun Zhang. 2021. State-of-the-art in 3D face reconstruction from a single RGB image. In International Conference on Computational Science. Springer, 31–44.
[74]
Giorgio Fumera, Gian Luca Marcialis, Battista Biggio, Fabio Roli, and Stephanie Caswell Schuckers. 2014. Multimodal anti-spoofing in biometric recognition systems. In Handbook of Biometric Anti-spoofing. Springer, 165–184.
[75]
Mary Grace Galterio, Simi Angelic Shavit, and Thaier Hayajneh. 2018. A review of facial biometrics security for smart devices. Computers 7, 3 (2018), 37.
[76]
Clare Garvie. 2016. The Perpetual Line-up: Unregulated Police Face Recognition in America. Georgetown Law, Center on Privacy & Technology.
[77]
Zhenglin Geng, Chen Cao, and Sergey Tulyakov. 2020. Towards photo-realistic facial expression manipulation. Int. J. Comput. Vis. 128, 10 (2020), 2744–2761.
[78]
Athinodoros S. Georghiades, Peter N. Belhumeur, and David J. Kriegman. 2000. From few to many: Generative models for recognition under variable pose and illumination. In 4th IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 277–284.
[79]
Cognitec Systems GmbH. 2013. FaceVACS SDK 8.8.0. Retrieved from http://www.cognitec-systems
[80]
Mislav Grgic, Kresimir Delac, and Sonja Grgic. 2011. SCface—Surveillance cameras face database. Multim. Tools Applic. 51, 3 (2011), 863–879.
[81]
Ralph Gross, Iain Matthews, Jeffrey Cohn, Takeo Kanade, and Simon Baker. 2010. Multi-pie. Image Vis. Comput. 28, 5 (2010), 807–813.
[82]
Patrick J. Grother, P. Jonathon Phillips, and George W. Quinn. 2011. Report on the Evaluation of 2D Still-image Face Recognition Algorithms. US Department of Commerce, National Institute of Standards and Technology.
[83]
FISWG. 2018. Facial image comparison feature list for morphological analysis. Retrieved from https://fiswg.org/FISWG_Morph_Analysis_Feature_List_v2.0_20180911.pdf
[84]
FISWG. 2019. Facial comparison overview and methodology guidelines. Retrieved from https://fiswg.org/fiswg_facial_comparison_overview_and_methodology_guidelines_V1.0_20191025.pdf
[85]
FISWG. 2021. Image factors to consider in facial image comparison. Retrieved from https://fiswg.org/fiswg_image_factors_to_consider_in_facial_img_comparison_v1.0_2021.05.28.pdf
[86]
FISWG. 2021. Physical stability of facial features of adults. Retrieved from https://fiswg.org/fiswg_physical_stability_of_facial_features_of_adults_v2.0_2021.05.28.pdf
[87]
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM Comput. Surv. 51, 5 (2018), 1–42.
[88]
David Gunning, Mark Stefik, Jaesik Choi, Timothy Miller, Simone Stumpf, and Guang-Zhong Yang. 2019. XAI–Explainable artificial intelligence. Sci. Robot. 4, 37 (2019), eaay7120.
[89]
Guodong Guo and Na Zhang. 2019. A survey on deep learning based face recognition. Comput. Vis. Image Underst. 189 (2019), 102805.
[90]
Jianzhu Guo, Xiangyu Zhu, and Zhen Lei. 2018. 3DDFA. Retrieved from https://github.com/cleardusk/3DDFA
[91]
Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, and Stan Z. Li. 2020. Towards fast, accurate and stable 3D dense face alignment. In European Conference on Computer Vision (ECCV’20).
[92]
Linfeng Guo and Yan Meng. 2006. What is wrong and right with MSE? In 8th IASTED International Conference on Signal and Image Processing. 212–215.
[93]
Hu Han and Anil K. Jain. 2012. 3D face texture modeling from uncalibrated frontal and profile images. In IEEE 5th International Conference on Biometrics: Theory, Applications and Systems (BTAS’12). IEEE, 223–230.
[94]
M. Hassaballah and Saleh Aly. 2015. Face recognition: Challenges, achievements and future directions. IET Comput. Vis. 9, 4 (2015), 614–626.
[95]
Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. 2015. Effective face frontalization in unconstrained images. In IEEE Conference on Computer Vision and Pattern Recognition. 4295–4304.
[96]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[97]
Carrie L. Heike, Kristen Upson, Erik Stuhaug, and Seth M. Weinberg. 2010. 3D digital stereophotogrammetry: A practical guide to facial image acquisition. Head Face Med. 6, 1 (2010), 1–11.
[98]
Javier Hernandez-Ortega, Javier Galbally, Julian Fiérrez, and Laurent Beslay. 2020. Biometric quality: Review and application to face recognition with FaceQnet. arXiv preprint arXiv:2006.03298 (2020).
[99]
Dallas Hill, Christopher D. O’Connor, and Andrea Slane. 2022. Police use of facial recognition technology: The potential for engaging the public through co-constructed policy-making. Int. J. Police Sci. Manag. 24, 3 (2022).
[100]
Kashmir Hill. 2020. Wrongfully accused by an algorithm. In Ethics of Data and Analytics. Auerbach Publications, 138–142.
[101]
Yu-Jin Hong. 2022. Facial identity verification robust to pose variations and low image resolution: Image comparison based on anatomical facial landmarks. Electronics 11, 7 (2022), 1067.
[102]
Xiao Hu, Shaohu Peng, Li Wang, Zhao Yang, and Zhaowen Li. 2017. Surveillance video face recognition with single sample per person based on 3D modeling and blurring. Neurocomputing 235 (2017), 46–58.
[103]
Gary B. Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Workshop on Faces in “Real-Life” Images: Detection, Alignment, and Recognition.
[104]
Patrik Huber, Guosheng Hu, Rafael Tena, Pouria Mortazavian, P. Koppen, William J. Christmas, Matthias Ratsch, and Josef Kittler. 2016. A multiresolution 3D morphable face model and fitting framework. In 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.
[105]
Lucas Introna and Helen Nissenbaum. 2010. Facial recognition technology a survey of policy and implementation issues. Lancaster University: The Department of Organisation, Work and Technology (2010).
[106]
ISO/IEC JTC 1/SC 37 Biometrics. ISO/IEC DIS 29794-1 Information Technology – Biometric Sample Quality – Part 1: Framework. Retrieved from https://www.iso.org/standard/79519.html
[107]
ISO/IEC JTC 1/SC 37 Biometrics. ISO/IEC WD TS 24358 Face-aware capture subsystem specifications. Retrieved from https://www.iso.org/standard/78489.html
[108]
Maëlig Jacquet and Christophe Champod. 2020. Automated face recognition in forensic science: Review and perspectives. Forens. Sci. Int. 307 (2020), 110124.
[109]
Rabia Jafri and Hamid R. Arabnia. 2009. A survey of face recognition techniques. J. Inf. Process. Syst. 5, 2 (2009), 41–68.
[110]
Anil K. Jain, Debayan Deb, and Joshua J. Engelsma. 2021. Biometrics: Trust, but verify. IEEE Trans. Biomet., Behav. Ident. Sci. 4, 3 (2021).
[111]
Anil K. Jain, Brendan Klare, and Unsang Park. 2011. Face recognition: Some challenges in forensics. In IEEE International Conference on Automatic Face & Gesture Recognition (FG’11). IEEE, 726–733.
[112]
Anil K. Jain and Arun Ross. 2015. Bridging the gap: From biometrics to forensics. Philos. Trans. R. Soc. B: Biol. Sci. 370, 1674 (2015), 20140254.
[113]
László A. Jeni, Jeffrey F. Cohn, and Fernando De La Torre. 2013. Facing imbalanced data—Recommendations for the use of performance metrics. In Humaine Association Conference on Affective Computing and Intelligent Interaction. IEEE, 245–251.
[114]
Felix Juefei-Xu, Dipan K. Pal, Karanhaar Singh, and Marios Savvides. 2015. A preliminary investigation on the sensitivity of COTS face recognition systems to forensic analyst-style face processing for occlusions. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 25–33.
[115]
Dervis Karaboga et al. 2005. An Idea Based on Honey Bee Swarm for Numerical Optimization. Technical Report. Technical report-tr06. Erciyes University.
[116]
Jane Kaye, Linda Briceño Moraia, Liam Curren, Jessica Bell, Colin Mitchell, Sirpa Soini, Nils Hoppe, Morten Øien, and Emmanuelle Rial-Sebbag. 2016. Consent for biobanking: The legal frameworks of countries in the BioSHaRE-EU project. Biopreserv. Biobank. 14, 3 (2016), 195–200.
[117]
Brendan F. Klare, Mark J. Burge, Joshua C. Klontz, Richard W. Vorder Bruegge, and Anil K. Jain. 2012. Face recognition performance: Role of demographic information. IEEE Trans. Inf. Forens. Secur. 7, 6 (2012), 1789–1801.
[118]
Krista F. Kleinberg and Peter Vanezis. 2007. Variation in proportion indices and angles between selected facial landmarks with rotation in the Frankfort plane. Med., Sci. Law 47, 2 (2007), 107–116.
[119]
Krista F. Kleinberg, Peter Vanezis, and A. Mike Burton. 2007. Failure of anthropometry as a facial identification technique using high-quality photographs. J. Forens. Sci. 52, 4 (2007), 779–783.
[120]
Yassin Kortli, Maher Jridi, Ayman Al Falou, and Mohamed Atri. 2020. Face recognition systems: A survey. Sensors 20, 2 (2020), 342.
[121]
K. S. Krishnapriya, Vítor Albiero, Kushal Vangara, Michael C. King, and Kevin W. Bowyer. 2020. Issues related to face recognition accuracy varying based on race and skin tone. IEEE Trans. Technol. Soc. 1, 1 (2020), 8–20.
[122]
Kelsey M. Kyllonen and Keith L. Monson. 2020. Depiction of ethnic facial aging by forensic artists and preliminary assessment of the applicability of facial averages. Forens. Sci. Int. 313 (2020), 110353.
[123]
Simone Maurizio La Cava, Giulia Orrù, Tomáš Goldmann, Martin Drahansky, and Gian Luca Marcialis. 2022. 3D face reconstruction for forensic recognition—A survey. In 26th International Conference on Pattern Recognition (ICPR’22). IEEE, 930–937.
[124]
Napa Lakshmi and Megha P. Arakeri. 2018. Face recognition in surveillance video for criminal investigations: A review. In International Conference on Communication, Networks and Computing. Springer, 351–364.
[125]
Timothy Lau and Alex Biedermann. 2019. Assessing AI output in legal decision-making with nearest neighbors. Penn St. L. Rev. 124 (2019), 609.
[126]
Anja Leipner, Zuzana Obertová, Martin Wermuth, Michael Thali, Thomas Ottiker, and Till Sieberth. 2019. 3D mug shot—3D head models from photogrammetry for forensic identification. Forens. Sci. Int. 300 (2019), 6–12.
[127]
Jiawei Li, Yiming Li, Xingchun Xiang, Shu-Tao Xia, Siyi Dong, and Yun Cai. 2020. Tnt: An interpretable tree-network-tree learning framework using knowledge distillation. Entropy 22, 11 (2020), 1203.
[128]
Pei Li, Patrick J. Flynn, Loreto Prieto, and Domingo Mery. 2019. Face recognition in low quality images: A survey. ACM Comput. Surv 1, 1 (2019).
[129]
Stan Z. Li, RuFeng Chu, ShengCai Liao, and Lun Zhang. 2007. Illumination invariant face recognition using near-infrared images. IEEE Trans. Pattern Anal. Mach. Intell. 29, 4 (2007), 627–639.
[130]
Yue Li, Liqian Ma, Haoqiang Fan, and Kenny Mitchell. 2018. Feature-preserving detailed 3D face reconstruction from a single image. In 15th ACM SIGGRAPH European Conference on Visual Media Production. 1–9.
[131]
Jie Liang, Feng Liu, Huan Tu, Qijun Zhao, and Anil K. Jain. 2018. On mugshot-based arbitrary view face recognition. In 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 3126–3131.
[132]
Jie Liang, Huan Tu, Feng Liu, Qijun Zhao, and Anil K. Jain. 2020. 3D face reconstruction from mugshots: Application to arbitrary view face recognition. Neurocomputing 410 (2020), 12–27.
[133]
Jiangke Lin, Yi Yuan, Tianjia Shao, and Kun Zhou. 2020. Towards high-fidelity 3D face reconstruction from in-the-wild images using graph convolutional networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5891–5900.
[134]
Feng Liu, Ronghang Zhu, Dan Zeng, Qijun Zhao, and Xiaoming Liu. 2018. Disentangling features in 3D face shapes for joint face reconstruction and recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 5216–5225.
[135]
J. E. Loohuis. 2021. Synthesising Security Camera Images for Face Recognition. B.S. thesis. University of Twente.
[136]
Nessa Lynch, Liz Campbell, Joe Purshouse, and Marcin Betkier. 2020. Facial recognition technology in New Zealand: Towards a legal and ethical framework. (2020). DOI:
[137]
Xanthé Mallett and Martin P. Evison. 2013. Forensic facial comparison: Issues of admissibility in the development of novel analytical technique. J. Forens. Sci. 58, 4 (2013), 859–865.
[138]
Gian Luca Marcialis, Fabio Roli, and Gianluca Fadda. 2014. A novel method for head pose estimation based on the “Vitruvian Man.” Int. J. Mach. Learn. Cybern. 5, 1 (2014), 111–124.
[139]
Giulia Margagliotti and Timothy Bollé. 2019. Machine learning & forensic science. Forens. Sci. Int. 298 (2019), 138–139. DOI:DOI:
[140]
MarketMarkets. 2021. Video Surveillance Market with COVID-19 Impact Analysis, By Offering (Hardware (Camera, Storage Device, Monitor), Software (Video Analytics, VMS), Service (VSaaS)), System (IP, Analog), Vertical, and Geography - Global Forecast to 2026. Retrieved from https://www.marketsandmarkets.com/Market-Reports/video-surveillance-market-645.html
[141]
Vincenzo Mastronardi and M. Dellisanti Fabiano Vilardi. 2014. Ricostruzione della Scena del Crimine in 3D. Mondo Digitale 13, 53 (2014).
[142]
Vincenzo Maria Mastronardi and Giuseppe Castellini. 2009. Meredith: Luci e ombre a Perugia. Armando Editore.
[143]
Lerato Masupha, Tranos Zuva, Seleman Ngwira, and Omobayo Esan. 2015. Face recognition techniques, their advantages, disadvantages and performance evaluation. In International Conference on Computing, Communication and Security (ICCCS’15). IEEE, 1–5.
[144]
Brianna Maze, Jocelyn Adams, James A. Duncan, Nathan Kalka, Tim Miller, Charles Otto, Anil K. Jain, W. Tyler Niggel, Janet Anderson, Jordan Cheney, and Patrick Grother. 2018. IARPA Janus benchmark - C: Face dataset and protocol. In International Conference on Biometrics (ICB’18). 158–165. DOI:DOI:
[145]
Rachel Metz. 2020. Portland passes broadest facial recognition ban in the US. CNN. Retrieved from https://edition.cnn.com/2020/09/09/tech/portland-facial-recognition-ban/index.html
[146]
Didier Meuwly, Daniel Ramos, and Rudolf Haraksim. 2017. A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation. Forens. Sci. Int. 276 (2017), 142–153.
[147]
Hoda Mohaghegh, Farid Boussaid, Hamid Laga, Hossein Rahmani, and Mohammed Bennamoun. 2023. Robust monocular 3D face reconstruction under challenging viewing conditions. Neurocomputing 520 (2023), 82–93.
[148]
Araceli Morales, Gemma Piella, and Federico M. Sukno. 2021. Survey on 3D face reconstruction from uncalibrated images. Comput. Sci. Rev. 40 (2021), 100400.
[149]
Emilio Mordini. 2017. Ethics and policy of forensic biometrics. In Handbook of Biometrics for Forensic Science. Springer, 353–365.
[150]
Reuben Moreton. 2021. Forensic face matching. Forens. Face Match.: Res. Pract. Oxford University Press (2021).
[151]
Reuben Moreton and Johanna Morley. 2011. Investigation into the use of photoanthropometry in facial image comparison. Forens. Sci. Int. 212, 1-3 (2011), 231–237.
[152]
Vidhyashree Nagaraju and Lance Fiondella. 2016. A survey of homeland security biometrics and forensics research. In IEEE Symposium on Technologies for Homeland Security (HST’16). IEEE, 1–7.
[153]
Cedric Neumann, Ian W. Evett, and James Skerrett. 2012. Quantifying the weight of evidence from a forensic fingerprint comparison: A new paradigm. J. R. Stat. Soc.: Series A (Stat. Soc.) 175, 2 (2012), 371–415.
[154]
NeuroTechnology. 2004. VeryLook. Retrieved from http://www.neurotechnology.com
[155]
Joao Neves, Juan Moreno, and Hugo Proença. 2018. QUIS-CAMPI: An annotated multi-biometrics data feed from surveillance scenarios. IET Biomet. 7, 4 (2018), 371–379.
[156]
Xin Ning, Fangzhe Nan, Shaohui Xu, Lina Yu, and Liping Zhang. 2020. Multi-view frontal face image generation: A survey. Concurr. Comput.: Pract. Exper. 35, 18 (2020).
[157]
Unsang Park and Anil K. Jain. 2010. Face matching and retrieval using soft biometrics. IEEE Trans. Inf. Forens. Secur. 5, 3 (2010), 406–415.
[158]
Pascal Paysan, Reinhard Knothe, Brian Amberg, Sami Romdhani, and Thomas Vetter. 2009. A 3D face model for pose and illumination invariant face recognition. In 6th IEEE International Conference on Advanced Video and Signal-based Surveillance. IEEE, 296–301.
[159]
Yuxi Peng. 2019. Face recognition at a distance: Low-resolution and alignment problems. University of Twente.
[160]
Alex Pentland, Baback Moghaddam, and Thad Starner. 1994. View-based and modular eigenspaces for face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (1994).
[161]
P. Jonathon Phillips, Hyeonjoon Moon, Syed A. Rizvi, and Patrick J. Rauss. 2000. The FERET evaluation methodology for face-recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 22, 10 (2000), 1090–1104.
[162]
P. J. Phillips, A. N. Yates, Y. Hu, C. A. Hahn, E. Noyes, K. Jackson, J. G. Cavazos, G. Jeckln, R. Ranjan, S. Sankaranarayanan, J. C. Chen, C. D. Castillo, R. Chellappa, D. White, and A. J. O’ Toole. 2018. Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proc. Nat. Acad. Sci. 115, 24 (2018), 6171–6176.
[163]
Paulo Henrique Pisani, Abir Mhenni, Romain Giot, Estelle Cherrier, Norman Poh, André Carlos Ponce de Leon Ferreira de Carvalho, Christophe Rosenberger, and Najoua Essoukri Ben Amara. 2019. Adaptive biometric systems: Review and perspectives. ACM Comput. Surv. 52, 5 (2019), 1–38.
[164]
Bo Qiu. 2020. Application analysis of face recognition technology in video investigation. In Journal of Physics: Conference Series, Vol. 1651. IOP Publishing, 012132.
[165]
Siti Zaharah Abd Rahman, Siti Norul Huda Sheikh Abdullah, Lim Eng Hao, Mohammed Hasan Abdulameer, Nazri Ahmad Zamani, and Mohammad Zaharudin A. Darus. 2016. Mapping 2D to 3D forensic facial recognition via bio-inspired active appearance model. Jurnal Teknologi 78, 2-2 (2016).
[166]
Inioluwa Deborah Raji and Joy Buolamwini. 2019. Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products. In AAAI/ACM Conference on AI, Ethics, and Society. 429–435.
[167]
Daniel Ramos and Joaquin Gonzalez-Rodriguez. 2013. Reliable support: Measuring calibration of likelihood ratios. Forens. Sci. Int. 230, 1-3 (2013), 156–169.
[168]
Daniel Ramos, Ram P. Krish, Julian Fierrez, and Didier Meuwly. 2017. From biometric scores to forensic likelihood ratios. In Handbook of Biometrics for Forensic Science. Springer, 305–327.
[169]
Karl Ricanek and Tamirat Tesafaye. 2006. Morph: A longitudinal image database of normal adult age-progression. In 7th International Conference on Automatic Face and Gesture Recognition (FGR’06). IEEE, 341–345.
[170]
Edward M. Robinson. 2016. Crime Scene Photography. Academic Press.
[171]
Andrea Macarulla Rodriguez, Zeno Geradts, and Marcel Worring. 2022. Calibration of score based likelihood ratio estimation in automated forensic facial image comparison. Forens. Sci. Int. 334 (2022), 111239.
[172]
Gemma Rotger Moll, Francesc Moreno-Noguer, Felipe Lumbreras, and Antonio Agudo Martínez. 2019. Detailed 3D face reconstruction from a single RGB image. J. WSCG (Plzen, Print) 27, 2 (2019), 103–112.
[173]
Ernestina Sacchetto. 2020. Face to face: Il complesso rapporto tra automated facial recognition technology e processo penale. La legislazione penale (2020), 1–14.
[174]
Debanjan Sadhya and Sanjay Kumar Singh. 2019. A comprehensive survey of unimodal facial databases in 2D and 3D domains. Neurocomputing 358 (2019), 188–210.
[175]
Michael J. Saks and Jonathan J. Koehler. 2005. The coming paradigm shift in forensic identification science. Science 309, 5736 (2005), 892–895.
[176]
Angelo Salici and Claudio Ciampini. 2017. Automatic face recognition and identification tools in the forensic science domain. In International Tyrrhenian Workshop on Digital Communication. Springer, 8–17.
[177]
Soubhik Sanyal, Timo Bolkart, Haiwen Feng, and Michael Black. 2019. Learning to regress 3D face shape and expression from an image without 3D supervision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 7763–7772.
[178]
Arman Savran, Neşe Alyüz, Hamdi Dibeklioğlu, Oya Çeliktutan, Berk Gökberk, Bülent Sankur, and Lale Akarun. 2008. Bosphorus database for 3D face analysis. In European Workshop on Biometrics and Identity Management. Springer, 47–56.
[179]
A Scalfati. 2019. Le indagini atipiche (2nd ed.). G Giappichelli Editore (2019).
[180]
Morgan Klaus Scheuerman, Kandrea Wade, Caitlin Lustig, and Jed R. Brubaker. 2020. How we’ve taught algorithms to see identity: Constructing race and gender in image databases for facial analysis. Proc. ACM Hum.-comput. Interact. 4, CSCW1 (2020), 1–35.
[181]
Torsten Schlett, Christian Rathgeb, Olaf Henniger, Javier Galbally, Julian Fierrez, and Christoph Busch. 2022. Face image quality assessment: A literature survey. ACM Comput. Surv. 54, 10s (2022), 1–49.
[182]
Mohamad Firham Efendy Md Senan, Siti Norul Huda Sheikh Abdullah, Wafa Mohd Kharudin, and Nur Afifah Mohd Saupi. 2017. CCTV quality assessment for forensics facial recognition analysis. In 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence. IEEE, 649–655.
[183]
Sahil Sharma and Vijay Kumar. 2021. 3D landmark-based face restoration for recognition using variational autoencoder and triplet loss. IET Biomet. 10, 1 (2021), 87–98.
[184]
Biao Shi, Huaijuan Zang, Rongsheng Zheng, and Shu Zhan. 2019. An efficient 3D face recognition approach using Frenet feature of iso-geodesic curves. J. Vis. Commun. Image Represent. 59 (2019), 455–460.
[185]
Jiazheng Shi, Ashok Samal, and David Marx. 2006. How effective are landmarks and their geometry for face recognition? Comput. Vis. Image Underst. 102, 2 (2006), 117–133.
[186]
Yichun Shi and Anil K. Jain. 2019. Probabilistic face embeddings. In IEEE/CVF International Conference on Computer Vision. 6902–6911.
[187]
Raymond P. Siljander and Lance W. Juusola. 2012. Clandestine Photography: Basic to Advanced Daytime and Nighttime Manual Surveillance Photography Techniques-For Military Special Operations Forces, Law Enforcement, Intelligence Agencies, and Investigators. Charles C. Thomas Publisher.
[188]
Terence Sim, Simon Baker, and Maan Bsat. 2002. The CMU pose, illumination, and expression (PIE) database. In 5th IEEE International Conference on Automatic Face Gesture Recognition. IEEE, 53–58.
[189]
Sima Soltanpour, Boubakeur Boufama, and Q. M. Jonathan Wu. 2017. A survey of local feature methods for 3D face recognition. Pattern Recog. 72 (2017), 391–406.
[190]
Nicole A. Spaun. 2007. Forensic biometrics from images and video at the Federal Bureau of Investigation. In 1st IEEE International Conference on Biometrics: Theory, Applications, and Systems. IEEE, 1–3.
[191]
Nicole A. Spaun. 2009. Facial comparisons by subject matter experts: Their role in biometrics and their training. In International Conference on Biometrics. Springer, 161–168.
[192]
Ailsa Strathie and Allan McNeill. 2016. Facial wipes don’t wash: Facial image comparison by video superimposition reduces the accuracy of face matching decisions. Appl. Cog. Psychol. 30, 4 (2016), 504–513.
[193]
Ailsa Strathie, Allan McNeill, and David White. 2012. In the dock: Chimeric image composites reduce identification accuracy. Appl. Cog. Psychol. 26, 1 (2012), 140–148.
[194]
Abby Stylianou, Richard Souvenir, and Robert Pless. 2019. Visualizing deep similarity networks. In IEEE Winter Conference on Applications of Computer Vision (WACV’19). IEEE, 2029–2037.
[195]
Ambika Suman. 2008. Using 3D pose alignment tools in forensic applications of face recognition. In IEEE 2nd International Conference on Biometrics: Theory, Applications and Systems. IEEE, 1–6.
[196]
Philipp Terhörst, Jan Niklas Kolf, Marco Huber, Florian Kirchbuchner, Naser Damer, Aythami Morales Moreno, Julian Fierrez, and Arjan Kuijper. 2021. A comprehensive study on face recognition biases beyond demographics. IEEE Trans. Technol. Soc. 3, 1 (2021), 16–30.
[197]
Massimo Tistarelli, Enrico Grosso, and Didier Meuwly. 2014. Biometrics in forensic science: Challenges, lessons and new technologies. In International Workshop on Biometric Authentication. Springer, 153–164.
[198]
Pedro Tome, Julian Fierrez, Ruben Vera-Rodriguez, and Mark S. Nixon. 2014. Soft biometrics and their application in person recognition at a distance. IEEE Trans. Inf. Forens. Secur. 9, 3 (2014), 464–475.
[199]
Pedro Tome, Julian Fierrez, Ruben Vera-Rodriguez, and Daniel Ramos. 2013. Identification using face regions: Application and assessment in forensic scenarios. Forens. Sci. Int. 233, 1-3 (2013), 75–83.
[200]
Pedro Tome, Ruben Vera-Rodriguez, Julian Fierrez, and Javier Ortega-Garcia. 2015. Facial soft biometric features for forensic face recognition. Forens. Sci. Int. 257 (2015), 271–284.
[201]
Bill Triggs. 1997. Autocalibration and the absolute quadric. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 609–614.
[202]
Matthew A. Turk and Alex P. Pentland. 1991. Face recognition using eigenfaces. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 586–587.
[203]
Petra Urbanová, Zuzana Ferková, Marie Jandová, Mikoláš Jurda, Dominik Černỳ, and Jiří Sochor. 2018. Introducing the FIDENTIS 3D face database. Anthropol. Rev. 81, 2 (2018), 202–223.
[204]
Chris van Dam, Raymond Veldhuis, and Luuk Spreeuwers. 2012. Towards 3D facial reconstruction from uncalibrated CCTV footage. In Information Theory in the Benelux and the 2nd Joint WIC/IEEE Symposium on Information Theory and Signal Processing in the Benelux. 228.
[205]
Chris van Dam, Raymond Veldhuis, and Luuk Spreeuwers. 2013. Landmark-based model-free 3D face shape reconstruction from video sequences. In International Conference of the BIOSIG Special Interest Group (BIOSIG’13). IEEE, 1–5.
[206]
Chris van Dam, Raymond Veldhuis, and Luuk Spreeuwers. 2016. Face reconstruction from image sequences for forensic face comparison. IET Biomet. 5, 2 (2016), 140–146.
[207]
Ruben Vera-Rodriguez, Pedro Tome, Julian Fierrez, Nicomedes Expósito, and Francisco Javier Vega. 2013. Analysis of the variability of facial landmarks in a forensic scenario. In International Workshop on Biometrics and Forensics (IWBF’13). IEEE, 1–4.
[208]
Ruben Vera-Rodriguez, Pedro Tome, Julian Fierrez, and Javier Ortega-Garcia. 2013. Comparative analysis of the variability of facial landmarks for forensics using CCTV images. In Pacific-Rim Symposium on Image and Video Technology. Springer, 409–418.
[209]
Rajesh Verma, Navdha Bhardwaj, Arnav Bhavsar, and Kewal Krishan. 2022. Towards facial recognition using likelihood ratio approach to facial landmark indices from images. Forens. Sci. Int.: Rep. 5 (2022), 100254.
[210]
Frank Wallhoff, Stefan Muller, and Gerhard Rigoll. 2001. Recognition of face profiles from the MUGSHOT database using a hybrid connectionist/HMM approach. In IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. IEEE, 1489–1492.
[211]
Haitao Wang, Yangsheng Wang, and Hong Wei. 2003. Face representation and reconstruction under different illumination conditions. In 7th International Conference on Information Visualization. IEEE, 72–78.
[212]
Mei Wang, Weihong Deng, Jiani Hu, Xunqiang Tao, and Yaohai Huang. 2019. Racial faces in the wild: Reducing racial bias by information maximization adaptation network. In IEEE/CVF International Conference on Computer Vision. 692–702.
[213]
Wei Wang, Jing Dong, and Bo Tieniu Tan. 2017. Position determines perspective: Investigating perspective distortion for image forensics of faces. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1–9.
[214]
Andrew B. Watson, Quingmin J. Hu, and John F. McGowan III. 2001. Digital video quality metric based on human vision. J. Electron. Imag. 10, 1 (2001), 20–29.
[215]
Craig I. Watson. 1993. NIST special database 18. NIST Mugshot Identification Database (MID). US National Institute of Standards and Technology.
[216]
David White, Kristin Norell, P. Jonathon Phillips, and Alice J. O’Toole. 2017. Human factors in forensic face identification. In Handbook of Biometrics for Forensic Science. Springer, 195–218.
[217]
S. M. Willis, L. McKenna, S. McDermott, G. O’Donell, A. Barrett, B. Rasmusson, A. Nordgaard, C. E. H. Berger, M. J. Sjerps, J. Lucena-Molina, G. Zadora, C. Aitken, L. Lunt, C. Champod, A. Biedermann, T. Hicks, and F. Taroni. 2015. ENFSI Guideline for Evaluative Reporting in Forensic Science. ENFSI. Retrieved from https://enfsi.eu
[218]
Sarah Wu. 2019. Somerville City Council passes facial recognition ban. Retrieved from https://www.bostonglobe.com
[219]
Wei Xiong, Hongyu Yang, Pei Zhou, Keren Fu, and Jiangping Zhu. 2021. Spatiotemporal correlation-based accurate 3D face imaging using speckle projection and real-time improvement. Appl. Sci. 11, 18 (2021), 8588.
[220]
Yanjun Yan and Lisa Ann Osadciw. 2008. Bridging biometrics and forensics. In Security, Forensics, Steganography, and Watermarking of Multimedia Contents X, Vol. 6819. SPIE, 278–285.
[221]
Moi Hoon Yap, Nazre Batool, Choon-Ching Ng, Mike Rogers, and Kevin Walker. 2021. A survey on facial wrinkles detection and inpainting: Datasets, methods, and challenges. IEEE Trans. Emerg. Topics Comput. Intell. 5, 4 (2021).
[222]
Bangjie Yin, Luan Tran, Haoxiang Li, Xiaohui Shen, and Xiaoming Liu. 2019. Towards interpretable face recognition. In IEEE/CVF International Conference on Computer Vision. 9348–9357.
[223]
Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, and Manmohan Chandraker. 2017. Towards large-pose face frontalization in the wild. In IEEE International Conference on Computer Vision. 3990–3999.
[224]
Mineo Yoshino, Hideaki Matsuda, Satoshi Kubota, Kazuhiko Imaizumi, and Sachio Miyasaka. 2000. Computer-assisted facial image identification system using a 3-D physiognomic range finder. Forens. Sci. Int. 109, 3 (2000), 225–237.
[225]
Dilovan Asaad Zebari, Araz Rajab Abrahim, Dheyaa Ahmed Ibrahim, Gheyath M. Othman, and Falah Y. H. Ahmed. 2021. Analysis of dense descriptors in 3D face recognition. In IEEE 11th International Conference on System Engineering and Technology (ICSET’21). IEEE, 171–176.
[226]
Chris G. Zeinstra, Didier Meuwly, A. C. C. Ruifrok, Raymond N. J. Veldhuis, and Lieuwe Jan Spreeuwers. 2018. Forensic face recognition as a means to determine strength of evidence: A survey. Forens. Sci. Rev. 30, 1 (2018), 21–32.
[227]
Chris G. Zeinstra, Raymond N. J. Veldhuis, Luuk J. Spreeuwers, Arnout C. C. Ruifrok, and Didier Meuwly. 2017. ForenFace: A unique annotated forensic facial image dataset and toolset. IET Biomet. 6, 6 (2017), 487–494.
[228]
Dan Zeng, Shuqin Long, Jing Li, and Qijun Zhao. 2016. A novel approach to mugshot based arbitrary view face recognition. J. Optic. Soc. Korea 20, 2 (2016), 239–244.
[229]
Dan Zeng, Qijun Zhao, Shuqin Long, and Jing Li. 2017. Examplar coherent 3D face reconstruction from forensic mugshot database. Image Vis. Comput. 58 (2017), 193–203.
[230]
Xiaozheng Zhang, Yongsheng Gao, and Maylor K. H. Leung. 2008. Recognizing rotated faces from frontal and side views: An approach toward effective use of mugshot databases. IEEE Trans. Inf. Forens. Secur. 3, 4 (2008), 684–697.
[231]
Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, and Feiyue Huang. 2021. Learning to aggregate and personalize 3D face from in-the-wild photo collection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14214–14224.
[232]
Sanqiang Zhao, Wen Gao, Shiguang Shan, and Baocai Yin. 2004. Enhance the alignment accuracy of active shape models using elastic graph matching. In International Conference on Biometric Authentication. Springer, 52–58.
[233]
Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu, and Xiaogang Wang. 2020. Rotate-and-render: Unsupervised photorealistic face rotation from single-view images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5911–5920.
[234]
Michael Zollhöfer, Justus Thies, Pablo Garrido, Derek Bradley, Thabo Beeler, Patrick Pérez, Marc Stamminger, Matthias Nießner, and Christian Theobalt. 2018. State of the art on monocular 3D face reconstruction, tracking, and applications. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 523–550.
[235]
Jin Zou, Xu Fu, Chi Gong, and Yi Shi. 2021. Is face recognition being abused? A case study from the perspective of personal privacy. In 7th Annual International Conference on Network and Information Systems for Computers (ICNISC’21). IEEE, 957–962.

Cited By

View all
  • (2024)Implications of the forthcoming forensic sciences standard ISO/IEC 21043 for forensic biometrics2024 12th International Workshop on Biometrics and Forensics (IWBF)10.1109/IWBF62628.2024.10701603(1-6)Online publication date: 11-Apr-2024
  • (2024)A Brief Review of Recent Advances in AI-Based 3D Modeling and Reconstruction in Medical, Education, Surveillance and Entertainment2024 29th International Conference on Automation and Computing (ICAC)10.1109/ICAC61394.2024.10718770(1-6)Online publication date: 28-Aug-2024
  • (2024)Texture and artifact decomposition for improving generalization in deep-learning-based deepfake detectionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108450133:PCOnline publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 56, Issue 3
March 2024
977 pages
EISSN:1557-7341
DOI:10.1145/3613568
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023
Online AM: 24 September 2023
Accepted: 18 September 2023
Revised: 24 August 2023
Received: 01 August 2022
Published in CSUR Volume 56, Issue 3

Check for updates

Author Tags

  1. 3D face reconstruction
  2. forensics
  3. recognition

Qualifiers

  • Survey

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,818
  • Downloads (Last 6 weeks)321
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Implications of the forthcoming forensic sciences standard ISO/IEC 21043 for forensic biometrics2024 12th International Workshop on Biometrics and Forensics (IWBF)10.1109/IWBF62628.2024.10701603(1-6)Online publication date: 11-Apr-2024
  • (2024)A Brief Review of Recent Advances in AI-Based 3D Modeling and Reconstruction in Medical, Education, Surveillance and Entertainment2024 29th International Conference on Automation and Computing (ICAC)10.1109/ICAC61394.2024.10718770(1-6)Online publication date: 28-Aug-2024
  • (2024)Texture and artifact decomposition for improving generalization in deep-learning-based deepfake detectionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108450133:PCOnline publication date: 1-Jul-2024
  • (2024)Self-supervised learning for fine-grained monocular 3D face reconstruction in the wildMultimedia Systems10.1007/s00530-024-01436-330:4Online publication date: 5-Aug-2024
  • (2024)Selection of Rapid Classifier Development Methodology Used to Implement a Screening Study Based on Children’s Behavior During School LessonsHuman-Centric Decision and Negotiation Support for Societal Transitions10.1007/978-3-031-59373-4_7(77-88)Online publication date: 11-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media