The camera calibration is very important for the automobile driver fatigue detection under a single sample condition, and the camera calibration is realized by using the calibration method of Zhang Zhengyou.
2.1. Camera Calibration of Automobile Driver Fatigue Detection
In the calibration method of Zhang Zhengyou, it is assumed that the plane calibration board is used in the coordinate system, the optimal solution of the camera parameters is calculated by linear simulation analysis, the maximum likelihood method is used for nonlinear refinement, and the effect of lens distortion is taken into account in the calibration. The specific methods are as follows.
The calculation of the homography matrix: a mapping between the point of the template plane and its image points is called the homography matrix
. Suppose that:
The elements in the matrix are:
Among them, is the proportional coefficient, is the weight parameter, represents the abscissa, and represents the ordinate.
Let
. Then:
where
represents the orthogonal factor, and
and
represent the external parameters of the camera; then:
where
represents the camera’s internal parameters,
T represents the transposed symbols, −
T is the reverse direction transposition, and −1 is the first transposition. Formula (4) provides two constraints for the solution of the internal parameters of the camera, with the 6 degrees of freedom occupied by the external parameters, which just satisfies all 8 degrees of freedom of the single stress matrix of the plane calibration template.
Then, the camera parameters are solved. Suppose that the symmetric matrix
is:
In Formula (5),
is the number of degrees of freedom, and the vector
is defined as
, then:
According to the constraint condition of Formula (4), we can get the homogeneous equation vector
:
On the basis of Formula (7), a set of linear equations can be obtained:
In Formula (8),
represents a matrix of
. Assuming that the data of the template image is
, 6 equations can be listed, and
is solved. Matrix
can be obtained by obtaining
, and the key parameters of each direction of the camera can be obtained by Formula (5).
In Formula (9),
,
, and
are the camera aperture, wide angle focal length, and pixel, respectively. The camera’s external parameters at different viewpoints can be obtained by the following formula:
Finally, the maximum likelihood estimation method is used to solve the results accurately. Suppose there are
images on the calibration template; each image has
points, and the data of each point are affected by the noise independently distributed. The objective function is to be set up as follows:
In the result of Formula (11),
represents the image coordinates of the
point in the
amplitude image,
and
represent the rotation and translation vectors of the
amplitude image coordinates,
represents the obtained image point coordinates, and
represents the world coordinates of the
point in the
amplitude image. Error characteristic curves were used to evaluate the positioning accuracy of the proposed method in different datasets, with the formula as.
In the result of Formula (12), represents the error in the coordinates, and represents the total number of test samples.
To sum up, the calibration method of Zhang Zhengyou does not need to refine the calibration block; the calibration process has good robustness, the internal and external parameters of the camera and the distortion coefficient can be all obtained at one time with high precision and strong practicability [
10,
11].
The double object vision calibration method is used in this paper, and the double target determination experiment is divided into two parts of the automobile driver’s face image acquisition and image calibration. The image calibration part is composed of a calibration board and calibration software. As shown in
Figure 1, the left and right cameras are fixed in parallel on the bracket to form a binocular camera, and the calibration board is about 1 to 2 m away from the camera. Suppose that the world coordinate system coincides with the left camera coordinate system; the lens focal length is adjusted to make the image sharpness reach the best state and improve the automobile driver’s fatigue detection precision under the single sample condition. The calibration board used in the experiment is made by the calibration board template provided by the HALCON machine vision software. HALCON is developed by a German company and is one of the most versatile types of machine vision software today [
12]. It provides a variety of libraries with outstanding performance controllers, and users can make use of their open architecture to quickly develop programs in the fields of image processing and machine vision. The calibration board has 8x8 dots, and the center distance is 40mm. The right upper corner of the boundary rectangular frame of the demarcated plate has a mark. It can make the camera calibration method obtain the displacement direction of the calibration board. The circle mark can be used to extract the center coordinates very accurately. The circular marking points are arranged in a rectangular array, which makes the camera calibration method more convenient in regard to extracting the pixels corresponding to the mark points.
According to the characteristics of the calibration board, the circle area inside the calibrated plate can be obtained by the threshold segmentation operation. Once the internal region of the calibrated object is found, the edge of each circle on the calibrated board can be extracted by the sub pixel edge extraction method, and the coordinates of each center can be obtained. The specific calibration process is as follows:
- (1)
Keep the binocular camera fixed and move the calibration board, and get a number of calibration plate images at different angles.
- (2)
Detect the center of the face in the facial image.
- (3)
Solve all internal and external parameters of the camera based on closed solution.
- (4)
All parameters, including distortion parameters, can be accurately solved based on the minimum solution of Formula (11).
2.2. Image Enhancement of Automobile Driver’s Face
In
Section 2.2, the noise is distributed in the target point, and the image of the driver’s face is also noisy. This section focuses on the analysis and improvement of the Retinex method in the image enhancement method. On this basis, a special color image enhancement method (MSRCR) suitable for human eye vision is proposed on the basis of an image histogram. When the MSRCR method is applied to the face image enhancement of the automobile driver, in order to obtain the best effect, the gap of the scale parameters is adjusted to the minimum, and the visual characteristics of the human eye are considered. The color, brightness, and contrast of the image are compared with the visual effect of the human eye. The detailed process is as follows:
- (1)
Global processing of the automobile driver’s facial image is done on the logarithmic domain, and the dynamic range of the image is obtained and adjusted preliminarily.
- (2)
Multi-scale processing to adjust the local dynamic range of the automobile driver’s facial images.
- (3)
Adaptive change of color is realized on the assumption of GreyWorld.
- (4)
Histogram correction is used to further improve the global contrast of the automobile driver’s facial images.
Due to a certain assumption in the algorithm, the partial image obtained through step three is dark and cannot achieve the best visual effect. Therefore, the global contrast of the image is adjusted by the histogram interception in step four [
13,
14]. In this method, parameters are mainly controlled by hand, which are obtained by experiment, and the choice of parameters has little effect on the quality of image enhancement. By using the histogram information of the image, we can make some improvements to step four.
In the image enhancement method, linear contrast stretching is a simple and effective algorithm, which is as follows:
Among them, and represent the output and input (the image obtained in step 3), represents the minimum value to obtain the image, and represents the dynamic range of the display device. Assuming that represents the maximum of the display device, the image enhancement method is a common linear contrast stretching method considering the dynamic range of display devices. Assuming that it is directly applied to step (4) of the MSRCR method without considering the specific content of the image, the enhancement effect is not obvious. Thus, for and , from the histogram distribution of images, an adaptive selection intercept point and is given.
The probability of each gray value in the image of automobile driver is:
Among them,
represents the number of pixels with a gray scale of
, and
represents the total image prime number.
represents the value of the image histogram, as shown in
Figure 2, where the low-end intercept point and the high-end intercept point are the
and
mentioned above.
Formula (14) is called a cumulative distribution function (CDF), which represents the proportion of pixels of an image under certain intensity. In the contrast enhancement of images, a certain proportion of pixels will reach saturation or lower saturation, which is expressed by
and
, and these two points can be found by analyzing the relationship between the CDF and the histogram. Specifically, using
and
to represent the probability of upper saturation and lower saturation, the corresponding image intensities are
and
, as shown in
Figure 3.
When CDF exceeds a probability of
at a certain gray level,
is obtained:
Start with
and accumulate until the
and
that satisfy the requirement were found. In the same way,
can be obtained, that is:
At this point, start searching from and decrease until you find the and that satisfy the condition. Thus, the cumulative distribution function can be used to obtain the saturated intercept point adaptively and complete the automobile driver’s face image enhancement.
In order to further enhance the image, texture mapping is used to map texture images onto spatial entities. The key of this operation is to establish the correspondence between object spatial coordinates and texture spatial coordinates. The image of a complex object can be pasted to a simple geometric body surface by using the texture mapping technology. When displaying the space scene in real time, it can also make use of the translation of the 3D image and rotation to realize the effect of the rotation of the complex object with the change of the observation direction. The texture mapping process is to make texture patterns in a texture space, determine the mapping relationship between points on the surface of the space object and the points in the texture, and map the texture patterns of the texture space to the object according to a certain technique.
There are many kinds of texture picture formats that can be used for textures. Commonly used formats are JPG and RGB format. Since the operation is in ArcGIS and the Windows system, the texture format chosen in this test is BMP format. Mapping only gives us the spatial frame of the face image of our automobile driver. In order to enhance the realistic sense of the automobile driver’s face image, it is necessary to assign the texture map to the surface of the automobile to make it more real, which is essential to the map. The quality of the map is very important, which directly affects the visual effect. The image is enhanced by the image enhancement in
Section 2.2. The brightness of the texture image is improved, and the texture map is realized in the material editor of the facial image recognition, as shown in
Figure 4.
Through the above introduction, we use the “material editor” to simulate the effects of the luster, texture, and shade of the driver’s face, so that the performance of the face image can be more close to the real; at the same time, the texture and the related information under the single sample condition are expressed [
15,
16,
17,
18]. It is worth noting that it is difficult to make the map match the correct map coordinate and size on the different surfaces of the object, and it must be done well to realize the true simulation effect. Therefore, we need to use the “UVW expansion” map modifier to manage the coordinates of the map. The “UVW” refers to the map coordinate system, which is different from the XYZ spatial coordinate system in the image recognition scene. The coordinates of U, V, and W are parallel to the direction of the X, Y, and Z coordinates. If the space map is viewed, U is equivalent to the direction of the X mapping level, V is equivalent to Y representing the vertical direction of the map, and W is equivalent to the Z representing the vertical direction of the UV plane with the map.
The UVW expansion map modifier expands the UVW coordinates into plane coordinates, so that these surfaces can be expanded into planes, and the relationship between the map and the surface of the object can be visually controlled. The use of a UVW expansion modifier can accurately control texture mapping and map the texture map to the surface of the object accurately, which is especially suitable for the automobile driver fatigue detection under single sample conditions and further enhances the face image of the automobile driver.
2.3. Automobile Driver Fatigue Detection under Single Sample Condition Based on Face Image Recognition
This section first carries out facial image recognition, uses the vertical projection method to determine the comfort value of the face, and establishes the constraints of fatigue detection. Symmetric cryptography has the advantages of high efficiency, simple algorithm, low system overhead, and suitability for encrypting large amounts of data. The automobile driver fatigue detection under a single sample condition is realized [
19,
20,
21].
The enhanced image background is basically filtered out, leaving only face images and some small non-face areas, and the face area occupies a large part of the whole image [
22]. At this point, we can use vertical projection to determine the comfort value
of the face:
In Formula (17), is a vertical projection function. When the face comfort value is greater than 0, the automobile driver is in a fatigue state; when is equal to 0, the automobile driver is in general condition; when the is less than 0, the automobile driver is in a comfortable state.
The comfort value of the face image is obviously different from the background of other objects, and the comfort distribution value of the face is also different, but it has little effect on the average comfort value of the face, and the rest of the face is relatively uniform, so in the area where the face is located, the projection curve will appear relatively flat, and the face and background will be recognized. There is a sudden change at the boundary; that is, there will be a big gradient at the point of the boundary. In the vertical projection, as long as we find the coordinates with the largest change of gray level, we can determine the comfort value of the human face. The next step we need to do is delete the remaining non-face regions and use connected region markers to further process them. The connected label processing operation of the enhanced image is to extract the pixel set with the pixel value of “1” from the dot matrix image composed of white pixels and black pixels and then fill different digital labels in different connected regions of the image to get the detection value of statistical connected region as
In the formula, are the output parameters and partial nodes of the image group diagram, respectively. The face part is a connected domain, and it is a connected domain with the largest connectivity area in the whole image. The rest of the non-face region is filled with black. Then, we further look for the upper and lower boundaries of the face in the image, which can be done directly from the top down to the first white pixel point. At this point, the face comfort value has been found, and then, the face area can be cut from the original image.
From the cut face area, it is known that the connecting line of any two points on the face of the automobile driver is parallel and equal in the translation, the vector of any position on the driver’s face is different, and the difference is a constant vector. Therefore, the driver’s face fatigue detection elements under a single sample condition can be divided into three major categories: spot, linear, and surface.
Spot element analysis. The spot elements include the face and mouth organs of the automobile driver. In the facial image recognition technology, the spot of the automobile driver’s face is constructed by using the texture mapping method. First, two images under the same single sample condition are prepared, one of which is black and white; a BOX is created, and a phase is added. A spotlight is used as the main light source, and a reflector is used as an auxiliary light. The material editor is used for editing, and then the spotlight is changed to modify the panel.
Analysis of linear elements. Linear elements include the automobile drivers’ facial lines, connections, and so on. First, the line is extended in the facial image recognition technology, and then, the subsequent texture map is made.
Analysis of surface elements. The surface elements include large areas except for eyes, eyebrows, lines, etc. In facial image recognition technology, because the main object of the detection model is a regular set of faces, it is usually analyzed by polygons. The analysis method can be summed up by the following two points:
- (1)
Through the analysis of the face appearance of the automobile driver under a single sample condition, the main body under a single sample condition is constructed as much as possible by using as little simple aggregate as possible.
- (2)
Omitting local details, highlighting the local appearance with characteristic marks, making use of editable lines for outline drawing, drawing lines into faces, and extruding molding.
After two steps of space construction and texture mapping, the automobile driver’s face fatigue detection under a single sample condition is basically completed, and then, the image is output. In order to apply the detection method under the single sample condition in the ARCGIS software, it is necessary to carry out the matching material in the output of the test result and put the map in the same directory with the map; otherwise, the texturing effect of the material will not be displayed in the ARCGIS.
To sum up, the fatigue test of automobile drivers under a single sample condition is completed.