1. Introduction
An aircraft spin is a specific flight condition that occurs in all types of aviation. In this state, the trajectory has the characteristic form of a spiral line. Specific actions are required to recover from the spin and to avoid the plane crash. There are two types of spin flat and steep. In the flat one, the plane’s pitch angle is less than 45 degrees, whereas, in the steep one, it is between 45 and 90 degrees. This article concerns the steep case when the pilot sees the image of the rotating earth and can return to the normal state by appropriate actions.
When recovering from a spin, three phases of flight are defined: rotation, diving, and recovery (see
Figure 1).
Each of them requires a different pilot’s action. This paper proposes a vision-based method of identification in which of these phases an aircraft is.
The presented solution could be part of the pilot assistance system and, in the future, the basis of the method for automatic spin recovery. The spin recovery procedure does not seem difficult, but the problem is the spatial disorientation of the pilot [
1,
2,
3]. According to the Air Safety Institute of Aircraft Owners and Pilots Association (AOPA), in the years 2000–2014, 30 percent of stall related accidents in commercial flights caused fatalities [
4]. Therefore such a solution would significantly improve safety.
Research on the spin phenomenon, conducted since the beginning of the 20th century, concerns the dynamics of aircraft in this state [
5,
6,
7,
8,
9], or recovery procedures [
10,
11,
12,
13,
14,
15]. Control algorithms are also being developed to enable automatic spin recovery but mainly for military or experimental applications [
16,
17,
18,
19]. However, these methods assume that we can precisely determine the instantaneous state of the aircraft. Proposed solutions are mainly based on inertial sensors, which measure the aircraft state indirectly, for example, through the analysis of angular velocities. Such analysis may sometimes lead to ambiguous results. Therefore, direct measurement using a vision sensor is a desirable and innovative solution.
Vision systems are increasingly used in aviation to detect threats from intruder objects appearing in the operating space [
20,
21,
22], or in navigation [
23,
24,
25]. There are also known solutions that use cameras for spin analysis. Aircraft models are observed in specially designed wind tunnels [
26,
27]. However, these are solutions in which the view from the perspective of an external observer is used and they are designed to know how the different aircraft structural elements influence on the spin character.
Several works regarding the estimation of flying object state can be found in the literature. In [
28], the attitude of the aircraft model placed in the vertical wind tunnel is measured using the stereo vision method. To achieve high robustness markers are attached to the surface of the plane. A vision system for a helicopter model six degrees of freedom pose estimation is proposed in [
29]. It uses a pan/tilt/zoom ground camera and another small onboard imager. The algorithm is based on tracking of five colored blobs placed on the aircraft and a single marker attached to the ground camera. A system for precision projectiles roll and pitch estimation by interpreting data from a strapped-down, forward-facing imager is described in [
30]. The solution is based on the horizon detection algorithm, employing the Hough transform and an intensity standard deviation method. Robust, real-time state estimation of micro air vehicles is proposed in [
31]. The method is based on tracking the feature points, such as lines and planes, and the implicit extended Kalman filter. According to the authors, a vision-based estimation is an attractive option, especially in urban environments. In [
32], a vision-based method of aircraft approach angle estimation is presented. Several sequential images are used to determine the horizon and the focus-of-expansion, and then to derive the angle value. A glider control system with vision-based feedback is presented in [
33]. The proposed navigation algorithm allows for reaching the predetermined location. The position of the target in the image is determined by integrating the pixel intensities across the image and performing a cascade of feature matching functions. Then a Kalman filter is used to estimate attitude and glideslope.
The approach proposed in this work is original. According to the authors’ best knowledge, no other studies are published in which images from the on-board camera are used to determine the condition of the aircraft in the spin. An additional advantage of the vision-based method is passive measurement. It also does not require significant modification of the aircraft structure.
The main novelty and contributions of this paper are: (i) unique application based on the vision sensor only, (ii) proposal of mappings from the image sequence (space-time domain) to the parameter space to determine the rotation axis and the movement direction by voting technique and maxima detection in the accumulator matrices, (iii) analyzing of the accumulator matrices using the histogram spread measure, (iv) a set of rules proposal to estimate the aircraft spinning state, and (v) creating a unique dataset, annotated by a human expert, containing various simulation data as well as preliminary flight recordings, and making it available to the research community for fair comparisons, (vi) experimental verification of the method using data from the simulator and real recordings in-flight tests, and (vii) original application of the multimedia data annotation package—ELAN for qualitative analysis of results.
The structure of this paper is as follows.
Section 1 defines the problem, gives the research background and relevant references.
Section 2 describes the proposed method. Experiments are presented in
Section 3.
Section 4 concludes the paper and indicates further works.
3. Experiments
Spin is a dangerous phenomenon. Deliberately performing the spin-entry procedure when testing an experimental method, especially at lower altitudes and with poor visibility, would be extremely risky. Performing some experiments in flight is also impossible because Polish aviation law prohibits aerobatic flights over settlements and other population centers. Therefore, the evaluation of the new approach began with simulation tests, which additionally ensure repeatability of weather conditions.
3.1. Laboratory Setup
The X-Plane 10 professional flight simulator was used [
42,
43]. The simulator operates based on an analytical model of aircraft dynamics and provides images from a virtual camera taking into account geographical location, terrain diversity, time of year and day, cruising altitude, and atmospheric conditions, including visibility. Obtaining data on such diversity under real conditions, in addition to security issues, would also be very expensive. The camera was attached close to the aircraft bow. The experiments were carried out using two computers with the following parameters: Intel Core i7-6700K @ 4 GHz, 64 GB RAM, Nvidia GTX 750 Ti. On one of them, the simulator was launched, on the other, the MATLAB/Simulink computing environment. For the selected conditions, the first flight tests were also performed. Test videos used in the experiments are available at
http://vision.kia.prz.edu.pl/.
3.3. Results Evaluation Methods
Each frame of the manually extracted video fragment corresponding to the entire spin recovery procedure was processed. Manual annotations created by an expert in ELAN—the popular annotation tool were used as a ground truth [
44,
45]. Results returned by the described method implemented in Matlab were automatically saved in the ELAN file using the annotation API [
46]. Qualitative assessment of the results was made by visual comparison of both annotation layers (
Figure 13).
For the quantitative assessment, the Jaccard index was used, defined as the length of the intersection divided by the length of the union of ’human expert’ and ’our method’ layers:
where
A—the ground truth (‘human expert’ layer),
B—the prediction (‘our method’ layer),
—length of the layers overlap, and
—length of the layers union.
3.4. Parameter Selection
The developed method has several parameters characterized in
Table 2.
The fixed step size random search (FSSRS) [
50] with the fitness function equal to the average Jaccard index, calculated for the entire dataset, was used for parameters selection. The following formula was minimized:
where
—the Jaccard index (see Equation (
9)) estimated for the test video
i,
—number of test videos, and
x—vector of decision variables composed of method parameters (see
Table 2). The initial decision vector
(a first approximation of method parameters) was selected randomly from the set of allowable values defined in the third column of
Table 2. For the first five parameters related to the SURF algorithm, this set was defined based on suggestions given in Mathworks documentation [
47,
48,
49]. For the remaining ones, it was determined experimentally by trial and error approach. The number of steps equals to 100 was proposed as the termination criterion. The lowest obtained value of the fitness function
was equal to
for the set of parameter values
given in
Table 3.
3.5. Results
The results obtained for the selected parameters are shown in
Table 4.
The graphs obtained for the selected test video are shown in
Figure 14. In
Figure 14a,b, the red lines show the histogram spread measure for
AR and
AT, respectively. The green lines correspond to the selected threshold values
TR and
TT. The blue ones show thresholds increased by deadbands
TR + Δ
R and
TT + Δ
T.
Figure 14c shows the state of the aircraft during spin recovery.
For the first group, the Jaccard index was greater or equal to 0.90 in 11 out of 18 cases. This result is promising, given the range of changes in image brightness (compare
Figure 9a–f). Moreover, for movies recorded after 21:00, surprisingly good results were noticed, because street and square lighting had a positive effect on the number of keypoints detected. However, they may be worse for areas with less variation in background brightness.
The results obtained for the second group confirm this hypothesis. It turns out that the method depends on the diversity of the scene. If the spin occurs over areas with a homogeneous structure and small variations in brightness, we get smooth, texture-free images. In such cases, the number of detected and matched keypoints is significantly lower (see
Figure 15a). Therefore, the number of votes for the possible rotation center and the dominant displacement direction is also lower. As a result, the method does not infer the real tendency occurring in the processed video accurately. In 3 of 11 cases, the results were weaker (Jaccard index lower than 0.80). For video sequences recorded over a smooth ocean surface, it was impossible to reliably determine the aircraft state due to the small number of matched keypoints. Spin recovery in such background conditions is also problematic for the pilot.
In the third group, some regularity can be seen. The results are weaker for small and large altitudes. At 2000 feet, the objects seen become quite large. The edges and corners between them move apart (see
Figure 11a). Because the keypoints are associated with high-frequency elements of the image, their density becomes definitely lower, which results in a lower number of votes (see
Figure 15b). At altitudes of 10,000 and 12,000 feet, the edges and corners are so close together that they begin to “merge” into aggregate objects, which also adversely affects the number of detected keypoints. The solution to this problem could be the use of a camera with fast-changing zoom, controlled in an adaptive manner, depending on the height of the aircraft. At such high altitudes, the results can also be affected by the transparency of the atmosphere through which the light beam passes before it reaches the camera lens. (see
Figure 11e,f).
Changes in visibility are particularly severe in the last phase of the spin recovery process when the aircraft is in a position close to horizontal (see
Figure 16).
The differences in results observed for group 4 are the effect of different lengths of this phase in individual test videos.
Preliminary experiments for test videos registered during the glider flights were also performed. Flight tests were carried out in September, from 17:00 to 19:00, over the agricultural and forest area, at an altitude of 1500–500 m AGL (Above Ground Level), in CAVOK (Ceiling and Visibility OK) meteorological conditions. The camera was attached to the bow. Its optical axis was approximately parallel to the longitudinal axis of the glider (
Figure 17).
Flights were made just before sunset. The sun was low above the horizon, which resulted in rapid changes in image brightness, depending on the spatial orientation of the aircraft, reflections in the lens, and the presence of underexposed areas on the ground due to long shadows. Recorded videos were used to preliminary test the method robustness in adverse lighting conditions.
Individual rows of
Figure 18 show selected frames from consecutive phases of five spin executions.
Table 5 summarizes the results obtained for these videos.
The results obtained for demanding real images recorded on the fly are promising. It turned out that for the execution of the spin during the glider flight, the radius of the spiral line circled by the aircraft is larger. The position of the instantaneous rotation axis, determined by the algorithm, was often outside the image. Therefore, the size of the AR matrix has been doubled. It was also observed that due to the nonuniform scene illumination, the number of keypoints in some parts of the image was too small. The problem was solved by setting the
MetricThreshold parameter value to 1. The worse results for the first two videos result from the unwanted glares appearing in the lens when it is in full sunlight (see the second row of
Figure 18). Perhaps the problem can be solved by using some adaptive image processing algorithms.
In our tests, the single-frame processing time was 250 ms (Matlab), and 30 ms (C++ implementation) for FullHD (1920 × 1080) scaled four times. It is possible to further speed up the calculations by the parallel implementation or the use of an embedded computing system dedicated to vision applications.