In this section, the feature extraction is described in details. Feature extraction plays a key role in the automatic recognition system, the algorithm of which determines not only the RSR but also the robustness of the system. The features include signal features and image features. The section is organized as follows. First, the signal features that based on the second order statistics, PSD, and instantaneous properties are illuminated. Then, the Choi–Williams distribution is introduced, and a further eight types of radar waveforms are shown in CWD images, respectively. After that, image preprocessing based on image morphology is addressed. Finally, the image features of the waveforms are estimated and extracted from the CWD image.
Table 1 lists the necessary features that are presented in
network1 and
network2. Meanwhile, in order to keep the classifiers as concise as possible, other features are considered but not listed. For example, higher order moments (up to the fourth order) and cumulants (up to the fourth order), Pseudo–Zernike moments (up to the eighth order) and other instantaneous properties, etc. However, these features are not found discriminative enough in the system recognition. How to select the final features will be described in
Section 5.
4.2. Choi–Williams Distribution (CWD)
Choi–Williams distribution is a member in Cohen Classes [
26], which can reduce image interference from cross terms effectively:
where
t and
ω are time and frequency axes, and
is the kernel function given by
The kernel function is a low-pass filter to eliminate cross terms.
σ refers to controllable factor. The bigger
σ is, the more obvious cross terms are. Meanwhile,
is used in this paper to balance the cross terms and resolution. Eight types of waveforms of CWD transformation are shown in
Figure 4. A method for fast calculation of CWD is found in [
10]. The structure of a fast CWD method is based on the standard fast Fourier transformation (FFT). Therefore, the number of sampling points is recommended to be a small power of two, such as 128, 256, 512, etc. In this paper, 1024 × 1024 points are selected.
4.3. Image Preprocessing
In the following parts, the real parts of CWD results of waveforms is treated as a 2D image. Digital image processing is explored to gain interested features. In this part, CWD image is processed into a binary image with three operations.
First, the length of detected signal, however, is
points in most cases. Zero padding is utilized because we select the CWD transformation of 1024 points. Then, the CWD image is resized to
to reduce the computation load. Finally, the resized image is converted to a binary image, based on global thresholding algorithm [
27]. The operation steps are as following:
transform the resized image to gray image between
, i.e.;
estimate the initial threshold T. It can be obtained from the average of the minimum and maximum from the image ;
divide the image into two pixel groups and after the comparison with the threshold T. includes all pixels in the image that the values , and includes all pixels in the image that the values ;
calculate the average value and of two pixel groups and , respectively;
update the threshold value;
repeat (b–e), and calculate
, i.e.;
until the is smaller than a predefined convergence value, 0.001 is used in the paper;
output the final binary image .
After the image binarization, however, there is some isolated noise and processing noise in the binary images. Isolated noise is generated because the signal is transmitted in the noisy environment. In addition, processing noise is generated in the kernel of CWD itself. It is a kind of special straight line, thin but long. The width of the line is less than three pixels, whereas the majority of lines in the CWD image are longer than half of the image length. In order to remove the noise, a morphological opening is applied (erosion followed by dilation). Erosion and dilation are the basic operations in morphological image processing. Morphological techniques probe an image with a small shape or template called a structuring element. The structuring element is positioned at all possible locations in the image. Furthermore, it is compared with the corresponding neighbourhood of pixels. The structuring element is said to fit the image if, for each of its pixels set to 1, the corresponding image pixel is also 1. Similarly, a structuring element is said to hit, or intersect, an image if, at least for one of its pixels set to 1, the corresponding image pixel is also 1. The opening is so called because it can open up a gap between objects connected by a thin bridge of pixels. Any regions that have survived from the erosion are restored to their original size through the dilation. In the paper, the structuring element in the size of 3 × 3 pixels is used. After the opening operation, the groups as small as a minimum 10% of the size of the largest group are removed. The image process is introduced in
Figure 5.
4.4. Image Features
The number of objects in the binary image () is a useful feature. For example, Frank code and P3 have two objects, respectively, while LFM and P1 only have one. However, Costas codes have many objects in different location. In order to distinguish different waveforms, two features and are used in the paper. and are the number of the objects that represent larger than 20% and 50% of the size of the largest object, respectively.
A feature is also found in the location of the maximum energy in time domain of the image, i.e.,
where
is a resized version of the CWD image, and
N is the length of the sampling data.
The standard deviation of the width of the signal objects (
) and the rotate angle of the largest object (
) are appropriate for polyphase codes discrimination. Namely, the feature
is suitable to classify two kinds of waveforms such as “stepped waveform” (including Frank code, P1) and “linear waveform” (including LFM, P3 and P4). In eight types of waveforms, only P2 has a negative slope. Therefore, P2 can be picked out by the parameter
from others easily. The features are calculated as follows:
- *
nearest neighbor interpolation is used in rotation processing.
Next, we select the maximum object with others removed, extracting the skeleton of the maximum object and estimating the linear trend of it. In the estimation, minimizing the square errors method is applied. The linear trend is subtracted from skeleton of the object to acquire the resulting vector
. The standard deviation of
is given by
where
M is the length of
.
In order to express the randomness of
, a statistical characteristics test is proposed. The details of the test are as follows:
calculate the average of ;
calculate as follows , ;
a consecutive sequence of 0’s or 1’s is called a unit, and R is the number of units. Let and denote the statistics number of and , respectively, i.e., ;
calculate the mean of units, i.e.;
calculate the variance of units, i.e.;
calculate the value of test statistic
Y, i.e.;
output the probability feature, i.e.;
- *
where is the standard normal cumulative distribution function. The value of is between 0 and 1.
- **
note that is no longer a probability. It is a measure of the similarity with Gaussian distribution. The standard deviation value is too small for machine precision. Therefore, it is replaced by variance in (f).
When the is closer to 1, the test signal is more similar to Gaussian distribution, and the number of N is more than 50, the distribution of R is similar to the standard normal distribution, i.e., . However, the values of are quite small for machine precision for the P4 codes. Therefore, the value of R is normalized by using the equation . The feature is worked in network2 only.
The other three features are obtained from the autocorrelation of
, i.e.,
, and
.
Figure 6 indicates the differences of P1 and P4. In order to characterize these differences, the features are introduced.
The ratio of maximum and sidelobe maximum, i.e.,
where,
is the location of the minimum of the lag value, and
is the location of the sidelobe maximum of the lag value in the
. The FFT result is selected to characterize the power of oscillation of
, and the feature is estimated as:
where
is normalized by using the maximum value of itself. Namely, the range of
is
.
The final features are the members of Pseudo–Zernike moments. Pseudo–Zernike moments are invariant to translation, rotation, scaling and mirroring. It is suitable for the problems about pattern recognition [
28,
29,
30]. These invariant features can reduce the amount of data used in training, which makes the recognition easy. The
order of the image geometric moments are defined as:
where
is the binary image. The scaling and translation invariant central geometric moments are given by
where
and
.
The scaling and translation invariant radial geometric moments are given by
where
and
.
The Pseudo–Zernike moments are defined as:
where
,
and
The dynamic range can be reduced by calculating the logarithm, i.e., . The members of the Pseudo–Zernike moments, including , , , , , and are selected. The features are used in network2 only because the features in the network1 can distinguish between LFM, Costas codes, BPSK and polyphase codes very well.