Matrix Separation and Poisson Multi-Bernoulli Mixture Filtering for Extended Multi-Target Tracking with Infrared Images

Su, Jian; Zhou, Haiyin; Yu, Qi; Zhu, Jubo; Liu, Jiying

doi:10.3390/electronics13132613

Open AccessArticle

Matrix Separation and Poisson Multi-Bernoulli Mixture Filtering for Extended Multi-Target Tracking with Infrared Images

by

Jian Su

¹,

Haiyin Zhou

¹,

Qi Yu

¹,

Jubo Zhu

² and

Jiying Liu

^1,*

¹

College of Science, National University of Defense Technology, Changsha 410073, China

²

College of Artificial Intelligence, Sun Yat-Sen University, Zhuhai 519082, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(13), 2613; https://doi.org/10.3390/electronics13132613

Submission received: 11 May 2024 / Revised: 20 June 2024 / Accepted: 27 June 2024 / Published: 3 July 2024

(This article belongs to the Section Circuit and Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Multi-target tracking using infrared images is receiving more and more attention. There are many state-of-the-art methods, and the deep learning network and low-rank and sparse matrix separation are two kinds of methods with high accuracy. However, the former suffers from heavy training samples, and the latter requires high-dimensional processing, meaning its computing cost is huge. In this work, a united detection and tracking method with matrix separation and PMBM filtering is proposed. In the detection process, a low-rank and sparse matrix separation algorithm with a differentiable form based on a single image is constructed. In the filtering process, the multi-target state is modeled as a PMBM distribution, which is conjugate in the Bayesian framework. The two processes interact mutually in that the detection provides measurements, and the filtering offers prior information for the next detection to improve accuracy. The computational complexity is given by a theoretical analysis, which shows a significant reduction. The numerical analysis, carried out on a practical dataset, verifies an enhancement in the BSF and SCRG metrics and ROC curves.

Keywords:

target tracking; infrared image; Poisson multi-Bernoulli mixture; low-rank and sparse matrix separation

1. Introduction

Driven by the demands of applications such as video surveillance and space target monitoring, multi-target tracking technology based on infrared images has made great progress [1,2,3]. In these applications, one often encounters challenges such as the coexistence of the target of interest and other interfering targets or complex image backgrounds. Existing target tracking and detection algorithms mainly include two categories: Track-Before-Detect (TBD) [4] and Detect-Before-Track (DBT) algorithms [5]. Particle filtering is the main technology in the TBD category [6]. Since sampling of the probability density function is required, the computational complexity of the method is high, and the performance will be seriously degraded when the background is complex. The corresponding technologies in the DBT group include the transform domain [7], deep learning [8], image separation [9,10] and other technologies. The current research mainly focuses on the latter two categories because of their superior performance. However, since deep learning methods require a large number of labeled training samples [8] and the existing technology based on low-rank and sparse matrix separation requires joint spatial–temporal processing of multiple frames [11,12,13,14,15,16], the computational complexity is high. In addition, due to the improvement in current imaging sensors, targets often show “extended” characteristics; i.e., they occupy several pixels in the image rather than “point” targets [17]. Reasonable exploitation of this feature will further improve the accuracy of target tracking methods.

The main innovations of this work are as follows:

(1): Low-rank and sparse matrix separation and the multi-target filtering method based on random finite sets (RFSs) are integrated into our framework. It effectively utilizes the physical characteristics of the target and background in the image and avoids the pure data-driven properties of the deep learning method; at the same time, the efficiency is deeply improved by filtering compared to traditional spatial–temporal, multi-frame, high-dimensional processing.
(2): The target obtained in the detection process of low-rank and sparse separation naturally has position and scale information, which is modeled as measurement by RFSs; meanwhile, the filtering process also uses RFSs to describe state parameters. The whole tracking process is realized within a conjugate Bayesian framework.
(3): In the filtering process, the continuity of the target’s motion is introduced through the state equation so that any false alarm that may occur in the detection process can be effectively suppressed; in addition, the result obtained by filtering can be used as prior information for detection in the next frame. Thus, deep fusion is achieved, and this makes the entire framework more accurate.

2. Differential Low-Rank and Sparse Matrix Separation

In target detection for infrared images, the different physical properties of the target and background, namely sparsity and low rank, respectively, are often used to construct a separation model, as shown in (1) [18]:

\min_{L, O \in ℝ^{m \times n}} \{\frac{1}{2} ‖ Y - L - O ‖_{F}^{2} + λ_{*} ‖ L ‖_{*} + λ_{1} ‖ O ‖_{1}\}

(1)

where

Y

is a “patch image” matrix whose columns are formed by a series of sliding windows of size

n_{w} \times n_{w}

from top left to bottom right of the original image (see Figure 1). Here,

L

and

O

are the low-rank and sparse matrices, which represent background and targets, respectively.

‖ \cdot ‖_{*}

is the nuclear norm of the matrix, which reflects the low-rank constraint, while

‖ \cdot ‖_{1}

is the

l_{1}

-norm that describes the sparsity. The parameters

λ_{*}

and

λ_{1}

adjust the weights of the low-rank and sparse terms. Without losing mathematical equivalence, the matrix

L

and its nuclear norm in (1) can be further decomposed into

‖ L ‖_{*} = \min_{A, B} \frac{1}{2} \{‖ A ‖_{F}^{2} + ‖ B ‖_{F}^{2}\} s . t . A B = L

(2)

where

L = U S V^{T}

, and

‖ \cdot ‖_{F}^{2}

is the Frobenius norm of the matrix. Minimization is achieved when

A = U Σ^{\frac{1}{2}}

and

B = Σ^{\frac{1}{2}} V

.

Inspired by (2), the low-rank matrix is decomposed into two parts: the dictionary,

D_{0}

, and coefficient,

S

. Thus, (1) becomes

\min_{D_{0}, S, O} \{\frac{1}{2} {‖Y - D_{0} S - O‖}_{F}^{2} + \frac{λ_{*}}{2} ({‖D_{0}‖}_{F}^{2} + ‖ S ‖_{F}^{2}) + λ ‖ O ‖_{1}\}

(3)

where

L = D_{0} S

, and the dimensions of each matrix are

D_{0} \in ℝ^{m \times q}, S \in ℝ^{q \times n}

and

O \in ℝ^{m \times n}

, respectively. A unified fast algorithm can be used to solve (3), and the specific process is shown in Algorithm 1. In this algorithm, the formula

(Z^{k + 1} - Z^{k})

is multiplied by the matrix

H

, which comes from the differentiability of the objective function in (3), which greatly improves the computational efficiency. At the same time,

p_{t} (B^{k})

is an operation carried out on each component of the vector, and its

i

-th component is

τ_{t_{i}} (B^{k}_{i}) = \max {0, B^{k}_{i} - t_{i}}

, which is only a threshold operation. Reference [19] proves that if a solution obtained by the algorithm in Algorithm 1 satisfies

{‖Y - D_{0} S - O‖}_{2} \leq λ_{*}

, then it is global optimal [19].

Algorithm 1 Algorithm for solving the differential form of low-rank and sparse matrix separation.

Input: Data

Y

, dictionary

D_{0}

, parameters

λ, λ_{*}

and Step size controller

α

Output: Low rank matrix

L

, sparse matrix

O

Define:

H = I - \frac{1}{α} (\begin{matrix} D_{0}^{T} D_{0} + λ_{*} I & D_{0}^{T} \\ D_{0} & (1 + λ_{*}) I \end{matrix})

, W = \frac{1}{α} (\begin{matrix} D_{0}^{T} \\ I \end{matrix})

, t = \frac{λ}{α} (\begin{array}{l} 0 \\ 1 \end{array})

Initialize:

Z^{0} = 0, B^{0} = W Y

For k = 1, 2, …. (until convergence)

Z^{k + 1} = π_{t} (B^{k})

(4)

B^{k + 1} = B^{k} + H (Z^{k + 1} - Z^{k})

(5)

End
where

Z^{k + 1} = (S; O)

, L = D_{0} S

3. PMBM for Extended Multi-Target Tracking

In recent years, due to the development of infrared sensors, imaging resolution has become higher and higher, so small targets will also occupy several pixels in an image, showing the characteristics of “extended” targets. In order to make better use of this information, in extended target filtering, the state of the target is added with the shape (scale) information on the basis of the original kinematic (such as position and velocity) information only.

3.1. PMBM for State Modeling

Extended target tracking requires appropriate mathematical tools to model the kinematic and shape information of the target. In this paper, the Bernoulli RFS in (6) is first introduced to describe a single extended target.

p (X) = \{\begin{cases} 1 - r X = \emptyset \\ r p (x) X = \{x\} \\ 0 |X| \geq 2 \end{cases},

(6)

In (6), the target state vector takes an empty set, which means that the target “disappears” with a probability of

1 - r

. The state

X

“exists” as a single element set,

\{x\}

, with a probability of

r

. The function

p (x)

is a probability density function describing the position (through the expectation) and the extension (through the standard deviation) of the target. Generally,

p (x)

can be chosen to be a Gaussian distribution.

Furthermore, in order to represent multiple targets, an index set,

I

, is introduced. In this case, the number of targets in

X

is equal to the cardinality of

I

, thus obtaining multi-Bernoulli (MB) RFSs.

p (X) = \{\begin{matrix} \sum_{X = \cup_{i \in I} X^{i}} \prod_{i \in I} p^{i} (X^{i}) |X| \leq |I| \\ 0 |X| \geq |I| \end{matrix}

(7)

where

\forall i, j \in I, X^{i} \cap X^{j} = \emptyset

and

X = \cup_{i \in I} X^{i}

represent the union of multiple targets. From (6) and (7), one can conclude that the key parameters of the MB model are

{\{(r^{i}, p {(\cdot)}^{i})\}}_{i \in I^{}}

, which represent the survival probability of the

i

-th target and the probability density representing the “extended” state, respectively.

Data association is one of the key steps in multi-target tracking, and its purpose is to establish a connection between the measurement data and target status. The mathematical representation of data association is also considered in state modeling. Let

W^{j}, j \in J

be the corresponding weights of several MB RFSs; i.e., for a fixed

j

, there is an MB RFS

{\{(r^{j, i}, p {(\cdot)}^{j, i})\}}_{i \in I^{j}}

, and then their combination forms a multi-Bernoulli mixture (MBM) RFS. The key parameters of the MBM RFS can be listed as

{\{(W^{j}, {\{(r^{j, i}, p {(\cdot)}^{j, i})\}}_{i \in I^{j}})\}}_{j \in J}

. The weight

W^{j}

represents the probability that a certain measurement belongs to the target states.

When multiple targets exist, some target could be missed in some time steps. In (8), the undetected targets,

X^{u}

, are described through the Poisson point process (PPP):

p^{u} (X^{u}) = e^{- 〈D^{u} (x), 1〉} \prod_{x \in X^{u}} D^{u} (x)

(8)

where

D^{u} (x) = μ^{u} p^{u} (x)

,

μ^{u}

are constant numbers, and

p^{u} (x)

is the probability density of undetected targets.

〈D^{u} (x), 1〉 = \int_{}^{} D^{u} (x) d x = μ^{u}

is the inner product.

By combining (7) and (8), the state model for extended multi-target tracking, the Poisson multi-Bernoulli mixture (PMBM), can be obtained:

p (X) = \sum_{X^{u} + X^{d} = X} p^{u} (X^{u}) \sum_{j \in J} W^{j} p^{j} (X^{d})

(9)

where

X^{d}

and

X^{u}

(

X^{u} \cap X^{d} = \emptyset

) are the detected and undetected targets, respectively, and

p^{j} (X^{d}) = \sum_{X^{d} = \cup_{i \in I^{j}} X^{i}} \prod_{i \in I^{j}} p^{j, i} (X^{i})

(10)

Once the state model was obtained, we also used the probability density function

f_{k + 1, k} (x_{k + 1} | x_{k})

to represent state transition. In addition,

p_{S} (x_{k})

was employed to represent the probability that the target still “exists” from time

k

to time

k + 1

, and a PPP model with density

D_{k + 1}^{b} (x)

was introduced to represent the emergence of new target.

3.2. Measurement Modeling

The measurement refers to the “targe part” in the infrared image obtained through the detection process. Let

p_{D} (x)

be the probability that the target can be detected, and after detection, the model of the measurement is given as

l_{Z} (x) = p_{D} (x) p (Z | x) = p_{D} (x) e^{- γ (x)} \prod_{z \in Z} γ (x) ϕ (z | x)

(11)

which is a PPP model with an intensity of

γ (x) ϕ (z | x)

. In (11),

γ (x)

is the Poisson intensity and

ϕ (z | x)

is the spatial distribution of the measurement. On the other hand, the probability that the target is not measured is

q_{D} (x) = 1 - p_{D} (x) + p_{D} (x) e^{- γ (x)}

(12)

Assuming the measurement dataset is

Z = {\{z^{m}\}}_{m \in M}

and

M \cup I^{j}

is the union of the measurement index set

M

and the target’s state index set

I^{j}

, the process of data association is basically a partition of

M \cup I^{j}

, and all possible results of the partition are set to

A^{j}

. For a simple interpretation, if

M = (m_{1}, m_{2}, m_{3})

and

I^{j} = (i_{1}^{j}, i_{2}^{j})

, then the process of data association is carried out to assign these three measurement values to two targets, where each target can have multiple measurements or none. Thus, one result may be

\{m_{1}, m_{2}, i_{1}^{j}\}, \{m_{3}\}, \{i_{2}^{j}\}

, where the measurements

m_{1}

and

m_{2}

are associated with the target

i_{1}^{j}

, the target

i_{2}^{j}

is not measured, and the measurement

m_{3}

is not associated with any detected targets. The set

\{m_{3}\}

indicates that it may be a measurement of a new target or clutter. In multi-target tracking, clutter is also modeled as a PPP with an intensity of

κ (z) = λ c (z)

.

3.3. Conjugate Bayesian Filtering

The process of filtering in multi-target tracking refers to obtaining the target state at time

k

by combining the state transition information with the target state at time

k - 1

and the measurement at time

k

. Since the states at time

k - 1

and time

k

are the prior and posterior distributions, respectively, it is a Bayesian filtering process.

As discussed above, the target state is modeled as a PMBM distribution. It has been proven that in Bayesian filtering, if the prior distribution is a PMBM, then the posterior distribution also satisfies a PMBM. Therefore, the filtering process only needs to update the parameters of the model, eliminating the calculation of the density function itself, thus significantly improving efficiency.

The filtering process can be divided into two processes: prediction and update. The prediction means using the state at time k − 1 to “predict” a value at time k through the state transfer function

f_{k + 1, k} (x_{k + 1} | x_{k})

. Since the distribution in a PMBM is fixed (as Gaussian and Poisson), the parameters

D^{u} (x), {\{(W^{j}, {\{(r^{j, i}, p^{j, i} (x))\}}_{i \in I^{j}})\}}_{j \in J}

(13)

are enough to describe the target state at time

k - 1

. Furthermore, the predicted state also satisfies the PMBM, and the corresponding parameters are [20,21]

D_{+}^{u} (x), {\{(W_{+}^{j}, {\{(r_{+}^{j, i}, p_{+}^{j, i} (x))\}}_{i \in I_{}^{j}})\}}_{j \in J_{}}

(14)

where

\begin{matrix} D_{+}^{u} (x) = D^{b} (x) + 〈D^{u} (x), p_{S} (x) f_{k + 1, k} (x_{k + 1} | x_{k})〉 \\ r_{+}^{j, i} = 〈p^{j, i} (x), p_{S} (x)〉 r^{j, i} \\ p_{+}^{j, i} (x) = \frac{〈p^{j, i} (x), p_{S} (x) f_{k + 1, k} (x_{k + 1} | x_{k})〉}{〈p^{j, i} (x), p_{S} (x)〉} \end{matrix}

(15)

Since the process of prediction is independent of the measurement,

W_{+}^{j}

equals

W^{j}

.

The update process refers to the use of the predicted state value in (14), combined with the measurement

Z = {\{z^{m}\}}_{m \in M}

, to obtain the final target state at time

k

. It has been shown that in the PPP measurement model, the updated state also satisfies the PMBM [20,21]:

\begin{matrix} p (X | Z) = \sum_{X^{u} + X^{d} = X} p^{u} (X^{u}) \sum_{j \in J_{}} \sum_{A \in A^{j}} W_{A}^{j} p_{A}^{j} (X^{d}) \\ p^{u} (X^{u}) = e^{- 〈D^{u} (x), 1〉} \prod_{x \in X^{u}} D^{u} (x) \\ p_{A}^{j} (X^{d}) = \sum_{X = \cup_{C \in A} X^{c}} \prod_{C \in A} p_{C}^{j} (X^{C}) \end{matrix}

(16)

where

W_{A}^{j} = \frac{W_{+}^{j} \prod_{C \in A} L_{C}}{\sum_{j^{'} \in J} \sum_{A^{'} \in A^{j^{'}}} W_{+}^{j^{'}} \prod_{C^{'} \in A^{'}} L_{C^{'}}}

(17)

and the PPP intensity is

D^{u} (x) = q_{D} (x) D_{+}^{u} (x)

.

4. The United Framework and Complexity Analysis

4.1. Detection and Filtering Framework

The infrared image small-target tracking method in this paper deeply integrates (single-frame-image) low-rank and sparse matrix separation and the extended multi-target filtering algorithm. The whole framework is shown in Figure 1. Firstly, the original image traverses the sliding windows (small red rectangle) and is rearranged as the input patch image,

Y

, for low-rank and sparse matrix separation. Then, through the algorithm in Section 2, the “target pixels” in the image are separated, which have specific positions and shapes. Thirdly, taking the “target pixels” as the measurement from extended multi-target filtering, a state estimation of the multiple targets can be obtained by using the state prediction and update process given in Section 3. Finally, the target state is used as prior information in the low-rank and sparse separation in the next frame, and a new round of detection and filtering is carried out. It is worth noting that this work combines multiple frames of images through the Bayesian filtering process to improve the accuracy of detection and tracking, rather than using multiple frames of images through high-dimensional processing like in the existing spatial–temporal method, so the computational complexity is significantly reduced.

4.2. Computational Complexity

Let’s recall the dimensions of the matrices mentioned in Algorithm 1 first, i.e.,

D_{0} \in ℝ^{m \times q}, S \in ℝ^{q \times n}

and

O \in ℝ^{m \times n}

. The computational complexity mainly comes from the matrix multiplication in Algorithm 1, i.e., the multiplication of matrices

H

and

Z^{k + 1} - Z^{k}

in (5). From their construction, it can be seen that their dimensions are

q (m + q) \times q (m + q)

and

q (m + q) \times n

, respectively. Thus, the complexity is

O [{(q (m + q))}^{2} \times n]

, which is needed in each iteration. A threshold operation is also required during the iteration in (4). Since this operation is performed on the elements, the time complexity is negligible compared to the above matrix multiplication. In the process of filtering the PMBM model, since the “sparse” part is obtained from low-rank and sparse decomposition as a measurement, its size is very small compared to the original matrix. At the same time, the dimensions of the state vectors of the filtering process are similar to those of the measurement; therefore, the complexity of the filtering process is negligible compared with low-rank and sparse matrix decomposition. In summary, the computational complexity of the entire algorithm is of the order of

O [K ({(q (m + q))}^{2} \times n)]

, where

K

is the number of iterations in the low-rank and sparse decomposition process. This complexity allows the algorithm to satisfy the needs of real-time applications.

5. Experiments

In this section, an infrared image dataset from an unmanned aerial vehicle (UAV) is used to verify the performance of the algorithm proposed in this paper. This dataset has the following characteristics: (1) Due to the huge difference in distance between the sensor and the UAV, the scale of the target in the images varies greatly. At a closer distance, the UAV appears as an extended target, and at a longer distance, it appears as a point target. Therefore, it matches the features of the extended target filtering algorithm discussed in this paper. (2) The dataset contains multiple flight processes, and each flight process forms an image sequence. These sequences include both single UAV flights and multiple UAVs flying simultaneously. Meanwhile, there are both sky backgrounds and complex ground backgrounds, which enhances the difficulty of detection and tracking.

After comprehensive analysis, three representative state-of-the-art methods were selected for comparison. These three methods are four-dimensional spatial–temporal tensor decomposition with a block term decomposition-based norm and multidirectional derivative-based priors (4DST-BTMD) [14]; a weighted adaptive Schatten p-norm and spatial–temporal tensor transpose variability (WASpN-STTTV) model [13]; and a Spatial Temporary Tensor Modeling With Saliency Filter Regularization (STTM-SFR) algorithm [10]. It has been shown that these algorithms outperform multiple frame sequential detection methods such as spatial temporal difference measurement (STLDM) [22] and edge and corner awareness-based spatial temporal tensors (ECASTTs) [23]. In addition, the advantages of PMBM filtering compared with other RFS filters have been verified in [24].

There are some control parameters in the process of low-rank and sparse matrix decomposition, and their values are chosen as follows. In the construction of the patch image matrix,

Y

, the size of the sliding window is 16*16 pixels. The balance parameters,

λ_{*}

and

λ

, in (3) are chosen according to the recommendations in [13,14] with fine-tuning for the infrared dataset. Specifically, we set

λ_{1} = c / \sqrt{\min (n_{1}, n_{2}) \times n_{3}}

, where

n_{1}

and

n_{2}

are the 2D patch sizes, and

n_{3}

is the number of patches from the original image. The value of the constant

c

is initially set to 2.2 and adaptively adjusted in real time according to PMBM filtering based on the result of the previous timepoint. The value of

λ_{*}

is set to

0.05 \cdot λ_{1}

to balance the measurements and clusters in the filtering. As for PMBM tracking, the detection probability is 0.95, and the false alarm probability is

8 \times 10^{- 4}

.

The metrics used for the performance comparison were also obtained from studies of the above methods, and they include the Background Suppression Factor (BSF) and the Signal-to-Clutter Ratio Gain (SCRG) [13]. The BSF is a commonly used indicator for infrared image target detection, defined as

BSF = σ_{in} / σ_{out}

(18)

where

σ_{in}

and

σ_{out}

are the standard variances of the input and the detected image, respectively. The SCRG is defined as follows:

SCRG = 20 \cdot \log 10 ({SCR}_{out} / {SCR}_{in})

(19)

SCR = (I_{\max} - I_{mean}) / σ

(20)

where

I_{\max}, I_{mean}

, and

σ

are the maximum, minimum, and standard variance of the image, respectively. The receiver operating characteristic (ROC) curve [13] is also used to compare the performance of the different algorithms, where the x-axis is

P_{f}

and the y-axis is

P_{d}

.

P_{d} = \frac{number of detected true pixels}{groundtruth target pixels}

(21)

P_{f} = \frac{false alarm pixels}{total number of pixels}

(22)

In the dataset [11], six image sequences (each containing about 300 image frames with 256 × 256 pixels) were selected for performance verification. Sequences 1 and 2 are multiple extended targets in the sky background (see the two frames in Figure 2a,b, respectively). The trajectories between the targets overlapped, which makes the tracking process difficult. Sequences 4 to 6 are images including complex backgrounds (see the four frames in Figure 2c,g–i, respectively). Their backgrounds contain objects with different characteristics, which can cause confusion and make the detection process difficult. The corresponding results of low-rank and sparse matrix separation are listed below the original image frame, respectively (see Figure 2d–f,j–l). One can see that the targets can be detected effectively.

For a quantitative comparison, the mean of the BSF and SCRG metrics are calculated over the six image sequences, respectively, and listed in Table 1. Higher values mean better suppression of the background and better accuracy in object detection. The results show that the proposed algorithm outperforms the state-of-the-art algorithms, which are STTM-SFR, WAS_pN-STTTV, and 4DST-BTMD, except in one situation (see Seq. 4). It is worth mentioning that performance was enhanced not only by the low-rank and sparse separation algorithm itself but also by the use of accurate prior information in the filtering process.

Finally, the ROC curve is used to compare the values of

P_{f}

and

P_{d}

. In Figure 3, the six panels (a) to (d) are plotted from image sequence 1 to 6, respectively. The horizontal axis in the figure represents the

P_{f}

value after logarithmic transformation (log10), and the vertical axis represents the

P_{d}

value. There are four curves in each panel, representing the four methods we employed. The red curve in each panel represents the proposed method, and it is almost always above the other curves.

6. Conclusions

A unified framework for small-target tracking in infrared images is proposed in this work. The framework deeply combines a fast low-rank and sparse matrix separation algorithm and a PMBM based on the Bayesian multi-target filtering process. The low-rank and sparse matrix separation completes the process of target detection based on a signal image, and it constitutes a fast algorithm benefiting from its differentiable form with global convergence. On the other hand, extended multi-target filtering is carried out, which matches the characteristics of the target in high-resolution infrared images. In the process of filtering, the state of multiple targets is modeled by the PMBM distribution. Due to the conjugate feature of the PMBM in the Bayesian framework, we only need to update the parameters rather than the whole distribution, so the computational complexity is significantly reduced. Unlike simultaneous spatial–temporal processing based on multiple frames in the existing literature, the enhancement in accuracy comes from the sequential use of detection and filtering, where the former provides the measurement for the latter and the latter provides precise prior information for the former. A theoretical complexity analysis is given in this work, and it is found to be lower than in existing 3D or 4D low-rank and sparse decomposition in multiple frames, which explains the efficiency of the whole framework. Finally, verification on a practical infrared dataset was carried out, and corresponding quantitative metrics (BCF and SCRG) and ROC curves proved the accuracy of the proposed method in cluster suppression and target detection. One of the limitations of the united framework comes from the parameters, which need fine-tuning based on the theoretical criteria.

Although the proposed framework has led to some improvements in terms of computational efficiency and accuracy, there is still some room for improvement. For example, the current combination is feature-level. Future work will consider a direct integration of matrix decomposition and PMBM filtering; that is, we will integrate the matrix decomposition process into the state modeling, transition, and update of the PMBM and develop a framework that advances from the current feature-level fusion to signal-level fusion.

Author Contributions

Conceptualization, J.S., H.Z. and J.L.; methodology, J.S. and J.L.; software, J.S.; validation, J.S. and J.L.; formal analysis, J.Z.; investigation, J.Z.; resources, Q.Y.; data curation, J.L.; writing—original draft preparation, J.S. and J.L.; writing—review and editing, Q.Y.; visualization, J.S.; supervision, H.Z.; project administration, J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (2020YFA0713504).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tian, F.; Guo, X.; Fu, W. Target Tracking Algorithm Based on Adaptive Strong Tracking Extended Kalman Filter. Electronics 2024, 13, 652. [Google Scholar] [CrossRef]
Yong, Y.; Kang, J.; Oh, H. Detection-Free Object Tracking for Multiple Occluded Targets in Plenoptic Video. Electronics 2024, 13, 590. [Google Scholar] [CrossRef]
Yang, S.-Y.; Cheng, H.-Y.; Yu, C.-C. Real-Time Object Detection and Tracking for Unmanned Aerial Vehicles Based on Convolutional Neural Networks. Electronics 2023, 12, 4928. [Google Scholar] [CrossRef]
Yi, W.; Fang, Z.; Li, W.; Hoseinnezhad, R.; Kong, L. Multi-Frame Track-before-Detect Algorithm for Maneuvering Target Tracking. IEEE Trans. Veh. Technol. 2020, 69, 4104–4118. [Google Scholar] [CrossRef]
Wan, M.; Gu, G.; Cao, E.; Hu, X.; Qian, W.; Ren, K. In-Frame and Inter-Frame Information Based Infrared Moving Small Target Detection under Complex Cloud Backgrounds. Infrared Phys. Technol. 2016, 76, 455–467. [Google Scholar] [CrossRef]
Tian, M.; Chen, Z.; Wang, H.; Liu, L. An Intelligent Particle Filter for Infrared Dim Small Target Detection and Tracking. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 5318–5333. [Google Scholar] [CrossRef]
Deng, L.; Xu, G.; Zhang, J.; Zhu, H. Entropy-Driven Morphological Top-Hat Transformation for Infrared Small Target Detection. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 962–975. [Google Scholar] [CrossRef]
McIntosh, B.; Venkataramanan, S.; Mahalanobis, A. Infrared Target Detection in Cluttered Environments by Maximization of a Target to Clutter Ratio (TCR) Metric Using a Convolutional Neural Network. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 485–496. [Google Scholar] [CrossRef]
Zhang, C.; He, Y.; Tang, Q.; Chen, Z.; Mu, T. Infrared Small Target Detection via Interpatch Correlation Enhancement and Joint Local Visual Saliency Prior. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5001314. [Google Scholar] [CrossRef]
Pang, D.; Ma, P.; Shan, T.; Li, W.; Tao, R.; Ma, Y.; Wang, T. STTM-SFR: Spatial–Temporal Tensor Modeling with Saliency Filter Regularization for Infrared Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5623418. [Google Scholar] [CrossRef]
Yao, J.; Hong, D.; Chanussot, J.; Meng, D.; Zhu, X.; Xu, Z. Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution. In Computer Vision—ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, 23–28 August 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 208–224. [Google Scholar]
Yao, J.; Cao, X.; Zhao, Q.; Meng, D.; Xu, Z. Robust Subspace Clustering via Penalized Mixture of Gaussians. Neurocomputing 2018, 278, 4–11. [Google Scholar] [CrossRef]
Ma, D.; Dong, L.; Zhang, M.; Gao, R.; Xu, W. Weighted Adaptive Schatten P-Norm and Transpose Variability Model for Infrared Maritime Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2023. early access. [Google Scholar] [CrossRef]
Luo, Y.; Li, X.; Chen, S.; Xia, C. 4DST-BTMD: An Infrared Small Target Detection Method Based on 4D Data-Sphered Space. IEEE Trans. Geosci. Remote Sens. 2023, 62, 5000520. [Google Scholar] [CrossRef]
Yao, J.; Hong, D.; Wang, H.; Liu, H.; Chanussot, J. UCSL: Toward Unsupervised Common Subspace Learning for Cross-Modal Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5514212. [Google Scholar] [CrossRef]
Yao, J.; Zhang, B.; Li, C.; Hong, D.; Chanussot, J. Extended Vision Transformer (ExViT) for Land Use and Land Cover Classification: A Multimodal Deep Learning Framework. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5514415. [Google Scholar] [CrossRef]
Hui, B.; Song, Z.; Fan, H.; Zhong, P. A Dataset for Infrared Detection and Tracking of Dim-Small Aircraft Targets under Ground/Air Background. Sci. Data Bank 2019, 5, 291–302. [Google Scholar]
Sprechmann, P.; Bronstein, A.M.; Sapiro, G. Learning Efficient Sparse and Low Rank Models. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1821–1833. [Google Scholar] [CrossRef] [PubMed]
Mardani, M.; Mateos, G.; Giannakis, G.B. Decentralized Sparsity-Regularized Rank Minimization: Algorithms and Applications. IEEE Trans. Signal Process. 2013, 61, 5374–5388. [Google Scholar] [CrossRef]
Granström, K.; Fatemi, M.; Svensson, L. Poisson Multi-Bernoulli Mixture Conjugate Prior for Multiple Extended Target Filtering. IEEE Trans. Aerosp. Electron. Syst. 2020, 56, 208–225. [Google Scholar] [CrossRef]
García-Fernández, Á.F.; Williams, J.L.; Svensson, L.; Xia, Y. A Poisson Multi-Bernoulli Mixture Filter for Coexisting Point and Extended Targets. IEEE Trans. Signal Process. 2021, 69, 2600–2610. [Google Scholar] [CrossRef]
Du, P.; Hamdulla, A. Infrared Moving Small-Target Detection Using Spatial–Temporal Local Difference Measure. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1817–1821. [Google Scholar] [CrossRef]
Zhang, P.; Zhang, L.; Wang, X.; Shen, F.; Pu, T.; Fei, C. Edge and Corner Awareness-Based Spatial–Temporal Tensor Model for Infrared Small-Target Detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 10708–10724. [Google Scholar] [CrossRef]
Wang, Y.; Chen, X.; Gong, C.; Rao, P. Non-Ellipsoidal Infrared Group/Extended Target Tracking Based on Poisson Multi-Bernoulli Mixture Filter and B-Spline. Remote Sens. 2023, 15, 606. [Google Scholar] [CrossRef]

Figure 1. The united framework of the extended infrared target tracking algorithm based on low-rank and sparse matrix separation and PMBM multi-target filtering.

Figure 2. The representative infrared images from the six sequences in the dataset and the corresponding targets detected by matrix separation, respectively.

Figure 3. A comparison of the ROC curves of the four methods over the six selected infrared image sequences, respectively.

Table 1. A quantitative comparison of BSF and SCRG metrics among four methods on the six selected image sequences.

		STTM-SFR	WAS_pN-STTTV	4DST-BTMD	The Proposed
BSF	Seq. 1	43.54	45.33	45.74	46.07
	Seq. 2	42.90	44.33	45.06	45.46
	Seq. 3	27.62	28.44	28.23	28.53
	Seq. 4	35.11	36.61	37.62	37.35
	Seq. 5	36.54	37.60	38.52	38.94
	Seq. 6	34.05	35.11	35.53	36.06
SCRG	Seq. 1	37.01	38.71	39.27	40.29
	Seq. 2	36.69	37.99	38.58	39.14
	Seq. 3	29.92	30.71	30.45	31.28
	Seq. 4	37.70	38.03	38.94	38.62
	Seq. 5	38.72	39.04	39.39	39.51
	Seq. 6	36.37	36.52	36.97	37.23

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, J.; Zhou, H.; Yu, Q.; Zhu, J.; Liu, J. Matrix Separation and Poisson Multi-Bernoulli Mixture Filtering for Extended Multi-Target Tracking with Infrared Images. Electronics 2024, 13, 2613. https://doi.org/10.3390/electronics13132613

AMA Style

Su J, Zhou H, Yu Q, Zhu J, Liu J. Matrix Separation and Poisson Multi-Bernoulli Mixture Filtering for Extended Multi-Target Tracking with Infrared Images. Electronics. 2024; 13(13):2613. https://doi.org/10.3390/electronics13132613

Chicago/Turabian Style

Su, Jian, Haiyin Zhou, Qi Yu, Jubo Zhu, and Jiying Liu. 2024. "Matrix Separation and Poisson Multi-Bernoulli Mixture Filtering for Extended Multi-Target Tracking with Infrared Images" Electronics 13, no. 13: 2613. https://doi.org/10.3390/electronics13132613

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Matrix Separation and Poisson Multi-Bernoulli Mixture Filtering for Extended Multi-Target Tracking with Infrared Images

Abstract

1. Introduction

2. Differential Low-Rank and Sparse Matrix Separation

3. PMBM for Extended Multi-Target Tracking

3.1. PMBM for State Modeling

3.2. Measurement Modeling

3.3. Conjugate Bayesian Filtering

4. The United Framework and Complexity Analysis

4.1. Detection and Filtering Framework

4.2. Computational Complexity

5. Experiments

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI