1. Introduction
Millimeter-wave (mmWave) wireless communications have drawn great attention in the industry and academia [
1] thanks to the large bandwidth available in the 30–300 GHz band. To compensate for the significant path loss in this band and also thanks to the short wavelength, massive MIMO have been suggested for mmWave systems. In particular, large-scale symmetric antenna arrays, such as the uniform linear arrays (ULAs) and uniform planer arrays (UPAs) have been extensively considered for transmitters and receivers due to their neat structures and high gains for directional transmissions [
2,
3,
4]. However, the coherence time in the millimeter-wave system is suggested to be short and as the number of antennas increases, the complexity of channel estimation increases. Therefore, it is challenging to acquire instantaneous channel state information (CSI) for a mmWave massive MIMO.
MmWave channels are often dominated by a small number of propagation paths, indicating that the channel is sparse in the angular domain [
5]. The channel matrix can be expressed in terms of dictionary matrices, which are formed by the transmitting and receiving array response vectors, and path gains. Compressive sensing (CS) [
6] can then be applied to search for the dominant paths [
7,
8,
9,
10,
11,
12,
13]. Different measurement matrices can be used by choosing the precoders and combiners, as well as various recovery algorithms such as orthogonal matching pursuit (OMP) [
7,
14] and the adaptive CS [
8,
9,
10] can be applied. In general, the above mentioned CS schemes require knowledge of the array response, which depends on the array geometry and calibration of the antenna arrays. Such knowledge can be inaccurate when there are unknown hardware impairments, e.g., due to phase and gain errors, and imperfect calibration of the antenna arrays.
In the meantime, a small number of propagation paths also indicate that the channel is low-rank [
15] and can be depicted in low-dimensional subspace. Furthermore, such low-rankness is independent of the array response and calibration errors. In [
16,
17], both the sparsity and low-rank property of the mmWave channel are exploited to enhance CS-based channel estimators. In [
16], a two-stage estimator is proposed, where the low-rankness is exploited at the first stage while sparsity in the angular domain is exploited at the second stage. In [
17], the improved alternating direction method of multipliers (ADMM) method [
18] is applied to exploit the low-rankness and sparsity while at the same time enhancing the performance. These estimators, however, still require knowledge of the array response vectors. They can still suffer from performance loss when there are uncertainties in the array response.
To achieve robust channel estimation, matrix completion (MC) methods exploiting only the low-rank property of the mmWave channel have recently been proposed [
19,
20]. The analysis in [
21] shows that the rank of the channel matrix is generally very low, which is usually much smaller than the antenna dimension. In [
19], the singular value projection (SVP) algorithm [
22] is adopted to solve the mmWave channel estimation problem. Later, the GCG-Alt method is developed in [
20] by applying the generalized conditional gradient (GCG) framework [
23] together with the alternating minimum (AltMin) method [
24]. There are also other widely-studied MC algorithms that can be used for mmWave channel estimation, such as the singular value thresholding algorithm (SVT) [
25] and the fixed point continuation algorithm (FPC) [
26]. In this paper, we discuss several mmWave channel estimators based on MC, focusing on their performance and complexity comparisons with alternative methods based on CS. We aim to examine the pros and cons for several MC estimators and the factors that influence their performances.
The rest of this paper is organized as follows.
Section 2 introduces the mmWave MIMO channel model and formulates the channel estimation problem.
Section 3 introduces channel estimators based on MC, including their detailed implementation.
Section 4 presents simulation results, in terms of the mean squared error (MSE) and computational complexity.
Section 5 concludes the paper.
Notation: , , and denote transpose, conjugate, and conjugate transpose of matrix , respectively. denotes the norm. and represent the identity matrix and zero matrix/vector, respectively. and denote the Kronecker product and the Hadamard product, respectively. is the trace of and denotes the inner product of matrices and . denotes the statistical expectation and represents taking element-wise absolute value. ∇ denotes the gradient of a function. For a matrix denotes the vectorization of and denotes the inverse of vectorization. and denote the real and imaginary part of a number or vector, respectively. denotes complex Gaussian distribution with mean a and variance .
2. System Model
Consider a point-to-point, switch-based mmWave hybrid MIMO system, with the receiver at the mobile station (MS) shown in
Figure 1. For simplicity and clarity, this paper assumes switch-based mmWave systems to investigate MC-based channel estimators. MC-based estimators can also be applied to phase shifter-based mmWave systems, when the hybrid precoders/combiners are properly designed, as shown in [
20]. Therefore, the discussion in this paper can be easily extended to phase shifter-based mmWave systems. At the receiver, each of the
antennas is equipped with a switch used to select one of the
RF chains. The base station (BS) has the same structure with
antennas and
RF chains. Assume that
data streams are transmitted, with
[
14,
27]. The switching operation can be represented as a precoder
, where the nonzero entries indicate the entries of the channel matrix that are sampled. A symbol
with
is precoded, resulting in the transmitted signal
. We consider a narrow-band flat fading channel whose channel matrix
satisfies
. The received signal is expressed as:
where
indicates the average received power and
is a noise vector with i.i.d. entries distributed as
. Applying a combiner
to the received signal at the MS, the processed received signal is given by:
In the switch-based system, the combiner has a similar structure with the precoder .
Following [
8], the ray-clustering model of
is given as:
where
is the number of clusters with
as the mean of the Poisson distribution, and
R is the number of rays in each cluster. The complex small-scale fading gain on the
r-th ray of the
c-th cluster is
with
, where
is the sub-power on the
c-th cluster. In Equation (
3),
and
represent the array response vector for the receiver and transmitter, respectively, where
and
represent the corresponding azimuth AoA and AoD, which follow the Laplacian distribution [
28].
Considering a uniform linear array (ULA) with distance between adjacent antennas being
d, the array response is given by:
where
is the wavelength of the carrier wave. The array response
is constructed in the same manner as
.
3. Compressive Sensing-Based Channel Estimation
It has been shown that, without considering quantization errors, the mmWave channel estimation problem can be formulated as a sparse recovery problem by modeling the channel as [
29,
30,
31]:
where
is a unitary dictionary matrix when
and it is an overcomplete dictionary matrix when
, and
with
. Each column in
and
consists of a predefined array response vector.
is a sparse matrix with only
L non-zero values, with each of its non-zero values corresponding to the complex gain of a channel path. Vectorization of the channel matrix (
5) produces:
where
is a
sparse vector with
L non-zero values. We define
as a
dictionary matrix. Sparse recovery schemes can then be used to estimate the channel, which transforms the task of estimating
to estimating the non-zero coefficients in
.
A widely used method to estimate
from the the received signal is orthogonal matching pursuit (OMP) [
7,
14]. Using OMP,
L path directions from the
candidates in the dictionary are determined. The computational complexity of the OMP method is approximately
, where
N is the length of the received signal. In general, a larger dictionary leads to better performance but also higher computational complexity.
The above mentioned CS approach uses a discretized approximation of the channel. It may suffer from the off-grid issue if the physical propagation paths are off the assumed grid of the angles. In this case, the number of non-zero entries in the beamspace channel may not be exactly equal to L, leading to a power leakage. Another challenge is that the knowledge of the array response is required, which may be imperfect in practice due to unknown hardware impairments and imperfect calibrations.
4. Matrix Completion-Based Channel Estimation
In this section, we introduce MC-based estimation methods for the mmWave channel by exploiting the low-rankness of the channel matrix. By appropriately choosing the training scheme with proper precoders and combiners, the received signal provides noisy observations of a subset of the entries of
:
where
is the perturbed channel matrix,
is a noise matrix,
denotes a sample domain, and
is the
th entry of
. Define
as the sampling density, where
N is the total number of samples observed.
It is discussed in [
19] that when the mmWave channel matrix has strong non-coherent characteristics, it can be recovered from a subset samples of the channel matrix. We can thus formulate the channel estimation problem as a low-rank matrix completion problem as:
where
is the error tolerance parameter and the sampling operator
is defined as:
where
denotes the
-th entry of
. The sampling operator
significantly influences the performance of the algorithm [
32]. Bernoulli and uniform sampling models are proposed in [
32] and a uniform spatial sampling model (USS), which improves the performance, is proposed in [
33]. With USS,
samples are taken for each column of the target matrix.
4.1. MC Estimators
In the following, we discuss several MC estimators that can be used to solve the problem in Equation (
8).
4.1.1. SVT Estimator
Before presenting the SVT algorithm, let us define the matrix shrinkage operator:
where
denotes the singular value matrix of
and
is the element-wise shrinkage operator:
where
is the threshold.
The SVT algorithm [
25] can be applied to provide a heuristic solution to Equation (
8), which consists of two major steps,
where
,
is a step size, and
. The iteration is stopped when a stopping criterion is met or a maximum number of iterations
is reached. Singular value decomposition (SVD) is required at each iteration. Some comments regarding the implementation of the SVT algorithm to the mmWave channel estimation problem are as follows:
SVT Estimator is shown in below Algorithm 1:
Algorithm 1 SVT Estimator |
Require: , , , , , . |
- 1:
Set ; - 2:
for to do - 3:
Set - 4:
Set - 5:
if then break; - 6:
end if - 7:
end for - 8:
return
|
From [
25], SVT is effective for completing large matrices with low ranks. Its performance degrades as the rank increases. The computational complexity of the SVT algorithm mainly arises from Step 3.
4.1.2. FPC Estimator
The FPC algorithm [
26] reformulates the MC problem using the nuclear norm, which is the summation of the singular values,
where
is the regularization parameter. The algorithm consists of two steps similar to SVT:
where the threshold of the singular value thresholding operator is set as a variable
rather than a fixed value as that in SVT. A continuous strategy [
34] is used to accelerate the convergence by adapting
. The details are presented in Algorithm 2. Some comments are as below:
We can set the step size
according to [
26], where
is the maximum eigenvalue;
decreases as
where
M, determined by
, controls the step size and the estimation accuracy,
is small (e.g.,
), and the parameter
determines the decreasing rate for consecutive
.
Algorithm 2 FPC Estimator |
Require: , , , , , and |
- 1:
Initialization: , , - 2:
whiledo - 3:
- 4:
for do - 5:
- 6:
- 7:
if then break; - 8:
end if - 9:
end for - 10:
end while - 11:
return
|
The main computational cost of the FPC algorithm is in Step 6 of Algorithm 2 due to the SVD. In addition, the FPC algorithm needs to choose the step size by calculating the maximum eigenvalue of and has a higher computational complexity per iteration than that of the SVT algorithm.
4.1.3. SVP Estimator
The SVP algorithm [
22] is based on the projected gradients and is detailed in Algorithm 3. This algorithm requires that the rank
L of the channel matrix to be known. The step size can be chosen empirically as
with
. The stopping criterion is based on the norm of the difference in the sampled channel matrix, where the small threshold
can be set such as
. The SVP algorithm also needs to calculate the SVD in Step 4, which is the most computationally expensive step of the algorithm.
Algorithm 3 SVP Estimator |
Require: , L, , |
- 1:
: - 2:
whiledo - 3:
- 4:
Compute the L principal singular vectors of : . - 5:
- 6:
; - 7:
end while - 8:
return
|
4.1.4. GCG-Alt Estimator
In [
20], a generalized conditional gradient framework with alternating minimization (GCG-Alt) is developed for the MC problem. The nuclear norm is used to promote low-rankness as:
where
is a regularization factor. Let:
The gradient direction of the
kth iteration of
[
23]:
where
is the top singular vector pair of
which can be found iteratively. The channel matrix is updated as:
where
is the step size and
is adaptively chosen.
By using a property of nuclear norm [
20], the optimization problem can be reformulated as:
where
and
with
being the rank of
. Alternating minimization can then be used to optimize Equation (
19). The details of solving the alternate minimization problem can be found in [
20]. The overall algorithm is summarized in Algorithm 4.
Algorithm 4 GCG-Alt Estimator |
Require: |
- 1:
Initialization:, - 2:
whiledo - 3:
Compute the top singular vector pair of - 4:
- 5:
- 6:
- 7:
- 8:
- 9:
- 10:
- 11:
- 12:
: , . - 13:
while do - 14:
- 15:
Find that minimizes Equation ( 19) given ; - 16:
Find that minimizes Equation ( 19) given ; - 17:
- 18:
end while - 19:
- 20:
Calculate - 21:
- 22:
end while - 23:
return
|
The above MC methods have different computational complexities. SVT, SVP, and FPC all need SVD, which can be implemented using the PROPACK [
35] based on the iterative Lanczos bidiagonalization algorithm with partial reorthogonalization. The FPC has a higher complexity as SVD is repeated for different values of
. The GCG-Alt has the least complexity as the full SVD is not required. SVP is effective for large matrix completion problems with very low ranks, while the FPC, SVT, and GCG-Alt estimators allow higher ranks. The SVP estimator requires rank knowledge, while the FPC and GCG-Alt estimators implicitly determine the rank through choosing regularization parameters or thresholds.
4.2. MC-Based Hybrid Estimators
Next we discuss two MC-based hybrid methods that jointly exploit the sparsity and low-rankness of the channel.
4.2.1. ADMM Estimator
In [
17], the low-rankness of
and the sparsity of the beamspace channel
are jointly exploited and an ADMM method is proposed. Leveraging the side information that
has a sparse virtual representation given by Equation (
5), the channel estimation problem is formulated following [
36] as:
where the nuclear norm and
-norm together with the regularization parameters
and
are used to promote low-rankness and sparsity, respectively. The above problem is then reformulated by incorporating the constraints as penalty terms as:
where
and
are two auxiliary matrix variables. This problem is then solved by using ADMM which involves the iterative updates of the variables and Lagrangian multipliers. The augmented Lagrangian function of Equation (
21) is given by:
where
and
are dual variables (the Lagrange multipliers) and
is the step size. The estimator is summarized in Algorithm 5, where:
with in Step 3;
denotes the vectorization of , and similarly for other variables;
where
is composed of
N ones and
zeros, the value 1 indicates the position of a sample from the channel matrix, and
denotes the
i-th row of
, and
is the
matrix that the value at its
-th position is 1 and the remaining position is 0 [
17];
The parameters in Equation (
20) are chosen empirically as
and
, where
is the noise power.
The computational cost of the ADMM algorithm is mainly due to the SVD in Step 3 and the matrix operations in Steps 4, 5, 6, and 7.
Algorithm 5 ADMM Estimator |
Require:
, t, , , , , , . |
- 1:
: ; their vectorizations ; - 2:
fordo - 3:
Update , where - 4:
- 5:
Update
with and . - 6:
- 7:
- 8:
end for; - 9:
return
|
4.2.2. Two-Stage Estimator
The two-stage estimator [
16] also exploits the sparsity and low-rankness of the channel matrix. In the first stage, MC is applied to provide a denoised channel estimation based on the low rankness of the channel matrix and then the second stage employs CS to refine the estimation based on the array response and the virtual representation of the channel matrix.
The SVT algorithm [
25] introduced above solves the low-rank matrix completion problem:
The fast iterative shrinkage threshold algorithm (FISTA) proposed by [
37] is used to solve the sparse vector recovery problem:
The estimator is summarized in Algorithm 6, where:
The parameters of the first stage, SVT, is the same with Algorithm 1;
;
is a constant stepsize, e.g., ;
is the top eigenvalue of .
The complexity of SVT in the first stage has been anlyzed. The cost of the FISTA algorithm mainly consists of applying the sensing matrix in Step 4, which has a complexity of .
Algorithm 6 Two-Stage Estimator |
Require: |
- 1:
Use SVT to recover as
Require: , the top eigenvalue of - 2:
Initialize , - 3:
fordo - 4:
- 5:
- 6:
- 7:
- 8:
- 9:
- 10:
if then - 11:
Break - 12:
end if - 13:
end for - 14:
return
|
6. Conclusions
In this paper, we compared the performance of several MC-based channel estimators for mmWave massive MIMO systems. It was observed that the hybrid ADMM algorithm exhibited the best performance in general, which jointly exploited the low rank property of the channel matrix and the sparsity in the angular domain. However, it also exhibited the highest complexity among the estimators compared. The MC-based estimators (using GCG-Alt, SVT, SVP, or FPC) were robust against array impairments as they did not rely on array response vectors. Among them the GCG-Alt estimator exhibited the lowest complexity, better performance, and provided a competitive solution when the arrays were not perfectly calibrated.
In this work, we considered a point-to-point mmWave system. The estimators could also be applied to multiuser systems when orthogonal training schemes are deployed. The comparison of these methods when nonorthogonal training is used would be an interesting study for future work.