To introduce our notation, we assume throughout that the rows of the system matrix
A that are to be treated as dense have been permuted to the end. With a conformal partitioning of the vector
b (and omitting the row permutation matrix for simplicity of notation) we have
where
\(m_s\) and
\(m_d\) denote the number of sparse and dense rows of
A, respectively, with
\(m = m_s + m_d\) ,
\(m_s \ge n\) and
\(m_d \ge 1\) is small (
\(m_d \ll m_s\) ).
\(A_s\) and
\(A_d\) are referred to as the sparse and dense row blocks of
A and the rows of
\(A_d\) are termed dense rows. The linear LS problem that we are interested in solving is then
We assume that
A has full column rank, in which case the solution of Equation (
2) is unique and is given by the solution to the system of normal equations
It is well understood that there are a number of possible problems associated with the normal equations. First, there is a potential loss of information in explicitly computing the
\(n \times n\) symmetric positive definite normal matrix
C and the vector
\(A^Tb\) . Second, if
A contains just a single dense row, then
C is not sparse, and, thus, if
n is large, then it cannot be stored or factorized by a direct solver that computes a Cholesky factorization. Third, there is the fact that the condition number of
C is the square of that of
A, so that an accurate solution may be difficult to compute if
A is poorly conditioned. If
A is not full rank, then the Cholesky factorization of
C breaks down; near rank degeneracy causes similar numerical problems in finite precision arithmetic. One way to try and lessen the numerical issues is to avoid computing
C and to obtain its Cholesky factor
R directly from
A by computing its QR factorization. An orthogonal matrix
Q is computed such that
where
\(R \in \mathbb {R}^{n\times n}\) is upper triangular (and nonsingular if
A is of full rank) and
\(P \in \mathbb {R}^{n\times n}\) is a permutation matrix that performs column interchanges to limit the fill in
R. Since the Euclidean norm is invariant under orthogonal transformation, the solution to Equation (
2) may be obtained by solving the system
Over the years, there has been significant work on QR factorizations for solving large sparse LS problems (see, for example, the book by Björck [
11] and the references therein as well as References [
4,
12,
15,
47]). Unfortunately, for large problems of the form of Equation (
2), a straightforward application of the QR algorithm will fail, because, as already observed, the block
\(A_d\) of dense rows causes the factor
R to fill in, limiting the usefulness of black-box sparse QR solvers.
In Section
2, we discuss the software packages and test examples that are used in this study. We then commence our study by considering direct methods. Section
3 recalls the use of updating [
27] to handle dense rows. We then consider two preprocessing approaches that extend the applicability of updating when
\(A_s\) is rank deficient: partial matrix stretching [
42] and regularization [
36]. Both avoid break down of the QR algorithm by enlarging the problem that it is applied to. In Section
4, we discuss a hybrid approach that combines using a QR factorization of the (possibly enlarged) sparse row block with an iterative solver. An alternative method based on the augmented system formulation of the LS problem is considered in Section
5. Our proposed extension to sparse-dense LS problems allows the incorporation of iterative refinement with a preconditioned Krylov subspace solver. Numerical experiments to illustrate the performance of the different approaches are presented in Section
6. These show that updating works well if
\(A_s\) is of full rank: it is straightforward to implement and robust and offers significant savings compared to applying the QR solver with no special handling of dense rows (even if the rows that are classified as dense are far from being full). For the more challenging case of rank-deficient
\(A_s\) , we find that preprocessing using either regularization or partial stretching is effective, with the former being the easier to implement using existing software packages while the latter obtains more accurate solutions. A key attraction of the augmented system approach is that it can employ multi-precision arithmetic, which has the potential to reduce the solution cost (in terms of memory and/or time), while still returning the requested accuracy. In Section
7, we summarise our findings and propose possible future directions.