COMS 6998 Lec 1
COMS 6998 Lec 1
COMS 6998 Lec 1
Disclaimer: This draft may be incomplete or have errors. Consult the course webpage for the most
up-to-date version.
1 Logistics
Prerequisites.
• Linear algebra.
Grading.
• Scribe notes (15%): Scribe for one lecture. Draft due two days after class.
• Final project (50%): Choose between (1) A reading-based project to survey one or more papers.
(2) A research project.
Final report of 5-15 pages + presentation.
2 Overview
We will study applications of algebraic techniques in TCS. Here “TCS” mainly means algorithms and
complexity. We will briefly mention other areas as well, e.g., using the polynomial method in learning
theory. This course mainly covers four topics:
• Matrix rigidity.
• Matrix multiplication.
1
2.1 Algebraic Graph Algorithms
We study algebraic tools for faster graph algorithms. Some examples:
• Graph problems: All-pairs shortest paths (APSP), subgraph isomorphism, maximum matching,
longest path.
• Algebraic tools: Polynomial identity testing, fast matrix multiplication (FMM), algorithms for
determinant/inverse.
For example, the following polynomial q is a polynomial threshold function for the AND function:
1
q(x) = x1 + x2 + · · · + xn − n − .
2
2
This is because for x1 = x2 = · · · = xn = 1, q(x) = n − n − 21 = 12 ≥ 0. And if some xi = 0,
q(x) ≤ (n − 1) − (n − 21 ) = − 12 < 0.
Note that the degree of q is deg(q) = 1. This is much smaller than the degree of the previous
polynomial p(x) = x1 · x2 · · · xn .
2.2.2 Applications
1. Lower bounds. The polynomial method has been used to prove lower bounds in circuit complexity
and communication complexity, e.g., circuit lower bounds for AC 0 .
The high-level idea is to use a reduction from the known lower bounds of the degree of polynomials.
2. Faster algorithms. The polynomial method is also useful for designing faster algorithms, e.g.,
nearest neighbor search (NNS), all-pairs shortest paths (APSP).
The high-level idea behind these algorithms is to first convert the original problem into a Boolean
function, then approximate the Boolean function using a polynomial with good properties, and
finally design algorithms to evaluate the polynomial efficiently.
3
2.3.1 Examples
Identity matrix. The first example is the simplest identity matrix:
1 1
1 0
IN = ⇒ .
1 0
1 0
N r N −r
We can change N − r diagonal entries of IN to zero to decrease its rank to r. Thus RIN (r) ≤ N − r.
In fact, RIN (r) = N − r. This is because changing one entry of a matrix can decrease its rank by at
most 1, so we need to change at least N − r entries of IN until its rank decreases to r.
Upper triangular matrix. A more rigid example is the upper triangular matrix:
1 1 1 1
1 1 1
UN = .
1 1
1
PN −r
A naive upper bound is RUN (r) ≤ i=1 i = O((N − r)2 ) where we change all the ones in the lower
N − r rows to zero.
Here is a more efficient way to decrease the rank. We divide the rows of UN into r groups of size Nr ,
PN/r−1
and in each group we change the i=1 i = O(( Nr )2 ) number of zeroes in the bottom left corner to one.
In this way all the rows in one group become the same, so the rank of the modified matrix is r. Below
we show an illustration for N = 6 and r = 2:
1 1 1 1 1 1
∗ 1 1 1 1 1
∗ ∗ 1 1 1 1 N
r groups of size .
1 1 1 r
∗ 1 1
∗ ∗ 1
2
We get a better upper bound RUN (r) ≤ O(r · ( Nr )2 ) = O( Nr ). This bound is actually tight.
4
The upper triangular matrix is far from being Valiant-rigid:
N2
N
RUN =O = O(N log log N ) N 1+ .
log log N N/ log log N
In [Val77], Valiant showed that if there exists an explicit construction of a family of matrices that is
Valiant-rigid, then we can prove a big break-through result in circuit complexity. Currently we still do
not have any explicit construction of matrices that are Valiant-rigid.
Meanwhile, it is known that a uniformly random {0, 1}-matrix M satisfies
N
RM ≥ Ω(N 2 )
2
5
For completeness we include Strassen’s algorithm for 2 × 2 matrices here. First compute M1 to M7 :
C11 = M1 + M4 − M5 + M7 ,
C12 = M3 + M5 ,
C21 = M2 + M4 ,
C22 = M1 − M2 + M3 + M6 .
The recursive algorithm. For larger matrices, Strassen’s algorithm divides the matrix into 4 block
matrices of size N2 × N2 , and computes the multiplications of the submatrices recursively. In order to
multiply two N × N matrices, the algorithm plugs these submatrices into Strassen’s identity above, so it
computes 18 additions of N2 × N2 submatrices, and 7 multiplications of N2 × N2 submatrices. Fortunately,
even though we are doing many additions, we can very quickly add matrices in just O(N 2 ) time. Thus
we get the following recursive formula for the running time:
N
T (N ) = 7 · T + 18 · O(N 2 )
2
=⇒ T (N ) = O(N log2 7 ) ≤ O(N 2.81 ).
Current fast matrix multiplication algorithms. Currently, the fastest matrix multiplication algo-
rithm runs in O(N 2.373 ) time.
Even though Strassen’s algorithm is used in practice, the later theoretically faster algorithms usually
have exceedingly large constants, and they cannot be used in practice.
2.4.2 Applications
1. Matrix multiplication is used in all three previous problems.
In this course, we will first use matrix multiplication as a black-box to design algorithms for other
problems, and in the end we cover the fast matrix multiplication algorithms.
2. Many other linear algebra tasks can be done in the same time as matrix multiplication, including
computing determinant, inverse, linear systems, and some linear programs.
6
3 Graph Algorithms Using MM
Now we delve into our first topic (algebraic graph algorithms). In this section we consider designing
graph algorithms using the algebraic tool of matrix multiplication.
Algorithm. The algorithm first forms the adjacency matrix A ∈ {0, 1}N ×N of G:
(
1, if (i, j) ∈ E(G),
Aij =
0, otherwise.
The algorithm then computes the matrix A2 , and checks if there exists a pair (a, b) that satisfies
If so, the algorithm outputs “yes” (there exists a triangle), and otherwise the algorithm outputs “no”.
A2 [a, b] > 0 means there exists a length-2 path from a to b, i.e., there exists another node c where
(a, c), (b, c) ∈ E(G). Thus (a, b, c) is a triangle iff (a, b) ∈ E(G) and A2 [a, b] > 0.
The bottleneck of this algorithm is to compute A2 , and this takes O(N 2.373 ) time. All other compu-
tation can be performed in O(N 2 ) time.
7
a
c d
b
Figure 1: Target subgraph.
However, this naive algorithm doesn’t work because it doesn’t rule out the existence of the edge (c, d).
A correct algorithm must distinguish our target subgraph and the 4-clique .
2
This is because (a,b)∈E(G) A [a,b]
P
2 counts the total number of pairs of “parallel” length-2 paths. A target
subgraph contributes 1 pair (the pair (a − c − b) and (a − d − b)), while a 4-clique contributes
4
2 = 6 pairs (one pair between any two nodes in {a, b, c, d}).
2
Define R(G) := (a,b)∈E(G) A [a,b]
P
2 . Eq. (1) implies the following:
• If R(G) is a multiple of 6, then it’s not clear if G doesn’t contain , or if the number of in G
is a multiple of 6.
In order to truly determine whether G contains , we design a randomized algorithm. The algorithm
0 0
first samples a subgraph G of G where G keeps each node of G with probability 12 . In the next lecture,
we will show that if G contains , then with high probability R(G0 ) is not a multiple of 6.
References
[Str69] Volker Strassen. Gaussian elimination is not optimal. Numerische mathematik, 13(4):354–356,
1969.