Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Parity Principle

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Parity check matrices and product representations of squares

Assaf Naor Jacques Verstraëte


Microsoft Research University of Waterloo
anaor@microsoft.com jverstra@math.uwaterloo.ca

Abstract
Let NF (n, k, r) denote the maximum number of columns in an n-row matrix with entries in
a finite field F in which each column has at most r nonzero entries and every k columns are
linearly independent over F. We obtain near-optimal upper bounds for NF (n, k, r) in the case
r cr
k > r. Namely, we show that NF (n, k, r) ¿ n 2 + k where c ≈ 43 for large k. Our method is
based on a novel reduction of the problem to the extremal problem for cycles in graphs, and
yields a fast algorithm for finding short linear dependences. We present additional applications
of this method to problems in extremal hypergraph theory and combinatorial number theory.

“It is certainly odd to have an instruction in an algorithm asking you to play with some numbers
to find a subset with product a square . . . Why should we expect to find such a subsequence, and,
if it exists, how can we find it efficiently?”

Carl Pomerance [32].

1 Introduction
Since the mid-1930s it has been well established that there is a tight connection between combi-
natorial number theory and extremal combinatorics (see [33, 12] and the references therein). The
basic paradigm is that for certain number theoretical problems one can construct a combinatorial
object (e.g. a graph or a hypergraph), and prove that it cannot contain certain configurations (e.g.
cycles). Thus, in many cases one can use theorems on excluded configurations in extremal combi-
natorics to bound the size of the combinatorial structure that was constructed, and this translates
back to give number theoretical consequences. The present paper develops a novel reduction of
this type, and applies it to two problems in algebraic combinatorics, namely to coding theory and
combinatorial number theory. Additionally, our proofs are constructive, and thus yield the best
known algorithms for several natural computational problems.
Our original motivation comes from a problem in coding theory. Low density parity check
codes were introduced by Gallager in the 1960s, and have since found numerous theoretical and
practical applications in engineering and computer science (see [22, 29, 30] for an account of this
theory. We also refer to [11] for a nice introduction to the geometry of codes). Given a linear code
C ⊆ Fm 2 of dimension ` and minimum Hamming weight t, an (m − `) × m matrix H is called a
parity check matrix of the code C if C = {v ∈ Fm 2 : Hv = 0}. We shall say that H is r-sparse
if every column of H has at most r non-zero entries. The Syndrome Decoding Algorithm for such
codes works as follows: given a corrupted signal z one computes the vector x of minimal weight

1
satisfying Hz = Hx, and decodes z to z − x (this algorithm corrects at most t/2 errors). As such
computations are faster if the sparseness of H is exploited, it is desirable to obtain codes with
sparse parity check matrices. Indeed, sparse parity check matrices occur in many of the known
constructions of codes, e.g. codes based on bounded degree graphs such as expander codes [35, 36],
and we also refer to [28] for theoretical and experimental coding theory applications of very sparse
matrices (we stress here that the present paper deals with a different range of parameters – our
bounds will be for codes in which the minimal weight is not proportional to the dimension. Such
codes occur in several contexts, e.g. certain BCH and Reed-Solomon codes [29], Turbo and Turbo-
like codes [8, 25, 14, 7]). Additionally, the above discussion makes sense for parity check matrices
over arbitrary finite fields, which are also used in coding theory (see [29, 30] for basic information
on this topic, and [16] for empirical results on such codes). Finally, sparse parity check matrices
are the key ingredient in the construction of small probability spaces and deterministic simulations
of k-wise independent random variables, which are a key tool in derandomization [1, 2, 4, 10].
Somewhat surprisingly, in spite of their importance, there was a large gap between the known
upper and lower bounds for the maximal number of columns of sparse parity check matrices. For
a finite field F let NF (n, k, r) be the maximal number of vectors in Fn with at most r non-zero
coordinates such that no k of them are linearly dependent (observe that the linear independence
condition corresponds to the fact that the kernel of the matrix whose rows are the given vectors
is a code with minimal distance at least k + 1). When F = F2 we use the notation NF2 (n, k, r) =
N (n, k, r). The problem dealt with in this paper, namely that of estimating NF (n, k, r), differs
from the classical Gilbert-Varshamov bounds (see e.g. [29]), since the classical bounds on sizes of
codes are geometric packing bounds which depend only on the minimum distance of the code. Here
we introduce an additional algebraic restriction on the code (the existence of a sparse parity check
matrix) which is motivated by computational issues. Thus, we are dealing with a mixture of a
geometric and algebraic problem. In this paper we are primarily interested in the case that k and
r are fixed and n → ∞, although the results are valid for arbitrary k and r.
Throughout this paper we use the following notation: given two non-negative sequences {an }∞ n=1 ,

{bn }n=1 , we write an ¿ bn if there exists a constant C > 0 such that for all n, an ≤ C · bn .

1.1 Bounds on NF (n, k, r)


A probabilistic construction [27] (using the first moment method) shows that
r r
N (n, k, r) À n 2 + 2k−2 ,
r r
and this was generalized to arbitrary finite fields in [26] to NF (n, k, r) À n 2 + 2k−2 for even k and
r r
NF (n, k, r) À n 2 + 2k−4 for odd k. When k ≥ 4 is even, and gcd(k − 1, r) = 1, the probabilistic lower
bound above was improved in [9] to
r r 1
N (n, k, r) À n 2 + 2k−2 · (log n) k−1 .

This lower bound was generalized to arbitrary finite fields in [26] (in which case the constant also
depends on the size of the field).
In [27] it was shown that when k is a power of 2, N (n, k, r) ¿ n 2 dr+ k−1 e , which coincides with
1 r

the probabilistic lower bound (up to factors independent of n) when k − 1 divides r. This upper
bound was generalized to arbitrary finite fields in [26]. Observe that in the important case k > r,

2
i.e. when the number of correctible errors is greater than the weight, this upper bound becomes:
r 1
for k a power of 2, NF (n, k, r) ¿ n 2 + 2 . Thus the gap between the exponent of n in this bound and
the probabilistic lower bounds deteriorates as k grows. Here we resolve this problem by proving
the following theorem in Section 3:

Theorem 1.1. For every integer k ≥ 8 and every finite field F


r dr/3e
+ 2bk/8c
NF (n, k, r) ¿ n 2

where the implied constant depends only on k, r and |F|.

The exponent in the displayed inequality behaves roughly like 2r + 3k 4r


when k is large. This
r r
should be compared with the exponent 2 + 2k−2 in the probabilistic lower bound on NF (n, k, r). In
particular it follows from Theorem 1.1 that for any positive integer r,
· ¸ · ¸
log NF (n, k, r) log NF (n, k, r) r
lim lim inf = lim lim sup = .
k→∞ n→∞ log n k→∞ n→∞ log n 2
It is worthwhile to note here that the proof of Theorem 1.1 when r is even differs markedly from
the proof in the case of odd r. In fact, it turns out that the case of odd r involves a substantially
more subtle argument. The difference between these cases will be explained in section 2. It is likely
to be difficult to determine the exact value of the bracketed terms above for all k and r (the answer
probably depends on arithmetic properties of k and r).
The proof of Theorem 1.1 is based on a novel reduction of the problem to the following Turán
type problem: What is the maximum number of edges in an n-vertex graph which doesn’t contain
an even cycle of length 2k? We then employ recent results on this problem [37, 3, 24, 31] to deduce
bounds on NF (n, k, r). Some of the previous results on NF (n, k, r) reduced the problem to the study
of certain Turán type questions on hypergraphs (see [9, 7]). Since very little is known on hypergraph
Turán problems, our main contribution is the method of reducing such questions to a problem on
graphs. We believe that this approach is of independent interest. Indeed, as an example we apply
our result to a problem in combinatorial number theory, improving a theorem of Erdős, Sárközy
and Sós [21] (this application relies heavily on the more difficult part of Theorem 1.1, namely the
case of odd r). We also apply our result to the even cover problem for hypergraphs.

1.2 Even covers in hypergraphs


The methods used to prove Theorem 1.1 in the case F = F2 extends the Even Cycle Theorem of
Erdős [13] to hypergraphs. If we rephrase the theorem in the terminology of hypergraphs, then we
define an even cover to be a non-empty collection of sets each point of which is in a even number
of sets. For example, a graph contains an even cover of size at most k if and only if it has girth at
most k. Applying Theorem 1.1 with F = F2 to the incidence vectors of edges in a hypergraph, we
obtain the following theorem:
Theorem 1.2. Let S be a hypergraph on n points whose edges have size at most r each, and which
does not contain an even cover of size k. Then
r dr/3e
+ 2bk/8c
|S| ¿ n 2 .

3
As an example of an application to extremal hypergraph theory, it is notoriously difficult to
determine which configurations of triples are guaranteed to appear in every large enough Steiner
triple system (for example, many of the questions discussed in [19] remain open). In the present
context, it is known that there are infinitely many Steiner triple systems which do not contain an
even cover of four triples (one is constructed in [27]). Using Theorem 1.2, we can at least guarantee
small ¡even
¢ covers in Steiner triple systems. For if S is a Steiner triple system on n points, then S
1 n
has 3 2 edges [15], which is larger than the bound in Theorem 1.2 for r = 3 and k = 16, so every
large enough Steiner triple system contains an even cover of size at most sixteen. We leave open
the problem of determining if smaller even covers exist in every Steiner triple system.

1.3 Product representations of squares


Denote by Repk (n) the largest N such that there exists A ⊆ {1, . . . n} with |A| = N such that
for every 1 ≤ ` ≤ k there are no distinct a1 , . . . , a` ∈ A and x ∈ Z satisfying a1 · a2 · · · a` = x2 .
The behavior of the sequence Repk (n) has been studied by several authors [21, 34, 23]. One of
the motivations for studying this sequence is that the problem of finding product representations
of squares is a key step in certain sub-exponential factoring algorithms (see the surveys [38, 32]
for an account of this fascinating field, and [17] for the first rigorous analysis of a sub-exponential
randomized factoring algorithm). In these algorithms the goal is to find efficiently a subset of a
certain set of integers whose product is a square – the sets that are analyzed are carefully constructed
so that such a product representation is guaranteed to exist, but it is of interest to ask how large such
a set should be in order to ensure the existence of the required product representation. Moreover,
once the cardinality of a set is above this threshold, one would like to efficiently find such a product
representation. The best known results on this question follow from the work of Erdős [18] and
Erdős, Sárközy and Sós [21], who showed that for every k ≥ 6,
µ ¶1+ 1 µ ¶2
n 2 6k n 3
¿ Repk (n) − π(n) ¿ ,
log n (log n)2
where the implied constants are absolute and π(n) is the number of primes less than or equal to n.
1
Here we show that for large k, the order of magnitude of Repk (n) − π(n) is roughly N 2 , namely in
Section 4 we prove the following theorem:
Theorem 1.3. For every k ≥ 1 and every integer n,
µ ¶1+ 1
n 2 48k 1 1
¿ Rep8k (n) − π(n) ¿ kn 2 + 2k (log n)2 .
log n

1.4 An algorithm for linear dependences


Our proof of Theorem 1.1 yields an algorithm which, given a set X ⊆ Fn of vectors of weight at
most r with
r dr/3e
+
|X| À n 2 2bk/8c ,
finds k linearly dependent vectors in X in time quadratic in |X| – that is in time O(|X|2 ) = O(n2r ).
¡ ¢ kr
Observe that an exhaustive check of all possible |X|
k linear dependencies requires time Ω(n 2 ). In

4
the case that r is even, we will give a short proof of Theorem 1.1 which itself yields an algorithm
whose running time is linear in |X|. These algorithms will follow directly from our proof of Theorem
1.1 and the Alon-Yuster-Zwick algorithm [5, 6] for finding cycles in graphs, or alternatively the
proof of the main theorem in [37], which gives better constants. Additionally, our proof yields an
1 1
algorithm such that given a set A ⊆ {1, . . . , n} with |A| À kn 2 + 2k (log n)2 , finds in quadratic time,
distinct a1 , . . . , aj ∈ A with 1 ≤ j ≤ 8k, such that a1 a2 · · · aj is a square.

2 Forbidden cycles and sparse parity check matrices


Throughout this section, F is a finite field and F∗ = F \ {0} is the set of non-zero elements of F. A
set X ⊆ Fn is k-wise independent if no k vectors in X are linearly dependent. A vector v ∈ Fn is
said to have weight r if it has exactly r non-zero coordinates. Then NF (n, k, r) is the maximum size
of a set of k-wise independent vectors of weight at most r in Fn . The following result on forbidden
cycles will be used in the proof of Theorem 1.1:

Theorem 2.1. Let k be a positive integer and let G = (V, E) be an N -vertex graph. If G contains
no cycle of length exactly 2k then
1
|E| < 2kN 1+ k . (1)
If G is an M by N bipartite graph containing no cycle of length 2k, then
1 1 1
|E| < 2k · [M 2 N 2 + k + M + N ]. (2)

If G has girth at least 2k + 1, then the factor 2k may be removed in each of the upper bounds.

Theorem 2.1 is proved in [31], the last statement is proved in [3], the bipartite case of which
is proved in [24]. The first assertion (1), with the same dependence on n but a worse constant, is
Erdős’ Even Cycle Theorem – see [13].

2.1 Vectors of even weight


We will reduce the problem of estimating NF (n, k, r) to the bounds in Theorem 2.1. The reduction
we give is simple in the case that r is even, but substantially more involved when r is odd. The
first theorem we prove is as follows:
Theorem 2.2. Let n, k, r be positive integers, and define
br/2c
X µ ¶ dr/2e
X µ ¶
n n
M= (|F| − 1)` N= (|F| − 1)` .
` `
`=0 `=0

Let X ⊆ Fn be a set of vectors of weight at most r such that


1 1 1
|X| > 2k · [M 2 N 2 + k + M + N ].
P P
Then there are disjoint sets A, B ⊆ X, each of size k, such that a∈A a = b∈B b. In particular,
there is a set of 2k linearly dependent vectors in X.

5
r 1
The bound in Theorem 2.2 implies in particular that NF (n, 2k, r) ¿ n 2 + 2 for all k. This
generalizes the same bound which was proved for k a power of two in [27] to all values of k. The
proof of Theorem 2.2 is short, and we present it here:

Proof of Theorem 2.2. For each vector v ∈ X of weight ω ≤ r, fix a pair e(v) = {x, y} of
vectors in Fn of weights at most bω/2c and dω/2e, respectively, satisfying x + y = v. If r is
even, let G be the graph whose vertex set consists of all vectors in Fn of weight at most r/2
and whose edge set is E = {e(v) : v ∈ X}. Then |E| = |X| and G has M vertices. By the
first assertion (1) in Theorem 2.1, G contains a cycle of length 2k. By the definition of G, there
exist distinct vectors v1 , v2 , . . . , v2k ∈ X such that the edge set of this cycle consist of the pairs
e(vi ) = {xi , xi+1 } for i = 1, 2, . . . , 2k, where x2k+1 = x1 . Then
2k
X
(−1)`+1 vi = (x1 + x2 ) − (x2 + x3 ) + · · · − (x2k + x1 ) = 0,
`=1

and the disjoint sets A = {v1 , v3 , . . . , v2k−1 } and B = {v2 , v4 , . . . , v2k } have the same sum. If r is
odd, then G is a bipartite graph whose parts have sizes M and N . By the second assertion (2) of
Theorem 2.1, C2k ⊆ G, and we conclude by the same argument as that presented above.

Theorem 2.2 implies Theorem 1.1 in the case of even r: in fact in this case we obtain the
r r
stronger bound NF (n, k, r) ¿ n 2 + k for k even. We also remark that the proof above gives a linear
time algorithm – that is time O(|X|) – for finding 2k linearly dependent vectors in a set X ⊆ Fn
satisfying the requirements of the theorem – this follows from the linear time algorithm in [5, 6] for
finding a cycle of length 2k in a graph with appropriately many edges.

2.2 Vectors of odd weight


In the case of odd r, Theorem 2.2 is insufficient to deduce Theorem 1.1, since the rounding up of
r
r + 12
2 only yields an upper bound of order n . This difference between even and odd values of r
2

leads to a more involved argument in the case of odd r, which nevertheless yields the bounds of
Theorem 1.1 in this case as well. We conclude the proof of Theorem 1.1 by giving the following
refinement of Theorem 2.2 when k > r and r is odd. For convenience we define [ 3r ] = r − d 3r e − b 3r c
and q = |F∗ |:
Theorem 2.3. Let n, k and r be positive integers, and define
br/3c
X µ ¶ [r/3]
X µ ¶ dr/3e
X µ ¶
` n ` n ` n
L= q M= q N= q .
` ` `
`=0 `=0 `=0

Let X ⊆ Fn be an 8k-wise independent set of vectors. Then


1 1 1 1
|X| < 12kr · [(LN ) 2 M 2 + 2k + N 1+ 2k ].

r dr/3e
In particular, it follows that NF (n, 8k, r) ¿ n 2 + 2k , which gives Theorem 1.1 when we replace
k with bk/8c. The remainder of the paper is devoted to the proof of Theorem 2.3.

6
3 Proof of Theorem 1.1
We already proved Theorem 1.1 in the case that r is even. In this section we prove Theorem 2.3
which implies Theorem 1.1 when r is odd. Since the proof is quite involved, we begin with an
outline of the proof. For simplicity we omit all the multiplicative constants. Starting with a set
1 1 1 1
X ⊆ Fn where |X| À (LN ) 2 M 2 + 2k + N 1+ 2k , we wish to find a linearly dependent set of 8k vectors
in X. Suppose that there is no such subset of X. Via an averaging argument (see §3.6), we will
show that it is sufficient to find such a linear dependence in a set Z ⊆ FL × FM × FN , where
|Z| À |X| and where the projection of Z onto any of the subspaces FL , FM and FN consists of
vectors of weight one. We call such sets balanced.
To find linear dependences in balanced sets Z where |Z| À |X|, we first prove in §3.1 that there
1 1 1
is a set Y ⊆ Z where |Y | > |Z| − L1+ 2k + M 1+ 2k + N 1+ 2k and the projection of Y onto any one
of the subspaces FL × FM and FL × FN and FM × FN uniquely determines Y . In this case we say
that Y is determined by projection. This allows us in §3.2 to count special four-element subsets of
Y called partially dependent quadruples: these are sets of four vectors in Y whose projection onto
FL × FN is a linearly dependent set of four vectors. The key point in the proof is to prove that
there are enough of these quadruples in Y to ensure that the projection of 2k of these quadruples
onto FM results in a linearly dependent set. Then this gives 8k linearly dependent vectors in Y ,
and the required contradiction.
To obtain a feeling for where the bound in Theorem 2.2 comes from, we give some of the bounds
involved in the proof. Let Q denote the set of partially dependent quadruples in Y . We will show
in §3.2 that
|Y |4
|Q| > .
4L2 N 2
1 1 1
Now |Y | À (LN ) 2 M 2 + 2k , so this inequality gives |Q| À M 2+2/k . Since the projection of Y
onto FM consists of vectors of weight one, the projection of each quadruple in Q onto FM consists
of four vectors of weight one. If we treat these projections as vectors of weight four in FM , then
there are |Q| À M 2+2/k of these vectors. By Theorem 2.2 with r = 4, this shows that there are
2k quadruples in Q whose projections onto FM form a linearly dependent set of 2k vectors. In
principle, we then consider all the vectors in the corresponding 2k quadruples in Q to obtain a
linearly dependent set of at most 8k vectors in Y , and this contradiction completes the proof of
Theorem 2.2.
However, there is a subtlety, which is that the vectors in the 2k quadruples in Q might form a
trivial dependence in Y . One reason might be that each vector appears with coefficient zero modulo
p in the linear dependence, where p is the characteristic of F, in which case the linear dependence
is trivial. In the proof we consider only special types of linear dependences of projections of
quadruples onto FM , which guarantee that when we lift back to the quadruples themselves, the
linear dependence is not trivial. This approach is covered in §3.3 and §3.4.

3.1 Sets determined by projection


We represent the elements of Fα × Fβ × Fγ as triples (vα , vβ , vγ ). For convenience we write λ =
|F∗ |α, µ = |F∗ |β and ν = |F∗ |γ. A set Z ⊆ Fα × Fβ × Fγ is balanced if the components of each
(vα , vβ , vγ ) ∈ Z have weight one. The projection of a set Z onto Fα is
Zα = {v ∈ Fα : ∃ (vβ , vγ ) ∈ Fβ × Fγ , such that (v, vβ , vγ ) ∈ Z}.

7
Similarly, Zαβ denotes the projection of Z onto Fα ×Fβ . For v ∈ Fα ×Fβ let Z v = {z ∈ Z : zαβ = v}.
A set Y ⊆ Fα × Fβ × Fγ is determined by projection if any one of the sets Yαβ , Yβγ , Yαγ uniquely
determines Y . The following lemma says that 8k-wise independent sets contain large subsets which
are determined by projection:

Lemma 3.1. Let Z ⊆ Fα × Fβ × Fγ be a balanced 8k-wise independent set. Then there exists a set
Y ⊆ Z such that Y is determined by projection and
1 1 1
|Y | > |Z| − λ1+ 2k − µ1+ 2k − ν 1+ 2k . (3)

Proof. For v ∈ Zαβ , let T (v) be a spanning tree of the complete graph on Z v . So |E(T (v))| =
|Z v | − 1. We claim that the trees {T (v)}v∈Zαβ can be chosen so that the multigraph Gαβ consisting
of all edges in all the trees {T (v)}v∈Zαβ has girth greater than 4k. This is done by choosing the
trees {T (v)}v∈Zαβ so that the girth of Gαβ is a minimum. This choice implies that if C is a shortest
cycle in G, then |C ∩ T (v)| ≤ 1 for all v ∈ Zαβ . We aim to show that |C| > 4k. Suppose the edges
of C are {{w1 , w2 }, {w2 , w3 }, . . . , {w` , w1 }}. Then there are distinct vj = (vαj , vβj ) ∈ Zαβ such that
{wj , wj+1 } ∈ T (vj ) for all j ≤ `. Let xj = (vj , wj ) ∈ Z and yj = (vj , wj+1 ) ∈ Z for j ≤ `. Then

X̀ X̀
xj − yj = 0.
j=1 j=1

Now xi and yj are distinct for all i, j ≤ `. This means that {x1 , x2 , . . . , x` , y1 , y2 , . . . , y` } ⊆ Z is
linearly dependent. Since Z is 8k-wise independent, it follows that ` > 4k, as required, so G has
girth greater than 4k. The number of edges in Gαβ is
X
|E(G)| = (|Z v | − 1). (4)
v∈Zαβ

On the other hand, since Gαβ has at most c vertices (since Z is balanced), we conclude from
1
Theorem 2.1 that |E(Gαβ )| < ν 1+ 2k . We may define Gβγ and Gαγ similarly, and by symmetry
1 1
|E(Gβγ )| < λ1+ 2k . |E(Gαγ )| < µ1+ 2k . Using (4), these inequalities translate to
X 1
(|Z v | − 1) < ν 1+ 2k .
v∈Zαβ
X 1
(|Z v | − 1) < µ1+ 2k .
v∈Zαγ
X 1
(|Z v | − 1) < λ1+ 2k .
v∈Zβγ

Finally, the number of vectors in Z which are not determined by projection is exactly the sum of
the three terms on the left in the above inequalities. So the number of vectors in Z which are
determined by projection is greater than
1 1 1
|Z| − λ1+ 2k − µ1+ 2k − ν 1+ 2k .

Let Y be the set of all these vectors; then Y satisfies the requirements of the lemma.

8
3.2 Partially dependent quadruples
For the remainder of the proof of Theorem 2.3, we restrict our attention to a set Y ⊆ Z which
is 8k-wise independent and determined by projection, and satisfies (3). A partially dependent
quadruple in Y is a quadruple {w, x, y, z} ⊆ Y such that (wα , xγ , yα , zγ ) = (xα , yγ , zα , wγ ). When it
is convenient, we represent this quadruple as an ordered 4-tuple (w, x, y, z) with the understanding
that the non-zero co-ordinate of wα precedes the non-zero co-ordinate of yα and the non-zero
co-ordinate of wγ precedes the non-zero co-ordinate of yγ . A partially dependent quadruple is
illustrated in Figure 1.

Fα × Fγ

Figure 1 : A partially dependent quadruple

The key point is that the projection of a partially dependent quadruple onto Fα ×Fγ is a linearly
dependent set. Therefore our aim is to try to find a linearly dependent set of 2k projections of
quadruples in Y onto Fβ . Since each quadruple consists of four vectors in Y , altogether this gives
8k linearly dependent vectors in Y . Let Q denote the set of partially dependent quadruples in Y .
1
Lemma 3.2. Suppose that |Y | > ν + 2λν 2 . Then

|Y |4
|Q| > . (5)
4λ2 ν 2
Proof. For v ∈ Fα , recall that Y v = {(vα , vβ , vγ ) ∈ Y : vα = v}. We define Y v similarly when
v ∈ Fβ or v ∈ Fγ . To prove (5), we use the following identity, which follows from the fact that Y
is determined by projection:
X X µ|Y w |¶
v x
|Y ∩ Y | = . (∗)
α γ
2
{v,x}⊆F w∈F

9
It follows that
X µ v ¶
|Y ∩ Y x |
|Q| =
2
{v,x}⊆Fα
µ ¶µ 1 P v ∩ Y x |¶
λ α |Y
≥ (λ2 ) {v,x}⊆F
2 2
µ ¶µ 1 P ¡ ¢
|Y w | ¶
(∗) λ λ
(2) w∈F γ 2
=
2 2
µ ¶µ ν ¡ ν1 w∈Fγ |Y w |¢¶
P
λ (λ2 ) 2

2 2
µ ¶µ ν ¡ν −1 |Y |¢¶
λ (λ2 ) 2
=
2 2
|Y |4
> .
4λ2 ν 2
¡a¢
This is exactly (5). In each of the inequalities we used the convexity of the function a 7→ 2 . In
the last inequality we used the lower bound on |Y | assumed in the lemma.

3.3 Constructing linear dependences


For each partially dependent quadruple Q = (w, x, y, z) in Q, define π(Q) = {(wβ , yβ ), (xβ , zβ )}.
It is convenient to define the multigraph G consisting of all pairs π(Q) where Q ∈ Q (the vertex
set of G is a subset of Fβ × Fβ ). A chain of length k is a sequence Q = (Q1 , Q2 , . . . , Qk ) where
Qi ∈ Q such that (π(Q1 ), π(Q2 ), . . . , π(Qk )) is a non-returning walk (a walk in which consecutive
edges are distinct) of length k in G Two chains are concyclic if the walks corresponding to them
in G have the same endpoints. If Q = (Q1 , Q2 , . . . , Qk ) and R = (R1 , R2 , . . . , Rk ) are concyclic
chains, where Qi = (si , ti , ui , vi ) and Ri = (wi , xi , yi , zi ), then {(s1 )β , (u1 )β } = {(w1 )β , (y1 )β } and
{(tk )β , (vk )β } = {(xk )β , (zk )β }. It follows that

k
X k
X
(si + ui − ti − vi ) − (wi + yi − xi − zi ) = 0. (6)
i=1 i=1

We want to find concyclic chains Q and R such that the equation above is a non-trivial linear
dependence of at most 8k vectors in Y (it is possible to construct many examples where this
linear dependence is trivial). To find a non-trivial dependence, it is sufficient to show that some
vector appears in the equation (6) with a coefficient which is not zero modulo p, where p is the
characteristic of F. This will hold for certain special chains which we call nondegenerate chains.
Let Q = (Q1 , Q2 , . . . , Qk ) be a chain of length k where Qi = (si , ti , ui , vi ) for i ∈ {1, 2, . . . , k}.
Then the reduction of Q is the set of vectors defined by

b := Qk 4p Qk−1 4p · · · 4p Q1 .
Q (7)

10
The expression above is read from right to left, and the symmetric difference operator 4p is defined
as follows: we delete any vector once it appears with coefficient zero mod p in the sum:
k
X
(si + ui − ti − vi )
i=1

We say that Q is nondegenerate if |Qi ∩ (Q1 ∪ Q2 ∪ · · · ∪ Qi−1 )| ≤ 1, for 2 ≤ i ≤ k, and Q is


degenerate otherwise.

b = R.
Lemma 3.3. Suppose Q, R are concyclic nondegenerate chains of length k in Q. Then Q b

Proof. Suppose Q = (Q1 , Q2 , . . . , Qk ) and R = (R1 , R2 , . . . , Rk ) where Qi = (si , ti , ui , vi ) and


Ri = (wi , xi , yi , zi ). Since Q and R are concyclic, equation (6) holds. If we restrict the sum on the
left of (6) to vectors in Q, b the equality still holds, since p is the characteristic of F. Similarly, we
may delete all terms on the right of (6) which are not in R. b Since Q and R are nondegenerate, Q b
b b b b b
and R are each non-empty. If Q 4 R 6= ∅, then Q 4 R is a non-empty linearly dependent set of at
most 8k vectors in Y , a contradiction. So Q b = R,
b as required.

3.4 Counting nondegenerate chains


Lemma 3.4. If Q = (Q1 , Q2 , . . . , Q` ) is a nondegenerate chain in Q, then at most 8`2 degenerate
chains of length ` + 1 contain Q.

Proof. Let (wβ , yβ ) be the endvertex of the walk (π(Q1 ), π(Q2 ), . . . , π(Q` )) in G and let f (Q) =
S`
i=1 Qi . Then the number of degenerate chains of length ` + 1 containing Q is equal to the number
of partially dependent quadruples R ∈ Q such that |R ∩ f (Q)| ≥ 2. Let us assume

R = (w, x, y, z) = ((wα , wβ , wγ ), (yα , xβ , wγ ), (yα , yβ , yγ ), (wα , zβ , yγ )). (8)

We claim that exactly two elements of S = {w, x, y, z} ∩ f (Q) uniquely determine R. Once that is
proved, it follows that the number of R such that (Q1 , Q2 , . . . , Q` , R) is degenerate is at most
µ ¶ µ ¶ µ ¶
|f (Q)| |Q1 | + |Q2 | + · · · + |Q` | 4`
≤ = < 8`2
2 2 2

as required. We now prove the claim. Since Y is determined by projection, R is uniquely determined
upon specifying the projection of R onto Fα and onto Fγ . So the claim is proved if Sα = {wα , yα }
and Sγ = {wγ , yγ } – since in that case two co-ordinates of each vector in the expression (8) defining
R are specified. If this is not the case, then one checks that S is one of the pairs {w, x}, {x, y}, {y, z}
or {z, w}. These cases are all dealt with in the same way, so we check only the case S = {w, x}.
Since yγ is uniquely determined by yα and yβ , y is uniquely determined by S. Hence {w, x, y} are
specified, and therefore {wα , yα } and {wγ , yγ } are uniquely determined. This uniquely determines
R, and proves the claim.

11
Lemma 3.5. Let Pk denote the set of nondegenerate chains of length k in Q. Then

|Pk | > 4−k m(d − 32k 2 )k , (9)

where d > 64k 2 is the average degree in G and m is the number of vertices in G.

Proof. We claim that for all ` ≤ k and d > 64`2 ,

|P` | > 4−` m(d − 32`2 )` . (10)

The proof is by induction on m + `. If G contains a vertex of degree less than d/4, then we remove
such a vertex to obtain a graph of average degree greater than de = d + d/(2m − 2). By induction,
the number of walks in P` for this new graph is at least

4−` (m − 1)(de − 32`2 )` > 4−` m(d − 32`2 )` .

In particular, the number of non-returning walks of G which are in P` is at least 4−` m(d − 32`4 )` ,
as required. Suppose every vertex of G has degree at least d/4, and

|P`−1 | > 4−`+1 m(d − 32(` − 1)2 )`−1 .

Since every walk in P`−1 has at most 8(` − 1)2 extensions to a degenerate walk of length `, by
Lemma 3.4, there are at least d/4 − 8(` − 1)2 − ` > d/4 − 8`2 extensions of each walk in P`−1 to a
walk in P`+1 . This proves (10). Now the lemma follows from Lemma 3.4 with ` = k.

Lemma 3.6. Let m be the number of vertices in G. Then


µ ¶
k 4k m
|Pk | < 4 k . (11)
2

Proof. By Lemma 3.3, if Q = (Q1 , Q2 , . . . , Qk ) is a chain in Pk , then the number of chains R =


(R1 , R2 , . . . , Rk ) in Pk such that Q and R are concyclic is at most the number of choices of R such
that Rb = Q. b Let f (Q) = Sk Qi and f (R) = Sk Ri . The important point is that since Q and R
i=1 i=1
are nondegenerate, their projections onto Fα and Fγ are invariant under reduction:
b α = f (Q)α
Q b γ = f (Q)γ
Q
bα = f (R)α
R bγ = f (R)γ .
R

Therefore the number choices of R ∈ Pk concyclic with Q ∈ Pk is at most the number of choices
of R such that f (R)α = f (Q)α and f (R)γ = f (Q)γ . Since Y is determined by projections, any
quadruple Ri ∈ Q is specified by its projection onto Fα × Fγ . The number of ways of choosing
quadrilaterals R1 , R2 , . . . , Rk so that R = (R1 , R2 , . . . , Rk ) is at most
µ ¶ µ ¶ µ ¶k µ ¶k
|f (Q)α | k |f (Q)γ | k 2k 2k
≤ < 4k k 4k
2 2 2 2

since |f (Q)α | ≤ 2k and |f (Q)β | ≤ 2k. This gives the required upper bound on |Pk |.

12
3.5 Linear Dependences
In this section we prove Theorem 2.3, using the lemmas we have developed in the last few sections.
The following theorem combines all of these lemmas, and will also be used to prove Theorem 1.3.

Theorem 3.7. Let Z ⊆ Fα × Fβ × Fγ be a balanced 8k-wise independent set of vectors. Then


1 1 1 1 1 1
|Z| < 4k(λµν) 2 µ 2k + λ1+ 2k + µ1+ 2k + ν 1+ 2k + 2λν 2 . (12)

Proof. Since Z is 8k-wise independent, we may apply Lemma 3.1: there exists a set Y ⊆ Z such
that Y is determined by projection and
³ 1
´1 1
2
|Y | > 16k 2 µ1+ k λν + ν + 2λν 2 . (13)

Let Pk denote the set of nondegenerate chains of length k in Q, let d and m be the average degree
1
and number of vertices in G, respectively. Combining (11) and (9) gives d − 32k 2 < 4k 4 m k and
1
therefore d < 64k 4 m k . It follows that since |Q| = |E(G)| = 12 dm,
1
|Q| < 64k 4 m1+ k . (14)

For a contradiction, suppose that |Z| is at least the expression claimed in (12). Now from (13),

|Y |4
|Q| > . (15)
4λ2 ν 2
Using (14) and m = µ2 this gives
1 2
³ 1
´2
|Y |4 < 4λ2 ν 2 · 64k 4 m1+ k < 256k 4 λ2 ν 2 µ2+ k = 16k 2 µ1+ k λν

This contradicts (13), and proves the theorem.

Remark. Theorem 3.7 can be used to derive a more precise version of Theorem 1.2: if S is an
r-partite hypergraph with parts of sizes N1 , N2 , . . . , Nr , then the above theorem can be used to
1 1 r 1
prove that if |S| À (N1 N2 . . . Nr ) 2 + 2k + (N1 + N2 + · · · + Nr )d 3 e(1+ 2k ) then S contains an even
cover of size at most 8k. This result may be viewed as an extension of Theorem 2.1 from bipartite
graphs to r-partite hypergraphs.

Proof of Theorem 1.1. Let X ⊆ Fn be a set of vectors of weight at most r. Let χ : {1, 2, . . . , n} →
{1, 2, 3} be a random three-coloring of the co-ordinates, where distinct co-ordinates are colored
independently and each color is equiprobable. For a vector x ∈ X of weight ω(x) = ω, the
probability that x has exactly b ω3 c non-zero co-ordinates of color 1, exactly [ ω3 ] non-zero co-ordinates
of color 2, and exactly d ω3 e non-zero co-ordinates of color three is exactly
µ ¶µ ¶
1 ω ω − b ω3 c 1
ω ω ω > ,
3 b3c d3e 3ω

13
where we used the numerical Lemma 3.8 (see below) to obtain the inequality. In this case we
say that x is equipartitioned by χ. Therefore the expected number of vectors x ∈ X which are
equipartitioned is greater than
X 1 |X|
> .
3ω(x) 3r
x∈X

This implies that there is a subset Z of X of size greater than |X|


3r and a three-coloring χ such
that every vector z ∈ Z is equipartitioned by χ. Then Z may be regarded as a balanced subset of
FL × FM × FN where L, M and N are defined in Theorem 2.3. Applying Theorem 3.7 to Z with
λ = L, µ = M and ν = N , we obtain

|X| 1 1 1 1 1 1 1
< 4k(LN ) 2 M 2 + 2k + L1+ 2k + M 1+ 2k + N 1+ 2k + 2LN 2 ,
3r
which implies that
1 1 1 1
|X| < 12kr · [(LN ) 2 M 2 + 2k + N 1+ 2k ].
This is precisely the statement of Theorem 2.3.

Lemma 3.8. Let ω be a positive integer. Then


µ ¶µ ¶
ω ω − b ω3 c 3ω−1
ω ω > .
b3c d3e ω

Proof. Let f (ω) denote the expression on the left in the inequality above. It is not hard to verify
that the result is true for ω ∈ {1, 2, 3}. Suppose ω > 3. Using the inequalities,
1 1 1
nn e−n (2πn) 2 < n! < nn e−n (2πn) 2 e 12n ,

which are valid for all positive integers n, we get that for all integers s ≥ 1,
µ ¶µ ¶ 1
3s 2s (3s)! (3s)3s e−3s (6πs)1/2 33s+ 2 33s
f (3s) = = > > > .
s s (s!)3 s3s e−3s (2πs)3/2 e1/4s 2πse1/4s 4s

This implies the required inequality when ω is a multiple of 3. We pass to general ω by noting that

3s + 1 (3s + 2)(3s + 1)
f (3s + 1) = f (3s) and f (3s + 2) = f (3s).
s+1 (s + 1)2

In particular, for s ≥ 1, f (3s + 1) ≥ 2f (3s) and f (3s + 2) ≥ 5f (3s), which implies the required
inequality.

14
4 Product representations of squares
In this section we prove Theorem 1.3. Before doing so, we require the following simple lemma:
1
Lemma 4.1. Let n > 1 be a positive integer. Then either n has a prime factor larger than N 2 , or
1
n = xyz where x, y, z ≤ N 2 .
Proof. Let n = p1 p2 . . . pr denote the prime factorization of n into (not necessarily distinct) primes
1
pi where p1 ≥ p2 ≥ · · · ≥ pr . Suppose p1 ≤ N 2 . Then we can find a set X of pi s whose product x
1 1
is at most N 2 but as close to N 2 as possible. Let y be a prime factor of n which isn’t in X. Then
1
xy ≥ N 2 , and we may take z = n/(xy).

In what follows we denote by Π(n) the set of all primes in {1, . . . , n}, and let π(n) = |Π(n)| be
the usual prime counting function.

Proof of Theorem 1.3. Let A ⊆ {1, 2, . . . , n} be a set such that no product of at most 8k distinct
elements of A is a square. Denote by B ⊆ A the set of integers in A which have a prime factor larger
1 1
than N 2 , and write C = A \ B. By Lemma 4.1 we have that C = {a ∈ A : a = xyz, x, y, z ≤ N 2 }.
Denote for 0 ≤ i ≤ 12 log2 n,
n n no
Pi = p ∈ Π(n) : i+1 < p ≤ i .
2 2
Form a bipartite graph Gi with parts Pi and {1, . . . 2i+1 } such that p ∈ Pi is joined to q ∈
{1, . . . , 2i+1 } if pq ∈ B. Then Gi does not contain a cycle of length at most 8k since if for some
2 ≤ ` ≤ 8k , p1 q1 , q1 p2 , p2 q2 , . . . , q`−1 p1 is such a cycle then pj qj , pj+1 qj ∈ A are distinct and their
product is a square. It was proved in [24] that an M by N bipartite graph of girth at least 2k + 2
1 1
has at most (M N ) 2 + 2k + M + N edges. Since Gi has girth at least 8k + 2, we deduce that
¡ i+1 ¢1+ 1
|E(Gi )| ≤ 2 |Pi | 2 8k + 2i+1 + |Pi |
³ h ³n´ ³ n ´i´ 1 + 1
2 8k
≤ 2i+1 π i − π i+1 + 2i+1 + |Pi |.
2 2
Adding these inequalities for i = 0, . . . , b 12 log2 nc gives

b 21 log2 nc
X 1 1
|B| = |E(Gi )| ≤ π(n) + O(n 2 + 8k ).
i=0
1
We now estimate |C|. For each t ∈ C fix a factorization t = xt yt zt with xt yt , zt ≤ N 2 . We
1
assume xt ≥ yt ≥ zt , so in particular zt ≤ n 3 . Denote
½ ¾
1
S = (i, j) : 0 ≤ i ≤ j : i + j + 2 ≤ log2 n
3
For (i, j) ∈ S let Cij denote the set of all t ∈ C such that
1 1 1 1
N2 N2 N2 N2
< xt ≤ < yt ≤ 1 ≤ zt ≤ 2i+j+2 .
2i+1 2i 2j+1 2j

15
1 1
We now apply Theorem 3.7 with F = F2 and λ = 2i+j+2 , µ = N 2 /2i+1 and ν = N 2 /2j+1 .
Each t ∈ Cij may be considered as a vector of weight three in Fλ × Fµ × Fν : if t = xt yt zt is the
prescribed factorization of t then the vector associated with t is the vector of weight three with
a one in positions xt , µ + yt and µ + ν + zt , and zeros elsewhere. Clearly no 8k of these vectors
are linearly dependent, otherwise the product of the corresponding elements of Cij is a square. By
Theorem 3.7, we have that
1 1 1 1 1 1
|Cij | < 12kr(λµν) 2 µ 2k + λ1+ 2k + µ1+ 2k + ν 1+ 2k + 2λν 2
à 1 1
!1 Ã 1 ! 1 Ã 1 !1+ 1 Ã 1 !1
2
N2 N2 i+j+2 N 2 2k N 2 2 2k i+j N2 2
¿ k · · 2 + + 2
2i+1 2j+1 2i+1 2j+1 2j+1
1 1 j 1
¿ kn 2 + 2k + 2i+ 2 n 4 .

Finally, we sum this inequality over all (i, j) ∈ S. We chose λ, µ, ν carefully to ensure that the sum
1
of the last term over (i, j) ∈ S is O(n 2 ). Therefore we have
1 1 1 1 1
|C| ¿ kn 2 + 2k · |S| + O(n 2 ) ¿ kn 2 + 2k (log n)2 .

This concludes the proof of Theorem 1.3.

5 Acknowledgements
We are grateful to Noga Alon and Henry Cohn for helpful comments.

References
[1] N. Alon, L. Babai, and A. Itai. A fast and simple randomized parallel algorithm for the
maximal independent set problem. J. Algorithms, 7(4):567–583, 1986.

[2] N. Alon, O. Goldreich, J. Håstad, and R. Peralta. Simple constructions of almost k-wise
independent random variables. Random Structures Algorithms, 3(3):289–304, 1992.

[3] N. Alon, S. Hoory, and N. Linial. The Moore bound for irregular graphs. Graphs Combin.,
18(1):53–57, 2002.

[4] N. Alon and J. H. Spencer. The probabilistic method. Wiley-Interscience Series in Discrete
Mathematics and Optimization. Wiley-Interscience [John Wiley & Sons], New York, second
edition, 2000. With an appendix on the life and work of Paul Erdős.

[5] N. Alon, R. Yuster, and U. Zwick. Color-coding. J. Assoc. Comput. Mach., 42(4):844–856,
1995.

[6] N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica,
17(3):209–223, 1997.

[7] L. Bazzi, M. Mahdian, and D. A. Spielman. The minimum distance of Turbo-like codes.
Preprint, 2003.

16
[8] C. Berrou, A. Glavieux, and P. Thitimajshima. Near Shannon Limit Error Correcting Codes
and Decoding: Turbo Codes. In Proceedings of IEEE International Communications Confer-
ence, pages 1064–1070. 1993.

[9] C. Bertram-Kretzberg, T. Hofmeister, and H. Lefmann. Sparse 0-1 matrices and forbidden
hypergraphs. Combin. Probab. Comput., 8(5):417–427, 1999.

[10] C. Bertram-Kretzberg and H. Lefmann. MODp -tests, almost independence and small proba-
bility spaces. Random Structures Algorithms, 16(4):293–313, 2000.

[11] A. Beutelspacher and U. Rosenbaum. Projective geometry: from foundations to applications.


Cambridge University Press, Cambridge, 1998.

[12] B. Bollobás. Extremal graph theory. Dover Publications Inc., Mineola, NY, 2004. Reprint of
the 1978 original.

[13] J. Bondy and M. Simonovits. Cycles of even length in graphs. J. Combinatorial Theory B,
16:97–105, 1974.

[14] M. Breiling. A logarithmic upper bound on the minimum distance of Turbo codes. Preprint,
2001.

[15] A. E. Brouwer. Block designs. In Handbook of combinatorics, Vol. 1, 2, pages 693–745. Elsevier,
Amsterdam, 1995.

[16] M. C. Davey and D. J. C. MacKay. Low-density parity check codes over GF (q). IEEE
Communications Letters, 2(6):165–167, 1998.

[17] J. D. Dixon. Asymptotically fast factorization of integers. Math. Comp., 36(153):255–260,


1981.

[18] Erdős, P. On some applications of graph theory to number theoretic problems. Publ. Ramanu-
jan Inst. 1, 131–136, 1969.

[19] Erdős, Brown, W. G. and Sós, V. T. Some extremal problems on r-graphs. New directions in
the theory of graphs. Proc 3rd Ann Arbor Conference on Graph Theory, Academic Press, New
York, 55–63, 1973.

[20] P. Erdős and D. J. Kleitman. On coloring graphs to maximize the proportion of multicolored
k-edges. J. Combinatorial Theory, 5:164–169, 1968.

[21] P. Erdős, A. Sárközy, and V. T. Sós. On product representations of powers. I. European J.


Combin., 16(6):567–588, 1995.

[22] R. G. Gallager. Low Density Parity Check Codes. MIT Press, Cambridge MA, 1963. Research
Monograph Series, no. 21.

[23] E. Györi. C6 -free bipartite graphs and product representation of squares. Discrete Math.,
165/166:371–375, 1997. Graphs and combinatorics (Marseille, 1995).

17
[24] S. Hoory. The size of bipartite graphs with a given girth. J. Combin. Theory Ser. B, 86(2):215–
220, 2002.

[25] N. Kahale and R. Urbanke. On the minimum distance of parallel and serially concatenated
codes. IEEE Trans. Inform. Theory. To appear.

[26] H. Lefmann. Sparse parity-check matrices over finite fields (extended abstract). In Computing
and combinatorics, volume 2697 of Lecture Notes in Comput. Sci., pages 112–121. Springer,
Berlin, 2003.

[27] H. Lefmann, P. Pudlák, and P. Savický. On sparse parity check matrices. Des. Codes Cryptogr.,
12(2):107–130, 1997.

[28] D. J. C. MacKay. Good error-correcting codes based on very sparse matrices. IEEE Trans.
Inform. Theory, 45(2):399–431, 1999.

[29] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. I. North-Holland


Publishing Co., Amsterdam, 1977. North-Holland Mathematical Library, Vol. 16.

[30] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. II. North-Holland
Publishing Co., Amsterdam, 1977. North-Holland Mathematical Library, Vol. 16.

[31] A. Naor and J. Verstraëte. A note on bipartite graphs without a 2k-cycle. Preprint, 2003.

[32] C. Pomerance. A tale of two sieves. Notices Amer. Math. Soc., 43(12):1473–1485, 1996.

[33] C. Pomerance and A. Sárközy. Combinatorial number theory. In Handbook of combinatorics,


Vol. 1, 2, pages 967–1018. Elsevier, Amsterdam, 1995.

[34] G. N. Sárközy. Cycles in bipartite graphs and an application in number theory. J. Graph
Theory, 19(3):323–331, 1995.

[35] M. Sipser and D. A. Spielman. Expander codes. IEEE Trans. Inform. Theory, 42(6, part
1):1710–1722, 1996. Codes and complexity.

[36] D. A. Spielman. Linear-time encodable and decodable error-correcting codes. IEEE Trans.
Inform. Theory, 42(6, part 1):1723–1731, 1996. Codes and complexity.

[37] J. Verstraëte. On arithmetic progressions of cycle lengths in graphs. Combin. Probab. Comput.,
9(4):369–373, 2000.

[38] H. C. Williams and J. O. Shallit. Factoring integers before computers. In Mathematics of


Computation 1943–1993: a half-century of computational mathematics (Vancouver, BC, 1993),
volume 48 of Proc. Sympos. Appl. Math., pages 481–531. Amer. Math. Soc., Providence, RI,
1994.

18

You might also like