Parity Principle
Parity Principle
Parity Principle
Abstract
Let NF (n, k, r) denote the maximum number of columns in an n-row matrix with entries in
a finite field F in which each column has at most r nonzero entries and every k columns are
linearly independent over F. We obtain near-optimal upper bounds for NF (n, k, r) in the case
r cr
k > r. Namely, we show that NF (n, k, r) ¿ n 2 + k where c ≈ 43 for large k. Our method is
based on a novel reduction of the problem to the extremal problem for cycles in graphs, and
yields a fast algorithm for finding short linear dependences. We present additional applications
of this method to problems in extremal hypergraph theory and combinatorial number theory.
“It is certainly odd to have an instruction in an algorithm asking you to play with some numbers
to find a subset with product a square . . . Why should we expect to find such a subsequence, and,
if it exists, how can we find it efficiently?”
1 Introduction
Since the mid-1930s it has been well established that there is a tight connection between combi-
natorial number theory and extremal combinatorics (see [33, 12] and the references therein). The
basic paradigm is that for certain number theoretical problems one can construct a combinatorial
object (e.g. a graph or a hypergraph), and prove that it cannot contain certain configurations (e.g.
cycles). Thus, in many cases one can use theorems on excluded configurations in extremal combi-
natorics to bound the size of the combinatorial structure that was constructed, and this translates
back to give number theoretical consequences. The present paper develops a novel reduction of
this type, and applies it to two problems in algebraic combinatorics, namely to coding theory and
combinatorial number theory. Additionally, our proofs are constructive, and thus yield the best
known algorithms for several natural computational problems.
Our original motivation comes from a problem in coding theory. Low density parity check
codes were introduced by Gallager in the 1960s, and have since found numerous theoretical and
practical applications in engineering and computer science (see [22, 29, 30] for an account of this
theory. We also refer to [11] for a nice introduction to the geometry of codes). Given a linear code
C ⊆ Fm 2 of dimension ` and minimum Hamming weight t, an (m − `) × m matrix H is called a
parity check matrix of the code C if C = {v ∈ Fm 2 : Hv = 0}. We shall say that H is r-sparse
if every column of H has at most r non-zero entries. The Syndrome Decoding Algorithm for such
codes works as follows: given a corrupted signal z one computes the vector x of minimal weight
1
satisfying Hz = Hx, and decodes z to z − x (this algorithm corrects at most t/2 errors). As such
computations are faster if the sparseness of H is exploited, it is desirable to obtain codes with
sparse parity check matrices. Indeed, sparse parity check matrices occur in many of the known
constructions of codes, e.g. codes based on bounded degree graphs such as expander codes [35, 36],
and we also refer to [28] for theoretical and experimental coding theory applications of very sparse
matrices (we stress here that the present paper deals with a different range of parameters – our
bounds will be for codes in which the minimal weight is not proportional to the dimension. Such
codes occur in several contexts, e.g. certain BCH and Reed-Solomon codes [29], Turbo and Turbo-
like codes [8, 25, 14, 7]). Additionally, the above discussion makes sense for parity check matrices
over arbitrary finite fields, which are also used in coding theory (see [29, 30] for basic information
on this topic, and [16] for empirical results on such codes). Finally, sparse parity check matrices
are the key ingredient in the construction of small probability spaces and deterministic simulations
of k-wise independent random variables, which are a key tool in derandomization [1, 2, 4, 10].
Somewhat surprisingly, in spite of their importance, there was a large gap between the known
upper and lower bounds for the maximal number of columns of sparse parity check matrices. For
a finite field F let NF (n, k, r) be the maximal number of vectors in Fn with at most r non-zero
coordinates such that no k of them are linearly dependent (observe that the linear independence
condition corresponds to the fact that the kernel of the matrix whose rows are the given vectors
is a code with minimal distance at least k + 1). When F = F2 we use the notation NF2 (n, k, r) =
N (n, k, r). The problem dealt with in this paper, namely that of estimating NF (n, k, r), differs
from the classical Gilbert-Varshamov bounds (see e.g. [29]), since the classical bounds on sizes of
codes are geometric packing bounds which depend only on the minimum distance of the code. Here
we introduce an additional algebraic restriction on the code (the existence of a sparse parity check
matrix) which is motivated by computational issues. Thus, we are dealing with a mixture of a
geometric and algebraic problem. In this paper we are primarily interested in the case that k and
r are fixed and n → ∞, although the results are valid for arbitrary k and r.
Throughout this paper we use the following notation: given two non-negative sequences {an }∞ n=1 ,
∞
{bn }n=1 , we write an ¿ bn if there exists a constant C > 0 such that for all n, an ≤ C · bn .
This lower bound was generalized to arbitrary finite fields in [26] (in which case the constant also
depends on the size of the field).
In [27] it was shown that when k is a power of 2, N (n, k, r) ¿ n 2 dr+ k−1 e , which coincides with
1 r
the probabilistic lower bound (up to factors independent of n) when k − 1 divides r. This upper
bound was generalized to arbitrary finite fields in [26]. Observe that in the important case k > r,
2
i.e. when the number of correctible errors is greater than the weight, this upper bound becomes:
r 1
for k a power of 2, NF (n, k, r) ¿ n 2 + 2 . Thus the gap between the exponent of n in this bound and
the probabilistic lower bounds deteriorates as k grows. Here we resolve this problem by proving
the following theorem in Section 3:
3
As an example of an application to extremal hypergraph theory, it is notoriously difficult to
determine which configurations of triples are guaranteed to appear in every large enough Steiner
triple system (for example, many of the questions discussed in [19] remain open). In the present
context, it is known that there are infinitely many Steiner triple systems which do not contain an
even cover of four triples (one is constructed in [27]). Using Theorem 1.2, we can at least guarantee
small ¡even
¢ covers in Steiner triple systems. For if S is a Steiner triple system on n points, then S
1 n
has 3 2 edges [15], which is larger than the bound in Theorem 1.2 for r = 3 and k = 16, so every
large enough Steiner triple system contains an even cover of size at most sixteen. We leave open
the problem of determining if smaller even covers exist in every Steiner triple system.
4
the case that r is even, we will give a short proof of Theorem 1.1 which itself yields an algorithm
whose running time is linear in |X|. These algorithms will follow directly from our proof of Theorem
1.1 and the Alon-Yuster-Zwick algorithm [5, 6] for finding cycles in graphs, or alternatively the
proof of the main theorem in [37], which gives better constants. Additionally, our proof yields an
1 1
algorithm such that given a set A ⊆ {1, . . . , n} with |A| À kn 2 + 2k (log n)2 , finds in quadratic time,
distinct a1 , . . . , aj ∈ A with 1 ≤ j ≤ 8k, such that a1 a2 · · · aj is a square.
Theorem 2.1. Let k be a positive integer and let G = (V, E) be an N -vertex graph. If G contains
no cycle of length exactly 2k then
1
|E| < 2kN 1+ k . (1)
If G is an M by N bipartite graph containing no cycle of length 2k, then
1 1 1
|E| < 2k · [M 2 N 2 + k + M + N ]. (2)
If G has girth at least 2k + 1, then the factor 2k may be removed in each of the upper bounds.
Theorem 2.1 is proved in [31], the last statement is proved in [3], the bipartite case of which
is proved in [24]. The first assertion (1), with the same dependence on n but a worse constant, is
Erdős’ Even Cycle Theorem – see [13].
5
r 1
The bound in Theorem 2.2 implies in particular that NF (n, 2k, r) ¿ n 2 + 2 for all k. This
generalizes the same bound which was proved for k a power of two in [27] to all values of k. The
proof of Theorem 2.2 is short, and we present it here:
Proof of Theorem 2.2. For each vector v ∈ X of weight ω ≤ r, fix a pair e(v) = {x, y} of
vectors in Fn of weights at most bω/2c and dω/2e, respectively, satisfying x + y = v. If r is
even, let G be the graph whose vertex set consists of all vectors in Fn of weight at most r/2
and whose edge set is E = {e(v) : v ∈ X}. Then |E| = |X| and G has M vertices. By the
first assertion (1) in Theorem 2.1, G contains a cycle of length 2k. By the definition of G, there
exist distinct vectors v1 , v2 , . . . , v2k ∈ X such that the edge set of this cycle consist of the pairs
e(vi ) = {xi , xi+1 } for i = 1, 2, . . . , 2k, where x2k+1 = x1 . Then
2k
X
(−1)`+1 vi = (x1 + x2 ) − (x2 + x3 ) + · · · − (x2k + x1 ) = 0,
`=1
and the disjoint sets A = {v1 , v3 , . . . , v2k−1 } and B = {v2 , v4 , . . . , v2k } have the same sum. If r is
odd, then G is a bipartite graph whose parts have sizes M and N . By the second assertion (2) of
Theorem 2.1, C2k ⊆ G, and we conclude by the same argument as that presented above.
Theorem 2.2 implies Theorem 1.1 in the case of even r: in fact in this case we obtain the
r r
stronger bound NF (n, k, r) ¿ n 2 + k for k even. We also remark that the proof above gives a linear
time algorithm – that is time O(|X|) – for finding 2k linearly dependent vectors in a set X ⊆ Fn
satisfying the requirements of the theorem – this follows from the linear time algorithm in [5, 6] for
finding a cycle of length 2k in a graph with appropriately many edges.
leads to a more involved argument in the case of odd r, which nevertheless yields the bounds of
Theorem 1.1 in this case as well. We conclude the proof of Theorem 1.1 by giving the following
refinement of Theorem 2.2 when k > r and r is odd. For convenience we define [ 3r ] = r − d 3r e − b 3r c
and q = |F∗ |:
Theorem 2.3. Let n, k and r be positive integers, and define
br/3c
X µ ¶ [r/3]
X µ ¶ dr/3e
X µ ¶
` n ` n ` n
L= q M= q N= q .
` ` `
`=0 `=0 `=0
r dr/3e
In particular, it follows that NF (n, 8k, r) ¿ n 2 + 2k , which gives Theorem 1.1 when we replace
k with bk/8c. The remainder of the paper is devoted to the proof of Theorem 2.3.
6
3 Proof of Theorem 1.1
We already proved Theorem 1.1 in the case that r is even. In this section we prove Theorem 2.3
which implies Theorem 1.1 when r is odd. Since the proof is quite involved, we begin with an
outline of the proof. For simplicity we omit all the multiplicative constants. Starting with a set
1 1 1 1
X ⊆ Fn where |X| À (LN ) 2 M 2 + 2k + N 1+ 2k , we wish to find a linearly dependent set of 8k vectors
in X. Suppose that there is no such subset of X. Via an averaging argument (see §3.6), we will
show that it is sufficient to find such a linear dependence in a set Z ⊆ FL × FM × FN , where
|Z| À |X| and where the projection of Z onto any of the subspaces FL , FM and FN consists of
vectors of weight one. We call such sets balanced.
To find linear dependences in balanced sets Z where |Z| À |X|, we first prove in §3.1 that there
1 1 1
is a set Y ⊆ Z where |Y | > |Z| − L1+ 2k + M 1+ 2k + N 1+ 2k and the projection of Y onto any one
of the subspaces FL × FM and FL × FN and FM × FN uniquely determines Y . In this case we say
that Y is determined by projection. This allows us in §3.2 to count special four-element subsets of
Y called partially dependent quadruples: these are sets of four vectors in Y whose projection onto
FL × FN is a linearly dependent set of four vectors. The key point in the proof is to prove that
there are enough of these quadruples in Y to ensure that the projection of 2k of these quadruples
onto FM results in a linearly dependent set. Then this gives 8k linearly dependent vectors in Y ,
and the required contradiction.
To obtain a feeling for where the bound in Theorem 2.2 comes from, we give some of the bounds
involved in the proof. Let Q denote the set of partially dependent quadruples in Y . We will show
in §3.2 that
|Y |4
|Q| > .
4L2 N 2
1 1 1
Now |Y | À (LN ) 2 M 2 + 2k , so this inequality gives |Q| À M 2+2/k . Since the projection of Y
onto FM consists of vectors of weight one, the projection of each quadruple in Q onto FM consists
of four vectors of weight one. If we treat these projections as vectors of weight four in FM , then
there are |Q| À M 2+2/k of these vectors. By Theorem 2.2 with r = 4, this shows that there are
2k quadruples in Q whose projections onto FM form a linearly dependent set of 2k vectors. In
principle, we then consider all the vectors in the corresponding 2k quadruples in Q to obtain a
linearly dependent set of at most 8k vectors in Y , and this contradiction completes the proof of
Theorem 2.2.
However, there is a subtlety, which is that the vectors in the 2k quadruples in Q might form a
trivial dependence in Y . One reason might be that each vector appears with coefficient zero modulo
p in the linear dependence, where p is the characteristic of F, in which case the linear dependence
is trivial. In the proof we consider only special types of linear dependences of projections of
quadruples onto FM , which guarantee that when we lift back to the quadruples themselves, the
linear dependence is not trivial. This approach is covered in §3.3 and §3.4.
7
Similarly, Zαβ denotes the projection of Z onto Fα ×Fβ . For v ∈ Fα ×Fβ let Z v = {z ∈ Z : zαβ = v}.
A set Y ⊆ Fα × Fβ × Fγ is determined by projection if any one of the sets Yαβ , Yβγ , Yαγ uniquely
determines Y . The following lemma says that 8k-wise independent sets contain large subsets which
are determined by projection:
Lemma 3.1. Let Z ⊆ Fα × Fβ × Fγ be a balanced 8k-wise independent set. Then there exists a set
Y ⊆ Z such that Y is determined by projection and
1 1 1
|Y | > |Z| − λ1+ 2k − µ1+ 2k − ν 1+ 2k . (3)
Proof. For v ∈ Zαβ , let T (v) be a spanning tree of the complete graph on Z v . So |E(T (v))| =
|Z v | − 1. We claim that the trees {T (v)}v∈Zαβ can be chosen so that the multigraph Gαβ consisting
of all edges in all the trees {T (v)}v∈Zαβ has girth greater than 4k. This is done by choosing the
trees {T (v)}v∈Zαβ so that the girth of Gαβ is a minimum. This choice implies that if C is a shortest
cycle in G, then |C ∩ T (v)| ≤ 1 for all v ∈ Zαβ . We aim to show that |C| > 4k. Suppose the edges
of C are {{w1 , w2 }, {w2 , w3 }, . . . , {w` , w1 }}. Then there are distinct vj = (vαj , vβj ) ∈ Zαβ such that
{wj , wj+1 } ∈ T (vj ) for all j ≤ `. Let xj = (vj , wj ) ∈ Z and yj = (vj , wj+1 ) ∈ Z for j ≤ `. Then
X̀ X̀
xj − yj = 0.
j=1 j=1
Now xi and yj are distinct for all i, j ≤ `. This means that {x1 , x2 , . . . , x` , y1 , y2 , . . . , y` } ⊆ Z is
linearly dependent. Since Z is 8k-wise independent, it follows that ` > 4k, as required, so G has
girth greater than 4k. The number of edges in Gαβ is
X
|E(G)| = (|Z v | − 1). (4)
v∈Zαβ
On the other hand, since Gαβ has at most c vertices (since Z is balanced), we conclude from
1
Theorem 2.1 that |E(Gαβ )| < ν 1+ 2k . We may define Gβγ and Gαγ similarly, and by symmetry
1 1
|E(Gβγ )| < λ1+ 2k . |E(Gαγ )| < µ1+ 2k . Using (4), these inequalities translate to
X 1
(|Z v | − 1) < ν 1+ 2k .
v∈Zαβ
X 1
(|Z v | − 1) < µ1+ 2k .
v∈Zαγ
X 1
(|Z v | − 1) < λ1+ 2k .
v∈Zβγ
Finally, the number of vectors in Z which are not determined by projection is exactly the sum of
the three terms on the left in the above inequalities. So the number of vectors in Z which are
determined by projection is greater than
1 1 1
|Z| − λ1+ 2k − µ1+ 2k − ν 1+ 2k .
Let Y be the set of all these vectors; then Y satisfies the requirements of the lemma.
8
3.2 Partially dependent quadruples
For the remainder of the proof of Theorem 2.3, we restrict our attention to a set Y ⊆ Z which
is 8k-wise independent and determined by projection, and satisfies (3). A partially dependent
quadruple in Y is a quadruple {w, x, y, z} ⊆ Y such that (wα , xγ , yα , zγ ) = (xα , yγ , zα , wγ ). When it
is convenient, we represent this quadruple as an ordered 4-tuple (w, x, y, z) with the understanding
that the non-zero co-ordinate of wα precedes the non-zero co-ordinate of yα and the non-zero
co-ordinate of wγ precedes the non-zero co-ordinate of yγ . A partially dependent quadruple is
illustrated in Figure 1.
Fβ
Fα × Fγ
The key point is that the projection of a partially dependent quadruple onto Fα ×Fγ is a linearly
dependent set. Therefore our aim is to try to find a linearly dependent set of 2k projections of
quadruples in Y onto Fβ . Since each quadruple consists of four vectors in Y , altogether this gives
8k linearly dependent vectors in Y . Let Q denote the set of partially dependent quadruples in Y .
1
Lemma 3.2. Suppose that |Y | > ν + 2λν 2 . Then
|Y |4
|Q| > . (5)
4λ2 ν 2
Proof. For v ∈ Fα , recall that Y v = {(vα , vβ , vγ ) ∈ Y : vα = v}. We define Y v similarly when
v ∈ Fβ or v ∈ Fγ . To prove (5), we use the following identity, which follows from the fact that Y
is determined by projection:
X X µ|Y w |¶
v x
|Y ∩ Y | = . (∗)
α γ
2
{v,x}⊆F w∈F
9
It follows that
X µ v ¶
|Y ∩ Y x |
|Q| =
2
{v,x}⊆Fα
µ ¶µ 1 P v ∩ Y x |¶
λ α |Y
≥ (λ2 ) {v,x}⊆F
2 2
µ ¶µ 1 P ¡ ¢
|Y w | ¶
(∗) λ λ
(2) w∈F γ 2
=
2 2
µ ¶µ ν ¡ ν1 w∈Fγ |Y w |¢¶
P
λ (λ2 ) 2
≥
2 2
µ ¶µ ν ¡ν −1 |Y |¢¶
λ (λ2 ) 2
=
2 2
|Y |4
> .
4λ2 ν 2
¡a¢
This is exactly (5). In each of the inequalities we used the convexity of the function a 7→ 2 . In
the last inequality we used the lower bound on |Y | assumed in the lemma.
k
X k
X
(si + ui − ti − vi ) − (wi + yi − xi − zi ) = 0. (6)
i=1 i=1
We want to find concyclic chains Q and R such that the equation above is a non-trivial linear
dependence of at most 8k vectors in Y (it is possible to construct many examples where this
linear dependence is trivial). To find a non-trivial dependence, it is sufficient to show that some
vector appears in the equation (6) with a coefficient which is not zero modulo p, where p is the
characteristic of F. This will hold for certain special chains which we call nondegenerate chains.
Let Q = (Q1 , Q2 , . . . , Qk ) be a chain of length k where Qi = (si , ti , ui , vi ) for i ∈ {1, 2, . . . , k}.
Then the reduction of Q is the set of vectors defined by
b := Qk 4p Qk−1 4p · · · 4p Q1 .
Q (7)
10
The expression above is read from right to left, and the symmetric difference operator 4p is defined
as follows: we delete any vector once it appears with coefficient zero mod p in the sum:
k
X
(si + ui − ti − vi )
i=1
b = R.
Lemma 3.3. Suppose Q, R are concyclic nondegenerate chains of length k in Q. Then Q b
Proof. Let (wβ , yβ ) be the endvertex of the walk (π(Q1 ), π(Q2 ), . . . , π(Q` )) in G and let f (Q) =
S`
i=1 Qi . Then the number of degenerate chains of length ` + 1 containing Q is equal to the number
of partially dependent quadruples R ∈ Q such that |R ∩ f (Q)| ≥ 2. Let us assume
We claim that exactly two elements of S = {w, x, y, z} ∩ f (Q) uniquely determine R. Once that is
proved, it follows that the number of R such that (Q1 , Q2 , . . . , Q` , R) is degenerate is at most
µ ¶ µ ¶ µ ¶
|f (Q)| |Q1 | + |Q2 | + · · · + |Q` | 4`
≤ = < 8`2
2 2 2
as required. We now prove the claim. Since Y is determined by projection, R is uniquely determined
upon specifying the projection of R onto Fα and onto Fγ . So the claim is proved if Sα = {wα , yα }
and Sγ = {wγ , yγ } – since in that case two co-ordinates of each vector in the expression (8) defining
R are specified. If this is not the case, then one checks that S is one of the pairs {w, x}, {x, y}, {y, z}
or {z, w}. These cases are all dealt with in the same way, so we check only the case S = {w, x}.
Since yγ is uniquely determined by yα and yβ , y is uniquely determined by S. Hence {w, x, y} are
specified, and therefore {wα , yα } and {wγ , yγ } are uniquely determined. This uniquely determines
R, and proves the claim.
11
Lemma 3.5. Let Pk denote the set of nondegenerate chains of length k in Q. Then
where d > 64k 2 is the average degree in G and m is the number of vertices in G.
The proof is by induction on m + `. If G contains a vertex of degree less than d/4, then we remove
such a vertex to obtain a graph of average degree greater than de = d + d/(2m − 2). By induction,
the number of walks in P` for this new graph is at least
In particular, the number of non-returning walks of G which are in P` is at least 4−` m(d − 32`4 )` ,
as required. Suppose every vertex of G has degree at least d/4, and
Since every walk in P`−1 has at most 8(` − 1)2 extensions to a degenerate walk of length `, by
Lemma 3.4, there are at least d/4 − 8(` − 1)2 − ` > d/4 − 8`2 extensions of each walk in P`−1 to a
walk in P`+1 . This proves (10). Now the lemma follows from Lemma 3.4 with ` = k.
Therefore the number choices of R ∈ Pk concyclic with Q ∈ Pk is at most the number of choices
of R such that f (R)α = f (Q)α and f (R)γ = f (Q)γ . Since Y is determined by projections, any
quadruple Ri ∈ Q is specified by its projection onto Fα × Fγ . The number of ways of choosing
quadrilaterals R1 , R2 , . . . , Rk so that R = (R1 , R2 , . . . , Rk ) is at most
µ ¶ µ ¶ µ ¶k µ ¶k
|f (Q)α | k |f (Q)γ | k 2k 2k
≤ < 4k k 4k
2 2 2 2
since |f (Q)α | ≤ 2k and |f (Q)β | ≤ 2k. This gives the required upper bound on |Pk |.
12
3.5 Linear Dependences
In this section we prove Theorem 2.3, using the lemmas we have developed in the last few sections.
The following theorem combines all of these lemmas, and will also be used to prove Theorem 1.3.
Proof. Since Z is 8k-wise independent, we may apply Lemma 3.1: there exists a set Y ⊆ Z such
that Y is determined by projection and
³ 1
´1 1
2
|Y | > 16k 2 µ1+ k λν + ν + 2λν 2 . (13)
Let Pk denote the set of nondegenerate chains of length k in Q, let d and m be the average degree
1
and number of vertices in G, respectively. Combining (11) and (9) gives d − 32k 2 < 4k 4 m k and
1
therefore d < 64k 4 m k . It follows that since |Q| = |E(G)| = 12 dm,
1
|Q| < 64k 4 m1+ k . (14)
For a contradiction, suppose that |Z| is at least the expression claimed in (12). Now from (13),
|Y |4
|Q| > . (15)
4λ2 ν 2
Using (14) and m = µ2 this gives
1 2
³ 1
´2
|Y |4 < 4λ2 ν 2 · 64k 4 m1+ k < 256k 4 λ2 ν 2 µ2+ k = 16k 2 µ1+ k λν
Remark. Theorem 3.7 can be used to derive a more precise version of Theorem 1.2: if S is an
r-partite hypergraph with parts of sizes N1 , N2 , . . . , Nr , then the above theorem can be used to
1 1 r 1
prove that if |S| À (N1 N2 . . . Nr ) 2 + 2k + (N1 + N2 + · · · + Nr )d 3 e(1+ 2k ) then S contains an even
cover of size at most 8k. This result may be viewed as an extension of Theorem 2.1 from bipartite
graphs to r-partite hypergraphs.
Proof of Theorem 1.1. Let X ⊆ Fn be a set of vectors of weight at most r. Let χ : {1, 2, . . . , n} →
{1, 2, 3} be a random three-coloring of the co-ordinates, where distinct co-ordinates are colored
independently and each color is equiprobable. For a vector x ∈ X of weight ω(x) = ω, the
probability that x has exactly b ω3 c non-zero co-ordinates of color 1, exactly [ ω3 ] non-zero co-ordinates
of color 2, and exactly d ω3 e non-zero co-ordinates of color three is exactly
µ ¶µ ¶
1 ω ω − b ω3 c 1
ω ω ω > ,
3 b3c d3e 3ω
13
where we used the numerical Lemma 3.8 (see below) to obtain the inequality. In this case we
say that x is equipartitioned by χ. Therefore the expected number of vectors x ∈ X which are
equipartitioned is greater than
X 1 |X|
> .
3ω(x) 3r
x∈X
|X| 1 1 1 1 1 1 1
< 4k(LN ) 2 M 2 + 2k + L1+ 2k + M 1+ 2k + N 1+ 2k + 2LN 2 ,
3r
which implies that
1 1 1 1
|X| < 12kr · [(LN ) 2 M 2 + 2k + N 1+ 2k ].
This is precisely the statement of Theorem 2.3.
Proof. Let f (ω) denote the expression on the left in the inequality above. It is not hard to verify
that the result is true for ω ∈ {1, 2, 3}. Suppose ω > 3. Using the inequalities,
1 1 1
nn e−n (2πn) 2 < n! < nn e−n (2πn) 2 e 12n ,
which are valid for all positive integers n, we get that for all integers s ≥ 1,
µ ¶µ ¶ 1
3s 2s (3s)! (3s)3s e−3s (6πs)1/2 33s+ 2 33s
f (3s) = = > > > .
s s (s!)3 s3s e−3s (2πs)3/2 e1/4s 2πse1/4s 4s
This implies the required inequality when ω is a multiple of 3. We pass to general ω by noting that
3s + 1 (3s + 2)(3s + 1)
f (3s + 1) = f (3s) and f (3s + 2) = f (3s).
s+1 (s + 1)2
In particular, for s ≥ 1, f (3s + 1) ≥ 2f (3s) and f (3s + 2) ≥ 5f (3s), which implies the required
inequality.
14
4 Product representations of squares
In this section we prove Theorem 1.3. Before doing so, we require the following simple lemma:
1
Lemma 4.1. Let n > 1 be a positive integer. Then either n has a prime factor larger than N 2 , or
1
n = xyz where x, y, z ≤ N 2 .
Proof. Let n = p1 p2 . . . pr denote the prime factorization of n into (not necessarily distinct) primes
1
pi where p1 ≥ p2 ≥ · · · ≥ pr . Suppose p1 ≤ N 2 . Then we can find a set X of pi s whose product x
1 1
is at most N 2 but as close to N 2 as possible. Let y be a prime factor of n which isn’t in X. Then
1
xy ≥ N 2 , and we may take z = n/(xy).
In what follows we denote by Π(n) the set of all primes in {1, . . . , n}, and let π(n) = |Π(n)| be
the usual prime counting function.
Proof of Theorem 1.3. Let A ⊆ {1, 2, . . . , n} be a set such that no product of at most 8k distinct
elements of A is a square. Denote by B ⊆ A the set of integers in A which have a prime factor larger
1 1
than N 2 , and write C = A \ B. By Lemma 4.1 we have that C = {a ∈ A : a = xyz, x, y, z ≤ N 2 }.
Denote for 0 ≤ i ≤ 12 log2 n,
n n no
Pi = p ∈ Π(n) : i+1 < p ≤ i .
2 2
Form a bipartite graph Gi with parts Pi and {1, . . . 2i+1 } such that p ∈ Pi is joined to q ∈
{1, . . . , 2i+1 } if pq ∈ B. Then Gi does not contain a cycle of length at most 8k since if for some
2 ≤ ` ≤ 8k , p1 q1 , q1 p2 , p2 q2 , . . . , q`−1 p1 is such a cycle then pj qj , pj+1 qj ∈ A are distinct and their
product is a square. It was proved in [24] that an M by N bipartite graph of girth at least 2k + 2
1 1
has at most (M N ) 2 + 2k + M + N edges. Since Gi has girth at least 8k + 2, we deduce that
¡ i+1 ¢1+ 1
|E(Gi )| ≤ 2 |Pi | 2 8k + 2i+1 + |Pi |
³ h ³n´ ³ n ´i´ 1 + 1
2 8k
≤ 2i+1 π i − π i+1 + 2i+1 + |Pi |.
2 2
Adding these inequalities for i = 0, . . . , b 12 log2 nc gives
b 21 log2 nc
X 1 1
|B| = |E(Gi )| ≤ π(n) + O(n 2 + 8k ).
i=0
1
We now estimate |C|. For each t ∈ C fix a factorization t = xt yt zt with xt yt , zt ≤ N 2 . We
1
assume xt ≥ yt ≥ zt , so in particular zt ≤ n 3 . Denote
½ ¾
1
S = (i, j) : 0 ≤ i ≤ j : i + j + 2 ≤ log2 n
3
For (i, j) ∈ S let Cij denote the set of all t ∈ C such that
1 1 1 1
N2 N2 N2 N2
< xt ≤ < yt ≤ 1 ≤ zt ≤ 2i+j+2 .
2i+1 2i 2j+1 2j
15
1 1
We now apply Theorem 3.7 with F = F2 and λ = 2i+j+2 , µ = N 2 /2i+1 and ν = N 2 /2j+1 .
Each t ∈ Cij may be considered as a vector of weight three in Fλ × Fµ × Fν : if t = xt yt zt is the
prescribed factorization of t then the vector associated with t is the vector of weight three with
a one in positions xt , µ + yt and µ + ν + zt , and zeros elsewhere. Clearly no 8k of these vectors
are linearly dependent, otherwise the product of the corresponding elements of Cij is a square. By
Theorem 3.7, we have that
1 1 1 1 1 1
|Cij | < 12kr(λµν) 2 µ 2k + λ1+ 2k + µ1+ 2k + ν 1+ 2k + 2λν 2
à 1 1
!1 Ã 1 ! 1 Ã 1 !1+ 1 Ã 1 !1
2
N2 N2 i+j+2 N 2 2k N 2 2 2k i+j N2 2
¿ k · · 2 + + 2
2i+1 2j+1 2i+1 2j+1 2j+1
1 1 j 1
¿ kn 2 + 2k + 2i+ 2 n 4 .
Finally, we sum this inequality over all (i, j) ∈ S. We chose λ, µ, ν carefully to ensure that the sum
1
of the last term over (i, j) ∈ S is O(n 2 ). Therefore we have
1 1 1 1 1
|C| ¿ kn 2 + 2k · |S| + O(n 2 ) ¿ kn 2 + 2k (log n)2 .
5 Acknowledgements
We are grateful to Noga Alon and Henry Cohn for helpful comments.
References
[1] N. Alon, L. Babai, and A. Itai. A fast and simple randomized parallel algorithm for the
maximal independent set problem. J. Algorithms, 7(4):567–583, 1986.
[2] N. Alon, O. Goldreich, J. Håstad, and R. Peralta. Simple constructions of almost k-wise
independent random variables. Random Structures Algorithms, 3(3):289–304, 1992.
[3] N. Alon, S. Hoory, and N. Linial. The Moore bound for irregular graphs. Graphs Combin.,
18(1):53–57, 2002.
[4] N. Alon and J. H. Spencer. The probabilistic method. Wiley-Interscience Series in Discrete
Mathematics and Optimization. Wiley-Interscience [John Wiley & Sons], New York, second
edition, 2000. With an appendix on the life and work of Paul Erdős.
[5] N. Alon, R. Yuster, and U. Zwick. Color-coding. J. Assoc. Comput. Mach., 42(4):844–856,
1995.
[6] N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica,
17(3):209–223, 1997.
[7] L. Bazzi, M. Mahdian, and D. A. Spielman. The minimum distance of Turbo-like codes.
Preprint, 2003.
16
[8] C. Berrou, A. Glavieux, and P. Thitimajshima. Near Shannon Limit Error Correcting Codes
and Decoding: Turbo Codes. In Proceedings of IEEE International Communications Confer-
ence, pages 1064–1070. 1993.
[9] C. Bertram-Kretzberg, T. Hofmeister, and H. Lefmann. Sparse 0-1 matrices and forbidden
hypergraphs. Combin. Probab. Comput., 8(5):417–427, 1999.
[10] C. Bertram-Kretzberg and H. Lefmann. MODp -tests, almost independence and small proba-
bility spaces. Random Structures Algorithms, 16(4):293–313, 2000.
[12] B. Bollobás. Extremal graph theory. Dover Publications Inc., Mineola, NY, 2004. Reprint of
the 1978 original.
[13] J. Bondy and M. Simonovits. Cycles of even length in graphs. J. Combinatorial Theory B,
16:97–105, 1974.
[14] M. Breiling. A logarithmic upper bound on the minimum distance of Turbo codes. Preprint,
2001.
[15] A. E. Brouwer. Block designs. In Handbook of combinatorics, Vol. 1, 2, pages 693–745. Elsevier,
Amsterdam, 1995.
[16] M. C. Davey and D. J. C. MacKay. Low-density parity check codes over GF (q). IEEE
Communications Letters, 2(6):165–167, 1998.
[18] Erdős, P. On some applications of graph theory to number theoretic problems. Publ. Ramanu-
jan Inst. 1, 131–136, 1969.
[19] Erdős, Brown, W. G. and Sós, V. T. Some extremal problems on r-graphs. New directions in
the theory of graphs. Proc 3rd Ann Arbor Conference on Graph Theory, Academic Press, New
York, 55–63, 1973.
[20] P. Erdős and D. J. Kleitman. On coloring graphs to maximize the proportion of multicolored
k-edges. J. Combinatorial Theory, 5:164–169, 1968.
[22] R. G. Gallager. Low Density Parity Check Codes. MIT Press, Cambridge MA, 1963. Research
Monograph Series, no. 21.
[23] E. Györi. C6 -free bipartite graphs and product representation of squares. Discrete Math.,
165/166:371–375, 1997. Graphs and combinatorics (Marseille, 1995).
17
[24] S. Hoory. The size of bipartite graphs with a given girth. J. Combin. Theory Ser. B, 86(2):215–
220, 2002.
[25] N. Kahale and R. Urbanke. On the minimum distance of parallel and serially concatenated
codes. IEEE Trans. Inform. Theory. To appear.
[26] H. Lefmann. Sparse parity-check matrices over finite fields (extended abstract). In Computing
and combinatorics, volume 2697 of Lecture Notes in Comput. Sci., pages 112–121. Springer,
Berlin, 2003.
[27] H. Lefmann, P. Pudlák, and P. Savický. On sparse parity check matrices. Des. Codes Cryptogr.,
12(2):107–130, 1997.
[28] D. J. C. MacKay. Good error-correcting codes based on very sparse matrices. IEEE Trans.
Inform. Theory, 45(2):399–431, 1999.
[30] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. II. North-Holland
Publishing Co., Amsterdam, 1977. North-Holland Mathematical Library, Vol. 16.
[31] A. Naor and J. Verstraëte. A note on bipartite graphs without a 2k-cycle. Preprint, 2003.
[32] C. Pomerance. A tale of two sieves. Notices Amer. Math. Soc., 43(12):1473–1485, 1996.
[34] G. N. Sárközy. Cycles in bipartite graphs and an application in number theory. J. Graph
Theory, 19(3):323–331, 1995.
[35] M. Sipser and D. A. Spielman. Expander codes. IEEE Trans. Inform. Theory, 42(6, part
1):1710–1722, 1996. Codes and complexity.
[36] D. A. Spielman. Linear-time encodable and decodable error-correcting codes. IEEE Trans.
Inform. Theory, 42(6, part 1):1723–1731, 1996. Codes and complexity.
[37] J. Verstraëte. On arithmetic progressions of cycle lengths in graphs. Combin. Probab. Comput.,
9(4):369–373, 2000.
18