1.7. Linear equations in primes 111

The former step can be

accomplished22

in a number of ways. For pro-

gressions of length three (and more generally, for controlling linear patterns

of complexity at most one), transference can be accomplished by Fourier-

analytic methods. For more complicated patterns, one can use techniques

inspired by ergodic theory; more recently, simplified and more eﬃcient meth-

ods based on duality (the Hahn-Banach theorem) have also been used. No

number theory is used in this step.

The second step is accomplished by fairly standard sieve theory methods

(e.g. the Selberg sieve, or the slight variants of this sieve used by Goldston,

Pintz, and Yıldırım [GoPiYi2008]). Remarkably, very little of the formida-

ble apparatus of modern analytic number theory is needed for this step; for

instance, the only fact about the Riemann zeta function that is truly needed

is that it has a simple pole at s = 1, and no knowledge of L-functions is

needed.

The third step does draw more significantly on analytic number theory

techniques and results (most notably, the method of Vinogradov to compute

oscillatory sums over the primes, and also the Siegel-Walfisz theorem that

gives a good error term on the prime number theorem in arithemtic pro-

gressions). As these techniques are somewhat orthogonal to the main topic

of this text, we shall only touch briefly on this aspect of the transference

strategy.

1.7.1. Transference. The transference principle is not a single theorem,

but is instead a family of related results with a common purpose, namely to

show that a suﬃciently pseudorandom set, measure, or probability distribu-

tion will be “indistinguishable” from the whole set (or the uniform measure

or probability distribution) in certain statistical senses. A key tool in this

regard is a dense model theorem that allows one to approximate or model any

set or function that is dense with respect to a pseudorandom measure, by a

set or function which is dense with respect to the uniform measure. It turns

out that one can do this as long as the approximation is made with respect

to a suﬃciently weak topology; for the applications to counting arithmetic

patterns, it turns out that the topology given by the Gowers norms is the

right one to use. The somewhat complicated nature of these norms, though,

does make the verification of the required pseudorandomness properties to

be slightly tricky.

We illustrate these themes with Roth’s theorem, though the general

strategy applies to several other results in additive combinatorics. We begin

with Roth’s theorem in a cyclic group Z/NZ, which we phrase as follows:

22In the case of transference to genuinely random sets, rather than pseudorandom sets,

similar ideas appeared earlier in the graph theory setting; see [KoLuRo1996].