Annals of Mathematics
Hausdorff dimension of the set
of nonergodic directions
By Yitwah Cheung
Annals of Mathematics, 158 (2003), 661–678
Hausdorff dimension of the set
of nonergodic directions
By Yitwah Cheung
(with an Appendix by M. Boshernitzan)
Abstract
It is known that nonergodic directions in a rational billiard form a subset
of the unit circle with Hausdorff dimension at most 1/2. Explicit examples
realizing the dimension 1/2 are constructed using Diophantine numbers and
continued fractions. A lower estimate on the number of primitive lattice points
in certain subsets of the plane is used in the construction.
1. Introduction
Consider the billiard in a polygon Q.Afundamental result [KMS] implies
that a typical trajectory with typical initial direction will be equidistributed
provided the angles of Q are rational multiples of π. More precisely, there is
a flat surface X associated to the polygon such that each direction θ ∈ S
1
determines an area-preserving flow on X; the assertion is that the set NE(Q)
of parameters θ for which the associated flow is not ergodic has measure zero.
The statement holds more generally for the class of rational billiards in which
the (abstract) polygon is assumed to have the property that the subgroup of
O(2) generated by the linear parts of the reflections in the sides is finite. For
a recent survey of rational billiards, see [MT].
Let Q
λ
,λ∈ (0, 1), be the polygon described informally as a 2-by-1 rectan-
gle with an interior wall extending orthogonally from the midpoint of a longer
side so that its distance from the opposite side is exactly λ (see Figure 1). We
are interested in the Hausdorff dimension of the set NE(Q
λ
). Recall that λ is
Diophantine if the inequality
λ −
p
q
1
|q|
e
has (at most) finitely many integer solutions for some exponent e>0.
662 YITWAH CHEUNG
1 − λ
Figure 1. The billiard in Q
λ
.
Theorem 1. If λ is Diophantine, then H.dim NE(Q
λ
)=1/2.
In fact, Masur has shown that for any rational billiard the set of nonergodic
directions has Hausdorff dimension at most 1/2 [Ma]. This upperbound is
sharp, as Theorem 1 shows. It should be pointed out that the theorems in
[KMS] and [Ma] are stated for holomorphic quadratic differentials on compact
Riemann surfaces. The flat structure on the surface associated to a rational
billiard is a special case, namely the square of a holomorphic 1-form.
The ergodic theory of the billiards Q
λ
was first studied by Veech [V1]
in the context of
2
skew products of irrational rotations. Veech proved the
slope of the initial direction θ has bounded partial quotients if and only if the
corresponding flow is (uniquely) ergodic for all λ.Onthe other hand, if θ has
unbounded partial quotients, then there exists an uncountable set K(θ)ofλ for
which the flow is not ergodic. In this way, Veech showed that minimality does
not imply (unique) ergodicity for these
2
skew products. (The first examples
of minimal but uniquely ergodic systems had been constructed by Furstenberg
in [Fu].) Our approach is dual to that of Veech in the sense that we fix λ and
study the set of paramaters θ ∈ NE(Q
λ
).
The billiards Q
λ
were first introduced by Masur and Smillie to give a
geometric representation of the
2
skew products studied by Veech. It follows
from [V1] that NE(Q
λ
)iscountable if λ is rational. A proof of the converse can
be found in the survey article [MT, Thm. 3.2]. Boshernitzan has given a short
argument showing H.dim NE(Q
λ
)=0for a residual (hence, uncountable) set
of λ. (His argument is presented in the appendix to this paper.) Theorem 1
implies any such λ is a Liouville number. As is well-known, the set of Liouville
numbers has measure zero (in fact, Hausdorff dimension zero). We remark that
by Roth’s theorem every algebraic integer satisfies the hypothesis of Theorem 1.
Some generalizations of Theorem 1 are mentioned in Section 2. For the
class of Veech billiards (see [V2]) the set of nonergodic directions is countable.
It would be interesting to know if there are (number-theoretic) conditions on
a general rational billiard Q which imply that the Hausdorff dimension of
NE(Q)=1/2.
Theorem 1 can be reduced to a purely number-theoretic statement.
HAUSDORFF DIMENSION OF NONERGODIC DIRECTIONS 663
Lemma 1.1 (Summable cross products condition). Suppose (w
j
) is a
sequence of vectors of the form (λ + m
j
,n
j
), where m
j
,n
j
∈ 2 and n
j
=0,
and assume that the Euclidean lengths |w
j
| are increasing. The condition
(1)
|w
j
× w
j+1
| < ∞,
implies that θ
j
= w
j
/|w
j
| converges to some θ ∈ NE(Q
t
λ
) as j →∞. (Here,
Q
t
λ
is the billiard table obtained by reflecting Q
λ
in a line of slope −1.)
Theorem 2. Let K(λ) be the set of nonergodic directions that can be
obtain using Lemma 1.1. If λ is Diophantine, then H.dim K(λ)=1/2.
Proof of Theorem 1. Theorem 2 implies H.dim NE(Q
λ
)=H.dim NE(Q
t
λ
)
1/2. Together with Masur’s upperbound, this gives Theorem 1.
Density of primitive lattice points. The main obstacle in our approach to
finding lowerbounds on Hausdorff dimension is the absence of primitive lattice
points in certain regions of the plane. More precisely, let Σ = Σ(α, R, Q)
denote the parallelogram (Figure 2)
Σ:=
(x, y) ∈
2
: |yα − x| 1/Q, R y 2R
and define
dens(Σ) :=
#
{(p, q) ∈ Σ:gcd(p, q)=1}
area(Σ)
.
R
2R
2/Q
α
Figure 2. The parallelogram Σ(α, R, Q).
The proof of Theorem 2 relies on the following fact:
Theorem 3. Let Spec(α) be the sequence of heights formed by the
convergents of α. There exist constants A
0
and ρ
0
> 0 such that whenever
area(Σ)
A
0
Spec(α) ∩ [Q, R] = ∅⇒dens(Σ) ρ
0
.
664 YITWAH CHEUNG
Remark. It can be shown dens(Σ) = 0 if α does not have any convergent
whose height is between Q/4 and 8R.Thus, area(Σ) 1 alone cannot imply
the existence of a primitive lattice point in Σ. For example, the implication
|α| <
1
2R
1 −
1
Q
⇒ dens(Σ) = 0
is easy to verify and remains valid even if |·| is replaced by the distance to the
nearest integer (because arithmetic density is preserved under (
1 n
01
)).
Outline. Theorem 2 is proved by showing that K(λ) contains a Cantor set
whose Hausdorff dimension may be chosen close to 1/2 when λ is Diophantine.
The construction of this Cantor set is based on Lemma 1.1 and is presented in
Section 2. The proof of Theorem 2 is completed in Section 3 if we assume the
statement of Theorem 3, whose proof is deferred to Section 4.
Acknowledgments. This research was partially supported by the National
Science Foundation and the Clay Mathematics Institute. The author would
also like to thank his thesis advisor Howard Masur for his excellent guidance.
2. Cantor set of nonergodic directions
We b egin with the proof of Lemma 1.1, which is the recipe for the con-
struction of a Cantor set E(λ) ⊂ K(λ). We then show that the Hausdorff
dimension of E(λ) can be chosen arbitrarily close to 1/2ifthe arithmetic den-
sity of the parallelograms Σ(α, R, Q) can be bounded uniformly away from
zero.
2.1. Partition determined by a slit. The flat surface associated to Q
λ
is
shown in Figure 3. It will be slightly more convenient to work with the reflected
table Q
t
λ
. Let X
λ
be the flat surface associated to Q
t
λ
. The proof of Lemma 1.1
is based on the following observation:
X
λ
is a branched double cover of the square torus T =
2
/
2
.
More specifically, let w
0
⊂ T denote the projection of the interval [0,λ] con-
tained in the x-axis. X
λ
may be realized (up to a scale factor of 2) by gluing
two copies of the slit torus T \ w
0
along their boundaries so that the upper
edge of the slit in one copy is attached to the lower edge of the slit in the other,
and similarly for the remaining edges. The induced map π : X
λ
→ T is the
branched double cover obtained by making a cut along the slit w
0
.
Lemma 2.1 (Slit directions are nonergodic). Avector of the form
(λ + m,n) with m, n ∈ 2
and n =0determines a nonergodic direction in Q
t
λ
.
HAUSDORFF DIMENSION OF NONERGODIC DIRECTIONS 665
+ -
+ -
Figure 3. Unfolded billiard trajectory.
Proof.Avector of the given form determines a slit w in T that is homolo-
gous to w
0
(mod 2). (We assume λ is irrational, for the statement of the lemma
is easily seen to hold otherwise.) If π
: X
→ T is the branched double cover
obtained by making a cut along w, then there is a biholomorphic isomorphism
h : X
λ
→ X
such that π = π
◦ h.Itfollows that π
−1
(w) partitions X
λ
into
a pair of slit tori with equal area, and that this partition is invariant under
the flow in the direction of the slit. Hence, the vector (after normalization)
determines a nonergodic direction in Q
t
λ
.
Proof of Lemma 1.1. It is easy to see from (1) that the directions θ
j
form
a Cauchy sequence. The corresponding partitions of X
λ
also converge in a
measure-theoretic sense: the symmetric difference of consecutive partitions is
a union of parallelograms whose total area is bounded by the corresponding
term in (1); summability implies the existence of a limit partition. Invariance
of the limit partition under the flow in the direction of θ will follow by showing
that h
j
, the component of w
j
perpendicular to θ, tends to zero as j →∞([MS,
Th. 2.1]). To see this, observe that the area of the right triangle formed by
w
j
and θ is roughly h
j
times the Euclidean length of w
j
;itisbounded by the
tail in (1) and therefore tends to zero. (We have implicitly assumed that λ is
irrational. For rational λ the lemma still holds because a nonzero term in (1)
must be at least the reciprocal of the height.)
Remark. Avector of the form (λ + m, n) with m, n ∈ g and n =0
determines a partition of the branched g-cyclic cover of T into g slit tori of equal
area. From this, it is not hard to show that the conclusion of Theorem 1 holds
in genus g
2. Gutkin has pointed out other higher genus examples obtained
by considering branched double covers along multiple parallel slits. Further
examples are possible by observing that the proof of Theorem 2 depends only
on a Diophantine condition on the vector w
0
=(λ, 0). (See §3.)
666 YITWAH CHEUNG
2.2. Definition of E(λ). Our goal is to find sequences that satisfy condi-
tion (1) and intuitively, the more we find, the larger the dimension. However,
in order to facilitate the computation of Hausdorff dimension, we shall restrict
our attention to sequences whose Euclidean lengths grow at some fixed rate.
We shall realize E(λ)asadecreasing intersection of compact sets E
j
,
each of which is a disjoint union of closed intervals. Let V denote the set
of vectors that satisfy the hypothesis of Lemma 2.1. Henceforth, by a slit
we mean a vector w ∈ V whose length is given by L := |n| and slope by
α := (λ + m)/n. Note that the following version of the cross product formula
holds: |w × w
| = LL
∆, where ∆ is the distance between the slopes. Fix a
parameter δ>0.
Definition 2.2 (Children of a slit). Let w beaslit of length L and slope α.
A slit w
is said to be a child of w if
(i) w
= w +2(p, q) for some relatively prime integers p and q
(ii) |qα − p|
1/L log L and q ∈ [L
1+δ
, 2L
1+δ
].
Lemma 2.3 (Chains have nonergodic limit). The direction of w
j
con-
verges to a point in K(λ) as j →∞provided w
j+1
is a child of w
j
for every j.
Proof. The inequality in (ii) (equivalent to |w × w
| 1/ log L) implies
that the directions of the slits are close to one another. Hence, their Euclidean
lengths are increasing since the length of a child is approximately L
1+δ
. The
sum in (1) is dominated by a geometric series of ratio 1/(1 + δ).
Choose a slit w
0
and call it the slit of level 0. The slits of level j +1are
defined to be children of slits of level j. Let V
:= ∪V
j
where V
j
denotes the
collection of slits that belong to level j. Associate to each w ∈ V
the smallest
closed interval containing all the limits obtainable by applying Lemma 2.3 to
a sequence beginning with w. Define E(λ):=∩E
j
where E
j
is the union of
the intervals associated to slits in V
j
.Itiseasily seen that the diameters of
intervals in E
j
tend to zero as j →∞. Hence, every point of E arises as the
limit obtained by an application of Lemma 2.3. Therefore, E(λ) ⊂ K(λ).
2.3. Computation of Hausdorff dimension.Wefirst give a heuristic calcu-
lation which shows that the Hausdorff dimension of K(λ)isatmost 1/2. (This
fact is not used in the proof of Theorem 1.) We then show rigorously that the
Hausdorff dimension of E(λ)isatleast 1/2 under a critical assumption: each
slit in V
has enough children.
Recall the construction of the Cantor middle-third set. At each stage of
the induction, intervals of length ∆ are replaced with m =2equally spaced
subintervals of common length ∆
.Inthis case, the Hausdorff dimension is
exactly log 2/ log 3, or log m/ log(1/ε) where ε := ∆
/∆=1/3.
HAUSDORFF DIMENSION OF NONERGODIC DIRECTIONS 667
For K(λ)itisenough to consider sequences for which every term in (1)
is bounded above. Associated to each slit of length L is an interval of length
∆=1/L
2
. The number of slits of length approximately L
is at most m =
L
/L. Their intervals have approximate length ∆
=1/(L
)
2
. Therefore,
H.dim K(λ)
log m
log(∆/∆
)
=
1
2
.
To get a lowerbound on the Hausdorff dimension of E(λ)weneed to
show there are lots of children and wide gaps between them. The number of
children is exactly 2L
δ
/ log L times the arithmetic density of the parallelogram
Σ(α, R, Q) where R = L
1+δ
and Q = L log L.
Lemma 2.4 (Slopes of children are far apart). The slopes of any two chil-
dren of a slit with length L are separated by a distance of at least O(1/L
2+2δ
).
Proof. Let w be a slit of length L.Achild w
has the form w
= w +2v
for some v =(p, q). If w
= w +2v
is another child with v
=(p
,q
), then
v
= v. Since both pairs are relatively prime, |p/q − p
/q
| 1/qq
1/4L
2+2δ
.
The lemma follows by observing that the slope of w
satisfies
α
−
p
q
=
|w
× v|
L
q
=
|w × v|
(L +2q)q
L|qα − p|
2q
2
1
2L
2+2δ
log L
.
Proposition 2.5 (Enough children implies dimension 1/2). Suppose
there exists c
1
> 0 such that every slit in V
has at least c
1
L
δ
/ log L children
in V
, where L denotes the length of the slit. Then H.dim K(λ)=1/2.
Proof. The length of a slit in V
j
is roughly L
j
= L
(1+δ)
j
0
, where L
0
denotes
the length of the initial slit w
0
. The number of children is at least m
j
=
c
1
L
δ
j
/ log L
j
and their slopes are at least ε
j
=1/4L
2+2δ
j
apart. It follows by
well-known estimates for computing Hausdorff dimension (we use [Fa, Ex. 4.6])
that
H.dim E(λ)
liminf
j→∞
log(m
0
···m
j−1
)
− log(m
j
ε
j
)
= liminf
j→∞
j−1
i=0
δ log L
i
(2 + δ) log L
j
=
1
2+δ
.
Together with the upperbound on K(λ), this proves the lemma.
Remark. Theorem 3 allows us to determine when a slit has enough chil-
dren. It should by pointed out that Diophantine λ does not imply every slit
will have enough children. We shall show that Proposition 2.5 holds if V
is
replaced by a suitable subset. (By the remark following Theorem 3 one can
easily show there are slits that do not have any children and whose directions
form a dense set.)
668 YITWAH CHEUNG
3. Diophantine condition
Let w
0
be the initial slit in the definition of E(λ). The hypothesis that λ
is Diophantine implies there are constants e
0
> 0 and c
0
> 0 such that
||w
0
× v|| = min
n∈
|w
0
× v − n|
c
0
|v|
e
0
for all v ∈
2
, v =0.
Fix a real number N so that e
0
<Nδ.Weassume the length of w
0
is at least
some predetermined value L
0
= L
0
(λ, δ, N, e
0
,c
0
).
Definition 3.1 (Normal slits). A slit of length L and slope α is said to be
normal if for every real number n, 1
n N +1,
Spec(α) ∩ [e
nδ
L log L, L
1+nδ
] = ∅.
Let V
be the subset of V
formed by normal slits of length L
0
.
Proposition 3.2 (Normal slits have enough children). There exists c
1
> 0
such that every slit in V
has at least c
1
L
δ
/ log L children in V
.
To complete the proof of Theorem 2 we also need
Lemma 3.3 (Normal slits exist). Arbitrarily long normal slits exist.
Proof of Theorem 2 assuming Lemma 3.3 and Proposition 3.2. We may
choose the initial slit w
0
to lie in V
, which is nonempty by the lemma. The
calculation in the proof of Proposition 2.5 applies to a subset of E(λ)togive the
same conclusion; in other words, the proposition implies H.dim K(λ)=1/2.
We recall two classical results from the theory of continued fractions. The
k
th
convergent p
k
/q
k
of a real number α is a (reduced) fraction such that
(2)
1
q
k
(q
k+1
+ q
k
)
α −
p
k
q
k
1
q
k
q
k+1
and satisfies the recurrence relation q
k+1
= a
k+1
q
k
+ q
k−1
(similarly for p
k
),
where a
k
is the k
th
partial quotient. A partial converse is that if p and q>0
are integers satisfying
(3)
α −
p
q
1
2q
2
then p/q is a convergent of α, although it need not be reduced.
3.1. Existence of normal slits.
Definition 3.4. A slit of length L and slope α is said to be n-good if
Spec(α) ∩ [e
nδ
L log L, L
1+δ
] = ∅.
HAUSDORFF DIMENSION OF NONERGODIC DIRECTIONS 669
Lemma 3.5. Asufficiently long N-good slit is normal.
Proof.An(N+1)-good slit is normal by definition, so it suffices to consider
the case of an N-good slit that is not (N + 1)-good. Suppose w is such a slit,
with length L and slope α. Let q
k
be the largest height in Spec(α) ∩ [1,L
1+δ
]
so that q
k
= e
n
1
δ
L log L for some n
1
between N and N +1. Set v := (p
k
,q
k
).
By the RHS of (2), the Diophantine condition, |v|∈O(L log L) and e
0
<Nδ
we get
q
k+1
|q
k
α − p
k
|
−1
(1/c
0
)L|v|
e
0
L
1+Nδ
provided L L
0
. Since N n
1
, this shows w is normal.
Proof of Lemma 3.3. By the previous lemma, it is enough to prove the
existence of arbitrarily long N-good slits. We show that a sufficiently long slit
that is not N-good has a nearby slit that is N-good.
Hence, let w beaslit of length L and slope α and assume it is not N-good.
Let q
k
be the largest height in Spec(α) ∩ [1,L
1+δ
]. Since q
k+1
>L
1+δ
(here
we use the irrationality of λ to guarantee the existence of the next convergent)
the RHS of (2) implies ∆ := (L|q
k
α − p
k
|)
−1
>L
δ
. With L
:= L +2mq
k
,itis
not hard to see that there exists a positive integer m satisfying
e
Nδ
log(L
)+1/2 ∆ (L
)
δ
.
Indeed, if m is smallest for the RHS, then the LHS holds when L
L
0
.
Let w
= w +2mv where v =(p
k
,q
k
). We show w
is N-good. Let α
be its slope. Using |w
× v| = |w × v| and the cross product formula, we find
|α
− p
k
/q
k
| =1/L
q
k
∆ 1/2q
2
k
which by (3) implies q
k
∈ Spec(α
). Using the
above inequalities on ∆ in parallel with those in (2) we obtain
q
k+1
L
∆ (L
)
1+δ
and
q
k+1
L
(∆ − q
k
/L
) e
Nδ
L
log L
which show that w
is N-good.
3.2. Normal slits have enough normal children. Assume w is a normal slit
of length L
L
0
and slope α. Let q
k
be the largest in Spec(α) ∩ [1,L
1+δ
] and
define n
1
1 uniquely by q
k
= e
n
1
δ
L log L.
Lemma 3.6 (Enough children). Since w has at least O(L
δ
/ log L)(n−1)-
good children where n := min(n
1
,N + 1), if w
is a child with length L
and
slope α
, then w
= w +2(p
k
,q
k
) and q
k
+1
∈ [L
log L
, (L
)
1+δ
] for some
q
k
∈ Spec(α
).
Lemma 3.7 (Most children are normal). The number of children con-
structed in the previous lemma that are not normal is at most O(L
δ−δ
2
log L).
670 YITWAH CHEUNG
Proof of Proposition 3.2 assuming the above lemmas.Aslit w ∈ V
has
enough normal children. These are all longer than L
0
and therefore lie in V
.
Proof of Lemma 3.6. Applying Theorem 3 with Q = e
nδ
L log L and
R = L
1+δ
gives O(L
δ
/ log L)children of the form w
= w +2(p, q) where
gcd(p, q)=1,q ∈ [L
1+δ
, 2L
1+δ
] and |qα − p|
−1
e
nδ
L log L.
Observe that L
= L +2q so that |α
− p/q| = L|qα − p|/L
q 1/2q
2
.
Since gcd(p, q)=1, (3) implies q = q
k
∈ Spec(α
) for some index k
.
It remains to bound the next height q
k
+1
∈ Spec(α
). Using the LHS of
(2) together with the lower bound on |qα − p|
−1
,wehave
q
k
+1
L
/L|qα − p|−q
k
L
(e
nδ
log L − 1/2) e
(n−1)δ
L
log L
since L L
0
. Using the RHS of (2) and L L
0
again
q
k
+1
L
/Lq|α − p/q| L
L
δ
(L
)
1+δ
.
(In the second step we used the fact that |α − p/q|
1/L
2+2δ
which, by
Lemma 2.4, holds for all the children with a finite number of exceptions.)
Proof of Lemma 3.7. Let
˜
V be the collection of slits formed by the children
constructed in Lemma 3.6 that are not normal. We show that
˜
V has at most
O(L
δ−δ
2
log L) elements. Observe that if n
1
N +1,all children constructed
are N -good, hence normal, by Lemma 3.5. In this case,
˜
V is empty and we
have nothing to prove. Therefore, we may assume n
1
<N+1.
The next lemma will allow us to count the number of elements in
˜
V .
Lemma 3.7. Let w
beaslit in
˜
V of length L
and slope α
. Then (i)
the largest q
l
∈ Spec(α
) ∩ [1, (L
)
1+δ
] lies in [L
log L
,e
Nδ
L
log L
] and (ii) it
satisfies the inequality below for at most finitely many possible values of a.
(4) |L(q
l
α − p
l
) ± 2a|
1
L
n
2
δ+n
2
δ
2
where n
2
:= max(1,n
1
− 1).
Proof.Bydefinition q
l
+1
=(L
)
1+n
δ
for some n
1. Since w
is (n
1
−1)-
good, q
l
= e
n
δ
L
log L
for some n
n
1
− 1 0. By Lemma 3.5, w
is not
N-good so that n
<N; this proves (i).
Since n
1
<N+1,the fact that w
is not normal implies n
n
n
2
.
Let q
k
∈ Spec(α
)beasinLemma 3.6 and recall q
k
+1
∈ [L
log L
, (L
)
1+δ
].
By definition of q
l
, q
l
q
k
+1
. The recurrence relations satisfied by conver-
gents imply that q
l
= aq
k
+1
+ bq
k
for some integers a>0 and b 0. By (i),
a
e
Nδ
.
HAUSDORFF DIMENSION OF NONERGODIC DIRECTIONS 671
Write w
= w +2v and note that |w
×v| = |w×v|. Since the cross product
of consecutive convergents is ±1 (thought of as vectors),
L
|q
l
α
− p
l
| = |L(q
l
α − p
l
) ± 2a|.
The RHS of (2), L
= L +2q L
1+δ
, and n
n
2
imply
L
|q
l
α
− p
l
|
L
q
l
+1
1
(L
1+δ
)
n
δ
1
L
n
2
δ+n
2
δ
2
and (ii) follows.
Lemma 3.7 allows us to write
˜
V as a finite union of subsets
˜
V
±a
. Let Q
±a
denote the corresponding set of heights q
l
associated to the slits in
˜
V
±a
. The
next two lemmas complete the proof of Lemma 3.7.
Lemma 3.8.
˜
V
±a
and Q
±a
have the same number of elements.
Proof.Weneed to show that the map
˜
V
±a
→ Q
±a
sending w
to q
l
is
injective. Let w
be different from w
with corresponding image q
l
. Note that
since |α
−p
l
/q
l
| 1/(L
)
1+δ
q
l
is small compared to the distance between the
slopes of w
and w
(Lemma 2.4), the rationals p
l
/q
l
and p
l
/q
l
are distinct.
Their heights differ because the interval containing them is smaller than 1/q
l
.
Lemma 3.9. Each Q
±a
is a union of at most O(log L) subsets, each
having at most O(L
δ−δ
2
) elements.
Proof. Let q
l
,q
l
∈ Q
±a
and set ¯q := |q
l
− q
l
|.Weclaim that if ¯q L
1+δ
then ¯q = dq
k
for some positive integer d ∈ O(L
δ−δ
2
). This implies the lemma
since we either have ¯q>L
1+δ
or ¯q L
1+δ
so that the elements of Q
±a
fall
into O(log L) clusters, each having O(L
δ−δ
2
) elements.
Hence, assume ¯q
L
1+δ
and set ¯p = |p
l
− p
l
|. Let p/q be the reduced
form of ¯p/¯q so that ¯q = dq where d = gcd(¯p, ¯q). From (4) and the triangle
inequality
|¯qα − ¯p|
2
L
1+n
2
δ+n
2
δ
2
.
Since q
¯q L
1+δ
, |α − p/q| 1/2q
2
.By(3) p/q is a convergent of α; since
gcd(p, q)=1,q = q
k
∈ Spec(α) for some index k
.Byhypothesis, q L
1+δ
,
so we must have k
k.Infact, we must have k
= k because k
<kimplies
|¯qα− ¯p| = d|q
k
α −p
k
| 1/(q
k
+ q
k−1
) 1/2q
k
which contradicts the previous
inequality. Using the LHS of (2), we now have
d =
|¯qα − ¯p|
|q
k
α − p
k
|
2(q
k+1
+ q
k
)
L
1+n
2
δ+n
2
δ
2
672 YITWAH CHEUNG
which is O(L
(n
3
−n
2
)δ−n
2
δ
2
) where n
3
is defined by q
k+1
= L
1+n
3
δ
. Since w is
normal, n
3
n
1
. This together with (4) implies n
3
− n
2
n
1
− n
2
1 and
n
2
1; thus proving the claim.
4. Counting rationals in intervals
Primitive lattice points in the parallelogram Σ correspond to rationals in
the interval I :=
α −
1
RQ
,α+
1
RQ
. Set
Λ
I
:=
(x, y) ∈
2
: x/y ∈ I, R
y 2R
.
Theorem 4. If Spec(α) ∩ [Q, R] = ∅, dens(Λ
I
) 1/24 and R/Q 16.
Proof of Theorem 3. We use A
0
=16and ρ
0
=1/32. Observe that Σ
contains Λ
I
where I
is concentric with I and half as wide. Moreover, Λ
I
occupies three quarters of its area. Replacing Q with 2Q in Theorem 4 and
assuming Spec(α) ∩ [2Q, R] = ∅,weconclude dens(Σ)
(3/4) dens(Λ
I
) ρ
0
.
If Spec(α) ∩ [2Q, R]=∅, let q
k
be the largest height in Spec(α) ∩ [Q, R].
By the RHS of (2) it is easy to show the reduced fraction
p
q
=
ap
k
+ p
k−1
aq
k
+ q
k−1
where a =1, 2, 3,
(known as an intermediate fraction of α when a
a
k+1
) satisfies
(5)
α −
p
q
q
k
+ |q
k+1
− q|
q
1
q
k
q
k+1
.
These correspond to R/q
k
primitive lattice points in Σ, and since q
k
2Q,it
follows easily that dens(Σ)
1/4 ρ
0
.
Lemma 4.1. If J has rational endpoints of height at most R, dens(Λ
J
)
1/6.
Proof.Weshall first prove the lemma under the additional hypotheses:
(i) the height of any rational in int(J)isgreater than R, and
(ii) |pq
− p
q| =1,where p/q and p
/q
are the endpoints of J.
By (ii), arithmetic density is preserved by the linear map γ which sends the
standard basis to lattice points corresponding to the endpoints of J. Note that
γ
−1
(Λ
J
) ⊂ ∆:=
(x, y) ∈
2
: x/a + y/a
2,x,y>0
where a = R/q and a
= R/q
. Let n(∆) denote the number of primitive lattice
points int(∆). By (i), it is enough to show n(∆)
aa
/4.
HAUSDORFF DIMENSION OF NONERGODIC DIRECTIONS 673
Without loss of generality, assume a
a 1. There are two cases. If
a
2 then aa
4, and since (1, 1) ∈ ∆wehave n(∆) 1 aa
/4. On the
other hand, if a
> 2, then since γ(1, 1) ∈ Λ
J
,wehave 1/a +1/a
1sothat
a ∈ [1, 2]. Considering pairs of the form (1,n), we find
n(∆)
a
2 −
1
a
>aa
2
a
−
1
a
2
−
1
2
aa
4
.
This completes the proof assuming the additional hypotheses.
Note that every interval is a disjoint union of intervals satisfying (i).
Hence, the lemma follows if we show (i) = (ii). Indeed, let d = |pq
− p
q|.
There is a linear map in GL
2
that takes (p, q)to(0, 1) and (p
,q
)to(d, d
)
for some integer d
, 0 d
<d.Ifd
> 0 then (1, 1) is contained in the
triangle determined by the origin, (1, 0) and (d, d
) and corresponds to a ra-
tional of height at most R in int(J). Therefore, (i) implies d
=0,and since
gcd(d, d
)=1, this in turn implies that d =1,giving (ii).
The height of a rational strictly between p/q and p
/q
is at least q + q
:
1
qq
=
p
q
−
p
q
+
p
q
−
p
q
1
qq
+
1
q
q
This will be used several times in the next proof.
Proof of Theorem 4. Let α
and α
denote the left and right endpoints
of I, respectively. Let q
k
be the largest height in Spec(α) ∩ [Q, R]. By (2),
p
k
/q
k
∈ I and without loss of generality, we assume p
k
/q
k
α. Let q
l
be the
largest height in Spec(α
)∩[1,R]. Since p
k
/q
k
cannot lie strictly between p
l
/q
l
and p
l+1
/q
l+1
,itmust be the case that p
k
/q
k
p
l
/q
l
.Infact, strict inequality
must hold because q
l
= q
k
Q and |α
− p
k
/q
k
| 1/q
l
q
l+1
< 1/RQ give a
contradiction.
Let J := [p
k
/q
k
,p
l
/q
l
]. We claim its length is at least 1/2RQ.Infact,
p
l
/q
l
lies within 1/2RQ of α
if q
l
2Q.Onthe other hand, q
l
2Q implies
|J|
1/q
k
q
l
1/2RQ.Ineither case, |J| 1/2RQ.
We may assume α
p
l
/q
l
, for otherwise J ⊂ I and dens(Λ
I
) 1/24.
There are three cases. First, if q
l+1
+3q
l
> 2R there can be at most two
rationals with height at most 2R that lie strictly between p
l
/q
l
and p
l+1
/q
l+1
:
p
l+1
q
l+1
<
p
l+1
+ p
l
q
l+1
+ q
l
<
p
l+1
+2p
l
q
l+1
+2q
l
<
p
l
q
l
.
Let n(Λ
I
) denote the number of primitive lattice points in Λ
I
.ByLemma 4.1,
n(Λ
I
)
area(Λ
J
)
6
− 2=
1
12
−
2Q
3R
area(Λ
I
)
and since R/Q
16, dens(Λ
I
) 1/24.
674 MICHAEL BOSHERNITZAN
Next, suppose q
l+1
+3q
l
2R and q
l
2Q. Note that an intermediate
fraction of α
with height between q
l
and q
l+1
lies to the left of α
. Let
J
:= [p
k
/q
k
, p/q] where p/q is the intermediate fraction with the largest height
not exceeding R.Bydefinition, q>R− q
l
.From (5) we have
α −
p
q
q
l
+ q
l+1
− q
q
1
q
l
q
l+1
2R − 2q
l
− q
R − q
l
1
2RQ
1
2RQ
so that |J
| 1/2RQ and dens(Λ
I
) 1/24.
Finally, assume that q
l+1
+3q
l
2R and q
l
< 2Q. Again, we consider
the intermediate fractions of α
. Observe that their heights are at most 2R,
since q
l+1
2R. They form a sequence that increase towards α
from the left.
Given a consecutive pair with heights less than R we can always find a rational
strictly in between them with height in [R, 2R]. It follows that the number of
rationals in I with height in [R, 2R]isatleast (q
l+1
− q)/q
l
, where q is the
height of the first intermediate fraction that falls into I. According to (5), an
intermediate fraction lies in I as soon as its height is greater than
R
:=
(q
l
+ q
l+1
)RQ
2q
l
q
l+1
+ RQ
.
Hence, q − q
l
R
and
n(Λ
I
)
q
l+1
− q
q
l
q
l+1
− R
q
l
− 1=
2q
2
l+1
− RQ
2q
l
q
l+1
+ RQ
− 1
2R
2
− RQ
9RQ
− 1=
2
27
1 −
5Q
R
area(Λ
I
).
Since R/Q
16, we get dens(Λ
I
) 1/24.
Northwestern University, Evanston, IL
E-mail address:
Appendix
By Michael Boshernitzan
With notation as in the introduction, for λ ∈ [0, 1), set
(6) h(λ)
def
= H.dim NE(Q
λ
).
Recall (see Introduction) that h(λ)≤1/2 for all λ [Ma], and, by Theorem 1,
that h(λ)=1/2 for all Diophantine λ.Wealso have h(λ)=0for rational λ
(then the set Q
λ
is in fact countable [V1]). The main result in this section is
given by the following theorem.
APPENDIX 675
Theorem 5. The set of λ ∈ [0, 1) for which H.dim NE(Q
λ
)=0form a
residual subset of [0, 1). In particular, there are irrational λ ∈ [0, 1) such that
h(λ)=0.
Recall that a subset A ⊂ X of a topological space X is called residual (or
topologically large) if it contains a dense G
δ
-subset of X.Asubset Y ⊂ X is
called a G
δ
-set (in X)ifY is a countable intersection of open subsets of X.
Its complement X \ Y is called an F
σ
-set.
We remark that no irrational number λ satisfying h(λ)=0isknown even
though the set is topologically large (in particular, uncountable). Any such λ
must be Liouville. (Note that the set of Liouville numbers forms a residual set
of Lebesgue measure 0.)
A.1.
2
skew products of irrational rotations
Let X = S
1
0
∪ S
1
1
= S
1
×{0, 1} be the union of two unit circles S
1
k
=
S
1
×{n}, n ∈{0, 1}, and consider the two-parameter family of transformations
ρ
α,λ
: X → X, α ∈ ,λ∈ K =[0, 1),
defined as follows. For x =(s, n) ∈ X,
(7) ρ
α,λ
(x)=ρ
α,λ
(s, n)=(s ⊕ α, n
),n
=
n, if 0 ≤ s<λ,
1 − n, if λ ≤ s<1,
where x =(s, n) ∈ X, s, s ⊕ α ∈ S
1
= / =[0, 1), and ⊕ stands for the
group operation in S
1
.
The dynamical systems (X, ρ
α,λ
)have been studied by Veech [V1] as par-
ticular
2
skew product extensions of irrational α-rotations. Indeed, ρ
α,λ
may
be interpreted as the first return map to a disjoint union of two circles embed-
ded in the surface associated to Q
λ
.Inparticular, properties of billiards on
Q
λ
reduce to the study of dynamical systems (X, ρ
α,λ
). One verifies that the
ergodicity of the billiard system Q
λ
in direction θ is equivalent to the ergodicity
of the map ρ
α,λ
with α = tan(θ) (the slope in direction θ). Denote
NE(X)={(α, λ) ∈
× [0, 1) | (X, ρ
α,λ
)isnot ergodic} ,
and, for λ ∈ [0, 1),
NE(X
λ
)={α ∈ | (α, λ) ∈ NE(X)}(8)
= {α ∈
| (X, ρ
α,λ
)isnot ergodic} .
The sets NE(X
λ
), NE(Q
λ
)have the same Hausdorff dimensions because
NE(X
λ
)=NE(X
λ
)+1=tan(NE(Q
λ
)) = {tan(θ) | θ ∈ NE(Q
λ
)} .
Thus we have (see (6))
(9) h(λ)=H.dim NE(Q
λ
)=H.dim NE(X
λ
).
676 MICHAEL BOSHERNITZAN
A.2. The topological lemma
The following lemma is central in the proof of Theorem 5.
Lemma A.1. Let L be a G
δ
-subset of a σ-compact metric space K.Let
P be a Polish space (aspace with a complete metric topology). Let H be an
F
σ
-subset of the cartesian product W = K × P.For every p ∈ P , denote
(10) K(p)={k ∈ K | (k, p) ∈ H}⊂K
and
(11) P
o
= {p ∈ P | K(p) ⊂ L}.
If P
o
is dense in P , then P
o
is a residual subset (i.e., contains a dense
G
δ
-subset) of P .Inparticular, P
o
is uncountable if P is.
Proof. Since the family of residual subsets of P is closed under coun-
table intersections, we assume (as we may without loss of generality) that K
is compact and H is closed in W = K × P .
Since P is separable and P
o
is dense in P , there is a countable subset
P
c
⊂ P
o
which is dense in P .Weassume that the points of P
c
are arranged in
one sequence {p
i
} so that every point p ∈ P
c
is repeated infinitely many times,
i.e. p
i
= p, for infinitely many i ≥ 1.
Let π
P
: W → P, π
K
: W → K be canonical projections. Denote
K
i
= K(p
i
)=π
K
(π
−1
P
(p
i
) ∩ H),(12)
K
i
= K
i
×{p
i
} = π
−1
P
(p
i
) ∩ H) ⊂ L ×{p
i
}.(13)
Since L is a G
δ
-subset of K, L has a representation
L =
i≥1
U
i
,K⊃ U
1
⊃ U
2
⊃ U
3
where {U
i
} forms (without loss of generality) a nonincreasing sequence of open
subsets of K.
Fix any integer i ≤ 1. The set H
i
= H \ π
−1
K
(U
i
) ⊂ W is closed, and so
is the set π
P
(H
i
) ⊂ P (the projection π
P
: W → P is a closed map since K is
compact). One observes that p
i
/∈ π
P
(H
i
) (since K
i
⊂ L ⊂ U
i
), and hence
π
−1
P
(p
i
) ∩ H = K
i
= K
i
×{p
i
}⊂(U
i
×{p
i
}) ∩ H ⊂ π
−1
K
∩ H.
Therefore the set V =
k≥1
(
i≥k
V
i
)isadense G
δ
-subset of P (it contains
a dense subset P
c
).
APPENDIX 677
To complete the proof, we have to verify that V ⊂ P
o
.Ifv ∈ V , then
v ∈ V
i
, for an infinite set of i.For all those i,
v/∈ π
P
(H
i
)=π
P
(H
i
∩ π
−1
K
(K \ U
i
))
which implies K(v) ⊂ U
i
.Itfollows that K(v) ⊂ L =
i≥1
U
i
, and therefore
v ∈ P
o
. This completes the proof of Lemma A.1.
A.3. The completion of the proof
Let H beaseparable Hilbert space. Denote by L(H) the Banach algebra
of bounded linear operators on H, and by U(H) the subset of unitary operators
on H.Werecall that convergence T
i
→ T in the strong (or weak) operator
topology means convergence T
i
f → Tf, for all f ∈H,inthe strong (or weak,
respectively) topology of the Hilbert space H. The strong and weak operator
topologies on L(H) are different, but they coincide when restricted to the set
U(H) ⊂L(H) (see e.g. Halmos [Ha, pp. 61–80]).
Now we view X = S
1
×{0, 1} as a probability measure space (X, B,µ) with
µ amultiple of the Lebesgue measure, dµ = ds/2. Take H = L
2
(X, B,µ), and
denote by G(X) the family of invertible measure-preserving transformations
T : X → X. Then G(X)isnaturally imbedded into U(H): for T ∈G(X) and
f ∈H= L
2
(X, B,µ) define T (f(x)) = f(T (x)) ∈H. The subspace topology
on G(X) induced by a strong (equivalently, weak) operator topology on U(H)
is called weak topology on G(X). The following result is well known (see [Ha,
p. 80]).
Lemma A.2. The set
E(X)={T ∈G(X) | T is ergodic}⊂G(X)
is a dense G
δ
-subset of G(X)(in the weak topology of G(X)).
Now let K = S
1
and P =[0, 1), and define W = K × P = S
1
× [0, 1), just
as in Lemma A.1. The map φ : W →G(X) defined by the formula (see (7))
φ(w)=ρ
w
= ρ
k,p
, for w =(k, p) ∈ W = K × P
is easily verified to be continuous. In view of Lemma A.2, the set φ
−1
(E(X))
is a G
δ
-subset of W , and thus the complement
H
def
= W \ φ
−1
(E(X)) = {w =(k, p) ∈ W | φ(w)=ρ
k,p
is not ergodic}
is an F
σ
-subset of W .
Denote by
(P ), (K) the sets of rationals in P and K = / =[0, 1),
respectively. Fix an arbitrary G
δ
-subset L in K such that
(14)
(K) ⊂ L ⊂ K, H.dim L =0.
678 MICHAEL BOSHERNITZAN
To satisfy the conditions of Lemma A.1, it remains to verify that P
o
is
dense in P .Forevery λ ∈ P,wehave K(λ)=NE(X
λ
) (see (8), (10)), and
therefore
P
o
= {λ ∈ P | NE(X
λ
) ⊂ L}.
For λ ∈
(P ), Veech [V1] proved that NE(X
λ
)= (K). Since L ⊃ (K)
by the choice of L, the inclusion
(P ) ⊂ P
o
holds. Thus P
o
is indeed dense
in P , and, by Lemma A.1, P
o
is a residual subset of P =[0, 1).
For every λ ∈ P
o
,wehave NE(X
λ
) ⊂ L;thus H.dim NE(X
λ
)=0inview
of (14). This completes the proof of Theorem 5.
Rice University, Houston, TX
E-mail address:
References
[Fa] K. Falconer, Fractal Geometry. Mathematical Foundations and Applications, John
Wiley & Sons Ltd., Chichester, 1990.
[Fu]
H. Furstenberg, Strict ergodicity and transformation of the torus, Amer. J. Math.
83 (1961), 573–601.
[Ha]
P. Halmos, Lectures on ergodic theory, Math. Soc. of Japan, no. 3 (1956), 1–99.
[Kh]
A. Ya. Khintchine, Continued Fractions, translated by Peter Wynn, P. Noordhoff
Ltd., Groningen, 1963.
[KMS]
S. Kerckhoff, H. Masur, and J. Smillie, Ergodicity of billiard flows and quadratic
differentials, Ann. of Math. 124 (1986), 293–311.
[Ma]
H. Masur, Hausdorff dimension of the set of nonergodic foliations of a quadratic
differential, Duke Math. J . 66 (1992), 387–442.
[MS]
H. Masur and J. Smillie, Hausdorff dimension of sets of nonergodic measured folia-
tions, Ann. of Math. 134 (1991), 455–543.
[MT]
H. Masur and S. Tabachnikov, Rational billiards and flat structures, in Handbook of
Dynamical Systems 1A (2002), 1015–1089.
[V1]
W. Veech, Strict ergodicity in zero dimensional dynamical systems and the Kronecker-
Weyl theorem mod 2, Trans. Amer. Math. Soc. 140 (1969), 1–34.
[V2]
, The billiard in a regular polygon, Geom. Funct. Anal. 2 (1992), 341–379.
(Received July 23, 2001)
(Revised October 21, 2002)