ENTROPY IN NONCOMPACT RANK ONE SITUATIONS, AND HAUSDORFF DIMENSION

S. KADYROV AND A. POHL

Abstract. Recently, Einsiedler and the authors provided a bound in terms of escape of mass for the amount by which upper-semicontinuity for metric entropy fails for diagonal flows on homogeneous spaces Γ\G, whereGis any connected semisimple Lie group of real rank 1 with finite center, and Γ is any nonuniform lattice inG. We show that this bound is sharp, and apply the methods used to establish bounds for the Hausdorff dimension of the set of points which diverge on average.

1. Introduction

Let G be a connected semisimple Lie group of R-rank 1 with finite center and
Γ a nonuniform lattice in G. Further let a ∈ G\ {1} be chosen such that its
adjoint action Ad_{a}on the Lie algebragofGisR-diagonalizable. The elementa
acts on the homogeneous space X:= Γ\Gby right multiplication, defining the
(generator of the) discrete geodesic flow

(1) T:X→X, x7→xa.

The normalized Haar measure m on X uniquely realizes the maximal metric
entropyh_{m}(T) ofT (see below for more details). The following relation between
metric entropies ofT and escape of mass alongT-invariant probability measures
onXhas been proven in [EKP]. We note that the limit measureν does not need
to be a probability measure.

Theorem. Let (µj)j∈N be a sequence of T-invariant probability measures on X which converges to the measure ν in the weak* topology. Then

(2) ν(X)h ^{ν}

ν(X)(T) +1

2hm(T)·(1−ν(X))≥lim sup

j→∞

hµj(T),
where it does not matter how we interpret h ^{ν}

ν(X)(T) if ν(X) = 0.

Since Γ is not cocompact, upper semi-continuity of metric entropy cannot be
expected onX. The theorem above shows that the amount by which it may fail
is controlled by the escaping mass. In this formula, the factor ^{1}_{2} is significant:

2010 Mathematics Subject Classification. Primary: 37A35, 37D40, Secondary: 28D20, 22D40.

Key words and phrases. Hausdorff dimension, divergent on average, escape of mass, entropy, diagonal flows.

S.K. acknowledges the support by the EPSRC. A.P. acknowledges the support by the SNF (Grant 200021-127145) and by the ERC Starting Grant ANTHOS.

1

it shows that the amount of failure is only half as bad as it could be a priori (which would be the factor 1).

The first aim of this article is to show that the factor ^{1}_{2} is best possible. More
precisely, we will establish the following theorem.

Theorem 1.1. For anyc∈[^{1}_{2}h_{m}(T), h_{m}(T)], there exists a convergent sequence
of T-invariant probability measures (µj)j∈N on X withlimj→∞hµj(T) =c such
that its weak* limit ν satisfies

ν(X) = 2c
h_{m}(T) −1.

For any such sequence (µj), equality holds in (2) as well as
h ^{ν}

ν(X)(T) =h_{m}(T) forν(X)6= 0
(and hence ν/ν(X) is the normalized Haar measure on X).

The second aim of this article is to relate the factor ^{1}_{2} to the Hausdorff dimension
of the set of points which diverge on average. We recall that a point x ∈ X is
said todiverge on average (with respect toT) if for any compact subsetK ofX
we have

n→∞lim 1 n

i∈ {0,1, . . . , n−1} |T^{i}(x)∈K = 0.

It is said to be divergent (with respect to T) if its forward trajectory under T
eventually leaves any compact subset. In other words, if for any compact subset
K of Xwe find N ∈Nsuch that forn > N we have T^{n}x /∈K.

Obviously, each divergent point diverges on average. Let
U :={u∈G|a^{n}ua^{−n}→1 as n→ ∞}

denote the unstable subgroup with respect to a. From [Dan85] and also from [EKP] it follows that the Hausdorff dimension of the set of divergent points is dimG−dimU. However, for the set of points diverging on average we prove that its Hausdorff dimension is strictly larger than dimG−dimU. Moreover, we also obtain an upper estimate showing that its dimension is strictly less than the full dimension. To state these results more precisely, let

D:={x∈X|x diverges on average}.

The Lie group G has at most two positive roots, namely a short one, denoted α, and a long one 2α. Let

p1 := dimg_{α} and p2 := dimg_{2α}.

The groupG has a single positive root if and only if it consists of isometries of a real hyperbolic space. In this case, we set p1 = 0 or p2 = 0 (both cases are possible and relevant, see Section 2).

Theorem 1.2. For the Hausdorff dimension of Dwe have the estimates dimG− 1

2dimU −p2

2 ≤dimD≤dimG−1

2dimU+p1

4 .

The proof of Theorem 1.2shows that the factor ^{1}_{2} of dimU arises for the same
reason as the factor ^{1}_{2} in (2). If G consists of isometries of a real hyperbolic

space, we obtain the following improvement. It is caused by the fact that in this case, the adjoint action of ahas a single eigenvalue of modulus greater than 1.

Theorem 1.3. Suppose thatG consists of isometries of a real hyperbolic space.

Then

dimD= dimG−1 2dimU.

Therefore, it seems natural to expect the following precise value for the Hausdorff dimension ofD.

Conjecture 1.4. If G is any R-rank 1 connected semisimple Lie group with
finite center, then dimHD= dimG−^{1}_{2}dimU.

For the homogeneous spaces SL_{d+1}(Z)\SL_{d+1}(R), d ≥ 1, and the action of a
certain singular diagonal element of SL_{d+1}(R), the analog of Theorem 1.1 has
been proven in [Kad12]. Ford= 2, the Hausdorff dimension of the set of points
which diverge on average is shown in [EK12] to be 6 + 4/3.

2. Preliminaries

The Lie algebra g of the Lie group G is the direct sum of a simple Lie algebra of rank 1 and a compact one. The compact component does not have any influence on the dynamics considered here (cf. [EKP]). For this reason, we assume throughout thatgis a simple Lie algebra of rank 1 and, correspondingly, thatGis a connected simple Lie group ofR-rank 1 with finite center. This allows us to work with a coordinate system for G which is adapted to the dynamics, and Gcan be realized as the isometry group of a Riemannian symmetric space of rank 1 and noncompact type. For more background information on this coordinate system we refer to [CDKR91,CDKR98].

Coordinate system. LetA be the maximal one-parameter subgroup ofG of diagonalizable elements which contains a, the chosen generator for the discrete geodesic flow T. Then there exists a group homomorphism α: A → (R>0,·) such that α(a)>1 and gdecomposes into the direct sum

(3) g=g_{−2}⊕g_{−1}⊕c⊕g_{1}⊕g_{2},
where

g_{j} :=n
X ∈g

∀ea∈A: Ad

eaX=α(ea)^{j}^{2}Xo

, j∈ {±1,±2},

and c is the Lie algebra of the centralizer C = C_{A}(G) of A in G. The ho-
momorphism α is the square root of the “group analog” of the root α in the
Introduction. If gis not isomorphic to so(1, n), n∈N, the decomposition (3) is
the restricted root space decomposition of g. If g is isomorphic to so(1, n) for
somen∈N(which is equivalent to saying thatGconsists of isometries of a real
hyperbolic space), eitherg_{1} org_{2} is trivial. In this case, both

g=g_{−1}⊕c⊕g_{1} and g=g_{−2}⊕c⊕g_{2}

are restricted root space decompositions of g. The first one corresponds to the
Cayley-Klein models of real hyperbolic spaces, the second one to the Poincar´e
models (see [CDKR91, CDKR98]). In any case, let n:= g_{2}⊕g_{1} and let N be

the connected, simply connected Lie subgroup of Gwith Lie algebran. Further pick a maximal compact subgroup K of Gsuch that

N×A×K →G, (n,ea, k)7→neak (Iwasawa decomposition) is a diffeomorphism, and let

M :=K∩C.

The semidirect product N Ais parametrized by

R>0×g_{2}×g_{1} →N A, (s, Z, X)7→exp(Z+X)·as

with α(as) = s, as ∈ A. Let θ be a Cartan involution of g such that the Lie algebra k of K is its 1-eigenspace, and letB denote the Killing form. Further let

p_{1} := dimg_{1} and p_{2}:= dimg_{2}.
On nwe define an inner product via

hX, Yi:=− 1

p_{1}+ 4p_{2}B(X, θY) forX, Y ∈n.

This specific normalization yields that the Lie algebra [·,·] of g, even though it is indefinite, satisfies the Cauchy-Schwarz inequality

|[X, Y]| ≤ |X||Y|

forX, Y ∈n(see [Poh10]). We may identifyG/K∼=N A∼=R>0×g_{2}×g_{1} with
the space

D:=

(t, Z, X)∈R×g_{2}×g_{1}

t > 1
4|X|^{2}

via

R>0×g_{2}×g_{1} →D, (t, Z, X)7→(t+^{1}_{4}|X|^{2}, Z, X).

With the linear map J:g_{2} →End(g1), Z 7→JZ,

hJ_{Z}X, Yi:=hZ,[X, Y]i for all X, Y ∈g_{1},

the geodesic inversionσ ofD at the origin (1,0,0) is given by (see [CDKR98])

(4) σ(t, Z, X) = 1

t^{2}+|Z|^{2} t,−Z,(−t+J_{Z})X
.

We shall identify σ with the element in K with acts as in (4). ThenG has the Bruhat decomposition

(5) G=N AM ∪N AM σN.

To modify this Bruhat decomposition into one which is tailored to the dynamics on X, we recall the following result on fundamental domains of Siegel type. For s >0 let

As:={a_{t}∈A|t > s},
and for any compact subset η of N define the Siegel set

Ω(s, η) :=ηAsK.

Proposition 2.1 (Theorem 0.6 and 0.7 in [GR70]). There exists s0 > 0, a compact subset η0 of N and a finite subset Ξ of G such that

(i) G= ΓΞΩ(s_{0}, η_{0}),

(ii) for all ξ∈Ξ, the group Γ∩ξN ξ^{−1} is a cocompact lattice inξN ξ^{−1},

(iii) for all compact subsets η of N the set

{γ ∈Γ|γΞΩ(s0, η)∩Ω(s0, η)6=∅}

is finite,

(iv) for each compact subset η of N containing η0, there exists s1 > s0 such
that for all ξ_{1}, ξ_{2} ∈ Ξ and all γ ∈ Γ with γξ_{1}Ω(s_{0}, η)∩ξ_{2}Ω(s_{1}, η) 6= ∅ we
have ξ_{1} =ξ_{2} and γ ∈ξ_{1}N M ξ_{1}^{−1}.

Throughout we fix a choice for η_{0}, s_{1} (with η =η_{0}) and Ξ. The elements of Ξ
are representatives for the cusps of X(and will also be called cusps). Note that
σN σ = U is the unstable subgroup with respect to a, and the conjugation of
σ(1, Z, X)σ∈U by ais given by

a^{−k}σ(1, Z, X)σa^{k}=σ(1, α(a)^{−k}Z, α(a)^{−k/2}X)σ (k∈Z).

Multiplying (5) withξ ∈Ξ from the left and σ from the right yields G=ξN AM σ∪ξN AM U.

Maximal entropy. LetM1(X)^{T} denote the set ofT-invariant probability mea-
sures on X. For each µ ∈ M1(X)^{T} let h_{µ}(T) denote the metric entropy of T
with respect to µ. In [EL10, Section 7.8] it is shown that the maximal metric
entropy

max{h_{µ}(T)|µ∈M1(X)^{T}}

of T is uniquely realized by the normalized Haar measure m on X, and it is given by

h_{m}(T) =p_{1}
2 +p_{2}

logα(a).

Normalization. If the element a in (1) changes (within A) then all metric entropies scale by the same factor. Thus, for proving Theorem 1.1-1.3we may and will assume throughout thata is chosen such that

α(a) =e, (e= exp(1)) letting T result in the time-one geodesic flow.

The height function and an improved choice of s_{1}. In the following
we recall the definition of the height function on X from [EKP] and its for us
significant properties. For any ξ ∈ Ξ consider the ξ-Iwasawa decomposition
G =ξN AK. For g ∈G let s= s_{ξ}(g) >0 be given by g =ξN a_{s}K. Then s is
indeed well-defined. Forx∈X, its ξ-height is

ht_{ξ}(x) = sup{s_{ξ}(g)|Γg=x}.

Its height is

ht(x) = max{ht_{ξ}(x)|ξ∈Ξ}.

For s >0 we set

X<s={x∈X: ht(x)< s} and X≥s={x∈X: ht(x)≥s}.

The constant s_{1} in Proposition2.1 can be chosen such that

(i) if for x∈Xand ξ∈Ξ, we have htξ(x)> s1, then ht(x) = htξ(x),

(ii) if for x ∈ X, we have ht(x) > s1 and ht(x) > ht(xa), then the T-orbit of
x strictly descends below height s1 before it can rise again. This means
that there exists n∈Nsuch that for j = 0, . . . , n−1, we have ht(xa^{j})>

ht(xa^{j+1}) and ht(xa^{n})≤s1, and

(iii) ifx∈Xand ht_{ξ}(x)> s_{1}for someξ∈Ξ, then there is (at least one) element
g=ξnarmu∈ξN AM U org=ξnarmσ∈ξN AM σ which realizes htξ(x).

That is, x = Γg and ht_{ξ}(x) = s_{ξ}(g). The components a_{r} and u do not
depend on the choice ofg.

We suppose from now on that s_{1} satisfies these properties.

For any point x∈Xwhich is high in some cusp, we have the following explicit formulas for the calculation of the height of the initial part of its orbit.

Proposition 2.2 ([EKP]). Let x ∈ X, ξ ∈ Ξ and suppose that ht_{ξ}(xa^{k}) > s_{1}
for all k∈ {0, . . . , n}.

(i) If ht_{ξ}(x) is realized by g=ξnarmσ∈ξN AM σ, then
htξ(xa^{k}) =re^{−k}.

(ii) If ht_{ξ}(x) is realized byg=ξna_{r}mu∈ξN AM U withu=σ(1, Z, X)σ, then
ht_{ξ}(xa^{k}) =r e^{−k}

e^{−k}+^{1}_{4}|X|^{2}2

+|Z|^{2}.

Riemannian metric on G and metric on X. The isomorphism
n=g_{2}×g_{1}→N, (Z, X)7→exp(Z+X),

induces an inner product on N from the inner product onn. Then the isomor-
phismN →U,n7→σnσ, induces an inner product on U, and using the inverse
of the exponential map, also onn:=g_{−2}×g_{−1}.

We pick a leftG-invariant Riemannian metric onG, which on the tangent space
T1G∼=g reproduces the inner products onnand n. Let dG denote the induced
left-G-invariant metric on G. For r > 0 let B_{r}^{G}, B_{r}^{U}, resp. B_{r}^{N AM} denote the
r-balls in G,U, resp.N AM around 1∈G. We define

λ_{0} := min{|λ| |λis an eigenvalue of Ad_{a} with|λ|>1}.

Thus,
λ_{0}=

(e ifg_{1}={0} (and henceG/K is a real hyperbolic space),
e^{1/2} otherwise.

Then for any L≥0 we have

a^{L}B_{r}^{U}a^{−L}⊆B_{λ}^{U}−L
0 r

or, in other words,

dG(ua^{−L}, va^{−L})≤λ^{−L}_{0} dG(u, v)≤dG(u, v)
foru, v∈U. Further

cmax{|Z|,|X|} ≤dG(1, σ(1, Z, X)σ)

for some constantc >0 and all u=σ(1, Z, X)σ ∈U. In order to avoid carrying
too many constants through the calculation, we may assume that c = 1. The
induced metric d_{X}on Xis given by

d_{X}(x, y) := min{d_{G}(g, h)|x= Γg, y= Γh}.

We usually omit the subscripts of dG andd_{X}.
Finally, to shorten notation, we use

[0, n] :={0, . . . , n}

for n∈N. The context will always clarify whether [0, n] refers to this discrete interval or a standard interval in R.

3. Upper bound on Hausdorff dimension Recall that

D={x∈X|x diverges on average}.

Theorem 3.1. The Hausdorff dimension of Dis bounded from above by

(i) dimD≤dimG−1

2dimU +p_{1}
4.
If p_{2}= 0, then

(ii) dimD≤dimG−1

2dimU.

The proof of this theorem builds on Lemma3.2below, which easily follows from the contraction rate of the unstable direction under the action of a.

Lemma 3.2. Let µ be a probability measure on X of dimension at most β.

Then, for any r >0, any x∈Xand any L∈Nwe have
µ(xa^{L}B_{r}^{U}a^{−L}B^{N AM}_{r} )≤cr^{β}e(^{dim}^{N AM}^{+}^{p}_{2}^{1}^{−β})^{L}.
If p_{2}= 0, this bound can be improved to

µ(xa^{L}B_{r}^{U}a^{−L}B_{r}^{N AM})≤cr^{β}e^{(dim}^{N AM−β)}^{L}^{2}.
Here, c is a constant only depending on µ.

Proof of Theorem 3.1. The claimed bound on the Hausdorff dimension of D follows using the method used to prove Theorem 1.4 and Corollary 1.5 in [EK12], using Lemmas 8.4 and 8.5 in [EKP] as well as Lemma 3.2.

4. Lower bound on Hausdorff dimension

In this section we prove the following lower bound on Hausdorff dimension:

Theorem 4.1. The Hausdorff dimension of the set of points inXwhich diverge on average is at least

dimG−1

2dimU −p_{2}
2.

As a tool we use a lower estimate on the Hausdorff dimension of the limit set of strongly tree-like collections provided by [KM96, §4.1] (which goes back to [Fal86], [McM87], [Urb91], and [PW94]).

LetU0be a compact subset ofU and letλbe the Lebesgue measure onU (using
the identification U ∼=R^{p}^{2}×R^{p}^{1}). A countable collection Uof compact subsets
of U0 (a subset of the power set of U0) is said to be strongly tree-like if there
exists a sequence (Uj)j∈N0 of finite nonempty collections on U_{0} withU0 ={U_{0}}
such that

U= [

j∈N0

Uj

and

∀j∈N0 ∀A, B∈Uj either A=B orλ(A∩B) = 0, (6)

∀j∈N∀B ∈Uj ∃A∈Uj−1 such thatB ⊆A, (7)

dj(U) := sup

A∈Uj

diam(A)→0 as j→ ∞.

(8)

Note that (6) impliesλ(A)>0 for allA∈U. For a strongly tree-like collection Uwith fixed sequence (Uj)j∈N0 we let

(9) U_{j} := [

A∈Uj

A for any j∈N0.

Clearly,U_{j} ⊆Uj−1 for any j∈N. Further we call the nonempty set

(10) U∞:= \

j∈N0

Uj

the limit set of U. For any subset B of U0 and any j ∈ N we define the j-th stage densityof B inUto be

δj(B,U) :=

(0 ifλ(B) = 0

λ(Uj∩B)

λ(B) ifλ(B)>0.

Note thatδ_{j}(B,U)≤1. Finally, for anyj ∈N0 we define the j-th stage density
of Uto be

∆_{j}(U) := inf

B∈U^{j}δ_{j+1}(B,U).

Lemma 4.2 ([KM96]). For any strongly tree-like collection U of subsets of U_{0}
we have

dim_{H}(U∞)≥dimU −lim sup

j→∞

j−1

P

i=0

log(∆_{i}(U))

|log(d_{j}(U))| .

4.1. Construction of a strongly tree-like collection. We construct a stro- ngly tree-like collection such that its limit set consists only of points which diverge on average. This construction proceeds in several steps.

Proposition 4.3. Let s >39s1 and R ∈ N. Then there exists x ∈ X≤s such
that for any η in the interval (0,^{1}_{2}) there exists a subset E of B^{U}_{ηe}^{−R/4} with
S =be^{R/2}c^{p}^{2}be^{R/4}c^{p}^{1} elements such that

(i) for all u∈E, the points xu and T^{R}(xu) are contained in X≤s,

(ii) for any two distinct elements u, v∈E we have d T^{R}(u), T^{R}(v)

≥η,
(iii) for all u∈E and all k∈[0, R]we have T^{k}(xu)∈X>s/39.

We may choose for x any element Γg with

g∈ {ξna_{r}mσ(1, Z0, X0)σ|n∈N, r∈I, m∈M},

where ξ ∈ Ξ is any cusp, I is a specific interval in R of positive length and
(1, Z_{0}, X_{0}) is a specific point in N, both being specified in the proof. Thus, the
dimension of the set of possible x is at leastdim(N AM).

Proof. Fix a cusp ξ ∈ Ξ and pick an element (Z0, X0) ∈ g_{2}×g_{1} with |Z_{0}| =

3

2e^{−R/2} and |X_{0}|= ^{3}_{2}e^{−R/4}. Define

g:=ξna_{r}mσ(1, Z_{0}, X_{0})σ and x:= Γg
withn∈N,m∈M. Set

B:=

(Z, X)∈g_{2}×g_{1}

|Z| ≤ηe^{−R/2}, |X| ≤ηe^{−R/4} .

In the following we will estimate the height of xa^{k}, k ∈ [0, R], and deduce an
allowed range for r such that x satisfies (iii) and (i) for all elements in σBσ.

Since the height does not depend onnand m, we omit these two elements. Let (Z, X)∈B. Recall that

gσ(1, Z, X)σ=ξarσ(1, Z0+Z+^{1}_{2}[X0, X], X0+X)σ.

Then

(11) e^{−R/4}<|X_{0}+X|<2e^{−R/4}
and, using|[X_{0}, X]| ≤ |X_{0}||X|,

(12) 5

8e^{−R/2} <

Z_{0}+Z+1

2[X_{0}, X]

<3e^{−R/2}.
Let k∈[0, R]. Recall that

(13) htξ xσ(1, Z, X)σa^{k}

=r· e^{−k}

e^{−k}+^{1}_{4}|X_{0}+X|^{2}2

+

Z0+Z+^{1}_{2}[X0, X]

2

for sufficiently large r (calculated below). Using the upper bounds in (11) and (12) it follows that

ht_{ξ} xσ(1, Z, X)σa^{k}

> r 13.

Hence, (iii) is satisfied for r > ^{s}_{3} (note that then _{13}^{r} > _{39}^{s} > s1). Moreover, for
these r, [EKP, Proposition 5.5] shows that

ht xσ(1, Z, X)σa^{n}

= ht_{ξ} xσ(1, Z, X)σa^{n}
.
Using the lower bounds in (11) and (12) we find

ht(xσ(1, Z, X)σa^{k})≤ r

e^{−k}+^{1}_{2}e^{−R/2}+^{25}_{64}e^{k−R}.

For r≤ ^{25}_{64}s, this implies ht(xσ(1, Z, X)σa^{k})≤s fork∈ {0, R}and hence (i).

To define the setE, we may pick pairwise disjoint elements
(Zi, Xj)∈B, i= 1, . . . ,be^{R/2}c^{p}^{2}, j = 1, . . . ,be^{R/4}c^{p}^{1}

such that

|Z_{k}−Z_{`}| ≥ηe^{−R}, |X_{k}−X_{`}| ≥ηe^{−R/2}
whenever k6=`. Define

E :=

σ(1, Z_{i}, X_{j})σ

i= 1, . . .be^{R/2}c^{p}^{2}, j = 1, . . . ,be^{R/4}c^{p}^{1} .
For any two distinct elements σ(1, Z, X)σ, σ(1, Z^{0}, X^{0})σ ∈E we have

d(σ(1, Z, X)σa^{R},σ(1, Z^{0}, X^{0})σa^{R})

≥max

Z−Z^{0}+1

2[X, X^{0}]

e^{R},|X−X^{0}|e^{R/2}

If X6=X^{0}, then

d(σ(1, Z, X)σa^{R}, σ(1, Z^{0}, X^{0})σa^{R})≥ |X−X^{0}|e^{R/2}≥η.

If X=X^{0}, then

d(σ(1, Z, X)σa^{R}, σ(1, Z^{0}, X^{0})σa^{R})≥ |Z−Z^{0}|e^{R}≥η.

This completes the proof.

To simplify notation we use the following convention: Given a sequence (S_{k})k∈N

of positive natural numbers, for anyn∈Nwe let

Sn:={(i_{1}, . . . , i_{n})|1≤i_{j} ≤S_{j}, j = 1, . . . , n}= [1, S_{1}]× · · · ×[1, S_{n}]
be the set of n-multi-indices with entries 1, . . . , Sj in the j-th component. If
i= (i_{1}, . . . , i_{n})∈Sn and j∈[1, S_{n+1}], then we set

(i, j) := (i_{1}, . . . , i_{n}, j) ∈Sn+1.
Finally we let

S:= [

n∈N

Sn. We let

B_{ε}(K) :={x∈X|d(K, x)< ε}

denote the ε-thickening of the setK⊆X.

Theorem 4.4. Let K be a compact subset ofX. For any k∈N choose natural
numbers R_{k}, S_{k} ∈ N such that there exist a subset E^{(k)} ⊆ U of cardinality S_{k}
and a point x_{k}∈K such that for any u∈E^{(k)} we have

(14) xku, T^{R}^{k}(xku)∈K.

Then for any i∈Sthere exists g_{i}∈U such that, if we define
E_{n}^{0} :={g_{i}|i∈Sn} for n∈N,
the following properties are satisfied:

(i) E_{1}^{0} =E^{(1)},

(ii) for any m∈N there exists an enumeration of E^{(m)} by [1, S_{m}], say
E^{(m)}=n

u^{(m)}_{1} , . . . , u^{(m)}_{S}

m

o ,

and for any η > 0 there exists R^{0} = R^{0}(η,K) ∈ N (independent of the
choice of the g_{i}’s) such that with

F(k) :=

k−1

X

i=1

Ri+ (k−1)R^{0}, k∈N,
we have

(15) d T^{F}^{(n)+R}^{n}gi, T^{F}^{(n)+R}^{n}g_{(i,j)}

< η
for any n∈N, i∈Sn, andj ∈[1, Sn+1], and
(16) T^{F}^{(k)}(x_{1}g_{i})∈x_{k}u^{(k)}_{i}

k B^{N AM}_{η/2} a^{R}^{k}B_{η/2}^{U} a^{−R}^{k}
for any n∈N, any i= (i1, . . . , in)∈Sn and any k∈[1, n].

If, in addition, η_{0} >0 is an injectivity radius of B_{ε}(K) for some (fixed) ε > 0,
and

E^{(k)} ⊆B_{η}^{U}

0/4 for all k∈N, and

d T^{R}^{k}u, T^{R}^{k}v)≥η_{0}

for any distinct u, v∈E^{(k)}, any k∈N, and in (ii) we have
η <min

η_{0}(λ_{0}−1)
4λ0

,ε 2

then

(iii) for any n∈N, the set E_{n}^{0} has the cardinality of Sn, and
(iv) for any n∈N, any distinct i,j∈Sn we have

η0 > d(gi, gj) and d T^{F}^{(n)+R}^{n}gi, T^{F}^{(n)+R}^{n}gj

> η_{0}
2 .

The proof of Theorem4.4is based on Lemmas4.5-4.7below. Throughout these lemmas we let K be a fixed compact subset of X.

Recall that the group U N AM is a neighborhood of 1∈G. We fixε_{1} >0 such
that B_{ε}^{G}_{1} ⊆U N AM. The Shadowing Lemma 4.5 below uses the fact that the
subgroups N AM and U intersect in the neutral element 1 only.

Lemma 4.5 (Shadowing Lemma). There exists c > 0 such that for any ε ∈
(0, ε1) and x−, x+∈X with d(x−, x+)< ε there exist u^{+} ∈B_{cε}^{U} and u∈B_{cε}^{N AM}
such that

(17) x−u^{+}=x+u

Proof. There existsg∈Gwithd(g,1)< εsuch thatx−g=x_{+}.Writeg=u^{+}u^{−1}
with u ∈ N AM and u^{+} ∈ U. Then, d(u^{+},1) < cε and d(u,1) < cε and
x−u^{+} =x_{+}u. Now continuity of the decomposition, continuous dependence of
c onu^{+} and u, and the bounded range for εimplies a uniform constantc.

The compactness of K and the topological mixing of T imply the following lemma.

Lemma 4.6. For any η > 0 and any δ >0 there exists R^{0} = R^{0}(δ,K, η) ∈ N
such that for any z−, z+ ∈ Bη(K) and ` ≥ R^{0} there exists z^{0} ∈ X such that
d(z^{0}, z−)< δ and d(z_{+}, T^{`}(z^{0}))< δ.

The proof of the following lemma is a combination of Lemmas 4.5and 4.6.

Lemma 4.7. Let η > 0 and let z− and z_{+} be in B_{η}(K). Let c be as in the
Shadowing Lemma 4.5. For any δ >0 let R^{0} =R^{0}(δ,K, η) be as in Lemma 4.6.

Then there exist u^{+}∈B_{c(c+2)δ}^{U} and u∈B^{N AM}_{c(c+2)δ} such that
T^{R}^{0}(z−u^{+}) =z+u.

Proof. Throughout we will assume that δ < _{c+1}^{ε}^{1} to be able to apply the Shad-
owing Lemma4.5. If the statement is proven for these smallδ, it holdsa fortiori
for larger δ. We first use Lemma4.6 to obtainz^{0}∈Xsuch that

(18) d(z^{0}, z−)< δ and d z+, T^{R}^{0}(z^{0})

< δ.

Then we apply Lemma4.5withx−=z−,x_{+}=z^{0} and ε=δ to obtainu^{+}_{1} ∈B_{cδ}^{U}
and u1 ∈B_{cδ}^{N AM} such that

(19) z−u^{+}_{1} =z^{0}u1.

The distance between T^{R}^{0}(z−u^{+}_{1}) andz_{+} is bounded as follows:

d T^{R}^{0}(z−u^{+}_{1}), z+

=d T^{R}^{0}(z^{0}u1), z+

≤d T^{R}^{0}(z^{0}u1), T^{R}^{0}z^{0}

+d T^{R}^{0}z^{0}, z+

<(c+ 1)δ.

We apply again Lemma 4.5, this time for x− = T^{R}^{0}(z−u^{+}_{1}), x_{+} = z_{+} and
ε= (c+ 1)δ to obtain u^{+}_{2} ∈B_{c(c+1)δ}^{U} and u∈B_{c(c+1)δ}^{N AM} such that

T^{R}^{0}(z−u^{+}_{1})u^{+}_{2} =z+u.

Now T^{R}^{0}(z−u^{+}_{1})u^{+}_{2} = T^{R}^{0} z−(u^{+}_{1}a^{R}^{0}u^{+}_{2}a^{−R}^{0})

. Setting u^{+} := u^{+}_{1}(a^{R}^{0}u^{+}_{2}a^{−R}^{0})

concludes the proof.

Proof of Theorem 4.4. We start by proving (i) and (ii). To that end let η > 0
be arbitrary and pick c >0 as in the Shadowing Lemma4.5. Set D_{η} :=B_{η}(K),

δ := η

2· λ_{0}−1
c(c+ 2)λ_{0}

and fix R^{0} with the properties as in Lemma 4.6 applied for this δ. Instead of
proving (16) we will prove the stronger statement

(20) T^{F}^{(k)}(x1gi)∈xku^{(k)}_{i}

k B_{c(c+2)δ}^{N AM} a^{R}^{k}B^{U}_{r(n,k)}a^{−R}^{k}
for any n∈N, any i= (i_{1}, . . . , i_{n})∈Sn and anyk∈[1, n] where

r(n, k) :=c(c+ 2)δ

n−k−1

X

i=0

λ^{−i}_{0}

and r(n, n) = 0 by convention. Since c(c+ 2)δ < η/2 andr(n, k)< η/2, this is indeed stronger than (16). For the proof of (20) we precede by induction onn.

As a by-product, we will prove (i) and (15).

For n = 1 and j ∈ [1, S_{1}] we set g_{i} = u^{(1)}_{i} . Then (i) and (20) for n = 1
are trivially satisfied. Suppose that for some n ∈ N we constructed the set
E_{n}^{0} fulfilling (20). We show how to construct E_{n+1}^{0} from E_{n}^{0} such that (20) is
satisfied for n+ 1 and (15) for n.

Let i∈Sn andj ∈[1, Sn+1]. By the inductive hypothesis
T^{F}^{(n)}(x1gi)∈xnu^{(n)}_{i}_{n} B^{N AM}^{η}

2 a^{R}^{n}B^{U}^{η}

2a^{−R}^{n}.
Thus,

T^{F}^{(n)+R}^{n}(x_{1}g_{i})∈T^{R}^{n}(x_{n}u^{(n)}_{i}

n )a^{−R}^{n}B^{N AM}^{η}

2

a^{R}^{n}B^{U}^{η}

2

. From

a^{−R}^{n}B^{N AM}^{η}

2 a^{R}^{n}B^{U}^{η}

2

⊆B_{η}^{G}

and T^{R}^{n}(xnu^{(n)}_{i}_{n} )∈K, it follows thatT^{F}^{(n)+R}^{n}(x1gi)∈Dη. Further,
xn+1u^{(n+1)}_{j} ∈K⊆Dη.

We apply Lemma 4.7with

z−:=T^{F}^{(n)+R}^{n}(x1g_{i}) and z+:=xn+1u^{(n+1)}_{j}
to obtain u^{+}_{j} ∈B^{U}_{c(c+2)δ} and u_{j} ∈B_{c(c+2)δ}^{N AM} satisfying

(21) x_{1}g_{i}a^{F}^{(n)+R}^{n}u^{+}_{j}a^{R}^{0} =T^{R}^{0}(z−u^{+}_{j} ) =z_{+}u_{j} =x_{n+1}u^{(n+1)}_{j} u_{j}.
We define

g_{(i,j)}:=g_{i}a^{F(n)+R}^{n}u^{+}_{j} a^{−F}^{(n)−R}^{n} ∈U
and

E_{n+1}^{0} :={g_{(i,j)}|i∈Sn, j ∈[1, Sn+1]}.

Clearly,

d T^{F}^{(n)+R}^{n}(g_{i}), T^{F}^{(n)+R}^{n}(g_{(i,j)})

=d(1, u^{+}_{j} )< η
2,
which proves (15) for n.

We will now show (20) for n+ 1. Suppose first that k = n+ 1. From the definition of F(n+ 1) and (21) it immediately follows that

T^{F}^{(n+1)}(x_{1}g_{(i,j)})∈x_{n+1}u^{(n+1)}_{j} B^{N AM}_{c(c+2)δ}.
Suppose now that k∈[1, n]. Then

T^{F}^{(k)}(x_{1}g_{(i,j)}) =x_{1}g_{i}a^{F}^{(n)+R}^{n}u^{+}_{j} a^{F}^{(k)−F}^{(n)−R}^{n}

=T^{F}^{(k)}(x_{1}g_{i})a^{−F}^{(k)+F(n)+R}^{n}u^{+}_{j} a^{F(k)−F}^{(n)−R}^{n}

∈T^{F(k)}(x1gi)a^{−F}^{(k)+F}^{(n)+R}^{n}B_{c(c+2)δ}^{U} a^{F}^{(k)−F(n)−R}^{n}.
From the inductive hypothesis we have

T^{F}^{(k)}(x1gi)∈xku^{(k)}_{i}

k B_{c(c+2)δ}^{N AM} a^{R}^{k}B_{r(n,k)}^{U} a^{−R}^{k}.
Therefore

(22) T^{F}^{(k)}(x_{1}g_{(i,j)})

∈x_{k}u^{(k)}_{i}

k B_{c(c+2)δ}^{N AM} a^{R}^{k}B_{r(n,k)}^{U} a^{−F}^{(k)−R}^{k}^{+F}^{(n)+R}^{n}B_{c(c+2)δ}^{U} a^{F}^{(k)−F}^{(n)−R}^{n}.

If k=n, then r(n, k) = 0. Hence (22) simplifies to
T^{F}^{(n)}(x_{1}g_{(i,j)})∈x_{n}u^{(n)}_{i}

n B^{N AM}_{c(c+2)δ}a^{R}^{n}B^{U}_{c(c+2)δ}a^{−R}^{n}.
If k∈[1, n−1], then

−F(k)−R_{k}+F(n) +R_{n}=

n

X

i=k+1

R_{i}+ (n−k)R^{0} =:p(k, n).

Hence

a^{−F}^{(k)−R}^{k}^{+F(n)+R}^{n}B_{c(c+2)δ}^{U} a^{F}^{(k)+R}^{k}^{−F}^{(n)−R}^{n}⊆B^{U}

c(c+2)δλ^{−p(k,n)}_{0}

⊆B^{U}

c(c+2)δλ^{−(n−k)}_{0} .
With r(n, k) +c(c+ 2)δλ^{−(n−k)}_{0} =r(n+ 1, k) it now follows that

T^{F}^{(k)}(x_{1}g_{(i,j)})∈x_{k}u^{(k)}_{i}

k B^{N AM}_{c(c+2)δ}a^{R}^{k}B_{r(n+1,k)}^{U} a^{−R}^{k}.
This completes the proof of (ii).

Since (iii) is an immediate consequence of (iv), it remains to prove the two state-
ments in (iv). We start with the first one. Leti= (i_{1}, . . . , i_{n}),j= (j_{1}, . . . , j_{n})∈
Sn. Then

d(g_{i}, g_{j})≤d(g_{i}, g_{i}_{1}) +d(g_{i}_{1}, g_{j}_{1}) +d(g_{j}_{1}, g_{j}).

Since g_{i}_{1}, g_{j}_{1} ∈ E^{(1)} ⊆ B_{η}^{U}

0/4, we have d(g_{i}_{1}, g_{j}_{1}) < η_{0}/2. To bound the other
two terms, let k∈[1, S_{n+1}]. Then by (15) we have

d T^{F}^{(n)+R}^{n}g_{i}, T^{F}^{(n)+R}^{n}g_{(i,k)}

< η.

Therefore,

d(g_{i}, g_{(i,k)})< ηλ^{−F}_{0} ^{(n)−R}^{n}.
Applying this observation iteratively, we obtain

d(gi1, g_{i})< η

n−1

X

j=1

λ^{−F}_{0} ^{(j)−R}^{j} < η· 1

λ_{0}−1 < η0

4. Thus,

d(gi, gj)< η0

as claimed.

Finally, let i,j∈Sn,i6=j. It remains to show that
(23) d(T^{F}^{(n)+R}^{n}g_{i}, T^{F}^{(n)+R}^{n}g_{j})> η_{0}

2. Suppose first that we find k∈[1, n] such that

d(g_{i}a^{F}^{(k)}, g_{j}a^{F(k)})≥η_{0}.
Since F(k)−F(n)−R_{n}<0, the assumption

d(g_{i}a^{F}^{(n)+R}^{n}, g_{j}a^{F}^{(n)+R}^{n})≤ η0

2 would result in

d(g_{i}a^{F(k)}, g_{j}a^{F}^{(k)})≤ η_{0}
2.
Therefore, in this case, (23) is obviously satisfied.

To complete the proof pick k∈[1, n] such thatik6=jk and suppose
d(gia^{F}^{(k)}, gja^{F(k)})< η0.

Actually, we may suppose ≤η0/2, but< η0 turns out to be sufficient. By (16)
we find u^{−}_{i} , u^{−}_{j} ∈B_{η/2}^{N AM} and u^{+}_{i} , u^{+}_{j} ∈B^{U}_{η/2} such that

T^{F(k)}(x1g_{i}) =x_{k}u^{(k)}_{i}

k u^{−}_{i} a^{R}^{k}u^{+}_{i} a^{−R}^{k}
and

T^{F}^{(k)}(x_{1}g_{j}) =x_{k}u^{(k)}_{j}

k u^{−}_{j} a^{R}^{k}u^{+}_{j}a^{−R}^{k}.

Pick h_{0}, h_{k} ∈G such that Γh_{0} =x_{1} and x_{k} =x_{1}h_{k}. Further letγ ∈Γ be such
that

γh0gia^{F(k)} =h0hku^{(k)}_{i}

k u^{−}_{i} a^{R}^{k}u^{+}_{i} a^{−R}^{k}.
We will show that

(24) γh_{0}g_{j}a^{F}^{(k)}=h_{0}h_{k}u^{(k)}_{j}

k u^{−}_{j} a^{R}^{k}u^{+}_{j} a^{−R}^{k}
(same γ!). To that end we note that

d h0hku^{(k)}_{i}

k u^{−}_{i} a^{R}^{k}u^{+}_{i} a^{−R}^{k}, h0hku^{(k)}_{j}

k u^{−}_{j}a^{R}^{k}u^{+}_{j} a^{−R}^{k}

≤d u^{(k)}_{i}

k u^{−}_{i} a^{R}^{k}u^{+}_{i} a^{−R}^{k}, u^{(k)}_{i}

k

+d u^{(k)}_{i}

k , u^{(k)}_{j}

k

+d u^{(k)}_{j}

k , u^{(k)}_{j}

k u^{−}_{j} a^{R}^{k}u^{+}_{j}a^{−R}^{k}

< η+η_{0}

2 +η < η_{0}
and

d γh0gia^{F}^{(k)}, γh0gja^{F}^{(k)}

< η0.
Since η_{0} is an injectivity radius of∂_{B}G

ε K, equality (24) now follows. Finally,
d g_{i}a^{F}^{(n)+R}^{n}, g_{j}a^{F}^{(n)+R}^{n}

≥d g_{i}a^{F}^{(k)+R}^{k}, g_{j}a^{F}^{(k)+R}^{k}

=d u^{(k)}_{i}

k u^{−}_{i} a^{R}^{k}u^{+}_{i} , u^{(k)}_{j}

k u^{−}_{j} a^{R}^{k}u^{+}_{j}

≥d u^{(k)}_{i}

k a^{R}^{k}, u^{(k)}_{j}

k a^{R}^{k}

−d u^{(k)}_{i}

k a^{R}^{k}, u^{(k)}_{i}

k u^{−}_{i} a^{R}^{k}u^{+}_{i}

−d u^{(k)}_{j}

k a^{R}^{k}, u^{(k)}_{j}

k u^{−}_{j} a^{−R}^{k}u^{+}_{j}

≥η_{0}−2η > η_{0}
2 .

This completes the proof.

Definition of the strongly tree-like collection. Fix s0 > 39s1 and set
K:=X≤s_{0}. Further fix an injectivity radius η_{0} of some neighborhood of Ksuch
that ^{1}_{2} > η_{0}>0 and choose

η < η0(λ0−1) 4λ0

so small that we may apply Theorem4.4. Fork∈N we setRe_{k}:=kand
Sek:=be^{k/2}c^{p}^{2}· be^{k/4}c^{p}^{1}.

For any k ∈N we apply Proposition4.3 with Re_{k}, Se_{k}, s0 and η0 to get a point
x_{k} ∈ K and a subset Ee^{(k)} ⊆ B^{U}_{η}_{0}_{e}−k/4 with the properties of this proposition.

For k ≥ k0 := d4 log 4e we have Ee^{(k)} ⊆ B_{η}^{U}

0/4. We set E^{(k)} := Ee^{(k+k}^{0}^{−1)},
Rk := Rek+k0−1, Sk := Sek+k0−1 for k ∈ N and apply Theorem 4.4 to these
sequences to construct a sequence (E^{0}_{n})n∈N of sets with the properties as in
Theorem4.4. For anyn∈N we set

Un:=n

ua^{F}^{(n)+R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n)−R}^{n}

u∈E_{n}^{0} o
.
Let

U_{0} :=[

U1 = [

u∈E_{1}^{0}

ua^{k}^{0}B^{U}_{η}_{0}_{/4}a^{−k}^{0},

which is a compact non-null subset of U, and letU0:={U_{0}}. We claim that
U:= [

n∈N0

Un

is a strongly tree-like collection on U0. To that end let n ∈ N. Suppose that
g, h∈E_{n}^{0},g6=h. By Theorem4.4 we have

d ga^{F}^{(n)+R}^{n}, ha^{F}^{(n)+R}^{n}

> η_{0}
2.
Therefore

ga^{F(n)+R}^{n}B^{U}_{η}_{0}_{/4}∩ha^{F}^{(n)+R}^{n}B^{U}_{η}_{0}_{/4}=∅,
and hence

ga^{F}^{(n)+R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n)−R}^{n}∩ha^{F}^{(n)+R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n)−R}^{n} =∅.

This shows (6) (and even a stronger disjointness). Now let i ∈ Sn and j ∈ [1, Sn+1]. We claim that

g_{(i,j)}a^{F}^{(n+1)+R}^{n+1}B^{U}_{η}_{0}_{/4}a^{−F}^{(n+1)−R}^{n+1} ⊆gia^{F(n)+R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n)−R}^{n},
which is equivalent to

g_{(i,j)}a^{F}^{(n)+R}^{n}a^{F}^{(n+1)+R}^{n+1}^{−F(n)−R}^{n}B^{U}_{η}_{0}_{/4}a^{−F(n+1)−R}^{n+1}^{+F}^{(n)+R}^{n}
(25)

⊆g_{i}a^{F(n)+R}^{n}B^{U}_{η}_{0}_{/4}.
Since

F(n+ 1) +Rn+1−F(n)−Rn=Rn+1+R^{0} >0,
we have

a^{F}^{(n+1)+R}^{n+1}^{−F}^{(n)−R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n+1)−R}^{n+1}^{+F}^{(n)+R}^{n} ⊆B^{U}_{λ}^{−1}

0 η0/4. Then (25) follows from

λ^{−1}_{0} η0

4 +d g_{(i,j)}a^{F}^{(n)+R}^{n}, g_{i}a^{F}^{(n)+R}^{n}

< η0

4 · 1
λ_{0} +η0

4 ·λ0−1
λ_{0} = η0

4. Thus, the sets of the collection are nested in the required way. Finally,

ga^{F}^{(n)+R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n)−R}^{n} ⊆gB^{U}_{λ}^{−F(n)−}Rn
0 η0/4,
and hence

diam ga^{F}^{(n)+R}^{n}B^{U}_{η}_{0}_{/4}a^{−F}^{(n)−R}^{n}

λ^{−F}_{0} ^{(n)−R}^{n}.

Therefore, the sequence of supremal diameters converges to 0 as n→ ∞. This completes the proof thatU=S

Un is a strongly tree-like collection.