Upper and Lower Bounds on the Capacity of Amplitude-Constrained MIMO Channels

In this work, novel upper and lower bounds for the capacity of channels with arbitrary constraints on the support of the channel input symbols are derived. As an immediate practical application, the case of multiple-input multiple-output channels with amplitude constraints is considered. The bounds are shown to be within a constant gap if the channel matrix is invertible and are tight in the high amplitude regime for arbitrary channel matrices. Moreover, in the high amplitude regime, it is shown that the capacity scales linearly with the minimum between the number of transmit and receive antennas, similarly to the case of average power-constrained inputs.


I. INTRODUCTION
While the capacity of a multiple-input multiple-output (MIMO) channel with an average power constraint is well understood [1], surprisingly, little is known about the capacity of the more practically relevant case in which the channel inputs are subject to amplitude constraints. The first major contribution to this problem was a seminal work of Smith [2], in which it was shown that, for the scalar Gaussian noise channel with an amplitude-constraint, the capacity achieving input is discrete with finite support. In [3], this result was extended to peak-power-limited quadrature Gaussian channels. Using the approach of [3], in [4] the optimal input distribution was shown to be discrete for MIMO channels with an identity channel matrix and a Euclidian norm constraint on the input vector. Even though the optimal input distribution is known to be discrete, very little is known about the number or the optimal positions of the corresponding constellation points. To the best of our knowledge, the only exception is the work of [5] in which for a scalar Gaussian noise channel it was shown that two point masses are optimal for amplitude values smaller than 1.671 and three for amplitude values of up to 2.786.
Using a dual capacity expression, in [6] McKellips derived an upper bound on the capacity of a scalar amplitude-constrained channel that is asymptotically tight in the high amplitude regime. By using a clever choice of an auxiliary channel output distribution in the dual capacity expression, the authors of [7] sharpened McKellips' bound and extended it to parallel MIMO channels with a Euclidian norm constraint on the input. The scalar version of the upper bound in [7] has been further sharpened in [8] by yet another choice of auxiliary output distribution. In [9], asymptotic lower and upper bounds for a 2 × 2 MIMO system were presented and the gap between the bounds was specified.
In this work, we make progress on this open problem by deriving several new upper and lower bounds that hold for channels with arbitrary constraints on the support of the input distribution. We then apply them to the special case of MIMO channels with amplitude-constrained inputs.

A. Contributions and Paper Outline
Our contributions and paper outline are as follows. The problem is stated in Section II. In Section III, we derive upper and lower bounds on the capacity of a MIMO channel with an arbitrary constraint on the support of the input. In Section IV, we evaluate the performance of our bounds by studying MIMO channels with invertible channel matrices. In particular, in Theorem 8 it is shown that our upper and lower bounds are within n log(ρ) bits, where ρ is the packing efficiency and n is the number of antennas. For diagonal channel matrices, it is shown in Theorem 9 that the Cartesian product of pulse-amplitude modulation (PAM) constellations achieves the capacity to within 1.64n bits. Section V is devoted to MIMO channels with arbitrary channel matrices. It is shown that in the high amplitude regime, similarly to the average power-constrained channel, the capacity scales linearly with the minimum of the number of transmit and receive antennas. Section VI concludes the paper.

B. Notation
Vectors are denoted by bold lowercase letters, random vectors by bold uppercase letters, and matrices by bold uppercase sans serif letters (e.g., x, X, X). For any deterministic vector x ∈ R n , n ∈ N, we denote the Euclidian norm of x by x . For some X ∈ supp(X) ⊆ R n and any p > 0 we define where supp(X) denotes the support of X. Note that for p ≥ 1, the quantity in (1) defines a norm. The norm of a matrix H ∈ R n×n is defined as Let S be a subset of R n . Then, Vol(S) := S dx denotes its volume. Let R + := {x ∈ R : x ≥ 0}. We define an n-dimensional ball or radius r ∈ R + centered at x ∈ R n as the set B x (r) := {y : x − y ≤ r} .
Recall that for any x ∈ R n and r ∈ R + , For any matrix H ∈ R k×n and some S ⊂ R n we define Note that for an invertible H ∈ R n×n we have Vol(HS) = |det(H)|Vol(S) .
We define the maximum and minimum radius of a set S ⊂ R n that contains the origin as r max (S) := min{r ∈ R + : S ⊂ B 0 (r)} , r min (S) := max{r ∈ R + : B 0 (r) ⊆ S} .
For a given vector a = (a 1 , . . . , a n ) ∈ R n + we define Box(a) := {x ∈ R n : |x i | ≤ a i , i = 1, . . . , n} and the smallest box containing a given set S ⊂ R n as Box(S) := inf{Box(a) : S ⊆ Box(a)} , respectively. Finally, all logarithms are taken to the base 2, log + (x) := max{log(x), 0}, Q(x), x ∈ R, denotes the Q-function, and δ x (y) the Kronecker delta, which is one for x = y and zero otherwise.
II. PROBLEM STATEMENT Consider a MIMO system with n t ∈ N transmit and n r ∈ N receive antennas. The corresponding n r -dimensional channel output for a single channel use is of the form for some fixed channel matrix H ∈ R nr×nt . 1 Here and hereafter, we assume Z ∼ N (0, I nr ) is independent of the channel input X ∈ R nt and H is known to both the transmitter and the receiver, where I nr denotes the n r × n r identity matrix. Now, in all that follows let X ⊂ R nt be a convex and compact channel input space that contains the origin (i.e., the length-n t zero vector) and let F X denote the cumulative distribution function of X. As of the writing of this paper, the capacity of this channel is unknown and we are interested in finding novel lower and upper bounds. Even though most of the results in this paper hold for arbitrary X , we are mainly interested in the two most important special cases: (i) per-antenna amplitude constraints; that is, X = Box(a) for some given a = (A 1 , . . . , A nt ) ∈ R nt + , (ii) n t -dimensional amplitude constraint; that is, X = B 0 (A) for some given A ∈ R + .
wherex ∈ HX is chosen such that x = r max (HX ).
Proof: Expressing (2) in terms of differential entropies results in ≤ max = n r log k nr,p where a) follows from Lemma 1 with the fact that h(Z) = nr 2 log(2πe) and b) from the monotonicity of the logarithm. Now, notice that HX+Z p is linear and bounded (and therefore continuous) in F X so that it attains its maximum at an extreme point of the set F X := {F X : X ∈ X } (i.e., the set of all cumulative distribution functions of X). As a matter of fact [15], the extreme points of F X are given by the set of degenerate distributions on X ; that is, {F X (y) = δ x (y), y ∈ X } x∈X . This allows us to conclude max F X :X∈X Observe that the Euclidian norm is a convex function, which is therefore maximized at the boundary of the set HX . Combining this with (3) and taking the infimum over p > 0 completes the proof.
The following theorem provides two alternative upper bounds that are based on duality arguments.

Theorem 2. (Duality Upper Bounds)
For any channel input space X and any fixed channel matrix H where where a = (A 1 , . . . , A nr ) such that Box(a) = Box(HX ).
Proof: Using duality bounds, it has been shown in [7] that for any centered n-dimensional ball of radius r ∈ R + max F X :X∈B 0 (r) where c n (r) : Here, a) follows from enlarging the optimization domain and b) from using the upper bound in (6). This proves (4).
In order to show the upper bound in (5), we proceed with an alternative upper bound to (7): where the (in)equalities follow from: a) enlarging the optimization domain; b) single-letterizing the mutual information; c) choosing individual amplitude constraints (A 1 , . . . , A nr ) =: a ∈ R nr + such that Box(a) = Box(HX ); and d) using the upper bound in (6) for n = 1. This concludes the proof.

B. Lower Bounds
A classical approach to bound a mutual information from below is to use the entropy power inequality (EPI).

Theorem 3. (EPI Lower Bounds) For any fixed channel matrix H and any channel input space
Moreover, if n t = n r = n, H ∈ R n×n is invertible, and X is uniformly distributed over X , then Proof: By means of the EPI we conclude which finalizes the proof of the lower bound in (8).
To show the lower bound in (9), all we need is to recall that which is maximized for X uniformly distributed over X . But if X is uniformly drawn from X , we have which completes the proof. The results in [2]- [4] suggest that the channel input distribution that maximizes (2) might be discrete. Therefore, there is a need for lower bounds that, unlike the bounds in Theorem 3, rely on discrete inputs. Remark 2. We note that the problem of finding the optimal input distribution of a general MIMO channel with an amplitude constraint is still open. The technical difficulty relies on the fact that the identity theorem from complex analysis, a key tool in the method developed by Smith [2] for the scalar case and later used by [16] for the MIMO channel, does not extend to R n and C n . The interested reader is referred to [17] for a detailed discussion on this issue with examples of why the identity theorem fails in the MIMO setting.
Theorem 4. (Ozarow-Wyner Type Lower Bound) Let X D ∈ supp(X D ) ⊂ R nt be a discrete random vector of finite entropy, g : R nr → R nt a measurable function, and p > 0. Furthermore, let K p be a set of continuous random vectors, independent of X D , such that for every U ∈ K p we have h(U) < ∞, for all and k nt,p as defined in Lemma 1, respectively.
Proof: The proof is identical to [14,Th. 2]. In order to make the manuscript more self-contained, we repeat it here.
Let U and X D be statistically independent. Then, the mutual information I(X D ; Y) can be lower bounded as Here, a) follows from the data processing inequality as X D + U → X D → Y forms a Markov chain in that order and b) from the assumption in (11). By using Lemma 1, we have that the last term in (14) can be bounded from above as Combining this expression with (14) results in with G 1,p and G 2,p as defined in (12) and (13), respectively. Maximizing the right-hand side over all U ∈ K p , measurable functions g : R nr → R nt , and p > 0 provides the bound. Interestingly, the bound of Theorem 4 holds for arbitrary channels and the interested reader is referred to [14] for details.
We conclude the section by providing a lower bound that is based on Jensen's inequality and holds for arbitrary inputs.
where X ′ is an independent copy of X.
Proof: In order to show the lower bound, we follow an approach of [18]. Note that by Jensen's inequality Now, evaluating the integral in (15) results in where a) follows from the independence of X and X ′ and Tonelli's theorem, b) from completing a square, and c) from the fact that R nr e − y− H(X−X ′ ) 2 2 dy = R nr e − y 2 dy = π nr 2 . Finally, combining (15) and (16), subtracting h(Z) = nr 2 log(2πe), and maximizing over F X proves the result.

IV. INVERTIBLE CHANNEL MATRICES
Consider the case of n t = n r = n antennas with H ∈ R n×n being invertible. In this section, we evaluate some of the lower and upper bounds given in the previous section for the special case of H being also diagonal and then characterize the gap to the capacity for arbitrary invertible channel matrices.

A. Diagonal Channel Matrices
Suppose the channel inputs are subject to per-antenna or an n-dimensional amplitude constraint. Then, the duality upper boundC Dual,2 (X , H) of Theorem 2 is of the following form.
Moreover, if X = B 0 (A) for some A ∈ R + , then Proof: The bound in (17) immediately follows from Theorem 2 by observing that Box(HBox(a)) = Box(Ha). The bound in (18)

follows from Theorem 2 by the fact that
where h := A √ n (|h 11 |, . . . , |h nn |). This concludes the proof. For an arbitrary channel input space X , the EPI lower bound of Theorem 3 and Jensen's inequality lower bound of Theorem 5 evaluate to the following. where and Proof: For some given values B i ∈ R + , i = 1, . . . , n, let the i-th component of X = (X 1 , . . . , X n ) be independent and uniformly distributed over the interval [−B i , B i ]. Thus, the expected value appearing in the bound of Theorem 5 can be written as Now, if X ′ is an independent copy of X, it can be shown that the expected value at the right-hand side of (22) is of the explicit form with ϕ as defined in (20). Finally, optimizing over all b = (B 1 , . . . , B n ) ∈ X results in the bound (19). (9), which concludes the proof. In Fig. 1, the upper bounds of Theorems 1 and 6 and the lower bounds of Theorem 7 are depicted for a diagonal 2 × 2 MIMO channel with per-antenna amplitude constraints. It turns out that the moment upper bound and the EPI lower bound perform well in the small amplitude regime while the duality upper bound and Jensen's inequality lower bound perform well in the high amplitude regime (note that the Jensen's inequality lower bound becomes strictly positive around 9 dB).

B. Gap to the Capacity
Our first result bounds the gap between the capacity (2) and the lower bound in (9).
where a) follows since k n,2 = 2πe n and b) since E[ Z 2 ] = n. Therefore, the gap between (9) and the moment upper bound of Theorem 1 can be upper bounded as follows: Here, a) is due to the fact that x is the radius of an n-dimensional ball, b) follows from the inequality 1+cx 1+x ≤ c for c ≥ 1 and x ∈ R + , and c) follows from using Stirling's approximation to obtain n . The term ρ(X , H) is referred to as the packing efficiency of the set HX . In the following proposition, we present the packing efficiencies for important special cases.
The proof of (25) is concluded by observing that Vol(I n Box(a)) = n i=1 A i . Finally, observe that Box(a) ⊂ B 0 ( a ) implies r max (HBox(a)) ≤ r max (HB 0 ( a )) so that Example of a pulse-amplitude modulation constellation with N = 4 points and amplitude constraint A (i.e., PAM(4, A)), where ∆ := A/(N − 1) denotes half the Euclidean distance between two adjacent constellation points. In case N is odd, 0 is a constellation point.
which is the bound in (26). We conclude this section by characterizing the gap to the capacity when H is diagonal and the channel input space is the Cartesian product of n PAM constellations. In this context, PAM(N, A) refers to the set of N ∈ N equidistant PAM-constellation points with amplitude constraint A ∈ R + (see Fig. 2 for an illustration), whereas X ∼ PAM(N, A) means that X is uniformly distributed over PAM(N, A) [14]. Theorem 9. Let H = diag(h 11 , . . . , h nn ) ∈ R n×n be fixed and X = (X 1 , . . . , X n ). Then, if X i ∼ PAM(N i , A i ), i = 1, . . . , n, for some given a = (A 1 , . . . , A n ) ∈ R n + , it holds that where and c := log(2) + 1 2 log πe 6 . . , n, for some given A ∈ R + , it holds that where 2πe . Proof: Since the channel matrix is diagonal, letting the channel input X be such that its elements X i , i = 1, . . . , n, are independent we have that and observe that half the Euclidean distance between any pair of adjacent points in PAM( Fig. 2), i = 1, . . . , n. In order to lower bound the mutual information I(X i ; h ii X i + Z i ), we use the bound of Theorem 4 for p = 2 and n t = 1. Thus, for some continuous random variable U that is uniformly distributed over the interval [−∆ i , ∆ i ) and independent of X i we have that Now, note that the entropy term in (29) can be lower bounded as where we have used that ⌊x⌋ ≥ x 2 for every x ≥ 1. On the other hand, the last term in (29) can be upper bounded by upper bounding its argument as follows: Here, a) follows from using that X i and U are independent and E[ To provide lower bounds for channels with amplitude constraints and SVD precoding, we need the following lemma.
Lemma 2. For any given orthogonal matrix V ∈ R nt×nt and constraint vector a = (A 1 , . . . , A nt ) ∈ R nt + there exists a distribution F X of X such thatX = V T X is uniformly distributed over Box(a). Moreover, the componentsX 1 , . . . ,X nt ofX are mutually independent withX i uniformly distributed over [−A i , A i ], i = 1, . . . , n t .
Proof: Suppose thatX is uniformly distributed over Box(a); that is, the density ofX is of the form Since V is orthogonal, we have VX = X and by the change of variable theorem for x ∈ VBox(a) Therefore, such a distribution of X exists. and where with b := (B 1 , . . . , B nt ) and ϕ as defined in (20).
Proof: Performing the SVD, the expected value in Theorem 5 can be written as By Lemma 2 there exists a distribution F X such that the components ofX are independent and uniformly distributed. Since Λ is a diagonal matrix, we can use Theorem 7 to arrive at (32). Note that by Lemma 2 there exists a distribution on X such thatX is uniform over Box(a) ⊂ R nt and ΛX is uniform over ΛBox(a) ⊂ R n min , respectively. Therefore, by the EPI lower bound given in (8) which is exactly the expression in (33). This concludes the proof.
Remark 3. Notice that choosing the optimal b for the lower bound (32) is an amplitude allocation problem, which is reminiscent of waterfilling in the average power constraint case. It would be interesting to study whether the bound in (32) is connected to what is called mercury waterfilling in [19], [20].
In Fig. 3, the lower bounds of Theorem 10 are compared to the moment upper bound of Theorem 1 for the special case of a 3 × 1 MIMO channel. Similarly to the example presented in Fig. 1, the EPI lower bound performs well in the low amplitude regime while Jensen's inequality lower bound performs well in the high amplitude regime.
We conclude this section by showing that for an arbitrary channel input space X , in the large amplitude regime the capacity pre-log is given by min(n r , n t ). = min(n r , n t ) . Proof: Notice that there always exists a ∈ R nt + and c ∈ R + such that Box(a) ⊆ X ⊂ cBox(a). Thus, without loss generality we can consider X = Box(a), a = (A, . . . , A), for sufficiently large A ∈ R + . To prove the result we therefore start with enlarging the constraint set of the bound in (5): Box HBox(a) ⊆ B 0 r max HBox(a) ⊆ B 0 r max HB 0 ( √ n t A) where r := √ n t A n min i=1 σ 2 i and a ′ := r √ n min , . . . , r √ n min ∈ R n min + . Therefore, by using the upper bound in (5)  = n min .
Next, using the EPI lower bound in (33), we have that This concludes the proof.
VI. CONCLUSION In this work, we have focused on studying the capacity of MIMO systems with bounded channel input spaces. Several new upper and lower bounds have been proposed and it has been shown that the lower and upper bounds are tight in the high amplitude regime. An interesting direction for future work is to determine the exact scaling in the massive MIMO regime (i.e., n min → ∞).
Another interesting future direction is to study generalizations of our techniques to MIMO wireless optical channels [21].