Cost of local cooperation in hierarchical virtual MIMO transmission schemes

Hierarchical cooperation schemes in wireless networks rely on local cooperation among neighboring nodes to create virtual multiple-input multiple-output (MIMO) connections between clusters of nodes. It was shown that, by applying the virtual MIMO technique recursively in a hierarchical manner, the sum rate of all source-destination pairs can scale linearly with the number of nodes in the network. In this paper we focus on the impact of local cooperation and establish new capacity scaling bounds for the virtual MIMO transmission taking into account the constraints of local communication both at the transmitters and the receivers. We show that the cost of local communication, which is inevitable to establish the virtual MIMO transmission, grows exponentially with the number of layers in the cooperation hierarchy and plays a vital role in determining the overall performance of the hierarchical virtual MIMO cooperation.


I. INTRODUCTION
Given a large wireless network consisting of N nodes randomly deployed within an area A, each of the nodes wants to transmit to a random destination node within the network at some equal rate. For dense networks where the area A is fixed but the density of nodes scales up, Gupta and Kumar [1] showed that, with multi-hop transmission, the maximum achievable sum rate C N scales at most as O( √ N ) as dense networks are essentially interference limited. For extended networks where the transmission distance scales up with the number of nodes but the density of nodes N/A N is fixed, Xie and Kumar [2] provided an upper bound on the capacity scaling and showed that the multiple-hop transmission strategy is essentially optimal if the path loss attenuation factor is high. A hierarchical cooperation scheme proposed byÖzgür et al. [3] employs local communication among neighboring nodes to create virtual multiple-input multiple-output (MIMO) connections between source-destination pairs, and such virtual MIMO technique is then applied recursively to solve the local communication problems by formulating new virtual MIMO connections at a smaller scale. It was shown in [3] that, with perfect channel state information at the receivers, the maximal sum rate C N can scale as O(N 1−ǫ ) for any ǫ > 0 in dense This material is based upon work supported by the Air Force Office of Scientific Research (AFOSR) under award No. FA9550-13-1-0023, and by the MIT Wireless Center. The work of S. Shamai has been supported by the European Union's Horizon 2020 Research And Innovation Programme, grant agreement no. 694630. networks. This hierarchical cooperation scheme was later refined in [4], [5] to maximize the achievable rate. However, the cost of local communications, which grows exponentially with the number of layers in the cooperation hierarchy, was not fully explored in previous analysis [3]- [5]. For dense networks, the number of nodes N cannot be too large to ensure independent channels [6]- [8]. For extended networks, the received signal to noise ratio (SNR) becomes inevitably low and the channel knowledge can no longer be assumed available for free, which hinders the feasibility of quantization based joint detection in the distributed receivers [9]. Therefore the number of nodes N cannot be too large either.
In this paper we first quantify the cost of local communication for the hierarchical virtual MIMO transmission without imposing any limit on the total number of nodes N in the network, and then discuss the consequence of imposing such a constraint. We generalize the virtual MIMO model [3] to accommodate flexible local collaboration among neighboring nodes. As shown in Fig. 1, for each source-destination pair (take s 1 and d 1 for example), the transmission is split into three stages. In Stage I the source node s 1 distributes its message to all the neighboring nodes (s 2 , · · · , s M ), each of Q s bits, via local communication, to form a virtual multiple-antenna transmitter. In Stage II all the M nodes (s 1 , · · · , s M ) transmit to all the M nodes in the destination cluster (d 1 , · · · , d M ) using the same channel (i.e., in the same time and frequency resource block), and the transmission lasts a duration of one time slot (i.e., one channel use) 1 . In Stage III all nodes in the destination cluster first independently quantize their observation into Q d bits and then forward them to the destination node via local communication. The destination node d 1 , after receiving all the quantized bits from its neighboring nodes (d 2 , · · · , d M ) in the cluster, will perform joint detection to retrieve the M Q s bits sent by the source node s 1 . To focus on the impact of local communication, we assume that the virtual MIMO transmission in Stage II is capacity achieving in the sense that we can recover the transmitted message successfully if decoding is done based on the original observations (without quantization) of all the receiving nodes in the cluster. The virtual MIMO model in Fig. 1 nicely captures the essence of multiple-cell cooperation [10], [11] in cellular networks, where base stations are connected through finite-rate backhaul and the user terminals are capable of communicating directly with other users in its close vicinity.

II. SYSTEM MODEL
We inherit the grid topology from [3] such that there are N nodes evenly distributed in a square and every node wants to transmit the same amount data to a destination node randomly chosen from the rest of the N −1 nodes. All the transmissions are carried out in the same frequency band and the aggregate throughput of all the N source-destination pairs, C N , is referred as the sum rate. For a pair of transmit and receive clusters at distance d st meters apart, the SNR for the transmission between node i in the transmit cluster and node j in the receive cluster is written as is the transmit power of node i, d ij is the actual distance between node i in the transmit cluster and node j in the receive cluster, and α≥2 is the attenuation factor. We assume that all channels h i,j are i.i.d. with E[|h i,j | 2 ]=1. All other factors such as the noise power, antenna gains, and other loss are modeled through γ i,j , which has marginal variation. The approximation in (1) is taken to ignore the minor variation of SNR across different transmit-receive pairs as we are aiming at the capacity scaling behavior with very large number of nodes rather than the exact capacity.
Given two clusters each of M nodes at distance of d st meters apart, the capacity of a K 1 ×K 2 MIMO channel between K 1 ∈[1:M ] nodes in the transmit cluster and K 2 ∈[1:M ] nodes in the receive cluster, assuming full cooperation among transmit/receive nodes, can be denoted as 2 C(K 1 , K 2 , γ) [bits per channel use]. Without specifying the expression of the channel capacity C(K 1 , K 2 , γ), which is channel assumption dependent, the following fact will be used in our analysis: given any fixed K 1 (resp. K 2 ) ∈ [1:M ], the capacity C(K 1 , K 2 , γ) is monotonically increasing and concave w.r.t. K 2 (resp. K 1 ) if we ignore the continuity constraint.
The hierarchical cooperation protocol proposed in [3] rests on a layered structure, where the local cooperation tasks of Stage I and Stage III of an upper layer in the hierarchy are treated as the communication problem to be solved by the lower layer, which itself also consists of three stages partitioned in the same fashion. Such recursion is applied 2 The SNR γ depends on (M, dst), which is not highlighted here.
to the virtual MIMO setup puts K 1 transmit nodes (including the source node s 1 ) on the left side of the cut and K 2 receive nodes (including the destination node d 1 ) on the right side. repeatedly until the communication task for the lower layer becomes trivial or can be solved by local transmission. Our analysis is bottom-up, from the lowest layer (only incurs local transmission) to the highest layer (all nodes in the network are included). Note that at each layer, the local communication in Stage I (and Stage III) are carried out concurrently in all clusters. In this paper we artificially ignore the inter-cluster interference despite the fact that we do not resort to spatial reuse. Such "ignorance" of interference rests on two key observations: the power of inter-cluster interference can be limited to a low level via power control as stated in [3]; the penalty of spatial reuse can be greatly reduced by intelligent scheduling mechanism [4], [5]. Therefore we do not expect our results will change drastically should the residual interference (after power control) be accounted into analysis.

III. RATE UPPER BOUNDS OF A VIRTUAL MIMO SESSION
For each source-destination pair, say s 1 →d 1 , the data transmission takes three steps: s 1 first transmits Q s bits each to its M −1 neighbors via orthogonal channels; M ×M virtual MIMO transmission is performed for a duration of one time slot (i.e., one channel use); each node in the receiving cluster forward Q d bits to the destination node d 1 via orthogonal channels. Therefore, the amount of data that transmits from s 1 to d 1 can be upper bounded by the cut-set bounds illustrated in Fig. 2. A cut for the virtual MIMO setup may put K 1 transmit nodes (including s 1 ) on the left side of the cut and K 2 receive nodes (including d 1 ) on the right side. Since we do not distinguish a node from its neighbors, it is the number of nodes rather than their specific identity that matters. All cuts that have K 1 transmit nodes on its left side and K 2 receive nodes on its right side will be denoted as cut S K1,K2 as they will incur the same upper bound. For K 1 , K 2 =1, . . . , M , the cut S K1,K2 incurs a constraint from the data transmission via the K 1 ×K 2 MIMO channel, (M −K 1 ) constraints of Q s bits each from the orthogonal channels originated from s 1 , and (M −K 2 ) constraint of Q d bits each from the orthogonal channels ended at d 1 . The corresponding upper bound on the transmitted bits D from s 1 to d 1 can be written as Since C(K 1 , K 2 , γ) is monotonically increasing and concave, the minimum value of the above combined function is obtained at its boundaries, which leads to the following cut-set bound (after applying all the cuts) where the inequality is due to the fact that, when channel knowledge is known at the receiver, larger capacity gain can be obtained by increasing the degree-of-freedom than by increasing the diversity order. To maximize the cut-set bound in (2), we choose Q s and Q d such that which, together with the condition (3), implies that Therefore, the cut-set bound (2) can now be written as  ∼CN (0, 1), and the receive nodes have perfect channel state information, the capacity C(M, M, γ) can be written as where the first equality is from the MIMO capacity with perantenna power constraint, the second step is by rewriting the capacity with singular values of the random matrix 1 √ M H, and the third step is due to that, for large M , the empirical distribution of λ converges to a limiting density function f (x) [12, (8.23)] as prescribed by the Random Matrix Theory. For both M γ ≫ 1 (e.g., dense networks) and M γ ≪ 1 (e.g., extended networks), we have the following approximation When M is large, we can write C(1, M, γ) and C(M, 1, γ) as where the approximation is from the Law of Large Numbers for large M . Note that all the capacities C(1, M, γ), C(M, 1, γ), and C(M, M, γ) grow with M if the SNR γ does not change with M , then the requirement of local communication Q s and Q d as stated in (4) will not be satisfied as M scales up. If we instead allocate power such that M γ(d st )=γ 0 is a constant, the condition (4) always hold and we can rewrite the condition (for both γ 0 ≫1 and γ 0 ≪1 ) as IV. AGGREGATE THROUGHPUT UNDER HIERARCHICAL COOPERATION PROTOCOL By choosing power allocation such that M γ(d st )=γ 0 >0 as discussed in Sec. III, the aggregate throughput C(M, M, γ) in (6) can be written as M log(1+γ 0 x)f (x)dx=M R * , where In the high SNR regime we have R * ≃ log(1+γ 0 )− log(e) and the requirement in (9) becomes In the low SNR regime we have R * ≃ log(1 + γ 0 ) and the requirement in (9) becomes

A. How Much Time Is Required in the Hierarchical Protocol
Assume there exists a scheme that can support N 0 sourcedestination pairs at an aggregate throughput C 0 = 1 S0 N b0 0 within a cluster of N 0 nodes, where S 0 >0 and b 0 ≥0 are two parameters depending on the communication scheme.
1) Layer ℓ=1: Set the cluster size M 1 =N 0 . Each source node within a cluster needs to transmit Q s bits to the rest of (M 1 −1) nodes within the cluster, resulting in M 1 (M 1 −1) transmission sessions each of Q s bits. Using the transmission scheme from layer ℓ=0, it takes In Stage III, a destination node receives Q d bits from each of its M 1 −1 neighbors, and on average every node inside the cluster will be a destination node, which results in M 1 (M 1 −1) sessions each of Q d bits. Using the transmission scheme from layer ℓ=0 it takes ( time slots. The aggregate throughput at layer ℓ=1 is therefore where the second step is obtained by choosing N 1 =M 2−b0 1 and the last step is obtained by setting 2) Layer ℓ=2: Set M 2 =N 1 . In Stage I each source node transmits Q s bits to all the M 2 −1 neighboring nodes and every destination node receives Q d bits from each of its M 2 −1 neighbors. The transmission scheme developed at layer ℓ=1, whose rate is C 1 as in (12) slots, whereas Stage II takes N 2 time slots, one for each source node. The aggregate throughput at layer ℓ=2 is where the second step is by choosing N 2 =M 2−b1 2 , and 3) Layer ℓ>2: Assume at layer (ℓ−1) we have Setting M ℓ =N ℓ−1 , applying the layer (ℓ−1) scheme in the Stage I and Stage III of layer ℓ, and resorting to the M ℓ × M ℓ virtual MIMO transmission in Stage II, we have where the second step is by choosing N ℓ =M

B. Aggregate Throughput (Sum Capacity)
Given b 0 ∈[0, 1) and S 0 >0, b ℓ and S ℓ can be determined explicitly for all ℓ = 1, 2, 3, . . . , as follows where b ℓ is determined by observation and induction, whereas S ℓ is determined by the fact that Therefore we have and the aggregate throughput (i.e., sum rate) C ℓ is given by where the last step is obtained by applying (18), (19) and (20).

A. When the Total Number of Nodes N is Unbounded
When the total number of nodes N is unbounded, we can choose N 0 and ℓ independently to scale up N . 1) If we fix N 0 and let ℓ→∞: We can rewrite (22) as where C 0 =N b0 0 /S 0 is the rate at which local message change can be performed at layer ℓ=0 as defined at the beginning of Sec. IV-A. The scaling factor β ℓ in (23) is given by As ℓ→∞, the scaling factor β ℓ increases monotonically to Since we have Q s ≥ R * and Q d ≥ R * as stated in (10) and (11), the penalty term on the scaling factor β ∞ is not negligible for b 0 ∈ [0, 1) with a fixed N 0 .
2) If we fix ℓ and let N 0 grows: From [1] we know that if the optimal multi-hop strategy is used for message exchange in the initial cluster. We can therefore rewrite (22) (setting b 0 =1/2 and S 0 =1) as Here the cost of local communication is not significant as long as ℓ is not too large.
, grow with N 0 and ℓ, we take the partial derivatives of β w.r.t. N 0 and ℓ, relaxing the integer constraints, and set them to zero to determine the relationship between N 0 and ℓ as The maximized β, possible only if (27) is met, is and the required total number of nodes N to sustain (28) is B. When N is Large but Fixed Substituting (20) into (22), we have which prescribes the optimal ℓ (to maximize C) as and the corresponding optimal initial cluster size N * 0 as The optimal "scaling" exponent can now be derived as .

VI. NUMERICAL ILLUSTRATION
To have a quantitative sense of how local communication cost may degrade the overall performance, in Fig. 3 we plot the maximized capacity scaling exponent β stated in (28) as a function of the required total number of nodes N specified in (29). Since the local communication cost must satisfy the constraints Q s ≥ R * and Q d ≥ R * to not hurt the virtual MIMO capacity, we plot for two cases with Qs+Q d R *

= 2 and
Qs+Q d R * = 3. The former represents an idealized power control mechanism such that those constraints are always satisfied with equality, and the latter puts up some margin to tolerate rate mismatch. For the initial stage, b 0 = 0 corresponds to the situation with TDMA type of message exchanging methods and b 0 = 0.5 corresponds to the case where ideal multi-hop routing is utilized. From Fig. 3 we can see that targeting at a capacity scaling exponent β > 3 4 would require the total number of nodes N > 10 14 even for the case with ideal local communication cost Qs+Q d R * = 2. In Fig. 4 we plot optimal capacity scaling exponent β * specified in (33) as a function of the total number of nodes N , where we maximize the aggregate throughput C N for each fixed N and ignore the integer constraint on ℓ and N 0 . From Fig. 4 we can see that better message exchanging methods at the initial stage (i.e., larger b 0 ) helps to improve the sum rate C N and hence also the scaling exponent β * (blue vs. red), and more than "necessary" local communication cost Q s + Q d severely degrades the overall performance. Since the idealized case Qs+Q d R * = 2 may requires sophisticated (if not impossible) power control at every node on each stage, appropriate tradeoff No. of Layers (Q s +Q d )/R * =2 (Q s +Q d )/R * =3 Fig. 4. The optimal capacity scaling exponent β * of (33) with the aggregate throughput C N maximized at each fixed N , ignoring the integer constraint on ℓ and N 0 . More than necessary local communication cost Qs + Q d > 2R * severely degrades the sum rate C N hence also the scaling exponent β * . Note that the number of hierarchical layers ℓ * of (31), at which the sum rate is maximized, is less than 5 even for networks consisting of N = 10 9 nodes.
between the local communication margin and power control complexity is to be investigated. Note that for the range of network size we have investigated, 10 3 ≤ N ≤ 10 9 as in Fig. 4, the number of hierarchical layers ℓ * at which the sum rate is maximized is very small (less than 5 even for N = 10 9 ).