Latency Limits for Content Delivery in a Fog-RAN with D2D Communication

A Fog-Radio Access Network (F-RAN) with arbitrary number of edge nodes and users is studied in which the users are able to cooperate by communicating over out-of-band broadcast Device-to-Device (D2D) links. Placement and delivery strategies are proposed with the aim of minimizing the Normalized Delivery Time (NDT) — a metric that captures the high signal-to-noise ratio worst-case latency for delivering any subset of requested contents to the users. The proposed strategies, based on compress-and-forward, are shown to be optimal to within a constant multiplicative factor of two for all values of the problem parameters. The analysis provides insights on the role of D2D cooperation in improving the delivery latency.


I. INTRODUCTION
The proactive caching of popular content at the Edge Nodes (ENs) is an effective way of reducing delivery time [1]. Apart from alleviating the need to access centralized network resources to fetch requested contents, edge caching also offers opportunities for cooperative transmission and interference management if there are common contents across the caches of multiple ENs. Even without common cached contents, cooperative transmission is possible if the ENs are connected over fronthaul links to a Cloud Processor (CP) with full access to the content library, as in Cloud-Radio Access Network (C-RAN). However, cooperative transmission in C-RAN comes at the cost of additional latency due to fronthaul transfer [2], [3]. When fronthaul capacity is also limited or not available, an alternative approach to mitigate the inter-user interference on the shared wireless channel is by allowing the receivers to cooperate over out-of-band Device-to-Device (D2D) links [4]. In such a scenario, the latency overhead caused by D2D communication must be taken into account in order to assess the benefits of D2D communication.
In this paper, we consider a D2D-aided Fog-RAN (F-RAN), illustrated in Fig. 1, in which edge caching, fronthaul connectivity to a CP, and receiver cooperation are leveraged for reducing content delivery time. Following [2], the term F-RAN is used to indicate the use of both cloud and edge caching resources. We aim at characterizing the potential latency reduction that may be achieved by utilizing out-ofband D2D links, while properly accounting for the latency overhead associated with D2D communications.
Related Work: Bounds on the high-Signal-to-Noise-Ratio (SNR) delay metric, the Normalized Delivery Time (NDT), This   were presented in [5] for a general interference channel with caches equipped at all transmitters and receivers, and the achievable NDT was shown to be optimal in certain cache size regimes. Content delivery in a multi-hop D2D caching network was studied in [6], where the per-node capacity scaling law was derived. The trade-off between cache storage and transmission rate was characterized in [7] for a cache-aided network where the users can demand multiple files. Optimal worst-case delay was derived in [8] for a multi-sender coded caching network with shared caches. The NDT of a general F-RAN system without D2D links was investigated in [9], where the proposed schemes were shown to achieve the minimum NDT to within a factor of 2, and the minimum NDT was completely characterized for two ENs and two users, as well as for other special cases. An F-RAN with heterogeneous contents was studied in [10], and the NDT region was characterized for the case with two ENs and two users. A caching and delivery scheme was presented for a partially-connected F-RAN in [11] and in [12]. An F-RAN with imperfect Channel State Information (CSI) at the CP was studied in [13], and a non-orthogonal transmission scheme was shown to improve the latency performance. The only prior work on D2D-aided F-RAN are [14], [15], which derive the minimum NDT for the special case of an F-RAN with two ENs and two users.
Main Contributions: In this paper, we study the general D2D-aided F-RAN system with M ENs and K users illustrated in Fig. 1. We first present a lower bound on the minimum NDT. Then, we propose an achievable strategy that uses a D2D cooperation scheme based on Compress-and-Forward (CF). Although this strategy is known to be generally suboptimal [14], we show that it achieves the minimum NDT to within a multiplicative factor of 2. This implies that the optimality gap does not scale with the size of the system.

II. SYSTEM MODEL
We consider the F-RAN system with Device-to-Device (D2D) links depicted in Fig. 1, where K ≥ 2 single-antenna users are served by M ≥ 2 single-antenna Edge Nodes (ENs) over a downlink wireless channel. Each user is connected to all other users by an orthogonal out-of-band broadcast D2D link of capacity C D bits per symbol. The model generalizes the set-up studied in [9] by including D2D communications. Each EN is connected to a Cloud Processor (CP) by a fronthaul link of capacity C F bits per symbol. A symbol refers to a channel use of the downlink wireless channel.
Let F = { f 1 , . . . , f N } denote a library of N ≥ K files, each of size L bits. The library is fixed for the considered time period. The entire library is available at the CP, while the ENs can only store up to µN L bits each, where 0 ≤ µ ≤ 1 is the fractional cache size. During the placement phase, contents are proactively cached at the ENs, subject to the cache capacity constraints.
After the placement phase, the system enters the delivery phase, which is organized in Transmission Intervals (TIs). In every TI, each user arbitrarily requests one of the N files from the library. The users' requests in a given TI are denoted by the demand vector d (d 1 , d 2 , . . . , d K ) ∈ [N] K , where, for any positive integer A, we define the set [A] {1, 2, . . . , A}. This vector is known at the beginning of a TI at the CP and ENs. The goal is to deliver the requested files to the users within the lowest possible delivery latency by leveraging fronthaul links, downlink channel, and D2D links.
For a given TI, let T E denote the duration of the transmission on the wireless downlink channel. At time t ∈ [T E ], each user k ∈ [K] receives a channel output given by

A. Caching, Delivery, and D2D Transmission
The operation of the system is defined by policies that perform caching, as well as delivery via fronthaul, edge, and D2D communication resources. 1) Caching Policy: During the placement phase, for EN m, m ∈ [M], the caching policy is defined by functions π m c,n (·) that map each file f n to its cached content s m,n as Note that, as per (2), we consider policies where only coding within each file is allowed, i.e., no inter-file coding (e.g., [16]) is permitted. In order to satisfy the cache capacity constraints, we restrict the mappings to satisfy H(s m,n ) ≤ µL, where H(s m,n ) denotes the entropy of s m,n . The overall cache content at EN m is given by s m (s m,1 , s m,2 . . . , s m, N ).
2) Fronthaul Policy: In each TI of the delivery phase, for EN m, m ∈ [M], the CP maps the library, F, the demand vector d, and CSI H to the fronthaul message where T F is the duration of the fronthaul message. Note that the fronthaul message cannot exceed 3) Edge Transmission Policies: After fronthaul transmission, in each TI, the ENs transmit using a function π m e (·) that maps the local cache content, s m , the received fronthaul message u m , the demand vector d, and the global CSI H, to the output codeword Interactive Communication Policies: After receiving the signals (1) over T E symbols, in any TI, the users apply a D2D conferencing policy. For each user k ∈ [K], this is defined by the interactive functions π k D2D,t (·) that map the received signal y k (y k [1], . . . , y k [T E ]), the global CSI, and the previously received D2D message from users where t ∈ [T D ], with T D being the duration of the D2D communication, and v t−1 All users broadcast the D2D messages (5) to all other users over orthogonal broadcast channels of capacity C D . Hence, the total size of each D2D message cannot exceed T D C D bits. i.e., . 5) Decoding Policy: After D2D communication, each user k ∈ [K] implements a decoding policy π k d (·) that maps the channel outputs, the D2D messages from users [K]\{k}, the user demand, and the global CSI to an estimate of the requested file f d k given asf where V k {v 1 , . . . , v k−1 , v k+1 , . . . , v K } is the set of D2D messages sent by users k ∈ [K]\{k} and received by user k.
The probability of error is defined as which is the worst-case probability of decoding error measured over all possible demand vectors d and over all users k ∈ [K].
A sequence of policies, indexed by the file size L, is said to be feasible if, for almost all channel realization H, we have P e → 0 when L → ∞.

B. Performance Metric
We adopt the Normalized Delivery Time (NDT), introduced in [9], as the performance metric of interest. The NDT is the high-SNR ratio between the worst-case delivery time per bit required to satisfy any possible demand vector d and the delivery time per bit for an ideal reference system in which each user can receive the desired file at the maximum high-SNR rate of log(P) [bits/symbol]. To formalize the NDT, we parametrize fronthaul and D2D capacities as C F = r F log(P) and C D = r D log(P). With this parametrization, the fronthaul rate r F ≥ 0 represents the ratio between the fronthaul capacity and the high-SNR capacity of each EN-to-user wireless link in the absence of interference; a similar interpretation holds for the D2D rate r D ≥ 0.
As discussed, in each TI, the CP first sends the fronthaul messages to the ENs for a total time of T F symbols; then, the ENs transmit on the wireless shared channel for a total time of T E symbols; and, finally, the users use the out-of-band D2D links for a total time of T D symbols. The corresponding NDT contributions are obtained by normalizing the above terms by the delivery time needed on the mentioned reference system: and δ D lim .
The factor L/log(P), used for normalizing the delivery times in (9)-(10), represents the minimal time to deliver a file in the reference system. The total NDT is hence defined as where the notation emphasizes the dependence of the NDT on the fractional cache size µ, and the fronthaul and D2D rates r F and r D , respectively. The minimum NDT is finally defined as the minimum over all NDTs achievable by some feasible policy: δ * (µ, r F , r D ) inf{δ(µ, r F , r D ) : δ(µ, r F , r D ) achievable}. (12) By construction, we have the lower bound δ * (µ, r F , r D ) ≥ 1. Furthermore, the minimum NDT can be proved by means of file-splitting and cache-sharing arguments to be convex in µ for any fixed values of r F and r D [9, Lemma 1].

III. BOUNDS ON THE MINIMUM NDT
In this section, we provide lower and upper bounds on the minimum NDT for the M × K D2D-aided F-RAN described in the previous section.

A. Lower Bound
A general lower bound on the minimum NDT is given in Prop. 1. Following [9], the bound is derived by identifying subsets of information resources from which, for high-SNR, all requested files must be reliably decoded when a feasible policy is implemented. Specifically, for l = 0, 1, . . . , min{M, K }, we consider a subset that consists of the signals {y 1 , . . . , y l , V 1 , . . . , V l } received by l users on the downlink and D2D channels, along with the cache contents and fronthaul messages {s 1 , . . . , s (M−l) , u 1 , . . . , u (M−l) } of (M − l) ENs.
Proposition 1: For a D2D-aided F-RAN with M ENs, each with a fractional cache size µ ∈ [0, 1], K users, a library of N ≥ K files, a fronthaul rate r F ≥ 0, and a D2D rate r D ≥ 0, the minimum NDT is lower bounded as δ * (µ, r F , r D ) ≥ δ lb (µ, r F , r D ), with δ lb (µ, r F , r D ) being the minimum value of the following linear program where (13b) is a set of constraints with l = 0, 1, . . . , min{M, K }, and Proof: Omitted for brevity. See [17,Appendix B]. Note that, without D2D communication, i.e., r D = 0, the linear program (13) is identical to that of [9, Proposition 1]. For r D > 0, the additional term g(l)r D δ D in (13b) reflects the novel trade-off between the D2D NDT δ D and the edge and fronthaul NDTs δ E and δ F , respectively. This will be further discussed below.

B. Upper Bounds and Achievable Strategy
We first consider the special case in which the fractional cache size is µ = 1/M, and present a D2D-based delivery scheme that make use of Compress-and-Forward (CF), and does not require the use of the cloud infrastructure. Note that this is possible since a cache capacity of µ = 1/M guarantees that the entire library F is available across the caches of all ENs.
Lemma 1: For a D2D-aided F-RAN with M ENs, each with a fractional cache size µ = 1/M, K users, a library of N ≥ K files, a fronthaul rate r F ≥ 0, and a D2D rate r D ≥ 0, the minimum NDT is upper bounded as δ  (15) is achieved by the following scheme. Consider first the case M ≥ K. Under this assumption, K out of the M ENs are active at any time to transmit, so that each active EN transmits part of the requested file to a given user. K users are hence served simultaneously, each by a different EN. Each user compresses and forwards its received signal to all other users over the D2D links. Then, each user collects the K received signals and carries out ZF equalization in order to recover the desired signal with no interference from other signals. By quantizing with a rate equal to log(P) bits per downlink symbol, ZF equalization achieves an ideal edge NDT of δ E = 1 given that the SNR after compression scales linearly with P (see [9,Prop. 3]). Due to the use of the D2D links, a latency overhead of δ D = δ E /r D is added to the delivery time, and hence the total NDT is (15). For the complementary case in which M < K, only M users can be served simultaneously, and hence the edge NDT is multiplied by K/M. Remark 1: For a D2D-aided F-RAN with M = 2 ENs and K = 2 users, a different D2D-based scheme was presented in [14], which achieves an NDT equal to 1 + 1/(2r D ) < δ D2D-CF . This scheme is based on real interference alignment [18] and is hence strongly dependent on the assumption of perfect CSI at the transmitters. In contrast, the CF-based scheme discussed above requires only CSI at the receivers in order to perform the ZF equalization.
The CF-strategy can be combined with previously proposed delivery techniques studied in [9] by means of file-splitting and cache-sharing [9, Lemma 1]. That is, all files are split in the same way into a number of fragments, and each fragment is delivered through a different policy. In order to obtain a policy that applies for any value of fractional cache size µ, we combine the D2D-based CF scheme (Lemma 1) with the best-known general strategies for an F-RAN model with no D2D cooperation. These are: (i) cache-aided ZF [9, Lemma 2], whereby fragments cached by all ENs are delivered via cooperative ZF precoding; (ii) cache-aided EN coordination [9,Lemma 3], in which fragments cached by only one EN are delivered via interference alignment [18]; and (iii) cloudaided soft-transfer [9, Proposition 3], whereby ZF precoding is carried out at the cloud, and the fronthaul links are used to convey quantized ZF-precoded signals to the ENs, such that no cache resources are required. To formulate the main result, we define the threshold values and Proposition 2: For a D2D-aided F-RAN with M ENs, each with a fractional cache size µ ∈ [0, 1], K users, a library of N ≥ K files, a fronthaul rate r F ≥ 0, and a D2D rate r D ≥ 0, the minimum NDT is upper bounded as δ * (µ, r F , r D ) ≤ δ ach (µ, r F , r D ), where the achievable NDT δ ach (µ, r F , r D ) is obtained by combining the mentioned schemes as follows: • Low cache, low fronthaul, and low D2D regime (µ ≤ 1/M, r F ≤ r th F , and r D ≤ r th D ): Combining EN coordination and soft-transfer policies yields the NDT δ ach (µ, r F , r D ) • High cache, low fronthaul, and low D2D regime (µ > 1/M, r F ≤ r th F , and r D ≤ r th D ): Combining EN coordination and ZF precoding policies yields the NDT • High fronthaul and low D2D regime (µ ∈ [0, 1], r F > r th F , and r D ≤ r th D ): Combining ZF precoding and soft-transfer policies yields the NDT • Low cache and high D2D regime (µ ≤ 1/M, r F ≥ 0, and r D > r th D ): Combining soft-transfer and CF policies yields the NDT • High cache and high D2D regime (µ > 1/M, r F ≥ 0, and r D > r th D ): Combining CF and ZF precoding policies yields the NDT Proof: See [17, Appendix A].

IV. CHARACTERIZATION OF THE MINIMUM NDT
In this section, based on the lower and upper bounds of Section III, we discuss the optimality properties of the CFbased strategy. We start with the main result in the following proposition, which shows that the achievable strategy of Prop. 2 is optimal to within a multiplicative factor of 2.
Proposition 3: For a D2D-aided F-RAN with M ENs, each with a fractional cache size µ ∈ [0, 1], K users, a library of N ≥ K files, a fronthaul rate r F ≥ 0, and a D2D rate r D ≥ 0, the strategy of Prop. 2 achieves the minimum NDT to within a factor of 2, i.e., Proof: See [17, Appendix D]. The key result in Prop. 3 is that the multiplicative suboptimality factor of the CF-based D2D approach defined in the previous section does not scale with the size of the system. This is illustrated in Fig. 2, where we plot the achievable NDT δ ach (µ, r F , r D ) and the lower bound δ lb (µ, r F , r D ) as a function of the number of ENs and users, with M = K, fractional cache size µ = 1/M, fronthaul rate r F = 1, and D2D rate r D = 1.25. As seen, the suboptimality gap can be, in practice, significantly smaller than two.
While the CF-based scheme is approximately optimal as proved by Prop. 3, the gap identified in (23) is generally not  zero. As a notable example, for a D2D-aided F-RAN with M = 2 ENs and K = 2 users, the lower bound (13), illustrated in Fig. 2, is tight, and the CF-based strategy is suboptimal [14, Sec. III] (see Remark 1). This said, the next corollary states that CF is close to optimal for sufficiently high D2D rate r D .
Corollary 1: For a D2D-aided F-RAN with M ENs, each with fractional cache size µ ∈ [0, 1], K users, a library of N ≥ K files, a fronthaul rate r F ≥ 0, and a D2D rate r D ≥ max{r th D , 1/ } with r th D in (17) and > 0, the achievable strategy of Prop. 2 is close to optimal in the sense that we have δ ach (µ, r F , r D ) δ * (µ, r F , r D ) ≤ 1 + .
Proof: Follows from the proof of Prop. 3. See [17]. Corollary 1 is illustrated in Fig. 3. where we plot the achievable NDT δ ach (µ, r F , r D ) and the lower bound δ lb (µ, r F , r D ) as a function of the D2D rate r D , for M = 3 ENs, K = 3 users, fractional cache size µ = 1/3, and fronthaul rate r F = 1. As r D increases, δ ach (µ, r F , r D ) is seen to approach the lower bound δ lb (µ, r F , r D ). E.g., for r D ≥ 1/ = 10, the gap to optimality is smaller than = 0.1. This is because, for arbitrarily large D2D rate, the latency overhead caused by D2D communications is negligible, and an ideal NDT of one can be achieved by means of ZF-equalization at the users. The figure also highlights the gains that can be achieved with sufficiently high D2D rate.

V. CONCLUSIONS
In this work, we have studied the benefits of out-of-band broadcast Device-to-Device (D2D) communication for content delivery in a general Fog-Radio Access Network (F-RAN) with arbitrary number of Edge Nodes (ENs) and users. Focusing on the normalized delivery time (NDT) metric, a strategy based on compress-and-forward D2D communication was shown to be approximately optimal to within a constant factor of 2 for all values of the problem parameters. For sufficiently high D2D capacity, the proposed strategy was proved to achieve a significantly lower delivery latency than the minimum NDT for F-RAN without D2D communication. Similar results for a D2D-aided F-RAN under pipelined delivery policies, whereby simultaneous transmissions on fronthaul, edge and D2D channels are enabled, can be found in [17].