Energy-Efficient Transceiver Design for Cache-Enabled Millimeter-Wave Systems

In recent years, network densification and edge caching become effective approaches to reduce the burden on the fronthaul links and the content delivery latency for wireless communication systems. However, maximizing system spectral efficiency cannot directly provide any insight on their energy requirements/efficiency for cache-enabled millimeter-wave (mmWave) radio access networks (RANs). In this paper, we study the design of energy-efficient transceiver, consisting of analog and digital precoder/combiner, for the delivery phase of the downlink of cache-enabled mmWave RANs. Due to the non-convexity of the delivery rate and objective, the coupling between the digital and analog precoders/combiners, and the constant module constraint on the elements of analog precoders/combiners, the problem of interest is non-convex and hard to obtain the global optimal solution, even the local optimal solution. To this end, we first overcome these challenges one-by-one and then transform the original problem into tractable one. Finally, an algorithmic framework that converges to the Karush-Kuhn-Tucker solution with provable is developed to achieve the design of energy-efficient transceiver. Numerical results are provided to evaluate the performance of the proposed algorithm, where fully digital precoding is used as benchmark.


I. INTRODUCTION
A CCOMPANYING with the development of the electronic and communication technologies, various new applications have emerged, such as augmented reality, virtual reality, and autonomous vehicles. Consequently, the fifth generation (5G) and beyond 5G are anticipated to support massive wireless connectivity for both human centric and machine type services with high spectral/energy efficiency and ultrareliable low-latency delivery [1]. In general, the augmented and virtual realities require gigabit-per-second data rate and the typical emerging Internet of Things (IoT) applications require a latency from 0.25 ms to 10 ms and an outage probability (or packet loss rate) in the order of 10 −3 to 10 −9 [2]. To cater for these new demands on wireless communication networks, millimeter-wave (mmWave) communication and caching popular contents at the network edge stand out as two powerful and promising ways to realize multiple-gigabits data rate and millisecond level end-to-end delivery latency [3]- [5].
In the past few years, due to the abundant spectrum resource in the range of 30 − 300 gigahertz (GHz) [6], [7], mmWave communication has been regarded as a promising candidate for providing order of magnitude improvement in the achievable data rate, which has attracted extensive attention from both industry and academia. To balance the tradeoff between the hardware cost/complexity and the system performance of mmWave systems, the digital-analog hybrid antenna architecture has been extensively considered as a cost-efficient architecture in which the number of radio frequency (RF) chains is smaller than that of antennas [8]- [16]. The works in [8]- [11] consider the design of hybrid precoders for pointto-point mmWave communication systems. To release the coupling between the digital and analog precoding matrices, the orthogonal matching pursuit (OMP) [8]- [10] and matrix decomposition [11] methods are used to obtain the analog and digital precoders from fully digital (sub)optimal solutions. A codebook based hybrid precoding design was investigated for mmWave downlink multiuser communication system [12]. Recently, the penalty dual decomposition (PDD) 0090-6778 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
method is utilized to optimize the hybrid precoding for point-to-point mmWave communication systems [13], [14]. The physical propagation delay of electromagnetic waves on transmitted wideband signal was analyzed for mmWave largescale multiple-input multiple-output (MIMO) systems [15]. In addition, joint channel estimation and hybrid precoding was investigated by designing a novel hierarchical multiresolution codebook to construct training analog beamforming vector for mmWave communication systems [16]. Note that the aforementioned literature only focuses on the conventional non-cache-enabled point-to-point or multiuser mmWave communication systems. In conventional cellular network or cloud radio access network (C-RAN), since the data center is far away from the enddevices, only increasing bandwidth is still difficult to achieve the goal of millisecond level end-to-end delivery latency. Recently, to alleviate the fronthaul capacity requirement and reduce the user-perceived latency in C-RANs, an evolved cache-enabled RAN was proposed in which the remote radio heads (RRHs), referred as enhanced RRHs (eRRHs), have the abilities to store popular contents and perform baseband signal processing [3], [4]. During the off-peak traffic periods, some most frequently requested files can be pre-fetched to the local cache at the network edge to reduce delivery latency and increase spectral efficiency [17]- [19]. A cache placement strategy with limited fronthaul and cache capacity was studied to minimize the average download delay of user requests [17]. The authors of [18] investigated how to cache the files at different caching units with the objective to maximize the average requested data rate subject to a finite service latency. Successful transmission probabilities were analyzed for a two-tier large-scale cache-enabled wireless multicasting network [19]. Joint optimization of cloud-edge precoders was also investigated with aiming to maximize or minimize the system performance criterion under certain constraints for cache-enabled RANs [20]- [22]. A novel partial caching pipelined transmission scheme was proposed to reduce both the burden on fronthaul links and delivery latency for cacheenabled RANs [23], [24]. To further reduce signalling overhead, distributed fog computing was used to investigate the batched sparse code for industrial control [25]. Their studies have shown that caching the popular contents at the network edge has contributed to the reduction of delivery latency and of the burden on the fronthaul links for wireless communication. However, these mentioned literature focused on cache-enabled transmission optimization for sub-6 GHz communication systems. More recently, cache-enabled mmWave precoding design was investigated to maximize the minimum user rate to guarantee the user fairness [26]. The digital-analog hybrid antenna architecture makes the design of transmission scheme more challenging in cache-enabled mmWave communication system. The developed algorithms in [26], which aims only to optimize the hybrid precoding and adopts the Taylor series expansion to optimize the analog precoder, can only guarantee convergence, but cannot be proved to converge to the global optimum or the Karush-Kuhn-Tucker (KKT) solution.
On the other hand, due to the environmental concerns and sustainable growth considerations, green communication is rapidly gaining extensive attention for communication systems [27], [28]. Indeed, information and communication technologies (ICT) consume about 2% of the entire world energy consumption, and the situation is likely to reach the point wherein ICT equipments in large cities will require more energy than it is actually available [29]. To save transmit power consumption, one simple approach is to consider the minimization of the transmit power subject to the requirement of quality of services. However, minimizing transmit power does not mean maximizing spectral efficiency. To achieve high data rate with limited energy consumption, recently, energy efficiency, which is measured in bit/Joule, becomes a new performance optimization criterion for wireless communication, especially for energy efficiency priority mmWave communication systems [30]. Note that the aforemention references [27]- [30] focus on the design of energy-efficient transmission for non-cache-enabled wireless communication systems.
Compared to the conventional delivery rate maximization or cost minimization problem for cache-enabled RANs [20]- [22], due to the coupling between digital and analog precoders with additional constant modulus constraints, optimizing energy-efficient hybrid transceiver is more challenging and difficult to obtain the specific optimal solution for cacheenabled mmWave RANs. Meanwhile, the introduction of the constraints on the delivery rate and the fronthaul link makes the energy-efficient design of transceivers different from the contentional mmWave communication system [12], [13]. Furthermore, to the best of the authors' knowledge, the energy efficiency problem of cache-enabled mmWave RANs is still opening problem. Motivated by these observations, in this paper, we investigate the design of energy-efficient transceiver for the delivery phase of the downlink of cache-enabled mmWave RANs. The main contributions of this paper can be summarized as follows: • We jointly optimize the analog combiner/precoder and digital combiner/precoder to maximize the system energy efficiency for cache-enabled mmWave RANs employing soft fronthaul information transfer; • To solve the non-convex energy efficiency optimization problem, we first transform the original problem into a tractable form via fractional programming theory and penalty dual decomposition (PDD) method [13], [14]. In particular, the fractional programming theory is used to transform the fractional form into a parameterized subtract one. Then, the PDD method is exploited to put the coupling constraint into the objective; • An effective optimization algorithm is developed to solve the formulated optimization problem via the so-called alternating direction of multipliers method (ADMM) [31] and block successive upper-bound minimization (BSUM) [32]. Furthermore, we further prove that the presented algorithm converges to the KKT solution of the considered problem; • Numerical experiments are carried out to evaluate the performance of the presented algorithm and reveal that hybrid antenna architecture is not necessarily the best scheme for cache-enabled mmWave RANs from the perspective of system energy efficiency and delivery rate. But, compared to the non-caching mmWave RANs, caching popular files at the network edge helps to improve the system performance. The remainder of this paper is organized as follows. The system model is described in Section II. In Section III, the design of optimization method is investigated for the considered optimization problem. In Section IV, the performance of the developed algorithms is evaluated via simulation. Conclusions are drawn in Section V.
Notations: Bold lower case and upper case letters represent column vectors and matrices, respectively. The superscript (·) H represent the conjugate transpose operators. tr (·) and · F denote the trace and the Frobenius norm, respectively. diag (a) is a diagonal matrix whose diagonal elements are the elements of vector a. A 0 is a positive semidefinite matrix. 0 N ×N and I N ×N denote the N × N zero and identity matrices, respectively. The function x rounds x to the nearest integer not larger than x. a denotes the complement 1 − a of a binary variable a ∈ {0, 1}. E [·] denotes the expectation operator. ln (·) is the logarithm with base e. j is the imaginary unit, i.e., j 2 = −1. (·) and (·) are the real and imaginary part, respectively. The circularly symmetric complex Gaussian distribution with mean u and covariance matrix R is denoted by CN (u, R). The symbols used frequently in this paper are summarized in Table I.

II. SYSTEM MODEL
In this work, we investigate the downlink transmission of cache-enabled mmWave RANs. There are K U multi-antenna users establishing wireless communication with K R eRRHs, as illustrated in Fig. 1, where LPF is the abbreviation of low pass filter. Each eRRH is equipped with an N t -antennas uniform linear array (ULA). Let K R = {1, · · · , K R } and K U = {1, · · · , K U } be the set of eRRHs and users, respectively. For simplicity, we assume that all eRRHs are equipped with the same number N tRF of RF chains, each connected to N t transmit antennas via N t phase shifters. eRRH i is connected to the BBU through an error-free fronthaul link of capacity C i nats/Hz/s, i ∈ K R . Each user is equipped with N uRF RF chains and N r receive antennas. Each RF chain is connected to the N r receive antennas via N r phase shifters.

A. Cache Model
Without loss of generality, we assume that all files in library F at the BBU have a size of nS nats, where S is the normalized file size and n is the number of symbols of each downlink coded transmission interval. The assumption of equal file-sizes is standard and can be justified, as in practice what are cached and requested by users are chunks of contents, e.g., fragments of a given duration, which may be safely assumed to be of the same length [33], [34]. Each eRRH i ∈ K R is equipped with a cache of size nB i > 0 nats, where B i is the normalized cache size. To provide the spatial file diversity (which can improve the performance of dense wireless networks) [19], each eRRH i ∈ K R randomly prestores nB i nats of nF S nats in the library to its local cache according to certain preferences [20]- [22]. In other words, each eRRH i ∈ K R randomly selects B i /S files from library F at the BBU to store at its local cache. The cache status of file f at eRRH i can be modeled by defining binary variables Let c f,i denote the complement 1 − c f,i of binary variable c f,i . In this work, we do not focus on the research on the content popularity and specific caching strategies, the details please refer to other literatures, e.g., [17]- [21]. Therefore, for simplicity, in the following, we assume that the cache state information c f,i , f ∈ F, i ∈ K R , is predetermined.

B. Channel Model
A narrowband clustered channel model with N c non-line of sight (NLoS) scatterers and one line of sight (LoS) path based on the extended Saleh-Valenzuela model is adopted in this work [6]. Each scatterer is further assumed to contribute a single propagation path to the channel between a user and an eRRH [7]. Thus, channel matrix H k,i can be modeled as [35], [36] where θ l,k,i ∈ [0, 2π) and φ l,k,i ∈ [0, 2π) are the angles of arrival and departure (AoAs/AoDs) of the l-th NLoS path between eRRH i and user k, respectively. θ 0,k,i ∈ [0, 2π) and φ 0,k,i ∈ [0, 2π) are the AoA and AoD of the LoS link between eRRH i and user k, respectively. α l,k,i ∼ CN 0, σ 2 k,i and ρ l,k,i denote the complex gain and the attenuation associated to the l-th path between eRRH i and user k, respectively, p ∈ K C = {1, · · · , N c }. ρ 0,k,i denotes the attenuation of the LoS link between eRRH i and user k. I (ρ 0,k,i ) is a random variate indicating if a LoS link exists between eRRH i and user k. Denoting by p the probability that I (ρ 0,k,i ) = 1, i.e., a LOS link exists. The value of p can be determined according to the Formula given in [37,Eq.(2) and Eq. (3)]. In particular, for an N t -element ULA, the array response vector is given by [38] where λ s and d a are the wavelength of carrier frequency and the antenna spacing, respectively. The array response vector a u (θ l,k,i ) at user i can be written in a similar fashion. For simplicity, we assume that the perfect channel state information (CSI) H k,i can be obtained at the BBU. The CSI can be acquired by the eRRHs and reported back to the BBU via the fronthaul links, e.g. [20]- [22]. The analysis of the effect of imperfect CSI is an interesting topic for future work [39], [40].

C. Signal Model
According to the caching state of files, the contents of files requested by the users can be retrieved from the local cache or the library located at the BBU. In general, there are two kinds of tranfer methods for fetching the uncached requested files from the BBU, i.e., hard-and soft-fronthaul information transfer. For the hard fronthaul information transfer, the hard information of the uncached files is transferred to the eRRHs via fronthaul links. While, for the soft fronthaul information transfer, the fronthaul links are used to transfer a quantized version of the precoded signals of the missing files [20], [26]. In this paper, we adopt the soft fronthaul information transfer to acquire the uncached requested files from the BBU to the eRRHs which does not cached the requested files. Consequently, the signal x i ∈ C Nt×1 transmitted by eRRH i including the information of the cached and uncached requested files is given by where and G f,i being the digital precoding matrix at eRRH i for baseband signal s f cached at the local cache and not available at the local cache, respectively.
and n i ∈ N tC = {1, · · · , N tRF }. The quantization noise z i ∈ C NtRF×1 is assumed to be independent of the information of the uncached requested files and distributed as z i ∼ CN (0, Ω i ). Note if all requested files in F req have been stored at the local cache, then the quantization noise z i is a zero vector. We further assume that the quantization noise z i is independent across the eRRHs, i.e., the signals intended for different eRRHs are quantized independently. Consequently, the signal received at user k can be expressed as precoding super-matrix defined as where permutation matrix P i is defined as In (5), n k is the additive white Gaussian noise (AWGN) with circularly symmetric complex Gaussian CN 0 Nr×1 , σ 2 k I Nr×Nr . For simplicity, we assume that all users adopt linear receiver to recover the desired signal from the received signals. Let U RF,k ∈ C Nr×NuRF be the RF combining matrix and U BB,k ∈ C NuRF×d f be the baseband combining matrix. Similar to RF precoding matrix F RF,i , U RF,k is also imple- Consequently, the recovered baseband signal is expressed as Different from the conventional MIMO communication system, in which each antenna has a dedicated RF chain, the received signal at each RF chain is a combination of received signal of all antennas via a phase shifter network for mmWave communication system. Thus, when Gaussian symbols are transmitted over the mmWave channel, the achievable data rate at user k, in units of nats/s/Hz, can be defined as [14] R k ln det where Π k is given by (10), at the bottom of this page.
With Ω = diag (Ω 1 , · · · , Ω KR ). In the following subsection, the optimization objective will be formulated to obtain the corresponding digital and analog transceiver strategies.

D. Optimization Objective
In this paper, the objective is to maximize the system energy efficiency under certain constraints on the fronthaul transmission rate, eRRH transmit power, and analog precoder together with combiner. In general, energy efficiency is defined as the number of communicated nats per unit power consumption [27]- [30]. Accordingly, the resulting optimization problem is formulated as where r f denotes the delivery rate for requested file f , 1 is a constant accounting for the inefficiency of power amplifier [13], dynamic power consumption p i (F RF,i , G, Ω i ) is given by and constant circuit power consumption p c is calculated as In (13), P tRFC and P rRFC represent the power consumed by all RF chains at eRRHs and users, respectively. P PSC represents the power consumed by all phase shifters at eRRHs and users. P LNAC denotes the power consumption of all low noise amplifiers. P LO and P BB denote respectively the power consumed by the local oscillator and the basic power consumption. P DAC and P ADC denote the power consumption of the digital-to-analog converter and the analog-to-digital converter, respectively. P mixer and P LPF denote respectively the power consumed by the mixer and the LPF, respectively. P PS stands for the consumption of the phase shifter. P LNA represents the power consumption of low noise amplifier [9], [13]. In constraint (11d), fronthaul rate function g i (G, O) is defined as According to [41,Ch. 3], to reliably recover signal x i at eRRH i, constraint g i (U, O) C i has to be satisfied. In problem (11), constraint (11b) implies that the data delivery rate for user k cannot exceed the achievable rate of user k. Constraint (11c) ensures that the data delivery rate of file l does not exceed the normalized file size S. The rate on each fronthaul link is constrained by the limited fronthaul link capacity C i in constraint (11d). Constraint (11e) limits the maximum allowable transmit power of eRRH i. Constraints (11f) and (11g) are the constant modulus constraints on each element of analog precoding matrix due to the implementation of analog phase shifter. Note that there are six main obstacles for solving problem (11), i.e., the nonconvex objective (11a), the non-convex achievable rate R k in the right hand of constraint (11b), the non-convex fronthaul capacity constraint (11d) and power constraint (11e), and the non-convex constant modulus constraints (11f) and (11g). Therefore, problem (11) is non-convex fractional programming problem and therefore difficult to obtain the global optimal solution. In addition, if eRRH i has cached all requested files, i.e., file f , ∀f ∈ F req is pre-stored at the local cache at the network edge, the corresponding fronthaul link capacity constraint given in (11d) is redundant and can be removed. Furthermore, the dynamic power consumption of eRRH i is replaced with To provide performance comparison, we also here formulate the maximization of delivery rate as follows subject to (11b), (11c), (11d), (11e), (11f), (11g). (15b) Compared to problem (11), the objective function in problem (15) is simple, but problem (15) is still non-convex and hard to obtain the global optimal solution. In what follows, we focus on designing an efficient and effective heuristic optimization algorithm to solve problem (11) and (15), respectively.

III. DESIGN OF OPTIMIZATION METHOD
For the design of hybrid transceiver of interest, one of the main difficulties is the cascading relation between the analog precoders/combinners and digital precoders/combinners. In the existing literature, many optimization methods are used to overcome this difficulty, such as the OMP method [8]- [10], Lagrange convex approximation method [11] and codecook based analog precoding design method [12]. On the other hand, by treating the product of analog precoder/combiner and digital precoder/combiner as a whole, problem (11) can be transformed into another form that is easier to solve compared to the original one [13], [14]. In what follows, we explore the solution of problem (11) in detail.

A. Algorithmic Architecture
Thus, problem (11) can be equivalently rewritten as the following form subject to (11b), (11c), (11d), (11f), (11g), (16b)  N )+pc and go to Step 2. where The non-convexity of objective (16a) is mainly incurred with the fractional form. To overcome this difficulty, a classic technique was proposed to transform the single ratio fractional form into a parameterized linear form by introducing an additional auxiliary variable in [42]. Thus, the fraction objective function can be transformed into a linear objective one. This implies that problem (16) can be rewritten as (18), at the bottom of the next page, where λ is an auxiliary turnable parameter. As a result, an iterative algorithm (known as the Dinkelbach method [43]) can be used to solve problem (16). This algorithm is summarized as Algorithm 1, where is a given stopping threshold. The convergence to the optimal energy efficiency is guaranteed [27].

B. Inner Optimization Designs
In this subsection, we focus on investigating the solution of problem (18) by using the PDD method [13], [14], ADMM method [31], BSUM method [32] and the relation between the user rate and minimum mean square error (MMSE) [44].
1) Problem Transformation: In problem (18), equality constraints (16d) and (16e) are two main obstacles for solving it. Fortunately, the researches on equality constraint optimization in [13], [14] have shown that problem (17) can be further reformulated as (19), at the bottom of the next page. In (19a), matrix Ξ s is calculated as (20), at the bottom of the next page, where Φ f,i and Ψ i are Lagrange Multiplier matrices associated with equality constraints (16d) and (16e), respectively. ρ and are two penalty parameters. Though objective function (19a) is formulated as a linear form by using the fractional programming theory and PDD method, problem (19) is still non-convex and therefore difficult to obtain the global optimal solution. The main difficulties of solving problem (19) are three-fold. The first is the non-convexity of achievable rate R k of user k. The second is the coupling of analog and digital precoding matrices. The third is the constant modulus constraint on the elements of analog precoding and combining matrices.
In the sequel, we focus on overcoming the non-convex achievable rate R k of user k by using the weighted MMSE transformation. First, we define the mean square error (MSE) as follows where Π k is given by (22), at the bottom of this page. The authors of [44] have shown that the achievable data rate R k of user k can be expressed as Based on this observation, instead of solving directly problem (18), we resort to addressing a low boundary problem that is given by (23) (18) is an upper boundary of that of problem (23). This is because the feasible region of problem (23) is a subset of that of problem (18). Similarly, problem (15) can be approximated as the following low boundary problem subject to (11c), (11d), (11f), (11g), (16c), (23b).

(24b)
Though problems (23) and (24) have more tractable forms compared to their original ones, they are still impossible to obtain the global optimal solution via the conventional optimization method. In what follows, we are going to develop an iterative procedure to obtain the solution of problem (23) based on the ADMM and BSUM, which are known for their good behaviors in several non-convex optimization problems [31], [32]. In other words, we maximize the objective (23a) by sequentially fixing partial variables and updating the others.

2) Optimization of Baseband and RF Combiners:
In this subsection, we focus on optimizing variables U BB,k , Σ k , and U RF,k , ∀k ∈ K U with given other variables. Note that these variables appear only in the right side of constraint (23b). Therefore, maximizing the goal of problem (23a) is equivalent to maximize the right side of constraint (23b) as much as possible, i.e., maximizing the feasible region of problem (23). However, variables U BB,k , Σ k , and U RF,k are mutual coupling, ∀k ∈ K U , therefore, it is impossible to maximize jointly the right side of constraint (23b) with respect to these variables. To overcome this difficulty, we adopt two steps to optimize the three variables. In particular, we first optimize the analog combiner U RF,k and then optimize the baseband combiner U BB,k and weight matrix Σ k , ∀k ∈ K U , with fixed other variables.
When other variables are given, the solutions of U RF,k , ∀k ∈ K U , can be obtained by solving the following problem Substituting the analytical expression of E k given by (21) into problem (25), we have the following equivalent problem (26), at the bottom of the next page. In (26), matrix Υ k is given by It is not difficult to find that the objective of problem (26) have the form φ (X) Tr X H AXC − 2 Tr X H B , where A and C are positive semidefinite. First, we can express φ (X) as a quadratic function of X (i, j) in the form of φ (X (i, j)) a |X (i, j)| 2 − 2 (b * X (i, j)). Then, by using the constant module constraint, the derivation of φ (X (i, j)) with respect to X (i, j), and the derivation of φ (X) with respect to X, a block coordinated ascent optimization can be designed to solve problem (26) with complexity O N 2 r N 2 uRF . The details of the specific optimization algorithm can be referred to [14,Algorithm 4 listed in  In addition, when other variables are given, the solutions of U BB,k and Σ k , ∀k ∈ K U , can be updated via solving the following problem Problem (28) is an unconstrained optimization and easy to solve via its stationary points. Since each matrix Σ k is positive definite, the optimal solutions of U BB,k and Σ k , ∀k ∈ K U , can be analytically given respectively by and The complexity of computing U [45].

3) Optimization of Auxiliary Variable, Baseband Precoder and Noise Covariance
where a k and b k are respectively defined as (35) with auxiliary variables L = {L i 0} i∈KR . Constraint (32e) and g i (G, O, L) C i are equivalent when Therefore, in the sequel, we replace constraints (32e) with (30) 1: Let t = 0, initialize G (t) and O (t) . 2: Compute auxiliary variables L i , ∀i ∈ K R , via (36). 3: Let t = t + 1. Solve problem (37) for a given L i , ∀i ∈ K R , then obtain R (t) , M (t) , N (t) , G (t) , and O (t) . 4: Repeat Steps 2 and 3 until convergence.

Algorithm 2 Solution of Problem
Problem (37) is still non-convex and difficult to obtain directly the global optimal solution. Note that g i (G, O, L) is not jointly convex w.r.t. variables G, O, and L, however, it is jointly convex w.r.t. G and O for a given L. When L is fixed, problem (37) can be easily solved via classical convex optimization method, such as the primal-dual interior-point methods [46], [47]. While the solution of variable L i has an analytical expression, i.e., (36). It implies that problem (37) can be solved via alternative optimization method. In particular, the update of variables R, M, N , G, and O can be achieved by solving problem (37) for a given L. Then, the update of variable L can be finished by equality (36) for given R, M, N , G, and O. The detailed steps are summarized in Algorithm 2. After each update at Step 3 of Algorithm 2, this algorithm generates a nondecreasing sequence of the objective (37a). Furthermore, the objective (37a) has an upper boundary due to the limited transmit power in practical communication system. Therefore, the convergence of Algorithm 2 can be guaranteed by the monotonic boundary theorem [48]. The computational complexity of Algorithm 2

4) Optimization of RF Precoders:
In this subsection, we pay our attention to the solution of F RF,i , ∀i ∈ K R . Similarly, we note that variables F RF,i , ∀i ∈ K R only appear in the third item of objective function (23a). As a result, the solution of F RF,i , ∀i ∈ K R , can be obtained by solving the following problem By appropriate rearrange, problem (38) can be equivalently formulated as follows where i . Similar to problem (26), problem (39) can also be solved with complexity O N 2 t N 2 tRF [14, Algorithm 4 listed in Table V, pp. 466].

5) Summarization of Optimization Algorithm:
In this subsection, we summarize the details of our proposed optimization algorithm that is used to solve problem (18). The step-by-step description for solving problem (18) is given in Algorithm 3, where l and t denote the number of iterations, respectively. and ε denote respectively a stopping threshold and an approximation stopping threshold. ω is a control parameter. ζ (t) denotes the objective value of problem (23) at the t-th iteration. In particular, we firstly solve problem (19) with given Lagrange Multiplier matrices Φ f,i and Ψ i and penalty parameters ρ and using the block coordinate descent method, i.e., Steps 3 to 8. Then we update the Lagrange Multiplier matrices Φ f,i and Ψ i or the penalty parameters ρ and according to the constraint violation condition h F That is, if the constraint violation condition is satisfied, we update the Lagrange Multiplier matrices Φ f,i and Ψ i , otherwise, we update the penalty parameters ρ and . Note that problem (24) can also be solved with a similar procedure used in Algorithm 3. The only difference is that we need to remove the second item in the objective (32a) for updating variables R, M, N , G, and O.
In Algorithm 3, in Steps 4 to 8, each optimization tries to maximize problem (23). It implies that the iterative optimization from Steps 4 to 8 generates a non-decreasing sequence, i.e., ζ (1) Meanwhile, recalling the limited allowable transmit power in practical communication system, the objective of problem (23) has an upper bound. Therefore, the inner loops can converge to the KKT solution of problem according to the research results in [49,Theorem 1]. Further, following the similar arguments as those in [14, Corollary 3.1], Algorithm 3 converges to a KKT solution of problem (18). Therefore, Algorithm 3 can be guaranteed to converge to a KKT solution of problem (16) [42], [43]. In addition, the total computational complexity of Algorithm 3 is where κ 1 denotes the average number of the operations of Steps 4 to 8 in each execution of Step 3 and κ 2 represents the number of the operations of Step 3.

C. Discussion
In the aforementioned two subsections, an alternative optimization method is designed to solve problem (11). A few subalgorithms proposed for solving each subproblem are iterative  (28) and (29). 6: Solve problem (31) < , then go to Step 9. Otherwise go to Step 4.
Otherwise, go to Step 10. 10 Otherwise, update Φ f,i , Ψ i , ρ, and as follows , and go to Step 3. algorithms, except for the optimization of baseband combiners and weight matrices. While these sub-algorithms require a number of iterations for achieving convergence. Therefore, to some extent, the proposed algorithm is a time-consuming optimization method. It implies that this algorithm incurs a certain computation delay and consume a huge computational burden, especially for massive mmWave MIMO multiuser system. On the other hand, in this paper, we assume that the BBU can obtain the perfect CSI for each optimization. Once the channel condition changes and the scheduled user group changes, we need to rerun this optimization method to obtain the corresponding transmission mechanism. This means that the proposed algorithm have no the ability to adapt the fast time-varying communication environments. Therefore, how to make full use of the cache at the network edge is still challenging problem for reducing really the burden on the fronthaul links and the delivery latency for cache-enabled wireless communication system.

IV. NUMERICAL RESULTS
In this section, we present numerical results to evaluate the performance of the proposed transceiver optimization algorithm for cache-enabled energy-efficient mmWave RANs, in which the positions of the eRRHs and the users are uniformly distributed within a circular cell of radius 500 m. For easy of illustration, we assume that all eRRHs have the same maximum transmit power and fronthaul capacity, i.e., P i = P and C i = C, ∀i ∈ K R . The average path-loss power is ρ p,k,i = 1 1+(d p,k,i /d0) α with d 0 , d p,k,i , and α being the reference distance, the propagation distance of the p-th path between eRRH i and user k in meters (p = 0, · · · , N c ), and the pathloss exponent, respectively. For simplicity, we assume that the LoS link always exists between eRRHs and users. The AoDs/AoAs are assumed to take continuous values and are uniformly distributed in [0, 2π]. All users have the same noise variance, i.e., σ 2 k = σ 2 , k ∈ K U . The eRRHs are equipped with caches of equal size, i.e., B i = B = ξSF , i ∈ K R , where ξ denotes the fractional caching capacity. The cache states c f,i , f ∈ F, i ∈ K R , are randomly set subject to If not stated otherwise, the values of the system parameters are as in Table II. The set F req of requested files is randomly generated for each channel realization.
For comparison, we also simulate the minimum user rate for fully digital precoding for downlink cache-enabled RANs as performance benchmark. 1 The power consumption of a complete antenna array with one RF chain per antenna is calculated as [9] p dc = P tRFC +P rRFC +P LNAC +2P LO + 2P BB , (42a) P tRFC = K R N t (P DAC + P mixer + P LPF ) , (42b) P rRFC = K U N r (P ADC + P mixer + P LPF ) , (42c) In our simulation, the initialization of RF precoding matrix F RF,i is randomly generated. The quantization covariance matrix Ω i is set to be υI NeRF with 0 < υ < 0.02. The digital precoding matrix G f,i is initialized as f,i are mutual orthogonal unit norm column vectors. The coefficient β is defined to satisfy the power constraint and the fronthaul capacity constraint and is given by (43), at the bottom of the next page, where 0 < κ < 0.02.
In the legend of the figures, "Hybrid-Precoding-EEMax" and "Digital-Precoding-EEMax" denote the proposed hybrid precoding solution and fully digital precoding solution for maximizing the energy efficiency for partial caching at the network edge, respectively. While, "Hybrid-Precoding-DRMax" and "Digital-Precoding-DRMax" denote the proposed hybrid precoding solution and fully digital precoding solution for maximizing the delivery rate for partial caching at the network edge, respectively. For the case of partial/full/no caching at the network edge, the depicted curves have the term "with PC", "with FC" and "with NC" in the corresponding labels for distinction. Fig. 2 and Fig. 3 illustrate the energy efficiency and delivery rate comparison versus the fronthaul capacity C for different algorithms with different caching proportions, S = 1.2 nats and P = 20 dB. Numerical results reveal that when the system performance is limited by other factors rather than the fronthaul capacity C, the system energy efficiency and delivery rate obtained by various algorithms increases with the fronthaul capacity C. When the fronthaul capacity C is sufficient large, the system energy efficiency and delivery rate obtained by various algorithms keep unchange. Though the hybrid antenna architecture has the ability to reduce the cost of RF chains, from the perspective of system energy efficiency and delivery rate, the hybrid antenna architecture is not an optimum scheme for mmWave communication. This is because the directional transmission of mmWave communication may reduce the spatial/multipath diversity gain and cause additional power consumption and hardware cost with the phase shifter networks. In addition, numerical results also illustrate that caching popular files at the network edge helps to improve the system energy efficiency and delivery rate of cache-enabled mmWave RANs. Fig. 4 illustrates the energy efficiency and delivery rate comparison versus the file size S for different algorithms with  C = 1.2 nats/Hz/s and P = 20 dB. Numerical results show that when the system performance is limited by the file size S, Digital-Precoding-EEMax achieves the best energy efficiency performance, while Hybrid-Precoding-DRMax achieves the worst energy efficiency performance with p ac = 5.813 W and p dc = 11.062 W, respectively. Digital-Precoding-DRMax and Digital-Precoding-EEMax obtain respectively the optimal and worst system performance in terms of the delivery rate. The results illustrated in Fig. 4 further show that the optimal solution of maximizing system energy efficiency does not necessary maximizing system delivery rate for cache-enabled mmWave RANs with limited file size S. The hybrid antenna architecture achieves the worst energy efficiency and an  eclectic delivery rate performance. This is because the hybrid antenna architecture reduce the hardware cost while increasing the circuit power consumption with an increasing number of phase shifters and reducing the diversity gain with directional transmission. In addition, fully digital antenna architecture has more degree of freedom to optimize the transceivers such that a better performance may be realized.
To further evaluate the system performance with different number of receiving and transmitting antennas, Table III lists the energy efficiency and delivery rate performance for different algorithms with C = 2 nats/Hz/s, S = 1.2 nats and P = 20 dB. Numerical results show that increasing the number of RF chains can improve the delivery rate and energy efficiency performance of hybrid architecture mmWave communication systems. However, for the fixed number of RF chains, increasing the number of antennas can only improve the delivery rate performance, while, the energy efficiency performance becomes worse for hybrid architecture mmWave communication systems.

V. CONCLUSION
In this paper, we have studied the design of energy-efficient transceivers for cache-enabled mmWave RANs where each edge node is equipped not only with the functionalities of standard RRHs, but also with local cache and baseband processing capabilities. First, we use the fractional programming theory, PDD, and the relation between user rate and minimum mean square errors to transform the original problem of interest into a tractable form. Then, an effective optimization algorithm based on ADMM with provable convergence was developed to solve the transformed problem. Numerical results were provided to validate the effectiveness of the proposed algorithm and revealed that the hybrid antenna architecture is not necessarily the best scheme for cache-enabled mmWave RANs from the system performance perspective. But, it is a tradeoff between the system performance and the hardware cost and power consumption. In addition, caching requested frequently files at the network edge helps to improve the system performance of wireless communications.