Optimizing Over-the-Air Computation in IRS-Aided C-RAN Systems

Over-the-air computation (AirComp) is an efficient solution to enable federated learning on wireless channels. AirComp assumes that the wireless channels from different devices can be controlled, e.g., via transmitter-side phase compensation, in order to ensure coherent on-air combining. Intelligent reflecting surfaces (IRSs) can provide an alternative, or additional, means of controlling channel propagation conditions. This work studies the advantages of deploying IRSs for AirComp systems in a large-scale cloud radio access network (C-RAN). In this system, worker devices upload locally updated models to a parameter server (PS) through distributed access points (APs) that communicate with the PS on finite-capacity fronthaul links. The problem of jointly optimizing the IRSs' reflecting phases and a linear detector at the PS is tackled with the goal of minimizing the mean squared error (MSE) of a parameter estimated at the PS. Numerical results validate the advantages of deploying IRSs with optimized phases for AirComp in C-RAN systems.


I. INTRODUCTION
Federated learning is an emerging distributed learning paradigm in which mobile devices collaboratively train a machine learning model while preserving the privacy of local data sets [1]. In the presence of latency and bandwidth constraints, the implementation of federated learning on wireless systems is challenging if many workers, or devices, are involved. A potential solution to this problem is over-the-air computation (AirComp), which leverages the superposition property of the multiple access channel (MAC) from worker devices to a parameter server (PS) to allow for simultaneous transmissions from multiple devices [2]- [4]. It was reported in [3] that Air-Comp outperforms a conventional multiple access technique in terms of test accuracy, and that the gain is particularly significant at low transmit power and large number of workers.
AirComp assumes that the wireless channels from different devices can be controlled, e.g., via transmitter-side phase compensation, in order to ensure coherent on-air combining  [5]. To alleviate this problem, the work [6] considered a deployment of intelligent reflecting surfaces (IRSs). IRSs, also referred to as reconfigurable intelligent surfaces, can be controlled through integrated electronics in order to shape their response to impinging electromagnetic waves [7]. This enables the modification of the propagation channel between nearby transceivers. As a result, IRSs are considered as a costeffective solution to improve spectral and energy efficiency of wireless systems [8]- [11]. As examples of recent works on IRSs, references [9] and [10] addressed the joint design of downlink beamforming and IRSs' phases for interference management in multi-user [9] and multi-cell systems [10]. Reference [8] analyzed the number of reflecting elements of IRSs needed to beat conventional wireless relaying techniques (see also [12]). Finally, an information-theoretic study was provided in [13].
In this work, we study the advantages of deploying IRSs for AirComp systems. Unlike [6], which focused on a MAC channel where workers directly communicate with a PS, we consider the large-scale cloud radio access network (C-RAN) illustrated in Fig. 1, in which the workers upload local models to the PS through distributed access points (APs). The APs, or remote radio heads (RRHs), in C-RAN send the received signals to the PS on fronthaul links. The fronthaul links have finite capacity, requiring fronthaul quantization and compression [14]. We tackle the problem of jointly optimizing the IRSs' reflecting phases and a linear detector at the PS with the goal of minimizing the mean squared error (MSE) of a parameter estimated at the PS. Due to the non-convexity of the problem, we propose an iterative algorithm that alternately updates the IRSs' phases and the linear detector. Via numerical results, we validate the advantages of deploying IRSs with optimized phases for AirComp in C-RAN systems.

II. SYSTEM MODEL
As illustrated in Fig. 1, we consider an over-the-air computation task performed on a C-RAN system. In the system, N W single-antenna worker devices send locally updated models to a PS through N A single-antenna APs. Each AP is connected to the PS via a fronthaul link, which we model as a digital link of capacity C bit/sample [14]. We define the sets N W = {1, 2, . . . , N W } and N A = {1, 2, . . . , N A } for the workers' and APs' indices, respectively.

A. Over-the-Air Computation Model
We focus on the transmission at a specific time slot where each worker k ∈ N W sends a scalar parameter θ k , and the PS estimates a function f (θ) of the transmitted parameters θ = {θ k } k∈NW . The parameter θ k can be an element of the gradient vector [3] or the local model [4] updated at worker k using its local dataset. The PS typically estimates the weighted sum f (θ) = k∈NW w k θ k , with w k = S k /( l∈NW S l ), where S k denotes the number of training samples at device k [4]. To simplify the discussion, we assume S k = S for all k ∈ N W , and that the target parameter denoted byθ is given by the sumθ We also assume that the parameters θ k are independent, and we define the power of parameter θ k as E[|θ k | 2 ] = σ 2 θ,k . Thus, the target parameterθ has power E[|θ| 2 ] = k∈NW σ 2 θ,k .

B. Channel Model
To assist edge communication from the workers to the APs, we assume the presence of N I IRSs [6] in the network. Each IRS has n I reflecting elements, whose reflecting phases are dynamically adjusted to adapt to the instantaneous channel state information (CSI). We define the set N I = {1, 2, . . . , N I } for the IRSs' indices.
Under a flat-fading channel model, the received signal y i of AP i can be written as where x k is the signal transmitted by worker k; h i,k denotes the channel coefficient from worker k to AP i; and z i ∼ CN (0, σ 2 z ) represents the additive noise. The signal x k satisfies the transmit power constraint E[|x k | 2 ] ≤ P .
Due to the presence of IRSs, the channel coefficient h i,k is modelled as [8]- [10] where h d,i,k denotes the small-scale fading channel from worker k to AP i; g i,j ∈ C nI ×1 represents the small-scale fading channel vector from IRS j to AP i; h r,j,k ∈ C nI ×1 is the small-scale fading channel vector from worker k to IRS j; ρ d,i,k denotes the path-loss of the direct link from worker k to AP i; ρ r,i,j,k is the path-loss of the composite link from worker k to AP i through IRS j; and Θ j is a diagonal matrix that represents the reflecting operation of IRS j, which is defined as where φ j,m ∈ [0, 2π) denotes the reflecting phase of the mth element of IRS j.
We model the path-loss ρ d,i,k between worker k and AP i as the Euclidean distance in meter between the two input vectors, p W,k and p A,i denote the position vectors of worker k and AP i, respectively, η is the path-loss exponent, and c 0 denotes the path-loss at the reference distance of 1 m. For the pathloss ρ r,i,j,k of the composite channel from worker k to AP i through IRS j, we adopt the sum-distance model [7] which models ρ r,i,j,k as where p I,j denotes the position vector of IRS j.

III. OVER-THE-AIR COMPUTATION IN IRS-AIDED C-RAN
In this section, we illustrate the operations at the worker devices, the APs, and the PS in the IRS-aided C-RAN system described in Sec. II.

A. Transmission at Worker Devices
Without claim of optimality (see [15]), we assume that each worker k uses the maximum transmit power P , so that the transmit signal x k is given as with the coefficient α k = (P/σ 2 θ,k ) 1/2 . We note that this does not require CSI at worker devices.

B. Quantization at APs
AP i sends a quantized version of the received signal y i to the PS through a fronthaul link of capacity C bit/sample. Under the assumptions that the updated model vectors have a sufficiently large dimension, the quantized signal denoted bŷ y i can be modelled as [14], [16] where q i models the quantization distortion as being independent of y i and distributed as q i ∼ CN (0, ω i ). According to standard rate-distortion theoretic results [17], the quantization noise power ω i satisfies the condition where σ 2 y,i denotes the variance of the received signal y i given as The minimum distortion power ω i that satisfies the condition (8) is given as Note that the optimal distortion level (10) is a function of the reflecting phases φ = {φ j,m } j∈NI ,m∈{1,...,nI } , since φ affects the channel coefficients h i,k as seen in (3).

C. Estimation at PS
Based on the received quantized signals {ŷ i } i∈NA , the PS estimates the target parameterθ in (1). To elaborate, let us define a vectorŷ = [ŷ 1ŷ2 · · ·ŷ NA ] T which stacks the quantized signals. Then, the vectorŷ can be expressed aŝ where we have defined the vectors h The channel vector h k ∈ C NA×1 from worker k to all the APs can be written as a function of the IRSs' phases φ as where the matrices R r,j,k ∈ C NA×NA , G j ∈ C NA×nI , and the vectors h d,k ∈ C NA×1 , v j ∈ C nI ×1 are defined as R r,j,k = diag({ρ and v j = [e jφj,1 · · · e jφj,n I ] T , respectively. Note that the optimization of the phases {φ j,m } nI m=1 of IRS j is equivalent to that of the vector v j as long as the conditions |v j (m)| 2 = 1 (13) are satisfied for all m ∈ {1, . . . , n I }, where v j (m) denotes the mth element of v j . From the vector v j , each phase φ j,m can be obtained as −∠v j (m). We assume that the PS performs a linear estimation of the target parameterθ fromŷ. Accordingly, an estimateθ ofθ is given asθ with a linear detection vector f ∈ C NA×1 . For given phases φ, i.e., v = {v j } j∈NI , and linear detection vector f , the MSE between the estimateθ and the target parameterθ is evaluated as

IV. OPTIMIZATION
We tackle the problem of jointly optimizing the IRSs' reflecting phases v and the linear detection vector f of the PS with the goal of minimizing the MSE e(v, f ) in (15) while satisfying the unit modulus constraints (13). The problem can be stated as Since it is difficult to jointly optimize the variables v and f , we propose an iterative algorithm that alternately optimizes one variable while fixing other. If we fix the IRSs' phases v in problem (16), finding the optimal detector f becomes an unconstrained quadratic optimization problem, whose closed-form solution is given as To tackle the problem of optimizing the IRSs' phases v for fixed f , we remove the terms that are not dependent on the IRSs' phases from the cost function. Stating the obtained problem with respect to a stacked vectorv where we have defined the notations and e i being the ith element of f and the ith column of an identity matrix of size N A , respectively.
The problem (18) is non-convex due to the unit modulus constraints (18b). To handle this issue, we adopt the matrix lifting approach proposed in [6]. Accordingly, we tackle the problem (18) with respect to a matrix V ∈ C (nI +1)×(nI +1) defined as The matrix V is subject to the constraints V 0, rank(V) ≤ 1, and V(m, m) = 1 for all m ∈ {1, 2, . . . ,n I + 1}. From V, the IRSs' phase vectorv can be recovered as the firstn I elements of the last column of V. We tackle (18) with respect to V by using the following equalities: Specifically, by substituting (20) and (21) into problem (18), we obtain the problem with the matrix M defined as To address the non-convexity of constraint (22c), we note that (22c) is equivalent to the constraint [6] tr(V) − σ 1 (V) = 0, where σ 1 (·) denotes the largest singular value of the input matrix. Function σ 1 (V) is convex in V [18]. Furthermore, for V 0, the left-hand side (LHS) of (24) is 0 when rank(V) ≤ 1 and it becomes larger than 0 otherwise. Based on this observation, as in [6], we tackle the problem with a fixed weight γ ≥ 0. In problem (25), we have removed the rank constraint (22c) and instead added a penalty term γ(tr(V) − σ 1 (V)) to the cost function that increases if (22c) is not satisfied. The problem (25) is a difference-of-convex (DC) problem whose locally optimal solution can be efficiently found via the concave convex procedure (CCP) approach [19]. CCP solves a sequence of convex problems obtained by linearizing the terms that induce non-convexity. In the DC problem (25), the only term that induces non-convexity is −γ · σ 1 (V) in the penalty term. Linearizing −γ · σ 1 (V) at a reference point V = V ′ yields the upper bound [6] Algorithm 1 CCP based algorithm for optimizing V 1. Initialize V (1) as (19) with arbitraryv that satisfies (18b), and set t ← 1 2. Update V (t+1) as a solution of the convex problem: Otherwise, go back to Step 2 with t ← t + 1.
where u 1 (·) returns the eigenvector of the input matrix corresponding to the largest eigenvalue. The condition (26) is satisfied with equality when V = V ′ . The CCP based algorithm for optimizing V is summarized in Algorithm 1.
Overall, the proposed algorithm that alternately optimizes the IRSs' phases v and the linear detector f is detailed in Algorithm 2. In the algorithm, we initialize v and f in Steps 1-2, and update v for fixed f in Steps 3-4. In Step 4, v is modified only when it does not satisfy the modulus constraints (18b). In Step 5, f is updated for fixed v, and we check the convergence in Step 6.

V. NUMERICAL RESULTS
In simulation, we assume that the positions of N W workers, N A APs and N I IRSs are uniformly distributed in a circular area of radius 100 m. We set the variance of local parameters to σ 2 θ,k = 1 for k ∈ N W and assume c 0 = 20 dB, η = 3 in the path-loss models and γ = 1 for the penalty coefficient in (25a). For all links, we consider independent Rayleigh fading channels that are distributed as h d,i,k ∼ CN (0, 1), g i,j ∼ CN (0, I) and h r,j,k ∼ CN (0, I). We compare the performance of the proposed optimized scheme with two baseline schemes, one without IRSs and one with IRSs whose reflecting phases are randomly chosen. In all figures, we plot the normalized MSE, which is defined as the MSE e(v, f ) normalized by E[|θ| 2 ] so that it lies in the range [0, 1].
In Fig. 2, we plot the average normalized MSE versus the fronthaul capacity C for an IRS-aided C-RAN system with N W = 10, N A = 5, N I = 2, n I = 10 and P/σ 2 z ∈ {5, 20}  dB. The figure shows that the proposed optimized scheme outperforms both baseline schemes without IRS and with random phases, and that the gain increases with the fronthaul capacity C. This is because, when C is small, the impact of carefully designing the IRSs' phases becomes minor due to the impact of the quantization noise signals {q i } i∈NA . Also, the gain increases with the signal-to-noise ratio (SNR) P/σ 2 z of the uplink channel, and this trend coincides with the observation reported in [10, Sec. IV]. Fig. 3 plots the average normalized MSE versus the number N A of APs for an IRS-aided C-RAN system with N W = 5, N I = 2, n I ∈ {20, 50}, C = 5 and P/σ 2 z = 10 dB. When there are only a few APs, deploying IRSs provides relevant gains only when the reflecting phases are optimized according to Algorithm 2. However, the impact of optimizing the reflecting phases becomes minor for sufficiently large N A .

VI. CONCLUDING REMARKS
We have studied the impacts of deploying IRSs on AirComp in a C-RAN system. To this end, we have tackled the joint optimization of the IRSs' reflecting phases and the linear detector at the PS with the goal of minimizing the MSE of the parameter estimated at the PS. Numerical results were provided that investigate the effects of various parameters on the performance gain of the proposed optimization scheme compared to baseline schemes. Among open problems, we mention the design of channel estimation process, the investigation of the effect of imperfect CSI, and the design of AirComp jointly with information transfer.