Beyond Max-SNR: Joint Encoding for Reconfigurable Intelligent Surfaces

A communication link aided by a Reconfigurable Intelligent Surface (RIS) is studied, in which the transmitter can control the state of the RIS via a finite-rate control link. Prior work mostly assumed a fixed RIS configuration irrespective of the transmitted information. In contrast, this work derives information-theoretic limits, and demonstrates that the capacity is achieved by a scheme that jointly encodes information in the transmitted signal as well as in the RIS configuration. In addition, a novel signaling strategy based on layered encoding is proposed that enables practical successive cancellation-type decoding at the receiver. Numerical experiments demonstrate that the standard max-SNR scheme that fixes the configuration of the RIS as to maximize the Signal-to-Noise Ratio (SNR) at the receiver is strictly suboptimal, and is outperformed by the proposed strategies at all practical SNR levels.

paper takes a more fundamental information-theoretic perspective on the design of RIS-aided communication links in which the transmitter can control the state of the RIS via a finite-rate control link (see Fig. 1). The analysis points to a novel approach of signal engineering via RISs that goes beyond the maximization of the SNR through an information-driven control of the RIS: Rather than being fixed to enhance the SNR, the RIS configuration is jointly encoded with the transmitted signals as a function of the information message.
Related Work: Based on the electromagnetic and physical properties of RISs, free-space path loss models for RIS-aided systems were developed in [7] and [8]. The optimization of a fixed RIS configuration has been studied in various scenarios [9]- [22]. We mention here some representative examples. Algorithms for jointly optimizing precoding at the transmitter and beamforming at the RIS were proposed for a point-to-point Multiple-Input Single-Output (MISO) systems in [9], and for Multiple-Input Multiple-Output (MIMO) systems in [10], [11]. RIS-based passive beamforming was compared to conventional relaying methods such as amplify-and-forward and decode-and-forward in [12], [13], and to multi-antenna systems in [14]. Algorithms for maximizing weighted sum-rate and energy efficiency in an RIS-aided multi-user MISO systems were proposed in [15] and [16], [17], respectively. A multi-group multi-cast RIS-aided system was studied in [18], and efficient algorithms were proposed to maximize the sum-rate achieved by all groups.
To the best of our knowledge, the only paper that considers joint encoding of the transmitted signal and RIS state is [23], in which the receiver antenna for which SNR is maximized encodes the information bits using index modulation [24]. However, reference [23] does not address the optimality of the proposed index modulation scheme. Moreover, the proposed scheme fixes the configuration of the RIS for the entire duration of the transmission, and hence it provides minor rate increments for large coding blocks.
Main Contributions: In this paper, we study the RIS-aided system with a single-antenna transmitter and a receiver with N antennas illustrated in Fig. 1. We first derive the capacity of the system for any RIS control rate, and prove that joint encoding of transmitted signals and RIS configuration is generally necessary to achieve the maximum information rate. In addition, we explicitly characterize the performance gain of joint encoding in the high-SNR regime. Then, we propose an achievable scheme based on layered encoding and Successive block (m symbols) codeword (n symbols)  the codeword of n symbols sent on the wireless link to the receiver, but also jointly in the configuration of the RIS. The latter is defined by the phase shifts that each of the K RIS elements applies to the impinging wireless signal.
Following [3], we assume that the phase shift applied by each element is chosen from a finite set A of A = |A| > 0 distinct hardware-determined values. Moreover, there is a limit on the rate at which the RIS can be controlled, such that the phase shifts θ θ θ(t) are fixed for blocks of m > 0 consecutive transmitted symbols. That is, as illustrated in Fig. 1, the state θ θ θ(t) of the RIS can be changed only at the beginning of each block t ∈ [n/m] of m transmitted symbols. Note that if m = n, the configuration of the RIS is fixed for the entire duration of the transmission as assumed in the prior work reviewed above. We take n to be a multiple of m.
We assume that the direct link between transmitter and receiver is blocked, as in, e.g., [14], so that propagation from transmitter to receiver occurs through reflection from the RIS. Let the received signal matrix Y(t) = (y 1 (t), . . ., y m (t)) ∈ ¼ N×m collect the received samples in the tth block of a codeword, so that column y i (t), i ∈ [m], denotes the signal received at the N antennas for the ith transmitted symbol in the block. The signal received on the wireless link within each tth block, with t ∈ [n/m], can then be written as where the transmitted signal consists of the m symbols transmitted in the tth block; the channel vector g ∈ ¼ K×1 denotes the quasi-static flat-fading channel from the transmitter to the RIS; the RIS configuration matrix denotes the phase shifts applied by the RIS during the transmission of the tth block, with θ k (t) ∈ A denoting the phase-shift for the kth RIS element, k ∈ [K]; the channel matrix H ∈ ¼ N×K denotes the quasi-static flat-fading channel from the RIS to the N receiver antennas; and the white Gaussian noise matrix Z(t) ∈ ¼ N×m , whose elements are independent and identically distributed (i.i.d.) as CN(0, 1), denotes the additive noise at the receiving antennas during the transmission of the tth block. The transmitted signal is subject to the power constraint and some P > 0. We assume full Channel State Information (CSI) in the sense that the transmitter and receiver know the quasi-static channel vectors g and H, which remain fixed throughout the n symbols corresponding to the transmission of a message w.
Based on the message w and CSI given by the pair (g, H), the encoder jointly selects a codeword x(t), as well as a sequence θ θ θ(t) = (θ 1 (t), . . ., θ K (t)) of RIS configurations, for (1) for t ∈ [n/m], the decoder produces the estimatê w =ŵ(Y(1), . . ., Y(n/m), g, H) using knowledge of the CSI. As in the conventional definition in information theory (see, e.g., [25,Ch. 7]), a rate R(g, H) is said to be achievable if the probability of error satisfies the limit Pr(ŵ w) → 0 when the codeword length grows large, i.e., n → ∞. The corresponding capacity C(g, H) is defined as the maximum over all achievable rates, i.e., where the supremum is taken over all joint encoding and decoding schemes. Finally, we define the average rate and capacity as where the average is taken over the distribution of the CSI (g, H).

III. JOINT ENCODING: CHANNEL CAPACITY
In this section, we derive the capacity C(g, H) and we argue that the standard max-SNR method that does not encode information in the RIS configuration (see, e.g., [9]- [22]) is strictly suboptimal.

Proposition 1:
The capacity of the channel (1) is given as where we have defined function , and the expectation in (5) being taken with respect to a matrix Z whose elements are i.i.d. as At a computational level, problem (5) is convex (see Appendix A), and hence it can be solved using standard tools. In terms of the operational significance of Proposition 1, achieving capacity (5) generally requires joint encoding over codeword symbols x and RIS configuration variables θ θ θ, as well as joint decoding of message w based on information encoded over both x and θ θ θ at the receiver. This is reflected in (5) in the optimization over the joint distribution p(x, θ θ θ). In the next section, we will consider a suboptimal approach that uses separate encoding and decoding over x and θ θ θ through layering.
The following proposition derives achievable rates for the standard max-SNR approach [9]- [22], whereby the state of the RIS θ θ θ is fixed for the entire transmission, irrespective of message w, so as to maximize the SNR at the receiver.

Proposition 2:
The rate is achievable by selecting the phase shift vector θ θ θ so as to maximize SNR at the receiver, where we have defined function and the expectation in (7) is taken with respect to z ∼ CN(0, I N ).
The max-SNR rate (7) can be again computed using convex optimization tools. It is generally smaller than the counterpart capacity (5), and we will evaluate the corresponding performance loss in Sec. V via numerical experiments. The following proposition settles the comparison in the high-SNR regime.
Proposition 3: For any finite input constellation B, the high-SNR limit of the average capacity is given as where we have defined the set where the factor β 3P/[3 + 4(B 2 − 1)] ensures average power P, yielding the limit Proof: See Appendix C.
In the high-SNR regime, the rate of the max-SNR scheme is limited to log 2 (B). Therefore, Proposition 3 demonstrates that, for any RIS configuration set A of A distinct phases, in the high-SNR regime, modulating the RIS state can be used to increase the achievable rate by K log 2 (A)/m bits per symbol as compared to the max-SNR scheme. Furthermore, the proposition shows that, in this regime, it is optimal to use the ASK modulation (11), and it implies that choosing independent codebooks for input x and RIS configuration θ θ θ, i.e., setting p(x, θ θ θ) = p(x)p(θ θ θ) in (5), does not cause any performance loss at high SNR. As an additional comparison, we note that the high-SNR performance of the index modulation scheme proposed in [23] is upper-bounded by log 2 (B)+log 2 (N)/n. Therefore, its suboptimality as compared to (12) increases with the ratio n/m.

IV. LAYERED ENCODING
As discussed, achieving the capacity (5) requires jointly encoding the message over the phase shift vector θ θ θ and the transmitted signal x, while performing optimal, i.e., maximumlikelihood joint decoding at the receiver. In this section we propose a strategy based on layered encoding and Successive Cancellation Decoding (SCD) that uses only standard separate encoding and decoding strategies, while still benefiting from the modulation of information over the state of the RIS to improve over the max-SNR approach.
To this end, the message w is split into two sub-messages, or layers, w 1 and w 2 , such that w 1 , of rate R 1 , is encoded by the phase shift vector θ θ θ, whereas w 2 , of rate R 2 , is encoded by the transmitted signal x = (x 1 , . . . , x m ). In order to enable decoding using standard SCD, the first τ symbols x 1 , . . ., x τ , with τ ≥ 1, in vector x are fixed and used as pilots. The receiver starts by decoding w 1 using the first τ vectors y 1 , . . . , y τ , in every received block Y = (y 1 , . . . , y m ). This allows the decoder to obtain vector θ θ θ, which is then used to decode w 2 . This strategy achieves the rate detailed in Proposition 4. Proposition 4: A strategy based on layered encoding and SCD achieves the rate wherem max{τ + 1, m}; rate R 1 (g, H, τ) is defined as with function and matrices S = diag(exp( jθ 1 ), . . . , exp( jθ K )) and S ′ = diag(exp( jθ ′ 1 ), . . ., exp( jθ ′ K )); and rate R 2 (g, H) is defined as with function and a power allocation parameter which depends on the SNR γ(θ θ θ) H diag e jθ 1 , . . ., e jθ K g 2 P and on a cutoff parameter γ 0 satisfying the equality The expectations in (14) and (16) are taken with respect to random vector z ∼ CN(0, I N ).
Proof: Rate R 1 (g, H, τ) is obtained by modulating the RIS phases, and follows in a manner similar to [26,Eq. (4)]. Rate R 2 (g, H) is instead obtained by applying "water-filling" power allocation [27]. Details can be found in Appendix D.

V. NUMERICAL RESULTS
In this section, we provide numerical examples with the main aim of comparing the capacity (5) with the rate (7) achieved by the max-SNR approach and the rate (13)  In Fig. 2 we plot the average rate as a function of the average power P, with N = 2 receiver antennas, K = 3 RIS elements, A = 2 available phase shifts, a symbol-to-RIS control rate m = 2, and input constellation given by QPSK B = {± √ P, ±i √ P} or 4-ASK B = {β, 3β, 5β, 7β} with β = P/21. For layered coding, we set τ = 1 pilot, which was seem to maximize the rate in this experiment. For very low SNR, i.e., less than −20dB, it is observed that the max-SNR approach is close to being optimal, and hence, in this regime, encoding information in the RIS configuration does not increase the rate. For larger SNR levels of practical interest, however, joint encoding provides significant gain over the max-SNR scheme and the layered approach proposed in Sec. IV, with the latter in turn strictly improving over the max-SNR scheme. In this regard, we note that we have numerically verified that, in this regime, the optimal joint distribution p(x, θ θ θ) in (5) is not a product distribution except for very large SNR levels (see discussion after Proposition 3). Finally, while PSK outperforms ASK when used with the max-SNR and layered-encoding schemes, the opposite is true with joint encoding in the high-SNR regime. In fact, as discussed in Proposition 3, in the high-SNR regime, out of all finite input sets B with the same size, ASK achieves the maximum capacity.
The gain of using the state of the RIS as a medium for conveying information is expected to decrease as the rate of the control link from transmitter to RIS decreases. This is illustrated Note that the performance of the layered-encoding scheme is identical for m = 1 and m = 2 due to the use of the pilot symbol, as described in Sec. IV. It is observed that, while, for m = 1, joint encoding achieves three times the rate of max-SNR, the gain reduces to a factor of 1.3 for m = 7.

VI. CONCLUSIONS
In this work, we have studied the capacity of a Reconfigurable Intelligent Surface (RIS)aided channel. Focusing on a fundamental model with one transmitter and one receiver, the common approach of using the RIS as a passive beamformer to maximize the SNR at the receiver was shown to be generally suboptimal in terms of the achievable rate for finite input constellations. Instead, the capacity-achieving scheme was proved to jointly encode information in the RIS configuration as well as in the transmitted signal. In addition, a suboptimal, yet practical, strategy based on layered encoding and successive cancellation decoding was demonstrated to outperform passive beamforming for sufficiently high SNR

levels.
Among related open problems, we mention the design of low-complexity joint encoding and decoding strategies that approach capacity, the derivation of the capacity for channels with imperfect a priori CSI [28]- [30] or noisy RIS [31], and extensions to RIS systems with multiple users/surfaces [1], [15], [16], [18] or with security constraints [32]- [34].  The model (1) can be viewed as a standard channel with input (x, θ θ θ) and output Y. This is because the transmitter directly controls the states of the RIS S(t) for t ∈ [n/m]. Therefore, it follows from the channel coding theorem [25,Ch. 7] that the capacity can be expressed as Since the conditional probability density function of the output Y given the input (x, θ θ θ) is we have h(Y|x, θ θ θ) = mN log 2 (πe) and the differential entropy h(Y) can be written as (see, e.g., [ where the function f c (x, θ θ θ, Z) is defined in (6). Note that the differential entropy h(Y) is a concave function of p(x, θ θ θ) for fixed p(Y|x, θ θ θ) [25, Theorem 2.7.4], Therefore, problem (21) can be solved using standard tools.

B. Proof of Proposition 2
Fix a phase shift vector θ θ θ and an input distribution p(x). Since the state of the RIS is constant for the entire transmission, the channel (1) can be restated as where z(t) ∼ CN(0, I N ). For this channel, by [25,Ch. 7], the following rate is achievable where, similar to the proof of Proposition 1, the differential entropy h(y) can be expressed as with the function f max-SNR (x, θ θ θ, z) defined in (8).

C. Proof of Proposition 3
In the high-SNR regime, we can obtain the limit where equality is achieved for a uniform input distribution p(x, θ θ θ).

D. Proof of Proposition 4
The transmitted message w is divided into two layers w 1 and w 2 that are encoded in the phase shift vector θ θ θ(t) and the transmitted signal x(t), respectively, for t ∈ [n/m]. Let y i (t) denote the ith column of the received signal matrix Y(t) (1), i ∈ [m], which can be expressed as y i (t) = H diag e jθ 1 (t) , . . ., e jθ K (t) gx i (t) + z i (t), where the white Gaussian noise z i (t) ∼ CN(0, I N ) denotes the additive noise at the receiving antennas during the transmission of the ith symbol in the tth block. By fixing pilots x i (t) = √ P for all i ∈ [τ] and t ∈ [n/m], we obtain where we have defined vectorsz(t) τ i=1 z i (t)/τ ∼ CN(0, I N /τ) andx(t) ( √ Pe jθ 1 (t) , . . ., √ Pe jθ K (t) ) ⊺ .
The channel (29) is equivalent to a point-to-point Gaussian MIMO channel with PSK input and an average power constraint of P in which a precoder given by diag(g) is applied, and the noise at each receiving antenna has variance 1/τ. Therefore, it follows from [26,Eq. (4)] that a rate of R 1 (g, H, τ)/m is achievable for layer w 1 . The factorm = max{τ + 1, m} is due to the fact that this scheme requires a transmitted block of size greater than τ in order to encode w 2 in the symbols of x(t) excluding the first τ symbols.
Once the receiver decodes the first layer w 1 , it knows the state of the RIS S(t) for all t ∈ [n/m]. Therefore, the it h column of the received signal matrix, for i ≥ τ + 1, can be written as y i (t) = HS(t)gx i (t) =g(t)x i (t), i = τ + 1, . . . ,m, with time-varying channel vector g(t) HS(t)g, which is known to both transmitter and receiver. The channel (30) is equivalent to a fast-fading model with CSI at the transmitter and receiver [27]. Therefore, the following scheme is applied. The transmitter uses a uniform input distribution, i.e., p(x) = 1/|B| for all x ∈ B. For an RIS configuration θ θ θ, the transmitter amplifies the transmitted symbol by α(θ θ θ) in (18). Note that the average-power constraint is satisfied due to equality (20). Hence, a rate of I(x; y|θ θ θ)/m can be achieved, where, similar to the proofs of Proposition 1 and Proposition 2, the mutual information I(x; y|θ θ θ) can be expressed as R 2 (g, H) in (16). Sincem − τ of them symbols in block x(t) are used for conveying information, the total rate achieved in the second layer is (m − τ)R 2 (g, H)/m. Thus, using both layers, the rate R layered (g, H, τ) in (13) is achievable.