Information-Energy Capacity Region for SWIPT Systems with Power Amplifier Nonlinearity

We study the information-energy capacity region (IE-CR) of an additive white Gaussian noise (AWGN) channel in the presence of high-power amplifier (HPA) nonlinearity. Specifically, we consider a three-node network consisting of one transmitter, one information receiver and one energy receiver and we study the capacity-achieving input distribution under i) average-power, peak-power constraints at the transmitter, b) HPA nonlinearity at the transmitter, and c) nonlinearity of the energy harvesting circuit at the energy receiver. We prove that the input distribution is discrete and finite and we derive closed form expressions for the special cases of maximizing the harvested energy and maximizing the information capacity. We show that HPA significantly reduces the achievable IE-CR; to compensate HPA nonlinearity, a predistortion technique is also discussed and evaluated in terms of IE-CR performance.


I. INTRODUCTION
Simultaneous wireless information and power transfer (SWIPT) is a new technology, where a dedicated radiofrequency (RF) transmitter conveys information and energy to wireless devices by using the same radio waveform [1].It is a promising technology for future communication networks, which are characterized by a massive number of low-power devices (e.g., Internet of Things).The key idea of SWIPT has been proposed by Varshney in [2], where the fundamental trade-off between information and energy transfer has been introduced for a simple point-to-point channel; this work has been extended in [3] for a parallel-links point-to-point channel.More recent works study the integration of SWIPT in more complex network configurations e.g., multiple-antenna systems [4], multiple-access networks [5], multiple-antenna cellular networks [6], etc.
One of the main particularities of a SWIPT network is that the wireless power transfer channel is highly nonlinear (in contract to the linear information transfer channel).Recent studies take into account the nonlinearity of the rectification circuit, and study the impact of waveform design and/or input distribution on the achieved information-energy capacity region (IE-CR).For instance, the work in [7] models the rectifier's behaviour and introduces a mathematical framework to design waveforms that exploit nonlinearity.On the other hand, the authors in [8] study the input distribution that maximizes IE-CR for an additive white Gaussian noise (AWGN) channel under statistical constraints (first/second moment statistics) on the input distribution.By relaxing these constraints, the authors in [9] study the input distribution under general average-power (AP) and a peak-power (PP) constraints at the transmitter.By using the mathematical framework in [10], they prove that the input distribution is unique, discrete with a finite number of mass points.
According to experimental studies, signals with high peakto-average-power-ratio (PAPR) e.g., multi-sine signals, increase the direct-current (DC) output power of the rectifier and enhance the IE-CR performance [1], [7], [11], in comparison to constant-envelope signals.However, signals with high PAPR are more sensitive to high-power amplifier (HPA) nonlinearities that can significantly degrade the quality of the communication [12].Despite this fundamental experimentallyvalidated observation, previous works do not take into account the effects of HPA on the achieved IE-CR and assume that the RF power amplifier operates always in the linear regime.
In this paper, we study the fundamental limits of a SWIPT system which is characterized by HPA nonlinearities at the transmitter.By taking into account a memoryless HPA model i.e., solid state power amplifier (SSPA) model [12], we characterize the IE-CR for a three-node real-valued AWGN channel, under both AP and PP constraints at the transmitter as well as rectification nonlinearities at the energy receiver.We study the input distribution that maximizes IE-CR by formulating appropriate convex optimization problems over the input distribution and we provide simplified closed-form expressions, when the design maximizes information/energy transfer.We show that HPA significantly reduces the IE-CR, while a non-tradeoff between information and energy is observed for low PP constraints.Finally, a deterministic digital predistortion (PD) that inverses the HPA nonlinearities and linearizes the below-saturation regime is discussed; we show that PD enhances the IE-CR performance when HPA nonlinearity is more severe.Notation: Lower case bold symbols denote vectors, E[•] represents the expectation operator, denotes componentwise inequality, the superscript (•) ⊤ denotes transpose, and P(X) is the probability of the event X; f (x) ր (x 1 , x 2 ) and f (x) ց (x 1 , x 2 ) denote that the function f (x) is monotonically increasing/decreasing in the interval (x 1 , x 2 ), respectively.

II. SYSTEM MODEL
We assume a simple SWIPT topology consisting on one transmitter, one information receiver (IR), and one energy receiver (ER) [9].All the terminals are equipped with a single antenna; the IR converts the received signal to the baseband to decode the transmit information, while the ER harvests energy from the received RF signal.The transmitter transmits a pulseamplitude modulated signal with an average power σ2 x , where p(t) is the rectangular pulse shaping filter (i.e., p(t) = 1 for 0 < t ≤ T ), T is the symbol interval, and x[k] is the information symbol at time index k, modeled as the realization of an independent and identically distributed (i.i.d) real random variable X with cumulative distribution function F X (x).We assume a normalized symbol interval T = 1 and thus the measures of energy and power become identical and therefore are used equivalently.Fig. 1 schematically presents the system model.
The modulated signal is amplified by the HPA at the RF chain, which causes amplitude distortion and nonlinearity on the transmitted amplitude-modulated signal x(t).Specifically, the output of the nonlinear HPA can be written as x[k] = d(x[k]) (i.e., random variable X = d(X)), where d(•) denote the AM-to-AM conversion which is given by the considered SSPA HPA model [12] i.e., where A s is the output saturation voltage, and β represents the smoothness of the transition from the linear regime to the saturation.Let A 0 denote the minimum input voltage that drives the HPA output to the saturation i.e., A 0 = min d(r)=As r with A 0 = A s for β ≫ 1. Fig. 2 presents the input-output voltage characteristics for the considered HPA model; as β increases, the nonlinear transition regime (below saturation) of the HPA is linearized.We consider transmission over an AWGN channel with fixed and known channel fading [8], [9].The equivalent baseband received signal at the IR is given by where h I ∈ ℜ is the channel fading gain (constant) and n(t) is the real-valued Gaussian noise with unit variance.We define the conditional probability The ER converts the received RF signal to DC power through a nonlinear rectification circuit 1 .If h E ∈ ℜ is the fading channel gain (constant) for the link transmitter-ER, the average harvested energy is captured by the following quantity (monotonically increasing with the average harvested energy) [9], [13] where I 0 (•) denote the modified Bessel function of the first kind and order zero, and B is a constant that depends on the characteristics of the rectification circuit 2 .
In addition to the transmit AP constraint σ 2 x , the transmitter has a PP constraint to control the negative effects of saturation which characterizes both the transmitter and the ER due to HPA and the diode breakdown, respectively [9]; the PP constraint can be expressed as |X| ≤ A, where A is the peak amplitude.

III. INFORMATION-ENERGY CAPACITY REGION
We consider firstly the case where the IR is not present/active.In this case, we design the input distribution F X under both AP and PP constraints to maximize the power delivered at ER.The considered design problem can be formulated as Although the problem (P1) is a linear optimization problem and can be solved with standard convex optimization tools (e.g., CVX), we can provide a closed form solution.The following proposition gives the solution to (P1) and the associated input distribution for the different cases.
Proposition 1.The maximum average harvested energy and the associated mass point distribution are given by • If A 2 ≤ σ 2 x , we have p = 1, λ = A, and mass point distribution x ≤ A 2 and g(x) ց (σ x , A), we have p = 1, λ = σ x , and mass point distribution x ≤ A 2 and g(x) ր (σ x , A), we have p = σ 2 x A 2 , λ = A, and mass point distribution

and mass point distribution
where δ x is the Dirac measure (point mass) concentrated at x, and g(x The proof is given in the Appendix.In case that the IR is active, the achieved information capacity with j p(y|xj)pj dy where p j = P[X = x j ]; since Π A = j p j δ xj is a binary/ternary distribution, the complexity of the numerical computation is low.
Then, we consider the case where the target of the system is to maximize the Shannon information capacity under both AP and PP constraints.The problem can be formulated as follows where I(X; Y ) is the average mutual information between the channel input X and the channel output Y = h I X + N with X = d(X).Given that the input probability distribution is constrained to (−A, A), the mutual information is given by where p(y; F X ) is the marginal output probability density function given an input distribution F X .Due to the considered PP constraint and the nonlinearity of the HPA model, we can show that the optimal input probability function F X is unique, finite and discrete.The proof requires the application of a systematic methodology and is similar to the analysis in [9], [10].Given the finiteness/discretness of the input distribution, (P2) can be discretized and reformulated by the following convex optimization problem (corresponding to the capacity of a discrete memoryless channel (DMC) with AP/PP constraints) i.e., (P 3) max where 1 1 1 denotes a vector with ones, p ij = P(Y = y i |X = x j ), and p p p = [p 1 , p 2 , . . ., p n ] ⊤ .The above formulation discretizes the intervals x ∈ (−A, A) and y ∈ (−Γ, Γ) (where Γ ≫ A) with sufficiently small step size ∆x → 0, ∆y → 0 to form the input (n = 2A/∆x mass points) and the output (m = 2Γ/∆y mass points) symbol set, respectively.Formulation (P3) is a convex optimization problem where the objective function is concave in p p p; therefore can be solved by using standard convex optimization tools (e.g., CVX).It is worth noting that (P3) can be also solved by using the Blahut-Arimoto algorithm for constrained discrete channels, which numerically computes the capacity of DMC with constraints in the input distribution [14].If p p p * is the solution to (P3), the maximum mutual information becomes equal to In case that ER is active, the average energy harvested is written as Remark 1.For the case where d(A) ≤ A ≈ 1.665 (peak output amplitude [15]) and A 2 ≤ σ 2 x , the input distribution is equiprobable binary i.e., Π A = 1 2 (δ −A + δ +A ); in this case, we have I max = 1 − H 2 (P e ) and E min = I 0 (Bh E d(A)), where H 2 (x) is the binary entropy with probability x, and P e = ∞ 0 p(y| − A)dy.In this case and according to Proposition 1, we can see that there is not a trade-off between information/energy and the same input distribution (i.e., equiprobable binary with mass points at ±A) maximizes both information and energy transfer simultaneously.
Then, we consider the case where both ER and IR are active/present.The IE-CR is defined as To characterize the boundary of the IE-CR, we observe that when I ≤ I min , the maximum average harvested energy is given by the input distribution that achieves the rate tuple (I min , E max ), given by the solution to (P1).On the other hand, when E ≤ E min , the maximum information rate is given by the input distribution that achieves the rate tuple (I max , E min ), given by the solution to (P3).The other points of the boundary I min ≤ I ≤ I max and E min ≤ E ≤ E max can be found by solving a new optimization problem, which is similar to (P3) with the extra constraint Since the extra constraint is linear over the input distribution F X , the optimization problem is still convex and can solved by using standard convex optimization tools e.g.CVX.

A. Digital predistortion
In this section, we study the IE-CR for the case where a PD is applied to the input signal before HPA.The purpose of PD is to compansate the non-linear HPA effects and linearize the non-saturation regime of HPA.In case that HPA function d(r) is deterministic and known at the transmitter, an ideal PD corresponds to the function q(r) i.e., By using similar analytical steps with the HPA case (i.e., solving optimization problems (P1), (P2) and (P3)), the information energy capacity region is expressed by (10) with two basic modifications i.e., i) the AP constraint is replaced by E[q(x) 2 ] ≤ σ 2 x , and ii) HPA's output is equal to X = d(q(X)).These two modifications do not affect the nature and the characteristics of the problem (discreteness of the input distribution, convexity over p p p etc.) and therefore the proposed mathematical framework can be applied accordingly.It is worth noting that r ≥ d(r) and therefore PD penalizes the AP constraint (increases transmit power), while it facilitates the objective functions in (P1)-(P3).

IV. NUMERICAL RESULTS
Computer simulations have been carried out to evaluate the impact of HPA in terms of IE-CR; for the sake of simplicity, we assume h I = h E = 1 without loss of generality.
Fig. 3 deals with the input mass distribution for different system configurations when IR is not active and the target is to maximize the average harvested energy.We assume A = 16, σ 2 x = 49 and thus σ 2 x ≤ A 2 ; Fig. 3(a) plots the function g(x) for the configurations considered.For the case where A s = 10, β = 1 (see 3(b)), we have g(x) ց (σ x , A) and the optimal input mass distribution consists of two mass points at ±σ x .On the other hand, for the scenario where A s = 10 < A and β = 80 (see 3(c)), the nonlinearity in the HPA transition region disappears and thus g(x) ր (σ x , A s ), g(x) ց (A s , A); in this case the input distribution consists of three mass points at {±A s , 0}.Finally, for the scenario where A < A s = 100 and β = 10, we have g(x) ր (σ x , A) and the optimal input distribution is {±A, 0}.The main observations of Fig. 3 are inline with Proposition 1.
In Fig. 4, we show the input distribution for the case where ER is not present and the goal of the system is to maximize information Shannon capacity; the setting is A s = 5, β = 1, σ 2 x = 30 dB, and B = 0.5.For the case where A = 18 (Fig. 4.(a)), we can see that the optimal input distribution is discrete with a finite number of mass probability points.In Fig. 4.(b), we examine the special case of small A i.e., A = 1.75 (with d(1.75) = 1.6518 < A ≈ 1.665 [15]) and as it can be seen the optimal input distribution is binary with two mass points at ±A; this observation is inline with Remark 1. Fig. 5 shows the fundamental information-energy capacity region for the considered SWIPT system with HPA nonlinearities at the transmitter.The simulation setup assumes A s = 5, β = 1, σ 2 x = 30 dB and B = 0.5; the case without HPA degradation is used as a benchmark (no-HPA).The first remark is that HPA nonlinearities significantly reduce the achieved IE-CR in comparison to the no-HPA case; the negative effects of HPA are more critical as the PP constraint increases.Another important observation is that for low A (i.e., A = 1.75 with d(A) < A), there is not a tradeoff between information and energy and thus the same input distribution maximizes both information and energy transfer (Remark 1).Finally, in the curve corresponding to A = 10, we can see the key points of the boundary of the information-energy capacity region, which are defined in (10).
Finally, Fig. 6 deals with the impact of PD on the IE-CR; we study configurations with a different parameter β.As we can see, the application of a PD on the input signal, limits the negative effects of HPA and enlarges the IE-CR.However, the observed gain decreases as the smoothness parameter β increases, since the associated power for the inversion d −1 (r) increases; for β = 10, the transition region is almost linear and the application of PD slighly decreases the IE-CR performance.

APPENDIX
We consider the functions g 1 (x) = I 0 (θx) and g 2 (x) = g 1 (d(x)) where θ is a constant.The function d(x) is monotonically increasing function i.e., the first derivative equals to Given that g 1 (x) is monotonically increasing function for x > 0, the composite function g 2 (x) is an increasing function for x > 0 (composition of two increasing functions).
For the case where A 2 ≤ σ 2 x , the PP constraint dominates and due to the monotonicity and even symmetry of g 2 (x), the optimal distribution consists of two mass points at −A and A with probabilities p 1 and p 2 = 1 − p 1 , respectively.Therefore, the average harvested energy becomes equal to E max = g 2 (A).
Although p 1 can take any value in (0, 1) without affecting the maximum average harvested energy, we assume p 1 = 1/2 to maximize the information transfer (in case that IR is active).
When σ 2 x < A 2 , we examine also the case where the mass points are located at the region (σ x , A). Similarly to the previous case (i.e., A 2 ≤ σ 2 x ), let x 0 = σ x the point of increase of a distribution F X with probability 1; we construct a new distribution F ′ X by removing x 0 and adding two mass points at the locations 0 and y ∈ (σ x , A) with probabilities 1−σ 2 x /y 2 and σ 2 x /y 2 , respectively.We can show that this transformation decreases/increases the harvested energy depending on the monotonicity of the function g(x) = (g 2 (x) − 1)/x 2 .More specifically, if g(x) is a decreasing function in (σ x , A) i.e., x y 2 g 2 (0) + σ 2 x y 2 g 2 (y), with g 2 (0) = 1, F ′ X decreases the average harvested energy and thus F X is the optimal input distribution; by following similar arguments as before, the optimal distribution consists of two points at −σ x and σ x with probabilities 1/2, and the maximum harvested energy becomes equal to E max = g 2 (σ x ).On the other hand, if g(x) is an increasing function in (σ x , A), the inequality in (12) holds with the reverse direction and y = A maximizes the average harvested energy.In this case, the optimal mass function consists of three points at the locations −A, A and 0 with probabilities x A 2 , Finally, in case that A 0 ∈ (σ x , the function g(x) is in the interval (σ x , A ′ ) and decreasing in the interval (A ′ , A) and therefore we have y = A ′ ; we note A ′ ≈ A s for β >> 1. Equivalently, the optimal input distribution consists of three mass points at the locations −A ′ , A ′ and 0 with probabilities p 1 = p 2 = σ 2 x 2A ′2 and p 0 = 1− σ 2 x A ′2 .For these two subcases (with three mass points), the maximum average energy is equal to E max ≈ 2p 1 g 2 (µ)+p 0 with µ = A and µ = A ′ , respectively.

Fig. 1 .
Fig. 1.System model consisting of one transmitter, an information receiver and an energy receiver.