On the Outage Performance of Non-Orthogonal Multiple Access with One-Bit Feedback

In this paper, the outage performance of downlink non-orthogonal multiple access (NOMA) is investigated for the case where each user feeds back only one bit of its channel state information (CSI) to the base station. Conventionally, opportunistic one-bit feedback has been used in fading broadcast channels to select only one user for transmission. In contrast, the considered NOMA scheme adopts superposition coding to serve all users simultaneously in order to improve user fairness. A closed-form expression for the common outage probability (COP) is derived, along with the optimal diversity gains under two types of power constraints. Particularly, it is demonstrated that the diversity gain under a long-term power constraint is twice as large as that under a short-term power constraint. Furthermore, we study dynamic power allocation optimization for minimizing the COP, based on one-bit CSI feedback. This problem is challenging since the objective function is non-convex; however, under the short-term power constraint, we demonstrate that the original problem can be transformed into a set of convex problems. Under the long-term power constraint, an asymptotically optimal solution is obtained for high signal-to-noise ratio.


I. INTRODUCTION
Non-orthogonal multiple access (NOMA) has been recognized as an important multiple access (MA) technique in future fifth generation (5G) networks since a balanced tradeoff between spectral efficiency and user fairness can be realized [1]- [8].Unlike conventional MA, such as time-division multiple access (TDMA), NOMA simultaneously transmits messages to multiple users.The power domain is utilized by NOMA such that different users are served at Peng Xu and Xuchu Dai are with Dept. of Electronic Engineering and Information Science, University of Science and Technology of China, P.O.Box No.4, 230027, Hefei, Anhui, China.Yi Yuan and Zhiguo Ding are with School of Computing and Communications, Lancaster University, LA1 4WA, UK.Robert Schober is with Institute for Digital Communications, the Friedrich-Alexander University (FAU), Erlangen 91058, Germany.different power levels.The basic idea of NOMA is motivated by the optimal coding scheme for the broadcast channel (BC) [9], which combines superposition coding at the transmitter with successive interference cancellation (SIC) decoding at the receivers.However, compared to the conventional transmission schemes for the BC, NOMA imposes an additional fairness constraint on transmission, i.e., more power is always allocated to the users with poorer channel conditions, which is different from the conventional waterfilling power allocation scheme.In this sense, NOMA can be viewed as a special case of the superposition coding developed for the BC [10].
The capacity region of the degraded discrete memoryless BC was first found by Cover based on superposition coding [9].The work in [11] then established the capacity region of the Gaussian BC with single-antenna terminals.For the multiple-input multiple-output (MIMO) Gaussian BC, the capacity region can be achieved by applying dirty paper coding (DPC) [12].Moreover, the ergodic capacity and the outage capacity/probability of the fading BC with perfect channel state information (CSI) at both the transmitter and receivers were studied in [13] and [14], respectively.Compared to ergodic capacity, the concept of outage assumes the transmission with a predefined rate, which is more appropriate for applications with strict delay constraints.Two types of outage probabilities were defined in [14], namely the common outage probability (COP) and the individual outage probability (IOP).For the COP, an outage event occurs if any of the users are in outage.For the IOP, the outage events of individual users are considered.For the case where CSI is not available at the transmitter, the outage performance was analyzed in [15].
For the downlink MA scenario with K users, another key performance evaluation criterion is multiuser diversity, where serving the user with the best instantaneous channel gain yields the optimal ergodic sum rate [16], [17].However, user selection requires a large amount of CSI feedback, which is difficult to implement in practice.Motivated by this, a significant amount of existing work is dedicated to harvesting the multiuser diversity with only quantized CSI at the transmitter [18], [19].One can refer to the survey in [20] for more details.One of the most spectrally efficient approaches is to employ one-bit feedback for opportunistic user selection, which was proposed for the fading BC in [21]- [26].The outage performance with one-bit feedback was investigated in [23], [25], and the use of one-bit feedback has also been applied to the MIMO case in [27], [28].
This paper investigates the block fading BC with one bit feedback from the new perspective of NOMA.The traditional one-bit feedback schemes in [21]- [26] opportunistically select a single user for transmission within each fading block, and hence do not achieve short-term fairness 1   in general.Compared to these works, NOMA emphasizes short-term fairness, which is achieved by having the base station transmit messages to all K users simultaneously using superposition coding.In comparison with the existing works on NOMA assuming availability of perfect CSI at the transmitter (e.g., [3]- [6]), the proposed NOMA scheme with one-bit feedback enjoys a lower overhead, especially when the number of users is large.It is worth pointing out that this one-bit feedback scheme is aligned with how NOMA has been implemented in practice.For example, multiuser superposition transmission (MUST), a downlink two-user version of NOMA, has been included in 3rd generation partnership project long-term evolution advanced (3GPP-LTE-A) networks [29].For MUST, the base station needs to obtain partial CSI to determine the ordering of the users, and in [29], CSI feedback has been particularly highlighted as a potential enhancement to assist the base station in performing user ordering.Most recently, in [7], [8], the authors have investigated the outage performance of NOMA with statistical CSI knowledge.However, the works in [7], [8] did not consider quantized CSI feedback and the proposed schemes are fundamentally different from our work.
In this paper, a downlink NOMA system with one-bit feedback is investigated for delaysensitive applications.Therefore, the outage probability is used as the relevant performance metric.Specifically, the COP is adopted as the performance criterion, which is motivated by the fact that the COP captures the event that outage occurs at any of the users and hence emphasizes short-term fairness compared to the IOP.We derive a closed-form expression for the COP by first defining (K + 1) feedback events with respect to the number of channel gains exceeding a predefined threshold, and then analyzing the conditional COP for each event.The optimal diversity gains achieved by the considered NOMA scheme are derived under short-term and long-term power constraints, respectively.Our analysis shows that the diversity gain under the long-term power constraint is twice as large as that under the short-term power constraint.Furthermore, in order to minimize the COP, we study a dynamic power allocation policy based 1 In this paper, short-term fairness means that user fairness is guaranteed within any fading block, whereas long-term fairness means that user fairness is guaranteed within a large number of fading blocks.
on CSI feedback, i.e., different power allocation schemes are developed for different feedback states.The formulated power allocation problem is challenging since the objective function for minimizing the COP is non-convex.To make this problem tractable, under the short-term power constraint, we first characterize the properties of the optimal power allocation solution, which can be used to transform the problem into a series of convex problems.Under the long-term power constraint, we apply a high signal-to-ratio (SNR) approximation and show that the approximated problem is convex.Our analysis shows that, for each feedback event, the optimal solution is in the form of two increasing geometric progressions.An efficient iterative search algorithm is proposed to determine the length of each geometric progression.Numerical results reveal that one-bit feedback significantly improves the outage performance of NOMA compared to the case without CSI feedback.
Throughout this paper, we use P(•) to denote the probability of an event, and E(•) denotes the expectation of a random variable.In addition, {x i } denotes the sequence formed by all the possible x i 's, and [1 : K] denotes the set {1, • • • , K}.Furthermore, log(•) denotes the logarithm that is taken to base 2; ln(•) denotes the natural logarithm; at time instant t within fading block b.Here, the noise samples n k (t, b) at user k are independent and identically distributed complex Gaussian random variables with zero mean and unit variance.
h k (b) denotes the channel gain from the base station to user k in block b, which is assumed to be a zero mean circularly symmetric complex Gaussian random variable with unit variance.
Moreover, the users have mutually independent channel gains.This paper exclusively considers the case where all codewords span only a single fading block, and the base station transmits one message to each user in each block with the same fixed rate r 0 bits per channel use (BPCU), in order to guarantee fairness [4].
For the sake of brevity, the fading block index b will be omitted in the rest of this paper whenever this does not cause any confusion.Assume that all users have perfect CSI and compare their fading gains to a predefined threshold, denoted by α.Particularly, given h k , user k feeds back in each fading block a single bit2 "Q(h k )" to the base station via a zero-delay reliable link, where

A. User Ordering for NOMA
Denote the channel feedback sequence as realizations in each of which the elements are 0 or 1.Based on these feedbacks, the base station will perform power allocation for the K users.Thereby, the base station focuses only on (K + 1) categories for the realizations of {Q(h k )}, and a corresponding random variable is defined in the following.
Definition 1: Define a random variable N with respect to the K-dimensional random binary Obviously, N has (K + 1) possible realizations, and event N = n represents the case where n users send "0" and the other K − n users send "1", n ∈ [0 : K].
For event N = n, the base station uses three steps to determine the user ordering: (i) divide the users into two groups corresponding to feedbacks "0" and "1", denoted as G 0|n and G 1|n , respectively; (ii) allocate the ordering indices {1, • • • , n} to the users in G 0|n , and the ordering indices {n + 1, • • • , K} to the users in G 1|n ; (iii) randomly index (order) the users in the same group since the base station cannot distinguish their fading gains.
Denote the channel gains for the ordered users by Then, the base station broadcasts the superimposed message K k=1 s π k (t) based on the power allocation policy discussed in the next subsection, where s π k (t) is the signal for user π k in the t-th channel use of a fading block.
Remark 1: According to the applied user ordering principle, all channels h π k are mutually independent if conditioned on event N = n.This is because the two groups G 0|n and G 1|n are determined by event N = n, and all users in the same group are randomly ordered.

B. Successive Interference Cancellation (SIC)
The users employ SIC to decode their messages, based on the user ordering determined by the base station.As explained in the previous subsection, the ordering of the channels is denoted as In the SIC process, user π k will sequentially decode the messages of users π l , l ∈ [1 : k].Specifically, user π k will successively detect the message of users π l , l < k, and then remove these messages from its observation, such that the interference terms generated from user π 1 to user π l have been canceled when detecting the message of user π l+1 .

C. Power Constraint
For any block, the power allocated for user π k , whose ordering index in the SIC process is k, is denoted as P k ({Q(h k )}).While there are 2 K possible feedback sequences, the power allocation policy used at the base station will depend only on which of the K + 1 events N = n happens, i.e., the power allocation policy for all sequences corresponding to the same event are identical.
Therefore, the power allocated to user π k is denoted by P k,n , i.e., P k ({Q(h k )}) = P k,n , for event We consider two different types of power constraints.In particular, the short-term power constraint ensures that the sum power of all users within any block is constrained.Specifically, the short-term power constraint requires that the total power allocated to all users within any block cannot exceed P , i.e., In contrast, the considered long-term power constraint ensures that the average total transmission power is constrained, i.e., where the expectation of P k ({Q(h k )}) can be calculated as where (a) follows from the definitions c) holds since P k (q) = P k,n if q ∈ Q n as shown at the beginning of this subsection; (d) holds since P(N = n) = q∈Qn p(q) according to Definition 1.Thus, the long-term power constraint in (3) can be rewritten as Remark 2: Both types of power constraints are widely used in the related literature, e.g., [22], [24], [25], [30].The short-term power constraint is appropriate for applications with strict peak power constraints, whereas the long-term power constraint is appropriate for applications with average power constraints.

III. OUTAGE PROBABILITY
In this section, the outage probability of the NOMA system considered in Section II will be analyzed.However, first, some useful preliminary results are provided in the next subsection.

A. Preliminary Results
We first analyze of the conditional probability P(|h where random variable N is defined in Definition 1.Based on the user ordering in Section II, we know that, for event In addition, all channels h π k are mutually independent if conditioned on event N = n, as explained in Remark 1.Thus, we have3 Similarly, we have Next, the expressions for the signal-to-interference-plus-noise ratios (SINRs) at the receivers will be developed.As explained in Section II-B, SIC is adopted in the decoding process and the ordering of the channels is denoted as Thus, the SINR for user π k to decode the message of user π l is given by [9] SINR

B. Outage Probability
This paper adopts the COP [14] as performance criterion for the considered NOMA system since short-term fairness can be guaranteed with this criterion.The COP is provided in the following theorem.
Theorem 1: The COP of the considered one-bit NOMA scheme can be expressed as where } is the power allocation sequence for event N = n; P n (α) and P Indiv k,n (α, P n ) are defined as: with the definition ζk,n max{ζ 1,n , • • • , ζ k,n }, and Proof: Please refer to Appendix A.
Note that in (12), we have implicitly assumed that ζ k,n ≥ 0, i.e., Such a constraint on power allocation is typical for NOMA systems [3], [4], [6], where a user with poorer channel conditions has to be allocated more power in order to guarantee fairness.In addition, in order to facilitate the use of different power constraints in the following discussions, we express {P k,n } as a function of {ζ k,n } as follows: which is obtained from ( 12) by applying mathematical induction.Thus, the sum power for event N = n can be expressed as

C. Diversity Gain
In order to provide some insight into the outage performance, in this subsection, we analyze the diversity gains of the COP in (9) under the short-term and long-term power constraints.The diversity gain is defined as follows.
Definition 2: The diversity gain based on the COP is defined as In addition, the diversity gain in ( 16) can be also expressed as P Common .= P −d .
Then, the following two lemmas provide the diversity gains of the COP under the short-term and long-term power constraints.
Lemma 1: Under the short-term power constraint in (2), the maximum achievable diversity gain of the considered NOMA scheme is 1.
Proof: We consider a specific power allocation scheme such that the values of the ζ k,n 's in (12) are identical.Based on this power allocation scheme, we will show that a diversity gain of 1 can be achieved.The feedback threshold is set as α = ln(2) for simplicity.Note that one can also choose any other value of α to achieve a diversity gain of 1, which means that the maximum diversity gain can be achieved for any α.Then, a lower bound on the COP is derived to prove that a diversity gain of 1 is optimal for all possible power allocation schemes and all possible choices of threshold α.Details of the proof are provided in Appendix B.
Lemma 2: Under the long-term power constraint in ( 5), the maximum achievable diversity gain of the considered NOMA scheme is 2, which is achieved only if α satisfies α .= P −1 .
Proof: We consider a specific power allocation scheme such that the ζ k,n 's in (12) have the same value for a given n.We also choose a threshold α such that outages are not occurring for event N = 0 (i.e., all the users feed back "1").Then, a lower bound on the COP is derived to prove that a diversity gain of 2 is optimal for all possible power allocation schemes and all possible choices of threshold α, under the long-term power constraint.Details of the proof are provided in Appendix C.

IV. POWER ALLOCATION
Existing works have demonstrated that power allocation has significant impact on the outage performance in conventional multiple access scenarios [14], [31], [32].Motivated by this, in this section, we formulate a power allocation problem to minimize the COP P Common in ( 9), under short-term and long-term power constraints.

A. Problem Formulation
The optimization problem for the short-term power constraint can be formulated as follows: 2) and ( 13), Similarly, the optimization problem for the long-term power constraint can be formulated as follows: s.t. ( 5) and ( 13), To simplify the above two problems, variable transformation according to ( 12) is applied, and the problem in ( 17) is transformed into the following equivalent form: where 2) and (15).Note that, according to (14), the optimal power allocation scheme can be found once the optimal values of {ζ k,n } are obtained.Similarly, the problem in ( 18) can be transformed into the following equivalent form: The benefit of using the transformed problems in (19) and ( 20) is that the number of constraints has been reduced.However, problems (P1) and (P2) still involve the non-convex objective function and are difficult to solve.There are (K(K + 1) + 1) optimization variables in total, including K(K + 1) power variables ζ k,n and one threshold variable α.In the subsequent subsections, we first address the power allocation problem for a fixed threshold α, and then utilize a one-dimensional search to find the optimal α.

B. Short-Term Power Constraint
For a fixed α, P n (α) is also fixed, and therefore, the objective in (19a) is additive with respect , where the n-th subfunction depends on variable vector ζ n , 0 ≤ n ≤ K.Moreover, the constraints in (19b) and (19c) are uncoupled with respect to the (K + 1) variable vectors ζ n , 0 ≤ n ≤ K. Hence, the joint optimization problem (P1) can be decomposed into (K + 1) decoupled subproblems without loss of optimality, where the n-th subproblem has the following form: As shown in (11), P Indiv k,n is a non-convex function.The following proposition shows how to simplify P Indiv k,n Proposition 1: The optimal solution of problem (21) Proof: From (11), we have P Indiv k,n = 0 when ζk,n ≤ α, ∀k ∈ [n + 1 : K], which means that, once ζk,n ≤ α, further decreasing ζk,n cannot decrease P Indiv k,n nor increase f 1,n in (21a).Thus, once ζk,n ≤ α, we only need to consider the case of ζk,n = α, since this leads to a lower power consumption for user k (i.e., P k,n ) than the case of ζk,n < α, as is oblivious from (12).
In summary, the case of ζk,n < α can be ignored and the optimal solution of the considered optimization problem satisfies ζk,n ≥ α.
We can also simplify the functions P Indiv k,n for k ∈ [1 : n] by considering ζk,n ≤ α only as explained in the following.As shown in (11), if ζk,n > α, ∀k ∈ [1 : n], we have P Indiv k,n = 1, and the objective function in (21a) has the worst value (i.e., f 1,n = 0) among the possible values between 0 and 1. Exploiting the above considerations, the problem in ( 21) can be simplified as follows: as is oblivious from (14).Note that if this requirement on the transmit power is not satisfied, i.e., P < (r 0 +1) n −1 for any power allocation, i.e., the COP for event N = n must be 1 in this case.
To further simplify this problem, we introduce another proposition which allows the elimination of ζk,n .
Proposition 2: The optimal solution of problem (22) (12).Therefore, we can ignore the case ζ 2,n < ζ 1,n and only consider the case ζ 2,n ≥ ζ 1,n without loss of optimality.Carrying out the above steps iteratively, the proposition is proved. 4sing Proposition 2, the problem in ( 22) can be transformed into The objective function f 2,n is still non-convex.However, by using the natural logarithm of f 2,n , the problem in (22) (i.e., the n-th suboptimal problem of problem (P1) in (19) for a fixed α) can be transformed into the following equivalent convex problem: One can calculate the Hessian matrix of the objective function and the constraint in (24b) to verify that this problem is convex.This convex optimization problem will be solved later in Section V using corresponding numerical solvers, since a closed-form expression for the optimal solution of problem (P1.n) is difficult to obtain.
Furthermore, the optimal value of α in problem of (P1) in ( 19) can be found by applying a one-dimensional search.It is worth pointing out that the optimal α has a finite value.This is because the probability that all users feed back the message "0" goes to 1 (i.e., P K (α) → 1) if α is sufficiently large, which is equivalent to the case without CSI feedback.

C. Long-Term Power Constraint
1) Approximation for High SNR: Compared to problem (P1), problem (P2) in ( 20) is more challenging, since the decoupling approach used to solve problem (P1) is not applicable.Here, in this subsection, we will focus on the high SNR approximation of the objective function (i.e., P Common ) in order to simplify the problem.Specifically, the objective function is first simplified for high SNR, the optimal solution of this approximated problem is then obtained for a fixed α, and finally a one-dimensional search is used to find the optimal value for α.
Based on Propositions 1 and 2, problem (P2) can be simplified as: where . The following proposition shows that problem (P3) can be approximately transformed into a convex problem at high SNR.
Proposition 3: At high SNR, problem (P3) in (25) can be approximately transformed into convex problem (P4), which is defined as follows: Proof: Please refer to Appendix D. We will show in Proposition 4 that problems (P4) and (P5) are exactly equivalent, i.e., the optimal solution of problem (P5) automatically satisfies constraint (26d).The Lagrangian function of the optimal solution for problem (P5) is given by where λ k,n , ω ≥ 0 are Lagrange multipliers.The Karush-Kuhn Tucker (KKT) conditions are given by The complementary slackness conditions can be expressed as follows: From ( 28) and (29a)-(29c), we have ω > 0, λ k,n = 0, for k ∈ [1 : n], n ∈ [1 : K], and the optimal ζ k,n can be expressed as follows: The Lagrange multipliers are difficult to obtain directly.Hence, we first study the properties of the optimal power allocation.The following proposition will demonstrate that the constraint in (26d) is always satisfied.
Once all i n 's are given, the optimal solution of the ζ k,n 's can be easily obtained as follows.
Theorem 2: If all integers i n ∈ [0 : K − n] defined in Definition 4 are known, the optimal solution of problems (P4) and (P5) can be expressed as follows: for each n ∈ [0 : K], where and can be obtained.Moreover, since ω > 0 in (29a), we have Algorithm I: Proposed search for {in} defined in Definition 4.
Substituting the ζ k,n in (31) into the above equality, we obtain ω as shown in (32).
Remark 5: Theorem 2 shows that the optimal solution of {ζ 1,n , • • • , ζ K,n } is in the form of two increasing geometric progressions and some constant α between them.Interestingly, parameter n which represents the feedback event N = n only affects the lengths of the two geometric progressions, but does not affect the value of the elements.
3) Search Algorithm for {i * n }: The work left is to determine the unique integer sequence, denoted by {i * n }, such that all complementary slackness conditions are satisfied.We know that λ k,n = 0 for k ∈ [1 : n], so we only need to choose {i * n } such that Note that, given {i n }, since ζ (30), λ k,n can be obtained as Unfortunately, a closed-form solution for the i * n does not exist.Hence, we design an efficient iterative algorithm to find {i * n }, as summarized in Algorithm I. Specifically, the search starts from i (1) n = 0, ∀n ∈ [0 : K], and the main idea is to narrow down the search range of a certain number of i * n 's in each iteration, by enlarging the lower bounds on these i * n 's.The following theorem ensures that the unique sequence {i * n } can be found by the proposed algorithm, i.e., Algorithm I converges.
Theorem 3: The strategy proposed in Algorithm I, updating each i n+i ≤ α}, guarantees that {i * n } must be found.Proof: Please refer to Appendix F.
According to (31), 0 ∈ [0 : K], at most K + 1 iterations are required to find {i * n }, which means that the proposed algorithm enjoys low complexity compared to an exhaustive search which would have complexity O((K + 1)!).

V. NUMERICAL RESULTS
In this section, computer simulation results are provided to evaluate the outage performance of the considered NOMA scheme with one-bit feedback.

A. Benchmark Schemes
Some benchmark transmission and power allocation schemes are considered as explained in the following.

1) TDMA Scheme:
The first benchmark scheme is TDMA transmission with one-bit feedback since it is equivalent to any orthogonal multiple access scheme [33,Sec. 6.1.3].For TDMA transmission, assume that each fading block is equally divided into K time slots, and user k is served during the k-th time slot.The power allocated to user k is denoted by P T k,n for each event N = n, where N = n is defined in Definition 1 based on the feedback sequence.The shortterm and long-term power constraints in TDMA are 1

P
. The short-term and long-term power constraints can be rewritten as follows: respectively.Now, similar to problems (P1) an (P2) in ( 19) and (20), one can formulate two power allocation problems for TDMA transmission under short-term and long-term power constraints as shown in ( 38) and (39), respectively.We can solve the two new problems using similar approaches as in Section IV.The details are omitted here due to space limitations.
2) Fixed NOMA: In order to show the benefits of the proposed power allocation schemes, NOMA with fixed power allocation using one-bit feedback is used as the second benchmark scheme.Due to its simplicity, fixed NOMA has been adopted in many relevant works (e.g., [3], [5]).Specifically, we also utilize the NOMA transmission scheme in Section II, but fix the power allocation as follows: under the short-term power constraint, we let Note that such power allocation schemes have been utilized in Appendices B and C to prove Lemmas 1 and 2, respectively.The optimal α is also obtained via a one-dimensional search, for fairness of comparison.
3) NOMA without Feedback: In order to show the benefits of using one-bit feedback, the third benchmark scheme is NOMA without CSI feedback, i.e., the base station only has the average CSI information, but does not have the instantaneous CSI nor the ordering information [8].In this case, the base station randomly orders the users; the long-term power constraint reduces to the short-term power constraint and utilizes only one power allocation within each fading block.
Note that NOMA without CSI is a special case of the considered NOMA with one-bit feedback when we set α = 0 or α = ∞.
4) NOMA with Perfect CSI: Finally, NOMA with perfect CSI is considered as a lower bound on the COP.With perfect CSI, the base station informs the users of the optimal ordering of all channel gains, and knows the required power threshold for the users within any block for correct decoding.In this case, we only consider the short-term power constraint, where an outage event occurs if the required power threshold is larger than P [34].For the long-term power constraint, an outage probability of zero can be achieved when P is sufficiently large, as shown in [14], which will not be considered in this section.

B. Short-Term Power Constraint
This subsection focuses on the outage performance of NOMA with one-bit feedback under the short-term power constraint in (17).Figs. 1, 2, and 3 compare the outage performance of NOMA employing the optimal power allocation scheme proposed in Section IV-B with the benchmark schemes defined in the previous subsection as a function of the SNR, the transmission rate r 0 , and the number of users K, respectively.These figures demonstrate that NOMA with optimal

SNR (dB) Common outage probability
Fixed NOMA NOMA without feedback TDMA scheme NOMA with proposed optimal PA NOMA with perfect CSI Fig. 1.COP versus SNR under the short-term power constraint, where K = 3, the target rate is r0 = 1 BPCU for each user, and "PA" stands for "power allocation".power allocation outperforms the TDMA scheme, fixed NOMA, and NOMA without feedback.
As can be observed in Fig. 1, all the curves have almost the same slope at high SNR, but a constant gap exists between the proposed scheme and each benchmark scheme.This is because all the schemes achieve the same diversity gain of 1 (Lemma 1) under the short-term power constraint.In addition, the performance of the proposed NOMA scheme with one-bit feedback approaches that of NOMA with perfect CSI at high SNR, which means that the one-bit feedback is effectively used by the proposed scheme to improve the outage performance.Fig. 2 reveals that NOMA with the proposed optimal power allocation has almost the same COP as the TDMA scheme when r 0 = 0.1, but outperforms the latter as r 0 increases.For example, when r 0 = 1.3, these two schemes have COPs of approximately 0.15 and 0.23, respectively.Finally, as shown in Fig. 3, the COPs of all schemes increase with the number of the users.Particularly, the gap between the proposed NOMA scheme and the TDMA scheme is enlarged as K increases.This is because, compared to the orthogonal TDMA scheme, NOMA is more spectrally efficient in the sense that all users are served simultaneously.

C. Long-Term Power Constraint
This subsection focuses on the outage performance of NOMA with one-bit feedback under the long-term power constraint in (18).Figs. 4, 5, and 6 compare the outage performance of NOMA with the asymptotically optimal power allocation scheme proposed in Section IV-C with the benchmark schemes in Section V-A and NOMA under the short-term power constraint as Fixed NOMA NOMA without feedback TDMA scheme NOMA with proposed optimal PA NOMA with perfect CSI Fig. 3. COP versus the number of users under the shortterm power constraint, where the target transmission rate is r0 = 1 BPCU for each user, and the SNR is 30 dB.Fixed NOMA (long term) NOMA without feedback (short or long term) NOMA with optimal PA (short term) TDMA scheme (long term) NOMA with near−optimal PA (long term) Fig. 5. COP versus transmission rate under the long-term power constraint, where the number of users is K = 3, and the SNR is 20 dB.Fig. 6.COP versus the number of users under the longterm power constraint, where the target transmission rate is r0 = 1 BPCU for each user, and the SNR is 30 dB. a function of the SNR, the transmission rate r 0 , and the number of users K, respectively.As can be seen in Fig. 4, under the long-term power constraint, the COPs of NOMA with the proposed power allocation, the TDMA scheme, and fixed NOMA have the same slope at high SNR, which is due to the fact that all these schemes achieve a diversity gain of 2 (Lemma 2).However, fixed NOMA suffers from a poor performance, especially at high SNR.This implies that the power allocation scheme proposed in Section IV-C plays an important role for improving the outage performance.Note that, although the power allocation scheme proposed in Section IV-C is based on the high-SNR approximation, it also performs well at low SNR compared to NOMA under the short-term power constraint.As can be observed in Fig. 5, the fixed NOMA scheme also does not perform well especially for large transmission rates r 0 .NOMA with the proposed asymptotically optimal long-term power allocation scheme has the best outage performance among the considered schemes.When r 0 = 1.3, NOMA with the proposed power allocation scheme achieves a COP of approximate 0.07, whereas the TDMA scheme achieves only a COP of approximate 0.15.Finally, as shown in Fig. 6, the gap between the proposed NOMA scheme and the TDMA scheme increases as K increases.The TDMA scheme with longterm power constraint has a COP even higher than that of the NOMA scheme with short-term power constraint, which means that the TDMA scheme is not suitable for scenarios with large numbers of users due to its poor spectral efficiency.Fig. 7 illustrates the optimal threshold α * versus the number of users, K, where the target transmission rate is r 0 = 1 BPCU for each user, and the SNR is either 20 dB or 22 dB.As can be observed in this figure, the optimal threshold increases significantly with the number of users and decreases with the SNR.The optimal threshold decreases with the SNR for the following reason.Recall that compared to the case of perfect CSI, the disadvantage of using one-bit feedback is that a user with a poor channel may be categorized as a user with a strong channel and hence given less transmit power.A good choice of α should avoid this problem as much as possible.For example, consider a scenario with two users, where the users' channels are ordered as |h 1 | 2 ≤ |h 2 | 2 .When the transmit power approaches infinity, one type of outage event is due to the situation where users have very poor channel conditions, i.e., |h i | 2 → 0, i ∈ {1, 2}.In this case, a good choice of α is |h 1 | 2 ≤ α ≤ |h 2 | 2 , which means α → 0. This intuition can also be confirmed by the analytical results developed for the case with the longterm power constraint.In particular, Lemma 2 demonstrates that the maximum diversity gain can be achieved only when threshold α satisfies α .= P −1 , i.e., the optimal threshold (denoted as α * ) decreases with P when P is large.Similarly, we can intuitively explain why the optimal threshold increases with the number of users K. Specifically, a small threshold α may result in a user k with feedback "1" having a poor channel, and thus, user k with a poor channel may be mistakenly allocated with a large order index since the base station cannot distinguish the channel gains with feedback "1" as discussed in Section II-A.Note that, when K becomes large, the power allocated to a user with a large order index will become particularly small, according to the NOMA principle as discussed in (13).In this case, user k with a poor channel will be given a very small amount of power, and thus an outage event is prone to happen.Therefore, α has to increase as K increases, in order to avoid this problem.

VI. CONCLUSIONS
This paper has investigated the outage performance of downlink NOMA with one-bit CSI feedback.We have derived a closed-form expression for the COP, as well as the optimal diversity gains under short-term and long-term power constraints.The diversity gain under the long-term power constraint was shown to be two whereas that under the short-term power constraint is only one.In order to minimize the COP, a dynamic power allocation policy based on the feedback state has also been proposed.For the short-term power constraint, we demonstrated that the original non-convex problem can be transformed into a series of convex problems.For the longterm power constraint, we have applied high-SNR approximations to obtain an asymptotically optimal solution.Simulation results have been provided to demonstrate that the proposed NOMA schemes with one-bit feedback can outperform various existing multiple access schemes and achieve an outage performance close to the optimal one in many cases.An interesting topic for future research is to extend the one-bit feedback scheme for NOMA to multi-bit feedback.
Moreover, the extension of the analysis of the one-bit feedback scheme to asymmetric scenarios with different distances and different rates for different users is also of interest.

APPENDIX A PROOF OF THEOREM 1
We first analyze the probability of event N = n defined in Definition 1, denoted by P n (α), which is a function of threshold α.Specifically, since all unordered channel gains are identically and independent distributed and the random variable N defined in Definition 1 is binomially distributed, i.e., N ∼ B(K, 1 − e −α ).
) n e −α(K−n) as shown in (10).We then calculate the outage probability of individual users for event N = n, which is denoted by P Indiv k,n for user π k .Note that an outage event at user π k occurs if it fails to decode the message for any user π l , l ∈ [1 : k].Therefore, the outage probability can be expressed as follows: Furthermore, based on ( 6) and ( 7), P Indiv k,n can be calculated as shown in (11).Moreover, the COP conditioned on event N = n, denoted as P Common n , can be obtained as follows: where (a) is due to the fact that, conditioned on event N = n, the h π k 's are mutually independent as explained in Remark 1, and SINR l→k is a function of h π k as shown in (8).Now, the overall COP averaged over all (K + 1) events can be expressed as This completes the proof.

A. Proof of Achievability
We will verify that a diversity gain of 1 can be achieved based on a simple achievable power allocation scheme.In particular, we set where µ 1 = (r 0 + 1) K − 1.Therefore, for any n, P k,n = r0 (r 0 +1) K−k P (r 0 +1) K −1 as shown in (14), and K k=1 P k,n = P , i.e., the short-term power constraint is satisfied.Using this power allocation, the outage probability in (11) can be expressed as: for a given α.Now, let α = ln 2, i.e., e −α = 1 2 for simplicity.Then, from (10),

B. Proof of Optimality
Now, we derive a lower bound on COP to verify that the diversity gain of 1 is optimal for all possible power allocations and all possible choices of threshold α.From ( 12) and for the short-term power constraint, we have ζ k,n ≥ r0 P , so P Indiv k,n can be lower bounded as: From (41), it can be observed that Based on the above two relationships, in the following, we will verify that P Common ≥P −1 for any α.Specifically, let α .

B. Proof of Optimality
Now, we derive a lower bound on COP to verify that a diversity gain of 2 is optimal under the long-term power constraint.From ( 12) and the long-term power constraint, we have ζ k,n ≥ r0 Pn P , so P Indiv k,n can be lower bounded as: Based on the above relationship and (45), we can prove that P Common ≥P −2 for any α.Specifically, let α .= P β .
First, if β > 0, from (10), we have P K ≈ 1.From (44) and (45), P Common Summarizing these three regions, the necessary condition to achieve the optimal diversity gain of 2 is to set β = −1.

APPENDIX D
PROOF OF PROPOSITION 3 For optimization problem (P3) in (25), an asymptotically optimal solution {ζ k,n } at high SNR has the following properties: Proof: From Lemma 2, we know that the optimal threshold satisfies α .= P −1 , and the optimal COP satisfies P Common .= P −2 .These properties can be verified as follows.is negligible compared to the optimal COP) can be achieved at negligible power cost for the term , only when {ζ k,2 } satisfies the constraints in property (b).Let ζ k,2 .= P γ 2,k , then, similar to the proof of property (a), min k∈[1:K] {γ k,2 } > −2 should be satisfied such that → 0 as P → ∞.Moreover, to achieve P 2 P Common 2 <P −2 , P Indiv k,2 <P 0 needs to be satisfied according to (10)   ≤P −2 has to be satisfied.Thus, P Indiv k,1 ≤P −1 , ∀k ∈ [1 : K], needs to be satisfied according to (10) and (41).Thus, with the choice α .= P −1 , property (c) can be verified based on (11).(48) Accordingly, using a Taylor series expansion, the approximation of function f 3,n can be expressed as follows: ( With this, problem (P3) in (25) has been approximately transformed to (P4) in (26).

APPENDIX E
PROOF OF PROPOSITION 4 This proposition can be proved by using (29c) and (30) This completes the proof.
Finally, " .=" denotes exponential equality, i.e., f (P ) .= P x implies lim P →∞ log f (P ) log P = x, and " ≤" and " ≥" are defined similarly.II.SYSTEM MODEL Consider a downlink NOMA scenario with one single-antenna base station and K singleantenna users.Quasi-static block fading is assumed, where the channel gains from the base station to all users are constant during one fading block consisting of T channel uses, but change independently from one fading block to the next fading block.The base station sends K messages to the users using the NOMA scheme, i.e., it sends x(t, b) = K k=1 s k (t, b) at time instant t within fading block b, where s k (t, b) is the transmitted signal (containing the information-bearing message and the power allocation coefficient) for user k and the signals for different users are mutually independent.Accordingly, user i receives the following

Remark 4 :
Although the approximation in Proposition 3 is obtained for high SNR, even in the moderate SNR regime, the resulting suboptimal solution can still provide a significant performance gain compared to benchmark schemes, as shown later in Section V, 2) Optimal Solution of Problem (P4): Problem (P4) is a convex optimization problem for a given α.To further simplify this problem, we define a new problem as follows.Definition 3: A new convex optimization problem, denoted by (P5), is obtained by removing the last constraint in (26d) of problem (P4).

Fig. 2 .
Fig.2.COP versus transmission rate under the shortterm power constraint, where K = 3, and the SNR is 20 dB.

Fig. 4 .
Fig.4.COP versus SNR under the long-term power constraint, where the number of users is K = 3, and the target transmission rate is r0 = 1 BPCU for each user.
long term) NOMA without feedback NOMA with proposed PA (short term) TDMA scheme (long term) NOMA with proposed PA (long term)

Fig. 7 .
Fig. 7. Optimal threshold versus the number of the users, where the target transmission rate is r0 = 1 BPCU for each user, and the SNR is 20 dB or 22 dB.

Remark 6 :
Using Rule 2 in Step 2-c of Algorithm I, we can easily verify that the constraintλ (t) k,n ≥ 0 always holds for k ∈ [n + 1 : n + i (t)n ], according to (31), (37), and Proposition 5. Thus, it is not necessary to include this constraint in Step 2-b of Algorithm I.