A sequential constraint relaxation algorithm for rank-one constrained problems

Many optimization problems in communications and signal processing can be formulated as rank-one constrained optimization problems. This has motivated the development of methods to solve such problem in specific scenarios. However, due to the non-convex nature of the rank-one constraint, limited progress has been made in solving generic rank-one constrained optimization problems. In particular, the problem of efficiently finding a locally optimal solution to a generic rank-one constrained problem remains open. This paper focuses on solving general rank-one constrained problems via relaxation techniques. However, instead of dropping the rank-one constraint completely as is done in traditional rank-one relaxation methods, a novel algorithm that gradually relaxes the rank-one constraint, termed the sequential rank-one constraint relaxation (SROCR) algorithm, is proposed. Compared with previous algorithms, the SROCR algorithm can solve general rank-one constrained problems, and can find feasible solutions with favorable complexity.


I. INTRODUCTION
Recently there has been a growing interest in optimization problems that involve rank-one constraints. This is because many optimization problems in communications and signal processing applications can be cast as such problems [1].
Consider the following rank-one constrained optimization problem: s.t. g k (X) k b k , k = 1, . . . , K (1b) rank(X) = 1, where g 0 , g 1 , . . . , g K : C N ×N → R are continuous and differentiable convex or affine functions of an N × N complex-valued positive semidefinite matrix variable X 0, b 1 , . . . , b K ∈ R are scalar constants, and " k " in (1b) could refer to "<", "≤" for the k-th constraint in (1b), or "=" for linear g k (X). We assume that the constraints (1b) and X 0 form a bounded convex set to ensure that the objective function g 0 (X) always achieves a finite value that is meaningful in engineering applications.
Without the rank-one constraint rank(X) = 1, Problem (1) The work of P. Cao  becomes the following relaxed problem: which now is a convex optimization problem and thus can be efficiently solved by standard convex optimization methods, e.g., CVX. However, since the rank function is quasi-concave and subadditive [2], the non-convex constraint rank(X) = 1 causes the original problem (1) to become NP-hard in general.
In some special cases, Problem (1) may be convex or otherwise have lower complexity [3], [4]. However, in this work, we will focus on the generic case when Problem (1) is NP-hard. With this viewpoint, some heuristic approaches have been proposed to obtain sub-optimal or feasible rank-one solutions for Problem (1). These can be categorized into several approaches: 1) Rank-one constraint relaxation -first solve the relaxed problem without the rank-one constraint, and then construct a rank-one solution based on the solution of the relaxed problem if its solution is not rank-one. Otherwise, the solution of the relaxed optimization problem is exactly the solution to the original rank-one constrained problem [1], [5]- [8]. A drawback of this approach is that solving the relaxed problem usually yields a high rank solution, and in addition the constructed rank-one solution is usually sub-optimal or even infeasible for the rank-one constrained problem; 2) Rank minimization -first approximate the rank function by some tractable function, and then add the approximate rank function to the original objective function as a penalty term [2], [9], [10]. A difficulty with this approach is that its performance depends on both the quality of the rank-function approximation and the penalty parameter setting, as it is well known that different penalty parameters influence both the performance and complexity of penalty-based algorithms; 3) Quadratic optimization -first rewrite a rank-one positive semidefinite matrix variable as a quadratic term of a vector variable such that the rank-one constraint can be avoided, and then solve a quadratic optimization problem to obtain a vector variable [11], [12]. Unfortunately, there is no "free lunch", since in general it is still hard or even more difficult to solve a quadratic optimization problem. Thus, it is common to use semidefinite programming (SDP) to formulate a quadratic optimization problem as a rank-one constrained optimization problem.
These existing heuristic approaches typically work for some specific cases but not generic problems. This motivates us to propose a novel rank-one constrained optimization technique -sequential rank-one constraint relaxation (SROCR). Instead of dropping the rank-one constraint completely, the basic idea of this algorithm is to relax the rank-one constraint gradually such that it is easier to find a feasible solution. In addition, the SROCR algorithm enables one to generate a locally optimal solution to a generic rank-one constrained optimization problem if the relaxed optimization problem without a rankone constraint is a convex optimization problem. The proposed algorithm is evaluated via numerical results, which imply that the proposed algorithm can usually achieve a better objective function value with lower or comparable complexity compared with baseline algorithms.

II. THE SEQUENTIAL RANK-ONE CONSTRAINT RELAXATION ALGORITHM
Inspired by existing rank-one optimization methods, in this section we propose a novel sequential rank-one constraint relaxation algorithm to generate a guaranteed rank-one solution, which can be shown to be a Karush-Kuhn-Tucker (KKT) stationary solution to the original optimization problem (1).

A. Rank-One Constraint Reformulation and Analysis
Considering that rank(X) is a discontinuous non-convex function and does not have a closed-form expression [2], we reformulate rank(X) = 1 to the following equivalent form: where λ max (X) and Tr(X) denote the largest eigenvalue and trace of X, respectively. Now, (3) involves continuous functions of X but still a non-convex constraint. As we know that λ max (X) can be equivalently represented as (3) can be further formulated as

B. Alternating Optimization Algorithm
Based on the equivalent rank-one constraint (5), we can consider the vector v as a slack variable for the original problem (1). Then, Problem (1) is equivalent to the following joint optimization problem with respect to {v, X}: Suppose that we find a feasible point X (i) for Problem (6). The constraint (6c) implies that X (i) must be rank-one to ensure λ max (X (i) ) = Tr(X (i) ). Thus, the optimization of Problem (6) with respect to v yields an optimal solution v (i) = u max (X (i) ).
On the other hand, for any given v (i) in (7), the non-convex constraint (6c) becomes a linear constraint because (8) is a linear constraint on X. Then, Problem (6) becomes a convex optimization problem with respect to the variable X, and thus the optimal solution X (i+1) can be obtained by using standard convex optimization methods. However, to make (8) hold, the solution X (i+1) has to be rank-one and its principal eigenvector should be equal to v (i) , i.e., the direction of u max (X (i+1) ) should be parallel to that of u max (X (i) ). This alternating optimization of {v, X} cannot improve the solution quality compared with the initial feasible point. Furthermore, it is still unknown how to obtain a feasible solution for the generic problem (1), as the difficulty of finding a feasible solution for generic NP-hard optimization problems with multiple constraints is even comparable with that of obtaining its optimal solution [13].

C. SROCR Optimization Idea and Algorithm
To break the bottleneck of the rank-one constraint, we propose a novel algorithm by sequentially relaxing the rankone constraint rather than completely dropping it as is done in traditional algorithms.
1) Idea of SROCR: We propose to partially relax the rank one constraint λ max (X) = Tr(X) to where w (i) ∈ [0, 1] denotes a relaxation parameter that controls the largest eigenvalue to trace ratio of X, as the solution X based on (9) is guaranteed to satisfy the following condition: Then, the constraint set of (9) gradually shrinks and X can approach 1 as w (i) increases. One significant benefit of the above partial rank-one constraint relaxation scheme is that it makes it easier to find a feasible solution X (0) for the optimization with a small w (i) . For example, when w (0) = 0, (9) is equivalent to dropping the rank-one constraint. In this case, we can easily find a feasible point X (0) by solving the convex relaxed problem (2). This motivates us to increase w (i) sequentially from 0 in iterations such that the constraint set (9) can gradually approach the real rank-one constraint set when w (i) = 1.
2) SROCR Algorithm: The proposed SROCR algorithm can be described as follows.
In Problem (12), the constraint (12c) is jointly derived from (7), (8) and (9). In (15), δ (i) denotes the step size for weight parameter updating. To make X (i) more likely to be feasible for Problem (12) in the next iteration, in general the step size {δ (i) } should be small. If the pre-defined step size δ (i) makes Problem (12) insoluble or infeasible, we can reduce the step size by (14) until Problem (12) is solvable.
repeat Given {w (i) , X (i) }, solve the convex problem if Problem (12) is solvable then Obtain the optimal solution X (i+1) for Problem (12); Remark 1 The proposed SROCR Algorithm 1 makes the procedure of rank-one optimization easier to analyze. In principle, Algorithm 1 gradually projects the largest eigenvector of the matrix variable X onto an updated direction v (i) until all the power lies in a one-dimensional subspace. In addition, unlike penalty based methods, X (i) approaches a rank-one solution by monotonically increasing {w (i) } in an iterative but controllable manner.
The step size reduction in (14) could be extended to the general case δ (i+1) ← δ (i) /L where L ≥ 2 in order to speed up making Problem (12) solvable if it is infeasible in the last iteration. In the first iteration, the relaxation weight w (1) computed based on (15) satisfies 1/N < w (1) ≤ 1, as the ratio of the largest eigenvalue to the trace of an arbitrary positive semidefinite matrix should be not less than 1/N . Theorem 1 Given a feasible initial point X (0) , Algorithm 1 converges to a KKT stationary solution of Problem (6), equivalent to Problem (1).
Proof: Solving Problem (12) can be considered as an alternating optimization of v = u max (X) and X. The proof is detailed in [14].

Remark 2
The performance of Algorithm 1 depends on the initialization. A large initial step size δ (0) in (11) may cause some infeasible iterations, but a small initial step size may require more iterations to increase w (0) to 1. In addition, as Algorithm 1 is a projection procedure from an infeasible solution to a feasible one, it may skip some good solutions in the projection procedure if the step size is too large. In contrast, a small step size usually leads to a smooth projection procedure, thereby yielding a better solution with a high probability. Therefore, we can easily determine an initial step size δ (0) according to the priority of complexity vs. accuracy.

III. NUMERICAL EXAMPLE
To illustrate this algorithm we consider an example -a cooperative interference channel of two transmitter-receiver pairs, in which each transmitter causes interference to the other receiver in a spectrum sharing scenario. We assume that each node is equipped with N antennas and a single radio frequency (RF) chain. Each transmitter employs beamforming to minimize its interference leakage but subject to its own desired signal strength constraint. Let x ∈ C N ×1 , H 0 ∈ C N ×N and H 1 ∈ C N ×N denote the analog beamforming vector of a transmitter, the interfering channel matrix to the undesired receiver and the desired channel matrix to its own receiver, respectively. Then, we can formulate the following optimization problem: where γ 0 in (16b) denotes the minimal required signal strength to guarantee the quality of service (QoS) and is defined as where E n is a zero N ×N matrix except for the n-th diagonal element being one. We assume Rayleigh fading channels in which the elements are independent and identically distributed with zero mean and unit variance. In addition, 100 random channels {H 0 , H 1 } for Problem (17) are evaluated for Monte Carlo simulations. Let X (0) denote the solution to the relaxed problem (17a)-(17c). When N = 16, Fig. 1 shows that all the initial solutions {X (0) } for 100 random realizations of {H 0 , H 1 } are not rank-one. To extract a rank-one solution, the proposed SROCR algorithm is first evaluated. Fig. 2 and Fig. 3 show its convergence for one channel realization but with different initial step sizes 1 . In Fig. 2, we observe that the objective values converge gradually to different values at different speeds. Fig. 3 shows the convergence of the largest eigenvalue to trace ratio towards 1 The algorithm iterates until λ 1 (X (i) )/ Tr(X (i) ) ≥ 0.999 and | Tr(H H 0 H 0 X (i) ) − Tr(H H 0 H 0 X (i−1) )| ≤ 10 −4 are simultaneously satisfied. In addition, the maximal number of iterations is defined as 500. The convergence criterion for the algorithms used as baselines is similarly defined.  Fig. 2 and Fig. 3. This example implies that the proposed algorithm has a reasonably low complexity (within only a few tens of iterations) in general, and a small initial step size may lead to a better solution but at the cost of more iterations.
To illustrate the relationship between convergence and the initial step size, Figs. 4-5 show average convergence results for 100 random channel realizations versus the initial step size. Interestingly, when the initial step size decreases, Fig. 4 shows that the average converged objective value of the minimization problem (17) decreases as well, but Fig. 5 shows that the corresponding average complexity increases. This result is consistent with our assertion in Remark 1 for Algorithm 1 that a smaller step size enables smoother projections, which provides a high probability of achieving a better solution. Therefore, there exists a trade-off between performance and complexity. The converged objective values change slightly when δ (0) ≤ 0.1 in Fig. 4 and the convergence rate increases slightly when δ (0) ≥ 0.1 in Fig. 5. Thus, δ (0) = 0.1 might be a good choice for this problem to achieve reasonable performance with low complexity. On the other hand, if the system has a high requirement in terms of objective performance,  Based on the same channel realizations, two recently proposed methods are evaluated as baselines for performance comparison, since both have already been shown to outperform most other related methods: 1) PSCA -the penalty based successive convex approximation proposed in [15] with an additional updating of the penalty parameter µ (i) ; and 2) FPP -the feasible point pursuit (FPP) algorithm in [11] with a fixed penalty parameter µ (0) . We summarize a performance comparison of the proposed algorithm with these two baseline algorithms with different numbers of antennas as shown in Table I. In particular, we run the proposed SROCR algorithm for δ (0) = 0.01 and δ (0) = 0.1, and PSCA and FPP algorithms for µ (0) = 10 and µ (0) = 1000, respectively. Table I shows that the proposed algorithm with δ (0) = 0.01 can always converge to feasible solutions yielding smaller objective values compared with the baseline algorithms. Note also that PSCA usually outperforms FPP in terms of both objective values and convergence rate. Unlike the proposed algorithm, PSCA and FPP with some initial penalty parameters may lead to infeasible solutions, where the infeasibility is defined as either of the situations of convergence to an infeasible solution or non-  convergence within 500 iterations. For example, µ (0) = 10 usually results in an infeasible solution by FPP for large N . In addition, the same µ (0) for FPP and PSCA can result in feasible solutions for small N but not for large N . Therefore, it is in general unclear how to select good penalty parameters for the baseline algorithms. In contrast, we observe that the same δ (0) works robustly for the proposed algorithm with different values of N .

IV. CONCLUSIONS
In this paper, we have considered rank-one constrained optimization problems. We have proposed the SROCR algorithm which achieves a locally optimal solution for a generic rankone constrained problem by sequentially relaxing the rank-one constraint. We have seen that the proposed algorithm usually achieves a better final objective value but with a comparable or even lower complexity compared with baseline algorithms. Moreover, we have seen that the proposed algorithm works for generic rank-one constrained optimization problems, while, for example, the feasible point pursuit algorithm is only valid for quadratic programs.