Computing the Eigenvalues and Eigenvectors of Symmetric Arrowhead Matrices*

In this paper we will be concerned with the eigenvalue problem for a symmetric matrix which is zero except for its main diagonal and one row and column. Such problems arise in the description of radiationless transitions in isolated molecules [l] and of oscillators vibrationally coupled with a Fermi liquid [63. In these applications the order n of the matrix A can be in the thousands. The purpose of this paper is to present formulas and efficient algorithms for computing eigenvalues and eigenvectors of such matrices. Since eigenvalues are invariant under similarity transformations, we can symmetrically rearrange the rows and columns of the matrix at our convenience, and we therefore assume without loss of generality that A has been ordered so that the nonzero row and column are last. Thus, we consider matrices of the form


INTRODUCTION
In this paper we will be concerned with the eigenvalue problem for a symmetric matrix which is zero except for its main diagonal and one row and column. Such problems arise in the description of radiationless transitions in isolated molecules [l] and of oscillators vibrationally coupled with a Fermi liquid [63. In these applications the order n of the matrix A can be in the thousands. The purpose of this paper is to present formulas and efficient algorithms for computing eigenvalues and eigenvectors of such matrices.
Since eigenvalues are invariant under similarity transformations, we can symmetrically rearrange the rows and columns of the matrix at our convenience, and we therefore assume without loss of generality that A has been ordered so that the nonzero row and column are last. Thus, we consider matrices of the form which we will call (symmetric) arrowhead matrices. By further interchanges, we can arrange for the diagonal elements to be ordered so that d, <d, Q . . . <d,-1. Hence, we will consider only ordered arrowhead matrices. Since A is symmetric, its eigenvalues may in principle be computed by invoking any of a number of standard programs (e.g., the EISPACK programs [S]). However, these programs usually begin with an initial reduction of the matrix to tridiagonal form, which entails O(n') operations and O(n') storage. In this paper we propose an alternative which takes advantage of the structure of A: namely, we propose to solve the Eq. (2.1) below, which is closely related to the secular equation, for the eigenvalues of A. This results in an algorithm which requires only O(n*) computa-tions and O(n) storage. Although the idea is conceptually simple and in fact has been used to solve other eigenvalue problems of special structure [2,3,5,71, some care must be taken to show that the computation is stable.
In the next section we will give the basic properties of arrowhead matrices, and in the next we will describe how the equation rp,(l) = 0 may be solved. In the last section we will discuss the computation of eigenvectors. We point out here that our algorithms are completely parallel; the computations for one simple eigenvalue are completely independent of those for another.

PROPERTIES OF EIGENVALUES OF ARROWHEAD MATRICES
We begin by disposing of a special case. If there is a zero in the last column, say e, = 0, then the diagonal element d, is an eigenvalue whose eigenvector is the ith unit vector. To compute the remaining eigenvalues, we can reduce the size of the problem by deleting the ith row and column of the matrix, eventually obtaining a matrix for which all elements ej are nonzero. We will call such an arrowhead matrix irreducible.
The basic results on the eigenvalues of arrowhead matrices are contained in the following theorem due to Wilkinson [lo, pp. 94 ff]. Here we give a slightly different proof that contains results needed to derive our algorithm. By Gerschgorin's theorem [9, p. 3021, d,, < 2, and I, cd,,. These inequalities immediately imply the statements about multiple eigenvalues.
We will next show that if J. is distinct from the numbers di, then 2 is an eigenvalue of A if and only if (~~(1) = 0. If 1" is an eigenvalue of A, then det(A -,?I) = 0. If we multiply the first row of this determinant by e,/(d, -2) and subtract it from the last row, multiply the second row by e,/(d, -A) and subtract it from the last, and so on, we get the equation Since this determinant is the product of the diagonals and d, -A # 0, we must have ~~(2) = 0. Conversely, if ~~(1) = 0, then det(A -RZ) = 0. Now assume that di-i < di, where 1 < i < n -1. Then for I near to but greater than dim ,, we have (remember e,-I # 0) (2.3) For J. near to but less than d,, we have Therefore, ~~(2) must change sign in the interval (die 1, di), and the point at which it does is an eigenvalue. The inequalities (2.2) ensure that the sign changes only once in the interval. It remains only to show that there is an eigenvalue strictly less than d, and an eigenvalue strictly greater than d, ~ 1. If J is near to but less than d, , then On the other hand, as 3% + -cc we have cp,(E,) E -2 > 0. Hence q,(l) changes sign in (-co, d,). A similar argument shows that ~~(2) has to change sign in (4-l, 00). I The theorem shows we can lind eigenvalues corresponding to repeated values of the diagonal by inspection. The remaining eigenvalues are located strictly between the d;s and satisfy the equation (P,+(A) = 0. Since (Pi is very easy to compute, this suggests that we attempt to use a root-finding method to compute the eigenvalues of A. We will describe and analyze such a method in the next section.

COMPUTING EIGENVALUES
In this section we turn to the problem of computing the eigenvalue Ai lying between did i and di, assuming these diagonals are distinct. According to (2.3) and (2.4), if 1 E (di-i, di) and (~~(2) >O, then 1~ Ai. If (pA(jl) ~0, then L > ;li. This suggests that we can compute Li by interval bisection. The following pseudo-code gives such an algorithm, computing an interval [b, c] of length at most eps which contains li, and the midpoint of the interval as the estimate of the eigenvalue. The variable ops is a convergence tolerance, which must be greater than zero, and phi is a subprogram that returns the function (Pi. At each step of this algorithm, the size of the interval (a, b) containing the eigenvalue is reduced by a factor of two. In other words, the method adds about a bit of accuracy per iteration.
However, we recommend this procedure be used only to find brackets a > die 1 and b <di for the eigenvalue. Once these brackets have been found, one should switch to a combination of the secant method and interval bisection. Although this method is complicated, it is available in library subroutine packages (e.g., zeroin in the IMSL library). The increased rate of convergence more than justifies the trouble.
There is one simplification which results when there are multiple eigenvalues. If diel<d,= ... =di+k<di+k+Ir the terms i+k c ejz j=i dj-A'

in (2.1) collapse into
Thus if we strike the rows and columns corresponding to d,, i, . . . . di+ k and replace ei by ,/w, the value of (~~(2) is unchanged. When there are many repeated eigenvalues, this can result in large savings in the evaluation of q5.,,.
At the end of the iteration, we have two numbers a and b within eps of each other such that phi(a) is nonnegative and phi(b) is nonpositive. However, phi and qA are not the same function, since phi is evaluated with rounding error. We now show that the effect of this rounding error is negligible.
We will assume that all computation is done in floating-point arithmetic with rounding unit E,+,. Specifically, we will assume that the result of any floating-point operation is to return a value that has relative error E,,,. For example, in computations to 10 decimal digits, E,,, is approximately lo-". Under these assumptions, we can show (see the Appendix to this paper) that if n > 2, E,+, < 0.001, and nEM < 0. This result shows that whatever we compute for q(l), it is the same value we would have obtained by performing exact calculations with a slightly perturbed matrix A"= A + H. Now it can be shown [9, p. 3151 that such a perturbation can move the eigenvalues of A by no more than  Let us now apply these results to the output a and b of our algorithm. From our rounding-error analysis, we have (~~+,(a) 30. Since A + H is an ordered, irreducible arrowhead matrix with diagonals di, its ith eigenvalue must be greater than or equal to a. It follows from (3.1) that Ji 2 a -q(a). A similar argument shows that & < b + r](b). In other words, our algorithm-ither simple bisection or the composite method-always returns numbers a and b such that Ji E Ca -v(a), b + V(b)1 n (di-1, di).

COMPUTING EIGENVECTORS
Next we consider the computation of eigenvectors of ordered, irreducible arrowhead matrices (we have already mentioned in Section 2 that if ei = 0, then di is an eigenvalue whose eigenvector is the ith unit vector). We will first consider eigenvectors corresponding to nonmultiple eigenvalues. Let zi be the eigenvector corresponding to ,li, and assume that it has been normalized so that its n th component .zF' is one. Then it is easily verified from the equation Azi= ;l,zi that the remaining components of zi are given by the formulas Since dip, < ;li < di, the denominators in these formulas are nonzero.
The chief difficulty with these formulas is that we must use an approximate eigenvalue xi in place of the true value, which gives an approximate eigenvector &i. If we set 6,yliJ I&-1,1, then it follows [4] that the sine of the angle between zi and Zi satisfies where 11. I( is the usual Euclidean norm. It is easily verified that (A -X,1).?,= (0, ..., 0, ~(1~))~. Hence Since our algorithm returns intervals containing the eigenvalues, we may obtain a lower bound on Sj from their endpoints.
With this definition, the vectors zj are mutually orthogonal.

APPENDIX: ROUNDING ERROR ANALYSIS
The assumptions about our computer arithmetic amount to saying that the computed value of a floating point operation, say a + b, is (a + b)( 1 + E), where /El G&'&f. In deriving our bounds we will be faced with expressions of the form Eventually, we will replace such expressions by a bound independent of j, and in anticipation of this we let (k) be a generic symbol for such an expression. Clearly (k)(l) = (k+ 1). In evaluating q,(n) we must first compute p -J which gives a value (p -A)< 1). Next we must compute ef/(d, -L), which gives e:( 3)/(d, -A). When the latter is subtracted from the former, we get (p-1)(&g. 1 Computing and subtracting the next term gives e: (5) e: (4)