Prognostic Factors for Patients with Congestive Heart Failure – An Extended Evaluation of the SPICE-Study

This work was carried out in collaboration between all authors. Author WG initiated the extended evaluation of the SPICE study. Authors BM and WG wrote the manuscript. Author BM performed the statistical analyses. Authors RM and WG reviewed the results and the methodology used. Author CH performed respective literature search and reviewed the manuscript from a clinical perspective. All authors read and approved the final manuscript. ABSTRACT Aims: Controlled clinical trials collect huge amounts of high quality data. It is a waste of information to evaluate these data only for the efficacy and safety of the investigational medication. We propose extended evaluations of large trials for scientific purposes, especially to find the most important risk factors of the disease or variables which are associated with risk to have the disease. Methodology: The SPICE study is a controlled, randomised, completely masked trial that has investigated the efficacy of the Crataegus product WS (cid:210) 2681 randomised patients with congestive heart failure (CHF). It was initiated and sponsored by Dr. Willmar Schwabe failure. Results: Most important risk factors are lower New York Heart Association (NYHA) function class, younger age and higher left ventricular ejection fraction. Patients had less cardiac events when taking glycosides, antiarrhythmics, nitrates, diuretics, beta blockers and calcium antagonists, so patients with a high number of cardiovascular medication have a poorer prognosis. Three scenarios for the interpretation of cardioactive medications as “risk” are presented. We assume that symptoms leading to the indication of a specific cardioactive medication are the risk. This risk is only partly balanced by medication intake. In general, the intake of cardioactive medication is associated with the risk to have the disease. Conclusion: An extended evaluation of large clinical studies finds out what is important for the outcome besides specific efficacy of the investigational drug. This is usually not the scope of pharmaceutical companies, but useful for science, doctors and patients.


INTRODUCTION
Controlled randomised trials are designed to answer one single question. In most studies, it is the specific efficacy and safety of the investigational medication. Large randomised studies collect a huge amount of data under controlled conditions [1]. It is a waste of information if such huge high-quality data sets are evaluated for efficacy and safety only. We propose a more extended evaluation of large studies for general scientific purposes.
Therapy is a complex procedure with many sources and components. Often the effects on outcome are summarised in the following three components: (1 st ) Self-healing properties of the body (the natural course of disease), reduction of noxious substances, or after the disease had already passed its maximum when the patient consulted the doctor, etc.
(2 nd ) Non-specific effects induced by the patient status including the behaviour of the therapist and the setting in which therapy takes place. Examples: Patients receive sympathy and compassion for their sickness, are relieved from their daily workload and stress, look at their personal problems more distantly, are encouraged by the physician; and (3 rd ) Specific efficacy of physical or pharmaceutical intervention(s).
For physicians and patients, it is necessary but not sufficient to know that the applied medication is efficacious. They also need an overview over the most important factors on the therapeutic outcome. Therefore, clinical trials should be designed and evaluated to find the most important factors for therapeutic success or failure.
The SPICE study (Survival and Prognosis: Investigation of Crataegus Extract WS ® 1442 in congestive heart failure) investigated the efficacy and safety of the Crataegus extract WS ® 1442 in patients with congestive heart failure (CHF). It was a randomised, double blind, placebo-controlled, multicentre study. Eligible were adults with NYHA class I/II or III, CHF and reduced left ventricular ejection fraction (LVEF ≤ 35%). Study patients received 900 mg/day of WS ® 1442 or placebo for 24 months as an add-on of already administered cardioactive medications. The study was performed at 145 clinical centres in 13 European countries from 1998 (first patient in) to 2005 (last patient out). A total of 2681 patients were randomised and evaluated. Details of the study protocol were published in [2]. The SPICE study has already been evaluated for efficacy and safety and results for these two topics have been published [3]. We now present an extended evaluation of the prognostic factors for scientific purposes, i.e. to provide researchers with additional knowledge from large, highquality data beyond the usual aspects of efficacy and safety.
The primary outcome of the SPICE study was the number of days between baseline visit and first "cardiac event", which was a composite of cardiac death (sudden cardiac death, death due to progressive heart failure, fatal myocardial infarction), non-fatal myocardial infarction, or hospitalization due to progressive heart failure. An independent and "blinded" outcome committee decided for each patient if an observed event fulfilled the definition of cardiac event of the trial. Results on the primary outcome, secondary outcomes and safety are presented in [3].

Extended evaluation
Typically, the sponsor terminates work on a study after publication of the results on efficacy and safety. Two of the present authors (CH and WG) were members of the steering committee of the SPICE study, but are not employees of the sponsor. As already mentioned in the introduction, we think that such a large trial -which generated high-quality datashould be evaluated under general scientific aspects as well. Therefore, we carried out an extended re-evaluation of the SPICE study's data set to find out the most important prognostic factors to prevent cardiac events defined as the study's outcome variable.
Hypothesis testing is widely used for evaluating clinical trials. A significant result delivers a "statistical proof" in a confirmative sense only if: (i) the hypothesis was established independently from the data it tested and (ii) if an adjustment for type I error is done for multiple testing when more than one test was computed. However, an extended evaluation of a study often shows interesting results which arise from statistical testing, too, whereas the two conditions for confirmative testing are not satisfied. Such significant tests cannot be interpreted as "statistical proof", but only as the generation of new hypotheses. An explorative significance has to be tested again with independent data so it becomes a "statistical proof". This important distinction between confirmative and explorative hypotheses testing has to be kept in mind during an extended evaluation of a clinical trial.

Statistical Analysis
A multivariate Cox regression model was employed for the extended re-evaluation of the SPICE study [4,5]. The dependent variable was the time from baseline visit until the first cardiac event occurred. Sixteen potentially relevant explanatory variables, according to the SPICE protocol, were offered for the model, including demographic variables, reasons for CHF, severity of CHF and cardioactive medications. In the SPICE study, 2236 patients were treated according to protocol. All variables necessary for the regression model were available for 2170 patients.
All 16 explanatory variables were examined for multicollinearity. The two reasons for CHF "dilated cardiomyopathy" and "ischemic heart disease" were highly correlated (r = 0.84), so we excluded the "dilated cardiomyopathy" variable from the regression model to avoid computational problems. In order to assess the sensitivity of results when excluding "ischemic heart disease" instead, all analyses were conducted again with "dilated cardiomyopathy".
To capture the effect of combined explaining variables for cardioactive medications, we created combined, binary scaled explanatory variables out of the four most common combinations of cardioactive substances (different combinations of glycosides, nitrates, beta-blockers, ACE-inhibitors and diuretics, Table 3). Each of them indicated whether a patient did receive a specific combination or not. We included these variables in the multivariate Cox regression model to assess their effects on the dependent variable.
We chose a forward selection approach with an entrance level of p=0.10 for selecting the most relevant explanatory variables. Hence, only explanatory variables with a p-value less than 0.10 were included in the final model. To evaluate the impact of the selected entrance level on the final analysis model, a respective sensitivity analysis regarding the entrance levels of 0.05, 0.15 and 0.2 has also been conducted. To get some information on the variables not selected, we computed the same regression model a second time, now forcing the model to keep all 15 potentially relevant explanatory variables (either "dilated cardiomyopathy" or "ischemic heart disease" had been cancelled to avoid multicollinearity).
The analyses were conducted with version 9.2 of the SAS ® statistical software package by applying the PROC PHREG procedure for Cox regression models. For examining multicollinearity we used the PROC CORR procedure.
For all binary explaining variables, "no" was used as reference. Hence, the hazard ratio gives the risk factor if the explanatory variable is "yes". The references for NYHA, treatment group and gender were "class I/II", "placebo" and "female", respectively. Despite the fact that inclusion criteria for the SPICE study defined that only patients with NYHA class II and II are eligible for study participation [6], 8 patients with NYHA class I (Table 1) were in the data set. Those NYHA I patients were therefore considered within the NYHA II group for the analyses. All p-values were interpreted in an explorative manner. Table 1 describes the patient characteristics of the 2 170 patients included in the regression model. None of these characteristics differ noteworthy from the total of 2 681 randomised in the SPICE study.

General Results
Three reasons for CHF were recorded: ischemic heart disease, dilated cardiomyopathy and hypertension. About two-thirds of patients had only one reason, whereas all three reasons were mentioned for 2.4% of patients. Patients had an average of 1.3 reasons for CHF ( Table 2). The most frequent reason was ischemic heart disease. This was the only reason for 38.2% of patients. The dual reasons ischemic heart disease and hypertension applied to 19.4% of the patients, and 4.0% of the patients had ischemic heart disease and dilated cardiomyopathy Therefore, 64.0% of patients had ischemic heart disease as a reason for CHF. For more details Table 2.
Study patients got an outpatient treatment by cardiologists in hospitals. They received intensive treatment according to modern standards [7]. According to Table 3, most patients (1856 = 86%) received three, four or five cardiac medications. On average, they received 3.7 cardioactive medications. Most often used were diuretics (in 85.7% of patients), ACE inhibitors (83.7%), beta blockers (64.3%), glycosides (56.6%) and nitrates (56.4%). The most frequent combination was diuretics + ACE-inhibitors + beta blockers + glycosides. This combination of four cardioactive medications received 11.7% of all patients. Primary outcome was a composed endpoint. Table 4 shows which cardiac events occurred in the patients included in the multivariate regression analysis.

Results of the Multivariate Regression
Nine explanatory variables were selected step by step for the model. They are presented in Table 5 under the headline "significant explanatory variables" in consecutive order with their hazard ratio, corresponding 95% confidence interval and p-value of the final model. All numbers mentioned in the text are rounded. Most important for the occurrence of cardiac events in the investigated patients is the application of a glycoside followed by the administration of an antiarrhythmic medication. The third most important factor is the NYHA class of the patient.
The Cox regression model assumes proportional hazards, the hazard ratio (HR) can be interpreted as a relative risk. As we used "no" as reference for yes/no variables, the hazard ratio expresses the risk for a patient with an event compared to a similar patient with no event. The highest hazard ratios were computed for calcium (Ca) antagonists (2.2), glycosides (2.0) and antiarrhythmics (2.0).
The computed hazard ratios are estimations. The true value for a hazard ratio may be smaller or bigger. The true hazard rate lies with 95% probability somewhere within the respective confidence interval (CI). The confidence intervals are rather large despite of the sample size of 2170 patients.
Our model delivered statistical significance for 9 variables with p-values ranging from <0.0001 to 0.0177. Of course, these significant results have to be interpreted in an explorative way, i.e. as generation of new hypotheses. However, some of the p-values are rather small, so they can be considered as a strong evidence for an effect of the corresponding explanatory variable.

* Results of regression: Explanatory variables selected by the regression model (significant variables) and non-significant explanatory variables. The hazard ratio (HR) describes the extent of the risk. Next column gives the confidence interval (CI) for the hazard ratio. Finally the p-value for significance is given. Reading example: The intake of a glycoside increases the risk for a cardiac event by factor 2.03.
The true risk factor is with 95% probability somewhere between 1.68 and 2.45. This is significant.

Interpretation of the Estimated Parameters
Patients taking a glycoside have an estimated 2.0 fold risk (95% CI 1.7-2.5) for a cardiac event compared to patients receiving no glycoside (p < 0.0001). The surprise that patients with a glycoside have a higher risk for a cardiac event than patients without a glycoside will be discussed in detail. The same holds true for the other investigational cardioactive medications. A patient with an antiarrhythmic medication has a 2.0 fold risk (95% CI 1. 6 The risk for a cardiac event for patients with NYHA class III is 1.6 fold (95% CI 1.3-1.9) compared to similar patients with NYHA II or I (p < 0.0001). There is also an increased risk for a cardiac event by a factor of 1.012 (95% CI 1.004-1.021) for each year a patient gets older (p = 0.0048). Over a period of 10 years the risk increases by a factor of 1.012 10 = 1.13, and over a period of 20 years by a factor of 1.012 20 = 1.27 (95% CI 1.08-1.52). In this evaluation, 95% of the patients were between 38.8 and 80.8 years old, so these calculations are only valid within this age range.
Patients with a higher left ventricular ejection fraction (LVEF) have a lower risk for a cardiac event than patients with a lower LVEF (p = 0.0124), which is not surprising. The quantification of this relationship may be new here. For each LVEF percentage point increase, the risk for a cardiac event decreases on average by a factor of 0.984 (95% CI 0.972 to 0.997). If there are two similar patients but one patient has a LVEF = 25% and another patient has a LVEF = 30%, then the patient with the higher LVEF will have a risk factor for a cardiac event of 0.984 5 = 0.92 (95% CI 0.87-0.99) compared to the patient with the worse LVEF. Again, this result is only valid for this LVEF range.
Several explanatory variables were not selected by the Cox regression model with the defined entrance level of p=0.1. Therefore, all information given on these non-significant variables is unreliable. However, we want to give at least an interpretation of these variables.
There was a slight tendency that patients with ischemic heart disease have an increased risk for a cardiac event compared to patients with other reasons (HR=1.1, 95% CI 0.92-1.3, p=0.26). Consequently, the confidence interval includes the hazard ratio 1.0 which is neither a risk nor a benefit. Patients with hypertension showed a somewhat larger risk (HR=1.1, 95% CI 0.92-1.3, p=0.29) for a cardiac event. An alternative model using dilated cardiomyopathy instead of ischemic heart disease revealed no changes in the results, specifically there was also no significant effect of dilated cardiomyopathy (HR=1.1, 95% CI 0.9-1.3, p=0.55).
We found a slight tendency that patients with AT II antagonists are at a greater risk (HR=1.4, 95% CI 0.92-2.0, p=0.12) for getting a cardiac event compared to patients who are not treated with AT II antagonists. Again, there is a slight tendency (HR =1.2, 95% CI 0.92-1.5, p=0.20) for a greater cardiac event risk in patients who take ACE inhibitors compared to patients who do not take ACE inhibitors.
The analysis showed that patients who undergo Crataegus treatment have no further risk of having a cardiac event (HR = 0.98, 95% CI 0.83-1.14) than patients who take a placebo. However, the respective p-value of 0.75 indicated not even a trend that Crataegus treatment is beneficial for patients.
Gender was the last selected variable within the regression model. Men have a slightly higher risk for a cardiac event than women (HR= 1.02, 95% CI 0.82-1.26, p= 0.86), but again there was no trend of a substantial difference between male and female patients.

Overall Discussion
The specific medication investigated in a typical controlled randomised study is only one component of therapeutic success or failure. Therefore, it is necessary, but not sufficient, to know its efficacy. A physician and the patient need to know -in principle -all factors relevant for the outcome. We approached this ideal by this extended evaluation of the SPICE study. The major findings are based on explorative statistical tests with only a yes/no decision. It does not provide information on the importance of the investigated factors. Therefore, it should be followed by estimating the size of the effect. Confidence intervals are a good possibility to see how accurately the effects can be estimated.
The analysis is based on a multivariate Cox regression model. The variable selection process has been conducted by forward selection. The chosen entrance level of 0.10 represents the usual level of significance for selection of variables. As we mentioned in the "Methods" section, a sensitivity analysis with more tolerant limits of 0.15 and 0.2 and with a stricter limit of 0.05 yielded equal results. So, our model seems to be an appropriate choice. The explorative character of analyses is some limitation of the current study with respect to the interpretation of results. All findings cannot be definite conclusions, but rather evidence for further investigations. Furthermore, the offered explanatory variables for the Cox regression were restricted to those which were captured in the course of the SPICE study, so other potentially relevant variables may remain disregarded.
NYHA class, age and LVEF influence the prognosis of CHF. This is confirmed by our analysis: The higher the NYHA class, the older the patient and the lower the LVEF, the higher will be the risk for a cardiac event. The plausibility of results indicates that the employed regression model is reasonable.
For all investigational cardioactive medications -except for the Crataegus extract WS ® 1442 -we found at least a tendency that the application of a cardioactive medication increases the risk for a cardiac event. At first glance, this is surprising. In principle, however, statistics indicate only a relationship between medication and cardiac events and do not say which variable is the cause and which variable is the effect. Therefore, we are by no means sure if the applied medication causes additional cardiac events and this is why cardioactive medication in general should not be declared as a "risk factor" in a common sense, but rather as a factor that is associated with the risk of having cardiac events. We see three scenarios for interpreting our results obtained with cardioactive medications ( Table 6). All are of pure type, but any situation between these scenarios is possible as well.
Scenario 1 is the simplest statistical interpretation. It says that the increased risk for cardiac events is caused by the medication. That would mean that all these medications are not efficient, even worse, that they increase the risk for cardiac events as an adverse drug reaction.
In scenario 2, the risk is caused by the disease; its symptoms represent an indication for prescribing the medication. This means that symptomatic patients who lead the doctor to prescribe them a glycoside, for example, are sicker and have a higher risk for a cardiac event than patients without such symptoms. The medication is not effective in reducing the risk for cardiac events. In scenario 3, the risk is caused by the disease, its symptoms and indication for the prescribed medication. In other words, patients with symptoms who lead the doctor to prescribe them a glycoside, for example, are sicker and have a higher risk for a cardiac event than patients without such symptoms. In addition, the medication is effective and reduces the risk, but this reduction is not a complete compensation of the risk. Therefore, some observed risk remains. A rough formulation of scenario 3 is: "It's better to be healthy than to be sick and well treated." We think that scenario 3 has the best chance to be realistic. This means that patients with specific symptoms leading to an indication, e.g. for glycosides, have a poorer prognosis per se. The reason can be that patients suffering from symptoms that lead the doctor to prescribe a glycoside are sicker and have therefore an increased risk to get a cardiac event.
Medication reduces the risk only partly, not completely. The observed result is the risk of the disease with its symptoms and indication minus the effect of the medication. This assumption reflects the results of the large study of the Digitalis Investigation Group [8], which showed that the glycoside digitalis could only reduce hospitalisations but not mortality in patients with chronic heart failure. Scenario 3 is possible not only for glycosides, but also for all investigational cardioactive drugs except Crataegus.
A sensitivity analysis where we used combined explaining variables for the four most common combinations of cardioactive medications according to Table 3 revealed no considerable differences in comparison to the results presented in Table 5.
The discovered tendency of increased risk for hypertonic patients regarding cardiac events has confirmed already reported findings of respective studies [9,10].
Interpretation of Crataegus is completely different from the other cardioactive medications and much easier. In the SPICE study, Crataegus was not administered according to symptoms, but by randomisation. Hence, the scenarios mentioned above do not apply to Crataegus. Due to the design of the study, the observed therapeutic effect of Crataegus can only be caused by its efficacy. A non-significant trend in efficacy of Crataegus was reported in the primary evaluation [3], and the current extended analysis confirmed this nonsignificant tendency for CHF patients with respect to time of first cardiac event.
The SPICE study and its specific results on efficacy and safety have already been published [3], but with this extended evaluation that used a multivariate regression model we obtained more insights on risk factors for cardiac events or variables which are associated with the risk for having cardiac events, respectively, far beyond the efficacy of Crataegus.

CONCLUSION
High-quality data sets of controlled clinical trials provide a huge extent of information that should not only be used to investigate efficacy and safety of the investigational treatment. By means of an extended evaluation of a large data set from cardiology, we showed the gain of knowledge with respect to prognostic factors for congestive heart failure. This is not the usual scope of companies sponsoring efficacy studies, but it is important for science and a successful CHF therapy since the extended evaluations enabled to quantify the risk that is associated with the specific explanatory variables which have been considered and showed that patients with a high number of cardioactive medications have a poorer prognosis.