The repetition of behavioral assessments in diagnosis of disorders of consciousness

To determine whether repeated examinations using the Coma Recovery Scale‐Revised (CRS‐R) have an impact on diagnostic accuracy of patients with disorders of consciousness and to provide guidelines regarding the number of assessments required for obtaining a reliable diagnosis.

the only available tools in clinical centers to assess patients' level of consciousness. To date, the most sensitive and validated scale is the Coma Recovery Scale-Revised (CRS-R 11 ).
Recent guidelines emphasize the importance of repeated or extended assessments to minimize misdiagnosis attributed to fluctuating levels of consciousness. 9 However, to our knowledge, no study has investigated the number of examinations needed to increase diagnosis accuracy in patients with DOC.
The aims of our study were twofold: (1) to determine whether the diagnosis is influenced by the number of CRS-R assessments and (2) to evaluate the number of CRS-R examinations required to obtain a reliable and accurate diagnosis.

Patients and Methods
We assessed patients with chronic DOC (ie, time since onset longer than 12 months after a TBI (traumatic brain injury), and longer than 3 months after a non-TBI, as determined by the guidelines 12 ) admitted to the university hospital of Liège (Belgium) for multimodal assessment of consciousness. Patients underwent at least six standardized behavioral assessments using CRS-R over a period of maximum 10 days. Note that patients did not have a standardized prescreening assessment before inclusion. All assessments were performed during their stay in our hospital.
Inclusion criteria in the study were to have sustained a severe acquired brain injury leading to a chronic DOC, to be at least 18 years old, and to be medically stable. An exclusion criterion was any modification of pharmacological or rehabilitation treatment during the study period. Patients who were diagnosed as EMCS after the first two assessments were excluded, because, by definition, this state is not a DOC. However, patients detected in EMCS later on were kept in the sample because they represent misdiagnosis (ie, they were initially considered as MCS while they were, in fact, EMCS). The EMCS diagnosis was given as soon as the patient showed functional communication or functional object use in two consecutive evaluations, as stated in the guidelines. 7 Patients were, most of the time, assessed in their bed, with the chest raised up to increase arousal and avoid sleepiness. Assessments in a wheelchair were rare. At the beginning of each examination, spontaneous movements were observed for at least 1 minute and the arousal protocol was applied if the patient was drowsy, as recommended by the CRS-R manual. The CRS-R is composed of 23 items distributed in six subscales assessing different functions (ie, auditory, visual, motor and oromotor/ verbal functions, communication, and arousal). Each subscale contains multiple items arranged in a hierarchical way, the highest item representing the most complex behavior. While some combinations of these items are impossible according to the scale guidelines, some improbable combinations might indicate specific impairments. 13 The clinical diagnosis is thus based on the presence or absence of operationally defined behavioral responses to specific sensory stimuli (eg, if a response to command like "move your feet" is observed at least three times out of four trials, the patient is considered to be in MCS 1 ). We here did not use the total score made up from the addition of the different subscales, because even though a recent study proposed a cut-off score of 8 to distinguish between patients in UWS/VS and MCS, it still misclassified 7% of patients. 14 A modified score was then proposed, permitting to distinguish UWS/VS and MCS patients, based on the presence of signs of consciousness during the assessment. 15 However, it does not allow identifying MCS 1 and MCSpatients, nor EMCS. In this study, we thus diagnosed the patients based on the behavioral responses they showed. The complete CRS-R examination lasted between 25 and 50 minutes, depending on the patient's responsiveness. Patients were assessed at different moments of the day (morning and afternoon), and some CRS-R assessments were performed on the same day. All assessors were well trained and experienced in the use of the CRS-R.
As a dependent variable, we used the clinical diagnosis (UWS/VS, MCS -, MCS 1 , or EMCS) based on one, two, three, four, five, and six CRS-R assessments taken together (respecting the chronological order of administration). For each time point, the diagnosis was the highest out of the past and current CRS-R evaluations. In other words, if the patient was diagnosed as UWS/VS at the first CRS-R assessment, MCS 1 at the second and UWS/VS at the third, the concluding diagnosis after three CRS-R examinations was MCS 1 . The highest diagnosis obtained using six CRS-R evaluations was here considered the reference diagnosis.
For the first aim of the study, which was to evaluate the effect of the number of assessments on clinical diagnosis accuracy, we used Friedman's analysis of variance ANOVA as a nonparametric test for repeated measurements because our data were not normally distributed. To test for any influence of the etiology, we assessed separately TBI and non-TBI patients. We also tested a subgroup of patients whose best diagnosis has been observed at least twice, to eliminate the possibility of false positives biasing the results. To verify that the changes in CRS-R diagnosis were not attributed to a spontaneous recovery or a habituation of the patient to the CSR-R, we performed Friedman ANOVA with CRS-R mixed in a nonchronological order. We first tested the reverse of the chronological order (6-5-4-3-2-1) and then two random orders (2-5-6-4-3-1 and 3-6-4-2-1-5). To test for an effect of time since injury or age, we used another analysis, because these variables are continuous. We first created a new variable representing the number of assessments that indicated the final diagnosis. For example, a patient who was diagnosed: MCS 1 , UWS/VS, MCS 1 , MCS 1 , UWS/VS, UWS/VS was given a value of 3, because three assessments indicated the final diagnosis. This variable ranged from 1 to 6 and indicated whether the patient fluctuated a lot (low value) or was stable (high value). We then correlated this variable with age and time since onset, to assess whether the variability, thus the risk of misdiagnosis, was linked to age or time since injury. For the second aim of the study, which was to define the number of assessments required to accurately diagnose a patient, we compared the diagnosis obtained after each evaluation with the reference diagnosis (based on six CRS-R assessments) using a paired-sample Wilcoxon signed-rank test. To confirm those results, we ran the same analyses on a subgroup of patients who had seven CRS-R examinations within the 10day period.
Finally, to provide a clinical meaning to our data, we characterized the diagnostic error according to the first observed diagnosis (namely, UWS/VS, MCS -, or MCS 1 ). Data were analyzed using Statistica software (version 12; StatSoft, Inc., Tulsa, OK). The study was approved by the ethics committee of the Medical School of the University of Liège, and informed consents were obtained from the patient's legal surrogates.
In the whole group, the diagnoses after one, two, three, and four CRS-R assessments were significantly different from the reference diagnosis (based on six assessments). The Table 1 reports misdiagnosis rates and Wilcoxon signed-rank test results. In the subgroup of patients who had seven CRS-R evaluations (58 of the 123 patients), a significant difference was also observed until the fourth diagnosis, as compared to the reference diagnosis of seven CRS-R (v 2 (58,6) 5 104.11; p < 0.001; see Table 1). The diagnosis observed during the first assessment was used to determine the rate of misdiagnoses after a single CRS-R, as compared to repeated evaluations. Of the 62 patients initially diagnosed as UWS/VS, 22 (35.5%) were finally diagnosed as MCS. Six of these patients (9.5%) were diagnosed as MCSand 16 (26%) as MCS 1 . Whereas the missed patients in MCS 1 showed a response to command afterward, the missed patients in MCSsubsequently showed one or more of the following behaviors indicative of consciousness: visual pursuit (n 5 2); visual fixation (n 5 1); automatic motor reactions (n 5 2); pain localization (n 5 1); and/ or object localization (n 5 1

Discussion
It has been consistently reported that fluctuations in responsiveness are inherent to patients with DOC and could lead to misdiagnosis. 1,7,9,10,16 The first aim of the study was to bring empirical evidence that those fluctuations have an impact on the clinical diagnosis, and that repeating behavioral assessments can decrease the rate of misdiagnosis. Here, we found that the diagnosis was significantly influenced by the number of evaluations. Hence, a lack of repeated examinations in patients with DOC can lead to an underestimation of patients' level of consciousness. We did not observe any effect of age, etiology, or time since onset. Moreover, when the order of the CRS-R assessments was shuffled, the changes in CRS-R diagnosis were still observed. Altogether, these results indicate that the observed fluctuations do not reflect spontaneous recovery. The second aim of the present study was to determine how many CRS-R assessments are needed for a reliable diagnostic workup. We here observed significant differences between diagnosis based on six CRS-R and those based on one, two, three, and four CRS-R evaluations. These results imply that up to the fourth evaluation, fluctuations in responsiveness still impact diagnosis accuracy. No significant difference was observed between the reference diagnosis (based on six CRS-R) and the diagnosis based on five CRS-R, suggesting that a minimum of five CRS-R assessments is required for a reliable clinical diagnosis in DOC. Moreover, to confirm our results, we found similar outcome in a subgroup of patients that benefited from seven CRS-R within the 10-day period, validating the need of five CRS-R to reach a reliable diagnosis.
Reducing the risk of erroneous clinical diagnosis is of medico-ethical importance, given that prognostic and therapeutic decisions might be influenced by the diagnosis of the patient 17 . Patients' prognosis differ according to the diagnosis made a few weeks or months postinjury, as shown by different studies. [17][18][19] Rehabilitation decisions might also depend on the diagnosis. It is therefore essential to correctly identify patients evolving to MCS. Furthermore, indication of treatment also depends on the diagnosis. For example, it is known that half of patients in MCS are responsive to transcranial direct current stimulation, whereas patients in UWS do not seem to be. 22 Using a scale that has been standardized and validated is crucial when assessing patients with DOC 1 (the CRS-R is considered the most sensitive 9 ). To our knowledge, there is, however, no clear recommendations about the repetition of examinations, except for another scale, the Sensory Modality Assessment and Rehabilitation Technique (SMART), which recommends 10 examinations within 3 weeks (SMART 23,24 ). A preliminary study on a small sample of patients indicated that extended assessment (ie, 10 3 60 minutes with the SMART) might avoid 40% of misdiagnosis as compared to two CRS-R evaluations (ie, 2 3 25-30 minutes according to the authors of this preliminary study). 25 We here showed, in a large sample of patients, that misdiagnosis is equally reduced with repeated CRS-R assessments (ie, five evaluations; 5 3 25-30 minutes).
Our findings show that a "UWS/VS" diagnosis made after the first assessment might be erroneous in 35% of the cases (as compared to the reference diagnosis after six examinations). The diagnosis of any single evaluation, irrespectively of the chronology, is different from the reference diagnosis (ranging here from 31 to 37%; mean, 35%). Those results are similar to previous studies, reporting 35% to 41% of patients misdiagnosed in UWS/VS 1-3,5 by clinical consensus (compared to CRS-R). Moreover, in our study, 26% of the patients initially diagnosed UWS/VS were actually able to answer simple commands (ie, MCS 1 ). Detection of patients in MCS 1 is important because command following is the first step toward communication. According to our data, when a clinician did not detect a response to command in a patient at the first testing (UWS/VS or MCS -), the diagnosis was erroneous in 36% of the cases (16/62 UWS/ VS and 16/28 MCS -). Previous studies reported that patients in MCS show more often visual and motor responses than auditory responses related to consciousness (ie, response to command), 26,27 whereas a more recent study highlighted a large prevalence of response to command, visual fixation, and visual pursuit among the patients in MCS. 28 However, we here showed that the response to command seems to be more easily detected after several evaluations, which could explain that their single assessments were not able to identify so often a response to command. Conversely, another study showed that the auditory subscale, along with the visual one, was responsible for the variability observed in patients with DOC. 29 This is in line with our results, because MCS 1 patients (ie, showing a response to command) were frequently missed during the first assessments. At the other end of the spectrum, when patients were directly able to answer simple commands at the first assessment (MCS 1 ), we showed that 18% were finally diagnosed as EMCS. Finally, we grouped all patients with MCS (MCS 1 and MCS -) and observed 10% of misdiagnosis (ie, they should have been diagnosed EMCS), as previously reported. 1 Another 10% of patients with MCS (both MCS 1 and MCS -) only showed EMCS signs on one evaluation, but they were not diagnosed as EMCS because they failed to score the same item during the following testing (ie, functional communication or use of objects). This emphasizes the need to confirm the EMCS diagnosis before concluding that those patients are not suffering from DOC. The criteria of EMCS are subject to some controversies, considering that functional communication could be too difficult for patients with posttraumatic confusion. 30 As a result, misdiagnosis between MCS and EMCS might be even higher than what we observed.
Finally, in this study, the diagnosis based on six CRS-R examinations was considered as the reference diagnosis. This clinical reference diagnosis might, however, not necessarily represent the real diagnosis of the patient, because behavioral assessments are not an absolute measure of consciousness, and it can be influenced by many confounding elements such as examiners', patients', or environmental factors. 10 One should also keep in mind that despite the statistically significant results indicating that five CRS-R assessments are reliable, a small percentage of patients are still misdiagnosed after five assessments (5%). Ideally, in order to decrease the level of false negatives, behavioral evaluations should be combined with neuroimaging evaluations. For example, a recent study showed the ability of 18-fluorodeoxyglucose positron emission tomography to detect covert consciousness. 5 Indeed, almost 30% of patients clinically considered as UWS/VS showed brain metabolism more comparable to patients in MCS (nonbehavioral MCS, MCS* 5,31 ). Some studies also pointed out the usefulness of functional magnetic resonance imaging 32,33 and/or psychophysiological techniques 34,35 to detect nonbehavioral command following cognitive motor dissociation 36 or brain resting activity compatible with MCS. It implies that even with the most sensitive scale, we might still underestimate the level of consciousness of some patients.
Several limitations of the study should be taken into account. First, we have a bias toward positive evolution, because we kept the best diagnosis reached by each patient. Clinical regression could also appear and not be detected because of the way data were analyzed. Indeed, the highest diagnosis was considered as the reference diagnosis, even if subsequent examinations indicated a lower diagnosis. However, given the shortness of the study period, clinical regression is unlikely. By the way, no patient was initially considered MCS without being diagnosed once again as MCS later. Moreover, additional analyses shuffling the order of the CRS-R allow to exclude any effect of spontaneous recovery, given that the changes in CRS-R diagnosis were observed if the assessments were considered backward or in a random order. Second, we did not study the effect of time of assessments because data were not always available, but according to previous studies, morning evaluations might be preferable if one wants to increase the probability to observe signs of consciousness. 29,37 However, this might also depend on individual differences. Finally, one could argue that variability can be attributed to the clinician's subjectivity. In our study, a single testing could modify the reference diagnosis, and bias the results if it was only attributed to the rater. 38 However, besides the known high inter-rater reliability of the CRS-R, 9,11 patients were assessed by skilled and experienced neuropsychologists trained and used to administrate the scale. Moreover, we confirmed the observed variability in a subgroup of patients whose best diagnosis was observed at least twice, reducing the probability of false positives.
In conclusion, the present study confirms that patients with DOC suffer from fluctuations in responsiveness and shows that these fluctuations significantly impact the clinical diagnosis. For both clinical and research purposes, we suggest that patients with chronic DOC are repeatedly assessed (at least five times) in a short time span (eg, 10 days) in order to reduce the influence of behavioral fluctuations.