Validation of 18F–FDG-PET Single-Subject Optimized SPM Procedure with Different PET Scanners

18F–fluoro-deoxy-glucose Positron Emission Tomography (FDG-PET) allows early identification of neurodegeneration in dementia. The use of an optimized method based on the SPM software package highly improves diagnostic accuracy. However, the impact of different scanners for data acquisition on the SPM results and the effects of different pools of healthy subjects on the statistical comparison have not been investigated yet. Images from 144 AD patients acquired using six different PET scanners were analysed with an optimized single-subject SPM procedure to identify the typical AD hypometabolism pattern at single subject level. We compared between-scanners differences on the SPM outcomes in a factorial design. Single-subject SPM comparison analyses were also performed against a different group of healthy controls from the ADNI initiative. The concordance between the two analyses (112 vs. 157 control subjects) was tested using Dice scores. In addition, we applied the optimized single-subject SPM procedure to the FDG-PET data acquired with 3 different scanners in 57 MCI subjects, in order to assess for tomograph influence in early disease phase. All the patients showed comparable AD-like hypometabolic patterns, also in the prodromal phase, in spite of being acquired with different PET scanners. SPM statistical comparisons performed with the two different healthy control databases showed a high degree of concordance (76% average pattern volume overlap and 90% voxel-wise agreement in AD-related brain structures). The validated optimized SPM-based single-subject procedure is influenced neither by the scanners used for image acquisition, nor by differences in healthy control groups, thus implying a great reliability of this method for longitudinal and multicentre studies.


Introduction
In the last decades, increasing evidence showed that the pathophysiological processes leading to neurodegeneration begin many years before the clinical diagnosis of dementia (Bateman et al. 2012;Jack et al. 2013). It is now clear that when the clinical manifestations of dementia are overt, the neuropathological events in the brain are already in advanced state. Thus, one of the most compelling challenges in dementia research is to identify individuals at the earliest (i.e. preclinical or prodromal) stages of degeneration (Villemagne and Chételat 2016). For this reason, in the last years, a large portion of clinical guidelines has centred the diagnosis of Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_ apply/ADNI_Acknowledgement_List.pdf Electronic supplementary material The online version of this article (doi:10.1007/s12021-016-9322-9) contains supplementary material, which is available to authorized users. neurodegenerative dementias on the supportive use of biomarkers, including 18 F-fluoro-deoxy-glucose Positron Emission Tomography (FDG-PET) (McKeith et al. 2005;McKhann et al. 2011a;Albert et al., 2011;Sperling et al. 2011;K. Rascovsky et al. 2011Gorno-Tempini et al. 2011. Clinical diagnosis per se has limited accuracy, in particular considering the great overlap in clinical presentation among neurodegenerative disorders, while biomarkers are indicative of the underlying pathology providing a more accurate differential diagnosis of dementia, even in the earliest stage of the disease (Perani 2014). FDG-PET is considered a very accurate and powerful biomarker for the early diagnosis of dementia (Bohnen et al. 2012;Perani 2014), providing in vivo information about the distribution of synaptic functioning (Mosconi et al. 2009). Reductions of cerebral glucose metabolism detected by FDG-PET are associated with early neuronal dysfunctions, preceding tissue loss and atrophy (Bateman et al. 2012;Chetelat et al. 2007;Perani 2014). Metabolic activity reductions were observed not only in several groups of dementia patients, but also in subjects in prodromal disease phases (Anchisi et al. 2005;Cerami et al. 2015;Chételat et al. 2003;de Leon et al. 2001;Landau et al. 2010) and in at-risk individuals, such as in cognitively intact subjects with Alzheimer's disease (AD) family history (Mosconi et al. 2009) or carrying AD-associated autosomal dominant mutations (Bateman et al. 2012).
Although the aforementioned evidence supports the importance of using FDG-PET as an early biomarker of dementia, its usefulness in the early identification and in differential diagnosis is still matter of debate. Recently, a Cochrane review by Smailagic and colleagues questioned the diagnostic and prognostic accuracy of FDG-PET in early prodromal phases, claiming that the existing evidence does not support its utilization in the clinical setting (Smailagic et al. 2015). However, we believe, in line with the authors themselves and with the European Association of Nuclear Medicine (EANM) (Morbelli et al. 2015a(Morbelli et al. , 2015b) that this conclusion is biased by methodological faults in the reviewed literature. Above all, the lack of a proper objective method for an accurate quantitative assessment of FDG-PET images represents the major constraint. Of note, the evaluation of FDG-PET images is mostly limited to the visual inspection of radiotracer distribution, thus neglecting quantitative and objective measures. Many works have shown the importance of objective measurements of FDG-PET data based on either absolute or relative quantification, with consequent improvement in diagnostic accuracy (Foster et al. 2007;Frisoni et al. 2013;Herholz 2014;Perani et al. 2014b). When FDG-PET images are processed with quantitative or semiquantitative approaches (e.g. Statistical Parametric Mapping (SPM), Neurostat and AD tsum), the obtained specificity and sensitivity values for both early and differential diagnosis of dementia showed significant increases (see (Perani et al. 2014b) for a recent overview).
Following this line of research, Perani and Della Rosa et al. (2014) have recently validated an optimized SPM-based single-subject procedure that, through a dedicated pre-processing pipeline and a voxel-by-voxel statistical comparison with a large dataset of healthy controls (HC), allows the identification of brain hypometabolic SPM t-maps in dementia cases at single-subject level with high statistical power ) (see method for a complete description of the procedure). This procedure applies a rigorous statistical analysis without being completely automatized and unsupervised, as the BProbability of ALZheimer^(PALZ) algorithm (Herholz et al. 2002) (implemented in PMOD software http://www.pmod.com) or the three-dimensional stereotactic surface projections (3D-SSP) (Minoshima et al. 1995) method. Despite the promises of automatic methods, recent studies have demonstrated that these metrics still do not provide a significant diagnostic advantage in the clinical context (Ishii et al. 2006;Morbelli et al. 2015b).
On the contrary, the single-subject SPM optimized procedure demonstrated to be a powerful diagnostic tool, outperforming both visual qualitative assessment of FDG-PET images and the clinical characterization of patients per se . Moreover, it showed a high accuracy both in differential diagnosis and in the longitudinal assessment of mild cognitive impairment (MCI) patients (Cerami et al. , 2016Iaccarino et al. 2015;Perani et al. 2015;Perani, Della Rosa et al. 2014). Taken together, these research studies strongly suggest that the SPM-based semi-quantification of FDG-PET images allows the identification of dementia-specific hypometabolic patterns even in the prodromal stages of the disease and that it can be a crucial tool in supporting early and differential diagnosis of dementia.
With the aim of expanding the use of the optimized singlesubject SPM procedure to the wide clinical and research community, we measured its performance on images acquired with different PET scanners representative of the most common technological features introduced in the last two decades. In order to accomplish this comparison, we focused our analysis on a large series of AD patients (N = 144) characterized by the hypometabolic patterns suggestive of AD. This diseasespecific pattern of glucose hypometabolism was consistently reported in the well-established literature on independent cohorts and by using different methods for FDG-PET quantification. The typical AD hypometabolic pattern encompasses the temporo-parietal cortices, posterior cingulum, and precuneus (Herholz et al. 2002;Satoshi Minoshima et al. 2001;Teune et al. 2010). If the optimized single-subject SPM routine is robust and not affected by the type of the s c a n n e r u s e d , w e e x p e c t n o d i ff e r e n c e s i n t h e hypometabolic AD patterns obtained with different PET devices. We thus tested the possible effects deriving from those technical differences on the resulting SPM t-maps. This is beyond doubt a compelling issue, since in the last two decades PET tomographs have undergone important changes both in the hardware and in the software. Currently, almost all the scanners available on the market, with the only exception of the High-Resolution Research Tomograph (HRRT) scanner (Eriksson et al. 2002), have crystals with side lengths of 4-6 mm (Slomka et al. 2015). No other attempts towards increased resolution were performed, due to the increased noise and complexity of such a system (Slomka et al. 2015). A technical innovation regards the introduction of faster scintillating crystals (lutetium orthosilicate (LSO) and lutetium-yttrium orthosilicate (LYSO)), which allow Time of Flight measurements and high count-rate capabilities. However, their impact on brain imaging is limited, because of the relatively small size of the brain compared to the Time Of Flight resolution (Bettinardi et al. 2011). Regarding the software, many improvements were introduced in the reconstruction process. For example, statistical reconstruction algorithms improved the modelling of noise and attenuation, increasing image quality (Iatrou et al. 2004;Xuan Liu et al. 2001). Scatter correction techniques were also improved, increasing the final image quantitative accuracy (Iatrou et al. 2006;Sibomana et al. 2012), and allowing the routine use of 3 dimensional imaging (Zaidi 2000), which in turn markedly increases sensitivity (Townsend et al. 1991). In addition, a more accurate geometric modelling of the tomograph has also improved image resolution (Manjeshwar et al. 2007). All these changes produced very important technical advancements, but they also made images less comparable. This would be problematic for longitudinal or retrospective studies, especially if multicentric, where it is common to deal with images obtained from different scanners, often from different generations.
We hypothesize that the validated optimized single-subject SPM method is robust with respect to all these differences. We applied our procedure with images coming from different PET scanners and with different healthy control datasets. This would pave the way to the application of this powerful method for semi-quantification of FDG-PET images across multiple clinical and research settings.

Materials and Methods
Participants Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. 144 patients with AD from different cohorts were included in the study (95 from ADNI database, 49 from the Nuclear Medicine Database at San Raffaele Hospital (HSR)). All these participants were classified as having probable Alzheimer's dementia based on an extensive clinical and neuropsychological assessment as well as on positivity for AD-like brain hypometabolism as measured with FDG-PET images. These were acquired on different PET devices (see section scanner models compared for details).
In addition, we included FDG-PET images from 57 amnestic MCI subjects (35 men, 22 women; mean age = 74.05 ± 5.24 years; MMSE =26.6 ± 1.9) acquired with three different tomographs (Siemens HR+, General Electric Discovery LS, General Electric Discovery STE) from the ADNI and the HSR datasets. (See Fig. 3 for representative cases and Supplementary material for a full overview of the SPM t-maps and patient characteristics).
In two previous works, we have validated our optimized SPM method in MCI patients Perani et al. 2015). These studies provided evidence of distinct patterns of hypometabolism underlying the MCI condition before they clinically manifested dementia. The different patterns accurately predicted the progression from MCI to different dementia conditions at the clinical follow-up, suppporting the crucial role of our single-subject SPM approach to early recognize the clinical heterogeneity which underlies the MCI definition and the risk of progression Perani et al. 2015).
We downloaded unprocessed FDG-PET images from the ADNI database (see the protocol for more details http://adni. loni.usc.edu/methods/documents/) in order to have full control on the pre-processing steps. From all the patients available, we selected those acquired with the same scanner forming groups of at least 10 patients for scanner. We finally obtained a total of 144 patients, acquired on six different PET devices. Patients were grouped according to the scanner used for the acquisition, and their characteristics are reported in Table 1. Differences between groups on age at time of the acquisition, disease duration, Mini-Mental State Examination (MMSE), and gender were not significant at ANOVA (used for testing age, disease duration differences and MMSE) and Chi-squared test (used for testing gender differences).
In this study, in addition to the database of normal controls implemented in the optimized SPM procedure (HSR-HC) for the SPM single-subject analysis (see Della Perani et al. 2014a), we included a further dataset of healthy elderly subjects from the ADNI database (ADNI-HC). Summary of the characteristics of the two HC databases are reported in Table 2. Age was included in the optimized SPM procedure as nuisance covariate in order to exclude its effect.
HC and AD patient studies performed in Milan were approved by the HSR Medical Ethics Committee. Both groups provided written informed consent, following detailed explanation of each experimental procedure. ADNI subjects gave written informed consent at the time of enrolment for data collection and completed questionnaires approved by each participating sites Institutional Review Board.
The protocols conformed to the ethical standards of the Declaration of Helsinki for protection of human subjects.

Image Pre-Processing
Images were processed using SPM5 (http://www.fil.ion.ucl. ac.uk/spm). In the first step, images were converted to the Analyze format, then multi-frame images had individual frames realigned (to correct for eventual patient motion) and averaged. The origin of the images was manually set in the proximity of the anterior commissure, in order to translate all the images in the same space. In addition, we performed a careful quality check of the images, an essential procedure allowing the identification of potential artefacts.

Single-Subject SPM Optimized Procedure
The optimized single-subject SPM routine was run to obtain hypometabolic t-Maps for each patient. First, each FDG-PET image was spatially normalized by means of a dementia-specific FDG-PET template in the MNI stereotaxic space (Della ). This template was built with 100 FDG images (50 from healthy subjects and 50 from patients with dementia) and showed a high performance for spatial normalization compared to the commonly used H 2 O template (Della ) (freely available for download at http://www.fil.ion.ucl.ac.uk/spm/ext/). Then, images were smoothed with a Gaussian kernel (FWHM: 8-8-8 mm). This is an integral step of the SPM model, and it is performed in order to limit statistical noise, to avoid local effects due to inter-subject anatomical differences and therefore to increase statistical power (Friston 2002). Image intensities were scaled to each subject's global mean (Buchert et al. 2005), in order to account for between-subject uptake variability . The global mean was computed on normalized images after masking out all the non-brain tissue (skull and CSF). We used a standardized mask as previously described and validated (see Della . Global mean scaling results in higher signal-to-noise ratio compared to other available scaling methods (e.g. cerebellar reference area) (Dukart et al. 2010). Finally, the warped and smoothed image entered a whole-brain voxel-wise statistical comparison (Independent Two Sample t-test) with a large database of normal controls (N = 112 HSR-HC or N = 157 ADNI-HC), also controlling for age variability. The output of the comparison  was a SPM t-Map showing clusters of statistical significant hypometabolic voxels.

Comparison of Scanner Models
Six PET scanners were compared for this work. The most relevant characteristics are reported in Table 3. They are representative of a wide range of available solutions. Reconstruction parameters were standardized across different centres (http://adni.loni.usc.edu/methods/documents/). The reconstruction algorithm used is also reported in Table 3. Contrast images, representing the differences between the individual patient image and the HC group, generated from each single-subject analysis, were used for the subsequent second level analyses. In particular, two analyses were performed, a voxel-wise analysis and a Volume Of Interest (VOI) one.
1. The voxel-wise analysis was performed to evaluate whether the measured patterns were on average the same, independently from the scanner used. In particular, factorial one-way ANOVA analysis was conducted using SPM5, selecting the Bscanner model^as main effect. A threshold of p < 0.05, with an FWE correction for multiple comparisons was applied. 2. The VOI-based analysis was performed in order to evaluate whether the signal extracted from the precuneus and the posterior cingulate gyrus was different among the AD pa t i en t s . T he s e regi ons r ep r ese nt t h e m a j or hypometabolic signatures associated to AD. The volume of interest (VOI) of the precuneus and the posterior cingulate gyrus was obtained from the Automated Anatomical Labelling (AAL) (Tzourio-Mazoyer et al. 2002). For each patient, we extracted the mean signal in the selected VOI from the contrast images obtained from the SPM single-subject analysis. Then, a one-way ANOVA was performed off-line comparing the extracted mean contrast signals and selecting Bscanner model^as the variable of interest.

Comparison between Different Healthy Control Databases
To study the stability of the proposed method when the normal database pool is changed, all the patients were re-analysed at the single-subject level with the identical SPM routine, but using a different set of HC, namely the ADNI-HC cohort.
In accordance with the procedures adopted for building the HRS-HC dataset in Della , FDG-PET images of each ADNI-HC were spatially normalized to the FDG-PET template, and tested in a jackknife approach in order to exclude subjects presenting even minimal hypometabolism (Della ).
Specifically, every normalized FDG-PET scan was evaluated with respect to the remaining sample in SPM5 via a two-sample t-test so that a SPM t-Map was obtained for each HC. Then, all the HC subjects that showed even a minimum extent of 10 voxels of significant hypometabolism surviving at p < 0.05 FWE-corrected threshold at a voxel level were excluded.
After the single-subject SPM procedure was run for each AD patient against the two HC dataset, we compared the resulting t-Maps using the Dice scores as measure of concordance. A Dice score for binary variables A and B is defined as: ¼ A∩B A∪B . It takes the value of 1 if A and B assume the same logical value in every pixel, and a value of 0 if they always disagree.
We first used Dice method at the volumetric level, which consists in the ratio between the volumes found hypometabolic by the two analyses using the different HC database in each AD subject. Basically, Dice scores represent the amount of spatial overlap of the identified brain hypometabolic regions. Then, a voxel-wise concordance map was computed as the percent of times both analyses agreed.

Influence of the Scanner Model
Four patients were excluded from the analysis because they showed artefacts at the visual quality inspection. In the remaining ones, each patient showed the typical AD pattern, involving the temporo-parietal cortex, posterior cingulum and the precuneus that together are considered the dysfunctional hallmark of AD (McKhann et al. 2011a, b). This was also clearly seen in the commonality analysis at the second level (Fig. 1).
The ANOVA of the pattern specific analysis revealed no differences between images acquired with different scanners (F(5138) = 1.7, p = 0.14).
The voxel-wise ANOVA showed no statistically significant differences among the compared scanners, except in the cerebellar cortex. A post-hoc analysis revealed that this difference was due to the HRRT scanner. The HRRT PET device had the most different technical characteristics. Thus, a second posthoc analysis was performed comparing the HRRT scanner against all the others and the results are shown in Fig. 2.

Application to Early Detection
In order to validate, even in the prodromal dementia phase, the stability of our method when images acquired with different scanners are used, we included FDG-PET images from amnestic MCI subjects acquired with three different tomographs (Siemens HR+, General Electric Discovery LS, General Electric Discovery STE) from the ADNI and the HSR datasets. At clinical follow-up, 18 out of 57 subjects converted to AD and 31 remained stable. All the MCI converter to AD showed the typical AD hypometabolic pattern, even when the FDG-PET images were acquired with different tomographs. Twenty-eight MCI stable showed normal brain metabolism, and 3 MCI stable had AD-like patterns, in need perhaps of a longer follow-up. (See Fig. 3 for some representative cases and Supplementary Materials for a complete overview of all the MCI AD-like patterns).

Influence of Different Healthy Controls Databases
From the HC cases downloaded from the ADNI database, 6 images were excluded for technical reasons (i.e. the image files were not readable). Finally, a total of 157 subjects were kept after the jack-knife testing procedure.
The mean Dice score, obtained comparing the volume of the hypometabolic patterns from the two analyses, was 76%, indicating a good agreement between the two analyses. In particular, this indicates that, on average, the hypometabolic blobs estimated by the two analyses have a 76% overlap.
In Fig. 4, we show the voxel-wise map of Dice scores, representing the agreement in deeming a single voxel hypometabolic in the two analyses with different HC pools. In the core areas of AD-related metabolic impairment, the agreement was higher than 90%, while in the majority of other areas the agreement was generally higher than 80%. This indicates, at the voxel level, that the SPM statistical method using different control databases produced hypometabolic t-Maps with very high levels of spatial concordance.

Discussion
The reported results suggest a significant stability of the single-subject SPM method in the identification of the ADrelated pattern of brain hypometabolism in a large series of AD cases. In the first test, the images of brain hypometabolism obtained through the optimized SPM procedure (Perani 2014;Perani, Della Rosa et al. 2014) showed no influence of the PET scanners used for the acquisition. The AD-like hypometabolic pattern was consistently found in each subject, also in AD-converter MCI subjects, and across all the included PET tomographs, which are representative of the majority of scanners currently in use. Our semi-quantitative procedure, without being completely automatized and unsupervised, allows the clinician to evaluate directly the cerebral metabolic dysfunctional pattern in the single-cases. This is a very important aspect for physicians, particularly in the clinical settings.
In this paper, we report that the PET scanner used for the subject acquisition does not influence this optimized SPM procedure. The reasons that make this possible are probably multiple. An SPM t-map is obtained by performing t-tests on every voxel through the brain. On top of the physiological inter-subject variance, other sources of variance include statistical noise, differences in contrast recovery and anatomical mismatch. The mandatory smoothing step of the SPM procedures greatly reduce most of these factors, in particular the effects of anatomical mismatch (Friston 2002). This procedure also eliminates almost all the statistical noise due to the counting statistics, even if static FDG brain imaging, performed using long acquisition time and resulting in high organ uptake, produces very low noise levels. The only remaining confounder is the level of contrast recovery, due to different intrinsic resolution or to the reconstruction procedures. However, as previously shown, most scanners currently available have similar intrinsic resolution. Therefore, as the differences in contrast recovery are already supposed to be limited, the intrinsic resolution is not expected to be influential, when images are convolved with a smoothing kernel that is significantly larger.
More importantly, to make sure that collecting data in different centres did not compromise data quality, the ADNI collaboration investigated the best way to make PET data as comparable as possible (Joshi et al. 2009), by using an approach based on standardized acquisition procedures, followed by post-processing of the acquired image data. A set of standardized rules was defined to obtain the best possible reconstruction for all the scanners (Alzheimer's Disease Neuroimaging Initiative PET Technical Procedures Manual Version 9.5 2006). The next step in their proposed harmonizing procedure involved correcting for different spatial  (Eriksson et al. 2002) Fully 3D OSEM resolution and for low-frequency effects that presumably result from different scatter and attenuation correction procedures.
The authors reported that the spatial resolution differences could be reduced using smoothing kernels of 6 mm or less  (Joshi et al. 2009). This is consistent with our finding that, after the 8 mm smoothing, no differences exist among different scanners. Regarding the low frequency corrections, the authors state that these are rather small, as shown in a phantom model. Crucially, they state that such corrections are applicable only to phantoms, as scatter and attenuation results may be heavily influenced by each patient anatomy (Joshi et al. 2009). Therefore, it is expected that inter-patient differences in such phenomena are larger than systematic inter-scanner ones. Systematic differences in scatter and attenuation corrections could be expected to result in localized effects. We found indeed small localized differences for the HRRT scanner only in the cerebellum. Specifically, the cerebellar cortex was found to be slightly more hypometabolic, in the scanner comparisons. The HRRT tomograph is the most different in the physical parameters, as its crystals are very small and nonstandard methods for reconstruction and corrections are implemented (Eriksson et al. 2002). All the other scanners have very similar intrinsic resolution due to similar crystal dimensions, thus favouring homogeneity in the assessment of hypometabolism.
Another factor that might have contributed to the reported stability of our SPM method is the use of a large HC dataset made with subjects acquired in different centres and with different tomographs, which are representative again of all the most common PET architectures. We have shown that there is  In the second test, we ran the optimized SPM routine implementing healthy controls from a different HC database for the statistical comparison (one European and the other from the US, with slightly different acquisition protocols and acquired with different PET scanners). We found that the patterns estimated by the single-subject optimized procedure had a very high degree of overlap (76%), and the concordance at the voxel level was higher than 90% in the most compromised regions, suggesting a good stability of the method across these two conditions.
The present evidence provides a validation of our optimized single-subject SPM procedure for its use with FDG-PET images acquired with different PET scanners also in the prodromal AD phase. In addition, the inclusion of different HC databases acquired with various PET scanners is a further demonstration of its reliability, paving the way for using this SPM method also with different HC datasets. This is coherent with a previous result from our group showing that HC images obtained from different PET scanners can be implemented in the SPM single-subject procedure when large datasets of HC (N > 50) are included .
We believe that this single-subject SPM approach could have a positive impact in both research and clinical settings. Indeed, only proper voxel-wise semi-quantifications, as the one provided by SPM-based procedure, are able to identify the brain hypometabolic changes with high statistical accuracy (Frisoni et al. 2013;Perani et al. 2014b). FDG-PET as a biomarker of neuronal injury and neurodegeneration not only supports differential diagnosis among dementia conditions according to the research and clinical criteria (Armstrong et al. 2013;Bonanni et al. 2006;Dubois et al. 2014;McKeith et al. 2005;McKhann et al. 2011a, b;Rascovsky et al. 2011), but can also predict risk to dementia progression in the prodromal or preclinical phases of dementia Perani et al. 2015). The use of the optimized single-subject SPM procedure increases the above accuracy. A crucial requirement for multicentric studies is to compare the single-subject with a large number of HC and in this respect the possibility to use images coming from different scanners and centres is critical . The proven robustness of the method, with respect to changes in the scanner hardware and reconstruction parameters, is also important when performing large retrospective or longitudinal studies. The need to combine images acquired with different scanners is indeed very common in clinical research, and in retrospective studies where many large databases have been collected and shared across centres (e.g. ADNI). In these situations, the ability to compare data acquired in different centres and over more than a decade is of utmost importance.
Our optimized SPM method is based on FDG-PET images normalization to a specific FDG-PET template (Della ). This might be advantageous in clinical settings and in retrospective applications for large databases, where MRI images may not be available. Notably, this optimized SPM routine is able to provide consistent and validated patterns of brain hypometabolism useful in the clinical routine for differential diagnosis (Cerami et al. , 2016Perani et al. 2015;Perani, Della Rosa et al. 2014) A previous study, however, reported increased sensitivity when MRI is used for spatial normalization. Specifically, when MRI-DARTEL normalization was applied, a slight increase in the extent of regional hypometabolism was reported in the comparison between MCI and HC subjects, at group level (Martino et al. 2013). Further research studies will demonstrate both the impact of MRI-based normalization on the diagnostic sensitivity in general and whether differences among scanners could arise from its application.

Conclusion
The proposed routine for the SPM analysis of FDG-PET images is robust with respect to the use of different tomographs and to the use of different HC databases. Our data confirm the high value of this approach for diagnosis and prognosis, also in the early disease phase. Notably, its sensitivity independently by the tomograph and the normal database used for comparison paves the way for its use in large multicentre research and clinical trials. We thus suggest the application and diffusion of this SPM procedure to other clinical and research centres with the general aim to foster the application of quantitative and reproducible FDG-PET assessments.

Information Sharing Statement
Part of the FDG-PET images used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI, RRID:SCR_003007) database ( http://adni. loni.usc.edu ). The Dementia Specific FDG-PET template can b e d o w n l o a d e d f r o m h t t p : / / w w w. f i l . i o n . u c l . a c . uk/spm/ext/#Dementia_PET . The SPM software package (RRID:SCR_007037) can be downloaded from http://www. fil.ion.ucl.ac.uk/spm/software/.

Compliance with Ethical Standards
Conflict of Interest The authors declare that they have no conflict of interest.