Improved reproducibility in subcortical gray matter atrophy measurement using FIRST and FreeSurfer

Houshang Amiri ¹, Antoine Meijerman¹, Martijn D. Steenwijk^1,2, Ronald A. van Schijndel³, Frederik Barkhof^2,3, Keith S. Cover¹, and Hugo Vrenken^1,2

¹Department of Physics and Medical Technology, VU University Medical Center, Amsterdam, Netherlands, ²Department of Radiology and Nuclear Medicine, VU University Medical Center, Amsterdam, Netherlands, ³Image Analysis Center, VU University Medical Center, Amsterdam, Netherlands

Synopsis

Interest in measuring brain atrophy in neurodegenerative diseases such as Multiple Sclerosis (MS) and Alzheimer’s disease (AD) is growing. To this end, FreeSurfer and FSL-FIRST are widely used as fully automated algorithms for quantification of the brain volume and volume change in both cross-sectional and longitudinal MRI studies. We have tested reproducibility of these methods in measuring deep gray matter atrophy rates in a group of subjects consisting of healthy aging, mild cognitive impairment and AD. We showed that using longitudinal mode for FreeSurfer and highest number of modes for FIRST provides the best reproducibility that is similar for FreeSurfer and FIRST.

PURPOSE

Reproducibility of brain atrophy measurement using MRI could play an important role in monitoring the state and progression of some diseases ¹. To this end, fully automated algorithms are widely being used. We aimed (i) to quantify the reproducibility of two common automated methods, i.e. FreeSurfer and FSL-FIRST, for measuring deep gray matter atrophy rates in a group consisting of healthy controls, mild cognitive impairment (MCI) and Alzheimer’s disease (AD) patients; and (ii) to determine if adjusting the default parameters in the FIRST can improve reproducibility.

METHODS

From the multi-center Alzheimer's Disease Neuroimaging Initiative (ADNI)-1 study, 562 (57.3% male) subjects were selected: 114 patients with AD, 277 patients with MCI and 171 healthy controls. All subjects were scanned at 1.5T at baseline and year-1 including two back-to-back (BTB) MPRAGE volumes at each time point. The volume change and percentage of the volume change (compared to the baseline) of the left and right hippocampus, caudate nucleus and putamen were assessed. To this end, two automated software packages, i.e. FreeSurfer (version 5.3) and FIRST (part of FSL, version 5.0.8) were used. To assess reproducibility of the measured volume change, both BTB MPRAGE scans available for each visit were analyzed to compose two scan pairs: scan 1 at baseline (BTB1) paired with scan 1 at year-1 (BTB1), and scan 2 at baseline (BTB2) paired with scan 2 at year-1 (BTB2, see Figure 1). The volume changes between baseline and year-1 were expressed in μL, and as a percentage of baseline volume. Reproducibility of the baseline volume, of the 1-year volume change and of the 1-year percentage volume change was expressed as the median absolute difference between the measurements derived from each scan pair. For FreeSurfer, both the cross-sectional and longitudinal processing streams were used. For FIRST, in addition to the default version with the recommended number of “modes” (base functions), different alternatives with progressively higher numbers of modes were used, specifically with 100, 200, and 336 as the maximum available modes.

RESULTS

For all subjects, the mean baseline volume was obtained using cross-sectional FreeSurfer and FIRST with different number of modes (Fig. 2). Table 1 lists the median absolute differences between measurements obtained on the two scan pairs for baseline volume, volume change, and percent of baseline volume change. For all three structures, both in the left and right hemisphere, baseline volume measurement had improved reproducibility for FIRST when the number of modes was increased, and improved reproducibility for FreeSurfer when the longitudinal processing was used. Reproducibility of baseline volume was overall similar between FreeSurfer and FIRST, with FIRST exhibiting slightly better performance. Median absolute BTB differences of % volume change were in the range of expected annual atrophy rates. Moreover, reproducibility of volume change and percentage volume change for FreeSurfer was better for longitudinal processing compared to cross-sectional processing. Similar to the baseline volume, reproducibility of volume change and percentage volume change for FIRST was better for higher numbers of modes. Reproducibility of volume change measurement was overall similar between longitudinal FreeSurfer and FIRST with the maximum number of modes, with FreeSurfer exhibiting slightly better performance. Interestingly, the putamen exhibited generally poorer reproducibility, as expressed by a lower median absolute difference than caudate nucleus and hippocampus.

DISCUSSION

Based on median absolute difference, FreeSurfer is slightly more reproducible for volume change and % volume change than FIRST. Further work is needed to investigate the underlying sources of variance using e.g. linear mixed model analysis as reported by Mulder et al. ². For such analysis, visual inspection of the automated segmentations would help to detect software failures and therefore possible outliers.

CONCLUSION

Reproducibility of baseline volume assessment with FreeSurfer can be improved by using FreeSurfer’s longitudinal processing stream. Reproducibility of both baseline volume and volume change can be improved by increasing the number of base functions (“modes”) in FIRST. With the abovementioned settings, reproducibility of volume and volume change measurement of the hippocampus, caudate nucleus, and putamen was comparable between FreeSurfer and FIRST, although FreeSurfer exhibited slightly lower median absolute differences for volume change and FIRST gave lower values for baseline volume.

Acknowledgements

No acknowledgement found.

References

[1] Cover KS, et al. Assessing the reproducibility of the SienaX and Siena brain atrophy measures using the ADNI back-to-back MP-RAGE MRI scans, Psychiatry Res. 193 (2011) 182-190.

[2] E. R. Mulder et al., Hippocampal volume change measurement: quantitative assessment of the reproducibility of expert manual outlining and the automated methods FreeSurfer and FIRST, NeuroImage 92 (2014) 169-181.

Figures

Fig. 1 Scheme showing both back-to-back (BTB) scans at each time point. The reproducibility of the volume change is assessed by the difference between volume change 1 and 2. The reproducibility of the baseline volume is assessed by the difference between the volumes obtained from BTB1 and BTB2 at the baseline time point.

Fig. 2 The average volume of the BTB1 and BTB2 at baseline obtained by cross-sectional FreeSurfer and FIRST.

Table 1 Median absolute difference between both measurements of baseline volume, 1-year volume change, and 1-year percent volume change. As additional information and for completeness, the cross-sectional FreeSurfer results are shown next to the longitudinal FreeSurfer results.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)

4103