Pre-Processing of fMRI Data
Stephen Strother1

1Rotman Research/Medical Biophysics, Baycrest/University of Toronto, Toronto, ON, Canada

Synopsis

The target audience is researchers and clinicians with limited to no experience with fMRI imaging. As a result of this presentation the audience will know (i) what fMRI pre-processing is, and why it is important, (ii) the basic pre-processing steps and software packages available for implementing them, (iii) how to choose pre-processing steps for different data sets and experimental paradigms, and (iv) about recent developments in automated optimization of pre-processing of fMRI data.

Introduction.

For many experiments the BOLD fMRI signal reflecting the underlying neural signals of interest is a small fraction of the total BOLD signal variability, which includes large components due to motion, respiration and cardiac, and scanner effects, etc. The goal of pre-processing steps in fMRI (i.e., the pre-processing pipeline) is to remove as much of the unwanted non-neural BOLD or “noise” signal as possible to increase the signal-to-noise (SNR) of the neural signal component. This is typically done separately from the data analysis stage, but recent work uses the results of the analysis stage to automatically adapt the choice of pre-processing steps. Pre-processing of fMRI data is a large research area with 100s of papers addressing the many complex issues and outcomes. Therefore, in this overview it will only be possible to briefly touch on a selected subset of approaches and results.

Methods and Results.

Many sources of unwanted BOLD signal variability have been identified over the last 20 years, and are described in detail in [1]. A core set of pre-processing correction and denoising steps that are widely used by many researchers are briefly summarised below.

Within-Subject Corrections.

1. Rigid Body Motion Correction (MC): Motion correction’s effects vary by dataset: it reduces motion artifact, particularly in children and older groups, and clinical datasets [2, 3]. This probably the most ubiquitous and potentially important step in fMRI pre-processing, but it may produce biased results with task coupled motion, and in cases of large BOLD response and relatively small head movements [4].

2. Censoring/Scrubbing of outlier brain volumes: Removing outlier timepoints that are caused by abrupt head motion (i.e., scrubbing, [5, 6]), and replacing them by interpolating from adjacent volumes (i.e., censoring, [7]). Scrubbing of scans creates temporal discontinuities that preclude some types of analyses, e.g., spectral power. There have been no major studies of censoring in fMRI task data, and thus its impact and importance as a preprocessing step is largely unknown.

3. Physiological Correction using: This is an important pre-processing step that may often be second-only to motion correction, and yet it has not been widely used in much of the existing fMRI literature, because of the added experimental complexity of obtaining external physiological measures. Recently this has been ameliorated by the availability of software for multivariate, data–driven physiological estimates that may replace and outperform the use of external measures.

3a. external physiological measures with RETROICOR: A parametric model using external measures of respiration and heartbeat in which 2nd-order Fourier series is used to fit voxel time-courses, relative to the phase of cardiac and respiratory cycles [8].

3b. multivariate data-driven models: These may be used in place of external physiological measures to estimate physiological noise components, which are then regressed out of the data. A number of such data-driven approaches have been introduced, and shown to significantly improve on using external measures with RETROICOR, e.g., PHYCAA+ [9], PESTICA [10], COMPCOR [11].

4. Slice-timing correction: Correction for timing offsets between axial slices due to the slice ordering of the EPI acquisition. This step is important for single-event and resting state experimental designs, but it remains unclear if it provides a significant benefit to signal detection in block designs [12].

5. Spatial Smoothing: Typically preformed with a 3D Gaussian function, which tends to improve SNR of features larger than the smoothing function size, and reduces SNR for those that are smaller. Typical Gaussian full-width-half-maximum sizes used range from 4-8 mm. Adaptive smoothing approaches exist with available software [13].

6. Temporal Detrending and Filtering: Removes low frequency noise with either a high-pass filter (e.g., low-pass frequency cut-off of 0.01-0.005 Hz), or low-frequency temporal trend components (e.g., a Legendre polynomials of order N (0 to 5). This provides non-specific noise correction, including head motion, scanner drift, and physiological noise [14]. The optimal detrending order has been shown to vary as a function of subject and task design [15-17]. A low-pass filter with a frequency cut-off of 0.1 Hz is also widely used for resting state data analysis [18].

7. Motion Parameter Regression: The effects of this step vary by dataset: it is used to control residual motion artifact [3, 4, 14], but it may also reduce experimental power, particularly in cases of large BOLD response and low head motion [15, 19, 20]. Particularly for resting state analysis some researchers have advocated use of more extensive regression models of residual motion effects including quadratic terms and 1st-order derivatives of the estimated motion parameters [21]. The use of this step remains controversial in the literature.

8. Additional within-subject pre-processing steps that may be beneficial:

8a. Regression of white matter and CSF time courses: Some portion of physiological and global BOLD variation may be removed by regressing out estimates of the white matter and CSF temporal variability [22, 23]. Some authors suggest that this is superseded by data-driven, physiological noise correction, but a comprehensive comparison has yet to be performed.

8b. Non-neuronal tissue mask of vasculature, sinuses and ventricles: The PHYCAA+ algorithm ([9], www.nitrc.org/projects/phycaa_plus) may be used to estimate subject-specific, vascular masks, to account for inter-subject differences in vasculature. If these voxels are not excluded/down-weighted prior to analysis they can produce false-positive activations, particularly for multivariate analysis models.

8c. Global signal regression: There are large sources of global signal variation in some subjects for which the underlying cause remains unclear, but it may constitute physiological noise [24], neuronal response [25], or a mixture of both. The magnitude of global signal expression appears to be subject-dependent [26, 27], indicating the importance of adaptively estimating it across subjects. The various approaches listed below have yet to be comprehensively compared: (i) Using the spatial mean of scans’ BOLD signals: Can be quite effective but distorts measurement of the spatial connectivity values [28, 29]; (ii) Using PCA: Following a PCA of the fMRI data the PC#1 time-series tends to be highly correlated with global signal effects [30], and residual motion artefacts [31]. Removing it minimizes the distortion of signal independent of global effects, unlike simple regression of mean BOLD signal [28, 29]; (iii) Median angle correction: described in [26], and available in CPAC (http://fcon_1000.projects.nitrc.org/indi/cpac).

Between-Subject Registration

The problem of registering subject’s data sets across multiple subjects’ brains for group analysis is a large research area in its own right, which cannot be adequately addressed in this short tutorial. The basic issues and tradeoffs are described in [1]. Traditionally registration has been based on aligning fMRI data sets through high resolution structural MRIs, i.e., multiple subjects’ MRIs are registered to a target brain (e.g., http://www.bic.mni.mcgill.ca/ServicesAtlases/ICBM152NLin2009) using some form of non-linear warping algorithm, and the fMRI data sets are then registered via their individual MRIs to the target. For a comparison of the performance of available non-linear registration software see [32]. Recent research trends have focused on generating an implicit, group-specific target volume for minimizing registration errors [33], surface based registration [34], and using more than one modality in the registration process [35, 36].

Selected List of fMRI Pre-processing Software: http://afni.nimh.nih.gov/afni/; http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/; http://www.fil.ion.ucl.ac.uk/spm/; http://fcon_1000.projects.nitrc.org/indi/cpac; http://cibsr.stanford.edu/tools/human-brain-project/artrepair-software.html; http://www.humanconnectome.org/documentation/HCP-pipelines/index.html; http://kendrickkay.net/GLMdenoise/; Large clearing house of fMRI software algorithms: http://www.nitrc.org/

Discussion and Conclusions.

Recently Carp et al. [37] explicitly outlined the variability of fMRI results driven by a range of choices in the preprocessing pipeline and analysis steps. As others have demonstrated during the last 15 years (e.g., [15, 16, 38-41]), Carp establishes that small changes within a preprocessing pipeline and analysis steps may lead to large effects on the output, and suggests that this leads to a much higher risk of inflated false positives than has been generally appreciated in the field. He suggests that problems stemming from the influence of the wide range of pipeline choices being used in relatively low powered experiments “may be mitigated by constraining the flexibility of analytic choices or by abstaining from selective analysis reporting.” I strongly agree with the need to eliminate the selective pipeline and analysis reporting enabled by flexible, manual selection of pre-processing pipelines, and similar highly-biased experimental approaches such as the double dipping [42, 43]. To start to address this problem I strongly endorse the associated need for comprehensive and systematic reporting of neuroimaging pipeline steps called for by Poldrack et al. [44] and Carp [45], coupled with software and data sharing efforts, which allow other investigators to test directly results published in the literature [46]. However, constraining the flexibility of pipeline choices is only one possible approach. Another approach, which has been taken recently by multiple groups, is to eliminate the biases inherent in manual pre-processing step and component selection, and to automatically optimize choices using predictive, cross-validation frameworks [16, 47-49]. These recent frameworks all attempt to automatically estimate generalizable noise components using adaptively chosen PCA/ICA noise components/subspaces to increase prediction levels across multiple, within-subject and session scanning runs. As for previous studies using adaptive PCA- or ICA-based component denoising (e.g., [38, 48, 50]) they all report large improvements in SNR and effect size compared to conservative, fixed preprocessing pipelines. In addition, they all demonstrate that optimal preprocessing requires adaptive modeling of the noise variability (i.e., adaptive selection of pre-processing steps) on at least a subject-by-subject basis. These results disagree with Carp’s proposal that for a given sample’s size it is necessary to “constrain the flexibility of analytic choices” including pre-processing pipelines to manage the challenge of overfitting fMRI data sets. Instead, this recent work strongly supports the idea of flexibly adapting preprocessing pipeline choices in order to optimize signal-to-noise extraction in fMRI, provided that the flexibility is managed in an automated, analytic framework using cross-validated performance metrics.

We have learnt a great deal about choosing pre-processing steps and algorithms during the last 20 years, but we are still a long way from understanding what constitutes the best choices in any particularly experimental fMRI data set. This is particularly true as a function of age and disease, and in a clinical setting where much research remains to be done. I predict that the problem of optimizing pre-processing pipelines for a particular data set will be best solved using pipeline management systems that automate pre-processing choices within a resampling framework using resampled performance metrics, such as cross-validated prediction.

Acknowledgements

This work was supported by: CIHR (MOP 84483), NSERC, the Canadian Partnership for Stroke Recovery, and the Ontario Brain Institute.

References

[1] S. C. Strother, "Evaluating fMRI preprocessing pipelines," IEEE Eng Med Biol Mag, vol. 25, pp. 27-41, Mar-Apr 2006.

[2] E. T. Bullmore, M. J. Brammer, S. Rabe-Hesketh, V. A. Curtis, R. G. Morris, S. C. Williams, et al., "Methods for diagnosis and treatment of stimulus-correlated motion in generic brain activation studies using fMRI," Hum Brain Mapp, vol. 7, pp. 38-48, 1999.

[3] J. W. Evans, R. M. Todd, M. J. Taylor, and S. C. Strother, "Group specific optimisation of fMRI processing steps for child and adult data," Neuroimage, vol. 50, pp. 479-90, Apr 1 2010.

[4] L. Freire and J. F. Mangin, "Motion correction algorithms may create spurious brain activations in the absence of subject motion," Neuroimage, vol. 14, pp. 709-722, Sep 2001.

[5] J. D. Power, K. A. Barnes, A. Z. Snyder, B. L. Schlaggar, and S. E. Petersen, "Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion," Neuroimage, vol. 59, pp. 2142-54, Feb 1 2012.

[6] K. R. Van Dijk, M. R. Sabuncu, and R. L. Buckner, "The influence of head motion on intrinsic functional connectivity MRI," Neuroimage, vol. 59, pp. 431-8, Jan 2 2012.

[7] K. L. Campbell, O. Grigg, C. Saverino, N. Churchill, and C. L. Grady, "Age differences in the intrinsic functional connectivity of default network subsystems," Frontiers in aging neuroscience, vol. 5, p. 73, 2013. [8] G. H. Glover, T. Q. Li, and D. Ress, "Image-based method for retrospective correction of physiological motion effects in fMRI: RETROICOR," Magn Reson Med, vol. 44, pp. 162-167, Jul 2000.

[9] N. W. Churchill and S. C. Strother, "PHYCAA+: An optimized, adaptive procedure for measuring and controlling physiological noise in BOLD fMRI," Neuroimage, vol. 82C, pp. 306-325, May 31 2013.

[10] E. B. Beall, "Adaptive cyclic physiologic noise modeling and correction in functional MRI," J Neurosci Methods, vol. 187, pp. 216-28, Mar 30 2010.

[11] Y. Behzadi, K. Restom, J. Liau, and T. T. Liu, "A component based noise correction method (CompCor) for BOLD and perfusion based fMRI," Neuroimage, vol. 37, pp. 90-101, Aug 1 2007.

[12] R. Sladky, K. J. Friston, J. Trostl, R. Cunnington, E. Moser, and C. Windischberger, "Slice-timing effects and their correction in functional MRI," Neuroimage, vol. 58, pp. 588-94, Sep 15 2011.

[13] W. D. Weeda, L. J. Waldorp, I. Christoffels, and H. M. Huizenga, "Activated Region Fitting: A Robust High-Power Method for fMRI Analysis Using Parameterized Regions of Activation," Human Brain Mapping, vol. 30, pp. 2595-2605, Aug 2009.

[14] T. E. Lund, K. H. Madsen, K. Sidaros, W. L. Luo, and T. E. Nichols, "Non-white noise in fMRI: does modelling have an impact?," Neuroimage, vol. 29, pp. 54-66, Jan 1 2006.

[15] N. W. Churchill, A. Oder, H. Abdi, F. Tam, W. Lee, C. Thomas, et al., "Optimizing preprocessing and analysis pipelines for single-subject fMRI. I. Standard temporal motion and physiological noise correction methods," Human Brain Mapping, vol. 33, pp. 609-27, Mar 2012.

[16] N. W. Churchill, G. Yourganov, A. Oder, F. Tam, S. J. Graham, and S. C. Strother, "Optimizing Preprocessing and Analysis Pipelines for Single-Subject fMRI: 2. Interactions with ICA, PCA, Task Contrast and Inter-Subject Heterogeneity. ," PLoS One, vol. 7, p. e31147, 2012.

[17] J. Tanabe, D. Miller, J. Tregellas, R. Freedman, and F. G. Meyer, "Comparison of detrending methods for optimal fMRI preprocessing," Neuroimage, vol. 15, pp. 902-7, Apr 2002.

[18] K. R. Van Dijk, T. Hedden, A. Venkataraman, K. C. Evans, S. W. Lazar, and R. L. Buckner, "Intrinsic functional connectivity as a tool for human connectomics: theory, properties, and optimization," Journal of neurophysiology, vol. 103, pp. 297-321, Jan 2010.

[19] T. Johnstone, K. S. Ores Walsh, L. L. Greischar, A. L. Alexander, A. S. Fox, R. J. Davidson, et al., "Motion correction and the use of motion covariates in multiple-subject fMRI analysis," Hum Brain Mapp, vol. 27, pp. 779-88, Oct 2006.

[20] J. M. Ollinger, T. R. Oakes, A. A.L., F. Haeberli, K. M. Dalton, and R. J. Davidson, "The Secret Life of Motion Covariates.," in NeuroImage, 2009, p. S122.

[21] T. D. Satterthwaite, D. H. Wolf, K. Ruparel, G. Erus, M. A. Elliott, S. B. Eickhoff, et al., "Heterogeneous impact of motion on fundamental patterns of developmental changes in functional connectivity during youth," Neuroimage, vol. 83, pp. 45-57, Dec 2013.

[22] M. S. Dagli, J. E. Ingeholm, and J. V. Haxby, "Localization of cardiac-induced signal change in fMRI," Neuroimage, vol. 9, pp. 407-15, Apr 1999.

[23] C. Windischberger, H. Langenberger, T. Sycha, E. A. Tschernko, G. Fuchsjager-Mayerl, L. Schmetterer, et al., "On the origin of respiratory artifacts in BOLD-EPI of the human brain," Magnetic Resonance Imaging, vol. 20, pp. 575-582, Oct 2002.

[24] R. M. Birn, J. B. Diamond, M. A. Smith, and P. A. Bandettini, "Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI," Neuroimage, vol. 31, pp. 1536-48, Jul 15 2006.

[25] M. L. Scholvinck, A. Maier, F. Q. Ye, J. H. Duyn, and D. A. Leopold, "Neural basis of global resting-state fMRI activity," Proceedings of the National Academy of Sciences of the United States of America, vol. 107, pp. 10238-43, Jun 1 2010.

[26] H. He and T. T. Liu, "A geometric view of global signal confounds in resting-state functional MRI," Neuroimage, vol. 59, pp. 2339-2348, Feb 1 2012.

[27] C. W. Wong, V. Olafsson, O. Tal, and T. T. Liu, "The amplitude of the resting-state fMRI global signal is related to EEG vigilance measures," Neuroimage, vol. 83, pp. 983-90, Dec 2013.

[28] J. R. Moeller and S. C. Strother, "A regional covariance approach to the analysis of functional patterns in positron emission tomographic data," J Cereb Blood Flow Metab, vol. 11, pp. A121-35, Mar 1991.

[29] K. Murphy, R. M. Birn, D. A. Handwerker, T. B. Jones, and P. A. Bandettini, "The impact of global signal regression on resting state correlations: are anti-correlated networks introduced?," Neuroimage, vol. 44, pp. 893-905, Feb 1 2009.

[30] F. Carbonell, P. Bellec, and A. Shmuel, "Global and system-specific resting-state fMRI fluctuations are uncorrelated: principal component analysis reveals anti-correlated networks," Brain connectivity, vol. 1, pp. 496-510, 2011.

[31] R. P. Woods, M. Dapretto, N. L. Sicotte, A. W. Toga, and J. C. Mazziotta, "Creation and use of a Talairach-compatible atlas for accurate, automated, nonlinear intersubject registration, and analysis of functional imaging data," Human Brain Mapping, vol. 8, pp. 73-79, 1999.

[32] A. Klein, J. Andersson, B. A. Ardekani, J. Ashburner, B. Avants, M. C. Chiang, et al., "Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration," Neuroimage, vol. 46, pp. 786-802, Jul 1 2009.

[33] X. Geng, G. E. Christensen, H. Gu, T. J. Ross, and Y. Yang, "Implicit reference-based group-wise image registration and its application to structural and functional MRI," Neuroimage, vol. 47, pp. 1341-51, Oct 1 2009.

[34] B. D. Argall, Z. S. Saad, and M. S. Beauchamp, "Simplified intersubject averaging on the cortical surface using SUMA," Hum Brain Mapp, vol. 27, pp. 14-27, Jan 2006.

[35] B. R. Conroy, B. D. Singer, J. S. Guntupalli, P. J. Ramadge, and J. V. Haxby, "Inter-subject alignment of human cortical anatomy using functional connectivity," Neuroimage, vol. 81, pp. 400-11, Nov 1 2013.

[36] M. P. Heinrich, M. Jenkinson, M. Bhushan, T. Matin, F. V. Gleeson, S. M. Brady, et al., "MIND: modality independent neighbourhood descriptor for multi-modal deformable registration," Med Image Anal, vol. 16, pp. 1423-35, Oct 2012.

[37] J. Carp, "On the plurality of (methodological) worlds: estimating the analytic flexibility of FMRI experiments," Frontiers in neuroscience, vol. 6, p. 149, 2012.

[38] S. LaConte, J. Anderson, S. Muley, J. Ashe, S. Frutiger, K. Rehm, et al., "The evaluation of preprocessing choices in single-subject BOLD fMRI using NPAIRS performance metrics," Neuroimage, vol. 18, pp. 10-27, Jan 2003.

[39] S. LaConte, S. Strother, V. Cherkassky, J. Anderson, and X. Hu, "Support vector machines for temporal classification of block design fMRI data," Neuroimage, vol. 26, pp. 317-29, Jun 2005.

[40] J. B. Hopfinger, C. Buchel, A. P. Holmes, and K. J. Friston, "A study of analysis parameters that influence the sensitivity of event-related fMRI analyses," Neuroimage, vol. 11, pp. 326-33, Apr 2000.

[41] J. B. Poline, S. C. Strother, G. Dehaene-Lambertz, G. F. Egan, and J. L. Lancaster, "Motivation and synthesis of the FIAC experiment: Reproducibility of fMRI results across expert analyses," Hum Brain Mapp, vol. 27, pp. 351-9, May 2006.

[42] N. Kriegeskorte, W. K. Simmons, P. S. Bellgowan, and C. I. Baker, "Circular analysis in systems neuroscience: the dangers of double dipping," Nature neuroscience, vol. 12, pp. 535-40, May 2009.

[43] N. Kriegeskorte, M. A. Lindquist, T. E. Nichols, R. A. Poldrack, and E. Vul, "Everything you never wanted to know about circular analysis, but were afraid to ask," Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism, vol. 30, pp. 1551-7, Sep 2010.

[44] R. A. Poldrack, P. C. Fletcher, R. N. Henson, K. J. Worsley, M. Brett, and T. E. Nichols, "Guidelines for reporting an fMRI study," Neuroimage, vol. 40, pp. 409-14, Apr 1 2008.

[45] J. Carp, "The secret lives of experiments: methods reporting in the fMRI literature," Neuroimage, vol. 63, pp. 289-300, Oct 15 2012.

[46] J. B. Poline, J. L. Breeze, S. Ghosh, K. Gorgolewski, Y. O. Halchenko, M. Hanke, et al., "Data sharing in neuroimaging research," Frontiers in neuroinformatics, vol. 6, p. 9, 2012.

[47] K. N. Kay, A. Rokem, J. Winawer, R. F. Dougherty, and B. A. Wandell, "GLMdenoise: a fast, automated technique for denoising task-based fMRI data," Frontiers in neuroscience, vol. 7, p. 247, 2013.

[48] S. C. Strother, S. La Conte, L. Kai Hansen, J. Anderson, J. Zhang, S. Pulapura, et al., "Optimizing the fMRI data-processing pipeline using prediction and reproducibility performance metrics: I. A preliminary group analysis," Neuroimage, vol. 23 Suppl 1, pp. S196-207, 2004.

[49] G. Salimi-Khorshidi, G. Douaud, C. F. Beckmann, M. F. Glasser, L. Griffanti, and S. M. Smith, "Automatic denoising of functional MRI data: combining independent component analysis and hierarchical fusion of classifiers," Neuroimage, vol. 90, pp. 449-68, Apr 15 2014.

[50] M. J. McKeown, Z. J. Wang, R. Abugharbieh, and T. C. Handy, "Increasing the effect size in event-related fMRI studies - Getting more in less time with ICA denoising," Ieee Engineering in Medicine and Biology Magazine, vol. 25, pp. 91-101, 2006.



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)