Analyzing Data Using the General Linear Model

Lars Kasper^1,2

¹Institute for Biomedical Engineering, University of Zurich and ETH Zurich, Zurich, Switzerland, ²Translational Neuromodeling Unit, IBT, University of Zurich and ETH Zurich, Zurich, Switzerland

Synopsis

The general linear model (GLM) is the most common framework for analyzing task-based fMRI data. In this talk, we motivate its use from the precarious contrast-to-noise situation of fMRI, which requires not only modeling (or fitting) of experimental factors and confounds, but also statistical assessment of their significance in the presence of an irreducible noise floor. The presentation will feature analyses of simulated and measured fMRI data to highlight GLM parameter estimation as well as statistical inference (t-, F-tests) and its representation in Statistical Parametric Maps. Finally, limitations of the GLM and intricacies are discussed, e.g. correlated regressors or multiple comparison correction, to enable its proper use in practice.

Highlights

· Introduction of the most common framework for analyzing fMRI data: the General Linear Model (GLM)

· Coherent motivation of modeling (experimental factors and confounds) as well as statistical testing from the low contrast-to-noise nature of fMRI

· Geometrical intuition to understand parameter estimation in the GLM

· Understanding the representations of modeling results as statistical parametric maps (SPMs)

· fMRI-specific pitfalls and corrections of the GLM: multiple-comparison correction, non-independent errors, correlated regressors

Target Audience

· Students and researchers with little to no background in fMRI analysis, as well as

· Students and researchers with previous experience in analysis and curiosity about the mechanics behind the ubiquitous „activation maps“ used for reporting fMRI results

· No prerequisites required, other than attendance of previous lectures of this course, i.e. physiological and physical principals of BOLD-based functional imaging

Outcome and Objectives

After this sessions, attendees should be able to

· Understand and explain the rationale for using the GLM for hypothesis-driven fMRI analysis

· Come up with a sensitive and specific GLM for their research question themselves

· Perform diagnostics on chosen models, in particular when using correlated regressors

· Understand the link between GLM, classical statistical testing (t- and F-tests), and statistical parametric maps (SPMs)

· Know about limitations of the GLM for fMRI and suitable alternatives

Purpose

· In task-based fMRI, we induce an experimental manipulation in the subject which should evoke neuronal processing and downstream localized metabolic and vascular consequences, i.e. changes in BOLD contrast. Thus, the single-voxel time series of image intensities forms the fundamental information unit of our measurement.

· In hypothesis-driven analysis of fMRI data, we formulate putative time courses of BOLD fluctuation that reflect our belief about how the experimental manipulation should be manifested in the brain. Then, by comparing these modeled with the measured voxel time-courses, we assess where and to which extent such processes happen in the brain, and arrive at quantification using similarity measures, e.g. correlation.

· However, the contrast-to-noise situation of BOLD is rather precarious, meaning that the signal change we would like to detect is buried under various other fluctuation patterns of similar or much greater magnitude that go on at the same time in the same voxel. During the talk, we will look at a couple of these confounds and their typical occurence sites, e.g. subject motion, pulsatile tissue displacement, drifts and thermal noise.

· In short, it is (nearly) impossible to detect a meaningful BOLD activation in a voxel time course by eye, let alone assess its significance amidst all other fluctuations – we could just believe to see a pattern that matches our hypothesis given the rich dynamics we observe.

· Thus, the detection of BOLD activation in an fMRI analysis is a 2-stage procedure: (1) estimating the BOLD effect size from our data and hypothesized fluctuation patterns and (2) statistically assessing the significance of the estimated effect (inference).

· The general linear model (GLM) for fMRI was introduced in the early 90s as an answer to this challenging task, in that it provided a flexible, computationally feasible (note the ~100'000 voxels with hundreds of time points in an fMRI experiment) modeling framework that comprehensively linked matrix calculus and statistical testing (Friston et al., 1994).

Methods

What is the general linear model?

· For a thorough overview, encompassing intuition, mathematical details and some historical background, (Poline and Brett, 2012) is highly recommended reading.

· General means that we model all effects together and in the same way for all voxels.

o In particular, we model both effects of interest (hypothesized BOLD activation) and confounds in a single step, i.e. each effect becomes a „regressor“ in the model.

o Statistically, it is important to note that the term general implies a „multiple linear regression with more than one dependent variable“, i.e. more than one time series. Thus, it subsumes ANOVA, ANCOVA, MANOVA, ordinary and multiple linear regression as well as t- and F-tests into one common framework.

· Linear in this context implies two assumptions

o The total BOLD signal is a superposition (sum) of all modeled effects without interactions, i.e. effect size changes in one regressor do not alter the magnitude of others. Note that correlation between the regressors themselves is still possible.

o The BOLD signal scales proportionally with the modeled effect.

· Model means that we capture only an approximation of the real-world data, i.e. that

o the limited complexity of our model allows us to explain the data only partially, i.e. up to a residual.

o on the other hand, the finite amount and wealth of our data only justifies a limited complexity of our model.

How are the parameters of the GLM estimated?

· The parameter estimation process is detailed e.g. in (Christensen, 2011; Kiebel and Holmes, 2007).

· Mathematically, we can write the assumptions of the GLM as a matrix equation $y = X · β + ε$ with

o $y$ , the voxel time series, a column vector (of $N$ time points)

o $X$ , the design matrix, where regressors modeling effects and confounds (both of length $N$ ) are stacked into $M$ different columns

o $β$ , the parameter estimates, a column vector ( $M$ elements), one element (effect size) for each modeled regressor

o $ε$ , the residual time series, i.e. any unmodeled fluctuation, a column vector (of $N$ time points)

· The parameter estimation process finds the least-square solution to this matrix equation, i.e. the betas that minimize our squared error or residual $|ε|^2$ .

· As a graphical intuition (2 factors, 3 time points, see (Kiebel and Holmes, 2007)), our model spans a plane of possible parameter combinations, but the measured time series is „out of reach“ in a 3rd dimension. The best fit or parameter estimate, i.e. closest point to the data we can reach within the plane, is the point where the residual is an orthogonal vector pointing from the plane towards our measured data point.

· We always find a unique solution for the betas (parameters), which is typically non-zero, but this does not equal a significant effect, because of the aforementioned occurrence of random fluctuations in the signal that could correlate with our effect of interest. Thus, we have to estimate how likely such a “Type 1“ error is, i.e. observing an apparent effect size of the fitted magnitude by chance, i.e. when fitting random noise. This is the purpose of statistical testing.

How do we perform statistical testing?

· A detailed overview of statistical inference, including t- and F-tests, is given in (Poline et al., 2007).

· The residual has an important role for assessing how meaningful our modeling is. It collapses all our unexplained data fluctuation into a statistical distribution, characterized by a few (co-)variance parameters.

· Our fitting as outlined above is only correct if our residual behave exactly like independent, identical Gaussian random noise.

o That’s why it is important to model the confounds as well as the signal of interest, since e.g. pulsatile changes are correlated over time (by the cardiac cycle).

· Then, the residuals provide a good estimate of the variance of the Gaussian noise, which is needed for statistical inference.

· Statistical testing finally amounts to again putting contrast of interest (the beta-estimates) and fluctuations of no interest (“noise”) into relation.

o Within the talk, we will see how t-tests can be seen as special contrast-to-noise ratios for our effect of interest, compared to the remaining random noise. Similarly, we will see how F-tests are comparing explained variances with and without inclusion of the tested effects of interest.

· Note that confounds are not considered as “noise“ in this statistical context, since they are not random. If we don’t include them in the model, we overestimate the random noise in the data, and thus reduce sensitivity.

Results

· During the talk, we will use various examples of simulated and measured fMRI data to understand the mechanics of the GLM.

· We will look at parameter estimates (“fits”) for individual voxels and how they change for different models.

· Furthermore, we will browse parameter maps, summarizing fits over the entire brain, and observe their transition into statistical parametric maps (SPMs) by manual estimation of noise variances and extra-sum-of squares for t- and F-tests, respectively.

· Finally, we will investigate in more detail the impact of including confound regressors for statistical significance of contrasts of interest.

Discussion and Conclusion

· The GLM is a powerful tool for analyzing task-based fMRI data and typically the first method of choice for this purpose.

o More sophisticated designs are possible than introduced here, e.g. model-based fMRI linking individual learning processes to brain activity (Stephan et al., 2015).

· However, certain limitations of the GLM have to be considered and addressed when using it in practice:

o Variability of the hemodynamic response: The transfer from neural activity to BOLD response is typically modeled by a convolution of the hypothesized effect time series with the canonical hemodynamic response function (HRF). This response may be variable between individuals and even brain regions, which can be accounted for by an extended basis set, e.g. a Taylor expansion of the HRF, which models small delays and width variations of the response, resulting in multiple regressors conjointly modeling one effect of interest.

o Multi-collinearity: In more complex designs involving a couple of modeled effects, it often occurs that a set of regressors can form a linear combination highly correlated to another regressor. While this multi-collinearity is no problem for parameter estimation, it impacts on the sensitivity of statistical tests performed on such a regressor. Note that both t– and F-tests only account for unique variance explained, i.e. effects in the data that can only be explained by the tested regressor, and not equally by others in the model! This shared variance is only revealed in a joint test for all multi-collinear regressors, or by orthogonalizing regressors in the model.

o Multiple comparison correction: The GLM is a mass-univariate approach, i.e. every voxel time series is fitted independently to the same model. This poses a multiple testing problem given hundreds of thousands of voxels in the brain, in that typical thresholds for refuting the null hypothesis of no effect (p=0.05) would still detect thousands of active voxels in a t- or F-test, even if random noise was the only fluctuation in the data. This can either be corrected by a very conservative threshold (e.g. Bonferroni: divide p by number of voxels), by taking the spatial structure of the residuals over voxels into consideration (random field theory and family-wise error correction, (Worsley, 2007)), or by giving up distributional assumptions and estimating the noise floor from the data (non-parametric statistics, (Winkler et al., 2014))

o Validity of the distributional assumptions: The outlined assumptions about the noise in the GLM estimation have to validated by diagnostics on the residuals, i.e. routinely checking whether these are indeed independently, identically distributed over time points (Poline and Brett, 2012), which becomes of even higher importance for fast imaging sequences with TRs under one second.

· Alternatives for analyzing fMRI data should be considered, if the intrinsic assumptions of the GLM do not apply to your research question. Examples include

o Estimation of effect sizes: Classical statistical testing assesses significance of parameter estimates compared to a null hypothesis. To directly estimate the probability of effect sizes above a certain magnitude, one could perform a Bayesian parameter estimation of the GLM (e.g. offered by the SPM software package). While computationally more burdensome, Bayesian methods provide posterior probability maps (PPMs) instead of SPMs, which contain information about the beta parameter distributions instead of point estimates (Friston et al., 2002).

o Multivariate approaches: The assumption that single voxels respond independently to an experimental manipulation is somewhat arbitrary, since they are the consequence of your acquisition choices. Instead, multivariate approaches fit the response pattern of several voxels (either adjacent or distal) together and thus can be more sensitive to extended activations.

· However, bad data quality or a poor experimental design that does not specifically elicit the hypothesized response will not be compensated even by the most sophisticated fMRI analysis choice.

Acknowledgements

No acknowledgement found.

References

Christensen, R., 2011. Plane Answers to Complex Questions: The Theory of Linear Models. Springer.

Friston, K.J., Holmes, A.P., Worsley, K.J., Poline, J.-P., Frith, C.D., Frackowiak, R.S.J., 1994. Statistical parametric maps in functional imaging: A general linear approach. Hum. Brain Mapp. 2, 189–210. doi:10.1002/hbm.460020402

Friston, K.J., Penny, W., Phillips, C., Kiebel, S., Hinton, G., Ashburner, J., 2002. Classical and Bayesian Inference in Neuroimaging: Theory. NeuroImage 16, 465–483. doi:10.1006/nimg.2002.1090

Kiebel, S.J., Holmes, A.P., 2007. Chapter 8 - The General Linear Model, in: Karl Friston, John Ashburner, Stefan Kiebel, Thomas Nichols and William Penny - Karl Friston, J.A., William Penny (Eds.), Statistical Parametric Mapping. Academic Press, London, pp. 101–125.

Poline, J.-B., Brett, M., 2012. The general linear model and fMRI: Does love last forever? NeuroImage 62, 871–880. doi:10.1016/j.neuroimage.2012.01.133

Poline, J., Kherif, F., Pallier, C., Penny, W., 2007. Chapter 9 - Contrasts and Classical Inference, in: Karl Friston, John Ashburner, Stefan Kiebel, Thomas Nichols and William Penny - Karl Friston, J.A., William Penny (Eds.), Statistical Parametric Mapping. Academic Press, London, pp. 126–139.

Stephan, K.E., Iglesias, S., Heinzle, J., Diaconescu, A.O., 2015. Translational Perspectives for Computational Neuroimaging. Neuron 87, 716–732. doi:10.1016/j.neuron.2015.07.008

Winkler, A.M., Ridgway, G.R., Webster, M.A., Smith, S.M., Nichols, T.E., 2014. Permutation inference for the general linear model. NeuroImage 92, 381–397. doi:10.1016/j.neuroimage.2014.01.060

Worsley, K., 2007. Chapter 18 - Random Field Theory, in: Karl Friston, John Ashburner, Stefan Kiebel, Thomas Nichols and William Penny - Karl Friston, J.A., William Penny (Eds.), Statistical Parametric Mapping. Academic Press, London, pp. 232–236.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)