Huanjie Li1, Lisa D. Nickerson2, Yang Fan3, Thomas E. Nichols4, and Jia-Hong Gao5
1Department of Biomedical Engineering, Dalian University of Technology, Dalian, China, People's Republic of, 2McLean Imaging Center, McLean Hospital/Harvard Medical School, Belmont, MA, United States, 3GE Healthcare, MR Research China, Beijing, China, People's Republic of, 4Department of Statistics and Warwick Manufacturing Group, University of Warwick, Coventry, United Kingdom, 5Center for MRI Research, Peking University, Beijing, China, People's Republic of
Synopsis
Threshold-free
cluster enhancement (TFCE) outperforms the
cluster-size test (CST) based on random field theory and our recent papers provide two voxelation-corrected CST (v-CST and vn-CST) which
also show the clear advantage over other CST as well. However, it’s not
clear which one shows better performance for MRI data analysis. This work provides
a very careful, fair and thorough evaluation of the powerful statistical
methods, which may be particularly appealing for group-level
MRI data analysis.Purpose
Threshold-free
cluster enhancement (TFCE) based on permutation testing outperforms the
original cluster-size test (CST) based on random field theory
1, 2.
Our recent papers provide two voxelation-corrected CST (v-CST and vn-CST) which
also show the clear advantage over other CST as well
3, 4. However, it’s not
clear which one shows better performance for MRI data analysis. This work provides
a very careful, fair and thorough evaluation of the powerful statistical
methods. To investigate the effectiveness of v-CST, vn-CST and TFCE under
different degrees of freedom (
dfs), smoothness levels and signal to noise
ratios (SNRs) for both stationary and non-stationary images for group-level
analysis.
Methods
Simulated null data:
The Monte-Carlo simulations used to
generate both stationary and non-stationary null data using a strategy similar
to that implemented by Li et al. (2015)4. For each realization, three sets
of two-group data with 64 x 64 x 32 Gaussian images were generated and a two-sample t-test was used to
calculate the statistic images with df = 18, 38 and 58, respectively.
For the stationary null data simulation,
the applied full width at half maximum of Gaussian kernels were 0, 3, 6
and 9 voxels. For the null non-stationary data, each white noise image was
smoothed with three different 3D Gaussian kernels, producing three images with
low, medium and high smoothness. Six different smoothness settings was used to simulate
different levels of non-stationary data. 2000 realizations were generated for
each sample size. Each test's rejection rate was calculated by taking the
number of realizations that contained detected clusters divided by the total
number of realizations.
Simulated activation data: A template of the medial visual resting state network5 was used for the
ground truth activation spatial pattern. Ground truth signals were assigned a
value of 0 for background voxels and a peak value of 1 in network voxels. The
signal was scaled by 1, 3 or 5, and added to the unsmoothed simulated
non-stationary data to give a range of peak SNR values of 1, 3 and 5,
respectively. 20 realizations were generated for each sample size (df = 18, 38
and 58). The smoothness levels were the same as the null data
simulation. Receiver-operator characteristic (ROC) curves were used to compare
each method's performance with non-stationary activation data.
VBM data: Two group-size structural
images: 65 images (small group, 34 normal control (NC) subjects and 31 patients
with Alzheimer’s Disease (AD)) and 82 images (larger group, 42 NC subjects and
40 AD) obtained from the ADNI database were used for the VBM analysis. An
optimized VBM protocol was implemented using FSL-VBM. Two
different smoothing kernels with δ = 3 and 4 mm were applied.
For
CST inference, two commonly used cluster defining thresholds (t = 2.5 and 3.5)
were applied. For TFCE test, the number of permutations was set to 5000 with
the default connectivity. The significance level of tests was set to 0.05.
Results and Discussion
Figs. 1 and 2 show the results of
FWE-corrected rejection rates on simulated stationary and
non-stationary null data, respectively. The performance of CST methods depend on the intensity threshold. Compared with CST methods, the
performance of TFCE is more stable in controlling the false positive rate. With
a suitable intensity threshold, the performance of CST inference and TFCE is
comparable.
Fig. 3 shows the AUC results on simulated stationary activation data. For high SNR (SNR ≥ 3), TFCE
shows slightly better sensitivity under all dfs and smoothness levels. For low
SNR (SNR = 1), the performance of CST methods are better than TFCE under low
smoothness level (FWHM = 0 voxel) and low df (df = 18); with increasing
smoothness level, the performance of TFCE is increased and shows better
sensitivity. The results of non-stationary data are similar to stationary data
and therefore it is not displayed.
Figs. 4 and 5 show the VBM results using
vn-CST and TFCE methods for small and larger group size,
respectively. Over the large and small groups, for t = 2.5 the vn-CST results
were similar or better than TFCE, while for t = 3.5 the vn-CST results were
similar or worse than TFCE.
Conclusion
In
summary, both vn-CST and TFCE are robust inference approach for group-level
analysis without requiring high degrees of spatial smoothness or uniform
smoothness. TFCE is more reliable without requiring the cluster-forming
intensity threshold, but it’s not available for individual subject-level because
the assumption of exchangeable. Thus the most suitable approach for inference
may ultimately depend on whether or not the interest is in single-subject
versus group-level analysis.
Acknowledgements
This work was supported by “the Fundamental
Research Funds for the Central Universities”.References
1. Smith,
S.M., Nichols, T.E. Threshold-free cluster enhancement: Addressing
problems of smoothing, threshold dependence and localisation in cluster
inference. NeuroImage. 2009; 44: 83-98.
2. Salimi-Khorshidi, G., Smith, S.M., Nichols, T.E.
Adjusting the effect of nonstationarity in cluster-based and TFCE inference.
NeuroImage. 2011; 54: 2006-2019.
3. Li, H., Nickerson, L.D., Xiong, J., et al. A high performance 3D cluster-based test of
unsmoothed fMRI data. NeuroImage. 2014; 98: 537-546.
4. Li, H., Nickerson, L.D., Zhao, X., et al. A voxelation-corrected non-stationary 3D cluster-size test based on random
field theory. NeuroImage. 2015; 118: 676-682.
5. Beckmann, C.F., DeLuca, M., Devlin, J.T., et al.
Investigations into resting-state connectivity using independent component
analysis. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2005; 360: 1001-1013.