Identifying true positives from diffusion weighted imaging (DWI)-based tractography is not trivial, and no universal gold standard has yet been developed. In this study, we introduce a method which utilizes the group-level streamline pathway distribution to determine inter-subject reproducibility of streamline bundles. At the participant level, streamlines that do not correspond to reproducible bundles are removed, resulting in improved reproducibility and increased fidelity when identifying differences between pathways. Additionally, we utilized electrical stimulation-based cortico-cortical evoked potentials to assess how well reliable bundles reflect underlying connectivity. Cleaned structural connectivity data was found to better correlate with electrophysiological connectivity.
Three tesla DWI scans were performed on 65 neurosurgical patients (33 males; age: 11.8 ± SD 3.7 y/o). Scans were conducted at a GE 3T Signa scanner using 55 encoding directions at b=1000 s/mm2 and a voxel size of 2x2x3 mm. Regions of interest (ROIs) were defined using electrical stimulation-based probability maps of language-related cortical regions (Fig. 1) generated from 100 participants4. Each of the eight resulting ROIs were used as seeds and targets for probabilistic tractography estimation in the MRtrix framework5. Streamlines were generated from the fiber orientation distribution image using 2000 seeds per voxel optimized to the grey matter-white matter interface, angle threshold of 90 degrees, maximum length of 250 mm, and cropped at the grey matter-white matter boundary as determined from a T1-weighted image
To identify reproducible bundles, each participant’s streamlines were warped to template space and merged into a single image. Bundle centroids were calculated via the QuickBundles algorithm,6 with a clustering threshold of 12 mm, from streamlines downsampled to 12 points. The minimum average direct-flip distance from each streamline to each centroid was used to assign streamlines to centroid. Thus, each group level-defined centroid has a distribution of streamlines, which form bundles. Bundles can be described by the number of subjects that contribute streamlines and the number of contributing streamlines relative to the overall number of streamlines generated for that ROI-ROI pair. Bundle reliability (BR) can thus be quantified as the percent of participants contributing to a bundle, weighted by the proportion of streamlines that contribute to the bundle. Streamline bundles which have a BR larger than the mean plus one standard deviation of the BR distribution are kept.
To test whether observations of tract morphological differences are more reliable in a refined data set, we compared streamline counts between the superior temporal gyrus (STG) and inferior frontal gyrus (IFG) and between the STG and inferior precentral gyrus (iPCG), three ROIs critical to human language function, but who’s relative connectivity weighting is still debated7,8. Intracranial EEG data from our previous study indicated greater information flow between STG-iPCG relative to STG-IFG9. To assess whether reliable bundles better reflect underling connectivity, we also measured electrical stimulation-based cortico-cortical evoked potentials (CCEPs)10 in a subset of patients. We tested the hypothesis that DWI-based tractography should reflect the functional connectivity based on CCEP results.
Results
Our quality control algorithm resulted in a significant reduction in the number of streamlines between all 28 ROI combinations (all p<4.54x10-6, significant after Bonferroni correction for multiple comparisons). This corresponded to a 6 – 70% drop in streamline count variance across individuals (Figs. 2 and 3). Change in mean streamline length was more variable: some increased, some decreased, and some showed no change. This is expected given there are no a priori definitions of a non-reproducible bundle other than it is not represented across individuals.
In the original dataset, we found that the STG-iPCG streamline count is larger than the STG-IFG count (t(54)=3.7866, p=0.0038). After data cleaning, we found the same pattern (t(54)=3.7866, p=7.95x10-7, t(54)=5.5805) but with a notably increased effect size: Cohen’s d increased from 0.53 to 0.80, denoting a shift from a medium to a large effect (Fig. 3). The ratio of CCEP-based connectivity strength between STG-iPCG and STG-IFG was 3.557. After cleaning, the ratio of streamlines between these regions increased from 1.684 to 2.679.