Similarity based fusion of multiple Regions of Interests for MR sequence evaluation
Michael Goetz1, Christian Weber1, and Klaus H. Maier-Hein1

1Medical Image Computing, German Cancer Research Center (DKFZ) Heidelberg, Heidelberg, Germany

Synopsis

To compare the information content of different MR sequences often regions of interests (ROI) are used. A way to avoid observer-dependencies is to use ROIs from different observer, but this leads to the question how to fuse them. We propose a new method to combine the information obtained by multiple ROIs depending on their similarity, making our method less sensitive to outlier. We evaluate our method by comparing the results obtained from the traditional merging method with the proposed algorithm. The results indicate that our method can be a valuable extension to ROI-based, multi-observer studies.

Motivation

The diagnostic value of different MR sequences and other imaging modalities is usually assessed by region-of-interest (ROI) based comparisons. To avoid the dependency on a single observer and to improve the reliability the results, often ROIs from multiple observers are used. This allows reporting a mean result which is not based on a single observer and an inter-rater reliability as done by Davenport et al., for example [1, 2].

If multiple ROIs within a single image return different results, the remaining question is which ROI is the best ROI. A well-established solution is to report the mean value of all ROIs within a single image. But this weights all ROIs equally, even though some might be placed by more experienced observer than others and some may be considered as outliers. Therefore, the mean might not reflect the best solution, as for example shown for manual segmentations [3]. There are some modality-specific solutions to avoid the observer-dependency like TBSS and Partial Volume Analysis [4, 5].

We propose a new method of combining ROIs of multiple observers to simulate an optimal common ROI. Our method weights the result of each observer based on the agreement of all data, therefore reducing the effects of outliers.

Method

Our method takes n different ROIs $$$R_i$$$ and estimates a single target ROI $$$R_t=\sum_n w_i\cdot R_i$$$. The influence of the ROIs is done by weighting each voxel with an ROI-specific weight $$$w_i$$$. The so-created ROI is initialized as a mean ROI as all weights are set to $$$\frac{1}{n}$$$. Taking this as starting condition the weights are iteratively updated by our method. For this, a density function $$$D_i$$$ is estimated from each ROI and the distance $$$d_i$$$ of all $$$R_i$$$ from $$$R_t$$$ is calculated as

$$d_i = \sum_x \left| D_i(x) - D_t(x) \right|\enspace .$$

The density functions are evaluated for a fixed range of points, which are equally distributed over the complete observation range. The sum is calculated over the complete range of all those points. After calculating all $$$d_i$$$, the weights are updated according to

$$ w_i = \frac{\frac{1}{d_i}}{\sum \frac{1}{d_i}}\enspace.$$

This gives ROIs that are more similar to the target distribution more weight and increases the overall similarity as the influence of outliers is reduced. The reweighting is repeated until the change of all weights is below a given threshold. Figure 1 visualize the process of the algorithm.

The evaluation of our method is done using a group of 18 patients with high grade glioma. Seven ROIs are placed within edema based on T2-weighted MR-images by three different observers and the mean diffusivity calculated from a co-registered diffusion-weighted MR-image.

We implemented our method using the R programming language. The density estimation is done using the parameter-freefunction sm.density().

Results

Figure 2 shows displays the benefit of our method by using simulated data.

Figure 3 shows the qualitative result of our approach. Displayed is the distribution of the grey values marked by the ROIs of three observers, the resulting mean distribution, and the result of the proposed approach. Running our algorithm on the images of all 18 patients took less than half a second per patient (mean duration $$$0.05\,s\pm0.003\,s$$$). It took $$$11.7 \pm 4.9$$$ iterations on average until the algorithm converged (threshold for converging was set to 0.001). The distribution of the calculated weights is shown in Fig. 4. The differences of mean between the observer-generated ROIs and the proposed method was 84 times lower than those between the observer-generated ROIs and the mean ROI and 42 times the other way round. Figure 5 depicts the distribution of the mean grey values for all observers and the two fusing approaches.

The difference between the mean of all mean values for all observers and the mean value of and proposed approach are $$$2.03 \cdot 10^-5$$$ and $$$-9.23 \cdot 10^-6$$$ respectively. The same differences for the median are $$$4.32 \cdot 10^-5$$$ and $$$8.30 \cdot 10^-6$$$, respectively.

Discussion

We proposed a new method that allows a new unification of ROIs by weighting the observations from different raters. We think that this will enable more representative results which are less influenced by outliers, as our initial results indicate that the proposed method does reflect the real data by reducing the influence of an unusual observation, leading to a result that is less sensitive to a single rater. While we do not think that using only the results of our method is sufficient, we think that it can reveal additional information if it is used in multi-rater studies.

Acknowledgements

This work was carried out wth the support of the German Research Foundation DFG within projects I04 and R01, SFB/TRRR 125 "Cognition-Guided Surgery".

References

[1] Hallgren KA. Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial. Tutorials in quantitative methods for psychology. 2012;8(1):23-34.

[2] Davenport, M. S., Heye, T., Dale, B. M., Horvath, J. J., Breault, S. R., Feuerlein, S., Bashir, M. R., Boll, D. T. and Merkle, E. M. (2013), Inter- and intra-rater reproducibility of quantitative dynamic contrast enhanced MRI using TWIST perfusion data in a uterine fibroid model. J. Magn. Reson. Imaging, 38: 329–335. doi: 10.1002/jmri.23974

[3] Warfield, S.K.; Zou, K.H.; Wells, W.M., "Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation," in Medical Imaging, IEEE Transactions on , vol.23, no.7, pp.903-921, July 2004 doi: 10.1109/TMI.2004.828354

[4] S. M. Smith, M. Jenkinson, H. Johansen-Berg, D. Rueckert, T. E. Nichols, C. E. Mackay, K. E. Watkins, O. Ciccarelli, M. Z. Cader, P. M. Matthews, T. E. J. Behrens: Tract-based spatial statistics: Voxelwise analysis of multi-subject diffusion data. In: NeuroImage. 31(4):1487–1505, 2006

[5] Diffusion tensor imaging in primary brain tumors: reproducible quantitative analysis of corpus callosum infiltration and contralateral involvement using a probabilistic mixture model. Stieltjes B, Schlüter M, Didinger B, Weber MA, Hahn HK, Parzer P, Rexilius J, Konrad-Verse O, Peitgen HO, Essig M. Neuroimage. 2006 Jun;31(2):531-42. Epub 2006 Feb 14. PMID: 16478665

Figures

Figure 1: Simple flow chart for the proposed algorithm. The weights are estimated in an iterative way. The threshold can be user-defined, we used 0.001 for our experiments.

Figure 2: An example to illustrate the benefit of our method. Three ROIs are simulated by randomly drawing from normal distribution, two with expected mean at 1, and the one with 3 to simulate an differing ROI. Other than the mean ROI, our method ignores the outlier.

Figure 3: Exemplary qualitative results based on the Mean Diffusivity. The intensity distribution for the ROIs of the three rater are compared to the mean and the proposed approach. Observer 1 seems to be very different from the two other. Therefore it is less important for our method.

Figure 4: Showing the distribution of of weights used by our method. The weights are used to fuse seven different ROIs of three observer for 18 patients.

Figure 5: Distribution of mean values from 18 images grouped by ROI type (observer / placement run) and combination method, based on the Mean Diffusivity. Compared to the mean method, the median of our approach is more similar to those of the ROIs.



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)
3304