An Active Learning platform for automatic MR image quality assessment

Thomas Küstner^1,2, Martin Schwartz^1,2, Annika Kaupp², Petros Martirosian¹, Sergios Gatidis¹, Nina F. Schwenzer¹, Fritz Schick¹, Holger Schmidt¹, and Bin Yang²

¹University Hospital Tübingen, Tübingen, Germany, ²Institute of Signal Processing and System Theory, University of Stuttgart, Stuttgart, Germany

Synopsis

Acquired images are usually analyzed by a human observer (HO) according to a certain diagnostic question. Flexible algorithm parametrization and the enormous amount of data created per patient make this task time-demanding and expensive. Furthermore, definition of objective quality criterion can be very challenging, especially in the context of a missing reference image. In order to support the HO in assessing image quality, we propose a non-reference MR image quality assessment system based on a machine-learning approach with an Active Learning loop to reduce the amount of necessary labeled training data. Labeling is performed via an easy accessible website.

Purpose

In medical imaging, recorded images are usually visualized and analyzed by a human observer (HO) to answer specific diagnostic questions. A good quality of these images is essential to substantiate the diagnostic reading. Flexible MR sequence and reconstruction parametrization make it very tunable to specific applications but also demand a profound knowledge: If not chosen carefully, this can lead to image quality degradation. Furthermore, MRI is prone to artifacts which can be classified into hardware-related, patient-related or signal-processing-related. Together with the enormous amount of data created per patient, image quality assessment by a HO can be time-demanding and expensive. Hence, an automatization or support of this process is desired¹. However, quality criteria need to be clarified first which can be challenging and may be an obstacle to objective evaluation especially in the case of missing reference/gold-standard images². We therefore proposed a non-reference MR image quality assessment (IQA) system based on a machine-learning approach to predict HO labeling scores of arbitrary input images corrupted with unknown artifacts³. In order to reduce the amount of needed labeled training data, in this work we propose the integration of an Active Learning (AL) loop providing a feedback to an HTML-based scoring website to query the HO for labeling only the next most significant images. In contrast to other approaches focusing on content-based classification with AL⁴ or on the combination of AL with a relevance-vector machine in a low-dimensional feature space⁵, we employ AL to a support-vector machine (SVM) classification in a high-dimensional feature space being able to reflect more complex image distortions.

Material and Methods

The proposed system layout including AL is shown in Fig.1. The system accepts input of 2D and 3D MR images which are classified into 5 different classes according to a Likert scale. The database currently contains 150 labeled images from 38 patients which were blindfolded scored from five HO out of 1747 images from 344 patients. For the blindfolded labeling we developed an HTML website accessible via browser from every computer within the hospital. HOs register to this website and provide some background information (e.g. field and years of experience). Depending on the study prerequisites which are determined by the study coordinator, the website allows the participation of HOs in different studies. An exemplary screenshot of the website is shown in Fig.2. The website offers standard displaying adjustments (zoom, rotate,...). If necessary, a reference image (if available) for rough guidance can be provided. Once all images on one page are labeled, the user proceeds to the next dataset chosen by the AL. Progress is saved in a MySQL database allowing the user to stop at any time and to proceed at a later moment. Based on the already labeled images, the soft-margin multi-class SVM classifier⁶ is trained to learn its separating decision hyperplanes in the high-dimensional feature space (77 dimensional space after feature reduction of 2871 features³). Afterwards, the AL searches in the pool of unlabeled images for the ones which are closest located to these decision hyperplanes with additional attention drawn to outliers and slack violating (due to soft-margin type) images which are excluded from the selection process. The closest images resp. the ones for which the classifier is most uncertain about are then presented to the HO for labeling.
Since we are interested to keep the amount of needed images/samples low (i.e. HO queries), whilst achieving a fast convergence towards the maximal achievable classification accuracy, a trade-off has to be found between initial training size

$N_I$ , number of images per query

$N_L$ and the computational complexity. We extracted 2038 samples of 100 labelled images for training and 873 samples of 50 labeled images for testing and examined the classification accuracy. A maximal classification accuracy of 91.2% can be achieved without AL and a total of all 2038 samples and hence test accuracy >90% serves as target.

Results and Discussion

Fig.3 shows the classification accuracy of the proposed system including AL for changing initial training sizes

$N_I$ with a constant amount of 40 samples per query. For an initial training size of

$N_I$ =200 samples we first achieve an accuracy >90% for 1040 samples, corresponding to a 49% reduction of needed images. As can be seen in Fig.4, choosing

$N_L$ =40 samples per query for

$N_I$ =200 gives a good trade-off, because just 20 AL iterations are needed.

Conclusion

We propose a strategy to reduce the amount of labeled images by around 50% by means of AL for a non-reference MR IQA system. An easy accessible website was created to allow a smooth and streamlined HO labeling procedure.

Acknowledgements

We thank Carsten Gröper and Gerd Zeger for assistance in data acquistion. Thanks to all participating radiologists for labeling the training data. In particular, we would like to thank Christina Schraml, Ferdinand Seith and Cornelia Brendle.

References

[1] Barrett et al., Proc Natl Acad Sci USA 1993;90. [2] Rohlfing et al., TMI 2012;31(2). [3] Küstner et al., Proc ISMRM 2015. [4] Hoi et al., Proc Int Conf Mach Learn 2006. [5] Lorente et al., IEEE Proc ISBI 2014. [6] Chang et al., T Intell System Tech 2011;2.

Figures

Figure 1: MR image quality assessment system including active learning loop.

Figure 2: Screenshot of HTML-based labeling website for human observer.

Figure 3: Test accuracy for different

$N_L$ samples per query with initial training size

$N_I$ =200 and margin-based query approach.

Figure 4: Test accuracy for different initial training set sizes

$N_I$ with

$N_L$ =40 samples per query and margin-based approach.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)

1903