1372

Differentiation of Vertebral Fracture Types using Five Different Convolutional Neural Network Approaches

Lee-Ren Yeh¹, Yang Zhang², Jeon-Hor Chen^1,2, Peter Chang², Daniel Chow², and Min-Ying Lydia Su²

¹Department of Radiology, E-Da Hospital and I-Shou University, Kaohsiung, Taiwan, ²Department of Radiological Sciences, University of California, Irvine, CA, United States

Synopsis

Differentiation of benign from malignant vertebral fracture is challenging yet very important for therapeutic planning. In this study, deep learning was conducted to automatically differentiate the fracture types using 5 different convolutional neural networks, including ResNet50, DenseNet, Xception, xceptionResNetV2, and InceptionV3. The final segmentation model was developed using 10-fold cross-validation applied in two different input methods, i.e. single slice or each slice combined with its two neighboring slices. Overall, the prediction accuracy was improved when each slice combined with its two neighboring slices was used as the input. Among the five deep learning approaches, XceptionResnetV2 showed the highest prediction accuracy.

Background and Purpose

Benign and malignant vertebral fracture may be difficult to differentiate due to similar presentation. It is only when the vertebral fat-containing yellow bone marrow is replaced by enough amount of cancer cells, the bone will show signal intensity change. Clinical accurate diagnosis is especially difficult in elderly patients with no or minor trauma history. The differentiation between benign osteoporotic or traumatic fractures and malignant fracture is necessary to establish an appropriate staging and a therapeutic planning, especially in the acute and subacute stages. Imaging plays important roles for disease evaluation. Plain radiography has limitations in distinguishing osteoporotic fracture, metastasis-induced fracture, or other primary bone neoplasm. Among all, MR imaging is the most helpful radiological investigation in order to provide the basis for the distinction between malignant and benign fractures. Although conventional MRI has been extensively used for accessing spinal lesions [1], the accurate differential diagnosis of benign fracture and malignant fracture remains challenging. Recently quantitative imaging methods, such as diffusion weighted imaging (DWI), have been investigated and showed their values [2, 3]. Nevertheless, the method has not gained wide popularity due to technical and overlapping ADC values issues. In this study, benign fractures resulting from osteomyelitis, Paget's disease, hyperparathyroidism and other metabolic processes were excluded from the analysis. For malignant fracture, we only included cases of metastatic lesions. Patients with other primary neoplasms of vertebrae were excluded. In this study we stepped further and investigated the potential value of deep learning approaches in the differentiation of benign and malignant vertebral fracture.

Materials and Methods

The patients included in this study were randomly sorted out from the radiological reporting system in a period of 4 years using key works of osteoporotic fracture, traumatic fracture, and tumor fracture. All subjects received MR imaging of the spine at a 1.5T scanner. Further, the MR images were reviewed by an experienced bone radiologist to confirm the lesion(s). All patients of benign fracture had no known history of malignancy and have been followed up with stable disease. The patients of malignant fracture had either biopsy proof of cancer, or known history of primary tumor with progressive disease. The most common primary cancer came from lung followed by colon/rectum, breast, and prostate. The primary origin was not known in three patients. In total, 140 patients with benign, and 62 patients with malignant fracture were studied. An experienced radiologist identified the abnormal region on T2W images on sagittal view. For each case, a rectangle region of interest (ROI) was manually placed on an area that showed the abnormal region on T2W images on sagittal view. For each case, the smallest bounding box containing the entire tumor was determined to crop the original images. In this study, we tested two types of inputs. The first one is to use each single slice as an independent input in deep learning. Another type of input is to use each slice combined with its two neighboring slices as an input in deep learning. Deep learning was done to automatically differentiate the fracture types, by using 5 different convolutional neural networks, including ResNet50, DenseNet, Xception, XceptionResNetV2, and InceptionV3 [4-6]. The final segmentation model was developed using 10-fold cross-validation. To avoid overfitting, the dataset was augmented by a random affine transformation. The loss function is cross entropy and the optimizer is Adam with learning rate 0.001 [7]. ImageNet works as the initial values of the parameters in these models [8].

Results

Table 1 shows the prediction accuracy of the five deep learning approaches when single slice or each slice combined with its two neighboring slices was used as an input in 10-fold validation. It was obvious that when single slice was used as the input, among the five deep learning methods, XceptionResnetV2 achieved the highest prediction accuracy of 0.93 (0.85 – 0.95). If each slice combined with its two neighboring slices was used as an input, overall, the prediction accuracy was improved. By using XceptionResnetV2, the prediction accuracy in 10-fold validation was 0.95 (0.85 – 0.99), the highest one among the five approaches. Figure 1 shows two case examples of benign fracture accessed by deep learning as high probability and low probability respectively. Figure 2 shows two case examples of malignant fracture accessed as high probability and low probability respectively. Figure 3 shows two cases, one benign fracture misdiagnosed as malignant fracture, and one malignant fracture misdiagnosed as benign fracture.

Discussion

This study compared the performance of the differentiation of benign and malignant fracture with five different deep learning approaches using either single slice or three consecutive slices as the input. The results showed that, overall, when each slice combined with its two neighboring slices was used as an input, the prediction accuracy was improved compared with when the single slice was used as the input. The radiological diagnosis of benign or malignant fracture is based on the combined information derived from T1WI and T2WI images in all planes. For example, the detection of adjacent soft tissue mass, pedicle and posterior element involvement, and other skipped bone marrow lesion are more specific for malignancy. However, diffuse marrow replacement, posterior protrusion of vertebral body, compression of entire body, and end plate disruption are shared by both benign and malignant fractures. For the equivocal cases difficult to make a definite diagnosis, it is hoped that deep learning approach can increase the diagnostic specificity.

Acknowledgements

This study was supported in part by NIH R01 CA127927.

References

[1] Jung HS, et al. Discrimination of metastatic from acute osteoporotic compression spinal fractures with MR Imaging. RadioGraphics. 2003;23:179–87.; [2] Herneth AM, et al. Vertebral metastases:assessment with apparent diffusion coefficient. Radiology. 2002;225:889–94.; [3] Chan JH, et al. Acute vertebral body compression fractures:discrimination between benign and malignant causes using apparent diffusion coefficients. J Radiol. 2002;75:207–14.; [4] He K, et al. Deep residual learning for image recognition. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2016.; [5] Chollet F. Xception: Deep learning with depthwise separable convolutions. arXiv preprint. 2017:1610.02357.; [6] Szegedy C, et al. Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2016.; [7] Kingma D, et al. A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014.; [8] Deng J, et al. Imagenet: a large-scale hierarchical image database. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2009.

Figures

Table 1. Comparison of the performance of five different deep learning approaches using single slice and three consecutive slices as the input

Figure 1. Examples of true negative prediction of benign fracture cases using threshold of 0.5. The left panel has a high benign probability (malignant/benign probability= 0.06/0.94). The right panel has a low benign probability (malignant/benign probability = 0.43/0.57). The left panel looks like old compression post vertebroplasty. The anterior wedge deformity of vertebral body and the absence of mass or nodular soft tissue component indicates its benign nature. The low signal intensity on the vertebral body was probably cement content following prior vertebroplasty. The right panel showing diffuse collapse pattern and posterior protrusion of vertebral body is confusing. However, the absence of epidural/paraspinal soft tissue mass, pedicle and posterior element involvement, and other skipped bone marrow lesion still suggest the possibility of benign fracture. The low signal intensity in the vertebral body could be vacuum cleft or bone sclerosis.

Figure 2. Examples of true positive prediction of malignant fracture cases based on threshold of 0.5. The left panel has a high malignant probability (malignant/benign probability= 0.92/0.08). The right panel has a low malignant probability (malignant/benign probability = 0.52/0.48). The left panel shows the total involvement of the vertebral body and posterior element which strongly suggest a malignancy. The right panel shows the total marrow replacement of the C6 vertebral body, compression of entire vertebral body, and also posterior protrusion, suggesting (but not indicating) a malignant fracture. Similar pattern may be observed in benign osteoporotic fracture (See Figure 1 right panel). Furthermore, the homogenous pattern and absence of fracture dark line are more likely for tumor involvement.

Figure 3. Examples of misdiagnosis cases based on threshold of 0.5. The left panel is a benign fracture case misdiagnosed as malignant fracture with malignant/benign probability = 0.54/0.46. The right panel is a case of malignant fracture misdiagnosed as benign fracture, with malignant/benign probability = 0.39/0.61. In the left panel, there is focal low signal intensity in the collapsed T12 vertebra, most likely a benign fracture with vacuum cleft or bone sclerosis. The coexistence of multiple old fractures in T8, T10, and T11 vertebrae also suggests osteoporotic nature of the T12 vertebral fracture. In the right panel, the T4 vertebra shows total marrow replacement, compression of entire vertebral body, and also posterior protrusion. Similar finding also noted in T1 vertebra, but with more homogeneous marrow replacement and evidence of anterior paraspinal extension. The combined information leads to the diagnosis of metastatic tumor with pathologic fractures of both T1 and T4 vertebrae.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

1372