Lee-Ren Yeh1, Yang Zhang2, Jeon-Hor Chen1,2, Peter Chang2, Daniel Chow2, and Min-Ying Lydia Su2
1Department of Radiology, E-Da Hospital and I-Shou University, Kaohsiung, Taiwan, 2Department of Radiological Sciences, University of California, Irvine, CA, United States
Synopsis
Differentiation
of benign from malignant vertebral fracture is challenging yet very important for therapeutic planning. In this study, deep
learning was conducted to automatically differentiate the fracture types using
5 different convolutional neural networks, including ResNet50, DenseNet,
Xception, xceptionResNetV2, and InceptionV3. The
final segmentation model was developed using 10-fold cross-validation applied
in two different input methods, i.e. single slice or each slice combined
with its two neighboring slices. Overall, the prediction accuracy was improved
when each slice combined with its two neighboring slices was used as the input.
Among the five deep learning approaches, XceptionResnetV2
showed the highest prediction accuracy.
Background and Purpose
Benign and malignant vertebral fracture may be difficult
to differentiate due to similar presentation. It is only when the vertebral
fat-containing yellow bone marrow is replaced by enough amount of cancer cells,
the bone will show signal intensity change. Clinical accurate diagnosis is
especially difficult in elderly patients with no or minor trauma history. The
differentiation between benign osteoporotic or traumatic
fractures and malignant fracture is
necessary to establish an appropriate staging and a therapeutic planning,
especially in the acute and subacute stages. Imaging plays
important roles for disease evaluation. Plain radiography has
limitations in distinguishing osteoporotic fracture, metastasis-induced
fracture, or other primary bone neoplasm. Among all, MR imaging is the most
helpful radiological investigation in order to provide the basis for the
distinction between malignant and benign fractures.
Although conventional MRI has been extensively used for
accessing spinal lesions [1], the accurate differential diagnosis of benign
fracture and malignant fracture remains challenging. Recently quantitative
imaging methods, such as diffusion weighted imaging (DWI), have been
investigated and showed their values [2, 3]. Nevertheless, the method has not
gained wide popularity due to technical and overlapping ADC values issues. In
this study, benign fractures
resulting from osteomyelitis, Paget's disease, hyperparathyroidism and other
metabolic processes were excluded from the analysis. For malignant fracture, we
only included cases of metastatic lesions. Patients with other primary
neoplasms of vertebrae were excluded. In this study we stepped further and
investigated the potential value of deep learning approaches in the
differentiation of benign and malignant vertebral fracture.
Materials and Methods
The
patients included in this study were randomly sorted out from the radiological
reporting system in a period of 4 years using key works of osteoporotic
fracture, traumatic fracture, and tumor fracture. All subjects received MR
imaging of the spine at a 1.5T scanner. Further, the MR images were reviewed by
an experienced bone radiologist to confirm the lesion(s). All patients of benign
fracture had no known history of malignancy and have been followed up with
stable disease. The patients of malignant fracture had either biopsy proof of
cancer, or known history of primary tumor with progressive disease. The most
common primary cancer came from lung followed by colon/rectum, breast, and
prostate. The primary origin was not known in three patients. In total, 140 patients with benign, and 62 patients with malignant
fracture were studied. An experienced radiologist identified the abnormal
region on T2W images on sagittal view. For each case, a rectangle region of
interest (ROI) was manually placed on an area that showed the abnormal region
on T2W images on sagittal view. For each case, the smallest bounding box
containing the entire tumor was determined to crop the original images. In this
study, we tested two types of inputs. The first one is to use each single slice
as an independent input in deep learning. Another type of input is to use each
slice combined with its two neighboring slices as an input in deep learning.
Deep learning was done to automatically differentiate the fracture types, by
using 5 different convolutional neural networks, including ResNet50, DenseNet,
Xception, XceptionResNetV2, and InceptionV3 [4-6]. The final segmentation model was developed using 10-fold
cross-validation. To avoid overfitting, the dataset was augmented by a
random affine transformation. The loss function is cross entropy and the
optimizer is Adam with learning rate 0.001 [7]. ImageNet works as the initial
values of the parameters in these models [8].
Results
Table 1 shows the prediction accuracy of the five deep learning approaches when
single slice or each slice combined with its two neighboring slices was used as
an input in 10-fold validation. It was obvious that when single slice was used
as the input, among the five deep learning methods, XceptionResnetV2 achieved
the highest prediction accuracy of 0.93 (0.85 – 0.95). If each slice combined
with its two neighboring slices was used as an input, overall, the prediction
accuracy was improved. By using XceptionResnetV2, the
prediction accuracy in 10-fold validation was 0.95 (0.85 – 0.99), the highest
one among the five approaches. Figure 1
shows two case examples of benign fracture accessed by deep learning as high
probability and low probability respectively. Figure 2 shows two case examples of malignant fracture accessed as
high probability and low probability respectively. Figure 3 shows two cases, one benign fracture misdiagnosed as
malignant fracture, and one malignant fracture misdiagnosed as benign fracture.Discussion
This study compared the performance of the differentiation of benign and malignant fracture with five different deep learning approaches using either single slice or three consecutive slices as the input. The results showed that, overall, when each slice combined with its two neighboring slices was used as an input, the prediction accuracy was improved compared with when the single slice was used as the input. The radiological diagnosis of benign or malignant fracture is based on the combined information derived from T1WI and T2WI images in all planes. For example, the detection of adjacent soft tissue mass, pedicle and posterior element involvement, and other skipped bone marrow lesion are more specific for malignancy. However, diffuse marrow replacement, posterior protrusion of vertebral body, compression of entire body, and end plate disruption are shared by both benign and malignant fractures. For the equivocal cases difficult to make a definite diagnosis, it is hoped that deep learning approach can increase the diagnostic specificity.Acknowledgements
This study was supported in part by NIH R01 CA127927. References
[1] Jung HS, et al.
Discrimination of metastatic from acute osteoporotic compression spinal
fractures with MR Imaging. RadioGraphics. 2003;23:179–87.; [2] Herneth AM, et
al. Vertebral metastases:assessment with
apparent diffusion coefficient. Radiology. 2002;225:889–94.; [3] Chan JH, et
al. Acute vertebral body compression fractures:discrimination between benign
and malignant causes using apparent diffusion coefficients. J Radiol.
2002;75:207–14.; [4] He K, et al. Deep residual learning for image recognition.
Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2016.;
[5] Chollet F. Xception: Deep learning with depthwise separable convolutions.
arXiv preprint. 2017:1610.02357.; [6] Szegedy C, et al. Rethinking the
inception architecture for computer vision. Proceedings of the IEEE conference
on Computer Vision and Pattern Recognition. 2016.; [7] Kingma D, et al. A
method for stochastic optimization. arXiv preprint arXiv:14126980. 2014.; [8]
Deng J, et al. Imagenet: a large-scale hierarchical image database. Proceedings
of the IEEE conference on Computer Vision and Pattern Recognition, 2009.