Hailong Li1,2, Lili He1,2,3, Jonathan Dudley2,4, Thomas Maloney2,4, Elanchezhian Somasundaram4, Samuel L. Brady4,5, Nehal A. Parikh 1,3, and Jonathan R. Dillman2,4,5
1The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States, 2Imaging Research Center, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States, 3Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States, 4Department of Radiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States, 5Department of Radiology, University of Cincinnati College of Medicine, Cincinnati, OH, United States
Synopsis
Detection and monitoring of chronic liver
diseases is typically assessed using a combination of clinical history,
physical examination, laboratory testing, biopsy with histopathologic
assessment, and imaging. The aim of this study is to develop a deep transfer
learning model (DeepLiverNet) to categorically classify the severity of liver
stiffening (no/mild vs. moderate/severe) using both anatomic T2-weighted MR
images and clinical data. The DeepLiverNet model achieved accuracies of 88.0% and
80.0% on the risk stratification of liver stiffness in internal and external
validation datasets, respectively. This demonstrates that a deep learning model
may provide a means for stratifying liver stiffness without elastography.
INTRODUCTION
Chronic liver diseases are a common source of morbidity and
mortality in both children and adults in the United States and around the world.
1,2 Detection and progression of
such liver diseases is typically assessed using a combination of clinical
history, physical examination, laboratory testing, biopsy with histopathologic
assessment, and imaging. 3 MR elastography (MRE), a
non-invasive method of assessing liver stiffness, uses an active-passive driver
system (with the passive paddle placed over the right upper quadrant of the
abdomen at the level of the costal margin) to create transverse (shear) waves
in the liver. 4 Although MRE may obviate the
need for liver biopsy in some patients and allows more frequent longitudinal
assessment of liver health, it has associated drawbacks related to additional
patient time in the scanner, patient discomfort, and added costs (e.g., infrastructure
and patient charge-related). 5 In the current project, we aim
to develop a deep learning approach to categorically classify liver
stiffness-determined by MRE-using both T2-weighted imaging and clinical data
from pediatric and young adult patients.METHODS
In this retrospective study, we
collected 178 MRE examinations from a GE scanner for model development and internal
validation and 95 MRE examinations from a Philips scanner for external
validation. For each subject, the mean liver stiffness value in kPa (shear
modulus) and 27 clinical features were retrieved from the electronic health
record (Epic Systems Corporation; Verona, WI). Based on pre-defined liver
stiffness cutoff, 6 patients were divided into
two groups (<3 kPa=no/mild vs. ≥3 kPa=moderate/severe liver stiffening).
Axial two-dimensional T2-weighted fast spin-echo fat-suppressed images were extracted
from our clinical Picture Archiving and Communicating System.
Our deep model contains two separate input channels
for imaging and clinical data, respectively (Figure 1). The imaging channel is comprised of an image input
layer, a transfer learning block, and an adaptive learning block. First, the
image input layer contains S parallel
input sub-channels, taking S
individual slices of fixed-size axial T2-weighted MR images. Next, to extract
liver image features, we designed a transfer learning block by reusing the
weights of a VGG-19 model 7 (from 1st to 21st layers)
that was trained based on ~1.2 million color images from ImageNet database. 8 Then, we designed an adaptive
learning block that contains S
parallel sub-channels (two convolutional layers with [8, 16] neurons and 3×3
filters, and a fully-connected layer with 8 neurons) corresponding to the input
sub-channels for learning the individual latent features of S liver slices, respectively. In the
end, those sub-channels in the adaptive learning block are integrated by a
fully-connected layer with 8 neurons. In the current study, we used four axial T2-weighted
images of the liver from the same anatomic levels as MRE images (i.e., S=4). For the
clinical channel, a fully-connected layer with 8 neurons is directly applied.
After the feature extraction, a fusion block (8 neurons) is applied to
integrate the latent features from both imaging and clinical data. A two-way
softmax classifier was utilized to classify the severity of liver stiffness.
Considering the imbalanced subject
ratio (e.g., <3 vs. ≥3 kPa = ~2:1 in the current study), a rotation and
shift-based data augmentation scheme 9 is used to balance the
subjects between two groups. The diagnostic performance of the model is assessed
with the metrics of accuracy, sensitivity, specificity, and area under the
receiver operating characteristic curve (AuROC). We used 10-fold
cross-validation in the internal validation with the internal cohort. In the
external validation, we trained the model with the internal cohort and tested the
model with an external cohort.RESULTS
Demographics of two cohorts were
listed in Table 1.
Internal Validation. We first set to determine the
performance of DeepLiverNet using only non-stiffness T2-weighted imaging data,
including liver volume and chemical shift-encoded fat fraction. As shown in Table 2, the DeepLiverNet was able to
correctly classify patients with regard to categorical MRE liver stiffness with
an AuROC of 0.80. Using only clinical data, the model classified patients with
an AuROC of 0.83, achieving a significantly greater AuROC (p=0.003) compared to
the one using only imaging data. The DeepLiverNet combining both T2-weighted MR
imaging and clinical data was able to correctly classify patients with an AuROC
of 0.86. This was significantly greater than imaging data alone (p<0.0001)
or clinical data alone (p<0.0001).
External Validation. The internally-validated DeepLiverNet using both clinical and imaging
features achieved an accuracy of 80.0%, with a sensitivity of 61.1%, a
specificity of 91.5%, an AuROC of 0.77 on independent external subjects. DISCUSSIONS and CONCLUSIONS
In this work, we applied both transfer learning
and data augmentation strategies to avoid deep learning model overfitting. Our
model demonstrated good generalizability when externally validated. Our
proposed deep learning model that incorporated clinical features and
T2-weighted MR images demonstrated a means of classifying patients into
normal/minimally elevated versus moderately/severely elevated liver stiffness
with an accuracy up to 88%. Further studies are needed to continue to refine
the model as well as validate it in other patient groups, including older
adults and cohorts with very specific liver diseases (e.g., sclerosing
cholangitis, viral hepatitis, non-alcoholic fatty liver disease).Acknowledgements
We highly appreciated the internal research grant of the Department of Radiology, Cincinnati Children's Hospital Medical Center.References
1. Chalasani N, Younossi Z, Lavine JE, et al. The diagnosis and
management of nonalcoholic fatty liver disease: practice guidance from the
American Association for the Study of Liver Diseases. Hepatology. 2018;67(1):328-357.
2. Lavanchy D. The global burden of
hepatitis C. Liver international. 2009;29:74-81.
3. Tapper EB, Lok AS-F. Use of liver
imaging and biopsy in clinical practice. New
England Journal of Medicine. 2017;377(8):756-768.
4. Trout AT, Sheridan RM, Serai SD, et al.
Diagnostic performance of MR elastography for liver fibrosis in children and
young adults with a spectrum of liver diseases. Radiology. 2018;287(3):824-832.
5. Wang M, Byram B, Palmeri M, Rouze N,
Nightingale K. Imaging transverse isotropic properties of muscle by monitoring
acoustic radiation force induced shear waves using a 2-D matrix ultrasound
array. IEEE transactions on medical
imaging. 2013;32(9):1671-1684.
6. He L, Li H, Dudley JA, et al. Machine
Learning Prediction of Liver Stiffness Using Clinical and T2-Weighted MRI
Radiomic Data. American Journal of
Roentgenology. 2019:1-10.
7. Simonyan K, Zisserman A. Very deep
convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556. 2014.
8. Deng J, Dong W, Socher R, Li L-J, Li K,
Fei-Fei L. Imagenet: A large-scale hierarchical image database. IEEE Conference on Computer Vision and
Pattern Recognition. 2009.
9. Krizhevsky A, Sutskever I, Hinton GE.
Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing
Systems. 2012.