Peyman Shokrollahi1, Juan M Zambrano1, Allison Li2, Surbhi Raichandani1, Akshay S. Chaudhari1, and Andreas M. Loening1
1Stanford University, Stanford, CA, United States, 2GE Healthcare, Sunnyvale, CA, United States
Synopsis
Keywords: Other AI/ML, Machine Learning/Artificial Intelligence, Radiology Protocols, Decision Support System, Modeling, All-Body MR Protocols
Motivation: We developed a system that performs radiology protocol selection for incoming MRI orders.
Goal(s): To enhance MRI protocol selection accuracy and efficiency. We evaluated new models and expanded anatomic/subspeciality coverage compared to a prior body MRI protocol selection system.
Approach: A machine learning-driven decision-support system was developed integrating kernel-based, tree-based, boosting, and deep-learning algorithms with an ensemble classifier in 22,524 patients. This system utilizes electronic medical records to predict the top-three likely MRI protocols and their probabilities.
Results: A cumulative F1-score of 97.1% for the top-three predicted MRI protocols was obtained in a test set of 3,379 patients.
Impact: The proposed system has the potential to improve radiologists’
protocol selection accuracy by notifying them of protocol-case discrepancies due
to the individual patient’s conditions, and to enable a decision-support system
for greater efficiency in selecting commonly utilized MR protocols.
Introduction
Nearly 40
million MRI scans are conducted annually in the USA1. Most research
on machine learning (ML) applications has centered on image processing tasks,
such as detection2, segmentation3, and reconstruction4,
rather than pre-image acquisition tasks5. When selecting a protocol
in response to a physician order, a radiologist specifies a protocol (e.g., MR
pelvis prostate carcinoma without and with contrast) that most commonly
encompasses a broad anatomical region (e.g., pelvis), a specific organ target (e.g.,
prostate), a particular purpose (e.g., cancer screening), and whether contrast
will be utilized (e.g., without and with contrast)6. In addition to
its susceptibility to human error, this tedious
process7 takes radiologist
time away from clinical image interpretation. Consequently, protocol selection is
sometimes delegated to technologists8. Importantly, use of an improper
protocol may yield insufficient diagnostic data, putting patient health at
risk, delaying treatment, and increasing healthcare costs8,9.
ML
could increase efficiency in radiology workflows by facilitating appropriate
protocol selection. Unlike prior ML systems that utilize free-text inputs9,10,
our system uses structured data from the electronic medical record (EMR). Large
language models (LLMs) have been recently used in protocol selection5,11.
However, LLMs demonstrate challenges, including inaccuracy, uncertainty, and data-privacy
issues12,13. Prior work has presented the formulation of EMR
database-trained modeling systems designed to predict rank-ordered Body MR
protocols and avoid the abovementioned challenges14. Herein, we extend this modeling
algorithm approach to handle a greater diversity and anatomical scope of
applications in a streamlined ensemble system. Methods
All
work was performed with an institutional review board approved consent
waiver, using retrospective, anonymized data. EMRs were obtained from patients undergoing
cardiovascular, cardiac, body, breast, and neuroradiology MRI scans at our
institution between May 2017 and December 2022. A tabular dataset was generated,
including radiology-specific data (e.g., protocol forms, worklist, history,
allergies) and general data (e.g., demographics, laboratory, and orders). Initial
attribute selection incorporated factors such as ordered procedure,
order priority, allergy information, and previously applied protocols.
Data corresponding to the top 10 used
protocols (Table 1) were extracted from EMRs for 22,524 patients, encompassing 31,380
radiology examinations and forming a tabular input signal with a feature space
of 156 attributes per record. We omitted records that were duplicates, had altered
orders, were for other imaging modalities, were MSK examinations, or were for individuals
not 25‒85 years of age. The dataset was split into training (75%) and testing
(25%) subsets by patients. We used support vector machine (SVM), random forest
(RF), light gradient boosting machine (LGBM), extreme gradient boosting
(XGBoost), and neural network (NN) algorithms for performing protocol
classification9,15-17. The NN model consists of four layers with
batch normalization using ReLU activation function. An ensemble classifier integrated
all models, excluding SVM (due to high computational time) by averaging
classifiers probabilistic predictions. Hyperparameters (e.g., learning rate, leaf
count, and bandwidth) were fine-tuned with a Bayesian approach18. A
five-fold cross-validation was employed to assess model efficacy, gauged by
F1-score (a metric suitable for evaluating imbalanced data). Shapley additive explanations
(SHAP) values were plotted to reveal the relative impacts of each feature on
predictions.
The tuned models were used to predict
protocols and their probabilities for an unseen test dataset of 3,379 patients
with 4,771 records. The pipeline lists the top three protocol suggestions and
their probabilities. The accuracy of these selections was evaluated based on F1-scores. Results and Discussion
We
obtained an average F1-score of 89.7% per our cross-validation evaluations
across all models (Fig. 1). SHAP plots revealed that Ordered Procedure and
Ordered Anatomical Region were important features in most models (Fig. 2). The
trio of the most probable predicted protocols from the ensemble classifier can
be earmarked for radiologist review (Fig. 3). An accumulated F1-score of 97.1% was
obtained for the top-three predicted protocols (Fig. 4). The most prominent
protocols, and their associated probabilities, aid in selecting the most
appropriate protocol as part of a clinical decision-support system.Conclusions
The presently tested protocol selection system,
which provides expanded modeling and anatomical scan target coverage compared
to a prior system14, has been validated
with real clinical data. Incorporating a variety of classifiers improved its
predictive accuracy and stability. The proposed pipeline represents a
new approach to optimizing radiology protocol selection by providing automated suggestion
of common protocols. We are in the process of incorporating this resultant
decision-support system into our radiologist’s workflow hope to assess whether
it can contribute to improved patient outcomes while improving radiology
efficiency.Acknowledgements
This
work has been supported and funded by General Electric (GE) Healthcare.References
1. The Organization for Economic Cooperation and
Development. MRI units per million: by country, 2022. USA,
https://data.oecd.org/healtheqt/magnetic-resonance-imaging-mri-units.htm, Accessed November 1, 2023.
2.
Sheth, D, Giger, M.
Artificial intelligence in the interpretation of breast cancer on
MRI. J Magn Reson Imaging. 2020;51(5):1310-1324.
3.
Goldenberg, S, Nir, G,
Salcudean, S. A new era: artificial intelligence and machine learning in
prostate cancer. Nat Rev Urol. 2019;16(7):391-403.
4.
Wang, G, Ye, J, De Man,
B. Deep learning for tomographic image
reconstruction. Nat Mach Intell. 2020;2(12):737-748.
5.
Gertz, R, Bunck, A,
Lennartz, S, et al. GPT-4 for automated determination of radiological study and
protocol based on radiology request forms: a feasibility study. J Radiol.
2023;307(5):230877.
6.
Boland, G, Duszak, R.
Protocol management and design: current and future best practices. J Am
Coll Radiol. 2015;12(8):833-835.
7.
Richardson, M, Garwood,
E, Lee, Y, et al. Noninterpretive uses of artificial intelligence in
radiology. Acad Radiol. 2021;28(9):1225-1235.
8.
Kalra A, Chakraborty A, Fine B, et al. Machine
learning for automation of radiology protocols for quality and efficiency
improvement. J Am Coll Radiol. 2020;17(9):1149-1158.
9.
Brown A, Marotta T. A natural language
processing-based model to automate MRI brain protocol selection and
prioritization. Acad Radiol. 2017;24(2):160-166.
10.
Trivedi H, Mesterhazy J, Laguna B, et al.
Automatic determination of the need for intravenous contrast in musculoskeletal
MRI examinations using IBM Watson’s natural language processing algorithm. J
Digit Imaging. 2018;31(2):245-251.
11.
Mese, I, Taslicay, C, Sivrioglu, A. Improving
radiology workflow using ChatGPT and artificial intelligence. Clin
Imaging, 2023;109993.
12.
Thirunavukarasu, A, Ting, D, Elangovan, K, et
al. Large language models in medicine. Nat. Med.
2023;29(8):1930-1940.
13.
Clusmann, J, Kolbinger, F, Muti, H, et al. The
future landscape of large language models in medicine. Commun. Med.
2023;3(1):141.
14.
Shokrollahi, P Zambrano, J, et al. Predicting
Abdominal MRI Protocols using Electronic Health Records. In Proc. Int. Soc.
Magn. Reson. Med., Toronto, Canada, Jun. 2023.
15.
Retson, T, Besser, A, Sall, S, et al. Machine
learning and deep neural networks in thoracic and cardiovascular
imaging. J. Thorac. Imaging. 2019;34(3):192.
16.
Charbuty, B, Abdulazeez, A. Classification
based on decision tree algorithm for machine learning. J.
appl. sci. technol. trends. 2021;2(01):20-28.
17.
LeCun, Y, Bengio, Y, and Hinton, G. Deep
learning. Nat. 2015;521(7553):436-444.
18.
Snoek, J, Larochelle, H, Adams, R. Practical
bayesian optimization of machine learning algorithms. Adv.
Neural Inf. Process. Syst. 2012;25.