Synopsis
Development,
validation, and translation of advanced new imaging methods is an exciting and
important area of scientific development and clinical medicine. The development
of standardized approaches and objective measures of new imaging technologies
such as SNR and CNR, and subjective ordinal metrics are extremely helpful particularly
in the early stages of technical development and translation. Subsequent
studies comparing new imaging techniques with accepted reference standards, is
the next step to establish the diagnostic performance of a technique for the
detection and staging of disease.
Ultimately,
clinical effectiveness and patient outcomes are the most important metric of
the impact of new technologies.
Finally,
there are many practical barriers that should be considered, including work
flow, post-processing, that are needed to garner acceptance by technologists,
radiologists, and referring physicians.Highlights
· Acceptance of a
new technique generally requires that the acquisition have one or more of the
following qualities: faster, more robust/reliable, improved image quality, or
offers new information not previously available by other acquisition
techniques.
· Evaluation of
new imaging techniques should focus on the specific anatomy of interest or on a
specific clinical question, not the overall appearance of an image.
· Commonly used numerical
metrics for initial clinical evaluation of new imaging methods includes SNR,
CNR, and subjective grading metrics such as overall image quality, artifact
severity, and specific quality metrics relevant to the technical development
(e.g. quality of fat suppression).
· Ultimately, determination
of clinical impact requires evaluation of diagnostic accuracy, clinical
effectiveness and the impact on clinical decision-making.
· Dissemination of
new imaging techniques requires acceptance from MRI technologists,
radiologists, and referring physicians.
· Effective
development and translation of new imaging techniques into clinical care
requires partnership with clinical imaging experts.
Introduction
Imaging scientists continue to be
remarkably productive, producing more and more innovative MRI acquisition
techniques with potential for improved clinical care. These methods reduce
acquisition time, improve image quality, improve the robustness of the
acquisition, and/or provide new information to the interpreting physician.
We live in an exciting time where the combination of improved scanner hardware
technology and improved acquisition algorithms had led to an explosion of
exciting new methods to improve the way we acquire MR images. We are also
living in an era of financial constraint, where there is increasing pressure to
reduce overall protocol times, and to answer the clinical question in a
relatively short and reliable manner. The use of objective measures to evaluate
new imaging technologies is increasingly important when translating new
technologies into clinical practice to triage those methods with the greatest
promise.
When developing and
validating new imaging methods, it is critical to understand the anatomy of
interest and the relevant clinical context or question. Even though a
new technique may produce beautiful images, if the anatomy of interest is not
well visualized or the clinical question not answered, the value of that method
for that specific application is low. For example, a new MRA method to evaluate
for renal artery stenosis must have excellent image quality not only in normal
renal arteries, but also in those with disease. Good visualization of the aorta
or other vessels may be irrelevant. Another example is the ability to visualize
the thin capsule of a hypertrophic nodules in benign prostatic hypertrophy, to
differentiate it from cancer.
Typically, testing in
healthy volunteers is performed for the initial technical evaluation. Ultimately,
testing in patients not just for diagnostic accuracy (sensitivity, specificity)
but also technical performance is necessary. While a new imaging technique may
work beautifully in a healthy graduate student, the performance may suffer in a
child, obese patient or elderly patient.
We also are
constrained by the well-known trade-off between signal to noise ratio (SNR),
spatial resolution and scan time. For most body and MSK applications, higher
spatial resolution is always better, so long as the SNR is adequate for the diagnostic
task. Although there is no such thing as “too much SNR”, there is a
point above which increasing SNR adds little new information, and increases in
spatial resolution or shorter scan times are preferred. In general, radiologists
are content with sufficient SNR, in
order to obtain the highest spatial resolution within acceptable scan time. Tradeoffs
between spatial resolution and scan time are made when short scan times (eg.
breath-holds) are needed. A good rule of thumb is that most patients can hold
their breath for no more than 20 seconds,. The use of navigator-based or motion
compensated methods can help break constraints on spatial resolution and SNR
imposed by scan time.
Early Evaluation of New Techniques
Early evaluation generally focuses on
technical performance such as signal to noise ratio (SNR) and contrast to noise
ratio (CNR) performance. This provide a simple and objective means to determine
the improvement in performance of a new method relative to conventional
imaging. SNR and CNR are of great interest when comparing different techniques,
optimizing a technique, as well as comparing different contrast agents with
regards to their performance with a particular sequence. Absolute SNR should be
measured whenever possible, however, it is the relative SNR or CNR performance
between two different techniques that matters. A fair comparison of SNR or CNR between
two methods requires that scan time and/or spatial resolution are held constant.
Evaluation of the CNR
between two different tissues of interest is typically needed. For example, the
conspicuity of a liver lesion depends on the CNR between the lesion and the
adjacent liver parenchyma. Similarly, the CNR between cartilage and joint fluid
is necessary when evaluating sequences aimed at evaluating cartilage.
The following objective measures are
commonly used for both body and MSK applications:
1. Signal to noise ratio (SNR) - this is the most widely accepted metric by both physicists
and radiologists to evaluate image quality. Whenever possible, it is preferable
to measure absolute SNR, in order to gauge the performance across patients. Most
radiologists (and some physicists!) are not aware of the pitfalls of measuring
SNR in the presence of parallel imaging or compressed sensing. Further, they
often neglect to use the necessary correction factors when measuring SNR from
magnitude images to account for the Rician noise characteristics of magnitude
images (1,2).
2. Contrast noise ratio (CNR) - CNR is often more important than SNR, because the value
of a sequence depends heavily on its ability to visualize pathology, which
typically requires contrast between two adjacent tissues. Examples include the
ability to identify and visualize lesions, and evaluate the contrast between
two relevant areas of anatomy, such as joint fluid and cartilage, etc. Muscle
is a useful surrogate for metastatic lesions in the liver when optimizing CNR
for a method aimed at detection liver lesions.
3. SNR and CNR Pitfalls - when parallel imaging, compressed sensing,
constrained reconstruction or non-Cartesian techniques are used, spatially
non-uniform and/or colored noise may result. This may renders traditional
measurement of SNR / CNR invalid. There are techniques available for measuring
absolutely SNR and CNR using multiple acquisition or subtraction techniques
(3). In my experience, however, these methods are neither feasible nor
practical for most body and many MSK applications, due to intra-scan motion. Advanced
approaches to calibrate systems to provide images in the units of SNR is a more
elegant and attractive approach, but is not widely used (4).
4. Relative SNR and CNR – the use relative SNR and CNR is an alternative to
absolute CNR and SNR in some circumstances, but can be limited. For example,
for MRA applications, we may wish to optimize the flip angle or compare the
performance of two contrast agents. Normalizing the signal within a vessel by
the signal in the pre-contrast images (eg. muscle) is a valid approach that
allows for intra-individual comparison. Relative SNR is limited for
comparison across a group of subjects. Further, the use of two acquisitions
(eg. pre-contrast and post-contrast) requires that the gain settings and all
other image acquisition parameters are identical.
Quantitative metrics of spatial resolution - objective metrics of spatial resolution are not
commonly used in practice. Nonetheless, the use of profiles across relevant anatomy
such as vessels to evaluate sharpness can be very useful, particularly at early
stages of development. Ultimately assessment by the radiologist is needed for
qualitative assessment of spatial resolution and sharpness.
Subjective Evaluation
Early in the evaluation of new imaging
techniques, subjective evaluations are commonly made. Through experience, we
can tell very quickly which image is better. It is important, to attempt to
quantify these improvements as objectively, and when possible in a blinded
manner. This is typically done with ordinal scales. Although there are no standardized
ordinal scales, the following are some useful examples.
An ordinal scale (eg.
0-3) and allows for comparison between two different acquisitions using
statistical tests for ordinal numbers (eg. Wilcoxon rank-sum test). It is
important to have a reference (usually current standard of care imaging) for
direct comparison.
1. Overall image quality - how do you
like the images? This is typically a summary of the overall performance. Eg. 3=excellent
image quality, 3= good image quality, 1=barely acceptable image quality,
0=non-diagnostic.
2. Presence and severity of artifacts - artifacts are an inherent part of MRI> Are
artifacts present, and if so, do they interfere with potential diagnosis or
anatomy of interest. Eg. 3=minimal or no artifacts present, 2= some artifacts
present, does not interfere with relevant clinical question, 1= moderate to
severe artifacts, interferes somewhat with relevant clinical question, 0=
non-diagnostic.
3. Visualization of the specific anatomy or clinical
question – some examples might
include: conspicuity or sharpness of renal arteries: 3= sharp vessels, well
visualized, 2= good visualization, minimal blurring, 1= arteries visualized,
considerable blurring, 0= very blurry, non-diagnostic. A second example is the
quality of fat suppression with a new fat suppression techniques: 3= excellent
fat suppression, 2= very good fat suppression, some areas of inhomogeneity, 1=
poor fat suppression but still diagnostic, 0= failed fat suppression.
4. Subjective SNR - are the
images noisy, and if so, does this interfere with the clinical question? Eg. 3=
excellent subjective SNR, minimal image noise, 2= some image noise, acceptable,
1= very noisy, but contains diagnostic information, 0= non-diagnostic.
Diagnostic Performance
Perhaps most importantly and the most
accepted metric of performance by clinicians is the diagnostic performance of a
new technique to evaluate the presence of disease. This requires a reference
standard, such as biopsy, other clinical diagnostic test (blood pressure), clinical
diagnosis (hypertension), standard of care imaging e.g (CT), or future clinical
outcome. For example, surgical/pathological diagnosis of appendicitis and/or CT
of the abdomen and pelvis may be acceptable reference standards when testing
new methods for contrast enhanced MRI in patients with right lower quadrant
pain. Ideally, evaluation of a new technique will take place as part of a
prospective study, although it is sometimes possible to do retrospective studies
to evaluate diagnostic performance.
Diagnostic
performance is assessed using the sensitivity, specificity and their composite
metric, accuracy. These metrics are independent of prevalence in the population
and are the most widely accepted metrics of a test performance. Further, in
certain clinical settings, the ability to screen for the presence of or absence
of disease can be also measured by the negative predicted value (NPV) or
positive predicted value (PPV). This is particularly helpful in patient
populations where a prevalence disease is particularly low or high. It also may
depend on the clinical nature of the disease of interest. For example, the NPV
of coronary CTA for the diagnosis of clinically relevant coronary artery
disease is very high, approaching 100%. This makes coronary CTA an ideal test
in the emergency setting, because of patients with a negative CTA can be safely
discharged without concern for of a subsequent major adverse cardiac event
(MACE). A detailed description of metrics of diagnostic performance is standard
in the early chapters of clinical research textbooks and will not be discussed
further.
Effectiveness and clinical decision-making
Beyond diagnostic performance of a
test for its ability to diagnose disease ultimately is its ability to change
clinical outcomes. It is one thing to have a test that accurately diagnoses
disease, but does this ultimately change clinical management? We will not
discuss this in detail here but a few examples are warranted. First, looking at
the effectiveness of a test is an important alternative to the diagnostic
performance of the test when a reference standard is not available or practical.
Test effectiveness can be established by studying patient outcomes such as
disease recurrence, morbidity, as well as mortality (5). For example, Schiebler
et al recently evaluated the safety and test effectiveness of pulmonary MRA
performed for the evaluation of pulmonary embolus. The subsequent
venothrombolic event (VTE) rate for one year following a negative pulmonary MRA
was exceedingly low (6).
Another important
metric of the utility of an imaging sequence is its ability to impact clinical
decision making. This is a very tangible and realistic approach to evaluating
the effectiveness of a test, and does not require a significant amount of follow-up.
Simply, does the new information provided by the imaging study change the
decision to treat the patient? For example, it is well-known that in
patients with right lower quadrant pain with suspected appendicitis,
cross-sectional imaging with ultrasound, CT and increasingly MRI alters the
surgeon’s decision to operate on a patient. These imaging techniques have
greatly reduced both the negative laparotomy rate as well as complications in
those patients with appendicitis who did not proceed with surgery.
Ideally, the impact
of a new imaging technology should be measured by its long-term outcomes in
patients including morbidity, mortality and quality of life. However, this is
often impractical, difficult and sometimes impossible to gauge. Such outcomes
constitute the highest level of evidence, but are exceedingly difficult to
perform and often impractical. There are many steps that take place in the
diagnostic and therapeutic pathway that are beyond the control of the imaging
specialist. Historically, excellent diagnostic performance (sensitivity,
specificity) are sufficient to justify the use of a new imaging study. In the
future, however value added and the impact of imaging on outcomes will be
increasingly important.
Acceptance of new imaging technologies and
practical matters
There are other important factors that
must be considered in the acceptance of a new imaging technology, particularly
work flow an complexity of technique. There are three groups of stakeholders that
are impacted: 1. MRI technologist 2. interpreting radiologist, and 3. referring physician.
Technologists are
critical in the acquisition chain and a new imaging technique is ideally simple
to use and can be performed by any technologist with different degrees of skill
and experience. Techniques that are difficult to use or easy to “break” are doomed to fail in the real world. The work
flow should be relatively “bullet proof” for even the inexperienced user.
Image interpretation
must also be straightforward and rapid. If significant post-processing is
required, or there is a need to perform significant post-processing on an
independent workstation, it is unlikely the method will be accepted a busy
radiologist, except in the most compelling cases. For example, MR spectroscopy
(MRS) is widely regarded as the reference standard for quantification of proton
density fat fraction (PDFF) as a metric of liver fat content. Even though the
acquisition is simple and only requires a 20 second breath hold (5),
considerable post-processing is required. However, quantitative chemical shift
encoded MRI techniques that provide a quantitative PDFF maps that are easily
visualized and analyzed using a few regions of interest (ROI) on the PACS. This
is a major reason why CSE-MRI techniques for quantifying liver fat have gained widepread
acceptance (6).
Finally, there must
be acceptance by the referring physician, who increasingly have access to PACS
in their offices. There must be some form of work product available on PACS for
the referring physician to review. Images that are presented in an easy to
comprehend display, is critical, and is one of the major reasons why referring
clinicians, particularly surgeons, continue to order CT scans, when they are
fully aware that MR offers superior diagnostic performance with no ionizing
radiation.
An important example
is the emergence of advanced 4D flow MRI techniques in clinical care. 4D flow
MRI generates an enormous amount of data. Is critical to provide a succinct
means of conveying the main clinical finding is challenging. Complex difference
MR angiograms, while helpful and easy to display on PACs, do not contain the
rich flow information of a 4D flow MRI acquisition. For this reason, the use of
summary DICOM images depicting streamlines and particle tracers loaded directly
onto the PACs is a strategy that has proven a highly effective for summarizing
the relevant anatomy and pathology. This practice provides a summary of 4D flow
images it is helpful for surgeon and also for showing to their patients.
Workflow
considerations are extremely important and without solutions to practical
barriers, new imaging techniques will have difficulty gaining traction and
ultimately acceptance in clinical practice.
Clinical Partner
Effective development and translation
of new imaging techniques into clinical care requires partnership with clinical
imaging experts. By developing
partnerships with motivated imaging physicians, you will gain clinical insight
and motivation and find a partner who will assist you in the validation and translation
of new innovation into clinical care. Ultimately, it takes a team of
individuals with both technical and clinical expertise to translate new imaging
innovations to bear, to impact the lives of our patients.
Summary
Development,
validation, and translation of advanced new imaging methods is an exciting and
important area of scientific development and clinical medicine. The development
of standardized approaches and objective measures of new imaging technologies
such as SNR and CNR, and subjective ordinal metrics are extremely helpful particularly
in the early stages of technical development and translation. Subsequent
studies comparing new imaging techniques with accepted reference standards, is
the next step to establish the diagnostic performance of a technique for the
detection and staging of disease. Ultimately, clinical effectiveness and
patient outcomes are the most important metric of the impact of new
technologies, although are generally beyond the scope of most imaging scientists.
Finally, there are many practical barriers that should be considered, including
work flow, post-processing, that are needed to garner acceptance by
technologists, radiologists, and referring physicians, in order for an advanced
imaging technique to gain traction.
Acknowledgements
No acknowledgement found.References
1.
RM Henkelman “Measurement of Signal Intensities in the
Presence of Noise in MR Images”, Medical Physics, 1985,12 (2): 232-233
2.
O Dietrich et al “Influence of Multi-channel Combination,
Parallel Imaging, and Other Reconstruction Techniques on MRI Noise Characteristics”,
MRI, 2008, 26: 754-762
3.
SB Reeder et al “Practical Approaches to the Evaluation to
Signal to Noise Ratio Performance with Parallel Imaging: Application with
Cardiac Imaging and a 32-channel Cardiac Coil”, MRM, 2005, 54:748-754
4.
P Kellman, ER McVeigh, “Image reconstruction in SNR units: a general method for SNR measurement”, MRM, 2005
Dec;54(6):1439-47
5.
PS Douglas et al “Outcomes Research and Cardiovascular
Imaging: Report of a Workshop Sponsored by the National Heart, Lung, and Blood
Institute” Circ Cardiovasc Imaging, 2009; 2:339-348
6.
Schiebler et al “Effectiveness of MR angiography for the
primary diagnosis of acute pulmonary embolism: clinical outcomes at three
months and one year" JMRI, 2013 38(4): 914-25.
7.
G Hamilton et al “In Vivo Characterization of the Liver Fat 1H
MR Spectrum” NMR Biomed, 2011, 24(7): 784-90
8.
SB Reeder et al “Quantitative Assessment of Liver Fat with
Magnetic Resonance Imaging and Spectroscopy”, JMRI, 2011 34(4): 729-49