3811

Automated Detection of Suboptimal Fat Suppression - A Simulation Study

Sauram Shreyas Vasanawala¹, Ali Bin Syed², and John Mark Pauly³
¹Electrical Engineering, Stanford University, Stanford, CA, United States, ²Radiology, Stanford University, Stanford, CA, United States, ³Stanford University, Stanford, CA, United States

Synopsis

Keywords: Other AI/ML, Artifacts, fat suppression, automated detection, cnn

Motivation: Fat suppression on MR images is not always completely successful. Identification of inadequate fat suppression can be difficult for technologists while they are multitasking. Unidentified low quality images inhibit radiologists’ ability to diagnose.

Goal(s): This work develops an automatic method to detect inadequate fat suppression in extremity MRI exams.

Approach: Two-point Dixon fat-water image pairs were combined to simulate varying degrees of fat suppression failure and serve as training data for a CNN.

Results: Greater than 85% accuracy was obtained on simulated data, which motivates future effort on prospective validation study.

Impact: This work may lead to on-scanner software that auto-identifies inadequate fat suppression, prompting repeat scans with more refined shimming or alternative fat suppression methods. This may improve image quality.

Background

Chemically-selective fat suppression is widely used [1]. Real-time identification of inadequate fat suppression can prompt immediate corrective measures, such as shimming or leveraging other methods (two-point Dixon, IDEAL, or STIR). However, technologists often multitask and may miss inadequate fat suppression. Despite extensive work on automated image analysis for segmentation and pathology detection, there are no reports of automated assessment of suboptimal fat suppression. Here we develop an automated fat suppression failure detection method and evaluate its performance on simulated images.

Methods

Overview: Fat-water image pairs from two-point Dixon acquisitions were scaled and combined to simulate varying degrees of fat suppression failure. The resulting images were used to train a CNN to detect fat suppression failure. Accuracy of the network was assessed.

Source Data: With IRB approval and waived consent, lower extremity MRI exams with two-point Dixon fat-water separation [2] were retrospectively identified. Fat and water series from 2D and 3D gradient and spin echo sequences were reviewed to ensure high-quality separation, with no local swaps. The edge slices in an image series often lack tissue, so the first/last 10% of each series were excluded.

Data Pre-processing and Label Generation: All images were resized to 128x128 and normalized. A random 3D compact, smooth scaling function was constructed as a 3D matrix with elements from zero to one by starting at a random entry to grow a fractional volume of support ranging randomly from 0.1 to 0.4; then a gaussian filter was applied. A random cross-sectional slice of this scaling function was chosen to scale a fat image, which was then added to the corresponding water image. This produced spatially varying signal intensity of the fat, resulting in a realistic simulation of fat suppression failure (Fig. 1). A corresponding continuous training label quantifying the degree of fat suppression failure was calculated by integrating the scaled fat slice and dividing that by the integral of the sum of the fat and water images; if greater than 0.05, the slice was labeled as fat suppression failure.

Neural Network Architecture: The model (Fig. 2) consists of convolutional layers with batch normalization, rectified linear units, dropout regularization, max pooling, and finally two fully connected linear layers. An optional fully connected terminal layer passed the contrast type.

Hyperparameter and Architecture Optimization: Binary cross entropy loss and Adam optimization were used. The first convolution layer kernel size varied from 3x3 to 7x7; the network convolutional layer depth varied from 3 to 6. The learning rate, batch size, number of epochs, and range of volume of support for the matrices used to scale the fat slices were also all varied.

Training: Computation was performed in PyTorch. The different series of fat/water pairs were chosen from randomly along with a random image number and then augmented tenfold with different matrices to nonuniformly scale the fat slice to maximize the training data. Rotation transformations were then optionally applied. Performance of models trained by excluding the first/last 25% of images from each series was also assessed.

Testing: Image preparation for the testing dataset was similar to the training dataset. The model was evaluated based on accuracy. To determine whether improved sensitivity/specificity could be obtained, two models trained on different distributions of fat corruption were combined by choosing whichever one was more confident.

Results

Hyperparameter optimization resulted in five convolutional layers with kernel sizes of 3x3, 600 epochs, and batch size 50. Larger convolution kernels and deeper networks resulted in overfitting. Addition of a decrementing learning rate intended to help the training curve converge yielded no significant improvement. Testing accuracy improved significantly by removing the first/last 25% of images from each series from the training (but not testing) data. Leveraging contrast type did not help the network classify images (Fig. 3). Rotations applied to the slices did not improve testing accuracy. Most errors in the better performing models were from images at the boundary of the cutoff of the label, i.e. with labels near 0.05. Interestingly, most of the models struggled with identifying extreme cases of fat suppression failure. The best models were able to reach an accuracy of 86%, being both sensitive and specific. Testing data generated with a volume of support for the scaling function from 0.1 to 0.4 was another improvement compared to 0.1 to 0.3 or 0.2 to 0.4. Combining sensitive and specific models improved accuracy by 3% (Fig. 4).

Implications

We demonstrate automated detection of inadequate fat suppression with a clinically useful level of accuracy. If implemented, this can prompt in realtime technologists to take corrective actions, improving overall exam quality.

Acknowledgements

Cagan Alkan and Rachel Kaci provided helpful advice.

References

[1] Delfaut, E M et al. “Fat suppression in MR imaging: techniques and pitfalls.” Radiographics vol. 19,2 (1999): 373-82. doi:10.1148/radiographics.19.2.g99mr03373

[2] Ma, Jingfei. “Dixon techniques for water and fat imaging.” J Magn Reson Imaging vol. 28,3 (2008): 543-58. doi:10.1002/jmri.21492

Figures

Fig. 1. Top left: method of generating data. Top right: method of generating label. Middle two rows: two examples of image and label generation; yellow arrows highlight undesired fat. Bottom row: three representative examples of fat suppression failure. Numbers represent the label that quantifies the degree of fat suppression: greater than 0.05 is inadequate fat suppression.

Fig. 2. Model architecture diagram

Fig. 3. Left: model with contrast type as input. Right: model without contrast type as input.

Fig. 4. Top: sensitive model. Middle: specific model. Bottom: combined model.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

3811

DOI: https://doi.org/10.58530/2024/3811