4341

Torch-EPG-X: a GPU-powered differentiable framework for the simulation of magnetization exchanging systems
Matteo Cencini1, Alessandra Retico1, and Michela Tosetti2
1INFN, Pisa Division, Pisa, Italy, 2IRCCS Stella Maris, Pisa, Italy

Synopsis

Keywords: Software Tools, Software Tools, Extended Phase Graphs

Motivation: Most existing MR simulators either focus on the implementation of multiple physical phenomena or on massive parallelization, but these two aspects are usually not tackled simultaneously.

Goal(s): To provide a feature-rich, massively parallelized and differentiable MR simulator.

Approach: We built on the Extended Phase Graphs formalism to efficiently simulate all the main MR physical phenomena. We used PyTorch as a backend to enable massive parallelization and efficient differentiation.

Results: Our toolbox, demonstrated on a numerical Fast Spin Echo experiment on an exchanging two-pool system, achieved order of magnitude speed-up compared to existing implementations and efficient differentiation with minimal boilerplate.

Impact: Torch-EPG-X will represent a useful tool for synthetic signal generation for deep learning, parameter fitting, model-based reconstruction and sequence optimization.

Introduction

MRI simulation tools are utilized in a wide range of applications, including but not limited to MR tissue properties quantification, pulse sequence optimization, synthetic data generation for deep learning and teaching and education. For these reasons, several open-source MR simulators have been developed in recent years, e.g., EPG-X1, Sycomore2, and mri-sim-py3. However, existing frameworks either focus on the simulation of several physical effects such as Magnetization Transfer and diffusion (EPG-X, Sycomore), but lack of massive parallelization, or achieve high computational efficiency but are limited to basic relaxation effects (mri-sim-py). Here, we introduce Torch-EPG-X, a feature-rich MR simulator (including diffusion, flow and magnetization exchange effects) supporting massive parallelization on GPU and automatic differentiation. The framework will be made publicly available under to interested researchers.

Methods

Software design and features: TorchEPG-X is written in Python 3 to enable portability across different platforms and uses PyTorch 2.1 as a backend to leverage its GPU support and efficient automatic differentiation capability, as well as facilitating integration in existing model-based reconstruction4 and deep learning5 frameworks. As a simulation model, we choose the Extended Phase Graphs (EPG) formalism1,6 due to its efficiency and flexibility in representing several physical phenomena.The package is organized in three main subpackages (Figure 1):

  • torchepgx.ops: low level operators describing the different effects acting on the magnetization (e.g., RF pulses rotation and saturation, relaxation, chemical exchange, diffusion, flow, dephasing).
  • torchepgx.blocks: composite operators representing common sequence events (e.g., adiabatic inversion, T2 preparation, rapid GRE or FSE readouts).
  • torchepgx.model: predefined simulators for several common sequences, defined as the subclass of an abstract model.base. This provides efficient parallelization and forward automatic differentiation of the generated signals while keeping the codebase concise and easy-to-read (Figure 2). Predefined sequences include MPRAGE, Multi-Echo MPRAGE (MEMPRAGE), Fast Spin Echo (FSE), T1-T2Shuffling7, SSFP-MR Fingerprinting8 and bSSFP-MR Fingerprinting9.

The main package features are listed in Figure 3, in comparison with other existing open-source MR simulation toolboxes.
Validation and benchmark: To validate the proposed toolbox, we simulated a FSE sequence acting on a two-pool system and compared the results to a reference MATLAB implementation1 both with and without chemical exchange. To assess the computational efficiency, runtimes were measured when simulating 21 batches of spins of linearly increasing sizes between 1 and 100k and compared to the runtimes obtained with Sycomore (C++) both with and without exchange and with mri-sim-py (PyTorch) for the non-exchanging case only. To enable a fair comparison, Sycomore was used both in a serial fashion (more efficient for small batch sizes) and parallelized using Python multiprocess module (more efficient for small batch size), and mri-sim-py was used both with the CPU and the GPU backends. In addition, automatic differentiation capability was demonstrated by computing the gradient with respect to the refocusing angles of an objective function defined as the Cramer Rao Lower Bound (CRLB) on T2, as common in the context of sequence optimization10. As a reference, we used finite-difference differentiation of both signals and objective function. All the experiments were performed on a HPC server equipped with a 96-cores CPU, 256GB of RAM and a NVIDIA A40 GPU with 48GB VRAM.

Results and Discussion

Figure 4 demonstrates that Torch-EPG-X can correctly compute FSE signals for two-pools systems both in presence and absence of exchange (panel b), providing virtually identical results to the reference MATLAB implementation. Computational efficiency is showcased in Figure 4c,d: thanks to the massive parallelization our implementation achieve similar runtimes both with and without the effect of exchange (despite the increased computational complexity for the latter case) which are comparable to the reference GPU implementation and one order of magnitude of speed-up compared to the state-of-the-art C++ implementation. Interestingly, our CPU implementation also achieves massive speed-up (similar to the GPU case), facilitating the use of our toolbox also on lower cost hardware. Finally, Figure 5 demonstrates the auto-differentiation capability of our toolbox: thanks to forward mode differentiation, the gradient of the signals with respect to tissue parameters retain similar complexity to the forward pass, while the gradient of the objective with respect to sequence parameters are efficiently calculated by backpropagation10. Importantly, this efficiency is achieved with minimal boilerplate code. This would enable application in the context of sequence optimization.

Conclusion

We believe that Torch-EPG-X will represent a useful tool for synthetic signal generation for deep learning11, parameter fitting12, model-based reconstruction13 and sequence optimization10. Future work will be focused on these applications.

Acknowledgements

This work was partially funded by the INFN-CSN5 PREDATOR project (“Grant Giovani”). Support from the Italian Ministry of Health under the grant RC 2022 and “5 per mille” to IRCCS Fondazione Stella Maris.

References

1. Malik SJ, Teixeira RPAG, Hajnal JV. Extended phase graph formalism for systems with magnetization transfer and exchange. Magn Reson Med. 2018;80(2):767-779. doi:10.1002/mrm.27040

2. Lamy J, de Sousa de S Paulo Loureiro. Sycomore: an MRI simulation toolkit. In: ; 2020:1038. https://archive.ismrm.org/2020/1038.html

3. Rakshit S, Wang K, Tamir JI. A GPU-accelerated Extended Phase Graph Algorithm for differentiable optimization and learning. In: ; 2021:3754. https://cds.ismrm.org/protected/21MPresentations/abstracts/3754.html

4. Wang G, Shah N, Zhu K, Noll DC, Fessler JA. MIRTorch: A PyTorch-powered Differentiable Toolbox for Fast Image Reconstruction and Scan Protocol Optimization. In: ; 2021:4982. Accessed November 7, 2023. https://archive.ismrm.org/2022/4982.html

5. Tamir JI, Yu SX, Lustig M. DeepInPy: Deep Inverse Problems in Python. In: ; 2020.

6. Weigel M. Extended phase graphs: Dephasing, RF pulses, and echoes - pure and simple. J Magn Reson Imaging. 2015;41(2):266-295. doi:https://doi.org/10.1002/jmri.24619

7. Tamir JI, Uecker M, Chen W, et al. T2 shuffling: Sharp, multicontrast, volumetric fast spin-echo imaging. Magn Reson Med. 2017;77(1):180-195. doi:10.1002/mrm.26102

8. Jiang Y, Ma D, Seiberlich N, Gulani V, Griswold MA. MR fingerprinting using fast imaging with steady state precession (FISP) with spiral readout. Magn Reson Med. 2015;74(6):1621-1631. doi:10.1002/mrm.25559

9. Ma D, Gulani V, Seiberlich N, et al. Magnetic resonance fingerprinting. Nature. 2013;495(7440):187-192. doi:10.1038/nature11971

10. Lee PK, Watkins LE, Anderson TI, Buonincontri G, Hargreaves BA. Flexible and efficient optimization of quantitative sequences using automatic differentiation of Bloch simulations. Magn Reson Med. 2019;82(4):1438-1451. doi:10.1002/mrm.27832

11. Li S, Kang T, Wu J, et al. Sub-second whole brain T2 mapping via multiband SENSE multiple overlapping-echo detachment imaging and deep learning. Phys Med Biol. 2023;68(19):195027. doi:10.1088/1361-6560/acfb71

12. Nataraj G, Nielsen JF, Scott C, Fessler JA. Dictionary-Free MRI PERK: Parameter Estimation via Regression with Kernels. IEEE Trans Med Imaging. 2018;37(9):2103-2114. doi:10.1109/TMI.2018.2817547

13. Heo J, Song P, Liu W, Weller A. Physics-Based Decoding Improves Magnetic Resonance Fingerprinting. In: Greenspan H, Madabhushi A, Mousavi P, et al., eds. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. Lecture Notes in Computer Science. Springer Nature Switzerland; 2023:446-456. doi:10.1007/978-3-031-43895-0_42

Figures

Figure 1 Main sub-packages of Torch-EPG-X. The sub-packages have different levels of abstraction to enable creation of simulators for different sequences with minimal boilerplate while maintaining high customizability for specific applications. All the packages are based on PyTorch 2.1 to enable GPU offload and efficient forward automatic differentiation of generated signals.


Figure 2 Code snippet for the generation of a SSFP-MR Fingerprinting simulator. Panel a) shows the core simulation engine. By defining the simulator as a subclass of epgtorchx.base.BaseSimulator, automatic parallelization and differentiation wrt to tissue parameters are automatically enabled with minimal boilerplate. Panel b) shows a minimal wrapper to instantiate the simulator, run the simulation and return the signals and corresponding derivatives.


Figure 3 Main features of the Torch-EPG-X package. Our toolbox includes both all the main MR physical phenomena and supports massive parallelization and automatic differentiation. By contrast, most available packages either focus on one of these aspects.


Figure 4 Validation and benchmark of Torch-EPG-X with an FSE simulation assuming constant refocusing pulse train (a) and acting on a two-pool spin system (T1: 1000/500ms, T2: 100/20ms, relative fraction: 0.2). Results (b) were identical to the reference implementation both without and with exchange (non-directional exchange rate: 10Hz). Our package also demonstrates high computational efficiency on GPU and CPU both without (c) and with exchange (d).


Figure 5 Validation and benchmark of Torch-EPG-X automatic differentiation capability with an FSE simulation assuming variable refocusing pulse train (a). Both the signal derivative with respect to T2 (b) and the CRLB objective gradient with respect to refocusing angles (c) showed good consistency with the finite-difference implementation, with some differences due to the numerical instability of the latter. Good computational efficiency was achieved, especially for the gradient of CRLB objective (d).


Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
4341
DOI: https://doi.org/10.58530/2024/4341