Despite the substantial increase in research activity in machine learning for MR image reconstruction, no large scale raw k-space data set is publicly available. This makes it challenging to reproduce and validate comparisons of different approaches, and it restricts access to work on this problem to researchers associated with large academic medical centers. This abstract introduces the first large-scale database of MRI data for reconstruction. The database currently includes about 7500 raw MRI k-space data sets from a range of MRI systems and clinical patient populations, with corresponding images derived from the rawdata using reference image reconstruction algorithms. Approximately 30000 additional clinical image datasets not directly associated with the rawdata are also included, and we plan to add to the database over time.
Raw k-space dataset: Fully sampled rawdata of consecutive patients undergoing regular clinical exams of the knee (≈1600), the brain (≈5300) and the liver (≈800) were collected. The study was approved the IRB. Patients were screened for metallic implants or other safety concerns, following routine safety procedures at our institution. Otherwise, there were no specific exclusion criteria. Scans were performed on five clinical 3T systems (Siemens Magnetom Skyra, Prisma, Trio, Vida and Biograph-mMR) and one clinical 1.5T system (Siemens Magnetom Aera) with clinically used receive coils. We used Cartesian 2D-TSE and GRE protocols that are employed clinically at our institution. Sequence parameters were matched as closely as possible between the different systems. Since our goal was to provide fully sampled k-space data, we disabled all subsampling-based acceleration methods like parallel imaging and partial Fourier for this study. Rawdata were exported from the scanners, anonymized, and converted into the vendor-neutral ISMRMD format7. The dataset includes acquisitions from five protocols:
Example images from reference reconstructions are shown in Figure 1, and detailed lists of the data acquisition parameters are contained in Tables 1-3.
Dicom dataset: In addition to the scanner rawdata, our dataset currently includes image sets from 10.000 knee, 10.000 brain and 10.000 liver scans of consecutive patients undergoing regular clinical exams. This data comes from a variety of
scanners within our institution and includes images from sequences beyond what is included in the raw dataset. Reconstructed DICOM images were anonymized using the RSNA
clinical trial processor. In addition, we performed manual
inspection of each DICOM image (and rawdata file) for the presence of unexpected
protected health information (PHI).
We will provide links to the dataset by the time of presentation at the annual meeting.
To our knowledge this is the first large-scale public dataset of raw k-space data from a clinical patient population. While public datasets do exist for reconstructed images, for example the Human Connectome project (HCP), the Alzheimer’s Disease Neuroimaging Initiative (ADNI) or the Osteoarthritis Initiative (OAI), they are generally specialized by already targeting a specific research question, where imaging serves as a tool to answer this particular question. Our dataset is broader, with the goal of providing a resource to improve image acquisition and reconstruction itself.
The number of cases that are included as DICOM images is substantially larger than the core k-space data, and this part of the dataset is
more heterogeneous with data coming from a wider range of MR-systems and protocols. It is worth noting that a Fourier transform of these images does not directly correspond to the originally measured rawdata. Images were also partly acquired with accelerated acquisitions and reconstructed with parallel imaging, which additionally confounds the validity of them being used as a fully sampled ground truth. In the context of machine learning for image reconstruction, our motivation to include the DICOM data is to answer the question if training on a larger number of less perfect data outperforms training on a smaller number of high quality data in terms of performance and generalization.
We hope that the availability of this dataset can further accelerate research in MR image reconstruction, much as computer
vision was supercharged by well curated large-scale datasets like ImageNet8. In particular, we hope that this dataset can serve as a benchmark during training and validation of developments in image reconstruction.
[1] K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll, “Learning a Variational Network for Reconstruction of Accelerated MRI Data,” Magn. Reson. Med., 79:3055–3071 (2018).
[2] S. Wang, Z. Su, L. Ying, X. Peng, S.
Zhu, F. Liang, D. Feng, and D. Liang, “Accelerating Magnetic
Resonance Imaging Via Deep Learning,” in IEEE
International Symposium on Biomedical Imaging (ISBI), 514–517 (2016).
[3] J. Schlemper, J. Caballero, J. V. Hajnal, A. Price, and D. Rueckert, “A Deep Cascade of Convolutional Neural Networks for MR Image Reconstruction,” in Information Processing in Medical Imaging, 647–658 (2017).
[4] B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen, “Image reconstruction by domain-transform manifold learning,” Nature, 555: 487–492 (2018).
[5] M. Mardani, E. Gong, J. Y. Cheng, S. S. Vasanawala,G. Zaharchuk, L. Xing, and J. M. Pauly., "Deep Generative Adversarial Neural Networks for Compressive Sensing (GANCS) MRI," in IEEE Transactions on Medical Imaging 2018, in press: doi: 10.1109/TMI.2018.285875.
[6] F. Chen, V. Taviani, I. Malkiel, J. Y. Cheng, J. I. Tamir, J. Shaikh, S. T. Chang, C. J. Hardy, J. M. Pauly, and S. S. Vasanawala, “Variable-Density Single-Shot Fast Spin-Echo MRI with Deep Learning Reconstruction by Using Variational Networks,” Radiology, 289: 366–373 (2018).
[7] S. J. Inati, J. D. Naegele, N. R. Zwart, V. Roopchansingh, M. J. Lizak, D. C. Hansen, C. Y. Liu, D. Atkinson, P. Kellman, S. Kozerke, H. Xue, A. E. Campbell-Washburn, T. S. Sørensen, and M. S. Hansen, “ISMRM Raw data format: A proposed standard for MRI raw datasets,” Magnetic Resonance in Medicine, 77: 411–421 (2016).
[8] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li and L. Fei-Fei, "ImageNet: A Large-Scale Hierarchical Image Database". IEEE Computer Vision and Pattern Recognition (CVPR) 2009.