0239

NeuroLibre: Living MRI preprints with built-in support for code review
Agah Karakuzu1, Elizabeth DuPre2, Patrick Bermudez3, Mathieu Boudreau1, Rachel Harding4, Jean-Baptiste Poline3, Samir Das3, Pierre Bellec5, and Nikola Stikov1
1NeuroPoly Lab, Polytechnique Montreal, Montreal, QC, Canada, 2Department of Psychology, Stanford University, San Francisco, CA, United States, 3The Neuro, McGill Centre for Integrative Neurosciences MCIN, Montreal, QC, Canada, 4Structural Genomics Consortium, University of Toronto, Toronto, ON, Canada, 5CRIUGM, University of Montreal, Montreal, QC, Canada

Synopsis

Keywords: Software Tools, Software Tools

Motivation: The ISMRM community is swiftly adopting data sharing and code review. While the advantages are clear, challenges persist in ensuring the quality and functionality of these shared resources.

Goal(s): To establish a platform for simplifying technical (or code) reviews and generating open-source living preprints with interactive data apps (e.g., dashboards).

Approach: We created NeuroLibre.org, offering dedicated cloud resources for hosting living preprints that combine narrative and executable content.

Results: NeuroLibre has published 8 living preprints, covering a variety of MRI applications. Each preprint is registered as citable and online-executable content with DOI links to archived reproducibility objects (code, runtime, data).

Impact: Our living preprints showcase how NeuroLibre helps reviewers interactively assess the quality and functionality of reproducibility objects effortlessly, bolstering the reproducibility of MRI publications. The ISMRM 2020 reproducibility challenge is our flagship example: https://doi.org/10.55458/neurolibre.00014

Introduction

While the inclusion of a code/data availability section to static PDFs is encouraged, it can be challenging for readers to effectively use these resources in practice. Difficulties include incomplete documentation, dependency conflicts, missing reproducibility objects, or version mismatch between the code that has been shared and the code used in the article.

To overcome these challenges, we present NeuroLibre (https://neurolibre.org), a living preprint platform to boost the reproducibility of MRI publications. It streamlines the process of creating reproducibility objects by providing authors with easy-to-adapt repository templates and concise user documentation. These objects include i) a collection of Jupyter Notebooks1, ii) a data requirement file, iii) reproducible runtime environment descriptions, iv) a high-level summary, and v) content layout configuration.

When provided with the URL of a compatible repository, NeuroLibre creates a living preperint2 by downloading the necessary data and re-executing each notebook in the specified runtime environment. Our technical screening process first verifies that the living preprint and respective reproducibility objects can be generated, then proceeds to content registration for publishing a discoverable and citable preprint. In this abstract, we present recent technical advancements made in NeuroLibre and discuss how the concept of living preprints can promote code reviews, drawing on examples from our current repository of living preprints3,4.

Methods

The Neurolibre cloud infrastructure (Fig.1) is powered by the Digital Research Alliance of Canada (bit.ly/nlprev) and bare-metal servers donated by Cancer Computer (bit.ly/nlprep). BinderHub5 is deployed on both clouds to create and serve reproducible runtime environments, as well as to automatically build the Jupyter Book6 and store the required data on our servers.

The Flask framework was used to create the publication workflow API endpoints that adhere to OpenAPI specifications and include a Swagger UI for documentation (Fig.1). Gunicorn was employed as the web server gateway interface, while Nginx served as a reverse proxy to route incoming traffic to the appropriate server or endpoint. Nginx was also configured to handle SSL termination, load balancing, and serving of static files. For high throughput, static web pages are served through Cloudflare's content delivery network.

The API endpoints of the full-stack preprint server were enhanced with functionality to archive the reproducibility objects on Zenodo and to associate their DOIs with the submission metadata of the living preprint. Web applications were then hosted on Heroku using the OpenJournals editorial bot and journal management repositories, while NewRelic7 was used to provide centralized monitoring and performance analysis of all components of the NeuroLibre ecosystem.

Results

We have successfully implemented a reproducibility-focused publication workflow that ensures all necessary assets are properly indexed, archived on long-term preservation platforms, and tested. This guarantees that the published preprint reflects the same assets and can be reliably reproduced by any reader with a few clicks through the dedicated server hosting. We have also demonstrated that tech-savvy authors can use the proposed workflow to publish preprints in a multi-page8 (Fig.2) or a single-page9 layout (Fig. 3).

Discussion

The challenge in code review arises when asking reviewers to run the code on their local computers. This is because configuring the same runtime environment on various operating systems or even on the same operating system at different time points can be impractical. NeuroLibre addresses this problem with a standard repository structure (see https://docs.neurolibre.org) that is translated into containerized environments where all the data and software dependencies are met. The final component of the code review process is a platform where reviewers can verify, debug, and propose enhancements to the code. NeuroLibre's adaptation of the OpenJournals editorial manager is well-suited to fulfill these needs. Originally designed for conducting public technical screenings, it can also be modified to support private and confidential code reviews.

Current limitations include the lack of support for proprietary runtimes like MATLAB and GPU hardware. Moreover, the prerequisite for organizing revision materials in Jupyter Book or MyST markdown formats may not align with the preferences of all researchers. Ongoing initiatives are focused on bridging these gaps by offering the missing resources and enabling non-Jupyter interfaces for code reviews. Future endeavors may involve deploying additional data applications to NeuroLibre to cater to the broader needs of MRI-related articles, such as the MRD viewer10.

Conclusion

NeuroLibre is poised to redefine article reproducibility by enabling real-time exploration of research results through a user-friendly web interface. Our expanding list of peer-reviewed journal collaborations (bit.ly/nlcolab) demonstrates how this approach can support existing publishers. Powered by open-source tools on its dedicated cloud platform, NeuroLibre's approach ensures researchers can easily create and publish living preprints, while receiving the credit they deserve.

Acknowledgements

This work is sponsored by the Canadian Open Neuroscience Platform (CONP), Quebec Bio-imaging Network (QBIN), Brain Canada, TransMedTech Institute, and Unifying Neuroscience and Artificial Intelligence - Québec (UNIQUE).

References

1. Perez, F., Granger, B. E., San, C. P. & Obispo, L. Project jupyter: Computational narratives as the engine of collaborative data science. http://archive.ipython.org/JupyterGrantNarrative-2015.pdf.

2. DuPre, E. et al. Beyond advertising: New infrastructures for publishing integrated research objects. PLoS Comput. Biol. 18, e1009651 (2022).

3. Salo, T. et al. NiMARE: Neuroimaging meta-analysis research environment. NeuroLibre Reproducible Preprints 1, 7 (2022).

4. Mancini, M. et al. An interactive meta-analysis of MRI biomarkers of myelin. NeuroLibre Reproducible Preprints 1, 4 (2022).

5. Jupyter, P. et al. Binder 2.0 - Reproducible, interactive, sharable environments for science at scale. Proceedings of the Python in Science Conference Preprint at https://doi.org/10.25080/majora-4af1f417-011 (2018).

6. Perez, F. & Holdgraf, C. Scientific Communication and Reproducible Publishing in the Jupyter Ecosystem and Beyond. in vol. 2021 U33A–03 (2021).

7. Ahmed, T. M., Bezemer, C.-P., Chen, T.-H., Hassan, A. E. & Shang, W. Studying the Effectiveness of Application Performance Management (APM) Tools for Detecting Performance Regressions for Web Applications: An Experience Report. in 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) 1–12 (2016).

8. Boudreau, M. et al. Quantitative T1 MRI. NeuroLibre Reproducible Preprints 14, (2023).

9. Boudreau, M. and Karakuzu A., et al. Results of the ISMRM 2020 joint Reproducible Research & Quantitative MR study groups reproducibility challenge on phantom and human brain T1 mapping NeuroLibre Reproducible Preprints 19, (2023).

10. Dietrich, B.E., ISMRM Raw Data Viewer, Annual Meeting of the International Society for Magnetic Resonance in Medicine, 2855, (2018).

Figures

Figure-1 The NeuroLibre publication workflow is divided into two stages: preview (yellow) and preprint (purple). The technical infrastructure and architecture of NeuroLibre services are designed to align with this two-stage process. Before submitting their work, authors can use robo.neurolibre.org hosted on the preview cloud, which also supports the technical screening process (steps 1-3). Upon successful screening, the reproducibility assets are moved to the preprint cloud (4), integrated (5), archived (6), and published by depositing with Crossref (7-8).

Figure-2 The structure of two living preprints published in the multi-page format. All living preprints are registered with a DOI, making them citable and discoverable online.The preprints also feature publicly available technical screening that is archived, providing proof of correspondence between the archived reproducibility assets and those hosted on the preprint server for online execution. This allows for real-time exploration of the published research and guarantees a persistent archival of the assets as an integral part of the preprint.

Figure-3 The format of a living preprint, when published using a single-page layout that emulates the traditional single-column article templates. Despite the traditional layout, the living preprint incorporates its own dashboard hosted by NeuroLibre’s Platform as a Service (PaaS) instance: rrsg2020.db.neurolibre.org. This allows a panoramic view of the entire dataset for a variety of presentation types. Additionally, interactive figures enable users to explore images based on specific parameters (e.g., inversion times) and delve into individual data points.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
0239
DOI: https://doi.org/10.58530/2024/0239