Making Quality Software
William Overall1

1HeartVista, Inc., United States

Synopsis

Open sharing of software tools, machine learning models, and data can provide significant benefits to the MRI community. This session introduces the core concepts of open software development, including software life cycles, collaborative development tools, and standard practices for ensuring software quality.

Software in Medical Research: Typical Life Cycle

Typical research software projects follow a familiar life cycle (Figure 1). A researcher or team identifies a need for software tools to explore a particular research interest. A tailored solution is then developed internally. Later, the developers post their code online for others to use. In many cases, however, a critical mass of community support doesn't materialize. If key developers move on from the originating institution, the project may stagnate.

This unfortunate reality leads to substantial duplicated effort, with various sites curating redundant tools to generate the same standard pulse sequences, reconstructions, and visualizations.

In the ideal case, an open software project builds an engaged user community that can create a self-sustaining cycle of regular improvements.To facilitate this community-building, software should be created with standard tools, quality standards, and documentation.

Relatedly, the Reproducible Research initiative [1] aims to incentivize the publication of software code and raw data as part of the scientific publication process. Many of the principles of open-source software distribution are applicable to data sharing and the sharing of machine-learning models.

Existing Open-Source Initiatives in MRI Software

By far, the easiest way to contribute software that is of value to the greater MRI community is through adding to an existing project. There are a number of open-source projects that are under active development and would welcome external contributors. The following links contain information about active MRI-related open-source projects:

  • https://www.ismrm.org/MR-Hub/
  • https://www.ismrm.org/mri_unbound/
  • http://www.opensourceimaging.org/

Interoperability between software efforts is of value as well. For example, the ISMRMRD raw-data format [2] seeks to foster sharing of raw MRI data in a common, vendor-agnostic format.

Open-Source Software Principles

A number of models exist for open software development. Projects may choose varying amounts of openness of the software codebase and different organizational structures. Successful projects typically share a few critical characteristics [3]:

  • A decentralized structure encouraging collaboration between developers
  • A clearly defined vision shared between developers and users
  • One or more project leaders who reward community participation

An administrative structure is required to ensure that the project retains its vision and maintains quality standards and coding guidelines. Communities often identify "project leaders" who can be responsible for the maintenance of the project over the long term.

Open software should select a licensing model. Permissive licenses (e.g., BSD; MIT) allow the most flexibility to use the software in external projects with few restrictions. Share-alike licenses (e.g., GPL, LGPL) provide source code with varying expectations that the users will also share the source code of their work that uses it. Proprietary licensing involves more restrictive sharing of source code, if at all, but may be appropriate in cases where core libraries can be shared along with extensible APIs for open development.

If a project makes use of external libraries (e.g.,FFTW; GSL), the licenses of those included libraries may restrict the available licensing choices.

Collaboration Tools

The most important tool for collaborative software development is the version-control system. By far, the most popular of these today is git. Git allows developers in diverse locations to make concurrent edits to the same code base. These changes can then be merged to the shared main development code base by sending a pull request to the project leader(s), who review the added code for conformance with community standards, suggest improvements, and publish the changes. A number of web-based tools exist for coordinating pull requests (e.g., GitHub; Bitbucket).

While community members can largely decide which features to contribute, there is usually a need for coordination of major new features. Creating this roadmap should be controlled by the project leader(s) with community involvement. To follow progress of desired features, an issue tracking system is useful (e.g., GitHub, JIRA, Redmine).

Developer documentation and example code should be created from an early stage. Integrated API documentation tools are available for most languages (e.g., Doxygen for C and C++; Python docstrings).

Software Quality and Testing

Quality control is important for any software project, and even more so when medical data are involved. If MRI-related software is to be used in routine clinical practice, it must be cleared for use by the governing regulatory body (FDA, CE Mark, etc.) depending upon its risk. These organizations publish standards for ensuring software quality [4,5] through validation and life cycle management.

Some of the strict regulatory requirements may seem in conflict with the open-source collaboration model. However, many of these concerns may be addressed by strict usage of controls like pull requests and binary packaging. Software validation and testing must be thorough for the intended use of the product. Unit testing and automated integration tests can be a significant part of this story.

Packaging and Distribution

Several options exist for distributing open toolkits to external users. While it's often sufficient to merely provide the URL to a public git repository for download, this alone can often lead to a poor initial experience for platform users, as they may need to manually download external library dependencies, additional tools, and compile code themselves before being able to do even a basic test.

To simplify this initial experience, many projects provide pre-built packages to allow one-step installation. Package managers may be operating-system specific or language-specific, and include Docker, Anaconda, dpkg, rpm, Homebrew, and Chocolatey.

Conclusion

Open sharing of software tools, machine learning models, and data can provide significant benefits to the MRI community. When starting new software projects, first consider if it can extend an existing open project. When starting new projects intended to be openly extensible, integrate the appropriate collaboration tools, quality and testing standards, documentation tools, and distribution methods to maximize the chance for community engagement.

Acknowledgements

No acknowledgement found.

References

[1] Fomel S & Claerbout JF. Guest Editors' Introduction: Reproducible Research. Computing in Science and Engg 2009; 11(1), 5-7. doi: 10.1109/mcse.2009.14

[2] Inati SJ, Naegele JD, et al. ISMRM Raw data format: A proposed standard for MRI raw datasets. Magn Reson Med 2017; 77(1): 411-421. doi: 10.1002/mrm.26089

[3] Guliani G & Woods D. Open Source for the Enterprise. O'Reilly Media, 2005. ISBN: 0596101198.

[4] U.S. Food and Drug Administration. General Principles of Software Validation. Center for Devices and Radiological Health 2002/1/11.

[5] International Electrotechnical Commission. Medical device software – Software life cycle processes. IEC 62304:2006.

Figures

Figure 1: The typical open-source software life cycle begins with a perceived need, progresses through internal development and usage, and is ultimately released externally. Development of a community is critical to building a project that can sustain itself over the long term (dotted lines).

Proc. Intl. Soc. Mag. Reson. Med. 26 (2018)