Large Language Models (LLMs) in Research

Efrat Shimron¹
¹Technion - Israel Institute of Technology, Israel

Synopsis

Keywords: Transferable skills: Responsible research, Transferable skills: Reproducible research, Image acquisition: Reconstruction

This presentation explores the integration of Large Language Models (LLMs) like ChatGPT into academic research, highlighting their utility in enhancing research efficiency through tasks such as proofreading, brainstorming, and code writing. It further addresses the ethical use of open web-based datasets, especially in MRI research, emphasizing the importance of responsible data application to avoid bias and "data crimes." Attendees will learn to effectively leverage these digital tools for improved research productivity and impact, balancing innovation with ethical considerations.

Overview

This lecture focuses on strategies for harnessing web-based resources, particularly Large language models (LLMs) like chatGPT and open-access datasets, for academic research.
The first part will explore the burgeoning role of LLMs within the academic community. With their capacity to assist in a variety of tasks—ranging from drafting communications to writing code —LLMs like ChatGPT have emerged as invaluable allies in the research process [1,2]. This section will offer practical insights into leveraging ChatGPT to enhance research efficiency, including proofreading drafts, fostering creative brainstorming sessions, extracting complex equations from dense PDFs, and ensuring reference integrity.
The second part will focus on methodological considerations for use of open web-based datasets, with a particular focus on MRI databases. Attendees will gain an understanding of the diverse dataset types at their disposal, strategies for their responsible use, and the potential pitfalls of negligent data application, which can culminate in "data crimes" and skewed research outcomes [3].
By the conclusion of this lecture, participants will be equipped with strategic approaches to harness web-based LLMs and online datasets effectively and ethically, thereby unlocking new dimensions of productivity and impact in their research endeavors.

Acknowledgements

E.S. Is a Horev Fellow and acknowledges support from the Technion's Leaders in Science and Technology program.

References

1. Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901.

2. Kasneci, Enkelejda, et al. "ChatGPT for good? On opportunities and challenges of large language models for education." Learning and individual differences 103 (2023): 102274.

3. Shimron, Efrat, et al. "Implicit data crimes: Machine learning bias arising from misuse of public data." Proceedings of the National Academy of Sciences 119.13 (2022): e2117203119.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)