Integration of Large Language Models as an Adjunct Tool in Healthcare

Himel Mondal

doi:10.4274/tao.2024.2024-10-11

Dear Editor,

I read with great interest the recent article titled “Evaluating the Performance of ChatGPT, Gemini, and Bing Compared with Resident Surgeons in the Otorhinolaryngology In-service Training Examination” published in your journal (1). The study offers valuable insights into the evolving role of large language model (LLM) in healthcare.

The study’s comparative evaluation of Artificial intelligence (AI)-driven language models with resident surgeons is both timely and significant. It highlights the fact that while LLMs, such as ChatGPT and Gemini, exhibit impressive capabilities in answering factual and guideline-based questions. However, they are still far from replacing human expertise (2), especially in highly specialized fields like otorhinolaryngology. The complexity involved in medical decision-making require not only the recall of information but also the ability to apply it in context, an area where general-purpose LLMs like ChatGPT remain limited as it depends on the input (3).

While these tools excel at providing broad and evidence-based responses, they often struggle with the subtleties of case-specific clinical reasoning (4). A summary of potential difference between the LLM and human in various aspect of healthcare is shown in Table 1. Usage of LLMs is an adjunct tool rather than replacements in healthcare education and clinical practice (5). By supporting residents in understanding core concepts, reviewing evidence-based guidelines, or simulating basic diagnostic scenarios, LLMs can serve as a valuable supplementary resource in training environments. However, the integration of AI into medical education and diagnostics must be approached with caution. AI is still some way from being able to reliably make critical healthcare decisions independently. Hence, the use of AI tools should be geared toward enhancing human decision-making rather than substituting it.

Keywords:

Financial Disclosure: The author declared that this study received no financial support.

References

Mete U. Evaluating the performance of chatgpt, gemini, and bing compared with resident surgeons in the otorhinolaryngology ın-service training examination. Turk Arch Otorhinolaryngol. 2024; 23: 6248-57.

Homolak J. Opportunities and risks of ChatGPT in medicine, science, and academic publishing: a modern Promethean dilemma. Croat Med J. 2023; 28: 64:1-3.

Yuan M, Bao P, Yuan J, Shen Y, Chen Z, Xie Y, et al. Large language models illuminate a progressive pathway to artificial intelligent healthcare assistant. Medicine Plus 2024; 1: 100020.

Meng X, Yan X, Zhang K, Liu D, Cui X, Yang Y, et al. The application of large language models in medicine: A scoping review. iScience 2024; 23: 109713.

Ullah E, Parwani A, Baig MM, Singh R. Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology - a recent scoping review. Diagn Pathol. 2024; 27: 19: 43.

CrossRef PubMed Google Scholar