Dr Bot: Why Doctors Can Fail Us and How AI Could Save Lives (4) Evaluation of AI chatbots for patient education and information on chronic obstructive pulmonary disease

15 September, 2025

Thanks to HIFA member Thomas Krichel and his regular update on Biomedical Librarianship [ bims-librar https://biomed.news/bims-librar/2025-09-14 ] we have witnessed a proliferation of papers on AI providing information for the general public, patients, health workers and others. Overwhelmingly these chatbots are proving to be remarkably accurate and reliable and they will become more so.

Below is an example and a comment from me.

CITATION: Evaluation of AI chatbots for patient education and information on chronic obstructive pulmonary disease.

Pınar Merç, Cansu Şahbaz Pirinççi, Emine Cihan.

DOI: https://doi.org/10.1016/j.hrtlng.2025.09.002

BACKGROUND: Chronic obstructive pulmonary disease (COPD) is a chronic and progressive disease that affects patients' quality of life and functional capacity. With its widespread use and ease of access, AI chatbots stand out as an alternative source of patient-centered information and education.

OBJECTIVES: To evaluate the readability and accuracy of information provided by ChatGPT, Gemini, and DeepSeek in COPD.

METHODS: Ten most frequently asked questions and answers regarding COPD in English were provided using three AI chatbots (ChatGPT-4 Turbo, Gemini 2.0 Flash, DeepSeek R1). Readability was assessed using the Flesch-Kincaid Grade Level (FKGL), while information quality was analyzed by five physiotherapists based on the guidelines. Responses were graded using a 4-point system from "excellent response requiring no explanation" to "unsatisfactory requiring significant explanation." Statistical analyses were performed on SPSS.

RESULTS: Overall, all three AI chatbots responded to questions with similar quality, with Gemini 2.0 providing a statistically higher quality response to question 4 (p < 0.05). In terms of readability of the answers, DeepSeek was found to have better readability on Q5 (12.01), Q8 (9.24), Q9 (13.1) and Q10 (8.73) compared to ChatGPT (Q5:13.9, Q8:11.92, Q9:17.15, Q10:9.88) and Gemini (Q5:18.22, Q8:15.47, Q9:17.42, Q10:9.38). Gemini was observed to produce more complex and academic level answers on more questions (Q4, Q5, Q8).

CONCLUSIONS: ChatGPT, Gemini, and DeepSeek provided evidence-based answers to frequently asked patient questions about COPD. DeepSeek showed better readability performance for many questions. AI chatbots may serve as a valuable clinical tool for COPD patient education and disease management in the future.

COMMENT (NPW): I would be interested to see an overview of the emerging literature on AI, perhaps stratified according to a taxonomy, which would differentiate its application for different end-user groups, and for different settings (for example, use by patients to explore a new health issue and what to do about it; use by patients and/or health workers to develop knowledge and expertise around a chronic condition; and use of AI as a 'substitute' for consulting with health workers)

HIFA profile: Neil Pakenham-Walsh is coordinator of HIFA (Healthcare Information For All), a global health community that brings all stakeholders together around the shared goal of universal access to reliable healthcare information. HIFA has 20,000 members in 180 countries, interacting in four languages and representing all parts of the global evidence ecosystem. HIFA is administered by Global Healthcare Information Network, a UK-based nonprofit in official relations with the World Health Organization. Email: neil@hifa.org