Alimenté par : Claudia (ADFI Alsace)
Cet outil s'appuie sur PubMind
Un accès direct à la littérature scientifique via la base PubMed permettant de faciliter la veille sur les enjeux complexes de la santé mentale et du fait religieux : de la neuroscience des croyances à l'étude des abus spirituels, en passant par la prise en charge des traumatismes et des processus de déconversion.
Dernière synchronisation le 07/06/2026
Appl Clin Inform . 2026;17 (2) :172-176
This study aimed to explore the performance of ChatGPT version 4.0 (GPT-4) and Gemini Advanced (Gemini) large language models (LLMs) in addressing common patient questions after gynecology surgery with regards to accuracy, relevance, helpfulness to the average patient, and readability.In this cross-sectional study, the two LLMs were prompted to generate answers to postoperative patient questions after gynecologic surgery. Postoperative patient questions were developed to simulate common patient questions after gynecologic surgery, based on expert opinion and compiled from anonymous posters on Reddit (r/endometriosis). Questions were focused on six topics: endometriosis, vaginal bleeding, bowel/bladder function, incision care, resumption of activities, and sexual function. Questions were asked in a systematic three-step submission process with the memory reset after each query. Responses were then blinded and independently assessed for accuracy and relevance on a 5-Point Likert scale by four board-certified gynecologic surgeons with fellowship training in gynecologic surgery. Readability of the answers was calculated with the Flesch Kincaid grade level calculator. Responses were also evaluated by three clinic nurses.A total of 41 questions were posed to GPT-4 and Gemini three times. The responses were independently evaluated by four surgeons and three nurses leading to a total of 1,968 evaluations for accuracy, relevance, helpfulness to the average patient, and readability. Surgeons and nurses graded Gemini responses as more accurate (4.23 vs. 4.03, = 0.015) and helpful (4.37 vs. 4.21, = 0.025) than GPT-4 responses. Responses from both models were similarly found to be relevant or very relevant (4.45 vs. 4.36, = 0.2). Most responses by GPT-4 (85%) and Gemini (87%) were consistent across all questions. The average reading level for GPT-4 and Gemini responses were 11th and 10th grade, above the recommended 6th grade reading level for patient information.GPT-4 and Gemini provided overall accurate, relevant, and helpful responses to common postoperative patient questions for gynecologic surgery. Gemini outperformed GPT-4 in both accuracy and helpfulness and had objectively more readable responses.