Alimenté par : Claudia (ADFI Alsace)
Cet outil s'appuie sur PubMind
Un accès direct à la littérature scientifique via la base PubMed permettant de faciliter la veille sur les enjeux complexes de la santé mentale et du fait religieux : de la neuroscience des croyances à l'étude des abus spirituels, en passant par la prise en charge des traumatismes et des processus de déconversion.
Dernière synchronisation le 07/06/2026
J Glaucoma . 2026;35 (3) :173-178
PRÉCIS: This study investigates the accuracy, readability, utility, and educational value of glaucoma treatment content on social media platforms and explores how large language models assess the quality of social media posts compared with glaucoma experts.PURPOSE: To assess the quality of information on glaucoma treatment available on social media platforms.METHODS: A 30-question survey consisting of the "top posts" from three social media platforms (X, Instagram, and Reddit) was assessed by 5 board-certified glaucoma experts across four domains (readability, utility, educational value, and accuracy) by using a 5-point Likert scale. The overall quality of each post was calculated as the average of the median score assigned to each of the four domains to create a reference standard. Expert agreement was assessed using Kendall's coefficient of concordance ( W ). A large language model (LLM), GPT-4 (OpenAI), was then prompted to evaluate the same posts with identical instructions. Agreement with expert consensus was compared using Cohen weighted kappa ( κ ), and the difference in favorability of each post assessed using McNemar exact test.RESULTS: Fewer than half of social media posts on glaucoma treatment were judged favorably by glaucoma experts (40%). GPT-4 was less critical of social media content and provided a favorable rating nearly twice as often (77%, P =0.017). Despite this difference, there was moderate agreement between the LLM compared with the glaucoma experts ( κ =0.421, P =0.005). The lack of agreement predominantly stemmed from cases where the experts rated the content unfavorably, with disagreement occurring in 56% of cases, compared with 0% when the content was deemed favorable ( P =0.005).CONCLUSIONS: Although glaucoma experts and artificial intelligence (AI)-based systems were in moderate agreement when evaluating the quality of posts, the LLM was less able to discriminate posts of low quality.