Research Review

Comparing Three Chatbots Responding to Myopia Questions

September 3, 2024

By Dwight Akerman, OD, MBA, FAAO, FBCLA, FIACLE

chatbots

Photo Credit: Getty Images

The research study conducted by Wang, Y., Liang, L., Li, R., Wang, Y., & Hao, C. (2024) titled “Comparison of the Performance of ChatGPT, Claude, and Bard in Support of Myopia Prevention and Control” provides valuable insights into the effectiveness of chatbots in the context of myopia prevention and control. The study aims to evaluate the performance of three well-known chatbots — ChatGPT, Claude, and Bard — in responding to public health questions related to myopia.

The authors begin by highlighting the increasing use of chatbots, particularly those based on large language models, in the field of public health. However, they also acknowledge that the effectiveness of chatbot responses has been a subject of debate, and their performance in addressing issues related to myopia has not been extensively researched. Therefore, the primary purpose of the study is to fill this gap by assessing the effectiveness of the three chatbots in responding to a set of 19 public health questions about myopia, covering topics such as policy, basics, and measures.

The research methodology involved the individual response of each chatbot to the set of questions, followed by independent ratings from four raters for comprehensiveness, accuracy, and relevance of the responses. The results of the study showed a significant difference among the word count responses of all three chatbots, with ChatGPT providing the highest word count, followed by Bard, and then Claude. Additionally, all three chatbots received composite scores above four out of five, indicating a generally positive performance. Notably, ChatGPT scored the highest in all aspects of the assessment, although the study also highlighted some shortcomings, such as the chatbots occasionally providing fabricated responses.

In conclusion, the study emphasizes the great potential of chatbots, particularly ChatGPT, in the context of public health. However, it also underscores the need for the rapid development of standards for the use and monitoring of chatbots, as well as ongoing research, evaluation, and improvement of these tools. The findings of this research have implications for the future use of chatbots as a public health tool and suggest the necessity of evaluating and refining their capabilities for addressing specific health-related queries and concerns.

Overall, the study contributes to the growing body of literature on the role of chatbots in public health and provides valuable insights into their performance in the specific domain of myopia prevention and control. Additionally, it highlights the importance of further research and development in this area to harness the potential of chatbots as effective tools for addressing public health issues.

Abstract

Comparison of the Performance of ChatGPT, Claude, and Bard in Support of Myopia Prevention and Control

Yan Wang, Lihua Liang, Ran Li, Yihua Wang, Changfu Hao 

Purpose: Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots ― ChatGPT, Claude, and Bard ― in responding to public health questions about myopia.

Methods: Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by four raters for comprehensiveness, accuracy, and relevance.

Results: The study’s questions have undergone reliable testing. There was a significant difference among the word count responses of all three chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All three chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.

Conclusion: Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation, and improvement of chatbots.

Wang, Y., Liang, L., Li, R., Wang, Y., & Hao, C. (2024). Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control. Journal of Multidisciplinary Healthcare, 3917-3929.

DOI: https://doi.org/10.2147/JMDH.S473680

 

To Top