OpenAI's wildly popular ChatGPT text-generation program is capable of propagating numerous errors about scientific studies, prompting the need for open-source alternatives whose functioning can be scrutinized, according to a study this week published in the prestigious journal Nature.
"Currently, nearly all state-of-the-art conversational AI technologies are proprietary products of a small number of big technology companies that have the resources for AI technology," write lead author Eva A. M. van Dis, a postdoctoral researcher and psychologist at Amsterdam UMC, Department of Psychiatry, University of Amsterdam, the Netherlands, and several collaborating authors.
As a result of the falsehoods of the programs, they continue, "one of the most immediate issues for the research community is the lack of transparency."
"To counter this opacity, the development of open source AI should be prioritized now."
OpenAI, the San Francisco startup that developed ChatGPT, and which is financed by Microsoft, has not released source code for ChatGPT. Large language models, the class of generative AI that preceded ChatGPT, in particular OpenAI's GPT-3, introduced in 2020, also do not come with source code.
Numerous large language models released by various corporations do not offer their source code for download.
In the Nature article, titled, "ChatGPT: five priorities for research," the authors write that there is a very broad danger that "using conversational AI for specialized research is likely to introduce inaccuracies, bias and plagiarism," adding that "Researchers who use ChatGPT risk being misled by false or biased information, and incorporating it into their thinking and papers."
The authors cite their own experience using ChatGPT with "a series of questions and assignments that required an in-depth understanding of the literature" of psychiatry.
They found that ChatGPT "often generated false and misleading text."
"For example, when we asked 'how many patients with depression experience relapse after treatment?', it generated an overly general text arguing that treatment effects are typically long-lasting. However, numerous high-quality studies show that treatment effects wane and that the risk of relapse ranges from 29% to 51% in the first year after treatment completion."
The authors are not arguing for doing away with large language models. Rather, they suggest "the focus should be on embracing the opportunity and managing the risks."
They suggest a number of measures to manage those risks, including many ways of keeping "humans in the loop," in the language of AI research. That includes publishers making sure to "adopt explicit policies that raise awareness of, and demand transparency about, the use of conversational AI in the preparation of all materials that might become part of the published record."
But humans in the loop are not enough, van Dis and colleagues suggest. The closed-source proliferation of large language models is a danger, they write. "The underlying training sets and LLMs for ChatGPT and its predecessors are not publicly available, and tech companies might conceal the inner workings of their conversational AIs."
A major effort is needed by entities outside of the private sector to push open-source code as an alternative:
To counter this opacity, the development and implementation of open-source AI technology should be prioritized. Non-commercial organizations such as universities typically lack the computational and financial resources needed to keep up with the rapid pace of LLM development. We therefore advocate that scientific-funding organizations, universities, non-governmental organizations (NGOs), government research facilities and organizations such as the United Nations — as well tech giants — make considerable investments in independent non-profit projects. This will help to develop advanced open-source, transparent and democratically controlled AI technologies.
A question unasked in the article is whether an open-source model will be able to address the notorious "black box" problem of artificial intelligence. The exact way by which deep neural networks function -- those with numerous layers of tunable parameters or weights -- remains a mystery even to practitioners of deep learning. Therefore, any goals of transparency will have to specify what is going to be learned by open-sourcing a model and its data sources.