The role of AI in science was shoved into the spotlight again last week after some very ill-advised uses AI-generated content in research papers came to light. In this blog post, I'll take a look at the current state of AI in science, focusing on LLMs in academic publishing, and argue that AI is not going to destroy science, nor is it going to solve it and replace all human researchers.
Last week, users of the site formerly known as Twitter had some fun pointing out academic articles with AI generated images and text. Gary Marcus wrote two posts on what he termed the "exponential enshittification" of science due to the use of generative AI in scientific research. One of the more amusing examples is a generated diagram of a rat in a now-redacted article in Frontiers in Cell and Developmental Biology. After finding some articles with clearly generated text in the published version, users also pointed out that Google Scholar searches for phrases like "Certainly, here is a list," "my last knowledge update," or "I am a language model" turned up many seemingly legitimate articles.
Using AI-generated text and images in scientific articles is clearly problematic. The fact that either type of content could, in its raw form, get through a review process is pretty disheartening. AI-generated text often contains fabricated, misleading, or biased information, and usually sounds like it was written by a pontificating idiot. AI-generated images can’t generate real text (see the Testtomcels above) and rarely represent reality, so it is hard to imagine how they could be of any use for a scientific purpose like making diagrams. Thankfully, the use of AI-generated images in scientific articles seems to be less common than the use of AI-generated text, so I’ll focus on the latter.
There is clearly a growing number of articles online that used an LLM to write a portion of the text with neither the author or reviewers changing it. I decided to take a look at 100 articles that came up when searching for some example generated AI phrases on Google Scholar; "my last knowledge update" was by far the most common. I collected the articles in a Zotero group, which you can access here. This is intended as a representative sample, but not an exhaustive list, as I'm sure there are other articles that contained clearly generated text but weren't caught by the above search terms, and I also went through enough articles to get temporarily blocked by Google Scholar. So, sample size of 100.
Of the 100 articles, 68 of them had some form of journal or publication listed correctly. Some of the other articles, like this one didn't have proper Google Scholar listings, while most of the others were preprints or were only uploaded on ResearchGate (for those lucky enough to not be familiar, ResearchGate is like the Facebook of academia, but somehow even worse than that sounds.)
Only two of the articles were published in journals referenced by the Directory of Open Access Journals: the first in IEEE Access, and the second in a journal titled the "Journal of Arts & Social Sciences". I couldn't find any reliable information about the impact factor of JASS and the article in question can't be downloaded from the journal's website, but only from the Internet Archive, so it's safe to say that this journal isn't the height of academic publishing.
Five of the 100 articles were conference papers, specifically from the IEEE conferences ICSCAN, ICDSAAI, RMKMATE, ICACCTech, and ICECA. None of these conferences are even indexed in the CORE database, which ranks academic conferences. Hilariously, one of the articles is about AI writing tools; it includes (in context, not about AI) the clearly generated phrase "by considering the evolution of these tools beyond my last knowledge update in September 2021, this study equips individuals with insights," and yet it does not acknowledge the use of an AI writing tool, as required by the IEEE.
So, as a summary, out of 100 articles, there were two articles in somewhat reputable journals, five in non-ranked conference proceedings, and the rest were either preprints or published in non-referenced journals. In other words, I’m not panicking about the future of science yet.
I don’t want to minimize the problems of language models in generative AI, however I do think that the claim that this is an AI-driven "exponential enshittification" of science are a bit overblown. Science has been plagued with predatory journals and conferences for years. Google Scholar is known to be "filled with junk science" and bibliometric analyis for academics relies on other sources of information like Scopus or Web of Science. Generative AI leading to an increase of publications in predatory or low-quality journals is a problem, but the existence of these journals, conferences, and articles is not a new problem, and means to address this problem, while imperfect, are already in place.
Rather, there is an argument that writing assistance from large language models will benefit science. Many papers with interesting results have been rejected due to their presentation, English level, or writing style. AI tools can help researchers with these issues, allowing them to focus on the science itself. Copying and pasting generated text into a paper should definitely be forbidden, but the use of language models to translate, correct, or rephrase text could help progress scientific communication if done properly. This editor statement makes some common-sense recommendations:
LLMs or other generative AI tools should not be listed as authors on papers
Authors should be transparent about their use of generative AI, and editors should have access to tools and strategies for ensuring authors’ transparency
Editors and reviewers should not rely solely on generative AI to review submitted papers
Editors retain final responsibility in selecting reviewers and should exercise active oversight of that task
Final responsibility for the editing of a paper lies with human authors and editors
The use of LLMs in academic publishing is also only a small piece of the puzzle of AI in science. There are many, many ways in which AI can be implicated in science, and it’s a subject I hope to return to in future posts. There's a series of workshops on AI for science, a number of research groups, and government initiatives from the EU and the US which mention or focus on AI in science. The recent-ish Nature article "Scientific discovery in the age of artificial intelligence" does a great job of summarizing some of the key advancements in AI for science.
An argument that has been put forward is that AI is best used as a tool to aid in scientific understanding, rather than as a replacement for humans in various parts of the scientific process. AI can help scientists analyze large datasets, design experiments, generate new hypotheses, and iterate over scientific literature. This article argues that AI can be useful as a "computational microscope," allowing for complex simulations and data analysis, as a source of inspiration through analysis of scientific literature, data, or models, and as an "agent of understanding" through explaining concepts to scientists.
Similar points were recently argued by Stephen Wolfram in his blog post and 2.5 hour long webinar entitled "Can AI Solve Science?"1. Wolfram points out that AI can help find interesting or surprising subjects of study, assist in the creation of human-accessible narratives for complex scientific phenomena, and identify patterns or relationships within vast amounts of data that might not be immediately apparent to human researchers. However, he also cautions against over-reliance on AI, emphasizing that the ultimate understanding and advancement of science will still depend on human creativity, intuition, and the ability to conceptualize new frameworks and theories.
So, AI isn't going to destroy science, nor is it going to solve it and replace all human researchers. As a tool for scientific understanding, it can certainly accelerate science, and the question of whether accelerating science is desirable should be discussed. To benefit from AI, however, the temptation to prioritize quantity over quality, to publish first and question later, must be resisted, as ensuring the credibility of scientific endeavors in the face of AI's double-edged promise is paramount. Using Midjourney to generate fake rat testes is definitely not the way forward.
The point about human understanding only comes at the end of this massive post/webinar. The central argument of Wolfram's post is rather that the effectiveness of AI in science depends on computational reducibility. Some problems, which are reducible, can be broken down into simpler steps that can be approximated or solved by AI. For problems that are not reducible, such as the three-body problem in physics, the approximations offered by machine learning won't be effective. By leveraging AI as a tool within the broader context of human-led research, scientists can navigate computational irreducibility, uncovering pockets of reducibility that offer new insights and breakthroughs.