With the release of ChatGPT and Bing (that much-maligned monster), a great deal of ink has been expended on the problem of AI systems ‘hallucinating.’ But what does this word mean in relation to a bunch of code?
For most of us non-computer-scientists, the term ‘hallucination’ conjures up a vision of a sentient being in a state of conscious delusion. Hallucinations that humans experience are different to say dreams, because in dreaming we aren’t conscious. Hallucinators of the human variety are fully conscious but are usually said to be in a state of ‘altered consciousness,’ often brought about by extreme physical stress such as dehydration or fasting, or from ingesting drugs. I can attest to the overwhelming sense of reality that may accompany pharmacologically induced hallucinations and the powerful present-ness of such experiences.
So, to declare that an AI system is ‘hallucinating’ is to tacitly imply the system has already attained some kind of consciousness. This has been on my mind as I read the endless column-inches being devoted to ChatGPT and Bing’s various idiotic, hilarious, astounding, and sometimes dangerous hallucinations.
Far more important than the content of any of these cognitive misalignments is the insinuation of an overall context in which ‘hallucinating’ might even be possible – and a lack of commentary challenging this root idea. Even Gary Marcus, the famously skeptical critic of AI, who has been all over the media warning about its risks, has seemingly accepted the term at face value. It’s the use of the term itself that I want to challenge here.
According to dictionary.com, the primary definition of an ‘hallucination’ is “a sensory experience of something that does not exist outside the mind, caused by various physical and mental disorders, or by reaction to certain toxic substances, and usually manifested as visual or auditory images.” Note here the central phase “something that does not exist outside the mind,” which immediately puts us in the domain of something that has a mind.
Does Bing have a mind?
I think the consensus among AI researchers and philosophers alike is still no. Some AI enthusiasts are now arguing that it has the capacity for what is called ‘theory of mind,’ which is to say it can register when it is encountering an entity with a mind – ie. a human. However, I don’t see the research community claiming (yet) that AI’s have minds of their own.
In applying the term ‘hallucination’ to their bots, computer scientists and AI researchers are using the word in a much looser, less complex way – the one spelled out in dictionary.com’s third and final definition: “a false notion, belief, or impression.”
Let’s leave aside the word “belief” for now[1] and just go with “a false notion or impression.” This is the sense in which AI programmers use the term hallucination and it’s a whole lot less sexy than the primary definition, for it implies nothing about a ‘mind,’ conscious or otherwise. No consciousness is being altered because none is present. When Bing or ChatGPT ‘hallucinate’ they are not having LSD-like experiences. Technically, when a bot ‘hallucinates’ it is simply stating untruths by giving factually incorrect information, or “making things up” by spinning yarns based on the vast strings of words it’s been trained on.
In an elite discussion group I participate in hosted by Google,[2] a lively exchange has taken place in the past few weeks over the appropriateness of the word hallucinating in the AI context. Several researchers have defended this as just a “technical” term and reproached the public for misunderstanding scientists’ strict definitions. Yet ‘hallucination’ is a word with deep resonances in our culture dating back long before the invention of computers, so it’s rather disingenuous to then claim it should be interpreted by the public in a purely “technical” way – whatever that might mean.
The fact that AI researchers have chosen a word so laden with human connotations is not a co-incidence I believe. It’s an index of a wider trend in which words with complex cultural, psychological and social meanings are often now being appropriated by computer scientists: ‘Intelligence,’ ‘consciousness,’ and ‘understanding’ are other examples. My original intention for this essay was to write a piece on the ways in which the appropriation of real-life words into scientific rhetoric are being leveraged to sometimes quite misleading effect. I will leave that for a future installment. For now, I want to respond to a development in AI ‘hallucinating’ that – rhetoric aside – seems to me to an interesting development.
As noted above, ‘hallucination’ in relation to AI simply means such a system has stated something which isn’t true in the real world. At worst, this can be sprouting toxic falsehoods such as racist or sexist lies; quoting fake facts about elections, climate change, vaccines and so on; or, insidiously, offering up citations to fake articles about fake research with fake findings, frequently invoking the names of real researchers, thereby making it hard to assess where the falsehood begins.[3] But in this narrower technical sense, hallucination may also mean “making things up” – which isn’t necessarily bad.
Bedtime stories are made up; and all children hunger for good ones. In my own interactions with ChatGPT, I have asked it repeatedly to “tell me a story,” hoping to be delighted or at least amused. So far, it’s failed miserably, churning out formulaic plotlines that, on questioning, it admits to having lifted from classic tales, including Alice in Wonderland.[4] At least it’s been honest about its sources. Worse still, its prose-skills have been atrocious. I have one piece of advice for its maker, OpenAI – send it to a creative writing class. However much it may be able to (sort-of) emulate a specified writer whose prose has been included in its training set, as far as I can tell its own imaginative literary talents are sorely lacking. As professional writer I feel no threat whatever.
Ironicially, there is a scientific domain in which ChatGPT’s fabulation skills may prove genuinely innovative – the realm of synthetic biology.
In a recent paper published in Nature Biotechnology, a pair of researchers set a large language model (LLM) AI the task of ‘hallucinating’ new proteins.[5] (The formal title of the paper is “Hallucinating Functional Protein Sequences.”) That is to say, they wanted the AI to come up with proteins that don’t exist in real-world biology. In this case, making things up was the goal.
Proteins are the molecules almost all living things are built from: our bones, our blood, our tissues are all composed of proteins and this is what DNA codes for. For decades biochemists, medical researchers and pharmaceutical companies have been interested in finding new proteins for use as medications and in other areas of industrial chemical production. This has been challenging, to say the least, because proteins are often super-complex molecules with very intricate shapes that are essential to their functioning. It’s hard to tell if a slight modification to any given protein might make the whole thing fall apart and become unviable, or transmute it onto a poison.
It turns out the new generation of LLM AI’s are very good and very fast at making up new protein structures, culled from what the researchers refer to as a “protein fitness landscape.” Might this be a boon to medicine? Again, on the Google group vigorous debate has taken place with researchers weighing in on both sides. Detractors point out the many steps that must take place between a theoretical description of a new protein and successful clinical deployment, including testing if it has toxic side-effects or unexpected interactions with other proteins in our bodies. Also open is the question of whether a new protein can be manufactured efficiently, or manufactured at all: “Your hallucination may fall apart in the real world,” one participant noted. Any particular made up protein may not be useful in the end, yet proponents see AI’s as a potentially powerful tool whose fabulating skills may assist with more rapid development of drugs.
Which has led me to contemplate a more science fiction scenario: Could AI’s fantasize a new biochemistry? If they can make up new proteins ,could they one-day make up a new kind of molecular life-system not based on DNA or RNA but something quite alien? In other words, could an AI dream up synthetic sheep?
How about a life-chemistry based on silicon rather than carbon? Silicon is an obvious candidate because it sits below carbon in the periodic table and has the same chemical valency. This is the stuff of science fiction, and of actual science speculation. NASA has a lovely video[6] discussing the possibilities and challenges for a silicon-based biology, noting that while it’s highly unlikely to happen on Earth, where water reigns supreme, it is perhaps more viable in the methane-saturated environment of Venus or the hydrocarbon pools on some of Jupiter’s moons. Even on Earth, organisms such as radiolarians and glass sponges make their structural supports out of silicon, employing carbon-based chemistry for their organs in a wondrous hybrid carbon-silicon physiology.
Surveying the richness of the periodic table, who knows what other bio-chemistries may be waiting in the space of molecular possibility, outlandish confabulations based on novel arrangements of atoms. Could AI’s open a window onto this fantastical chemical ‘landscape’? Perhaps such proposals would not be constructible on Earth – they might require the internal pressure of a star, for instance, or a combination of elements not found together in the real world. Perhaps these alternative chemistries would remain science fictions – but we would have been told a truly original story.
[1880 words]
[1] I will write a later post about AI and belief.
[2] The group is called SciFoo, and I’m indebted to the many thought-provoking comments from its participants particularly David Sacks, Gary Bradski, James Gleick, and Pascale Fung.
[3] My favorite so far is a growing body of research ‘literature’ cited by ChatGPT about bears in space.
[4] The first time I asked it to “tell me a story” it produced a third-rate version of Alice in Wonderland. When I asked why it had chosen this tale it explained that AIW is a much-loved story that many people like. I then asked it to tell me a more original story, which resulted in a C-grade fairytale about a boy who saved his mountainous village from a dragon. Several iterations of this prompting with the aim to get it to write something genuinely original produced even worse plotlines and prose, including a couple of dire attempts to emulate the magical ‘nonsense’ of Jabberwocky. At one point I asked it to try its hand at science fiction which produced a pastiche about “a warlike and brute-force-oriented” alien race, the Zorgs, encountering a peace-minded and “more technologically advanced” race, the Zorons, with the respective planets eventually bonding. None of this rose remotely to the level of captivating.
[5] “Hallucinating Functional Protein Sequences” was published in the January 26, 2023 issue of Nature Biotechnology. https://www.nature.com/articles/s41587-022-01634-2
In 2019 an earlier paper addressing similar issues, “How to Hallucinate Functional Proteins,” was posted to the arXiv. https://arxiv.org/abs/1903.00458
You write, "As professional writer I feel no threat whatever."
This calculation would seem to depend upon one's age. Those well in to their career may have limited reason for concern. Those entering a writing career may want look more carefully.
An academic philosopher who teaches for a living recently claimed that these chatbots (I believe he was referring to Jasper) can currently write credible philosophy articles at the under graduate level. He provided examples, and his claim looked about right to me.
So naturally the question arises, how long will it be until such systems can write credibly at the graduate and post graduate level? I don't claim to know, but the trend seems clear.