Don’t Fall in Love with ChatGPT! A response

In their article “Falling in Love with ChatGPT? Warum wir soziale Phänomene der Mensch-Maschine-Interaktion und die Gestaltung von Hochschulen als innovative Lernorte und Blended Universities mit KI neu denken sollten” (“Falling in Love with ChatGPT? Why we should rethink social phenomena of human-machine interaction and the design of universities as innovative places of learning and blended universities with AI”, Google-translated Version), our colleagues Michael Siegel and Oliver Janoschka correctly diagnose that the discourse on ChatGPT and generative AI at universities has so far been too limited and too strongly focused on purely application-oriented issues. They draw attention to the fact that particularly the use of AI chatbots also has a social dimension and, on the basis of various findings, put forward the thesis that the “dialogical design of the chatbot between machine and human […] sometimes seems to be significantly more attractive than the interaction between human and human […] also from a social point of view”. We are very grateful for this thought experiment, but we would also like to try to critically assess the thesis and contrast it with another one: If human-machine interactions are indeed perceived as more attractive than human-human interactions, this must be recognised as a massive warning call to society and universities – after all, a machine is no substitute for genuine human-social interactions and can never be for the foreseeable future.

The film “her”: A criticism of man-machine relationships

But first, let’s take a step back. The article begins with a reference to the film “her”¹: In the film, the protagonist Theodore falls in love with an AI of his operating system named Samantha. Despite the thematically appropriate reference, however, the article refrains from establishing the overall cinematic context. At the beginning, the film shows that Theodore works for a company called Beautiful Handwritten Letters, which uses sophisticated technology to produce deceptively real-looking, supposedly handwritten letters in the handwriting of their clients. The clients are therefore people who do not want to waste their precious time on something as “mundane” as writing letters: Here, the film gives a first hint that people can deceive with the help of technology as well as be deceived by it.

The film does not tell us whether the AI in “her” has an actual consciousness and is therefore a counterpart. However, it does address the deficiency of a human-machine relationship in many instances, such as when Theodore wishes for Samantha’s physical presence while lying in his bedroom or when sexual contact with a surrogate partner leaves him distraught. The film ends with Theodore saying goodbye to the AI and a final scene in which he watches the sunrise over the city head to shoulder with a human partner, his neighbour Amy, who is also disappointed by her relationship with an AI, which can be read as a clear indication of the superiority of human-human over human-machine relationships. However, the article does not give us this contextualisation, and thus the reflection stops in the middle of the film.

The findings

Siegel and Janoschka cite two findings as indications for their thesis: A study by the University of California San Diego, which according to the article proves that patients are more satisfied with the medical answers of ChatGPT than with the answer of real doctors in 79 percent of cases from a qualitative and empathetic point of view, as well as a statement by Ludwig Lorenz, a student Digital ChangeMaker at Hochschulforum Digitalisierung (the German Higher Education Forum on Digitalisation), in a talk with Prof. Dr. Christian Spannagel.

UC San Diego study makes no statement on patient satisfaction

Let’s first look at the mentioned study: It conducted a kind of medical Turing test. The sample was provided by the board r/AskDocs on the social media platform Reddit, where doctors who – according to Reddit moderators – have been checked for their medical credentials answer medical questions from Reddit users in their spare time. Randomly, 195 medical questions were selected from this board and fed to ChatGPT for answering. A panel of three “licensed healthcare professionals”² then blindedly compared ChatGPT’s answers with those the doctors had previously given on Reddit and preferred ChatGPT’s answer to the doctor’s answer in 79 percent of the cases from a qualitative and empathic point of view.

A statement on patient satisfaction, as Siegel and Janoschka falsely claim, cannot therefore be derived from the study. In addition to the fact that no patients were interviewed in the study, the sample is not comprehensive enough: presumably the severity of one or more symptoms has an influence on whether one approaches Reddit, one’s general practitioner or specialist’s practice or the emergency room with a medical problem. Moreover, the quality of answers that doctors give voluntarily on Reddit in their spare time does not say anything about the quality of medical answers in everyday or clinical practice.

Side note: Artificial intelligence in patient care

The article’s reference to the use of AI in patient care, which is severely affected by staff shortages, also falls short at this point: here, a distinction must be made between developments that are to be evaluated positively, such as household or similar robots that help people to maintain their autonomy for as long as possible, and “emotion robots” that mimic social interaction in order to supposedly prevent loneliness. The latter are rightly viewed highly critically both in the field of ethics and the nursing sciences. In the context of this article, the question arises whether we really want to give chatbots in higher education the role of such “emotion robots”.

A statement taken out of context

Ludwig Lorenz is quoted, formally correctly, with the following statement:

“I have noticed that sometimes I have my spurts of enthusiasm. But then I can’t really take them out on anyone in my circle of friends, because they are often things that deal specifically with topics that my friends and family might not know much about. But then I write to ChatGPT and say: Hey, I think this and this are really great. And then ChatGPT replies: Wow, it’s really good that you think that’s great. I’ll give you a few reasons why that’s really great.”
Hangout “KI in der Hochschulbildung – Hype oder Innovation?” (original quote at 18:55 min., translation MW)

Without the context, the quote sounds as if Lorenz is seeking confirmation from ChatGPT in matters that his social environment does not understand much about. If you watch the video in its entirety, it becomes clear that Lorenz is talking about ChatGPT in the context of sources of inspiration. Shortly before, he goes into detail about the mistakes ChatGPT makes; immediately after the quoted section, he affirms that ChatGPT is only a statistical model and not a counterpart and that the “conversation” is therefore an illusion. However, the fact that good and meaningful inspirations can also arise from illusions is not a new insight, nor is it proof of the main thesis of the article: for example, dreams have also been considered a source of inspiration since antiquity at least.

Equal opportunities through chatbots?

Apart from inspiration, however, there is still a need for human learning companions who can hold real conversations and ideally have not only knowledge in their respective fields, but real bildung. Chatbots that merely reproduce sources of knowledge cannot do that. On the contrary, the misconception that chatbots are competent in the field of learning guidance has a counterproductive effect: namely, wherever the availability of chatbots could be used as an excuse to cut back on expensive human staff. We already had a very similar debate – now fortunately debunked argumentatively – in the mid-2010s, when digitalisation was wrongly seen by some political and social actors as a money-saving model for university teaching.

Siegel and Janoschka express the hope that the general availability of chatbots in the role of individual learning companions can contribute to equal opportunities at universities. In our view, this hope will not be fulfilled, simply because a meaningful use of chatbots, whether for inspiration or research, requires a substantial amount of prior education. This is also vividly explained by Ludwig Lorenz in his talk when he talks about the mistakes in the output of ChatGPT: If you want to learn from such mistakes, you first have to recognise them as such. To build up this previous education, reliable human learning companions are continued to be needed, especially also in the first and second education sector; only then can the use of chatbots be meaningfully discussed. Of course, this causes effort and costs – but those who ignore this and try to use chatbots with adolescents who do not have the appropriate prior education are more likely to exacerbate educational inequalities than to reduce them.

ELIZA or the computer as counterpart?

A debate about chatbots as social interaction partners cannot be held without a reference to ELIZA, the early chatbot developed and presented by IT and AI pioneer Joseph Weizenbaum in 1966, one of the first language-analytical programmes for interaction between humans and computers. The reactions he observed to this programme made Weizenbaum an early sceptic of IT technology, which was reflected in his major work Computer Power and Human Reason. From Judgment to Calculation.

The programme, completed in 1966, is based on simple speech analysis and simulates, very effectively for its time, a conversation situation with a psychotherapist. Weizenbaum gives an example of such a “conversation” between a human user and ELIZA (in italics):

“Men are all alike.
In what way
They’re always bugging us about something or other.
Can you think of a specific example
Well, my boyfriend made me come here.
Your boyfriend made you come here
He says I’m depressed much of the time.
I am sorry to hear you are depressed
It’s true. I am unhappy.
Do you think coming here will help you not to be unhappy
I need some help, that much seems certain.
What would it mean to you if you got some help
Perhaps I could learn to get along with my mother.
Tell me more about your family
[…]”³

The example shows that the programme is based on a rather simple algorithm. It tries to isolate individual terms from the user’s text line and convert them into a counter question. Where this fails, the programme asks a general question to keep the conversation alive. Weizenbaum’s motivation for the programme was to demonstrate the general functioning of a computer to a broad public.⁴

What is shocking to Weizenbaum after the publication of ELIZA, however, are the reactions to his programme. The psychiatrist Kenneth Colby, for example, suggests using the programme for therapeutic purposes:

“The human therapist, involved in the design and operations of this system, would not be replaced, but would become a much more efficient man since his efforts would no longer be limited to the one-to-one patient therapist ratio. […] A human therapist can be viewed as an information processor and decision maker with a set of decision rules […]”⁵

The fact that a psychiatrist no longer sees himself as a person who mediates therapy, but as a mechanical “information processor”, and could thus come up with the idea of being able to delegate his work to a computer programme, represents a mechanistic reduction of the human being that is simply incomprehensible to Weizenbaum.⁶

What he finds particularly frightening is how quickly human users are prepared to perceive the computer as an actual conversation partner when talking to ELIZA. His secretary, who had followed the development of the programme for months and is therefore well informed about how it works, asks Weizenbaum to leave the room during a “conversation” with ELIZA – as if it were an actual conversation partner with whom one is discussing intimate details. Weizenbaum is concerned that people seem to be willingly deceived by the illusion of the computer after only a short period of use.⁷

Illusion of a counterpart vs. human counterpart

There are 57 years between ELIZA and ChatGPT, and due to technical progress, the deception described by Weizenbaum as early as 1976 seems much more convincing today than it did then. Nevertheless, there is no reason to assume that computers and software could have developed a consciousness in the meantime and thus advanced to a real counterpart. Computers still function according to the principle of input-process-output. The fact that the “process” sub-step is now much more complex and elaborate than it was in 1966 does not change this.

However, the current debate shows that Weizenbaum‘s criticism is more relevant than ever: the danger of human deception through the illusion of a social counterpart, for example in the form of chatbots, is real. At the same time, anyone who has ever tried a longer, serious conversation with an AI chatbot can understand the unsatisfactory feeling that arises after a short time with the simulated “counterpart”. Our conviction is that this is due to the ontological difference between a simulation and a real human counterpart: A simulation remains a simulation, no matter how deceptively “real” it may seem. Anyone who tries to understand the chatbot as a social counterpart falls victim to a deception or actively deceives himself.

Universities: Empathy-free places of knowledge gain?

Finally, we would like to address the polemic at the end of the article, as the authors themselves put it, that universities in the classical understanding are primarily a place of knowledge gain, where empathy is of little importance. Individually, this impression may be true with regard to the overall construct of higher education. Here, however, a distinction must be made between empathy as part of social interaction and empathy in the true sense of the word (“compassion”). The latter is indeed not an institutional task of universities.

However, universities have always understood themselves as institutions of presence, as places of discourse that also offer space for human empathy, starting with the disputatio in the medieval university up to today’s understanding of personality development as an essential part of higher education.

Where universities do not live up to this understanding, it is often due to factual constraints such as a lack of resources and staff. Chatbots as a solution to this problem, however, lead in a completely wrong direction: As described, for the meaningful use of chatbots we need more, not less, human learning companions who know not only about the opportunities, but also about the inherent risks of AI. Only in this way can AI be used profitably as a creative tool, a source of inspiration or scientific help.

What needs to be done

To see AI in the role of a social interaction partner would be a fatally wrong use of AI, which universities should and must counteract. It is one of those genuinely human domains that absolutely must remain in human hands. Machines are and will remain machines, and not a social counterpart, which, by the way, is confirmed again and again by serious AI researchers outside the often marketing and investor-driven Silicon Valley bubble.

That is why we urgently need a debate on AI competencies at universities: What competencies do university members, regardless of their discipline, definitely need in order to be prepared and able to participate in an increasingly AI-driven world? How can the debate on AI be raised to a new level, away from the practical application of specific tools such as ChatGPT? Which tasks are genuinely human and should not be negligently delegated to AI? And where and in what ways can AI support us in the future, for example as a source of inspiration, creative tool and research tool?

Therefore, we are currently designing a working group within the Hochschulforum Digitalisierung, which is to start its work in autumn 2023. It is to discuss, among other things, the questions outlined above, develop a competence grid for AI at universities on this basis – regardless of the subject area – and look for effective transfer possibilities to all levels of universities. Or, to answer Siegel’s and Janoschka’s question at the end of the article: Challenge accepted!

1
The fact that the pronoun is written in lower case instead of capital letters at this point, unlike is customary in titles, can be seen as the director’s first hint at the ontological quality of the relationship to her, the artificial intelligence in the film.
2
This term includes doctors as well as nurses, chiropractors or physiotherapists, cf. Licensed healthcare professional definition, https://www.lawinsider.com/dictionary/licensed-healthcare-professional, accessed on 14 July 2023.
3
Cf. Weizenbaum, Joseph, Computer Power and Human Reason. From Judgment to Calculation, San Francisco 1976, 3f.
4
Cf. ibd., 4f.
5
Colby, quoted from: Weizenbaum, Computer Power, 5f.
6
Cf. ibid, 5f. and Id., Wo sind sie, die Inseln der Vernunft im Cyberstrom? Auswege aus der programmierten Gesellschaft (mit Gunna Wendt), Freiburg i. Br. 2006, 97: “Today you can find many variants of ‘Eliza’ on the net, all doing roughly the same thing. Only the purposes are different. There is even a variant in which the programme no longer plays the role of the psychiatrist but that of a priest and, so to speak, receives confessions via computer. Although I am not a Catholic, this idea appalls me. If one really believes that a machine can forgive one’s sins and give absolution, then I really wonder what meaning faith or priestly ordination still have.” (Translation: MW)
7
Cf. Id., Computer Power, 6f.