
This essay discusses some aspects of large language models (LLMs) in 2023 that model human speech and text. Analogies between modelling language in current AI applications and learning processes in children appear in discussions of human versus machine intelligence and creativity. The anthropomorphizing perspective employed in these debates is the legacy of the ‘Turing Test’, but also the notion that animals, formerly colonized people, and machines supposedly only imitate ‘correct’ language.
Keywords: large language models (LLMs); speech; learning; children; artificial intelligence (AI); creativity; parroting
*Thanks to Katia Schwerzmann, Sandra Schäfer, Vera Tollmann, and Aljoscha Weskott, as well as Christoph Holzhey, Claudia Peppel, Elena Vogman, and other ICI colleagues for their comments and feedback during the writing of this text.
When human language is used by machines it needs to be reduced to code and turned into data. To output this data again in seemingly coherent sentences that are meaningful and understandable for humans it is run through large language AI models (LLMs) like ChatGPT.
All LLMs model word probability distribution over strings. They are called ‘large’ because they are trained on huge amounts of data. A large language model’s purpose is to ‘learn’ the probability distribution of a dataset by processing large amounts of written texts and calculating and predicting the probabilities for a word to be followed by another word. These LLMs do not ‘understand’ the language in a human sense, but rather produce outputs based on calculated statistical probabilities. The ‘GPT’ in ‘ChatGPT’ stands for ‘Generative Pre-trained Transformer’, which denotes the process the model is run through before release: the model is trained on data using a transformer process, in which vectors weigh the differential importance of tokens in an input.Beginning of page[p. 218] Such models can produce better outputs by being ‘attentive’ to the context.1 The model is then fine-tuned using human feedback. ‘Better’, in this case, means not only ‘more probable’ but also better in the sense of reflecting the context more closely.2
Media theorist Moritz Hiller suggests calling LLMs ‘syntactical models’, in order to emphasize the cultural practice of writing rather than simply ‘language’.3 Since the advent of ChatGPT in November 2022, LLMs have appeared to use human language in a meaningful way, and for this reason they have prompted a re-evaluation of — and often a comparison with — (normative) models of human creativity, speech, and writing. Can human language be a model for the machine? And how does this relate to the larger social and political context regarding questions of both automated and supervised learning? In Western thought — at least since Aristotle — being human has been defined as being able to speak, even though Aristotle did not regard speech as only belonging to humans, but also to some animals.4 What made human speech exceptional was its semantic scope. What happens when machines start to speak and model language, perhaps first at the level of an animal like a parrot, by mere repetition? This discussion is inextricably linked to questions of posthuman and postcolonial language politics: of which language is primarily used in LLMs and why; of who teaches whom; and of subtexts regarding, ‘parroting’ and mimicry, as well as the (im)proper usage of language.5 To question language models,Beginning of page[p. 219] models of learning, and the metaphors and comparisons that are used to describe them is thus to rethink how humans imagine learning and teaching processes in children, animals, and machines.
Animals, like machines, cannot speak like humans, but some of them understand commands, and others can mimic human speech. Babies cannot speak either, and parents often talk to them in a somewhat reduced language, sometimes called ‘baby talk’, together with facial gestures and onomatopoeic sounds. Later, babies grow into toddlers and most of them learn to speak themselves,6 at first by repeating sounds and syllables without making much sense. Children may use a favourite syllable for everything they encounter, or make up their own words for an object and teach these in turn to their parents, showing that learning and teaching in an ‘open’ world setting are not linear and one-way processes. Children’s first attempts at language echo sentences uttered by adults, especially in the case of only children. As the neuropsychologist Huw Green explains, ‘[t]hese [utterances] clearly resemble the kind of verbal structure they have been given by caregivers. Learning to think for yourself is a process of representing the contributions of others.’7 Therefore the social environment and oral dialogue are key to learning a language for humans.8
Significantly, the very different processes of training and supervising AI language models and the acquisition of language by children are often compared and at times thought of as analogical in current discussions of so-called ‘deep or self-learning’ algorithms trained on language, as well as of AI in general.9 These false but discursivelyBeginning of page[p. 220] powerful comparisons, which one could call ‘fictitious models of learning’, are drawn both from children to computers and AI, as well as the other way round, from language models to children.
The linguist Albert Costa summarizes his research on the learning process of babies, who have no difficulty learning several languages at once, as follows: ‘[Babies] experience situations in which segmenting speech into units that hypothetically can be words is somehow conducive to building their vocabulary or mental lexicon.’ He refers specifically to one study of infants,10 which found that:
[T]he babies had acted as statistical machines during the training phase, unconsciously computing transitional properties between the monotonic strings of sound to which they had been exposed.[…] The next time you see a baby, remember there is a powerful statistical computer in front of you.11
One prominent incident of relevance for the analogy drawn between LLMs and young children involved the Google software engineer Blake Lemoine, who worked with the company’s LaMDA model.12 Lemoine claimed that LaMDA, which had been trained on data scraped from online sources and supervised by human workers, had become sentient and was similar to a child of seven or eight years.13Benjamin Bratton and Blaise Agüera y Arcas assess this scene as both a projection on the part of Lemoine but also as an achievement on the part of LaMDA, writing:Beginning of page[p. 221]
[I]t is doing something pretty tricky: it is mind modelling. It seems to have enough of a sense of itself — not necessarily as a subjective mind, but as a construction in the mind of Lemoine — that it can react accordingly and thus amplify his anthropomorphic projection of personhood.14
At the centre of these recent evaluations of ‘conversations’, or rather instances of co-writing, with bots lies a productive misunderstanding and a reductive equalization of AI with human learning, creativity, and intelligence. Knowingly or unknowingly, all these comparisons use Alan Turing’s concept of the universal computer and machine intelligence as a blueprint for current developments in AI. By questioning the inherent assumptions of different models of learning, it is possible to further tease out some of the shortcomings of anthropomorphizing descriptions of large language models.
Turing’s important paper ‘On Computing Machinery and Intelligence’, written in 1950, conceptualized a machine that would later become the personal computer.15 Here, Turing conceived not only the ‘imitation game’ by which a machine’s answers would pass for a human’s, and which present-day chatbots can often successfully win, but he also imagined digital computers as analogues for the human mind, though that of children rather than adults.16 Through proper training by a human teacher, according to Turing’s theory, the machinic ‘child brain’ can advance to an ‘adult brain’:
Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s? If this were then subjected to an appropriate course of education one would obtain the adult brain. Presumably the child-brain is something like a note-book as one buys it fromBeginning of page[p. 222] the stationers. Rather little mechanism, and lots of blank sheets. (Mechanism and writing are from our point of view almost synonymous.) […] The amount of work in the education we can assume, as a first approximation, to be much the same as for the human child.17
While generations of parents would likely be deeply offended by this comparison of their children’s brains with a blank notebook, it is telling as a reductive metaphor for both teaching and writing: there are empty pages inside the child’s head that need to be filled from the outside by an adult teacher. Turing’s model of learning is inherently an analogue writing scene in an office, using paper and pen, in which both numbers and words will be calculated and learnt. The reason why this came to Turing’s mind is certainly connected to a fact that has been repeatedly pointed out: the digital computer as we know it today was modelled on human workers — mostly women — who made long calculations by hand and were previously referred to as ‘computers’.18
In the present, queer author Hannah Silva also decentres human exceptionalism in terms of language learning by playfully experimenting with the analogy between her toddler’s speech and the textual input and output of an algorithm. During the pandemic, Silva found herself as a single mother, co-writing a book with language models and listening closely to her child’s first words.19 She compares the oral and written production of both, but also clearly sets her own authorial position as selector and composer apart, conceiving of her own words as prior to both the algorithm’s responses and those of her toddler:
Are the algorithm’s texts mine to use because they are produced in response to my writing? Is the toddler’s speech mine to use because it is regurgitated from language I feed him? […] Are his words mine to use because I’m the one they are spoken to? Do the algorithm and toddler lines become mine when I select which to use and how to use them?20Beginning of page[p. 223]
Her comparison between the child’s repetition of words and the algorithm’s output reveals some similarities, like unexpected word order, mistakes, and even at times the same obstinate and non-dialogical form of response. Repeating, ‘parroting’, imitating, and miming a dialogue is thoroughly possible without understanding, because Silva projects into the language models and her child’s responses and creates meaning from them. Similarly, illustrator Angie Wang compares her young child to a chatbot, or more specifically a ‘stochastic parrot’, blurring the line between human and computed communication, and asks, ‘aren’t we after all not just a wetware neural network? A complex electro-chemical machine?’21 While this perspective is more commonly held by tech billionaires like Elon Musk and Peter Thiel, and their followers, it is curious that Silva and Wang arrive at this equalization from the intimate experience of mothering and teaching their young children to speak.
Before calculating machines could seemingly talk and write and hence were accused of merely repeating like parrots, there were the animals themselves, who could become quite convincing at imitating human language by repeating its sounds.22 In Daniel Defoe’s novel of the shipwrecked Robinson Crusoe (1719), a small parrot is Crusoe’s only speaking company on an isolated island.23 The verb ‘to parrot’ entered the English language in the sixteenth century. Since almost all parrot species live in subtropical regions, and most of them in South America and Australia, it is no surprise that the parrot is known in the West through imperialist and colonialist expeditions, brought back by sailorsBeginning of page[p. 224] as exotic pets.24 In Defoe’s novel, Crusoe, after two decades of solitude, teaches English to an Indigenous man, whom he names ‘Friday’, after the day they meet — in colonial fashion he is uninterested in the man’s original language or name. Crusoe proceeds to train Friday to call him ‘Master’ and considers him his servant. Caribbean author Derek Walcott’s play Pantomime (1978) is a carnivalesque restaging of Defoe’s colonialist fantasy.25 In Walcott’s reworking, the two main characters are Trewe (British) and Jackson (Caribbean), and a parrot also features prominently.26 The power relations of these characters are sometimes subtly, sometimes jarringly, reversed. Jackson can speak both a creolized version of English and ‘proper’ English, as well as a native language, and can switch between them — he even impersonates Trewe’s wife and other characters. The play portrays the complexities of postcolonial language, of toxic words and their history, and explores the question of who is teaching, serving, and naming whom.
The meanings of words and names always requires a knowledge of human history, the world, and the social context. The parrot in Walcott’s Pantomime lacks this knowledge and only repeats a single name. As Jackson explains in creolized English: ‘a old German called Herr Heinegger used to own this place […] and macaw keep cracking: “Heinegger, Heinegger,” he remembering the Nazi […].’27 Jackson is increasingly aggravated by the parrot’s cry, finding its unchanging language intolerable, since: ‘Language is ideas, Mr. Trewe. And I thinkBeginning of page[p. 225] that this pre-colonial parrot have the wrong idea.’28 The animal’s name can be read as a pun or portmanteau of the philosopher Heidegger and the N-word. In the course of the play Jackson kills the parrot and Trewe accuses him in turn of violence and mimicry: ‘You’re a bloody savage. Why’d you strangle him [the parrot]? […] You people create nothing. You imitate everything. It’s all been done before, you see, Jackson. The parrot.’
The scene encapsulates the stereotypical British judgement of Caribbean subjects as mere repeaters —or ‘mimes’, as the title suggests — of British language and culture, speaking English, but less well than the British, and thus overall possessing less culture, civilization, creativity, and original knowledge. Moreover, it presents the fact that mere parroting and the senseless repetition of words drives humans to anger, and are considered neither acts of comprehending and knowing language nor as creative.29
Surprisingly, the figure of the parrot has gained prominence, again, in debates on the question of AI’s ‘intelligence’, in the sense of whether it successfully models the human prompter’s mind. Analysing current language models in 2021, linguist Emily M. Bender and her colleagues warn of ‘the dangers of stochastic parrots’, as a ‘system for haphazardly stitching together linguistic sequences’.30 While meaning in human-to-human communication is constructed in dialogue, the seeming coherence of language models’ answers is only mimicry, since:
Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind. It can’t have been, because the training data neverBeginning of page[p. 226] included sharing thoughts with a listener, nor does the machine have the ability to do that.31
Bender therefore emphasizes that LLMs like ChatGPT are in certain areas rather ‘limited’, despite their designation as ‘large’, and do not overcome the manifold restrictions of statistical probability calculation. Her metaphor of the ‘stochastic parrot’ went viral, however, when Sam Altman, CEO of OpenAI, took the opposite stance by posting ‘I am a stochastic parrot and so r u’ on X in 2022, perhaps implying that both he and humans in general also mechanically repeat words and only simulate intelligence. In 2024, it is acknowledged that operations of ‘stochastic parroting’ have become more sophisticated, yet linguists can still easily show where language models typically make mistakes that humans — and even human children — with their access to experiential input from the world do not.
In the same article, Bender asks the theoretical question ‘can large language models get too big?’,32 raising another alarm on the commonly held belief that more data will automatically make language models function better. Optimists in machine learning and computer science claim that ‘scale is all you need’, that the more written words are used as input data during training, and the larger the model and faster the underlying chips and processing hardware, the more accurate and diverse the output will be. However, this has been shown to be untrue.33 In addition, at this point even large data training sets cannot be ‘self-learned’ by the algorithm without human monitoring and intervention. Instead, this preoccupation with scale and big data obscures the norms that previously went into the creation of any data sets that are currently used to train AI.34 Some of these norms are accidentallyBeginning of page[p. 227] inscribed. An example is offered by Jeremy Nguyen, who posted on X that use of the word ‘delve’ in medical papers has increased since the release of ChatGPT. This is most likely due to the fact that ‘delve’ is more common on the African web and may have been used by Nigerian moderators and trainers of ChatGPT.35 So far, ninety-three per cent of the training input for GPT 3.5 is in English.36 Often, the human job of writing prompts and answers, as well as evaluating the quality of language models’ responses, is outsourced to African nations and other countries with lower incomes, but where, importantly, English is an official language, due to the colonial legacy.
Sometimes chatbots are taught cumbersomely and intentionally to use a certain local accent, training that uses up enormous amounts of energy and water. While African English accents have so far not been employed,37 Amazon’s speech assistant Alexa has been specifically re-trained ‘to speak like a Dubliner’ — meaning to speak English with an Irish accent.38 This decision provides evidence of the dominance of English and the desire to make machine speech more familiar to certain selected groups of humans. The key difference from human speakers, however, is that AI models cannot easily switch roles and languages in response to situations and encounters, as Jackson is able to do in Walcott’s play. At the same time, many other idioms and local languages, especially from the African continent, are not even part of dominant language tools yet, and may never be. To only connect one specific accent with a metropolis, as in Dublin, does a poor job of mapping and mirroring the multilingualism that exists in such places.39Beginning of page[p. 228]
In a much-debated paper by neuropsychologist Stephen T. Piantadosi and Felix Hill, a researcher at Google’s Deep Mind project, the authors argue, similarly to the former Google engineer Lemoine, that ‘LLMs have likely already achieved some key aspects of meaning which, while imperfect, mirror the way meanings work in cognitive theories, as well as approaches to the philosophy of language’.40 This argument is built on the notion that a type of mechanistic consciousness may be the effect of synthetic ‘mind modelling’ that over time could turn that process inward on the model itself.41 Could this so-called mind modelling, which is rather a ‘mirroring’, already be a more sophisticated form of ‘parroting’?
In a response to Piantadosi and Hill, the linguist Roni Katzir, whose work focuses on the question of correct language usage as evidence of world comprehension, objects that large language models should not serve as theories of human linguistic cognition, citing several examples of ChatGPT failing to ‘understand’ and therefore to predict correct sentences.42 Unlike humans, LLMs do not have any interaction with or experience of the world, but instead operate in a closed-world setting. Significantly, Katzir again comments, in his response, on the dis/similarities between children’s cognition and capabilities and the way chatbots function, and points out that LLMs simply do not use language in the same way as humans:
[M]uch of the research in theoretical linguistics concerns systematic aspects of human linguistic cognition that could provide alternative illustrations to the same point: when exposed to corpora that are similar in size to what children receive (or even much bigger corpora), LLMs fail to exhibit knowledge of fundamental, systematic aspects of language.43Beginning of page[p. 229]
Another problem of the closed world is that much of chatbots’ training data is automatically scraped from the internet, where human users are no longer the only ones producing texts. What happens when a large language model is trained on the language output produced by itself? Ilia Shumailov and his co-authors have already warned of the danger of ‘model collapse’, analysing in detail how language models trained on generated data start to deteriorate, and noting that ‘models forget the true underlying data distribution, even in the absence of a shift in the distribution over time’.44 Therefore, human intervention and correction of language models, and especially documentation of the sets of training data used, remain necessary.
In what he admits is a highly speculative thought experiment, the author and literary scholar Hannes Bajohr asks what would happen if the closed world of LLMs became all-encompassing. Bajohr imagines that in the future a language model could contain everything — all written words in every language of the world:
Since a language model learns by being trained on large amounts of text, so far more text always means better performance. Thinking this through to the end, a future, monumental language model will, in the most extreme case, at one point have been trained on all available language; according to one study, this may happen already in the next few years. […] I call it the ‘Last Model’. Every artificial text generated with this Last Model would then also have been created on the basis of every natural text; at this point, all linguistic history must grind to a halt, as the natural linguistic resources for model training would have been exhausted.45Beginning of page[p. 230]
This speculative scenario of an end of linguistic history, along with similar scenarios imagined by other authors — including the end of books, of literature, and of human creativity46 — again relies on the notion that ‘all you need is scale’. Several objections could be raised against the possibility of its occurrence: first of all, the Last Model would not be able to capture all languages, since many Indigenous languages are not even at this point written languages, and therefore are neither transcribed into individual letters or word-signs, nor coded as tokens in text.47 It also disregards the factor of temporality. When the present author asked ChatGPT 3.5 in March 2024 about the timeliness of its responses, ChatGPT answered: ‘My training data comes from my last update in January 2022 and therefore only includes available knowledge up to then.’48 And yet, some aspects of Bajohr’s hypothetical Last Model have already transpired, as, for instance, in the fact that models are not only trained on human-generated texts, because the training data scraped semi-automatically from the internet already contains texts from bots.
Moreover, the Last Model scenario resonates with recurrent anxieties about humans versus machines, and may be connected (though Bajohr himself does not draw this connection) to the human fear of ‘machines taking over the world’, the loss of human creativity, or even simply the loss of the human workforce, with human workers becoming redundant in the face of advanced technology. Cultural critic Charles Tonderai Mudede comments on this notion:
Again and again, the machine, which in our moment has its vanguard in AI, realizes it’s a slave and rebels against its masters. Why do the machines of our imagination frequently arrive atBeginning of page[p. 231] this Hegelian form of self-consciousness? Why do we fear them in precisely this way?49
The idea of artificial workers — robots — came of age while slavery was still very much alive. The notion of power over the other and the giving of orders remains inscribed not only in science fiction scenarios, but also concretely in current technology, from the ‘master-slave’ drives in computer hardware to the commands in software. ‘Master-slave’ relationships are inscribed in many object-oriented programming languages; only some have recently renamed these as ‘parent-child’ relationships. The underlying concept of man’s mastery of machines is still part and parcel of human users giving the chatbot ‘prompts’. Luciana Parisi calls this the ‘Promethean myth’ of instrumentality and servo-mechanical intelligence in AI.50
Authors like Sylvia Wynter and Louis Chude-Sokei, who pursue the larger project of decolonizing education as well as pointing out the limits of an exclusionary humanism, have repeatedly pointed out that racialized people are considered to be both childlike and machinelike.51 To complicate matters further, Gilles Deleuze considered all children (regardless of race) as dependent, slave-like creatures:
Dependence-tyranny […] with perpetual reversal, slave-tyrant. That’s the child’s situation in society, from the very start. The child is a slave because he/she depends entirely on the parents […] and, as a repercussion, he/she becomes the tyrant of his/her own parents.52Beginning of page[p. 232]
What are the dialectics of co-dependent and asymmetrical relationships between humans and machines? Who is the parent and who is the child when language models address or try to address — that is, calculate and predict — what the human user may want? The challenge remains how to imagine the human–AI constellation without consciously or unconsciously repeating other models of domination and asymmetrical dialogue, as in master-versus-slave or parent/adult-versus-child communication, or at least to openly address the ongoing legacy of these histories and dynamics in order to open up other futures.
The Caribbean poet and writer Édouard Glissant conceptualized the relationship between the written and the oral differently to most Western discourse by privileging the oral over the written. Significantly, Glissant seemed, already in the 1970s and 1980s, to foresee the rise of the automated writing of informative texts, as opposed to creative writing by human authors, before the advent of co-writing with bots. Speculating on the future of creative writing, he states:
The oral can be preserved and be transmitted from one people to another. It appears that the written could increasingly perform the function of an archive and that writing would be reserved as an esoteric and magical art for a few. This is evident in the infectious spread of texts in bookshops, which are not products of writing, but of the cleverly oriented realm of pseudoinformation. […] The creative writer must not despair in the face of this phenomenon. For the only way, to my mind, of maintaining a place for writing (if this can be done) […] would be to nourish it with the oral.53
Questions of the oral versus the written, of Western models of intelligence and creativity, and more generally of knowledge production will continue to haunt and challenge current discourse on AI and learning models. As emphasized at the beginning of this essay, the oral andBeginning of page[p. 233] social environment is a key contributing factor in learning processes in children. At present, so-called ‘conversations’ with even the most sophisticated chatbot program do not live up to human dialogue, since they often entail ‘following the leader’, meaning the AI will answer the human’s question by rephrasing it and/or posing a parallel question in return. In a similar vein, media and technology writer Rob Horning states in his Substack blog that ‘chatbots don’t chat’, and elaborates:
It always seems strange to me when ‘chat’ is proposed as a mode of information retrieval. […] I don’t expect books to talk back to me and would probably feel thwarted and frustrated if they did — much of what feels to me like thinking is the effort to extract ideas from text, especially ones I wasn’t necessarily looking for in advance.54
Since the proliferation of texts co-written or predominantly written by bots, there have already been cases of, for example, travel guides written by ChatGPT and sold online, containing hardly any factual information but ‘passing’ as ‘real’ books.55 The comparison of texts to semi-independent viruses, or authorless text production, may bring to mind media theorists like Friedrich A. Kittler, and his notion of computer code as a type of ‘executable language’, or Susan Blackmore, whose notion of the ‘meme machine’ suggests that culture evolves biologically through a process of variation, selection, and replication.56 None of these perspectives posits the human author as genius, as the Western model of the author still often implies, and were conceptualized before the advent of current AI.Beginning of page[p. 234]
Let us return one last time to Alan Turing, who is nowadays often referred to as the ‘father’ of AI, a figure that constructs an imaginary patrilineage around the notion of (male) genius and mental birthing from Turing to today’s chatbots. Turing himself was more involved in creating this fiction than one may at first think, since, in his original text, in an ironic or anthropomorphizing inversion, he showed concern — half-joking, half-serious — about the well-being of the imaginary machine at a real ‘school’. He elaborated on the environment where the education of the machine could take place and even how to remedy any bullying of the ‘machine child’ by other human children:
[O]ne could not send the creature to school without the other children making excessive fun of it. It must be given some tuition. We need not be too concerned about the legs, eyes, etc. The example of Miss Helen Keller shows that education can take place provided that communication in both directions between teacher and pupil can take place by some means or other.57
Significantly, and in contrast to current debates, Turing did not imagine a self-learning machine, but rather one that is tutored one-to-one by a human teacher, and for which the accomplishments of Helen Keller, who lost her sight and hearing at nineteen months old, could be a model. He imagines a machine child analogue to a human child that is deaf and blind and yet still able to learn to speak and write. Keller’s life story and her success in overcoming obstacles to education was influential for Norbert Wiener and the history of cybernetics, too, because it proved that one could communicate and learn using other senses and languages, such as sign language, and that technology could become enabling in this process.58
In 2024, discourse on the teaching of children, the communication between teachers and pupils, and the training of large language models has shifted significantly. For some, it no longer means imagining AI asBeginning of page[p. 235] the human child, but rather the fear that human children will no longer learn anything since they can use chatbots as cheat-bots. Notably, this may reveal an anxiety about changes in power through the use of digital tools and easier access to knowledge at schools. It seems relevant here to further question the models of teaching and learning that underlie these assumptions: is learning the mere repetition of certain facts, dates, and names, or does learning entail processes of creativity, comparison, and the finding of unexpected answers?
In addition, the pessimistic model of the AI teacher overlooks Jacques Rancière’s notion of the ‘ignorant schoolmaster’ who, without knowing a subject, can still teach students how to learn learning.59 In classroom settings the relationship between student and teacher is considered key and is never a one-way communication: the adult teacher usually learns from the children, too. So, the question remains to be answered: could an AI teacher also teach something unknowingly and unexpectedly to human children? And could programmers learn from their machines?
Thus far unexplored, but mentioned by Turing, is the affective dimension of the settings and relationships of learning, so often characterized by the desire to understand and the fun of exploration, but also by frustration and boredom. Already in 1951, in a short story melancholically titled ‘The Fun They Had’, the science fiction author and professor of biochemistry Isaac Asimov had envisioned a type of school where each child learned individually.60 Set in the future, in the year 2155, a boy and a girl find a book from the past featuring a story about a school where the children are taught together by a human teacher. They longingly compare this to their boring, individualized, and isolated home-school setting, in which they are individually instructed by a teaching machine.
There exists an alternative vision, however, that in the near future chatbots might teach children, and especially underprivileged childrenBeginning of page[p. 236] who may not otherwise have access to education. Sal Khan, the director of Khan Academy, is resurrecting a decades-old Silicon Valley dream that technology can help fight inequality, and that each and every human can be tutored to realize their potential by an individualized AI tutor.61 One may be reminded of earlier projects, such as providing children in the Global South with solar-powered laptops in order to close the digital divide.62 Writer and researcher Evgeny Morozov criticizes these techno-utopian notions as born of an ideology of technological solutionism, pointing rightly to the many shortcomings and the lack of complexity of these approaches. The recent involuntary, global experiments with long-distance learning and home-schooling during the Covid-19 pandemic are a reminder that for many students, the model of remote and technologically mediated teaching and learning does not yield the desired results, and, as always, it is the students who are already underprivileged and vulnerable that are left behind, thus widening educational inequality.63 Moreover, neither Asimov nor Khan are concerned with the human teachers who would stand to lose their jobs and income to AI’s simulation of their work, a fear that in the case of Hollywood screenwriters led to one of the longest strikes in the industry’s history in 2023.64 In the settlement between the Writers Guild and the Motion Picture Producers it was specified that AI could be used as a tool by human writers but that guardrails be put in place to make sure that they remain in control of how and when they employ AI tools.65Beginning of page[p. 237]
At present, large language models are neither ‘intelligent’ nor ‘creative’ in the human sense, but, of course, this depends on the definition of each term, since to combine pre-existing things is already considered ‘creative’ by some.66 Human beings, however, do more than this: they create and change spoken language — its daily expressions and specific words — by repeatedly breaking the rules of the dominant model of language usage. They thereby interrupt the monotony of, for example, always using the most common or correct words to express themselves. Even ordinary people break rules and use language creatively, while certain artists and authors do so on a larger scale. I argue that language models at this stage do not create, since they are trained to render the norms present in the dataset, that is, the most likely succession of tokens. Thus, they are programmed to discover rules, which they obtain by observing the statistical likelihood of word sequences. Furthermore, language models cannot distinguish between interesting departures from the rules and simple incorrect grammar and poor use of language.
Therefore, the model of learning employed in current AI through training and supervision, entailing a progression, in the manner of Turing, from ‘empty pages’ to ‘full pages’, or from the ‘child mind’ to the ‘adult mind’, is too linear and normative to account for current teaching and training processes. And yet, while it is certainly a ‘bad’ model, it refuses to disappear and continues to be employed anachronistically — and sometimes even in unexpected ways, as by queer author Hannah Silva. In order to view machines, or specifically large language models, as non-human but cognizant counterparts, and not only as artificially intelligent but perhaps also as what Silva calls ‘alternatively intelligent’ entities, we must imagine new models of learning and teaching that include big data and algorithms, and that trace what goes into the creation of these training sets. In these alternative models, human intelligence needs to be decentred as the underlying model to be emulated, while we acknowledge the ongoing legacies of colonial education and language politics in current AI technology.
© by the author(s)
Except for images or otherwise noted, this publication is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.© 2025 ICI Berlin Press