AI & “The Great Language Flattening”

As an educator, I found the premise of this article both intriguing and concerning. I’ve never considered the possibility that AI might flatten–or homogenize–language. It strikes me as boring and a threat to variety, which is said to be the spice of life.

You’ll need a subscription to Apple News + or the Atlantic to read the article.

2 Likes

I understand and appreciate the overall concern of the article…but their examples are all places I disagree that using an LLM is a downside. Take for example the whole thing about using an LLM to add prose to an email and a colleague using another LLM to distill it into bullet points. I don’t see this as a problem? Especially in a business context, all the matters is that the text communicated the intent and task in a way that’s clear to everyone that is involved. Anything to remove ambiguity is a net positive in my books.

Where I would find concern is in text that actually requires authenticity. If people start writing their Mother’s Day cards with ChatGPT, that’s a problem.

Maybe I just can’t foresee the long term implications.

1 Like

If someone uses AI to write to me, then I will judge them negatively, whether they’re in a business or not. After all, if they are incapable of performing such a simple, basic, skill as writing a coherent email without training wheels, why should I trust them to do anything else?

4 Likes

I am slightly more lenient: if someone uses AI to write to me, they must be ready to stand behind what that AI said. If it gives a factually incorrect answer or a straight stupid answer, well it will be them, not the AI, at fault.

3 Likes

(No subscriptions so I did not read the article, but I’m guessing the point being made is one that has been obvious for some time.)

Up until recently, the material used to train artificial intelligences has been real content – writing, photos, and drawings created by human beings. I suspect the amount of AI-generated content on the web is growing by leaps and bounds. AI entities have begun, or will begin, to feed more and more on such generated content, resulting in a feedback loop of idiocy.

2 Likes

I used to be a lawyer and did a lot of work in Real Estate. Sometimes when I did title searches in old property I would romanticize the deeds that were handwritten. But I know it would not be efficient for me to do that in a business capacity. But I would love to do at least one. That is AI, now that AI can create prose on the spot we will be expected to “write” 20x more. At the same time, the best thing for our brains is “old-school” methods. I do not know how we will move forward.

It may have nothing to do with capability. It may be that we have enough to do without having to spend unnecessary time simply conveying business information to another person. We use AI (GPTs) in nearly every department of our company and we have a rich culture and people who genuinely care about one another. The two are not incompatible.

Mere clickbait.

Of course language changes. It changes in thousands of ways every day, with every interaction between people, and between persons and their environment. Been that way since the first mumble was understood by another hominid.

This is another article in the category of

[such-and-such thing I just noticed] will change the future in (pick one) [horrible ways | marvelous ways], and we need to do [something drastic].

I love the claim that “AI tools” “caused Indian participants to write more like Americans”. Probably totally unprovable, and even if true, so what? Let folks make their own choices.

Katie

1 Like

And how do you expect your clients to know the difference between a highly trained professional seeking to cut time spent, and those who are covering up ignorance and incompetence, if the resulting text is the same?

If the recipient gets obvious AI text, then how are you going to convince them that the material is accurate, and that the author cared enough to check? When you encounter a company whose front line response is AI, the assumption is that they prioritise cost-cutting over effective support and they have to work a lot harder to regain respect. That seems to me to be a good base assumption for assessing AI generated emails / texts.

My point is that use of Gen AI, with its dubious provenance, and its being shoved into everything, regardless of benefits, has a cost in loss of trust, which I’m not convinced outweighs the convenience to the author.

To be clear: machine learning clearly has massive benefits in the right place. I remain unconvinced that its a useful communication or creative tool if you take loss of reputation into account.

3 Likes

+1

1 Like

It’s all just tools.

“Photoshop means you aren’t a real photographer.”

“Anything but hand tools means you aren’t a real woodworker.”

“GPTs mean you’re ignorant and incompetent.”

Use what you like. :slight_smile:

A tool is a tool, but the use of a tool can be good or bad, and your reputation will suffer accordingly with careless or ill-judged use. Personally, I’m not sure that ‘we don’t care whether our customers trust our communications’ is a viable long term strategy.

Of course, given the amount of money to be made by the Gen AI gangs, this may be a battle that has already been lost.

1 Like

This is a very important issue. A couple of years ago, I read some article (of course I can’t find it at hand) that said that, if the models keep growing in size and parameters in a couple of years (that should be by now, give or take 12 months) there would not be enough human generated texts to train them. Resorting to image and video data gave a couple of decades more, but in a non viable way in terms of storage and processing power.

But I think that the parameter race will stop soon enough. Now you can download a ~20GB model, run it locally and the thing can answer any query with a good level of authority questions like “What is Article number 3 of the spanish constitution?” , or “Can I do Domain Driven Design on a monolith?”, or “How do I fine tune an LLM model on a Mac?”. It’s like they compressed Google or Wikipedia into ~20GB!!! The race now is about delivering agentic capabilities making the models reason about the necessary tools to perform real time queries to search indexes, databases, APIs, whatever. More training data will not be that necessary.

Pandora cannot get back in the box. But what do we lose from our humanity in the process since our brains are very much hardwired to be “analog”

1 Like

I do not use commercial AI/LLMs.

Despite popular opinion, the prose strikes me as bland. I also have ethical concerns regarding plagiarism via software.

That said language change is going to happen, but I also expect LLMs to become better trained regarding dialect, usage and style. It’s an issue closely tied to the basic principle of GIGO.

Three quotations:

Ye knowe ek, that in forme of speche is chaunge
Withinne a thousand yere, and wordes tho
That hadden pris, now wonder nyce and straunge
Us thinketh hem, and yit they spake hem so.
– Chaucer Troilus and Criseyde Book II ll. 22-25–

Willam Caxton, England’s first printer, comments on language change in the Preface of his edition of the Aeneid in 1490.

And certaynly our langage now vsed varyeth ferre from that whiche was vsed and spoken whan I was borne. For we englysshe men ben borne vnder the domynacyon of the mone, whiche is neuer stedfaste but euer wauerynge wexynge one season and waneth & dyscreaseth another season. And that comyn englysshe that is spoken in one shyre varyeth from a nother. In so moche that in my dayes happened that certayne marchauntes were in a shippe in tamyes, for to haue sayled ouer the see into zelande and for lacke of wynde, thei taryed atte forlond, and wente to lande for to refreshe them; And one of theym named sheffelde, a mercer, cam in-to an hows and exed for mete; and specyally he axyd after eggys; And the goode wyf answerde, that she coude speke no frenshe. And the marchaunt was angry, for he also coude speke no frenshe, but wolde haue egges and she vndestode hym not. And thenne at laste a nother sayd that he wolde have eyren then the good wyf sayd that she vndestod hym wel. Loo, what sholde a man in thyse dayes now wryte, egges or eyren certaynly it is harde to playse euery man by cause of dyuersite & chaunge of langage.

In the North of England, an older word for eggs, eyren, persisted, even ninety years after Chaucer’s death.

Thirdly, from Socrates, purporting to quote Plato. Plato puts these words in the mouth of Socrates; I have no idea if he is in fact accurately depicting Socrates’s views regarding writing, his own views, or something in between.

If men learn this, it will implant forgetfulness in their souls; they will cease to exercise memory because they rely on that which is written, calling things to remembrance no longer from within themselves, but by means of external marks. What you have discovered is a recipe not for memory, but for reminder. And it is no true wisdom that you offer your disciples, but only its semblance, for by telling them of many things without teaching them you will make them seem to know much, while for the most part they know nothing, and as men filled, not with wisdom, but with the conceit of wisdom, they will be a burden to their fellows.

Socrates was writing about writing destroying memory, not AIs destroying writing. Yet the logic regarding the potential changes is similar. What may happen is that “human crafted prose” may have greater cultural value than LLM prose. Something similar has happened with the cultural distinction made between an email and a hand-written letter.

If LLMs continue to be trained on stolen text, and freely available text, and on previously trained extant LLMs, I think we would see an increased leveling of prose style in general and individual styles in particular. LLMs are trained with respect to register (formal to informal spectrums), which is not the same as voice or style.

This is one of many reasons LLM designers are attempting to train with much larger language corpora.

4 Likes

I think that is right. One of the best books I have read on the subject is Revenge of the Analog by David Sax and how Analog methods will continue to trudge in a more Niche manner as we continue the have a greater understanding of digitial downsides.

1 Like

The Atlantic does love to worry about modern life… Consider the first “fact”, Jeremy Nguyen and the 320 people who had to write something twice, before and then after they’d seen an AI produced version: “We didn’t say, ‘Hey, try to make it better, or more like GPT,’” Nguyen told me. Yet “more like GPT” is essentially what happened." But showing them the GPT version and asking them to write it again is exactly saying “make it more like GPT”. It’s called "demand bias’ in research on human subjects, but as Dr. Nguyen received his PhD in “computable general equilibrium modelling” he may never have been exposed to the basics that any undergrad major should learn in their experimental psych course. To be fair, this is an unpublished study and we’re only given what the author is telling us.

The rest of the story is full of "you could be more susceptible ", “But AI tools could”, “Kirby offered me a hypothetical”, “the linguists I spoke with speculated”, “It’s pretty easy to imagine”, “the proliferation of AI-written or -mediated text may”, “Bender imagines people”. "“We might find ".

The only other evidence is a conference paper that found that non-native English speakers who use AI models trained on largely Western inputs sound more like native English speakers than when they don’t use AI models. Why should this be any more of a cause for concern than the television, radio, books, or for that matter what McDonald’s has done to the world in terms of promoting cultural homogenization?

Alarmist, with very little evidence: typical of The Atlantic these days.

As a lover of language, this doesn’t worry me, it fascainates me. Language is ever changing, it takes unexpected darts left and right. In a species numbering billions, we will never all speak the same way. All AI will do is the mix the pot in a brand new way.

3 Likes

Personally,

The role between myself and this hypothetical person using AI to write is key. I don’t particularly care if my colleague in another time zone who’s first language is not English is using it to try and make sure the point is understood as effectively as possible.

But do I want a hallucinating “agent” to “help” me with my broken something or other? Definitely not. Similarly, I’d probably be quite concerned if my lawyer’s communication was primarily via AI