If you don't know the subject matter, how can you trust AI?

Headline below the fold of today’s St. Paul Pioneer Press newspaper: “AI-powered transcriber has big flaw: It tends to lie.” Here is the original article that the Pioneer Press appears to be publishing, Researchers say AI transcription tool used in hospitals invents things no one ever said | AP News | 2024-10-26

The enthusiasm for so-called Artificially Intelligent LLMs (Large Language Models) is misplaced. This is not news to anyone following AI news, but is finally getting out to the public.

2 Likes

A recent podcast I listened to made the argument that the value in LLMs is not for things one is an expert in, because LLMs are mediocre at everything. And since most people are bad at everything mediocre is better! An example given was the LLMs were way better at Chinese than he is.

My counter argument is that I have no means to verify the Chinese output is valid. This is true for the output in any domain I do not have knowledge in.

So if the LLM is 95% accurate that means that 5 out of 100 times it will be wrong. And without domain knowledge I cannot identify which five. There is no way that could be used in a production system.

These tools are useful, as many on this forum can attest. But the hype of snake oil salesman and fraudsters like Altman and Musk do all of us a disservice.

3 Likes

It all depends on the error rate you would expect from a human. I would say that using LLMs for the matter you happen to be an expert in is the sweet spot.

1 Like

That’s true if we were talking about just any human. No LLM is a domain expert even when they have been trained on text from a particular domain.

But what if I consult domain experts? If I go to a speaker of Chinese for a translation or to a medical provider for treatment of an ailment, the competence of these humans would be incomparably better than that of an LLM!

LLMs appear to many as if they were domain experts and they definitely are not.

I’m pretty sure when AI and the robots kill us all it won’t be from a Skynet style malevolence or power grab, it will be because AI and robots can be very, very incompetent in the real world and we’ll give too much responsibility/trust to them.

Beware your Roomba!

1 Like

I just re-read a 1961 short story by Isaac Asimov that nicely illustrates one way we could misplace our trust into thinking machines, and how the day was saved. :slightly_smiling_face:

The Machine That Won the War

5 Likes

Very cool! Thanks for sharing. What book are the shots from?

1 Like

I found that link on the website of a teacher at Detroit Catholic Central High School .

It’s also on page 142 of my trade paperback copy of Asimov’s short story compilation called Robot Dreams.

1 Like

Wikipedia says this:

1 Like

You can’t, so don’t.

Obviously that’s a blanket statement, and there is certainly some nuance to using it. It’s difficult to spell it out maybe, but personally I use AI like I use Wikipedia; I trust it much more for hard, factual info vs. anything that could have an agenda behind it, and thus more likely to be dishonestly manipulated, especially in the “fake news” world we’re living in. E.g., I can accept the details on the wikipedia page regarding, say, the ranking of wood hardness on their Janka Hardness Test page as opposed to a page detailing an event involving a “controversial” political figure. Or any political figure for that matter.

I realize that’s not always the same as the “lying” that AI has been shown to do, but the concept is similar: the more open to interpretation the topic is, the more skeptical the user should be.

But that’s just me. YMMV.

See also believing anything you read, hear, or see in the media.

I know aviation quite well and main stream media reporting on it is sloppy at best, sometimes misleading and occasionally just wrong.

I don’t know much on many other domains, and I have to assume they’re just as bad on those.

2 Likes

Yes, anytime I have personal knowledge of a subject, I find numerous errors and misinterpretations in the news coverage.

1 Like