I have no idea why, but I really enjoy watching the Outdoor Boys YouTube channel. It’s mostly a guy named Luke digging holes in the snow—usually in Alaska—and sleeping in them, sometimes with his kids.
Last night, YouTube recommended a long video about a supposed tragedy that had befallen his family.
I watched the first three minutes, and something felt off. So, I asked Perplexity AI about the “tragedy.”
It confidently told me that something terrible had happened and that Luke had stopped making videos a year ago.
Which was odd—because I’ve been watching his new videos every two weeks, every Sunday evening after my wife goes to bed.
So, the tragedy never happened. But Perplexity was convinced it had. And it wrote about it in a way that sounded completely believable.
When I checked the sources, I saw what had happened: Perplexity must have searched for “Outdoor Boys tragedy” and built its response from whatever it found.
It wasn’t “hallucinating”, it was just doing good old “garbage in”, “garbage out”.
This isn’t the first time I’ve seen AI confidently spit out garbage. I’m sure the others do it too. The problem isn’t just that they’re wrong—it’s that they’re wrong in a way that looks convincing.
As an English Teacher in a school system that so micromanages our assessments that they basically want us to be AI, the most useful thing I found is to input the text we are reading and then say “create questions based on X standard” that I am required to teach on. Its at its best when you give it self contained prompts. I fed it to the text and it formulated standard appropriate questions.
It’s getting worse. These are some rough notes from my Obsidian Vault:
Searching for “Does Lane Changing Get you there Faster” I found:
A quick skim lead me to believe it was interesting. However I wanted to know where it got it’s facts from. It claims a difference of 2% which seemed very specific for a highly variable actvity.
It claims it first source is from U of T. In fact the lead author is cross appointed to MIT and Universitie Laval. Further, the study is about the lane changing skill and choice under cognitive load as we age. So not really the claimed subject. Just in case, I read through the paper. I may not recover those brain cells.
The second paper, isn’t a paper as it claims. Instead it is a web page on temporary traffic control measures. It doesn’t prove anything.
This presents a few problems
It doesn’t advertise that it is AI Generated yet almost certainly is
Given the prevalence of AI Generated content, we need verify any claim that we want rely on
Next generation AI’s will be trained on this hallucinated content and believe to be true.
I’m still in the camp of not using AI summaries or look at AI Google’s overview. I get people can find them useful, but it seems the amount of work I would have to do to verify anything makes it a waste of time.
There’s enough misinformation out there as is. I don’t need to add another source of it coming in.
Though as a Golden State Warriors fan, Google’s AI sure gave a boost to Shaun Livingstone’s career stats.
I tried Perplexity some months ago after hearing MacSparky praise it. I liked that it would show me sources, but I asked it some simple questions about things I was researching at the time and its responses were simply terrible and could have been very costly if I had believed them. All delivered with slick and powerful confidence.
GIGO is as true today as it always has been, but with LLMs you have no idea how much of the input to the model was garbage.
I don’t use AI much for “general knowledge,” but I’m finding it very useful for giving my own data and asking questions. I can take a long backlog of notes from teaching or consulting and pull out summaries or examples I’ve long forgotten. I can give it transcripts of meetings I was just in and get key points and action items within seconds. I think the most valuable insight on the recent MPU was how important the prompts themselves are. I suspect that may be as true for the general knowledge queries as it is for what I’m doing, especially when it comes to avoiding the garbage. The linked google docs in the show notes are well worth reading.
As for the non-disclosure of AI generated articles: yes, that’s going to happen more and more. It sucks just as badly as did the outsourcing of articles to unknowledgeable, cheap offshore labor several years ago that was such a big contributor to the overall en%*$:ification of the internet.
We went badly wrong when we started calling this sort of thing Artificial Intelligence and stopped calling it Machine Learning. So-called AI is nothing more than advanced algorithms (programming code) and massive amounts of data. No more intelligence in any qualitative sense than a spell checker.
Just bad timing. When resources are constrained, 4o will tend to return shorter answers from its pre-existing knowledge instead of searching. If you’d scolded it and asked again, it would’ve given the right answer.
That’s valid! Even people who think LLMs are worth using find it annoying to ask multiple times. Reminds me of pre-Google search engines that struggled with relevancy and up-to-date crawls.
Ai is bad at a lot of things, but it’s astonishingly good at some things. Part of the adventure is figuring out what’s what. So far it seems to be getting better at the bad things.
Artificial Intelligence is to Intelligence as Artificial Flowers are to Flowers. In both cases, the artificial versions capture some, but not all, of the aspects of the real versions.
In other words, Artificial Intelligence does not equal Intelligence.
And just as flowering plants are a subset of botany which in turn is a subset of biology, so too are large language models a subset of artificial intelligence, which in turn is a subset of computer science.
At a high level, LLMs are extremely sophisticated pattern matchers. But there is no 'intelligence’ in the vernacular sense of the word. That is why glue is recommended to hold cheese on pizza. A better term than “hallucination” for this type of output is “bullshit” in the sense used by Frankfurt in his work “On Bullshit”, where the output is indifferent to the truth (and given in a very confident manner).
LLMs are an impressive technological achievement, and, as noted on this very forum, are useful for a variety of workflows. It is the inflated claims about their capabilities by the snake-oil salesmen that are the problem, pushing their usage in inappropriate and even dangerous ways.
Quick update: I asked perplexity the same question - Did something happen to the outdoor boys - using its deep search mode and it told me that it was very likely that any rumours were clickbait.
It was interesting to follow its thinking - it did what I did manually, but more thoroughly and a whole lot quicker.
I use AIs frequently, Perplexity mostly, but Grok 3 more recently. One thing is very clear, if the subject is critical, you have to check the sources. Also, it never hurts to repeat the prompt, possibly with some variation, or to check an AI answer with another AI. Hopefully, the AIs will improve with time, but the source is the Internet (and X with Grok), so “garbage in.”
Not sure if it makes any difference what we call it. What’s in a name?