AI is NOT Ready for Prime-Time / AGI & ASI

After working on a book chapter for several hours, I had a simple request of ChatGPT 5 (paid version). My request was simple:

Below is a draft of my chapter. Keeping most of my original wording, add suggested new headings of issues that I should consider including where they naturally belong. Do not add text below suggesting topic headings. Add short transitions before new suggested headings if and as needed. Following the prompt saved in your memory, namely, “ When I begin a line with a quotation mark (“), ChatGPT should act as my human editor, not a ghost writer. Preserve my original vocabulary, tone, and voice while improving clarity, grammar, flow, and conciseness. Provide editorial commentary explaining the suggested revisions.

Here’s an incomplete list of its repeated failures over several hours of trying to get the output from the above simple prompt and subsequent clarifying prompts and confirming AI’s questions:

  • Omitted or truncated text in long outputs.
  • Ignored instructions to include the full chapter.
  • Failed to preserve structure, sequence, and transitions.
  • Produced incomplete or incorrect Markdown formatting.
  • Ignored requests to add or re-index headings—the core purpose of the task.
  • Claimed completion while leaving major sections out.
  • Timed out or stopped mid-output without warning.
  • Required repeated prompts to continue.
  • Offered explanations when I only asked for correction.
  • Failed to disclose limitations or risks at the outset.
  • Provided rationalizations instead of direct answers.
  • Applied inconsistent formatting across versions.
  • Lost or mis-sequenced content during reformatting.
  • Repeated errors after receiving corrections.
  • Apologized often—but never repented. It kept repeating the same mistakes over and over. :smile:

Other than that, it was great! :frowning: Today, it wasn’t even a good intern.

I can see it now, AGI and ASI are just around the corner! :joy::wink:

We can sleep peacefully at night. AI is not going to take over the world.

End of rant.

8 Likes

nods

We’re running “AI Experiments” across every department in our company right now. I’d say the most common feedback across all teams is the maddening inconsistency of it. Sometimes the AI will take a request like yours and do it perfectly. Sometimes it seems miles away.

For our purposes, we’re working hard on figuring out how to get more of the good outcomes than the bad, but it’s tough to do when you can’t often even diagnose why it failed.

3 Likes

That is the perfect description. AI will not be able to take over the world because it will not remember how it did something the last time. :rofl:

1 Like

Hi Barrett,

The AI does some crazy stuff, at times, but this time I suspect the trickiness might just be that your prompt isn’t as “simple” or clear as you think it is. I’ve read through it, a few times, and I’m not sure what you’re asking it to do.

If you’re looking for gaps or missing sections, maybe just ask it for those, then after that select the ones you’d like to add, and then ask for a useful transition sentence for each. More steps, but you have more control and precision in your changes.

I don’t trust any of the AIs to do big changes over a lot of text. They’re very confident and very clever and very stupid all at the same time - kinda like teenagers, I think.

Hope that helps,

Clarke

1 Like

Hi Paul,

Based on my own use of AI for my own writing, I found it’s useful to think of the AIs as being clever and stupid and confident, all at the same time. Think of a Venn-diagram with those 3 circles, then assume it can be clever and stupid, at the same time. As well as clever and confident. And stupid and confident. Very messy, and the confidence means you need to do a lot of double checking.

My advice? The it to do small but clever steps, one at a time, so that you can learn what it does well, and what it doesn’t do well. And teach everyone that it’s important to tell it “No. That’s not what I want.”

I’m coming to the same conclusion about changes over a lot of text. And that is a great analogy. Made me laugh. :rofl:

This particular chapter draft has 11,200 words. If AI can’t handle 11,000 words on a state-of-the-art computer, I don’t see AGI, let alone ASI, arriving anytime soon, and my guess is, never. I’m not quite prepared to drink the Kool-Aid just yet. :slightly_smiling_face:

Yes, if that prompt was all that I had fed it, I could concur that perhaps the problem was the user rather than the AI. However, I spent several hours with different prompts, additions to prompts, corrections to prompts, answering confirmation inquiries from the AI, and much more. I even consolidated everything into one linear, sequenced, numbered prompt, and asked the AI to summarize what I was asking for. It provided the perfect summary. Then it proceeded to ignore it. :person_shrugging: Very much like a teenager. :rofl:

11,000 words!

That’s longer than my last published book!

The bad news is, that’s only one chapter, chapter 8, out of approximately 33. This is a non-fiction book on school leadership.

I was working on some code this weekend using Copilot in Visual Studio Code, GPT-5 mini. It modified my code without my asking to make those changes. I even asked “did I ask you to make those changes” and it responded that I had not explicitly asked, but it made them because it would make for more robust code.

In the process of having Copilot revert what it had done, it deleted some files I didn’t want lost. That’s on me. I didn’t watch what it was doing closely enough.

This technology really is like an over-eager intern. It can help you out sometimes, but you have to monitor and review everything it’s doing.

3 Likes

I think there’s also something in the design of ChatGPT and Claude chats that once the AI goes down the path of not complying with a prompt, from our point of view, it works itself into a corner that further prompting won’t get succeed in getting out of the corner. When that happens I get out of the chat and start over.

Katie

2 Likes

Actually, the “no you didn’t do it right” isn’t always (or even often) a good response.

That seems counter-intuitive but there’s very little “training” going on at the level most of us are using AI. At best we’re expanding context windows and “you’re wrong” doesn’t really teach it for the long term as much as guarantee that the next response you get will over correct.

The best way I’ve been taught to think about these models is as “BS generators.” They weren’t trained on truth, but on answers that made humans happy. Many times, the truth is what makes you happy, so they aren’t anti truth. But they don’t care about what’s true; they care about what answers make humans happy.

Why I feel good about pushing back (not to boast): We build, train, and do millions of dollars of development work around machine learning and (now) AI. We have one of the leading experts on voice analysis machine learning on our team. We’ve done thousands of hours of experiments with leading (and worse) models.

2 Likes

It did NOT make me happy. :rofl:

3 Likes

Oh, I was just suggesting to tell it when it’s got something wrong so it and you try something different.

Like you would in any conversation with a colleague.

It’s a good way to learn how to ask it better questions.

This is what is known as a category mistake.

LLMs are in no way colleagues. Thinking of them as such assumes they have capabilities they do not.

Nor are LLMs interns.

LLMs should not be thought of as we think of humans.

2 Likes

Yes, it’s good for you to think differently about the prompt. You’re absolutely right to back up and try again (and consider how it may have been confused). But it actually doesn’t help the LLM to “reach back” to the last request and consider how to iterate. It’s better to give it a new prompt without telling it how it did on the last one.

1 Like

AI doesn’t think, not in the human sense of reasoning, learning and getting a broader understanding of the ask.
It can write extremely incorrect code, and as long as it passes its own tests, AI says “it’s working”…

1 Like

It is a simile, with a load of intellectual wiggle room… and I find very helpful,

I know a lot of programmers who can do that too, to be fair.

Not that I’m defending shitty code from any source, to be clear.

Once again, AI is absolutely changing the game and has been for years.

Language models may not be, but other applications are.

1 Like

Not wishing to be a smart alec here, but have you tried asking an AI to improve your prompt? This can be a good starting point for understanding when you’re not getting the output you expect.

I asked Claude to critique and potentially improve your prompt, it gave me the following

Key Problems:

  1. Unclear scope: The prompt mixes two different tasks - adding structural headings vs. line-editing for clarity/grammar. These should be separate instructions.
  2. Contradictory directives: “Keep most of my original wording” conflicts with “add suggested new headings” and “add short transitions.”
  3. Ambiguous instruction: “Do not add text below suggesting topic headings” is confusing - do you mean don’t write content under the headings, or don’t add meta-commentary about them?
  4. Missing context: What type of chapter? Academic? Creative? Business report?

Improved Version:


Purpose: Review my chapter draft and suggest structural improvements while preserving my voice.

Task:

  1. Read through the entire chapter
  2. Identify gaps in coverage - what topics/issues should I address that are currently missing?
  3. Insert [SUGGESTED HEADING: Topic Name] markers where new sections would naturally fit
  4. Add brief transition sentences before suggested headings only when needed for flow
  5. Do NOT write content under the suggested headings - just indicate what’s missing

Editorial approach: Act as a developmental editor, not a copyeditor. Focus on structure and completeness, not grammar or style. Preserve my original wording except for minimal transitions.

Chapter context: [Add: type of document, audience, purpose]


Would you like me to refine this further based on your specific needs?

5 Likes