743: The Current AI Moment

2 Likes

Great episode!

Three things:

If you are averse to paying for any AI services, brush up on your web searching skills and you can find beta or free versions of just about any AI service right now.

May not stay free and may have more usage limits added, but in these early days, all these companies want exposure and eyeballs and are not (yet) focused on monetization and long-term profitability.

Related - The hosts appear to have not tried Microsoft Copilot that is part of Bing. I started using it almost a year ago because of three benefits:

It is totally free (unlike paying OpenAI for an account)
It uses the latest version of ChatGPT from OpenAI
Microsoft annotates every result with footnotes that link to the original source material.

Maybe not as polished as the service mentioned in the podcast, but having both an AI generated search result as a narrative that includes all the source links has been working great for me.

Also, it may not be obvious, but you can use it as a general purpose generative AI, not just for search.

Any of the prompts you can submit to ChatGPT you can type into CoPilot - not just search requests.

8 Likes

Thank for yet another enjoyable show!

Some valuable insights on the “why can’t ai be an all out personal assistant yet” by Cal Newport can be found as podcast here or as written word here

Furthermore, if you’re a Setapp subscriber: check out superwhisper as alternative to MacWhisper

Fun episode. I really want to hear about Sparky’s use of an LLM to summarize Readwise highlights.

On the Copilot Coding front - it’s far from rosy. I’m already helping clients sort out this mess. Linked is a research paper that documents how Copilot is helping to make code quality and maintainability worse. Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality (incl 2024 projections) - GitClear

Yep, the “easy code” that has low quality and is replicated all over the place is not really the code I want generated.
Instead I focus on Copilot Chat for help with “what does this function do” after I’ve written it, tests (of course) and documentation.
Of course, for “line completion” it also does a great job, but I mostly keep it at the line level, and don’t let it create entire functions.

1 Like

Could we please clone you and spread this knowledge around. I see many devs, who feel pressured to produce more. They use Copilot to help them write more ‘code’.

Agreed. I downloaded the Microsoft Edge browser and use it for the sole purpose of running Copilot within the browser. It’s pretty good, and I like that it gives references.

2 Likes

I tried perplexity (free-tier) straight after listening. It’s very “user-friendly” and well-presented and often gave some surprisingly useful answers as the hosts said.

For example, I asked it to compare two specific smart telescopes - something I have been researching (the old-fashioned way) recently. I was very impressed with the first summary it gave. It picked up the key points perfectly, succinctly and accurately, with the footnote style references, which cited a lot of the better quality and more in-depth sites I had found. I genuinely think it did a better job than I would have done, and it produced it instantly.

I then asked for some more specifics (e.g. pros and cons of each) and began to notice out-of-date and even just plain wrong material creeping in. The errors weren’t just coming from out of date material being cited, but confusing e.g. specifics of particular models with previous models in the same range. Perplexity is using existing LLMs in the background, so this is not a surprise. As the hosts said, “hallucinations” happen quite a lot.

But perplexity presents the material so well, and so accurately in context, that it is jarring to realise that it is being “a perfect con artist”: hiding significant errors and misinformation amid impressive true results and presenting everything with persuasive confidence (and even humility) that is likely to give it credibility.

I liked the analogy of the intern, but although a human intern might not even know what they don’t know yet, so would need their work and understanding checked, they would be unlikely to be so impressive and persuasive while being wrong - and if they are, they would quickly be seen as dangerous and a mentor would try to get them to change.

I came to a different conclusion than David and Stephen: I thought perplexity was LESS good than many other AI front-ends. I like it giving references and citations, and it “writes” well, but it’s so credible that it really does have to be better at not presenting false information, or at least indicating how MUCH good evidence (or not) there is behind each point. The number and quality of citations making the same point matters at least as much as there being a citation.

2 Likes

Do you check the references? Beware sometimes the facts it gets from the references are made up.

1 Like