Gemini Leapfrogs Rivals – Wall Street Journal

A recent Wall Street Journal article claims that the newest version of Gemini leapfrogs its AI rivals. Because many on this forum do not have access to Apple News+ or the WSJ, I’ve provided the link and some key excerpts below.

I have not spent much time with Gemini yet, but I’m going to give it a try.

The launch of Gemini 3 has handed Google an elusive victory: The company, for the first time in years, has pulled well ahead in the race to develop artificial intelligence.

The release of its latest AI model this week dazzled users who praised its intelligence, accuracy and creative capabilities. On Thursday, the company said Gemini 3 would power a new version of Nano Banana, a popular image-generation tool that has already driven rapid growth in Gemini usage this year.

The success of the new model poses a significant challenge to OpenAI, Anthropic and other startups vying for AI dominance. Gemini 3 outperformed competing models on more than a dozen benchmark tests scoring a range of intelligence categories …

Google aimed to develop Gemini 3 to succeed in some of the most challenging areas of artificial intelligence. The company’s engineers and researchers wanted to improve this model’s ability to “see,” analyze and generate all means of content—text, images, audio, video and code. And they wanted to improve its capacity for thought and reasoning in the interest of building a better personal assistant in coding and other tasks.

After the launch, a table showing Gemini 3’s score on 20 benchmark tests circulated widely online. The model significantly outscored the latest ones from ChatGPT and Anthropic on tests involving expert-level knowledge, logic puzzles, math problems and image recognition. It took second place to Anthropic’s Claude Sonnet 4.5 on a single benchmark involving coding.

2 Likes

Still a ways to go though:

Key takeaways:

  • It’s great model, as far as LLMs go, topping most benchmarks, but it’s certainly not AGI . It’s haunted by the same kind of problems that all earlier models have had. Hallucinations and unreliability persist. Visual and physical reasoning are still a mess.
  • In short, scaling isn’t getting us to AGI .
  • OpenAI has basically squandered the technical lead it once had; Google has caught up. What happens to OpenAI if Google undercuts OpenAI on price?
  • But biggest news was buried in the methods: Google got better results than it is competitors without using Nvidia GPUs , relying solely on their own TPUs.
  • If Google were to make those TPUs commercially available at scale and reasonable price, Nvidia’s dominance would end , price wars would begin, and compute would become a commodity. That would be huge.
4 Likes

Google sells a limited number of hardware items, but it is mainly a service company.

They will be providing Anthropic with access to as many as one million TPUs through Google Cloud services

1 Like

I’ve been test-driving Gemini 3 both in the Gemini 3 app and in NotebookLM.

The latter has been materially updated in–I’m guessing–coordination with the rollout of Gemini 3:

  • You can now use Deep Research to find sources from either the web or Google Drive to add to your notebook.
  • You can now upload more documents types as sources, including images, Google Sheets, and Word documents.
  • Note that “images” includes images of hand-written notes. (Big, if NotebookLM can successfully parse them.)
  • NotebookLM will now automatically save your chat history.
  • You can now customize the output from NotebookLM Studio (Studio can generate audio overviews, video overview (essentially PowerPoint presentations with an audio track), mind maps, reports, flashcards, quizzes, infographics, and slide decks.

I don’t use LLMs for coding, so I can’t comment on Gemini 3’s capabilities on that front. Here are some thoughts on the things I’ve poked at thus far:

Guided Learning tool: It’s what I’ve spent the most time playing with to date, and it’s something that promises to be useful and enjoyable. (This tool was rolled out in August, but it’s now powered by Gemini 3, and my guided learning chats feel richer, more current, and more focussed.) I don’t think there’s really an equivalent in either ChatGPT or Claude. The former seems to want to just give you the answers; the latter gives you a very thorough 60,000 foot learning plan that you can work on together, but doesn’t engage in the kind of Socratic probing that Gemini does from the get-go.

Style: Gemini 3 and Claude’s responses are different in terms of style. Claude likes to answer you in essay-style paragraphs unless you prompt it to do otherwise. Gemini seems to favor discrete blocks of information with snappy heads and sub-heads. For instance, I asked for a timeline of the painter Jasper Johns’ evolution as an artist, and got self contained sections with titles like “Proto-Pop Breakthrough: Things the Mind Already Knows” and “The Device Era: Objects, Gray, and Anxiety.” Claude is more sober; Gemini is a little bit of a dandy.

Generative UI

Just for fun, I asked Gemini to turn the Jasper Johns timeline into single-page infographics. It described what it created for me as follows:

I’ve created a new React application that reimagines the timeline data as four distinct, highly stylized “one-page” digital posters. Each infographic uses a unique color palette and layout theme to reflect the artistic style of that specific era (e.g., Stencil fonts and primary colors for the Flag era, grayscale and industrial vibes for the Device era).

Note that I didn’t provide it with any instructions as to the infographics’ style–this is what it decided to deliver on its own.

Nano Banana pro is insane. I fed it the following prompt:

I’d like an infographic about the life and work of photographer Diane Arbus–a one-page explainer that would help someone new to photography understand the evolution of her style and why she was important.

And this is what it gave me:

[Diane Arbus Infographic - Gemini 3 Nano Banana

I’d quibble a bit here and there–someone new to photography would likely know nothing about Lisette Model or a “Medium Format TLR”, for instance, and I’d definitely send it back with a request for better illustrations for the “Key Subjects & Themes” section–but overall, it’s a decent one-shot effort.

(Note that I am a Gemini Pro subscriber; the free versions of Gemini 3 and NotebookLM may not deliver as much.)

It’s not AGI by any definition, but there’s a lot of good tools in its toolbox.

​​

3 Likes

It is not AGI (as others have said).

But in specific areas it has impressive skills.

For general AI queries or brainstorming it is overkill.

I have tried it for summarizing and indexing detailed medical records and it outperforms Claude 4.5 there.