GPT-5, out this week, is a strange model. It is supposed to be a deeper research model than it’s predecessors. I’m finding it generally gives very shallow answers. The way to get a better answer is to insult it and tell it that its predecessor models were much better. That prompts a much better result and “deeper” thinking and better citations.
I tried this trick once with students in a class I was teaching. But politely framed as a challenge … “… the class X years ago could do this in a heartbeat …”. All it got me was a lower student evaluation rating from that cohort at the end of the semester.
So, while it may work now, be wary of trying it in GPT-6!