Entirely agree with this! Bunch of ad-hoc theories on why there's such a tech divide:
The tech is hard: you have to be a bit technical to know what it can do. When you start, you just get a blank page. If you're not technical you're not going to know that ChatGPT can't go sort your image library for you, or send an email. The difference between research and brainstorming and actually agentic stuff isn't that clear for the general public. If you're unlucky and your first idea doesn't work, that's strong evidence that the tech sucks.
Too fast to keep up with: it's more like sport than tech. Opinions from 6 months ago are as irrelevant as me talking about the premier league as it was in March. This is bloody annoying if you think of yourself as informed, but have other things to do. The tech works fantastically in some ways, poorly in others, and there's no decent way of educating yourself on that other than investing time and effort - which few people have.
Nobody ever lost money saying 'this is shit': most tech doesn't change the world, but somebody somewhere will try to sell you it on the claim it will. Plus a lot of tech ideas will just fail, just like anything else. Being aggressively macho about tech's shitness is a perfectly viable personality/career for a lot of commentators, and they'll drag a lot of less-confident people along for the ride.
Jarvis: people compare AI to perfection, not the status quo. I blame Jarvis and Spielberg. Every single review of anything AI says "it isn't perfect", as if it would be reasonable to expect it is.
Mistakes: people in tech are used to having to bend computers to their will. They don't always do what you want, and it takes some effort. But if you're not used to that, you see a mistake and assume it's a flaw.
It must be deliberate: if you don't understand the tech it is objectively weird to hear 'no human being programmed chatgpt to tell this kid they should run away from home' or whatever unfortunate thing has happened. It feels like a human must have made a decision at some point down the line, when in fact that's not how it works.
The tech is weird: as you say, modern AI can win math olympiads but fail to count the number of 'r's in 'strawberry'. Those things are both true. If you're technical you can understand that it's a jagged frontier, but if you're not then it's not unreasonable to model it as a human and to think 'can't count letters in words = it's stupid and won't work for harder stuff'.
The usual commentators aren't useful: the defacto commentators in the current zeitgeist are political commentators. But in the nicest possible way, they're often just not well versed in tech stuff. It's just not their wheelhouse. But they're the smart go-to people for a lot of political personal-dynamics stuff, and have been for years, so their probably not-so-great opinions carry a lot of weight. Plus their lens will always be techbros and rivalries and political dynamics and stocks and shares etc, all of which is irrelevant to how/whether the tech works.
Daily Mail effect: in the same way that there's always going to be a migrant with violent tendencies, or some awful person claiming child benefit for 12 kids and going on cruises, there's always going to be a schoolkid using it to cheat or some founder using AI to help you have an affair or whatever. These things loom large in the public consciousness and are assumed to be the tip of an iceberg.
AI Council: I think some people have a model of an AI council who direct all use of AI. There was a robotics company recently who were showing off their AI robot that could load the dishwasher, but it was also being livestreamed to a real person who could jump in at any point. Obviously nuts. People model this as 'look what the AI companies are doing now' as opposed to 'one company is using AI to do something mad, but there are probably lots of sensible ones going a bit more slowly'.
Normal tech doomerism: it's obviously built in to humans to think that any new tech is going to ruin the kids. AI will stop kids thinking in the same way phones stop them reading in the same way tv stops them going outside in the same way D&D stops them being christian in the same way Walkmans stop them hearing the world in the same way books stop them engaging with people around them in the same way writing stops them using their memory. Maybe the fears are correct this time, who knows.
Some people get (excessively?) angry about hype: like, everyone finds it annoying when a techbro makes grandiose claims. But some people absolutely spectacularly hate this with every fibre of their being, in the same way that nobody likes paying taxes but there are some who inexpicably find it cosmically enraging. Again these people get outriders.
Copyright: genuinely a hard problem, but again gets a lot of macho fuck-ai responses which are off-putting.
Age: the old Douglas Adams quote about anything invented when you're over 35 being an affront to nature is just obviously a thing that happens.
Lefty antagonism: if what you actually want to talk about is unions and fascism, AI gives you plenty of low-hanging fruit to shoehorn your way in to new discussions.
Have you ever asked Chatgpt how your use compares to the average user? I did that the other day and it was very illuminating in showing that most people are still using it as a toy, rather than for serious work.
Is that response, from ChatGPT, one you can check or do you have to take it on trust? I see no reason for its reply to be reliable - just like the coding assistance described in James Oβs article.
Really interesting. I only occasionally use ChatGPT - and found it a bit hit and miss but I like the sound of these use cases.
I think a lot of people who are very anti AI are doing so primarily due to negative polarisation. They donβt like the people who are really in favour of it for vibes reasons.
I mostly used LLMs for coding. But one thing I learnt is that when planning a big thing, you can ask it to ask you questions.
"Here is my vague idea. I need to flesh it out so that I can (do something). Ask me questions to learn what you need to know and then write it up as a proposal/ plan for writing code / ..."
It then asks me questions. Sometimes I know the answer and just spell it out. Sometimes it really makes me think. And at the end I get a well structured plan/summary/whatever which contains a lot of my thinking. I've got a lot of value from that.
I originally came across the idea in this post, about coding, but the broad idea is much more widely applicable.
Fascinating stuff. After initial scepticism, my experience is much the same: I use it all the time (not least because I prompt design at work).
Are all your prompts at the level you talked about above, or do you find yourself getting better results using template? (e.g. persona, task, good examples, constraints, etc)
Also: if I never have to hear the words stochastic parrot again, it will be too soon.
They're pretty much always written just in "conversational" style. Sometimes I'll list things if it is a multi step thing I want it to do. And I will tell it what NOT to do to, especially when precision is required (eg, writing code), but generally pretty casual. I'm actually pretty astonished just how good it is at interpreting what I'm really trying to figure out.
Fantastic piece. Your experience of LLMs is very similar to mine.
Also, this is based on a pretty small sample size, but within my field (statistics / data science) I have found that my colleagues in industry are general far heavier and more enthusiastic users than those in academia.
There's definitely a huge enthusiasm variance in different industries / crowds. Suspect journalism and academia are particularly hesitant β I'd probably be an AI sceptic by the standards of Silicon Valley. (I think AI is _most_ likely just important on the scale of the internet, not the wheel).
This was very interesting, thank you for it. I have to confess, I have been struggling to make Chat GPT useful to me and I will try some of the things you suggested here. But I think one thing you might be missing in your confusion as to how Molly Jong-Fast (and me) are not having the same positive experience you are having is that for some of us there is a sensation of cognitive and emotional pain associated with AI use that you do not seem to experience. What I mean is that I find the tool actively annoying and the experience of having to review and assess its answers unpleasant. This is no doubt partly just a "change is hard" phenomenon, but I actively enjoy the process of iterating ideas in my own brain and on the page. Equally I find discussing ideas with colleagues or friends to be a pleasant and meaningful experience. I do NOT find talking to a chat bot fun or meaningful on any level and this colours my experience probably more than I would like to admit. As with things like exercise I suspect that there is a correlation between early positive experience and subsequent expertise. Kids who love doing sports want to play it more often and they get better at it. Those of us who get grumpy at the AIs are not motivated to do it very often and therefore never get to the level of fluency that would make it pay off for us. I do also still have HUGE ethical objections on copyright theft and sustainability grounds, but this is just me responding to the utility question.
This is really useful. I barely use LLMs and had been vaguely wondering what all those people were using it for, since if they were all getting something out of it I was probably missing out. Will try some of this!
One caveat on travel though is that I found it not-great for restaurant recommendations - on a recent trip I decided to give it a go and most of the ones it suggested had closed down recently. I wondered if there had been some news and posting about their shutting down meaning they appeared in search results more and boosted them in the algo. No idea! But it did put me off a bit... Will try again!
I think there's so much variability depending on the prompt - like the more recent model updates seem to be better at this, but I include the phrase "look it up", to try and tell the model to do a web search rather than just rely on its internal memory, which will be wildly out of date from when it was trained. (eg, without web search being triggered, GPT5 would probably think Biden was still President).
"Spitballing" is a great way to use LLMs. I've been doing a lot about France's governmental problems and it's been useful as an adversary and an ally in figuring out what comes next. Able to find obscure texts about the French constitution and law, and other articles in a variety of languages
We've predicted a few likely outcomes that i've not been seriously discussed in more formal periodicals, which at least has managed to make me a regular contributor in a few obscure cable news shows
I use it a lot for summarising long complex documents. I have to read a lot of very corporate jargony pieces and I find them easier to read once I know the key points. I regularly have arguments with it about grammar (early iterations where surprisingly bad at knowing if it should be βJames and meβ or βJames and Iβ). And if Iβm struggling to start a project I use it for doing a bad job just to give me a starting point β or even an insight in what not to do. It will give me a pretty average example that will kickstart far better ideas.
NB this is for day job work, not my creative writing. I have never found it useful for creative work beyond copy editing and fixing transcripts. Apart from the time I asked it for ideas on what could make a βwhirring soundβ on a robot. (I realised I was talking about its disks spinning which made me sound about 100 years old.)
Interesting. I'm a professional coder, and I've found coding models of very limited use so far. I sometimes use them as a first pass "why isn't this working?" query before bothering my colleagues, and I've generally found that for the extremely specialised work I do they're very unlikely to give me the correct answer, though they sometimes knock my thinking in the right direction ("That can't possibly be correct, because - oh, *right*, that's where I've been going wrong..."). I've also used them a couple of times for knocking up quick scripts in languages where I'm rusty, in situations where I can easily scan the code and verify it's not going to do anything dangerous so it's safe to run it and check whether it produced the output I wanted. For instance, recently I couldn't open a deeply-nested directory, and wanted to know at what point I was blocked, so asked Copilot to write a Bash script to list out the permissions on every prefix of the directory - /foo, then /foo/bar, then /foo/bar/quux, then .... Quicker than looking up Bash array syntax, but that sort of thing is only a small part of my job :-)
One coding task they're very useful for, as you correctly note, is writing code that uses well-documented but gnarly APIs. I too have used them for writing ffmpeg command lines, and they're good for data-munging with NumPy/Pandas or plotting with matplotlib. I've had less success getting them to query OpenStreetMap geodata with the Overpass API, though at least I got a starter query I could manually bash into shape.
Which model were you using to write code, out of interest? Think this is where a reasoning model and longer context window definitely help, as I remember earlier GPTs forgetting crucial stuff as the code iterated.
For non-coding use: the other day I was confused by Nancy Mitford's description of the court of Louis XIV as a "noblesse d'emir" (in an essay linked from The Bluestocking, natch), so I asked Claude. It gave me a good explanation that I'm pretty sure was correct ("the French nobility had become like decorative courtiers in an Eastern potentate's palace, stripped of real power and practical function"), which led into an interesting discussion of how autocratic courts have worked across continents and centuries, then veered into the life and works of Laclos (apparently after writing Dangerous Liaisons, he became a revolutionary).
I also asked it to recommend me Breeders songs, at which it was a total failure. But as far as I can tell there's only one really good Breeders song and I'd already heard it.
Already done that - there are some decent tracks there (I'm listening to Roi right now!), but IMHO nothing else on the same level as Cannonball. I can see why people would love it, though. Maybe if I'd encountered it at a more receptive time of my life.
Claude Sonnet 4, apparently, but I just tried my last query with GPT-5 and it gave essentially the same wrong answer. There doesn't seem to be a "reasoning" option built in to GitHub Copilot (which has the nice feature of being built in to my IDE, so my current open file gets automatically included in the context window and it can suggest edits to my code which I can then accept or reject with a click). I'm fairly restricted in which AI tools I can use on my work machine (for the usual enterprise IP paranoia reasons), but I'll have a look to see what else we have available.
I think youβre right that the choice of model is a large factor. GPT-5βs auto-routing will hopefully help here. I think a lot of people donβt understand - and havenβt tried to - basic information on how to use LLMs (and so donβt know that you can switch to a reasoning model, what that means, and when youβd want to).
I think that increasingly, Microsoft Copilot will be peopleβs main LLM, at least in work settings. I understand itβs based on GPT-5, but itβs very reluctant to switch to reasoning mode and the replies are so fast (incl. for long document summary) that I presume itβs a smaller model. The answers are certainly worse than Iβd expect from standard GPT-5. Thatβll probably make people think LLMs are less capable than they are.
Fascinating to hear your experience. Like many writers, I have a love/hate relationship with LLMs. No point pretending it doesn't exist, which is the attitude of some. It mapped out a pitching strategy for me for 2026 this week in about half an hour. That would have taken at least a week pre-November 2022. I haven't incorporated it as fully into my workflow as you seem to have, though I feel that time is not far away.
Interesting! Iβm curious about the inaccuracy aspect (I accept that this is related to my being Gen X, and generally crotchety). I feel like combing through an unfamiliar response and trying to spot hidden inaccuracies must be as time-consuming as doing research from scratch? At least when I do my own research, I ^know^ when Iβm being sketchy.
Plus, when you do research from scratch you find things out serendipitously (things that are not directly related to your query, but are interesting to know/enrich your life/make you reassess your query). Itβs a bit like how the disappearance of paper-based news sources have stopped me reading absolutely random coverage of international business and volleyball, and I kind of miss that random factor.
But yeah, to be fair, I also just donβt like new stuff.
Entirely agree with this! Bunch of ad-hoc theories on why there's such a tech divide:
The tech is hard: you have to be a bit technical to know what it can do. When you start, you just get a blank page. If you're not technical you're not going to know that ChatGPT can't go sort your image library for you, or send an email. The difference between research and brainstorming and actually agentic stuff isn't that clear for the general public. If you're unlucky and your first idea doesn't work, that's strong evidence that the tech sucks.
Too fast to keep up with: it's more like sport than tech. Opinions from 6 months ago are as irrelevant as me talking about the premier league as it was in March. This is bloody annoying if you think of yourself as informed, but have other things to do. The tech works fantastically in some ways, poorly in others, and there's no decent way of educating yourself on that other than investing time and effort - which few people have.
Nobody ever lost money saying 'this is shit': most tech doesn't change the world, but somebody somewhere will try to sell you it on the claim it will. Plus a lot of tech ideas will just fail, just like anything else. Being aggressively macho about tech's shitness is a perfectly viable personality/career for a lot of commentators, and they'll drag a lot of less-confident people along for the ride.
Jarvis: people compare AI to perfection, not the status quo. I blame Jarvis and Spielberg. Every single review of anything AI says "it isn't perfect", as if it would be reasonable to expect it is.
Mistakes: people in tech are used to having to bend computers to their will. They don't always do what you want, and it takes some effort. But if you're not used to that, you see a mistake and assume it's a flaw.
It must be deliberate: if you don't understand the tech it is objectively weird to hear 'no human being programmed chatgpt to tell this kid they should run away from home' or whatever unfortunate thing has happened. It feels like a human must have made a decision at some point down the line, when in fact that's not how it works.
The tech is weird: as you say, modern AI can win math olympiads but fail to count the number of 'r's in 'strawberry'. Those things are both true. If you're technical you can understand that it's a jagged frontier, but if you're not then it's not unreasonable to model it as a human and to think 'can't count letters in words = it's stupid and won't work for harder stuff'.
The usual commentators aren't useful: the defacto commentators in the current zeitgeist are political commentators. But in the nicest possible way, they're often just not well versed in tech stuff. It's just not their wheelhouse. But they're the smart go-to people for a lot of political personal-dynamics stuff, and have been for years, so their probably not-so-great opinions carry a lot of weight. Plus their lens will always be techbros and rivalries and political dynamics and stocks and shares etc, all of which is irrelevant to how/whether the tech works.
Daily Mail effect: in the same way that there's always going to be a migrant with violent tendencies, or some awful person claiming child benefit for 12 kids and going on cruises, there's always going to be a schoolkid using it to cheat or some founder using AI to help you have an affair or whatever. These things loom large in the public consciousness and are assumed to be the tip of an iceberg.
AI Council: I think some people have a model of an AI council who direct all use of AI. There was a robotics company recently who were showing off their AI robot that could load the dishwasher, but it was also being livestreamed to a real person who could jump in at any point. Obviously nuts. People model this as 'look what the AI companies are doing now' as opposed to 'one company is using AI to do something mad, but there are probably lots of sensible ones going a bit more slowly'.
Normal tech doomerism: it's obviously built in to humans to think that any new tech is going to ruin the kids. AI will stop kids thinking in the same way phones stop them reading in the same way tv stops them going outside in the same way D&D stops them being christian in the same way Walkmans stop them hearing the world in the same way books stop them engaging with people around them in the same way writing stops them using their memory. Maybe the fears are correct this time, who knows.
Some people get (excessively?) angry about hype: like, everyone finds it annoying when a techbro makes grandiose claims. But some people absolutely spectacularly hate this with every fibre of their being, in the same way that nobody likes paying taxes but there are some who inexpicably find it cosmically enraging. Again these people get outriders.
Copyright: genuinely a hard problem, but again gets a lot of macho fuck-ai responses which are off-putting.
Age: the old Douglas Adams quote about anything invented when you're over 35 being an affront to nature is just obviously a thing that happens.
Lefty antagonism: if what you actually want to talk about is unions and fascism, AI gives you plenty of low-hanging fruit to shoehorn your way in to new discussions.
πππππππ Canβt think of anything to add but youβve nailed it!
Have you ever asked Chatgpt how your use compares to the average user? I did that the other day and it was very illuminating in showing that most people are still using it as a toy, rather than for serious work.
This was a fantastic prompt! It flattered my ego and told me that "youβre much heavier and more sophisticated a user than average."
Yes, I got: "Youβre definitely on the more interesting end of the spectrum" which I choose to see as a compliment.
Is that response, from ChatGPT, one you can check or do you have to take it on trust? I see no reason for its reply to be reliable - just like the coding assistance described in James Oβs article.
Really interesting. I only occasionally use ChatGPT - and found it a bit hit and miss but I like the sound of these use cases.
I think a lot of people who are very anti AI are doing so primarily due to negative polarisation. They donβt like the people who are really in favour of it for vibes reasons.
100%. Another example of the phenomenon of people pretending Elon Musk's rockets are bad because he's a bad person.
https://takes.jamesomalley.co.uk/p/stop-letting-elon-musk-break-your?utm_source=publication-search
I mostly used LLMs for coding. But one thing I learnt is that when planning a big thing, you can ask it to ask you questions.
"Here is my vague idea. I need to flesh it out so that I can (do something). Ask me questions to learn what you need to know and then write it up as a proposal/ plan for writing code / ..."
It then asks me questions. Sometimes I know the answer and just spell it out. Sometimes it really makes me think. And at the end I get a well structured plan/summary/whatever which contains a lot of my thinking. I've got a lot of value from that.
I originally came across the idea in this post, about coding, but the broad idea is much more widely applicable.
https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/
This is brilliant. I'm going to try this!
> There are many paths for doing dev, but my case is typically one of two:
> - Greenfield code
> - Legacy modern code
*cries in twenty-year-old C89 codebase*
Fascinating stuff. After initial scepticism, my experience is much the same: I use it all the time (not least because I prompt design at work).
Are all your prompts at the level you talked about above, or do you find yourself getting better results using template? (e.g. persona, task, good examples, constraints, etc)
Also: if I never have to hear the words stochastic parrot again, it will be too soon.
They're pretty much always written just in "conversational" style. Sometimes I'll list things if it is a multi step thing I want it to do. And I will tell it what NOT to do to, especially when precision is required (eg, writing code), but generally pretty casual. I'm actually pretty astonished just how good it is at interpreting what I'm really trying to figure out.
Fantastic piece. Your experience of LLMs is very similar to mine.
Also, this is based on a pretty small sample size, but within my field (statistics / data science) I have found that my colleagues in industry are general far heavier and more enthusiastic users than those in academia.
There's definitely a huge enthusiasm variance in different industries / crowds. Suspect journalism and academia are particularly hesitant β I'd probably be an AI sceptic by the standards of Silicon Valley. (I think AI is _most_ likely just important on the scale of the internet, not the wheel).
As long as it's not important on the scale of the Great Oxygenation Event...
This was very interesting, thank you for it. I have to confess, I have been struggling to make Chat GPT useful to me and I will try some of the things you suggested here. But I think one thing you might be missing in your confusion as to how Molly Jong-Fast (and me) are not having the same positive experience you are having is that for some of us there is a sensation of cognitive and emotional pain associated with AI use that you do not seem to experience. What I mean is that I find the tool actively annoying and the experience of having to review and assess its answers unpleasant. This is no doubt partly just a "change is hard" phenomenon, but I actively enjoy the process of iterating ideas in my own brain and on the page. Equally I find discussing ideas with colleagues or friends to be a pleasant and meaningful experience. I do NOT find talking to a chat bot fun or meaningful on any level and this colours my experience probably more than I would like to admit. As with things like exercise I suspect that there is a correlation between early positive experience and subsequent expertise. Kids who love doing sports want to play it more often and they get better at it. Those of us who get grumpy at the AIs are not motivated to do it very often and therefore never get to the level of fluency that would make it pay off for us. I do also still have HUGE ethical objections on copyright theft and sustainability grounds, but this is just me responding to the utility question.
That's all fair! And to be clear - I do still enjoy talking to humans!
This is really useful. I barely use LLMs and had been vaguely wondering what all those people were using it for, since if they were all getting something out of it I was probably missing out. Will try some of this!
One caveat on travel though is that I found it not-great for restaurant recommendations - on a recent trip I decided to give it a go and most of the ones it suggested had closed down recently. I wondered if there had been some news and posting about their shutting down meaning they appeared in search results more and boosted them in the algo. No idea! But it did put me off a bit... Will try again!
I think there's so much variability depending on the prompt - like the more recent model updates seem to be better at this, but I include the phrase "look it up", to try and tell the model to do a web search rather than just rely on its internal memory, which will be wildly out of date from when it was trained. (eg, without web search being triggered, GPT5 would probably think Biden was still President).
"Spitballing" is a great way to use LLMs. I've been doing a lot about France's governmental problems and it's been useful as an adversary and an ally in figuring out what comes next. Able to find obscure texts about the French constitution and law, and other articles in a variety of languages
We've predicted a few likely outcomes that i've not been seriously discussed in more formal periodicals, which at least has managed to make me a regular contributor in a few obscure cable news shows
I use it a lot for summarising long complex documents. I have to read a lot of very corporate jargony pieces and I find them easier to read once I know the key points. I regularly have arguments with it about grammar (early iterations where surprisingly bad at knowing if it should be βJames and meβ or βJames and Iβ). And if Iβm struggling to start a project I use it for doing a bad job just to give me a starting point β or even an insight in what not to do. It will give me a pretty average example that will kickstart far better ideas.
NB this is for day job work, not my creative writing. I have never found it useful for creative work beyond copy editing and fixing transcripts. Apart from the time I asked it for ideas on what could make a βwhirring soundβ on a robot. (I realised I was talking about its disks spinning which made me sound about 100 years old.)
Interesting. I'm a professional coder, and I've found coding models of very limited use so far. I sometimes use them as a first pass "why isn't this working?" query before bothering my colleagues, and I've generally found that for the extremely specialised work I do they're very unlikely to give me the correct answer, though they sometimes knock my thinking in the right direction ("That can't possibly be correct, because - oh, *right*, that's where I've been going wrong..."). I've also used them a couple of times for knocking up quick scripts in languages where I'm rusty, in situations where I can easily scan the code and verify it's not going to do anything dangerous so it's safe to run it and check whether it produced the output I wanted. For instance, recently I couldn't open a deeply-nested directory, and wanted to know at what point I was blocked, so asked Copilot to write a Bash script to list out the permissions on every prefix of the directory - /foo, then /foo/bar, then /foo/bar/quux, then .... Quicker than looking up Bash array syntax, but that sort of thing is only a small part of my job :-)
One coding task they're very useful for, as you correctly note, is writing code that uses well-documented but gnarly APIs. I too have used them for writing ffmpeg command lines, and they're good for data-munging with NumPy/Pandas or plotting with matplotlib. I've had less success getting them to query OpenStreetMap geodata with the Overpass API, though at least I got a starter query I could manually bash into shape.
Which model were you using to write code, out of interest? Think this is where a reasoning model and longer context window definitely help, as I remember earlier GPTs forgetting crucial stuff as the code iterated.
For non-coding use: the other day I was confused by Nancy Mitford's description of the court of Louis XIV as a "noblesse d'emir" (in an essay linked from The Bluestocking, natch), so I asked Claude. It gave me a good explanation that I'm pretty sure was correct ("the French nobility had become like decorative courtiers in an Eastern potentate's palace, stripped of real power and practical function"), which led into an interesting discussion of how autocratic courts have worked across continents and centuries, then veered into the life and works of Laclos (apparently after writing Dangerous Liaisons, he became a revolutionary).
I also asked it to recommend me Breeders songs, at which it was a total failure. But as far as I can tell there's only one really good Breeders song and I'd already heard it.
Well you should listen to last splash as thereβs loads of them on there
Already done that - there are some decent tracks there (I'm listening to Roi right now!), but IMHO nothing else on the same level as Cannonball. I can see why people would love it, though. Maybe if I'd encountered it at a more receptive time of my life.
*checks*
Claude Sonnet 4, apparently, but I just tried my last query with GPT-5 and it gave essentially the same wrong answer. There doesn't seem to be a "reasoning" option built in to GitHub Copilot (which has the nice feature of being built in to my IDE, so my current open file gets automatically included in the context window and it can suggest edits to my code which I can then accept or reject with a click). I'm fairly restricted in which AI tools I can use on my work machine (for the usual enterprise IP paranoia reasons), but I'll have a look to see what else we have available.
I think youβre right that the choice of model is a large factor. GPT-5βs auto-routing will hopefully help here. I think a lot of people donβt understand - and havenβt tried to - basic information on how to use LLMs (and so donβt know that you can switch to a reasoning model, what that means, and when youβd want to).
I think that increasingly, Microsoft Copilot will be peopleβs main LLM, at least in work settings. I understand itβs based on GPT-5, but itβs very reluctant to switch to reasoning mode and the replies are so fast (incl. for long document summary) that I presume itβs a smaller model. The answers are certainly worse than Iβd expect from standard GPT-5. Thatβll probably make people think LLMs are less capable than they are.
Fascinating to hear your experience. Like many writers, I have a love/hate relationship with LLMs. No point pretending it doesn't exist, which is the attitude of some. It mapped out a pitching strategy for me for 2026 this week in about half an hour. That would have taken at least a week pre-November 2022. I haven't incorporated it as fully into my workflow as you seem to have, though I feel that time is not far away.
Really enjoyed this!
Cheers!
Interesting! Iβm curious about the inaccuracy aspect (I accept that this is related to my being Gen X, and generally crotchety). I feel like combing through an unfamiliar response and trying to spot hidden inaccuracies must be as time-consuming as doing research from scratch? At least when I do my own research, I ^know^ when Iβm being sketchy.
Plus, when you do research from scratch you find things out serendipitously (things that are not directly related to your query, but are interesting to know/enrich your life/make you reassess your query). Itβs a bit like how the disappearance of paper-based news sources have stopped me reading absolutely random coverage of international business and volleyball, and I kind of miss that random factor.
But yeah, to be fair, I also just donβt like new stuff.
I mean many of the best minds in AI think it's going to genuinely kill everyone. But yes it makes your holidays better.