I’ve been taking a lot of photography courses over the past five years or so, and have a huge pile of handwritten lecture notes that I’ve been longing to compile into an organized set of digital documents I could use for easy reference. I’ve been putting this project off since forever, because it seemed more daunting and time-consuming than I had any appetite for, and I didn’t think a good digital solution was likely in the offing. My handwriting looks tidy enough, but is hard to read (a colleague once suggested that I might as well write backwards). I’ve never had much success with the handwriting recognition built into note-taking apps, even though it’s gotten pretty good.
Then I heard an interview with History Prof. Mark Humphries on the Hard Fork podcast, during which he described Gemini 3’s astonishing facility at deciphering hard-to-read historical documents, so I thought, “Hmmm … Why not give it a try?” I scanned a couple of pages of notes, fed them to Gemini, and presto! I had a clean and very nearly perfect transcript in a few minutes. Gemini has offered to do all the organizing and compiling for me too, but that’s the part of the project I’m most looking forward to, so thanks, but no.
Now I have a new thing I can do while I’m sitting at my desk on perma-hold with customer service: feeding my notes into my scanner and getting them ready to upload to Gemini.
A note: I’m not sold on Humphries’ suggestion that Gemini’s skill at edge-case handwriting recognition is a sign of “spontaneous, abstract, symbolic reasoning.” But that’s a discussion for another day.
I recently gave Gemini a few pages of hand written To Do’s and it entered the info correctly into my Tasks list. The capitalization could have been better, but so could my writing.
Apparently Gemini and Notebook LM can also transcribe and organize tasks, etc. from audio recordings.
Wow this is really cool. Thanks for sharing. Have you tried it with any of the others such as ChatGPT or Claude? Curious to see how they compare.
Do share any additional tips about how you’re going about this. I’ve got a lot of disorganized handwritten notes all over the place. Do you think Gemini could take on the organizing and sorting too?
Loosely related but recently I tried to have ChatGPT and Claude transcribe audio I had dictated from an M4A file and they couldn’t in spite of both saying they could! I then thought to try Gemini and did a perfect, shockingly fast job!
I used to pay a subscription fee for an app that would transcribe podcasts and publicly available panel discussions that could reliably handle what’s known as “speaker diarization”—i.e., correctly identifying who’s speaking. No more. I fed a few audio files into Gemini, asked for a diarized transcripts, and cancelled my subscription to the other app the next day.
I should note that I’m using the paid version of Gemini, so it was able to transcribe hour-plus long files.
I’d done very little with any AI model until recently and have just been experimenting with Gemini using photos of handwritten notes, or screenshots of upcoming events posted on a website. I’ve not tried NotebookLM.
Gemini will add the events to my Google Calendar
or create todo’s in Google Tasks directly when I ask it to " . . . add these items to my Tasks list.
I’m using Gemini in my business standard Google Workspace account.
I agree. I seldom use ChatGPT at this point (for reasons I posted about earlier). My go to now is Claude for editing and Gemini for nearly everything else.
Speaking of which you should take a look at the recently released Claude Cowork (beta I think and only for pro users for now?). I had it go through my messy downloads folder with a couple thousand files and it worked with me to clean it up and organize. Pretty impresive POC!
I opened Claude Desktop this morning and lo and behold, there was the Claude Cowork tab. I might give it a test run against a big folder of disorganized research materials. I could have done this task in either Claude Chat (with the filesystem extension installed) or Claude Code directly, so I don’t know if there’s any particular advantage to using Cowork (which is just Claude Code with a friendly user interface.)
I had already had Claude give me the steps to index, search my sermon files. I had not got around to acting on it. With cowork, Claude stated that it can do it directly without the additional steps. Looking forward to trying it out.
This is where I was a few months ago but I don’t think it’s the case any longer. Claude’s app is just as good as ChatGPT, now better in some ways, and a little more feature rich. The model is, IMO, significantly better for the types of things I do with it.
I’m also on the cusp of leaving ChatGPT for Claude. I did an export of my data from ChatGPT yesterday and I’m going to uploading into Claude today to see how much it can quickly learn about me. It’s a tough one because I’ve sunk a lot of info into ChatGPT and uploaded a lot of files into projects and set up custom GPTs but as a coder I can’t ignore Claude Code any longer either and the whole package now with Cowork in the Pro plan just seems very irresistible. I’ve also seen some pretty impressive demos of Claude Code interfacing with Obsidian.
Same boat for me. I’ve got so much material in ChatGPT that I’m loathe to abandon it but if the other models get better then I guess I will have to abandon what I still feel is the better UX.
Claude MCP was the thing that got me using it in parallel for a while now and I guess its attraction is increasing with cowork.
I’ve tried Claude Code in the CLI, Claude Code in the desktop app, and Claude Cowork in the desktop app. I’m not sure I perceive much of a difference between Cowork and Claude Code in the desktop when it comes to file organization: they both work. I suspect that the difference might be that Cowork is set up to help me do things while Claude Code is set up to help me make things to do things, if that makes sense.
Based on my experiments so far, both Claude Code and Cowork will organize your files, but they will organize them much better the more explicit you are about what you want. My first prompt was more or less “See this giant folder of files? Organize them and don’t expect any input from me about what goes where.” The categories Claude developed for sorting were actually quite good. The actual sorting itself was pretty scattershot.
I started a new session, pointed Cowork at the newly-organized folder, and suggested that there were documents that had been misfiled. Claude agreed and set about trying to sort things where they belonged. The results were better, about the quality of a high-school summer intern.
I went back a third time and asked Cowork to re-examine the files and extract those related to a specific organization, and put them in a new folder. This it did quite well, and even sorted the documents it had extracted into logical subfolders.
TLDR: It’s a good tool, but it’s not magic. The clearer you are about what’s there, what you want done, and how you want it done, the better the result. But I’ll never try to organize files by myself entirely manually again.
I started by asking Claude to analyze the folder and suggesting a plan of action that I refined before it went ahead.
It’s pretty amazing considering how new this is. Makes me pretty optimistic about these tools adding major value pretty soon for a lot of people’s mundane work and save human interns from torture
I delivered a strategic planning workshop recently where we ended up with a bunch of flipcharts, post it notes on walls and handwritten pieces of paper from the exercises we did. I took pictures of the lot and got Claude to transcribe it all to text, but even better, to then summarise everything that people had written and categorise it into the most common themes. This was great and actually gave me a really interesting insight into what people had written, since trying to make sense of reams and reams of random bits of text is quite hard.