My AI Robots Are Sparking Joy

My project for the Holiday Interregnum is restoring order to my digital archives and repositories, which have not survived contact with the black hole of endless, overlapping family emergencies and the gazillion documents related thereto. I’m taking a deep breath, granting past me some grace, and setting about getting things ship-shape again for future me.

I wouldn’t have thought this six months ago, but AI has transformed what promised to be a sinkhole of grinding drudgery into a shallow depression of much less grinding drudgery.

I’m using the paid versions of Claude, Gemini, and NotebookLM to tame my files and archives, often in conjunction with DEVONthink.

(A note re privacy: Anthropic and Google will not train their models on your data if you use a paid API. If you use a free API that might not be the case, especially with Gemini.)

BIG WIN 1: Using Claude to turn my DEVONthink email archives into a useful repository for future reference. I routinely load all of the emails I have relating to a particular matter into their own DEVONthink database. While it’s useful to have them all in one place that’s searchable, it’s still something of a challenge to mine the archive for the exact information you need not to mention deleting what you don’t. Claude + DEVONthink 4’s AI toolset to the rescue! I have one prompt that creates a clean, easy to read “transcript” of a selected email chain. Another creates a summary of the chain that lists its date range, the participants, people mentioned, a summary of the main issue at hand, key points, action items, documents referenced, a list of attachments, any contact information, and keywords. Both prompts save the results as markdown files in the same group as the email chain. (Each chain is in its own group.)

BIG WIN 2: I have one DEVONTHINK email archive that contains 5000 email messages recovered from an accidentally deleted mailbox. Most of the messages are 5+ years old and unimportant, but at least some of them contain information that should be retained. Claude does a decent job of reviewing them, and, using whatever criteria I provide (e.g., “find messages with account numbers in them”), flagging them, suggesting next steps, and telling me how to execute those next steps. Nothing sparks joy like sending old email messages to the shredder.

BIG WIN 3: Creating a comprehensive summary of a bolus of related legal, financial, tax, and official documents that I can file with the documents themselves. NotebookLM is fantastic for this, but privacy is a concern, even though Google claims your data is not used for model training; shared without permission; or open for human review unless you provide feedback to troubleshoot or improve the product. If I’m working with something really sensitive, I default to DEVONthink and use my Claude or Gemini API to query the documents and generate summaries. NotebookLM’s big advantage is the array of useful artifacts that it can generate—e.g., reports, data tables, mind maps, infographics, etc.

BIG WIN 4: Converting old or badly-scanned PDFs into legible text documents that are more amenable to highlighting, annotating, and cutting-and-pasting.

BIG WIN 5: Generating file renaming and management tools I will never build myself.

  • I use Claude to help me craft regular expressions to batch rename files so they conform to specific naming conventions.

  • I use Claude to help me write AppleScripts to do all kinds of helpful little things. Example: I now have a script that changes the dates in filenames to YYYY-MM-DD format to facilitate document filtering and sorting.

Yes, I do know that I could spin things like these up in [insert the automation app of your choice here], but I don’t have the mental shelf space for that now. (Claude will walk me through what each step of the script or regex does if I’m curious and want to learn more. Gemini wants me to stop with the AppleScript already and learn to use python on my Mac in what it insists is THE RIGHT WAY. :roll_eyes:)

The obvious caveat is that LLMs make mistakes and you need to double check their work before you do anything important with it. Any artifacts they generate are useful to have as a kind of aide-mémoire, but won’t replace your own careful review or the guidance of a professional. You need to test any scripts before you turn them loose on your documents. Etc Etc Etc.

That being said, I’m delighted with the drudgery I can offload to my Clankers. (A term I use with affection, not scorn.)

18 Likes

Very cool and thanks for sharing! :heart: :smiley:
If you don’t feel like cross-site posting, I will post a link on our forums. Otherwise, you’re welcome to post the same in our neighborhood.

Glad you found the post interesting. Please feel free to post a link on the DT forums. I don’t visit them very often, so if someone has comments or questions, I’m much more likely to see them here.

@krocnyc thanks for posting this! It inspired me to use Claude Code to clean up some Google Takeout exports, split them into smaller .mbox files, and import into Apple Mail without breaking or slowing down the app. This was something I’d tried in all sorts of ways and was never able to manage (including Eagle Filer which strained under the load). Thanks for the inspo!

4 Likes

I just used Claude Code to find all the (exact) duplicate PDFs scattered across my Mac’s hard drive and move them to the trash bin. Worked like a charm. (Some folders were exempted from the search and destroy mission—e.g., Library folders.)

An addendum about archiving email in DEVONthink: with the pro version of DT4, it’s now possible to import email attachments as separate documents, which makes locating and working with them a lot more straightforward, even if you’re not using AI. Details here.

This new feature has made tidying up much, much easier. It makes it a snap to find important documents that came in as email attachments, rename them as needed, tuck copies away where they belong, and task the robot with summarizing them, extracting useful information, or whatever.

2 Likes

@DEVONtech_Jim suggested that I share my email prompts over at the DT Forum, which I did.

For those of you who may be interested, I’ll post the same information here.

A note: I had Claude help me refine the prompts so that I can get exactly what I need. I’ve found that working with the bot itself to refine and enhance a prompt really pays off. If you’re interested in using these prompts yourself, you might want to ask your bot of choice how to make them better suit your particular needs. And, if you’re working with long, complex email chains with a number of participants, you should ask the bot how best to “chunk” the task so that you don’t run up against token limits or context window issues; there are a number of ways to do that, and one might be better suited to your needs than another.

Here’s the transcript prompt:

Convert this email thread into a clean chronological transcript. For each message, write:

“On [date], [Name] wrote:”

Then include the message content with all email headers, reply markers (>), signatures, disclaimers, and quoted previous messages removed. Only include the new content each person contributed.

Preserve the natural flow of conversation while removing technical email artifacts.

Here’s the summary prompt:

You are analyzing an email thread. Provide a summary with the following sections:

Date Range: Note the dates of the first and last emails in this chain.

Participants: List each person who sent an email in this thread with their email address and role or organizational affiliation. If role is unknown, note as “role unclear”. Format as bullets.

People Mentioned: List any individuals referenced in the emails who weren’t direct participants, along with their roles or context for why they were mentioned. If role is unknown, note as “role unclear”. Format as bullets.

Context: In 1-2 sentences, state what this email chain is about and the main purpose of the exchange.

Key Points: List the main topics discussed, decisions made, and any disagreements or alternatives considered. Format as bullets.

Action Items: Extract any tasks, commitments, or next steps mentioned, noting who is responsible if specified. Format as bullets.

Outcome: State the current status - is this resolved, ongoing, or awaiting response?

Documents Referenced: List any documents, agreements, contracts, or other materials mentioned in the emails (e.g., “Q3 Budget Report,” “NDA with Acme Corp”). Format as bullets.

Attachments: List the filenames of any documents attached to emails in this chain. Format as bullets.

Contact Information: Extract any phone numbers, addresses, or other contact details shared in the emails. Format as bullets.

Keywords: Provide 5-10 keywords or short phrases that would be useful for searching this thread in an archive.

Ignore email signatures, disclaimers, and quoted text from earlier in the chain unless it’s essential context.

If you have any questions, just ask.

8 Likes

Thank you. This is all very interesting. I am also trying to do some deep organization and I appreciate your thoughts. I have two immediate questions. First, do you separate every project into its own database? Second, do you import or index all of your documents in DevonThink? Trying to read between the lines, it seems like you import everything, and then also import all emails, presumably to the relevant database and not one (or a few) giant databases. It seems like you have given us a great deal of thought and I would appreciate your input. Thank you in advance.

I have multiple DEVONthink databases, each of which targets a particular area of focus. For instance, I’m on the Board of two small non-profits; each has its own DT database. I handle the administration of some family trusts and estates; each of those has its own database. I have one for my personal taxes; one for my personal legal, financial, and administrative documents; one for account statements and receipts, etc. These are my administrative databases; I import documents into them.

I have two kinds of email archives in DT: the kind where I’ve siphoned off all the emails related to a particular matter and stored them with the relevant documents and the kind where I actually archive entire mailboxes every six months or so. I’ll use DT’s search tools (and now, AI) to weed out the messages I can safely delete and flag the ones I need to save. Once that’s done, I can empty out my Gmail inbox and start fresh. Be sure to read the DT manual on the difference between importing email and archiving it before you try this at home!

[A note on how I manage email: I use Mimestream as the client for my Gmail accounts. I make liberal use of filters to move specific categories of emails out of my inbox and into their own “folders” (scare quotes because that’s what they look like in Mimestream, but that’s not what they are in the Kingdom of Gmail.) For instance, I have filters set up for all of the newsletters I’ve subscribed to. The filters categorize them as newsletters, forward them to Readwise Reader, and then move them out of the inbox. I have similar filters to manage emails from vendors, cultural organizations, etc. Before I archive my mailboxes in DT, I make sure I’ve deleted everything I’ve caught in these filters if I haven’t done so already so that I’m not archiving ephemera.]

I also have a DT database dedicated to my research and learning materials. With a few exceptions, I index my Obsidian vaults and the folders that contain my research and learning materials in this database.

I find DT to be fabulous as a repository, but not a congenial space for writing and note taking: I do most of that elsewhere.

2 Likes