Next in the series: Thinking about archiving emails.
Some time ago, there was a thread talking about how to save and potentially export emails to files. That got me thinking about the idea of exporting my emails for future saving.
Sometime thereafter, I was looking through Stephen Wolfram’s blog and found a posting where he noted that he saved basically every email he has ever received, all of which are readily searchable via his custom search engine (which I would assume was written in the Wolfram Language, but I digress).
In any case, since that point I have been mulling over the idea of saving emails. I do have a huge archive of emails from work, because our email is in Gmail so pretty much everything gets saved, but for my personal email I do not have any real archive - once deleted and erased from Trash, pretty much gone.
I have found that I am always searching back through email because I cannot find where I made a note of something or perhaps I never remembered to record the content of an important email. As a result, it seems to me that it would be beneficial to set up a system to essential save all my emails in an accessible and searchable format. The question is, how to do this?
I have toying with two ideas, and would appreciate thoughts.
- Import all emails into DevonThink using the DT plug-in to Mail.app.
Advantages:
– It’s easy
– DT provides an easy enough way to organize things
– I could probably find some way to automate sorting the emails into personal and work related automatically
– DT displays the email in an easily readable format
– Attachments can be double-clicked to view, and/or could be dragged out of the email but still within DT to make them separate entries
Disadvantages:
– Readily searchable only inside DT
– I do not know if DT’s search will encompass only the email text or if it will magically OCR attachments (word documents, PDFs) so they are searchable as well. (I know DT can OCR and search a PDF, but that’s a PDF document in DT, not an attachment in an email document - anyone know the answer to this?)
– Syncing the data across computers requires setting up a DT sync store (not hard).
- Export emails to regular files. I threw together a quick script that takes a file containing a single email message (saved from mail in .eml format) and parses it. The text component of the email is saved to a text file with all the email headers intact. The html component if present is saved to an html file. All attachments are saved to separate files. Right now the filename is .txt or .html, and for attachments the same filename with the attachment filename appended.
Advantages:
– Email files are readily copy-able, share-able, sorted on disk by the date and message id, and if I have Hazel for example automatically ocr pdf’s, indexed and searchable via Spotlight or at the command line with grep.
– Individual attachment files can be easily accessed transparently.
– Sync happens automatically if I put the messages in a folder in my Dropbox-like cloud folder
Disadvantages:
– Yet another script I need to maintain
– Tons of disk files may be harder to readily search vs searching in DT
– Lots of cruft. For example, email messages that have an embedded image (for example company logo) get those images saved as files separately which is just messy
– The email message can be easily viewed via QuickLook, or in BBEdit, but isn’t pretty.
– The filenames are basically horrible. The date prefix is OK, but a subject to an email often contains too many characters and just and while convenient for searching, is just a mess (I do convert spaces and periods to underscores). Including the message ID makes sure that every file has a unique name, BUT makes the filenames incredible long, impossible to reasonably type, and just basically ugly.
- Save the email as a single .eml file exported from Mail.
Advantages:
– same advantage as in (2) for files on disk, but only one file per message
Disadvantages:
– Since attachments remain in BASE64 encoded form in the email, they are not viewable or searchable at all unless I then decode the email with the script from (2), which makes this pretty much useless.
So, what do you do to save your emails? Something like what I am considered in (1) or (2), or something else entirely?