An Embarrassingly Simple Way to Future Proof without Plain Text

Does DT do this automatically or are you running the conversion manually? Seems a bit tedious to convert back and forth, but makes sense for finished documents.

This got me thinking though: Maybe a better backup solution would be to periodically convert all your text notes to other formats, like PDF or HTML with inline styling. First thing that comes to my mind is running pandoc on a schedule using launchd. Might have to look at this myself later :thinking:

It is automatic; here is how I have the preference set. Note: an incoming “scan” is any document I’ve scanned or a PDF that I send to DT from other applications.

My primary conversion is from a PDF that I send to DT, which is OCRed automatically as indicated above. For example, I will write a speech or paper in Pages, “save as” a PDF for use when giving the speech and subsequently save the PDF to the inbox in DT, which automatically OCRs the document. I’m only converting my work, not other research articles. Consequently, I only convert a few documents per month.

If I can figure out how I will find a script for DT that will automatically convert documents sent to a particular folder to rich text. From what I’ve read in the comments to this post, rich text may be more future-proofed than PDFs.

Just keep your RTF file @Bmosbacker , and you’re golden.
RTF has been around 30 years and isn’t going anywhere.
It’s also easily converted to any other format.
But hang on to the RTF, as that’s your source document.

7 Likes

Thanks for all of the encouragement. I think I’m in very good shape and with less unnecessary friction. :blush:

Thus - there is no good way to convert from PDF back to RTF

Why not just keep it as RTF?

Just on the ‘easily create TOC’ thing, Multimarkdown (and other flavours, no doubt), allow you to do this easily enough. For example, in DEVONthink (DT3), add {{TOC}} to the text and the table of contents will be generated in the preview.

Obviously, what you see in the preview is dependent on the editor/viewer you’re using (and the CSS style sheet): it works not only within DT3, but also with BBEdit’s viewer, IA Writer and Marked, for example.

DT3 also shows you a ‘live’ navigable outline for the source markdown, which is very useful in long documents, and which some specialist apps don’t have (IA Writer, for example). This uses exactly the same Inspector Panel (Table of Contents Ctl-4) which the RTF outine uses.

I’m mentioning this not to dissuade you from using RTF / PDF at all – I understand your logic completely – but just to bring the features to your attention in case you weren’t aware of them as I know you use DT3. A lot has changed with how the program deals with markdown over the last couple of years…

I do sympathise with your desire to simplify as well! I’ve spent so much time on markdown related workflows (and money on the software…) and I’m absolutely sure that if I’d put as much effort into using the built-in DT3 / RTF stuff over the last few years I’d have been a lot more productive. But part of the fun of being retired is that I don’t have to be productive if I don’t want to be, and I’ve had a lot of enjoyment playing with it all…

The main thing holding me back from swapping to RTF is that it isn’t very good on iPadOS (it’s improved in DTTG, but I’m not sure by how much). Plus, I am fairly certain that I’ll want to change back in another day/week/month when the new Shiny.app appears.

Thanks, I learned something new! But, the output is not what one would want. Below are screen shots of a markdown document using the {{TOC}} in both DT and in Word.

As you can see, the TOC is not in a form that is particularly useable nor is it setup to automatically update in Word or Pages when changes are made to the document, which is a feature in Word and Pages. One would have to go back to the original MD file, make the changes and then export again. So, while it works to create a static table of contents at the beginning of a document, it is not a dynamic one nor is it formatted well. Of course, there is a high probability that I am unaware of additional steps to make this work but even if so, it illustrates my point. Whereas in Word and Pages this is a simple process, making this work properly in markdown requires many steps, creating friction in the writing workflow.

I think the issue here is that markdown was created primarily as a web-tool. That is, it is a simple way to write for the web. It was not intended for uses like mine–creating well formatted, long documents with citations, tables, pictures, TOC and more. Can md be made to work–yes, to a degree, but only with a lot of fiddling. If I wrote only for a blog or say technical manuals for the web, md would be ideal. But, much of my writing is more formal, e.g., board reports, or presentation notes, which are often shared later with those attending the workshop, conference and what have you, and often includes citations, graphics, etc.

After months of working with markdown, I’ve concluded that for my needs, a rich text editor/word processor is better.

In iA Writer

Output in Word

In DT

If you want the Word / Pages document to be the canonical source, then of course, you’re right, there’s no point in using markdown anywhere in the stream. I’m thinking of your wanting to use it produce final archival copies easily. The actual format of the TOC is a purely CSS issue, which of course is a source of further complication if you want it to be.

As I said, I’m not trying to change your mind!

Personally, as someone who had to use Word to write complicated commercial documents of hundreds of pages, I’m glad that I don’t have to anymore… But I wouldn’t have used Markdown for that either.

Exactly! Until everyone agrees to use the same editing and viewing software, which will never happen, we will continue to have these discussions and people will continue to have to select the software best suited to the type of work they do and their publication and collaboration needs. :slightly_smiling_face:

3 Likes

If you prefer to use, for example, Microsoft Word keep in mind that the .doc/docx formats are considered “acceptable formats” for archival purposes by the US National Archives, and that Word can also use the OpenDocument format which is a “preferred format”. We still have several options when we want to future proof our documents.

3 Likes

I don’t think you need to fret about what pdf type you’re using. As you note, you’re not running a national archive :joy:

I’ve got PDFs I created on Windows XP many computers, moons and a whole different internet ago and they’re fine on my shiny Mac toys.

And PDF/A is only “future proof” until someone comes along and invents PDF/A+….

My personal view is that PDF, plain text and I suppose RTF will outlive us all, because their myth will mean someone will always keep them running if the tech gods turn their backs on us.

4 Likes

Thanks for the encouragement and I love the cartoon–to DT it goes! :slightly_smiling_face:

Thanks @WayneG. If it is good enough for the National Archives surely it is good enough for me. And, though embarrassing to admit, I’ve never noticed the OpenDocument Format before. It is a same that Pages doesn’t offer that option.

Thanks again, you are a constant source of encouragement and a wealth of information!

Though I will keep the Word or Pages version indefinitely, I consider the PDF+text or the RTF version to be canonical–the “fall back” if needed.

For anyone considering this topic, two things to consider:

  1. How many documents do you really need to edit more than, say, a year from now? I’d bet few. So any hardship in converting back to an editable form is not a huge loss.
  2. A Word document is just as long lasting if you treat it as something you want to keep forever, which means keeping up to date copies of Word and periodically re-saving them in a new form. That’s what enterprises do (or should do) with backups that need to be kept for 10 years.

You should maybe have a look into, what makes PDF/A a “future proof” filetype.
This will not change with future developments.

And by the way, they are already at “PDF/A-4f”, or PDF/A-Next, based on PDF 2.0!

funny, this reminds me of

1 Like

I lean towards markdown and text formats for any of my personal notes and DT pdf for webpages

1 Like

…as will HTML.

And probably Word as well, given how long it has been used and how widely it still is used.

If I needed to preserve a document with paged layout (without having to edit/cite from the text later), PDF would be my preferred choice.

If I anticipate to „process“ any parts from it later, I probably use HTML - since it can cope with multimedia content (videos, even), and I don’t have to worry about compressing images or being unable to otherwise extract them at full quality.

Thankfully, storage for text-based documents is cheap, so one can automate conversion and create/keep multiple formats at once.

I have my doubts about all the formats. If I were the betting sort, I’d bet that one of those formats is gone in 20 years.

Which one? No idea. And that’s the problem.

I met a lawyer in the early 2000s who had floppy disks of adoption records he could no longer access – they were the 5-1/4" floppies, and he couldn’t even find a drive to put them in. (He did eventually, but again, at great effort.) I’m sure if he had thought about it, he would have been comfortable that he could always export the files when he needed to – only he hadn’t, and didn’t think of it until too late.

Maybe 5 years later, I had to jump through incredible hoops to get MS Word to open files I had made in MS Word a few years before (less than 10, maybe about 5). It was Mac-to-PC (or vice versa) as well… but I’m sure there will be other operating systems down the road.

The fewer hurdles there are to reading a file when I need it, the better. I’ll stick with the most cross-platform, most compatible, least proprietary format short of a printout – and that’s plain text.

1 Like