Automate Read-it-later app to archive in DEVONthink

fuzzygel · May 15, 2022, 5:37am

I am trying to build a simple (I hope) automation either by shortcut or Keyboard Maestro (preferred) to trigger either Goodlinks or Instapaper to add articles to DEVONthink as pdf. However, I cannot find anything suitable in either option.

Has anyone done or come across something similar on the Mac or iPad . Hope this does not involve AppleScript or things even more complicated

shandy · May 15, 2022, 9:49am

No need to use Keyboard Maestro or Applescript. You can automate it entirely from inside DT on Mac.

Add your Instapaper’s RSS feed to DT, and then set a smart rule to automatically move each PDF to the desired location.

I’ve got mine set up with Raindrop.io’s RSS, but it should work the same with Instapaper.

Two screenshots below.

The first shows the DT RSS settings.
The second shows the DT smart rule.

rms · May 15, 2022, 4:36pm

I do this also. Only difference is that I set the RSS feeds to save as Markdown to keep size down as text only good enough for me. If later i need the images i go back to the web site. 99% of time web site and page still there but of course i know the risk of the source database disappearing–that’s life.

And the rules I use to move are more selective based on rules. I do not move all.

I find for my most-read feeds that I often save into DEVONthink, this a very quick way to do so.

fuzzygel · May 15, 2022, 9:51pm

thanks both, I’ll give this a try. DEVONthink is so powerful sometimes, it is hard to know / find all the features

sunil · May 16, 2022, 9:13am

This is a very timely thread and really helpful post by @shandy. I have a large Instapaper library of articles going back 10 years that I’d quite like to save to DevonThink - I found Instapaper’s ‘clean’ reading mode to be excellent, but want to take advantage of all the features for linking/surfacing knowledge in DevonThink

However, short of exporting them all individually as PDFs from the Web version of Instapaper I’d not found a way of doing this. You can create an Instapaper RSS feed of your articles, but when I played with this I couldn’t get them to show up in DevonThink (I think the RSS feed from Instapaper only has the first few characters, IIRC).

I will need to give this another shot - but I wonder whether, even if this does work for Instapaper as @shandy describes - it will only capture new things that are added to Instapaper from that point on. If so, is there a way of retrieving a decade’s worth of nicely formatted Instapaper articles…?

Edit: I’ve just checked, and the Instapaper RSS feed only provides the 10 most recent articles (and the is updated as new items added) and per this Tweet from @InstapaperHelp in 2020 can’t retrieve anything older than this. More importantly, it only provides a short preview of the content - the image below is an example from a CNN article I saved yesterday, viewed within DevonThink, but the same applies to all of the 10 items in my Instapaper RSS feed.

So unfortunately it looks like Instapaper is a bit of a black-hole when it comes to the ‘read it later’ formatted articles. The options to get the formatted full text into DevonThink would seem to be:

Manually click through each article in your Instapaper account, then export to PDF and important that to DevonThink. You’d need to repeat this process at regular intervals - which makes it fairly impractical.
Retrieve all the Instapaper URLs from your account (you can download these as a CSV or HTML from your account - see below for the CSV format) and then add these to DevonThink some other way. I don’t know DevonThink well enough to know whether it can crawl URLs and retrieve them as PDFs in a read-it-later format.

URL,Title,Selection,Folder,Timestamp
https://www.cnn.com/travel/article/refugee-airplane-mystery-reunion-cec/index.html,Finding Tracy: How a CNN story led to a long-awaited reunion | CNN Travel,,Unread,1652648874

rms · May 16, 2022, 10:14am

I suspect if you asked Instapaper how long they keep their older feeds, this will help you figure out. Also depends on the source that Instapaper points to, I would think. If you try pointing the DEVONthink feed to the source RSS and not the Instagram feed, different result?

sunil · May 16, 2022, 10:17am

Sorry @rms - I was just editing my earlier post.

The issue is that Instapaper’s feed only has 10 items, and only provides a preview of content (rather than full text). The articles are all ones that I’ve saved to Instapaper over the years using their bookmarklet or extension - so there is no other source RSS I could use to get them into another client.

I’ve made a couple of suggestions of how to work around this, albeit manual/clunky, in the post above. Doesn’t seem to be a good, automated way of doing it, sadly. I’m definitely going to have to rethink using Instapaper as the ‘prettification’ service - which is a shame, I’d hoped I could simply use that as an intermediary to DevonThink.

fuzzygel · May 16, 2022, 10:19am

so instead of Instalpaper, is there any other read later app that works better with DEVONthink - Goodlinks, Matter, UpNext, Pocket, etc?

rms · May 16, 2022, 10:31am

I suspect anything that has a valid RSS Feed, be it a read-later, or read now, will work with DEVONthink. but have to try. I have used Feedly for years and it’s connected to my “Reeder” app. I don’t find a way that Feedly exposes what they have as an RSS feed for DEVONthink to use. What I’ve done is for my *very frequent and important" feeds, I use the RSS feed:// url noted in Reeder to add that feed to DEVONthink. Feedly not involved, and as I mentioned, I can’t find a way to involve that.

For the other read later apps, look at their web site and read their documentation to see if they expose RSS feeds for you.

shandy · May 16, 2022, 11:29am

For your archive of older Instapaper articles, there’s an AppleScript by Annard Brouwer which can batch save/import them as PDFs. I’ve pasted the script below, but it’s also linked to here:

-- Use Instapaper to export a CSV file of your articles.
-- Must have Numbers to open it in.
-- Will create PDF documents in /Instapaper/<your folder in Instapaper> groups.
-- Be sure to select an open database in DT Pro before you run this.
--
-- Created by Annard Brouwer, 24/08/2014
-- Share and enjoy!

property kCSVFileType : "csv"
property kRootGroupName : "Instapaper"

tell application "Numbers"
	local csvFile, csvDocument, cellRange, aRow, loadURL, theTitle, theGroup
	
	set downloadsFolder to path to downloads folder from user domain
	tell application id "DNtp"
		set csvFile to choose file with prompt "Select a csv file exported from Instapaper" of type kCSVFileType default location (downloadsFolder as alias)
	end tell
	
	set csvDocument to open csvFile
	
	tell csvDocument
		tell first sheet
			tell first table
				try
					set cellRange to cell range
					my windUp(count of rows)
					repeat with aRow in rows
						if address of aRow > 1 then
							tell aRow
								set loadURL to the value of first cell
								set theTitle to value of second cell
								set theGroup to value of fourth cell
								my importArticle(loadURL, theTitle, theGroup)
							end tell
						end if
					end repeat
					my windDown()
				on error errorMessage number errorNumber
					my windDown()
					if the errorNumber is not -128 then display alert "DEVONthink" message errorMessage as warning
				end try
			end tell
		end tell
	end tell
end tell

on windUp(numberOfArticles)
	tell application id "DNtp"
		activate
		show progress indicator "Archiving Instapaper articles..." steps numberOfArticles
	end tell
end windUp

on windDown()
	tell application id "DNtp"
		activate
		hide progress indicator
	end tell
end windDown

on importArticle(aURL, aTitle, aGroup)
	local groupPath, theGroup, theRecord
	
	tell application id "DNtp"
		try
			step progress indicator aTitle
			set groupPath to kRootGroupName & "/" & aGroup
			if not (exists record at groupPath) then
				set theGroup to create location groupPath
			else
				set theGroup to get record at groupPath
			end if
			set theRecord to create PDF document from aURL name aTitle in theGroup pagination no
		on error errorMessage
			log message aURL info errorMessage
		end try
	end tell
end importArticle

shandy · May 16, 2022, 11:33am

For me, Raindrop.io works well with DT, including automatically saving the whole article (not just preview) as PDF.

For an archive of older saved links (which probably won’t appear in the RSS feed) the Applescript I posted above should do the trick.

Also worth noting that Raindrop.io works with Zapier and Make.com (previously known as Integromat). So you can use Raindrop.io alongside Instapaper (or whatever) and automatically get new saved articles pushed from one service to the other.

Pupsino · May 16, 2022, 5:24pm

I haven’t tried this in DT so I don’t know how it works, but Pocket does have an RSS feed, it does sync your entire collection (and its tags), and the feed is the entire article, not just a snippet. I have my Pocket feed pointing to Reeder and it has all 5000 of my Pocket items in there. Pocket has the added bonus of being owned by Mozilla and believing in data privacy.

I moved off Instapaper during their data privacy shenanigans years ago (they cut off support to all EU citizens because they couldn’t comply with the then-new data regulations - which makes you wonder what data they were harvesting…). Anyway, at the time my export was a giant csv file. I’ve kept it, but gradually “rebuilt” my saved pages as PDFs which I store locally.

fuzzygel · May 17, 2022, 12:24am

thanks a lot @shandy , the script works. I am very happy. Very interesting that the script had to work on a CSV file exported from Instapaper account.

I guess my next question is how to update the archive, the instapaper export option does not seem to have means to limit to newer articles since last export. Perhaps I manually go to the CSV file using Numbers to remove those already imported into DEVONthink?

MitchWagner · July 6, 2022, 1:53pm

@shandy This is very useful, and thank you. Why do you save the articles as PDFs? Why not use Web archive?

shandy · July 6, 2022, 7:38pm

PDFs are compatible, readable and indexable by pretty much everything

Whereas WebArchives have been deprecated by Apple.

MitchWagner · July 8, 2022, 2:57am

One other question: Why not just save directly to Devonthink? Or just use Raindrop.io and skip DT? Why both?

shandy · July 8, 2022, 9:32am

A few reasons:

I use indexed folders in Devonthink. Indexed folders don’t have bidirectional sync with DTTG, which means I need to use something that’s not DTTG to capture URLs on my phone.
I want the PDFs captured in a desktop format (i.e. not mobile) and I don’t want to wait around for them to be rendered and saved before I can continue using the phone. This rules out creating the PDFs on the phone. I.e. my priority is to capture the URL, knowing that the PDF will then be automatically generated in the right format, in the background on my mac, and auto-filed in the right folder.
I prefer Raindrop’s workflow for quickly capturing a URL and allocating it to a folder. No other tool I’ve tried (including DT & DTTG) makes this as quick, painless and reliable across all browsers, apps and devices.
If I only use Raindrop, then I won’t get the archived PDF.

Doty · March 3, 2024, 2:20pm

This is just what I was looking for. Does it bring in the tags from raindrop?

shandy · March 3, 2024, 3:07pm

Unfortunately no.

Twenty characters.

Doty · March 3, 2024, 11:54pm

Thanks for the reply. What does “twenty characters” refer to?