Bookmarking and saving web page data

TallTrees · August 11, 2020, 6:38pm

All,

I’m curious as to what you all are using to capture, store, or keep a web article of interest, so that you can read it again later. Also, are people still Bookmarking websites of interest and storing them in their browser of choice, or is there a better option?

I use DEVONthink to both bookmark sites of interest, as well as capture text (or the whole page) of an article, often converting it to PDF or Rich Text. This generally works very well for me, but it can be a bit tedious, so I thought I’d see what others are doing.

Cheers!

Ethan9482 · August 11, 2020, 6:50pm

If it’a a keep for a while then i tend to use Evernote and forget about it.

If it’s something that i want to refer back to or read in detail in like a week i used to use pocket but recently started with reading list and find it grand.

jec0047 · August 11, 2020, 6:59pm

I use Safari’s Reading List to save articles I don’t want to read right away. Later, when I’ve read the article, I decide whether to bookmark it. I’m pretty choosy about what I bookmark, though.

Ricardo_M · August 11, 2020, 7:26pm

I’ve been using Reeder, which implemented an awesome “read later” feature with the most recent version. With share sheet support, I can send web articles even from web sites outside my RSS feed.

ismh86 · August 11, 2020, 7:32pm

I recently switched from Instapaper to GoodLinks.

bowline · August 11, 2020, 7:33pm

I usually save articles to Pocket. I have a Premium subscription (grandfathered in at a discount) which opens up a permanent library of everything saved, robust and intelligent suggested tags, full text search, highlighting, font choice, an iOS app, and a Mac app (from which I print-to-pdf and save a nice copy without ads). Pocket (Premium only?) saves the entire article, so that if an article goes behind a firewall or gets pulled or the website disappears I still can read it.

If I weren’t using Pocket I’d probably be using Pinboard, a bookmarking service that’s grown to include full-text saves, at half the standard price of Pocket Premium.

I sometimes save web pages to pdf but if it’s a particularly long article I’ll sometimes use an extension to turn the webpage into an epub.

I’ve been doing web research lately and need to save web pages, including for some sites which put their text almost entirely into jpegs (presumably so the text can’t be copied and/or scraped by search engines). For that I’ll save Safari webarchives to my hard drive.

anon41602260 · August 11, 2020, 7:36pm

I formerly used Pocket, but for a while now I’ve consciously limited my “read later” pile. “Later” never comes. So, I occasionally will clip articles to DEVONthink, using markdown format and the “Clutter-Free” option, which is very good.

TallTrees · August 11, 2020, 7:37pm

@anon41602260 - Why Markdown, as opposed to the other formats?

anon41602260 · August 11, 2020, 7:39pm

For the articles I save, I’m only interested in the text – so saving in markdown (plain text) strips away all the extraneous matter. It also makes the article more portable in my case since I can move it into an Obsidian vault if I want.

In the rare case where a page has some graphics or interactivity that I want to revisit, then I’ll bookmark it to DEVONthink.

jsamlarose · August 11, 2020, 10:42pm

Pretty much every web article I want to read gets piped through Reeder. This, after flipping between Pocket and Instapaper over the years.

Reading items from Reeder, I’ll either

skim and delete (why the heck did I save that one?)
read and archive (cool item, but nothing to refer back to)
read and capture important highlights to Drafts (along with my own thoughts/responses)
or…
push to MarginNote via InstaWeb PDF (mid-length and longer reads with many highlights to extract that I might want to map and manipulate for further thought/understanding before extracting to Drafts

I’m in the habit of clearing the RSS side of Reeder down in one shot last thing in the evening, and catching up on reading as and when I can during the day.

I very much appreciate being able to pull random items from Reeder via Shortcuts. I run a shortcut in the morning that prepares a daily dashboard note with a list of the day’s reminders, events, focus items and a random Reeder item.

tjluoma · August 12, 2020, 5:27am

If I want to keep something locally, I will use one of two options:

Bear - It will save images as well as text, and it does it in Markdown, which means that it’s pretty easy to extract if I need to later.
Print Friendly - I keep the “bookmarklet” from PrintFriendly.com on my Bookmarks bar, because the bookmarklet does a great job of getting rid of cruft, and it will also let you remove leftover bits by clicking on them, including images (or just all images). Then I will use ‘Print to PDF’.

simonsmark · August 12, 2020, 12:42pm

Stopped using Evernote and migrating out of there at the moment.

Mainly use Devonthink and Instapaper for reference. Save “read later” articles into Instapaper (to discard once processed) anything “keep” goes into Devonthink.

I do use Reeder for RSS feeds (both sites as well as dynamic Google searches), so catch more there than through normal browser use. Again use Instapaper for read once later and Devonthink for “keep”. That way I can empty the Reeder feeds when processed. The good thing is that Reeder now incorporates Instapaper so I have easy “one App” access, which optimises the data/info processing workflow.

simonsmark · August 12, 2020, 12:50pm

great tip re. Print Friendly! implemented right away …

bowline · August 12, 2020, 2:14pm

Does this offer anything special over printing from Reader Mode in Safari (or an equivalent extension in other browsers)?

tjluoma · August 12, 2020, 7:53pm

The advantage that Print Friendly has is that it allows you to click on things to remove them from what it automagically “thinks” you wanted to keep.

Because of this, I think it allows itself to be somewhat less aggressive when parsing a page, whereas sometimes Safari Reader is perfect, and sometimes it skips things that I want to include.

One important instance is that I find that Safari Reader mode often eliminates all or too many images in a web page, and Print Friendly does a better job, but if there are some I still want to get rid of, it’s just a click away (or there is an option to remove all images at once).

Lastly, I find that “Print to PDF” from Safari’s Reader Mode leaves me with a PDF with large font sizes, whereas Print Friendly’s font size seems much more reasonable.

n.b. When using Print Friendly, I do not use their option to “Save as PDF” because I find their font choice to be ugly. (I am not generally all that picky about fonts, so this is unusual for me.) Instead I use their option to Print, and then when the Print dialogue appears, I use macOS’ “Save as PDF” feature. YMMV, but I thought it was worth mentioning.

So the advantage of Print Friendly over Safari Reader is, for me, more flexibility, although I find Safari Reader to be sufficient… probably at least 85% of the time.

onepointzero · August 13, 2020, 9:40am

I use instapaper for read-later functionality. I tried pocket for a while when the whole GDPR thing locked us Europeans out but went back when they fixed that issue as I found its simplicity more attractive.

My highlights end up in Readwise which I (try to) review regularly. I then move the indispensable ones into Roam (previously it was DevonThink, but I’m starting to get the whole #roamcult thing). It sounds more complex than it is

If I find a technical article I want to keep for reference I usually save it as a PDF to DevonThink as this lets me annotate it, which is not the case with WebArchives, for example (technical articles often have diagrams I also want to hold on to). If I want to reference it elsewhere, like Roam, I copy a link via Hook.

zkarj · August 13, 2020, 9:48am

I’m a serial offender at constantly switching read later apps and also at never getting back to read them later.

For stuff I want to keep for reference — such as when I’ve searched for a programming topic and found a great article on it — I was using KeepIt but recently switched to DEVONThink, where I store Web Archives.

zkarj · August 13, 2020, 8:04pm

I had no idea it was a macOS thing! I assumed the few apps where I see that term were doing their own thing.

bowline · August 13, 2020, 9:26pm

Oh crap. I use this constantly to archive web pages. Just did a search and saw that I have saved 7,388 webarchives, the oldest being from June 2010.

TallTrees · August 14, 2020, 5:53pm

I’d read somewhere sometime ago, that Webarchives actually check the URL source for updates, and that if the site makes a change to the article or deletes it, that the webarchive will break. I don’t know if that’s factual or not, but it did prevent me from using them.