So over the last few months, I’ve been trying to determine how best to capture web content for archiving and general reading offline.
Given my workflow, it would be best for me to store this all in DevonThink rather than having to use a third party app and store web content elsewhere. I’ve been getting really frustrated with it though - mainly due to trying to figure out what format I should be downloading the content in.
I guess the main thing for me was to have a copy of the page rather than just a reference or a link to it. That way, I could highlight sections of text or make notes. This led me to trying to decide between Web Archive and PDF.
- Downloading in Web Archive doesn’t allow me to highlight any parts of the document on my iPad (which is where I tend to read most of the docs). As an aside, the highlighting options on OSX are just horrible and pretty much make the original text impossible to read
- I therefore switched to PDF. The problem I’m seeing there though is that most of the time, images embedded in a web page aren’t collected properly. Therefore, if the images are being used to reference an idea, I’m having to return to the web page and scroll through to see what I’m missing out on. This kind of negates the whole idea of having an offline material of archive.
How are you guys clipping web content? Or getting around the issues above?
And thanks for the mini rant