Extract Photos from Webpage

In DevonThink, I have a specific workflow. I use safari reader view, save as pdf and send to devonthink. This works well most of the time, but there are some websites that when I put the article in reader view, the photos don’t make it into the reader view. Is there anything I can be done about this? Here is an example problem url.

Can anyone recommend a good way to purge the ads (like reader view/pocket/instapaper) while also keeping these embeded photos? Thanks!!!

Would the “create pdf” command be of use here?
website -> sharesheet -> create pdf -> sharesheet -> to Devonthink

The images in your example URL are plced into a slide show The Safari reader view has a tough time with those sorts of things – probably by design.

The Evernote web clipper grbs them, though. This is one use case where Evernote works better than the Safari reader. There are many situations where the opposite is true.

Some times, when I want the text preserved from the Safari reader view, I do that and then also save the images individually. That’s not a good solution if you just want the images in the PDF, but if you just want the page (and images) for research purposes, it is adequate. You also usually get a better image quality that way.

Print to pdf doesn’t pull any of the images either.

How about using Safari to Save As a .webarchive (which saves the images as part of the archive) … then using a quality ad blocker in Safari when opening/reading .webarchive files from Devonthink?

FYI my experience with Pocket Premium is that it only occasionally retains images in the saved archives (though it depends on the site). A bit of a bummer, but I love the service anyway.

Also FYI, for today only, Pocket Premium is 60% off, or $17.99 for a year:

https://getpocket.com/premium?forceshowmonthly=1&prt=CYBER2018

You might also find some use for the Printliminator bookmarklet.

https://css-tricks.github.io/The-Printliminator/

It does for me when I use sharesheet -> create pdf

Thanks. It’s all the “local news” on the right that I don’t want. I want a clean pdf with just the one article and both pictures, the one you’re showing and the one underneath it.

That’s super interesting. How do I add an adblocker to DEVONThink?

This looks awesome! Do you know if this works on iOS also? I’m definitely going to kick the tires with this one! Thanks.

I meant use Devonthink as your repository then view/print in Safari (with a good ad blocker on it). If Devonthink is referencing the files (as opposed to importing them) it’s even easier because you can also easily Quick-Look the files if necessary in the Finder.

1 Like

I use it only on the Mac. I have no idea if there is an iOS version, but it might be worth looking for.