Solution to local newspaper's broken RSS feeds + paywall

Hi, all. So I live in the Hudson Valley, and we subscribe to the online edition of the Poughkeepsie Journal (poughkeepsiejournal.com) They are part of the USA Today network, which also runs some other nearby papers such as the Westchester Journal News (lohud.com).

I use an RSS reader to browse the Poughkeepsie Journal’s headlines. All of the links in the actual feed are to poughkeepsiejournal.com URLs - these then in turn redirect to actual article URLs on the newspaper site.

However, on the server side, a sizable fraction of those links redirect to one of the other local papers, even though the same article is available - at the same pathname - on the poughkeepsiejournal.com site. When this happens, and a subscribers-only story is involved, I hit a paywall that I wouldn’t hit on my local site.

(I have written to them to alert their tech folks to the issue, but it’s clearly not a priority for them to fix.)

An example from yesterday:

RSS Link: http://rssfeeds.poughkeepsiejournal.com/~/648028804/0/poughkeepsie/news~Marijuana-legalization-in-NY-to-lead-to-automatic-expungement-of-drug-convictions/

Redirected link which went to paywall:

https://www.lohud.com/restricted/?return=https%3A%2F%2Fwww.lohud.com%2Fstory%2Fnews%2F2021%2F03%2F31%2Fnew-york-marijuana-legalization-expungement-overturn-drug-convictions%2F4810902001%2F

Legit Poughkeepsie Journal Link:
https://www.poughkeepsiejournal.com/story/news/2021/03/31/new-york-marijuana-legalization-expungement-overturn-drug-convictions/4810902001/

So I have two solutions to this, one on iOS and one on Safari.

On iOS, I have a shortcut that I have creatively named “Poughkeepsie Journalize”:
https://www.icloud.com/shortcuts/1aef2882738c49ce88f8894b494db288

It splits the paywall URL at return=, URL decodes the last part, and replaces its hostname with poughkeepsiejournal.com and opens it in Safari. I invoke it from a Safari share sheet that’s viewing a bogus page.

On Mac, I just did something similar with Keyboard Maestro. This is something of a hack, using KM, Automator, and a BBedit Automator action. I have a macro set up that’s bound to ⌘J when Safari is frontmost. It does the following:

  • Type ⌘L, then copy the URL to the clipboard.
  • Run a “Poughkeepsie Journalize” Automator script, which:
  1. Gets the clipboard contents
  2. Replaces %2F with / (only works because this site doesn’t use any other characters that require URL decoding in their pathnames)
  3. Runs a BBEdit Search and Replace action that (using grep) searches for .*com\/story(.*) and replaces it with https://www.poughkeepsiejournal.com/story\1. This matches everything after com/story (including the subsequent slash), and appends that matched text to the correct path beginning.
  4. Copies the result back to the clipboard.
  • Keyboard Maestro sets a variable SAFARI_URL to %SystemClipboard%
  • Finally, it sets the Safari URL to SAFARI_URL

If anyone wants the KM macro or Automator script, I can append them there, but bear in mind that you need BBedit for this to work (or some other Automator action that does grep-style search and replace).

This is a real niche need (and the KM solution is not elegant), but it’s cut down substantially on tooth-grinding frustration, at least as far as bypassing dumb technical goofs. It hasn’t improved the quality of the newspaper (or the news itself!), sadly.

3 Likes

Let me start with an apology, because you didn’t ask for suggestions, but there are some things you can do to make your macro easier to use:

  1. Use the %SafariURL% token instead of ⌘L and ⌘C to get the current URL.
  2. Put the URL into a Named Clipboard instead of the System Clipboard. This keeps the System Clipboard the way you had it.
  3. Use the Filter action (it’s in the Clipboard set of actions) to Percent Decode the URL. This will handle everything, not just %2F, in case the Journal decides to put other characters in its URLs.
  4. Use Keyboard Maestro’s own Search using Regular Expression action (it’s also in the Clipboard set of actions) to avoid calling out to BBEdit. You can save the capture group to a Keyboard Maestro variable.
  5. Assemble the new URL using the Keyboard Maestro variable you just created.

That may look like a long list, but they’re small changes compared to the hard work you did in figuring out the URL structure and the regular expression needed to exploit it. It will cut your three keystrokes down to just one and will still work if you ever decide to abandon BBEdit.

8 Likes

This is great, thanks. When I was throwing my KM macro together, #2 occurred to me and I’d wondered if #3 and #4 were available, but I’d already procrastinated enough and moved on after cobbling together a solution that was functional.

Here’s a version incorporating your suggestions, in case anyone wants it:

This is great.

For anyone unable or unwilling to invest in KM (which I pay for and love), you can also do this with a javascript bookmarklet. I recent cobbled one together for a similar use case, and it would be pretty easy to adapt. Let me know and I’ll give it a shot.