Thank you @drdrang and @JohnAtl so much not just for solving my problem but explaining it in clear terms. I do appreciate that it takes time. I have run the first regex successfully now and will move on to the others and building it all into my script which is already fetching all of the files. And I will need the m
option as some of them do use ^
.
Thanks also to @RosemaryOrchard for the Text Factory pointer — while still trying to figure out the command line aspect, I have been able to fine tune my regex game to cater for some oddities in the source pages and the ability to quickly process the lot (albeit with a bunch of clicks) has been a boon when it has taken many runs to get perfect.
Last night I successfully got the Text Factory to produce a clean set of CSV files based on the web site data, which is the first step to… well, I’m not sure of the end game yet. I’m hopeful I can come up with something which will allow the curator of this information to modernise from this…
<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">