I want to export all of my Squarespace (SP) blog posts as separate, preferably plain text files, for archiving. Currently, SP seems to only export all of them as a single HTML file.
I have sent a support request to SP regarding this but I thought I’d also ask here.
Those of you who have Squarespace knowledge (or related knowledge and experience):
- Is there a way to export my SP blog articles as separate files?
- If not, is there a way to automate extracting individual articles from a single HTML file? This is WAY above my tech pay grade but I thought I’d ask.
Any solutions to my problem?
When I did it a year or so ago, what Squarespace exported for me was an .xml file in a format that Wordpress uses for site import/export.
I ended up writing a python script that turned the massive .xml file into individual markdown files and downloaded and renamed all the images linked in the Squarespace posts. This was kind of a messy process. As I recall there were some things in the Squarespace export that choked the python xml parser I was using (which I think were related to Unicode encoding issues). I had to run the script, see where it broke, edit the .xml file, run it again, see where it broke, etc.
I was afraid of that and what you described is, as I note above, way above my tech pay grade. This is what I dislike about proprietary systems of this nature. It should be so much easier to export individual articles. I am seriously considering using Obsidian Publish for my blog.
Unless someone offers a better solution, or I hear back from squarespace with a solution, I may have to do the tedious work of copying and pasting individual articles. In retrospect, I should have saved all of those articles in an archive. But alas, I did not thinking I could export them when needed.
I opened Sitesucker, and pointed it at my brother’s blog which is on Squarespace and got individual .html files out. They’d need some cleanup, but the content is there:
Also, how many non-profits have a
covid19.html file on their website? I bet ALL OF THEM.
I tranfered to Obsidian Publish for my personal blog. It’s great being able to add pages with a couple of clicks and manage my website through Obsdian.
That is good to know. I’m going to experiment with it. If it works, I may cancel my SP subscription and my domain registration. More money saved and less aggravation.
Well, I figured out a “solution”. And, while requiring some manual effort, this is much easier than copy/paste. In brief, here is what works:
- Open the blog in a browser (Reader View) and DEVONthink side-by-side
- Copy article title
- Select DT in the share sheet and save the article through DT in the appropriate Obsidian Vault. Paste the title in the Name field.
DEVONThink Sharing Clipper
This worked great because I currently have my Obsidian folder INDEXED in DT. Once added to DT, DT automatically places them in the Obsidian vault selected as markdown files. No additional work required.
The result is a markdown file in my Obsidian Vault that renders perfectly in Obsidian with almost no cleanup needed. Below is a screenshot of an article.
It only took two hours to transfer hundreds of files to Obsidian. Tedious but manageable.
I now have a complete archive of my articles stored in an Obsidian folder and backed up.