How to Bulk Convert XML to Markdown

I’ve posted this on the DEVONthink forum, but I’m cross-posting here hoping someone can help me with this project.

I exported my blog articles from SquareSpace as an XML file. I want to convert all of the articles in the XML file to Markdown in DEVONthink. My understanding is that I’ll need a script to accomplish this. If so, can someone explain to me how to find such a script and how to install it in DEVONthink? I know nothing about scripts. Or, is there an alternative way to convert an XML file that contains hundreds of articles to Markdown?

Any assistance will be greatly appreciated!

1 Like

You would need someone to write the script. Have you tried one of the online XML to Markdown conversion sites? You might need to split the file into chunks

1 Like

(deleted my first thoughts)

EDIT TO ADD: Try selecting the text displayed on a page of your SquareSpace website, copy it, paste it into a Markdown editor like Typora, and then save it. This is my recommended method.

Another idea: Can you export as HTML? Or just grab the pages from SquareSpace since they are already HTML to be able to be displayed on the web? Or use web clipper like the one in Evernote or EagleFiler?

HTML aligns better with Markdown than XML does. I get lots of hits when I Google for “convert html to markdown.”

I would suggest you use ChatGPT or a similar tool to get yourself a script. Writing such a script is something these tools excel at. And doing it yourself with it would probably feel quite empowering. You should send your AI tool of choice an example you want to convert because XML does not have much of a predefined structure like HTML or MD.

I would do it via Python. But that requires installing the programming language.

If it does not work for you, send me a message and I can help you out.

1 Like

No guarantees on whether it works, but I did find GitHub - zcaceres/squarespace-export-to-markdown: Python script to convert Squarespace exports into Markdown files and download images. which from a quick look over does not seem to be doing anything unexpected.

It would require you to install python to run it though.

Afterwards, it would just be markdown files and image files that you could import into devonthink as you normally would.

Looks like googling works as well :slight_smile:.

1 Like

I’d second this - I’ve had a lot of success recently with Claude.ai being able to write me some scripts for splitting XML files up in to different files (though mine was aimed at “breaking” apart GPX files in to different tracks, so the idea is the same, but different implementation to what you want).

Possibly not the most optimised code, but in my case, it’s worked.

It sounds like it would be a “problem” that others would have had in the past as well, so I would have thought that there is a script available online for it somewhere - @dustinknopoff seems to have found one.

1 Like