Extract PDF text with Mac Automator

It’s such a shame that I just knew the existence of Mac built-in software—Automator. After a quick search, I found that we can do a lot of automations with it.

One of the automations that benefits me is that I can automatically extract text from PDF files. I often do textual analysis based on plain text, so PDF is not convenient. To do so, I only need to create one action in Automator, simply drag PDF files, and I’ll get all my TXTs. The quality is pretty good, better than a paid software.

Steps:

  1. Open Automator
  2. Select Application when a window prompts you
  3. Select PDF --> select “Extract PDF text” --> drag it to the workflow
  4. Save the automation and you’ll see the application on your desktop or wherever you saved it
  5. Drag PDF file(s) to that App icon and voila!

10 Likes

That’s a good one.

What I could really use is something that would parse a PDF into a tree - such that I could modify the tree and then write it back out again.

1 Like

Nice! Thanks for posting!

1 Like