I used to have an excellent Hazel workflow using PDFpen Pro and an AppleScript (I think that originally came from Katie Floyd?) that would automatically OCR PDFs that were added to a folder. PDFpen Pro has not been working the same since it was taken over by Nitro, and so I’m looking for an alternative. I’ve settled on ABBYY FineReader PDF, but I can’t for the life of my work out how to create an AppleScript to run in Hazel to automatically run ABBYY FineReader PDF to OCR PDFs. I have tried to use ones I’ve round on the Internet but Hazel keeps giving errors, and as I have very very very basic coding skills I’m at a loss.
Here’s my current iteration which just isn’t working - please could someone tell me where I’m going wrong? Thank you?
tell application “System Events”
tell disk item (theFile as text)
set {theName, theExtension} to {name, name extension}
if theExtension is not “” then set theName to text 1 thru -((count theExtension) + 2) of theName – the name part
end tell
tell application “Finder”
set hazelPath to (container of alias (theFile as string)) as text
set pdfPath to hazelPath & "(OCR) " & theName & “.pdf”
end tell
tell application “ABBYY FineReader PDF”
repeat while is busy
delay 1
end repeat
export to pdf from file
repeat while is busy
delay 1
end repeat
end tell
end tell
Have you run this through Script Editor (or Script Debugger). I have, and numerous errors are reported. Probably more efficient and a better learning experience for you to run it through Script Editor yourself, and fix the errors reported.
Thanks for your replies. I went back to basics and found a version of the original Katie Floyd AppleScript posted by Rosemary Orchard, and I’ve to it working so that AppleScript will now open the PDF in ABBYY FineReader and run OCR on it, but I can’t get it to then automatically save the OCR version of the PDF and close ABBYY FineReader, I have to do that part manually. Which isn’t the end of the world but in an ideal world it would just open, run, save the OCR PDF and close like it used to with PDFPen Pro
tell application “ABBYY FineReader PDF”
open theFile
tell document 1
ocr
delay 1
export to pdf from file theFile
WaitUntilDone()
close with saving
end tell
quit
end tell
Actually - this script doesn’t work - if there are multiple PDF files it adds them all at once to ABBYY FineReader to make one enormous PDF.
I’ve seen there is a ‘hot folder’ for ABBYY FineReader but I can’t find that option - I’ve got the ABBYY FineReader Premium subscription through the App Store.
This script is working for me. It’s slightly longer (more complex) than the other examples. Which may explain the difference. Some settings (e.g. langList) may need some changes for you particular usecase.
on hazelProcessFile(theFile, inputAttributes)
using terms from application "FineReader"
set langList to {Dutch, English}
set saveType to single file
set keepPageNumberHeadersAndFootersBoolean to yes
set pageSizePageSizeEnum to automatic
set keepPicturesBoolean to yes
set imageOptionsImageQualityEnum to balanced quality
set keepTextAndBackgroundColorsBoolean to yes
set makePDFABoolean to yes
end using terms from
WaitWhileBusy()
tell application "FineReader"
export to pdf theFile ¬
from file theFile ¬
ocr languages enum langList ¬
page size pageSizePageSizeEnum ¬
saving type saveType ¬
keep page numbers headers and footers keepPageNumberHeadersAndFootersBoolean ¬
keep pictures keepPicturesBoolean ¬
image quality imageOptionsImageQualityEnum ¬
keep text and background colors keepTextAndBackgroundColorsBoolean ¬
make pdfa makePDFABoolean
end tell
WaitWhileBusy()
tell application "FineReader"
quit
end tell
end hazelProcessFile
on IsMainApplicationBusy()
tell application "FineReader"
set resultBoolean to is busy
end tell
return resultBoolean
end IsMainApplicationBusy
on WaitWhileBusy()
repeat while IsMainApplicationBusy()
end repeat
end WaitWhileBusy
Thank you so much! This is exactly what I’m looking for. I just have a couple of problems - Hazel doesn’t like the script starting with ‘on’ (it reports an error saying “Expected “end” but found “on””).
And both ScriptEditor and Hazel don’t like the word ‘file’ in
set SaveType to single file
They both say “Expected end of line, etc. but found class name”.
I’m using Hazel v.5.1.1 on Venture 13.5.1 if that’s any help?
Ah maybe that’s the problem - the version of ABBYY. Which version of ABBYY is the one I should be using? I’ve only just downloaded it from the App Store so I’ve probably got a newer version…
AFAIK the last version to support AppleScript was 12.x.x
I am (or rather was) using version 12.1.14.
ABBYY told for a long time that they would add AppleScript support again. But unfortunately FineReader for Mac is basically abandonware (apart from some very minor updates - none of them related to automation).
Even though their OCR engine is the best one out there, I moved over to OCRmyPDF (based on Tesseract).
Interesting - I might do the same, just go to OCRmyPDF. I’m by no means very experienced in scripting and coding so the whole Brew aspect is a bit daunting but I’ll make it a project to figure it out!
Just look at the facts. There have been a few updates, but all are minor. They don’t listen to user requests. They removed basically all automation options. Which is quite essential for an OCR engine.
And there is still no Apple Silicon support. The reply on their website is bonkers:
Starting from [Release 1 Update 2] FineReader PDF for Mac can work properly on computers powered by the Apple M1 chips. It runs using Rosetta 2 technology.
That comments is from about 2 years ago. And as far as I know there’s still no ARM version of Finereader available.
I’m going to throw out a recommendation for OwlOCR on the Mac App Store. I’ve found it to be one of the fastest and most reliable ways to ocr stuff using hazel.
I wish companies were more straightforward about abandonware. I tend to support subscription apps in that they are less likely to stop communicating with users and disappear into the fog without any real explanation.