I was looking up how to do this and wanted to share. I came up with a way to identify PDF’s that were not scanned by my ScanSnap.
I’ve got a folder with over 134 PDF files that I kept postponing the creation of Hazel rules for. I wanted to identify which ones were not scanned (And thereby OCR’d) by my ScanSnap. In Hazel, I used “Content Creator - does not contain - ScanSnap” along with it doesn’t contain a tag of OCR’d
Then I had Hazel run a Shell Script of
to have it pause for 30 seconds.
Then I have Hazel use this Applescript (Sorry I forget where I found it when Katie Floyd’s version didn’t work for me)
tell application “PDFpenPro”
open theFile as alias tell document 1 ocr repeat while performing ocr delay 1 end repeat delay 1 close with saving end tell end tell
Then Finally, have Hazel add the tag of OCR’d
Hope this helps.