I was looking up how to do this and wanted to share. I came up with a way to identify PDF’s that were not scanned by my ScanSnap.
I’ve got a folder with over 134 PDF files that I kept postponing the creation of Hazel rules for. I wanted to identify which ones were not scanned (And thereby OCR’d) by my ScanSnap. In Hazel, I used “Content Creator - does not contain - ScanSnap” along with it doesn’t contain a tag of OCR’d
Then I had Hazel run a Shell Script of
Sleep 30
to have it pause for 30 seconds.
Then I have Hazel use this Applescript (Sorry I forget where I found it when Katie Floyd’s version didn’t work for me)
tell application “PDFpenPro”
open theFile as alias
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
end tell
Then Finally, have Hazel add the tag of OCR’d
Hope this helps.