ChatGPT to develop automations. OCR revisited

SteveU75 · April 7, 2024, 11:30pm

Has a one had good luck using ChatGPT to develop shortcuts or automations.

One thing I have done is to upload a manual to ChatGPT and then ask it questions, beats digging through the manual.

Here is just a first attempt at revisiting the OCR automation. Now to actually try building the action.

I was hoping to use shortcuts but got Automator

Prompt

Create a Mac shortcut that performs optical character recognition on a group of selected files

Creating a shortcut on a Mac that performs Optical Character Recognition (OCR) on a group of selected files involves using the Automator app, which allows you to automate tasks. The specific steps outlined below make use of macOS’s built-in capabilities and may require you to have an app that can perform OCR if Preview doesn’t meet your needs. One commonly used app for OCR tasks is PDFpen, but there are many others available. Here’s a basic approach to creating such a shortcut:

Open Automator

• Go to your Applications folder and open Automator. Select to create a new “Quick Action.”
Configure the Quick Action

• At the top of the right pane, set “Workflow receives current” to “image files” in “Finder.”
• Optionally, you can set the image type to PDFs only or any image file depending on your preference.
Add Actions for OCR

Since macOS doesn’t have a built-in OCR action, you’ll need to use a third-party application for this. Assuming you have an application installed that supports AppleScript or Automator actions for OCR, you can proceed as follows:

For Apps with Automator Support

•	Look for an action related to your OCR application. This could be something like “Perform OCR with [Your App]” and drag it into the workflow area.

For Scriptable Apps

•	If your OCR app doesn’t have direct Automator support but can be controlled via AppleScript, add a “Run AppleScript” action.
•	In the script editor that appears, you’ll need to write or paste an AppleScript that tells your OCR application to open the selected files, perform OCR, and save them. This script will vary depending on the application you’re using. For example:

on run {input, parameters}
tell application “PDFpen”
repeat with thisFile in input
open thisFile
tell document 1
ocr
save
close
end tell
end repeat
end tell
return input
end run

Save Your Quick Action

• Give your Quick Action a name, such as “Perform OCR on Files.”
• Save your Quick Action.
Using Your Quick Action

• Select the files you want to perform OCR on in Finder.
• Right-click to open the context menu.
• Navigate to “Quick Actions” and select “Perform OCR on Files.”

This process will depend on the capabilities of your OCR software and its compatibility with Automator or AppleScript. If your software doesn’t support direct integration, you may need to explore alternative methods, such as using the software’s built-in batch processing features manually.

Steve_Sawczyn · April 9, 2024, 9:56pm

This is very interesting, especially as i am constantly looking for more efficient ways to handle OCR on Mac. My situation is that as a blind guy, I rely on OCR to read documents that don’t have any sort of text layer, or which have a text layer, but no real structure. On Windows, there are solutions that not only perform OCR, but which try to replicate the document’s structure, most notably Abbyy’s FineReader Pro which even tries to recreate a table structure if something in the document looks like a table. If I could find something similar for Mac, combining it with this sort of automator quick action could make a world of difference as I could bind it to a keystroke and then easily run it, giving me quick access to a more accessible version of the file.