But somehow the rename always put the date of the current date when I downloaded the file not the date found in the document. I tried to preview and all it says is the rule matches but I can’t seem to see where it’s picking up today’s date under the latest Hazel version.
Any suggestions as to where to look or how to fix this?
Sometimes ago my bank somehow managed to send me pdf monthly statements with the text layer all messed up: if I tried to select text I got highlighted some weird region of the page, usually white ones with no text at all and copy/pasting resulted in some random and partial text…
long story short my hazel rules similar to yours stopped working (selected the third or fourth date,because couldn’t read the first one) and never got back
Is the date that it is matching and renaming the file with somewhere in the document?
One thing I’ve found is that the “1st” date in a PDF document isn’t always what a person reading the document would think is first. My assumption is that Hazel is looking at the first date in the order of the text in the file on disk, not what appears closest to the top as the PDF is laid out on screen in a PDF reader.
I have had the same problem with some PDFs, especially ones that have been scanned vs downloaded (although I am not sure why that part makes a difference).
If you double click the rule to edit it, you can see a button on the top right called “preview” which lets you see how the rule will match a selected file. You will get a dialog to select the file that you want the rule to parse. Next to each rule criterion there will appear a red X or green checkmark indicating if that rule was matched or not. If you click the X or check you get a pop-up showing you details of the match, and there is an icon with three dots which you can click to see the entire match data.
Sometimes scanning through that will show you what the actual data that Hazel is seeing will look like, and you might be able to figure out how to modify your rule accordingly.
I have found this helps quite a bit, but there are still a few files that I just haven’t been able to make Hazel figure out. Sometimes, as others have noted, having an OCR’s text layer helps. Sometimes it doesn’t.
I have about 20 rules in my downloads folder that automagically rename scans and downloads, but I have another couple that I am still working on and haven’t been able to solve (yet).
What I find helps when I’m having issues matching PDF content is to use hazelimport from the Terminal. This is the tool that Hazel uses to get the PDF’s text. It’s part of the Hazel application bundle.
This command will dump the PDF text out to the terminal: ~/Library/PreferencePanes/Hazel.prefPane/Contents/MacOS/hazelimporter [PDF Fillename]
Alternatively, you can use this command will save the PDF text to file: ~/Library/PreferencePanes/Hazel.prefPane/Contents/MacOS/hazelimporter [PDF Fillename] > [Text Filename]
I find the 2nd command more useful, as I can then open the text file in a text editor and search for the pieces of text I want to use in my Hazel rules.
So far I keep getting command not found no matter which version of that idea I try. I have verified that I am in the top level Library PreferencePanes folder and put the full complete pathnames for both the input and output files.
It’s probably something I’m doing wrong in Terminal.
Dealing with urgent farm work, I will revisit this as soon as I handle the stuff that has to be done before it starts snowing later today.
@OogieM: It would be helpful if you posted the Terminal command you are using and the output.
However, if you have cd’d to ~/Library/PreferancePanes…, but the Hazel command is not found there, it could be because IF you have installed Hazel for all users of your Mac and not just for yourself, then the needed app, hazelimporter, and the Hazel preference pane itself, will be located in the system library folder at /Library (no initial ~).
I have Hazel 4.4.5 installed. If you have a different version, find the Hazel.prefPane in Finder and right click on it and Show Package Contents. See if the hazelimporter program is in the Contents/MacOS folder.
OK Got the commands to work, found the date in the file. It is the first date and I did have automatic date detection set up in my rule. In preview mode the underlined yellow section (which I presume is what Hazel thinks is the date, is actually a time of the hours for the office being open. It’s not a date at all. I’m playing with making it a later occurance to see if that helps.