Software Suggestion for organizing "A Ton" of files!

Hello All,

I’m looking for software or ideas on how to best organize a ton of files…

I have client files going back years and between a ‘hard drive loss and multiple moves’ I’m finding that the older files are very disorganized. I would like to start by scanning my Mac for all files belonging to a client and then go from there ie: break down by year etc. I was thinking about Hazel but am not sure if it will be able to do what I’m after.

I’m on a iMac 5K / OS Mojave

Any suggestions would be most appreciated!

Are they all PDFs? Are they arranged in any shape or form (i.e. by client?)

I find it best in the following approach (this works for me, may not work for all)

  1. Create a file naming convention that you will utilize from now on (this changed my digital life when I applied it to my photos, movies, music, etc)

  2. Skeleton System on your usage or search style. Example, I have items broken down by category or by year (depending on what it is) When I used to be a marriage family therapist this was my structure, which I also applied to other ares of my life.

Documents > Clients > Individual Client Folder > files

When it came to Photos, I am concerned over the year (or if using Lightroom I go by keyword)

Photos > YYYY > YYYY-MM-DD Event Name > jpegs

  1. Once you have the structure in place, then you can utilize Hazel to either rename, scan, move, etc to your heart’s content.
2 Likes

Based on solving similar problems, if these were my files what I would do is break the organizing down into managable chunks. First, create a folder for “Clients A-M”, a folder for “Clients N-Z”. I would put each file (or folder) inside the “A-M” or “N-Z” folder, or the Trash. It is a good idea to use preview mode in Finder (or even Cover Flow) so you can see enough contents.

Next would be to create break-down folders inside “Clients A-M” – e.g. a folder for each client, etc.

Continue until you’ve sorted everything into their proper buckets. At this point, you can apply Hazel rules to create hierarchies for each client folder (e.g, folders for client-provided documents, for your work products, for billing, etc.).

@FrMichaelFanous’s naming suggestions can enter into the picture too.

(I was going to suggest DEVONthink, whose “See Also & Classify” feature can be useful for suggesting destination folders for documents. The problem is, See Also & Classify only works well when there is enough of an existing hierarchy containing documents for the DEVONthink AI to make usable suggestions. So you might end up doing as much work getting DEVONthink ready to help you, as you would if you just jumped in and did the cleanup manually.)

A few thoughts:

Firstly, I totally agree with the idea of developing a naming scheme for files that will hopefully work for you long term. I have started a few threads (in the Workflows topic category) that detail my thoughts and my final (for now!) solution, which may or may not prove a useful starting point for you.

However, the first order of business is certainly to get some degree of control over your files. My thinking is:

  1. You should establish a system for naming and filing (eg folder structure) now, and use that for all NEW files created going forward, so that any work you do now does not contribute to your overall work of getting things into files. You might wind up changing that around a bit as you work through your backlog of files, but at least it will impose a structure to be used as you work forward.

I have adopted a new scheme of a relatively shallow folder tree, opting for more folders at one level that for few folders with many levels. So far it is working out well for me.

  1. Once that structure is established, take a look at what it is that you need to file, and see how you can characterize those files. What I mean by that is, what filing structure would be ideal for them. If, for example, it is just old client files related to work, and not any personal files that also need sorting, the you might want to create a top level folder called “ClientFiles.” If you had multiple different jobs (say you were an attorney and a photographer, so you have files from clients from both jobs) you could create subfolders based on job (AttorneyAtLawEsquire, ReallyGreatPhotosInc). Now I would assume that it makes the most sense to characterize by client name, since you are more likely to want to find all files related to a given client than all files from a given year across all clients, so your next level is by client name.

  2. Within the client name folders, if there are likely to be thousands of files, you might want to create a structure that reflects how these kinds of files are best organized. For example, maybe you would want to look at files grouped by year, so under each client name you would have folders named by year. On the other hand, you might find that your files are more naturally grouped by category (using the Attorney model: Contracts, MeetingNotes, Subpoenas, Depositions [I am not a lawyer, so if these categories appear stupid to the lawyers here, oh well]).

  3. If it is necessary to search / group in multiple ways, you can either create smart folders, tags, or a combination. For example, suppose you choose to organize by category, but might also need to pull together all the documents from a given year for a given client. Then, attaching a tag based on the year and creating a smart folder to pull all files under a given client name folder tagged with a given year would solve that problem for you. [Note that once you have grouped your file by category, you can create a finder tag (“2010”) and then select all files based on that year and drag onto the tag in Finder to apply the tag to the files all at once.)

  4. Now comes the hard part, which is actually getting the files into that folder structure. I think @anon41602260 is right on here. You need to break things down into manageable chucks. For example, get all the file for ClientA identified and dragged into the ClientA/Unsorted folder. From there, you go through the Unsorted folder and sort further into category, year, whatever. The idea is to find all the files for a particular client and get them into that client’s folder. You can then either find all the files for another client and similar move to ClientB/Unsorted, OR work on the files for ClientA/Unsorted, depending on your mood at any given time.

  5. Don’t try to do this all at once! It is a daunting task and better taken in chunks. For example, I am in the process of applying my new renaming scheme for files and folders to my huge backlog of documents, and I basically pick a folder or two or three and rename and reorganize at any given time. I have built my own renaming tools to make this process efficient, but I don’t try to do it all at once. Eventually I will have everything done, but in the meantime new files created wind up properly named and sorted as well.

  6. In terms of software tools, I don’t think Hazel is the right tool for this job. Hazel is great for ongoing monitoring and processing of files, but isn’t really ideal for one-time processing which is your problem right now. It may have some uses. For example, if you had a scenario where you wanted a large group of files renamed with a prefix of the creation date and a specific tag added, you could of course create such a Hazel rule for a “Temp” folder, drag all relevant files to that folder, let Hazel do its magic, then drag those files to their appropriate final location. I do actually use Hazel in this manner, although I often use a KM macro or custom python script for this purpose as well. Sometimes it is faster for me to create a Hazel rule than a KM macro or a python script, so I do whatever seems fastest / easiest for a given task.

If you were to provide some specifics regarding the files, how they are currently labelled, named, meta data, etc and an idea of where you are trying to get to (assuming you can provide this info on work product) we could try to give some more specific suggestions.

2 Likes

I second all of what @nlippman suggested – great ideas.

Except maybe this: “Don’t try to do this all at once”. Actually, depending on your tolerance, setting aside a day and just crunching through the exercise can turn out to be more productive than doing the work in dribs and drabs. Organizing files is the kind of work that can get you into a sort of zone of productivity – the day starts off looking really dreadful, then gets better after progress starts, then better and better, and by evening it’s all done and looks great.

@anon41602260: True enough. I often wish I could take two days off and just crank through the myself, but since the reality is that I can’t, I think that not trying to get too much done at work makes it possible for me to make progress, because I accept just spending 15 minutes when I have the time to move the project forward. if @CharleneB can actually take off the time, it would make sense to just get the job done.

Thank you!

Sounds like I will have to do a lot of ‘combing through and grunt’ work till I can utilize Hazel. Most of the files are PDF or AI, there are photos etc. that also belong within. I typically set up by client name and then year and individual project but as I mentioned over time some of these have gotten messed up.

I need to get at it … looking for a miracle!

Great ideas … I think I was hoping for a mind reading miracle app to sort these out! I will get at it.

Excellent information! No shortcuts for me!

Yes, I think this needs to be attacked as a 1 hour per day project as there are a lot of files etc.

@CharleneB:

I think that if you get your organization scheme and your tools set up in advance, you can really expedite this process and make it a lot less painful.

As an example, in about 2 hours on Saturday night I sorted out over 1000 unfiled files I had accumulated. While my approach and circumstances are highly specific and unlikely to be directly applicable to your situation, the concepts may help you to think through your process.

I have been paperless for years, and everything is scanned or downloaded, including all bank statements and bills. However, I went through a period where I was just scanning but not sorting or collating, and wound up with >1500 unfiled files in a folder (of course called “to file”). With my new schema for file naming and sorting I decided to process the backlog. Here’s what I did.

My files eventually wind up in a folder tree under “ScannedDocuments.” (They are not all scans, but that’s just what I started with and never changed the top level folder name.) Under that I have subfolders, two of which are called Bills and BankStatements.

I have a folder called Dispatch, synced between laptop and desktop, where I now put things that need to be filed/processed.

I created a folder under ScannedDocuments called Process as a staging area.

I have also recently (documented elsewhere) created some custom python scripts that reformat filenames into the schema I am now using (date and filename embedded, words separated by underscore, no spaces) and a script that uses tags to move files to major sorting areas and into subfolders. To move to the ScannedDocuments folder a “#Doc” will do it, and a “:Process” tag will move into the Process subfolder. I have talked about this scheme elsewhere.

Hazel of course does the heavy lifting, with rules I have been creating on the Process folder to handle all different kind of files.

To sort my files, I went to the “to file” folder and just selected a particular group of files, let’s say “checking account .pdf”. After selecting all the files, I launched a Keyboard Maestro macro that pops up a palette of my renaming tools, and select the one to rename the files to the proper format. Once renamed in Finder they remain selected, so I can pop up another KM palette that lets me assign both of the tags with one click. I then drag them to the Dispatch folder, where a Hazel rule detects that there are sorting tags on the files and launches my filing by tag script which figures out they belong in the ScannedDocuments/Process folder and moves them there.

I then go over the Hazel again and create the proper rule for a file name of that format (eg filename matches xxxx-xxxx_text where x is a single digit. The rule then moves the file over to the corrected filing folder (in the tree I have been using for some time).

This process of creating the Hazel rules in the Process folder is very quick, and I can select large groups of files and with a few clicks and a drag have them renamed, tagged, and sorted.

As a result, I was able to process over 1000 of the files in a short period of time.

There are a few other files that will be amenable to this process, leaving behind the “one off” types of files that do not easily follow rules and will have to be hand-reviewed and sorted or trashed. (I don’t create Hazel rules for things that are one-off’s or very rare, as I don’t want Hazel to be chewing up processing time for things that won’t happen. For instance, if I have a credit card and then cancel it, and so there will be no further rules, I disable the associated Hazel rule so that it won’t enter into the processing loop.)

This is just one scheme that I used over this weekend that really helped me plow through a backlog of filing that I should not have accumulated in the first place. And, by getting all these rules and support automation in place, going forward it will be easy to look at the Dispatch folder weekly or more often as needed, rename and tag files and have them automatically go where they are supposed to go, so I won’t wind up with a mess again in the future (I hope).

Your information has been extremely helpful, very much appreciated! I apologize I didn’t reply sooner (we’ve had a water line break and are working in chaos) I’m going bto refer back to this and get started on this in manageable ‘bite size’ pieces of time!

Thanks again!

Glad you found the info helpful.

I hope you didn’t suffer too much water damage and get things up and running soon!

A complete whole house reno … nothing like a water line break to get things moving!