Advance warning: this post is probably going to be too long.
I have been thinking about how I name and organize my files. I wanted to through out a collection of my thoughts to see what others think and might suggest. Like probably everyone else here, I have a huge number of files which I have tried to organize in some fashion that allows me to find things, but often find myself searching (sometimes unsuccessfully) for what I need. So here’s what I have been trying to think through.
My current scheme is basically to try to name files with something relatively descriptive, and then to put files in some fashion into a rather complex folder structure that in theory lets me chase down what I need to by drilling down through the folders. I often append a date (in yyyy.mmdd format) to files when “relevant.” For example, my checking account statement for April 30th would be called “checking statement 2019.0430.pdf” and would eventually be filed somewhere like Scans/Bank Statements/Checking Account/checking statement 2019.0430.pdf.
I made the decision to append the date rather than prefix it in the filename so that if I had multiple different types of files in a given folder they would sort by the type of file, eg all the “checking account” files would sort together as would all of the “savings account” files if they were in the same folder, while if I pretended the date, then they would be intermixed. This is, in retrospect, unnecessary if these files are in different folders, of course.
I use tagging in a fairly limited fashion so far, primary as a trigger for Hazel to be able to sort files. For example, I have a single folder called Dispatch where all scans wind up, and once I add a tag “personal” then Hazel can move the file into the folder tree for my personal files, where they can they be further sorted based on filename or content to the proper folders, while files tagged “business” can be similarly sorted by Hazel in to business folders. I haven’t adopted a more extensive tagging system beyond a few special cases to assist in sorting and finding.
Part of the problem here is that a) my rules for Hazel are incomplete, so the folder of things that need sorting is generally packed with files, b) many things that are scanned or otherwise received or created are one-offs that don’t necessary fall into the preset Hazel rules, and c) a mistake in naming the file causes it to remain unfiled.
Recently there was a posting here referring to an old blog post by “Dr. Bunsen” where he talked about how he handled files, and I found that post very interesting. He names all of his files with a prefix that starts with a date time (yyyymmdd.hhmmss) followed by his own sort of two character coding scheme to indicate the type of file (he has a series of categories and urgency levels) followed by a a filename that is descriptive and then an extension which sets the file type (eg .pdf).
I was struck by a few things here. Firstly, he is essentially encoding what would otherwise be considered metadata into the filename (eg the creation date/time). Secondly, but doing it this way, the files in a folder are by default represented in a chronological fashion (with alphabetic sorting, not depending on setting the sort order to “by date”) which has some appeal to me given that it may be more common to remember roughly when I created a file compared to what I might have named it. Arguably, his encoding of the file designator is basically putting a single tag into the filename.
He basically uses only three folders, an Inbox where everything created is initially swept, and Active folder for files he is working on, and an Archive folder for things he is no longer actively working with.
I would not necessarily adopt the three folder approach, as I still think there is value in having to some extent a folder structure to group files related to a specific activity or task, but I am less convinced that the extensive folder tree that I have put together is useful either given the need to search and drill through so many levels to find things. For example, it may be easier to search for all files in a folder named “All Bank Statements” for any files with “checking account” in the name, especially with advanced search tools including Spotlight indexing and HoudahSpot.
Further, the three folder paradigm could be imposed as a meta-folder structure. For example, using tags “active” and “archive” I could have all files daily swept into an Inbox folder, then go through them and add active and archive tags. Then move the files into a more shallow folder structure as may be relevant for filing away (bank statements) vs moving in to a working folder tree for things I am working on, and then have smart folders that are based on file with these respective tags, creating a meta-folder structure. It seems to me this might be a good working paradigm.
In contrast, Brett Terpstra has posted on his blog about a much more complex tagging scheme that he uses with his “tagfiler” script, which auto-files things into what sounds like a fairly complex folder structure based on a hierarchical tagging scheme, which he then depends upon for complex searching. WHile intriguing, I suspect this will be too complex for me.
I have also debated the idea of incorporating file meta data into the filename (eg the file creation date and a category designator as Dr. Bunson does) vs using filesytstem metadata for this purpose. Using file system metadata works better into the Spotlight based indexing system of MacOS, but makes the files less portable to non-MacOS systems, but is not a major impediment as it would be easy to create a script to rename files by inserting metadata into the filename vs extracting metadata from the filename and creating tags with it, etc.
Not every file I create would necessarily need such a complex name. For example, my photography library would probably not benefit from adding the filename stamp into the filename, because those files are not really organized and utilized the same was as other files in my file system.
I am not sure I would necessarily put a category designation into the filename because I am not sure I could easily creation a comprehensive listing of categories I would use a priori, vs simply adding tags as need arises.
So, what I am leaning towards is the idea of name all files in a format with a leading date time stamp (yyyy.mmdd-hhmmss) format, and underscore, then descriptive words for the file content, and the file extension. I would add tags as I presently due to aid Hazel in filing, recognizing that a lot might wind up unfiled and simply be stored in an unfiled folder unless/until a purpose arose for creating a folder to group some files together. Additional tags would be added as appropriate, but the number of tags would be kept relatively small so make it easy to find and search using built in tools and/or HoudahSpot. My current archive of files would be rearranged (over time as I had free time for this) to reduce the huge number of folders presently there, largely based on need. For example, the frequency with which i need to pull out the past 2 years of statements for my checking account is very very low, so searching is a better solution than putting in the effort to create that kind of sorting into folders, but searching still makes it easy to find as needed.
I hope that some of you have made it this far and can provide me with some thoughts and ideas as I start to think about how I will improve my filling system and access to files.