Managing File Retention & Purging how do you handle it?

How do people typically manage file retention and purging of old files?

My workflow currently consists of saving a file with the format yyyy-mm-dd - tag - subtag or description.extension.

The file sits in a ‘24 hour’ folder (which makes it easier to find when I want to email it, or work on it after a short break).

After 24 hours Hazel moves it to the ‘AutoFile’ folder from where it gets filed using the - tag - and, if appropriate, sub filed by date or a subtag (if it exists) - there are different rules for different filename tags.

All of this is great for the creation, distribution and filing, side of things but there is zero differentiation between stuff I need to keep for a few weeks and stuff that needs to be retained for a few years or in perpetuity.

It is not worth differentiating between a few days and 6 months (disk space is cheap) so I would probably have 6 months as a default for ‘review’ if no other time is set.

I am toying with the idea of adding a code to the filename that Hazel could then recognise, replace with a ‘proper’ MacOS tag and strip from the name. Something along the lines of R1Y (Retain 1 Year), R6M (Retain 6 Months), RPY (Retain forever) etc.

Hazel would translate a file saved today (November 25) with R1Y into the tag ‘DEC26’, R6M into ‘JUN26’ etc.

Come December 26 a Smart Folder could collect for review all the files that have a tag of ‘DEC26’, the vast majority would be sent to the bin, if, on reflection, the file needs to be kept the tag is changed manually to the new review date.

Before I go to the effort of setting this up (and seeing if I can get Hazel to work out the tag needed for R6M, RxY (where x is a number) without needing to update the rules each month) - is there a better way?

2 Likes

My answer might be quite boring because my retention and purging policy is a manual procedure with no automation at all (although I have Hazel installed):

  1. As long as I have to work with files, they very often sit in my desktop folder. Yes, I know, not very fancy. I try to keep my desktop folder clean, basically using it like an inbox that has to be emptied in the long run. If I cannot deal with files sitting in the desktop folder within days, I will file them into a folder structure that I have set up almost two decades ago (personal, work and some hobby folders with a lot of sub folders) and I might set a reminder to deal with stuff later.
  2. When I am done working with files, I decide whether I have to or want to keep them. If no, I will delete them right on the spot. If I want to keep them, I will file them in my folder-based system.

And that’s it. I do not purge anything because as you have said: disk space is cheap. And I have to say that it has happened more than once that I was quite happy to be able to go back to files that have been created almost 30 years ago.

So, how does it even work, to find stuff?

Well, first of all, my file names start with the date yyyy-mm-dd, too, followed by a meaningful description. Then they sit in the mentioned folder structure. And last but not least: Find Any File always has been able to find whatever I needed to look for.

Your approach does sound very interesting, if you want to automate a purge routine. It will be no real help, but if I chose to purge stuff automatically, I would consider implementing a similar routine. It really sounds thought through. I myself am not comfortable to have an app doing the purging. It is something I prefer doing myself. :slight_smile:

Again, your approach sounds very interesting and I am looking forward to others chiming in.

1 Like

My workflows are more project-based. I have main folders to contain these active projects. When I am done with the project, it goes into an archive folder on my NAS.

If you are dealing with lots of temporary files like iteratively generated PDFs, maybe delete them (manually) as you go (clean up at the end of the week, and definitely before you archive). You could also choose to store those temporary files somewhere like a Temp folder or your Downloads folder, then have Hazel automatically purge files older than a month.

I use Hazel to help rename and tag files, but not to move them, as I don’t want to lose track of things. I do have it purge my Downloads folder of old files based on date.

1 Like

I generally place all new and temp files in Downloads, and place all files that I am currently working with on the Desktop. I regularly cleanup Downloads, deleting most of what’s there, and storing most of the rest for long-term use in DEVONthink (90%) or into relevant folders in my Documents hierarchy. If I’m working with files on the Desktop (in folders for the relevant app I’m using), then when I’m finished I’ll decide if the file is a keeper or not.

Of course, a lot of apps are designed to have their own folders, especially apps that I use on mobile and desktop platforms. I try to regularly cull those app-specific folders, but that’s more of a challenge.

This is all not scientific in the least. I just try to leave files where I’m likely to stumble on them later and realize I need to get rid of them.

Katie

For work files, unless they include personal data, there is no retention period.

For personal data they are added to the relevant system and then fall inside the system’s retention

I have a rudimentary system that.

  • Moves files from the desktop, into Downloads, after 2 days
  • Moves specific file types from Downloads to type-specific folders after a week (images and PDFs, I used to have one for DMGs and ZIPs)

This is all Hazel. I wonder if it can do your bidding simply by adding a tag for the retention period and having a rule for each one.

I retain almost everything and just buy a bigger Mac every 5 years. I currently have a 4TB SSD. Dabbling in software development again is eating up space. In addition I do some video work.

After Black Friday, I will likely own 2-4TB external SSD for older video assets.

1 Like

When reading this, I wonder doesn’t it take more time to organise and to maintain the whole system instead of being productive and working?
Please do not miss understand me, I just wonder why do you need such a “complex” system?
I also “produce” files (design, drafts, texts, …) and I organise these in projects folders. These files have similar titles, easy to connect with their content. And it if necessary to add the time, I place “2025 11 26” (“yyy mm dd”) at the begin of the file title. I just type “fdt” and Typinator changes it into the actual year, month and day.

2 Likes

I bought a 4TB SSD with my current (M4 Pro) MacBook Pro. The reason being to carry ALL of my photos with me when travelling. Yes, a Samsung T7 was plenty fast and large enough, but it’s just enough friction that I hate it.

So, woohoo, I splashed out. Treated myself to a 4TB laptop. Since when I have been learning the major downsides!

  1. Selling it is very difficult. Few people want to pay a premium for 4TB and those who do don’t want to pay as much as you think they should. (I had an M3 Pro for about 6 months, but it was 16” and just too big for travel. It took me months to sell and I took a bath on the price.)
  2. Getting a quiet, reliable backup device is a nightmare. I set up Time Machine to my Synology. It’s… workable. My Mac filesystem locks up at the start of backups, not all command line Time Machine tricks work with it, and it’s not exactly fast, nor quiet. Desktop drives that are large enough (>4TB) are noisy. SSDs that are large enough are both rare and hideously expensive.

So yes, I regularly look for what’s taking up space and banish it either to the NAS or in some cases a dedicated drive. I’ve solved the backup problem for now with a workaround — I’ve excluded my photos from Time Machine because I already archive them to the NAS and they get included in Backblaze. With those out of the picture, my system is really only about 1TB, so I have a fairly quiet 2.5” WD MyPassport 4TB drive that serves for Time Machine. Oh, and there seems to be no such thing as a quiet, 2.5” HDD >5TB.


PS. Before I moved up to a 4TB laptop, I had a 1TB laptop. Now, when I see I have only a little over 1TB free, I start frantically searching for the space hog! :rofl:

4 Likes

I see no problem. Says man justifying his own behaviour.

I have 4TB M3Max for similar reasons. I gave up on Time Machine backups. They were unreliable.

At my desk I have an extra 4TB SSD for backups. I use Carbon Copy Cloner to do the backup. By the time extra SSD becomes too small I’m assuming that 8TB SSDs will be practical.

I also have an array of disks attached to an old M1 MacBook. It’s the ‘NAS’ for the house.

I get nervous when I see only 2GB left. This supposed a 5yr computer and this is only year 2.

2 Likes

Neither do I. But I prefer to keep as much data as possible on external drives, a habit left over from my days of managing servers. It makes trouble shooting much easier, avoiding Apple’s insane storage prices is just a bonus :grinning:

I back up to B2, and a local SSD, using Arq. But I also back up to Time Machine because, if it works, I can “nuke and pave” my MacBook in less than 45 minutes. Something I had to do multiple times last year when trying to upgrade to Sequoia.

1 Like

Thanks for the feedback so far.

I have done some more thinking and a bit of testing. TL;DR the idea will work - hazel and smart folders can do what I want.

I’m still interested in other solutions/experiences if anyone has them - at the moment it appears there are two camps: a) Get a bigger disk and keep everything or b) clean it up as you go along (ah, if only I were that disciplined…)

A bit more background, this is my own stuff and files for a small business that I run - not a corporate with big IT and defined retention rules or a regulated business with mandated retention/deletion.

My workflow has me occasionally create ‘iterations’ of documents and I am not good at deleting them when the final version exists and there are files that might be needed for a month or so, but not long term.

The result - I have lots of files that I will never need again - they clutter up search results (even with HoudaSpot doing the searching) - so getting rid of unwanted stuff is a good habit to get into.

My plan is for a manual review/purge once a month (or 2). this would involve manually updating the criterial for a smart folder to search for the appropriate tag (2025-11 if it was this month). Then to review the file names and delete them or change the tag if I decide to keep them.

I think 4 rules would be sufficient:

  • Probably not needed but hang onto for a while ‘just in case’ = keep for 6 months. I’d make this the default and rely on the review in 6 months time to identify the files that should be retained.
  • Needed until year end plus a bit = keep for a year and a bit
  • Keep for 7 full tax years because the Tax man might need it (they never have but they might) = keep for 8 years.
  • Keep forever (or until the project is purged - if they have been sorted into a project) = set a tag of RPY

Hazel will do what I want with one rule per retention ‘code’.

  • make the retention code the end of the filename e.g. it has " R6M."
  • use a match rule in hazel to get the filename (fname) ahead of the code and everything after the after the code (extn)
  • If a filename matches fname code extn then set the MacOS tag appropriately.

Hazel has a great way to do this using the Dynamic Tag in actions - it can set a tag based on the opening date of the file and adjust the date by 6+1 months (for R6M) and stripping out the days part giving a MacOS tag in the format yyyy-mm.

Now save the file as fname.extn (which strips out the retention code from the original file name) and the rest of the move and file rules deal with putting the file in the correct place in the file system.

To find the files when you need them. Create a smart folder that includes MacOS tags as a critera (this isn’t in the main list by default but can be found burried in ‘other’ when you first create the smart folder) and type the appropriate tag e.g, 2025-11. Finder will pull together a list of all files on the machine with that tag (irrespective of whether they were originally R6M, R2Y or R8Y. The criteria could be tweaked to separate the ‘recent’ files for review/purging (created in the last 7 months) from older files (created>7 months)

A quick test shows that it works in theory.

Not sure what I am going to do with the 43,000 items that are currently in my ‘filing’ folder :slight_smile: . Wait until I have very little else to do and start with a search of last opened > x years and does not have a RPY tag I suspect.

2 Likes

This is good observation. It’s probably why some of my documents are clearly named and organized by Hazel tiles , and others (probably the majority) are in a “file later” folder.

I organize them as I need to, and it doesn’t take long for Hazel to go through the pile after I create a new rule.

Spotlight does a good job of indexing and finding files, so I almost never use my folder structure to locate things.

It is a good idea to rename files from their default. I have several sources that I download files from where file name is something like “document.pdf” which will overwrite each other.

“There can be only one!” - Connor MacCleod on document.pdf files.

Not a solution, a couple of things that work for me.

Have you considered not filing temporary documents in the same folders as your permanent files? Given your current naming format would using a “6 month” and a “current year” folder be simpler than purging files later from your main file structure?

I dumped a ton of “just in case” files during the great purge of 2020. Some were decades old. Do you have a reliable backup system that you test regularity? That’s where my just in case files reside today.

Do they contain sensitive data? How much disk space do they consume? If your answer is 1. No, and 2. Less than 2TB, would it be worth $10 or less per month to upload them to iCloud or Google? I’m guessing (hoping) that’s not 6 months of data or I would hire someone to deal with it.

I get 2TB of storage with my Google Workspace account and my email, files, and photos, take up less than half of that. So my downloaded YouTube videos and everything else that I accumulate that might be “fun to have” gets uploaded to a separate part of Google Drive away from everything else.

And from time to time gets noticed and deleted.:grinning:

I certainly wouldn’t recommend that as a strategy (of course YMMV) backups should be there as a fallback position, not as a retention method.

If I was hell bent on retention, each project folder would be dated and then either (in the example
Above) go in the 6 Months folder or be tagged for retention for 6 months. I’d then run searches once a month and delete those which fall outside the retention.

I run a similar process for emails. Emails I get regularly (e.g. notifications from particular systems) are tagged for retention for 1 Month, 3 months, 1 Year, 3 years or 7 years. I also manually tag certain emails similarly. then monthly I search for tagged emails which fall outside the retention and permanently delete them.

Storage is so cheap these days (not Apple Storage) that it’s not worth the time for file retention for me.

1 Like

A few things I do that haven’t been mentioned:

  • I use _ rather than - to separate year,month,day so it doesn’t look like a math equation.
  • I’ve got a keyboard substitution (using Alfred) that inserts the date stamp YYYY_MM_DD into file names that aren’t stamped already.
  • I have a tag called “Sticky” that I put on desktop files/folders that I want to stay around. After three days, Hazel moves anything unmarked into one of several “Clean Desktop” folders that are on the desktop.
  • Scanned receipts are purged by Hazel after 6 months. Nothing else gets deleted by age. You never know what you might need, and storage is cheap.
  • Anything in Downloads that aren’t moved by Hazel to a final destination are tagged in RED after three days so I know I need to manually move them somewhere else.

I didn’t. It was something “that worked for me”. :wink:

In my case my “just in case” files were extra, on disk, copies of important but not critical, files. That I’ve never needed. I was keeping extra copies of files in case all three of my backups failed.


I use server side rules to sort my incoming mail. Anything that I might not want to keep is flagged for review and most is deleted immediately.

1 Like

For very important files, in addition to backing them up in lots of places, you should also consider an archive. Somewhat of an interchangeable term, but in my lexicon:

  • Backup — A copy of “live” data. Over time it ages out.
  • Archive — A permanent copy of data, live or legacy. It never ages out.

I archive my photos (live data) but nothing else. I use rsync to my NAS and rclone to B2 to copy-but-never-delete the files. This is to protect against something that happened to me many years ago. I managed to delete 3 months worth of photos. but did not discover this until years later. By that time, they had also aged out of all backups. Fortunately, I had archived an entire Mac before upgrading and they happened to be in there and I happened not to have cleared that drive. Lucky!

I’m so used to having a 250 GB or less internal drive, that I habitually “glean,” archiving to external drives or cloud servers. Now I have a terabyte drive and it’s just under half full.

2 Likes