High capacity, infrequent-access storage - best options?

I have a fair bit of data that I don’t need to access with any real frequency, but it must be available, and available reasonably quickly (within an hour or two) when I need it. Lots of backup archives for client files, etc.

My default MO back in the day was to burn a CD/DVD for each client and put them in a binder. Lately I’ve switched to just buying either internal or external hard drives, and disconnecting them as necessary.

I almost have to believe that a hard drive sitting in a box, only connected for a few hours a month, will probably last until the technology moves on and the drive interface is obsolete.

But is there a better way? Or am I already about as efficient as it gets, price-for-value wise?

You don’t mention how much data you’re talking about.
Here’s a link to Backblaze’s B2 cost calculator, which is around $100/yr for 1TB.

I have two concerns with your setup:

  • vulnerability - is this the only copy of these customers’ data? What if you have a fire/theft/power anomaly/etc.?
  • bit rot - data on hard drives can degrade over time. There’s good info in the video attached about ZFS, an error correcting file system designed to mitigate this vulnerability.

Given those two concerns, and the pricing of B2, it might be the better choice. Of course that makes you vulnerable to internet outages.

Hm. Another option would be regular Backblaze, and attaching each drive for a while once a month to ensure the data is retained.

1 Like

Maybe 4 or 5 terabytes, periodically rotating as new data gets added and old snapshots get thinned.

There’s also an equivalent amount of personal data (old video files, etc.) that’s not nearly as critical. In that case I really do just want something that’s more of a “treat this like a burned DVD and put the disk in when necessary” sort of thing.

Secondary copy on Backblaze (I connect the drives enough to be “verified”), with the upgraded 1-year retention. And this is data the customers are also responsible for having their own copies of, so consequence of loss would be less than if I had some sort of agreement with them to warehouse their data.

I’ve heard ZFS is sketchy over USB. Do you have any experience with it?

1 Like

No, I don’t. It would be something to use on a NAS or server with several drives so the probability of rot would be distributed among the drives, increasing the odds of recovery.

Ive thought about trying it on my Synology, but don’t have a lot of time to devote to it.

1 Like

I have a DS918+, but not sure four drives would be enough to implement it properly.

Using RAID goes counter to the design philosophy of ZFS, which knows about drives and failures and how to decrease risks. See RAID_Z

I just don’t trust TimeMachine anymore due to the periodic failures I’ve experienced.

Everything on my NAS is in at least five other places, so if I tried ZFS it would just be for fun. Or “fun” as these things sometimes turn out to be :slight_smile:

Yeah, sorry too, I misinterpreted.

1 Like

As far as I’m aware, ZFS isn’t available on Synology, they use BTRFS to stop bitrot?

ZFS may be accessible if you were running a virtual machine on it and within the VM?

For 4-5TB, I’d stick with the hard drive route as well. Smaller amounts I might have considered SSD or even Blu-Ray, but probably not the greatest.

Or possible Amazon Glacier, as then they protect it and it’s pretty cheap as a “what if”

I wouldn’t trust a single drive for anything. The infrequent data needs the same sort of backup strategy of any other data. I use multiple rotating backups, physical and cloud, with at least one of the physical copies always offsite. I’ve got files going back nearly 40 years.

This situation sounds like our requirements for some files at work. Hard drives in a good, heavy fire safe plus a copy on Glacier serve us well. Glacier’s egress costs are significant but would be paid by insurance in the event of loss of the safe. Glacier’s durability claims are also much stronger than B2’s. Glacier retrieval times tend to take about as long as someone going to the safe. Since these are archives of wrapped up projects, we don’t require any continuous sync solution.

What I switched to recently was using an old Mac mini as a server holding backups for all our computers as well as our archive and media files. That Mini is backed up locally and to BackBlaze. About 3tb of data. Changing my drive enclosure to raid would provide additional fault tolerance but not something that I need right now. Total cost was negligible, few hundred for used Mini, hundred for enclosure, already had 4tb drives, monitor, keyboard, etc.

Not that helpful perhaps, due to the amount of data you need to backup, but I discovered M-Discs today.

Even with the BDXL discs (100GB), probably unsuitable for your use case.

Actually, that might work well for some stuff. Do you use them? What’s your experience?

I haven’t used them. I only saw them yesterday as a passing comment on reddit. I currently don’t have anything that would burn them either, as I have no computers with a disc drive! However, I’m investigating, as it could prove a way of getting rid of a number of items, and could be a good way of archiving some family photos.

I have my original media so I quit backing up my ripped movies and music. That leaves less than 1TB of critical data which gets backed up hourly to Backblaze B2 and nightly to a external USB drive. An addition copy of everything older than 12 months is kept on AWS Glacier. My yearly cost is less than I spend on 2 months of entertainment and I don’t have to worry about ‘bit rot’, drive failures, etc.

I figure if JPL and NASA trusts AWS, my stuff is safer with them than me.

1 Like