Some further musings on BackBlaze

There has been recently another thread discussion different routes to cloud backup, and some discussion of BackBlaze vs Arq.

I have posted a number of times about the three big reasons that I use Arq and not BackBlaze (see below) but wanted to through out some thoughts that occurred to me today for comments.

I don’t use BackBlaze even though I acknowledge it to be an excellent product, for three big reasons:

  1. Lack of pooled storage. I have a desktop with about 1.7TB of active data, and two laptops each with relatively small amounts of data on them. If I used BackBlaze I would need three subscriptions, for $180/year (plus tax) while two of the subscriptions would use very small amounts of storage. WIth Arq into B2, my total storage is still about 2TB, keeping my costs lower, and I can add another computer in with a very small increment in cost.
  2. BackBlaze does not keep files after 30 days after deletion, which means that if something is accidentally deleted or otherwise lost, after 30 days it becomes unretrievable. Arq allows me to keep files indefinitely (at the cost of paying for the additional storage). I understand why BackBlaze’s business model makes this a necessary approach.
  3. The same issue applies if an external drive is disconnected; the backup will eventually be deleted. In BackBlaze’s defense, however, they do email you warnings on this as well.

Today a colleague asked me to help him plan out a backup strategy. He has a desktop and several laptops, and no backup plan in place. As I have been thinking about how he can best accomplish this, noting that he is not a tech guy or a power user, and needs a simple, easy to install, understand, and use, reliable solution, the obvious answer is BackBlaze, but again he is looking at multiple subscriptions and a significant cost, while each laptop has very little data to actually store. My obvious answer to him is going to be Arq. I bit more upfront configuration work, but a much lower overall cost to pool data in B2.

That got me things along a few different lines:

  1. As a (maybe) more sophisticated user, I could in fact use BackBlaze in a somewhat roundabout way, by having each of my laptops do incremental backups to my iMac Pro desktop (which has plenty of storage) and run BackBlaze on the iMac Pro only. I would have to figure out how to have the laptops do that backup, but I can find such a solution (or just use rsync).

  2. Maybe my thinking has been off base. It occurred to me today that the purpose to the cloud backup may well NOT be, as I have envisioned, to provide the ability to restore any historical file or version. Rather, the cloud backup might be better thought of as a last-resort way of recreating the state of your computer system at the moment it dies / is stolen / is destroyed in a fire, and it is only that state that matters in the cloud. Historical versions can be maintained on a home clone or whatever as needed. There is a small window where you could accidentally delete some files, not realize it, have your house burn down, and then those files are irretrievable lost with this approach, however.

  3. If I have my laptops make a daily clone to the iMac Pro, and using CCC’s technique for storing changed and deleted files in a rescue folder, I could also have BackBlaze handle all the files in those folders as well. BB would NOT back up the entire clone as that is redundant and unnecessary, but could backup the CCC saved files, which provides that extra safety net.

Given this thinking, I might decide to shift over to BB. Still thinking this through.

Good analysis. I’m looking at something similar as my CrashPlan family subscription runs out in early 2020 and I need something in place before then. My situation is different but has similar issues. I too am considering Arq and Backblaze. Arq has the advantage of multiple storage options including Backblaze B2.

I hesitate with the language “recreating the state of your computer system”. I don’t think offsite cloud-based backups are great for state recreation. They are good for data recovery. Most cloud-based backups don’t (or can’t) backup files like application files and system files. Without these, you can’t recover “state”. All you can do with most cloud backups is recover user-generated files. As such, cloud-based backups are best as a last-resort method for restoring your essential data when your local options are also toast or are inaccessible.

Local clones are how you recover state.

Time Machine for each system. Backblaze for each system. Anything more complicated than that involving cloning laptops to iMacs and then backing the clones up to Backblaze via the iMac is going to be:

  1. More prone to error than a simpler system
  2. Harder to manage than a simpler system

So your colleague is more likely to run into problems that are harder to solve. It also means having to deal with either connecting laptops to iMac on a regular basis, or trying to clone over the local network. One is inconvenient and the other is unreliable. The best backup system in the world isn’t worth a nickel if it isn’t being used.

So I suggest a local backup (e.g., Time Machine) for all of their machines, and ideally BackBlaze for all of their machines. Local backups (Time Machine or a clone) can recover state. If that isn’t available, then data can be recovered from BackBlaze.

However, depending on how their data are distributed across machines, they might be able to get away with having only one system back up to BackBlaze, this has been my approach. I have an iMac and a MacBook Pro. Almost all of my data are in Dropbox, iCloud (Drive and Photos), or Google Drive. This means any work or files I create on my MacBook Pro are usually also present on my iMac (after a short delay for syncing). That being the case, there is very little I do on my MacBook that isn’t also on my iMac, so as long as the iMac is backed up to BackBlaze, it’ll catch data from my MacBook as well. Anything exclusive to the MacBook is caught by Time Machine.

I assume you back up Arq to a cloud storage service. Which one?


Yes, I agree. Restoring state requires a clone. The goal of the offsite backup is to recover data. Since I do send my Library folder to Arq, I can recover plist files and the like, but in the event of catastrophic loss I would replace the computer and reconfigure from scratch, restoring apps from the App Store and/or distributions and then reload data. Harder to recreate are things like Hazel rules and the like. Some things like my Keyboard Maestro macros and Alfred configuration are in Dropbox and so easy enough to recover.

I agree that for my colleague he needs something simple and set-it-and-forget-it, but he does not want to pay for 4 subscriptions to BackBlaze. That is why I think Arq may be a better choice for him. I agree than backing up laptops to a desktop which backs up to BackBlaze is a poor choice. TimeMachine won’t work for them as they will not hook up the external hard drives reliably. If he turns out to have an AirPort Extreme I will suggest and external USB drive for networked TM.

I have a similar setup in which pretty much anything on my laptop of importance is in my ResilioSync folder or iCloud folder and hence mirrored to my desktop, so from that standpoint I could go BackBlaze on the desktop only, and it’s a thought. As I originally posted, I could send my daily CCC archive folders over to BB as well and that would give me a more permanent history. However there is a third laptop also going into Arq which is NOT synced to the other machines, so I think at this point I am going to have to stick with Arq. Of course, I could argue that two BB accounts are going to be, at this point, slightly cheaper than my total cost for Arq into B2 at present, so maybe it is time for a switch. There is essentially nothing on my own laptop that does not mirror via ResilioSync, iCloud, or Dropbox.

@MitchWager: I use BackBlaze B2 right now. I have used Glacier in the past as well.

The external drive expiration has been a big hassle for me over the last year - there’s no way to telling BB to prioritise the external drive when it’s plugged in & I’ve often had a few hours on fast wifi only to discover BB didn’t even start on the external drive.

@dfay: Interesting. It does seem to me that BB would either prioritize the external drive or at least note that it was reconnected so that the 30 day clock would restart even if the drive itself was not backed up yet. They may well do that; I don’t think they are trying to be unreasonable with their policy (if they are going to give you unlimited storage at a fixed price, it is only fair that they take some precaution so that you don’t back up 10TB of external drive, then disconnect it and leave it hanging in their cloud indefinitely) and from what I can tell, they are very smart people so they would likely have thought of this.

I am sure they would respond to a tech support question about this.

Yeah I emailed them about it back in March - here’s the (abbreviated) reply to my question, “How can I force it to prioritise my external drive to avoid you deleting my backup?

No, there is no way to do so. However, all drives should at least update so long as all drives selected for backup are on, connected & unlocked, all at the same time, for 4+ hours.

This part was particularly surprising:

Looking over the account there are at least 2 drives that are selected for backup that are extremely out of date, and if ANY drives are not available, can cause ANY other drive to fail to update, as you’re seeing now.

I’ve still been very happy with the service overall for the last four years, and they didn’t actually delete my disk the one time I exceeded the 30 day limit.