Yet another DEVONThink Bug/oddness

I don’t know for sure. I do know that I am NOT the only person who has lost data out of DEVONThink. Initially most of the reports were only imported items but I and others have verified that data can be lost from of indexed items as well. In my worst case the losses never showed up in spite of performing the regular database checkups that DT recommended at the time. The symptom was that items showed up in the database but when you went to access them their contents were gone. They became zero length files. I keep a deep multiple backup system. I lost what I consider archival data, data I only reference every few years. My data were gone as far back as my backups which extended over a year. So the problem was there for a lot longer.

It could be because I had been a heavy user of DT for many years with my first installation multiple computers and operating systems ago. I had well over 10 years of regular and extensive use. However, that only makes me far more wary now of trusting it. At my highest use I had over 770,000 individual files and items either imported into or indexed in multiple DT databases. I am down now to under 90,000 items in 3 databases. The largest is the roughly 85K email archive, created after Apple mail got bogged down with my archive of messages I refer to. I haven’t yet tackled that one.

I liked DT for many reasons. It is a huge wrench to my system to decide to move out of it and there are all sorts of functions I am still trying to figure out how to replace. I cannot trust it and that is the primary purpose of a data archive and searching tool. Protect and archive your data. DEVONThink failed in that critical function for me and for others as well.

I’ve been reading a few threads through here and on the DT discourse. I’ve also been using it for 10 years and have my own email DB at 90K.

One thing that was never clear in the other posts was how you synchronised your data. Do you use Devonthink’s sync feature or do you sync the DB’s using other software or scripts? I only need to sync to iOS and sync over my own wifi. This has worked well with not issues so far. I never store my DBs on any form of cloud service as I tried it in the early days and it gave me problems.

I used the DEVONThink Sync service running to a sync store on our Synology NAS server. I also only sync locally over wifi and NEVER used any outside cloud service.

1 Like

@OogieM certainly was not the only person having issues with dataloss in DT. I also had issues with zero length files, but not to the extend as @OogieM. In my case these where indexed files and it could also have been my own fault. I haven’t had issues for some time now and since DT also uses CloudKit it syncs fast and reliable for me.

Tools like DT, and services like OneDrive, Dropbox, iCloud or even your local NAS can all potentially corrupt files while indexing, syncing, copying. The annoying thing is that when you experience that a file is corrupted, especially archived files, it is already to late. I do think that applications that work with your data benefit from actively checking integrity of your files and warn you as soon as a file is corrupt so you can take action and won’t find out years later. I do not think the DT verify tools check the integrity of each file.

Sounds like this might be a good use for macOS’ system immutable flag for files. One must be superuser to add/remove the schg flag (system immutable). There is also a uchg flag that normal users can set or remove, but that doesn’t seem enough. Here’s a little trial session I did:

(base) Johns-iMac-Pro :: ~ 1 » /bin/ls -lhdO test
-rw-r--r--  1 john  staff  -    0B Feb 22 10:57 test
(base) Johns-iMac-Pro :: ~ » chflags uchg test
(base) Johns-iMac-Pro :: ~ » /bin/ls -lhdO test
-rw-r--r--  1 john  staff  uchg    0B Feb 22 10:57 test
(base) Johns-iMac-Pro :: ~ » chflags nouchg test
(base) Johns-iMac-Pro :: ~ » /bin/ls -lhdO test
-rw-r--r--  1 john  staff  -    0B Feb 22 10:57 test
(base) Johns-iMac-Pro :: ~ » chflags schg test
chflags: test: Operation not permitted
(base) Johns-iMac-Pro :: ~ 1 » sudo chflags schg test
Password: lolmadeyoulook
(base) Johns-iMac-Pro :: ~ » /bin/ls -lhdO test
-rw-r--r--  1 john  staff  schg    0B Feb 22 10:57 test
(base) Johns-iMac-Pro :: ~ » touch test
touch: test: Operation not permitted
(base) Johns-iMac-Pro :: ~ 1 » echo "blah" >test
zsh: operation not permitted: test
(base) Johns-iMac-Pro :: ~ 1 » sudo chflags noschg test
(base) Johns-iMac-Pro :: ~ » /bin/ls -lhdO test
-rw-r--r--  1 john  staff  -    0B Feb 22 10:57 test
(base) Johns-iMac-Pro :: ~ » touch test
(base) Johns-iMac-Pro :: ~ »

Caveat emptor, your mileage may vary, void where prohibited by law, valid in 49 states, sorry Tennessee.

It’s very hard to prove a negative (that no data is lost), unfortunately, without some sort of immutable ledger–and even that itself would need its behavior verified.

It seems almost no users are affected by data loss (not saying only Oogie is affected) but storage is so cheap that it’s prudent to store a yearly snapshot of all of your databases in case anything ever happens, or in case you just want to double-check that you never had something you thought you had.