DevonThink Databases. Size? Performance?

Just how big can the database be?

I have 357 GB of data in iCloud. Can I just create a cloud kit database and move all those files in there?

If I do have to break that up amongst my 6 areas of focus can each be about 60 GB?

Does AI work across databases, I assume they would have to be open for it to work.

If I split them up could I have them all open at the same time.

Concerns about initial database creation and fears about going down the wrong path and then have to redo has really kept me from moving forward.

That’s a lot of data.

Can I ask what kind of data it is? Are these all videos? If so, it changes things a little in my experience.

Also note that this kind of thing has been discussed extensively on DEVONthink’s own forum:

2 Likes

Agreed; this is the post on the DT forum I’ve referenced a few times: Too Large for DTTG? - #4 by anon6914418 - DEVONthink To Go - DEVONtechnologies Community

Size in gigabytes isn’t the critical number. If you check out File > Database Properties > … for a given database, the number of words / unique words are more critical. On a modern machine with 8GB RAM, a comfortable top limit is 200,000,000 words and 4,000,000 unique words in a database. (Note: This does not scale in a linear way, so a machine with 16GB wouldn’t necessarily have a comfortable limit of 400,000,000 words / 8,000,000 unique words.) So text content in a database is far more important.

  • If you have a database of images, it will have very few words but be large in gigabytes.
  • If you have a database of emails, it will have many words, but may be smaller in gigabytes.
    The second one may perform more poorly as the number of words increases beyond the comfortable limit.

Also, for best performance we suggest a maximum of 250,000 items per database.

Those limits roughly jive with my experience.

2 Likes

Highly second Ryan’s suggestion of going to the DT Forums. They can be quite helpful

2 Likes

Thanks for linking that info.
et al: These aren’t hard numbers; they are a comfortable average based on experience and reports, etc. We surely have people with larger databases, though often much of it is media content. And we have people with very small databases but with a ton of text content. The larger one could potentially outperform the smaller if the index of the smaller one is very large.

2 Likes

I was once told that to be successful you have to optimize approximately 11 areas of your life.

They broke it down as:

  1. Self Management / Growth
  2. Health
  3. Relationships
  4. Finance
  5. Career
  6. Focused / Organized
  7. Projects
  8. Personel
  9. Entertainment
  10. System Management
  11. Home Projects / Honey Do Lists

I have folders matching the above list. Sometimes things in 7. Projects can overlap information in other folders. For example I could have a Refinance my loan project that would be in Projects but interact with data in the 4. Finance folder.

Here are the sizes

Personal Growth is large because that is where I store the Downloads of the MacSparky field Guides as well as all of my books that I have scanned in. I could break those out into MacSparky FG and Books.

** As an aside, if you upload your books to the books app does that take up room in iCloud? If you have room does it keep a version on your Mac and in iCloud?**

0 0 Happy Inbox is where I tend to put new information. I want to get either Hazel or DevonThink to help me clean that out.

0 1 SS Home is that Home folder for ScanSnap Scanner Scans. I want to get either Hazel or DevonThink to help me clean that out as well.

00 2 Ready Reference is for Manuals and Mac Tricks and Tips

As I start making videos I probably should have folders for just Video work.

Obviously as time passes information becomes less relevant, one option is to copy this folder structure and files to a NAS archive folder for reference/history and then only keep the last 5 years of files in iCloud readily available. Doing this might mitigate some to the value from the AI relationships that DevonThink might find.

So would I create 12 or so CloudKit Databases with these folders and then open them up every time I start up DevonThink and work from inside DevonThink.

So for most work, Finder and Pathfinder would be a thing of the past?

When I worked with files on my iPhone and iPad it would all be done in DevonThink?

I originally started playing with DevonThink and had it just index the Finder data as I had a bad experience where I lost a lot of work when the database with the meta data for all of my information crashed, I vowed never to let that happen again.

But… Everyone has says that DevonThink is safe and if you backup your databases you should be OK and if you export the data regularly back to finder based archive location you should always have access to your data if not the metadata that you entered into DevonThink

I get nervous whenever I open a database DevonThink complains that the database was not closed properly and only open it if you are sure that it is not open elsewhere. Not sure what I need to do to avoid this message.

If I capture some data from a web page and store it as a PDF in DevonThink on my iPad, how long before it would be available on my Mac?

If I try and get rid of duplicates will it search across all 12 databases?

I image for rules to run I would need to have all of the databases opened that are referred to.

Until I got MacSparky Field Guide on DevonThink I was not aware that I could have a Devonthink database for Mail. Right now my Mac Mail is out of control. Would I use DevonThink rules against the DevonThink Mail database to clean it up or would I use Mail Rules in Mail and just use DevonThink to search for info and to use it’e AI capabilities to find connections to projects in the other databases?

When my previous system was working I could find anything in under 5 minutes. It was wonderful. I am hoping to get back to that.

Templates in HoudaSpot have really helped. Can they search DevonThink?

Setting up the databases correctly has really been holding me back.

Any suggestions on how to start small and get 1 or 2 databases set up and being shared across all devices so that I can use them for awhile and not be crushed it I shoot myself in the foot.

1 Like