Using a NAS for a document archive

A few years back, I went down the Home Server route and bought a Synology NAS which has been great for me.

While it wasn’t the only reason, one of the use cases for the NAS was document storage and I’m now pretty much paperless with all important documents being stored there.

However, I’ve come to realise that my paperless storage can be improved, as I’m currently just scanning files using an iPhone app and storing the resulting PDF into a relevant folder. So I wanted to post this up to see if anyone else on here was storing documents in a similar way and to ask if there were any pointers that could be shared.

One such area for example is OCR. Currently my docs are static PDFs and, when searching for material, I’m really only searching file/folder names as opposed to the content within the documents themselves. I have to admit that I’m not totally sure how OCR works.

If, for example, I was to use an app that supported OCR when scanning the document, would I be able to benefit from that once the doc is copied across to my NAS? Or do I need to setup the OCR side of things on the server itself to be able to search through the content of my files.

Thanks a lot for any help in advance!

Good Morning Mortimer Jazz,
I’m working on a similar project. I’m considering the Scan Snap ix1600. It has OCR. I too am using an iPhone app, it is Evernote, and does have the ability to search by characters. You can set up an Evernote account for free, and the free App is Scannable. It’s worth checking out.

I have been doing this for years. I OCR everything (scan snap software and PDF Pen Pro do this for me). I have a folder hierarchy that I set up each year. It has worked incredibly well. Some aspects are automated but not as much as I would like — mostly just a time issue.

You do want to make sure that you back these folders up, however. I have MS365 so I get a TB of 1Drive Storage. Synology Cloud Sync backs up files to 1Drive for me. I do create encrypted ZIP files for this, however, since some of the files are sensitive and I don’t like having clear versions in the cloud.

Thanks both for your replies - much appreciated!

One thing that I’ve realised I’m not clear on is how OCR works. By that, I mean if I use an OCR Scanner app on my phone, is the text that it recognises, appended to the PDF meaning that I can search through the text in the PDF anywhere? Or do I need to do the OCR work wherever I’m going to be storing the document?

So, @lsamberg, when you talk about OCR-ing everything that goes onto your NAS - do you create the OCR’d PDF on your phone and then copy it across? Or do you copy the PDF on to your Synology and then need to have some OCR software running on your server to be able to search the content of your docs?

And also, how do you enable OCR for PDFs/docs that you’re not scanning and are just receiving over email?

Thanks again for the help.

OCR data is stored in the PDF once it is created. So it’s fine to transfer it across devices. Digital documents that were created in a word processor already have the text data in them because they aren’t pictures. So if you get a bank statement the PDF already knows what in the text if it is a fully digital workflow.

1 Like

@macsparky has a course on this that will teach you everything and go through different options.