Paperless-ngx is an open-source document management system that transforms your physical documents into a searchable online archive

A webpage showing a tiled layout with tiles arranged in horizontal rows, each one representing a document that has been scanned in. Each tile shows a thumbnail image of the document contents with titles below, tags, a creation date, etc. Down the left side is a menu f options such as Dashboard, Documents, Recently Added, Inbox, Correspondents, Tags, Document Types, Storage Paths, Custom Fields, Templates, Mail, Settings etc. At the top is a search bar.

You can either scan or upload various document formats into Paperless-ngx.

It will organise and index your scanned documents with tags, correspondents, types, and more. Your data is stored locally on your server and is never transmitted or shared in any way. It performs OCR on your documents, adding searchable and selectable text, even to documents scanned with only images.

Documents are saved as PDF/A format which is designed for long term storage, alongside the unaltered originals. It uses machine-learning (see no AI) to automatically add tags, correspondents, and document types to your documents. Supports PDF documents, images, plain text files, Office documents (Word, Excel, PowerPoint, and LibreOffice equivalents) and more.

I installed this using the Docker Compose script file. I did notice though for support of Word, Excel, PowerPoint, and LibreOffice equivalents I needed to also install Tika and Gotenberg (added them to the Docker Compose file).

It is not just limited to documents, though, as it will also connect via IMAP to an e-mail server and organise and archive your e-mails.

I’m testing it out a bit now and finding it useful for scanning in my numerous receipts, as the OCR will help find what I’m looking for later. I’m thinking of doing a video about it in a few weeks to show what it does, and does not, do.

See https://docs.paperless-ngx.com/