Paperless-ngx and seeking suggestions for getting into a decent workflow
from kiol@discuss.online to selfhosted@lemmy.world on 22 May 10:51
https://discuss.online/post/40125238
from kiol@discuss.online to selfhosted@lemmy.world on 22 May 10:51
https://discuss.online/post/40125238
cross-posted from: discuss.online/post/40125235
Picked up an ix500 scansnap and wondering about suggested workflows for going paperless. My intention is to scan a bunch of documents, but haven’t delved deeply into how this will actually flow on the software level. I know I’ll need to OCR the scanned documents, and my base setup is:
- Pi with SSD storage running compose version of Paperless-ngx to filesystem mounted folders.
- Folders can also be accessed over Samba
- ix500 statically assigned over wifi as network scanner.
- A literal filing cabinet, for things I should keep physically.
- Ubuntu computer for browsing
I feel a bit overwhelmed, but am excited to get started. Will be scanning personal document, work docs, whatever else I need to digitize and recycle. All suggestions appreciated!
threaded - newest
Do not overthink it. Set up paperless, create a watched folder. Paperless does the OCR for you. Scan your stuff and check if it was scanned as you want it to be. If yes, drop it in the folder. Tag as you go, paperless will learn and tags will get more accurate. If something reaches a level where you can trust paperless to always tag it correctly, let it tag that type of thing completely automated.
And file away your scanned papers separately, because scanning old things takes a lot of time and will most likely not be done in a day or two. Even with a scanner which can pull through stacks of pages, you still have to check if every page really was scanned (scanners can pull in two pages at the same time, only one page will be scanned then) and you have to merge multi-page docs (or scan them that way immediately).
This^. No matter how many layers of backups I have for paperless, I’m still keeping the most important physical documents in a file cabinet.
I’d recommend using ASN (archive serial numbers) for documents you store a physical copy of, following the recommended flow
I printed ASN QR code stickers, using the smallest Avery labels I could find (Avery 5267 in the USA, L4731REV-25 in Europe) along with their free online design app.
For documents I want to keep, I stick a QR code sticker on them before scanning. Paperless-ngx automatically detects the QR code and sets the ASN. I then file it away in a folder that’s sorted by ASN. When I need to find the physical copy again, I first look in Paperless to find the ASN, then find the document in the folder (pretty quick since all documents are sorted).
You’ll need to set the following settings:
<img alt="" src="https://i.imgur.com/gBRc0IR.jpeg">
I’m not sure exactly what you are looking for but here is my workflow:
Laptop - This is where I do most of the uploading to paperless. When I get an important document over email, buy something online that’s expensive enough for me to want to save the receipt, or buy something that comes with a digital manual, I download the PDF and upload the document to paperless in the browser.
Phone - I have an iPhone and use Swift Paperless to upload physical mail, physical receipts, or physical manuals I can’t find online.
I know I can set up Paperless to pull documents from my email automatically but it’s not very good at guessing the tags and correspondents in my experience, and because it’s not good at guessing the correspondents and tags, I have to manually edit the documents anyway so I might as well upload them myself. I’ve just got into the habit of getting a document, knowing I might want to view it later, and upload it right then or later that day. The built in OCR works great.
Edit: Oh, my behavior has changed a little because of paperless too. I now ask everyone for a receipt, email confirmations when talking with customer service, or if I’m dealing with a business that only hands me paper documents, I ask them to email them to me too. I’m pretty annoying about it. Basically, if the transaction is important enough to me, it doesn’t end until I get proof that I can upload to paperless.
I organized paperless in the beginning but not so much anymore.
I have given it access to my email inbox and to add attachments automatically. I can then search for receipts and more without having done the work to import it manually
it works relatively well. for sites and services that actually include pdf receipts
If your scanner supports scanning to a network share, install Samba on your Pi and share the paperless-ngx incoming directory. My ScanSnap iX1600 supports this, but I’m not familiar with other models. I had to configure the scanner using the Windows app to add the SMB details, but once it’s configured, it works without a computer attached.
Paperless-ngx also supports email. You can set up a separate email account for it, then forward it any documents you want to keep to it.
For documents you need to keep a physical copy of, use ASNs (archive serial numbers) to correlate the physical and virtual copy. You can use QR code stickers to automatically set the ASN in paperless-ngx. I posted a nested comment with more details about this.
Consider using paperless-ai to use an LLM to tag and title your scanned documents automatically. It needs a webhook to be configured. Consider a local model if possible, and if you want to use a hosted model, review the provider’s privacy policy to ensure they do NOT train the AI on user content.
I am relatively new to the whole paperless thing but I have a couple of email addresses that it watches, one that saves the email as the document and one that saves the attachments. My pi 5 has no problem scanning documents that I can then send to one of my paperless emails. I can also forward emails I want saved to the other paperless email. I am still going into the web ui to make sure things are tagged right, but paperless is awesome it does OCR and it fixes files that were scanned at a bit of an angle.
I am trying to work through a system that prints documents according to events on my calendar, with HA.