Goal: getting rid of incoming paper asap, continueing with electronic documents
Here’s what I learned when setting up my Paperless Home. For this project I have ordered a Fujitsu Scanner (click for more technical details!).
Software: Scansnap Manager 5.0L24 (Windows XP)
Goal (high WAF)
- Open scanner (which then auto-connects to the scanning SW)
- Put in a piece of paper and hit the scan button.
- Close scanner and put the paper in a big box.
Ideal would be if..
- Tagging and renaming PDF files based on content (some of this could be done by A-pdf renamer)
- KPN phone bill: kpn, 2011, march, bill, phone
- American Express letter: americanexpress, 2011, january, other
- Bank statement DJ: ingbank, 2011, march, statement, DJ
- Bank statement Wife: ingbank, 2011, march, statement, Wife
- KPN phone bill: kpn, 2011, march, bill, phone
- Store files based on tags (not mandatory! See examples above)
This is the software I ran into when researching this project.
- Irislink ReadIris Corporate ($471, Windows)
does not store files in location based on content and is quite expensive
- A-PDF Rename ($27 Windows)
very interesting, renames PDFs based on keywords or text location. Not automated
That means it can use keywords to rename a file but you have to select them, every time.
Neatworks scanners & software
Note: software only works with own scanner (Win) or limited number of scanners (Mac)
- OpenKM document management system (Open Source)
- Benubird PDF (free, Windows) PDF file manager
Has potential. This might be the first application that I’m going to test-drive.
- DevonThink ($80, Pro:$150, Mac only) Interesting, automatically recognize/organize documents
Cannot rename automatically or recognize documents based on the contents
TIP: how to create a “watch folder“
- EagleFiler (€ 32, OSX) manage files, assign tags and provide smart filters (interesting!)
- iDocument ($ 49, OSX) manage documents, assign tags, automate, batch and more
- Yojimbo ($ 35, OSX) manage, tag and find documents
Using the Belkin LAN USB hub. Check out the Scanner post for more information. The result is that I can place the scanner anywhere in the house because it connects to it’s software over WiFi.
- Hardware: Get a good scanner, as small as possible, low power (or good stand-by power consumption)
- Software: Once you have scanned a piece of paper, it needs to be organized in a way that you can find it.
- Security: With all of your mail stored as files, make sure only you have access to these files
- Backup: When you collection grows it becomes more and more important to have a good backup. Check-out online backup services like Backblaze and others.
- Accessibility: With all of this information stored, your wife should be able to find a document. Browser? Folder structure? Search? iPad/iPhone?
The scanner comes with the Scansnap Manager. This software is needed to control your scanner as it isn’t TWAIN compliant. This software is not too bad.
More info on the Scansnap software is available in the Community. A great article on how to install Scansnap manager OSX (S1500m) if you bought the $100 cheaper (but identical) Windows version (S1500).
I also tested the Scansnap Organizer software. It’s a HUGE install (1.1 GB) and looking at the functionality I have NO idea why it’s that big. 40 MB max. I would say. Looks like a typical case of bad programming relying on nothing but .NET crap. Don’t bother. There’s even free applications that are better.
After testing a lot of application it looks like I will stick with DevonThink Pro. What does it do compared to what I want?
- Automatic importing (with script)
- Manual moving to folders (categories) but the system will suggest a destination folder based on the contents. This really works great.
- NOT: renaming. You have to do that manually
You only modify the names if you import all the scans into the DevonThink database. When indexing existing files on changes will be made, nor will they be placed in some kind of organized folder structure
SOLUTION: Use the Database and do a regular EXPORT of all folders. This will place the files in a folderstructure just like the categories in DT.
- NOT: auto-tagging based on content is not possible.
Paperless office myths
Your Paperless Office (good online book about document management. Very practical)
How to manage your collection of PDF files
Scansnap and Hazel a paperless match in heaven
Overview of paperless office applications
How to make your office paperless?
Multi-page / duplex document separation
The main challenges I ran into are linked to the Scansnap Manager software. In my case, the software is running in a Windows XP Virtual Machine (VMWare Server 2.0). There’s a couple of different scanning scenario’s that are required for a high Wife Acceptance Factor (WAF).
Put in 2 single-page bank statements and 1 four-page letter. Now you have the choice
- 1. Scan the entire batch into 1 PDF (searchable, duplex) that you have to manually split afterwards
Disadvantage: takes more time to split PDF documents afterwards
- 2. Scan each instance separately. (that means: 1 bank statement – scan, 1 bank statement – scan, 4-page letter – scan).
Disadvantage: takes a LOT more time
Currently I have many profiles
– duplex 1 page (2 pages: front & back)
– duplex 2 page (4 pages: 2x front & 2x back)
– duplex MULTI page (scan until it ends) <– CHOICE!
– simplex 1 page (2 pages: front & back)
– simplex 2 page (4 pages: 2x front & 2x back)
– simplex MULTI page (scan until it ends)
The problem is that to select a matching profile you need to go into the Scansnap Manager. Guess what, this is running in a virtual machine and I don’t want to access computers when I’m scanning documents.
Theoretically you don’t have to use different profiles for simplex and duplex scans as the software detects blank pages and removes them. But some documents have stuff on the back you don’t care about so you have to remove these manually.
CHOICE: Nr.1 Scan everything Duplex into 1 big PDF. Let the splitting happen afterwards
I took a bank statement and scanned it with different settings. I compared the results.
Settings: Scan-to-PDF (searchable), simplex
– 150 dpi / auto-color-detection / compression 3
295 Kb OCR: great Preference!
– 150 dpi / auto-color-detection / compression 4
217 Kb OCR: some problems (small print)
– 150 dpi / auto-color-detection / compression 4 / text-only
230 Kb OCR: more problems (small print + no colors)
– 150 dpi / auto-color-detection / compression 5
135 Kb OCR: more problems (small print)
Changing auto-color-detection to ‘Color’ created an 297 Kb (instead of 295 Kb) file where the colors were not nearly as virbant as with the auto-color setting. OCR is also terrible.
Changing the auto-color-detection to ‘Color high compression’ created an 83 Kb (instead of 295 Kb) file which had major OCR issues, looked horrible and had the worst colors.
PREFERENCE: Using “Auto color detection”, “Compression: 3” and resolution “Normal: 150 dpi”
Try to lower the amount of incoming paper by
- Requesting electronic bills
- Requesting electronic bank statements
- Setting up ‘machtigingen’ for automatic payment
(less chance to forget a payment and getting reminders)