However, if a PNG input file was used and it contains a gAMA and cHRM chunk (gamma and chromaticity information) either of which forces "convert" to write a BMP4. To get a BMP3 you need to get rid of that information. One way may be to pipeline the image though a minimal 'image data only' image file format like PPM and then re-save as BMP3. Messy, but it should work.
Backing up data to paper
Sort of odd-ball topic, but ended up looking today at paper storage for binary data. Came out of wanting to think about how best to back up things like gnupg keys or password databases.
Anyway, one tool that’s referred to quite often is PaperBack. From my point of view, its not a great option as it’s a windows tool, albeit with full source code. I took a bit of a look through code to see if I could easily enough port it across to Linux, but didn’t make rapid headway.
So I took a bit more time looking for alternatives. There’s a good summary here of various ways to encode binary data into paper-friendly representations. Most of those (e.g. QR Codes) aren’t much good if you’re trying to store 10s of kb or more. However, in the comments there was a good link!
Turns out that twibright have a small program: optar that’s squarely in the linux eco-system, and does much the same (web page references PaperBack too, so they’re working from the same idea). Nice to see this from Twibright Labs, as I remember looking at their pages years back with a view to building a Ronja setup
Anyway, turning back to Optar: that programme worked rather nicely. Basic workflow…
../optar_dist/optar README README ../optar_dist/pgm2ps *pgm # Then print out the postscript file # Then scan it back in, png format, 600dpi cp scan.png scanB_0001.png # Note that I changed default settings, by editing optar.h, # so as to reduce resolution and increase robustness... ../optar_dist/unoptar 0-32-46-24-3-1-2-24 scanB > output
The very first time I tried this out, file had some corruption. However, after reducing the resolution a bit, it worked just fine. Would take some guts to use this for a binary format (e.g. encrypted content) though!
Further reading turned up some more options.
Most interestingly, someone has already ported paperback in the way I considered, resulting in paperback-cli,
paperbackup is interesting for relying on QR codes, only for ascii storage though (which is good for things like gpg keys etc.,). The readme for that project also has a good list of other data→paper options, which I think is where I came to paperback-cli
colorsafe should have worked for me but I hit a couple of python library snags, and instead put energy into paperback-cli
git clone --recursive -j8 https://git.teknik.io/scuti/paperback-cli # Note, need recursive as there are sub-modules cd paperback-cli make
Building went fine. Encoding was straight-forward also:
# Note went for lower dpi (100) and full size dots, for more robust first go ./paperback-cli --encode -d 100 -s 100 -i LICENSE -o out.bmp
img2pdf --pagesize A4 -o out.pdf out.bmp
so, armed with that info:
convert scan0002.png ppm:- | convert - BMP3:scan0002.bmp3 ./paperback-cli --decode -i scan0002.bmp3 -o scan.txt
md5sum scan.txt LICENSE d32239bcb673463ab874e80d47fae504 scan.txt d32239bcb673463ab874e80d47fae504 LICENSE
Update 20th August 2018
Not nearly as happy with that as I thought I was! On positive side, did get a 200dpi version to work, and work first time. However when I then added a handwritten note to the page, that no longer worked. Then tried trimming the page, folding note away, etc., but couldn’t get it to work. Re-printed page, and the reprint wouldn’t work. So the time it worked now looks like a fluke.
Gets to the point where I think to get a working route, I would need to systematically work through tools/variables/tests and establish some performance. Also need to validate exactly the steps outside of the tools themselves (in particular getting the png/bmp file onto paper, where I’m using img2pdf at the moment; but also any dirt/contamination/artifacts from my relatively old and well used scanner), to ensure that problems. aren’t being introduced there.
One random observation: I think I’d prefer now to have a less-information dense format (so more resilient to contamination, distortions, etc.,) if that meant it could reliably be scanned via the automatic document feeder. The "waste" in the first case would be offset by the convenience of the latter.