Friday, May 29, 2015

Simple Data Backup with Paper Based QR Codes

Let's say you have some important data you want to protect, how do you do it?  The obvious answer is encryption, this then leaves you with the smaller but more manageable problem of protecting the key.  This is really important though, if you loose the key, the data becomes useless.  So it's not uncommon to back it up.  How you want to safeguard the key and where you want to store it aren't the subject of this post, what I want to talk about is a method to ensure the longevity of the data and medium you store it on that's also dead easy to recover.  (It looks hard, but it really isn't)


The first thing to consider when thinking about backups is the medium.  If you archived data on a 5.25 inch floppy 20 years ago, you might have a hard time recovering that today.  First you have to find a disk drive to read the information, then you have to hope that the information stored on the magnetic media hasn't degraded, then you need to be able to read the format of the recovered file.  This doesn't just apply to magnetic media.  To quote the National Archive of Australia about the preservation of physical media:

Recordable CDs and DVDs, USB keys and various forms of flash memory have doubtful long-term reliability and are subject to format and software obsolescence.

So what do you do?  The least worst solution is to store the data on paper.  Print it out and put it somewhere safe.  If you want a bit more safety, print out multiple copies and put them in different locations, it's up to you.  If you want to get all tin foil hat, you could split the data into n pieces that only require k parts to reassemble using Shamir's Secret Sharing algorithm.  For example split the key into 6 parts that only require any 4 pieces to reassemble, then store each portion in a different location.  I'll leave that for another time.    The point is, paper has proven that it can stand the test of time if stored with even the slightest bit of care.  You also don't need need specialised equipment to read it (although it helps).

Once you decide to store the data on paper then comes the question of how you plan to do this.  If the file is binary data you can't just print it as there'll be non printable characters, and unless you choose the right font it can be hard to tell the difference between characters.  E.g. | l 1.  You could print a hex dump of the file, but if you need to recover the file re-entering that data could be a very long process.  The easiest way is to use bar codes, QR codes to be exact.  The ubiquity of QR codes leads me to believe that a major catastrophe will  have to befall humanity before we forget how to read them.  Even if it has to be done by hand, I think they're a stable format.

The process to go from file to printed QR codes and back again is surprisingly simple when you use the right tools.  There are solutions like PaperBack that accomplish a similar goal, but it seems to use it's own barcode format and doesn't use a standard like a QR code.  That brings long term reliability into question.  The method I propose is listed below and uses software with functions that can be performed manually or easily reproduced with other software.

I decided to test this out using a live USB of the TAILS operating system.  The file I've backed up is an example Keepass database I created.  Start by installing the required tools.


    sudo apt-get update
    sudo apt-get install zbar-tools imagemagick qrencode



QRencode is used to create QR codes from terminal input, and that's what we'll be using it for.
Zbar-tools is a flexible easy to use barcode reader that can decode bar codes from an image or webcam.  We're going to use it to scan the data back into the computer.
ImageMagick is like the Swiss army knife of Linux image editing.  This will be used to combine 6 barcodes onto one page ready for printing.

 

Create the Barcodes


Next the input file will be encoded in base 64 format.  This probably isn't needed as QR codes are capable of encoding 8-bit binary data.  I just do it to be safe.  What I did actually wastes space, so do what works best for you.


    base64 keyfile.kdb > keyfile.64


The file is then split into a series of smaller files that can be converted to QR codes.  You can only fit so much data into a QR code.  A couple thousand bytes depending on your encoding and level of error correction.  Once again, use your judgement.


    split -n 6 keyfile.64 Passwords_kdb_64


Encode each portion of the split file as a QR code.  The -l H option gives the maximum amount of error correction in case the bar code is damaged.  I've processed all files using a command line loop.  This is something to generally avoid.


    for file in ./Passwords_kdb_64*; do qrencode -l H -o $file.png < $file; done


We'll then combine 6 QR codes into one image containing 3 rows of 2 codes with the filenames under each code.  If you have more than 6 bar codes don't worry about it, imagmagick will create as many output images as you need.


    montage -label '%f' *.png -geometry '1x1<' -tile 2x3 Passwords_kdb_64.png


QR code Backup
Resulting QR codes storing a password database

Recover the Original Data


Scan each of the QR codes in order using zbarcam and redirect the output to a file.  Each code is on a new line with a header identifying the type of code scanned.  The new lines and headers need to be removed.  This was done manually.


    zbarcam > keyfile.64


The last step is to convert the base 64 encoded file back to the original binary file.


    base64 -d keyfile.64 > keyfile.kdb


There you have it, file to QR code and back again.  What I like about this method is that even if all the software used to create the final output image disappears, the encoded data can still be recovered as long as you can decode a QR code and convert a file from base64 back to binary.  Both of these processes are widely known.

You can find all the associated files below.
https://gist.github.com/GrantTrebbin/0c6aadc7ecebe3107d08
https://drive.google.com/folderview?id=0B5Hb04O3hlQSfmwzVFdCTS1YZm8xSVVLZm95by0zLVpaTHR2WE1XcTVicWE5NUFJZjg4cGs&usp=sharing


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.