Showing posts with label image processing. Show all posts
Showing posts with label image processing. Show all posts

Saturday, June 3, 2017

Generating Stippled Images with Stiptacular

A few weeks ago I posted about how to generate stippled images from regular input images. The code was garbage at the time so I've improved it and posted it for people to use or learn from.  I only just remembered why I started this project. I found the StippleGen program from EvilMadScientist (EMS) but it was written in the processing development environment which I didn't have much luck with. I thought it'd be great to have a Python version out there as well and along the way added my own tweaks.

  • Better initial distribution of seed points via a PseudoHilbert Curve
  • Dithering to make points distribute more evenly
  • A term that sets how much points are attracted to darker areas

Since I'm using trying to replicate the work of EMS It thought I'd use their test image of Grace Kelly. Their interface is beautiful and has a few bells and whistles that mine doesn't because I figured that I could do any pre-processing in something like GIMP.

Grace Kelly
Grace Kelly test image 943x834
For an initial test I'll use 2000 points with 5 rounds of dithering and 5 rounds without dithering.  The simplest way to think of the adjustment parameter is like a contrast setting.  In this case it is three. This will cube the value of each pixel and re-scale all the data to a 0-255 range.

number_of_points = 2000
dot_radius = 2.5
non_dithering_iterations = 5
dithering_iterations = 5
adjustment_parameter = 3

Processing the image with the above settings took about 90 seconds and produces the following SVG file.

Grace Kelly
Large Grace Kelly test image - 2000 points

To demonstrate another quirk I noticed, take a look at the smaller image below.  It contains the same number of dots as the image above.  As a matter of fact, it's the exact same image just scaled down. However the dots start to look more like a face. So scale is important.  If I were to do an 8 foot print of this for a wall in my house it wouldn't look that good because I couldn't get far enough away from it for the image to emerge. I would need to use more points.
Grace Kelly
Small Grace Kelly test image - 2000 points
The image below uses 30000 points and takes about 25 minutes to generate. Writing this in something like C would help a lot..  Calculating the centroids of the Voronoi regions can be done in parallel to speed things up as well.
Grace Kelly
Grace Kelly test image - 30000 points
You may notice that some of the darker regions look a little strange. This is caused by points aligning, as can be seen in the close up below. This can be solved by reducing the number of points, dropping the adjustment parameter so the area isn't so crowded, performing more dithering steps, or just enlarging the source image before processing.

Stipple Error
Alignment artefacts
.
Get the Code!
You'll also need to install the PseudoHilbert Curve Module for Stiptacular to work.  It's a dependency that I'd like to eventually remove, but for now it's needed.

The posts below are my train of thought while developing this script.  It may help if you get a bit lost.

Voronoi Stippling
Entry and Exit Points for Space Filling Paths on a Grid
Hilbert Curve Generation With Lookup Tables
Converting Binary to Gray Code with XOR
Calculating Hilbert Curve Coordinates
Pseudo Hilbert Curve for Arbitrary Rectangular Regions - Part 1
Pseudo Hilbert Curve for Arbitrary Rectangular Regions - Part 2
Efficient Centroid Calculation for Discrete Areas
Generating Seed Points For Voronoi Stippling
Generating Stippled Images with Stiptacular

Here's a picture of a Atlantis during the STS-132 shuttle launch made of 30000 points. To infinity and beyond!
STS-132
STS-132
I wish I had the time to do an even deeper dive on this type of problem.

Friday, May 5, 2017

Generating Seed Points For Voronoi Stippling

I've been working on a way to generate an initial distribution of seed points for a Voronoi stippling program that I played with a while ago.  The problem is that for the final image to look good you really need a lot of points, and when you have a lot of points they are very slow to converge to a solution.  The process I used in the past is to generate a few points and allow them to be processed, split them in two, process them and so on and so on.  Each time the number of points is doubled and the solution is eventually found but it's still not great.  An improvement I thought I could try is to linearize the image via a Pseudo-Hilbert (PH) curve and apply a rolling threshold to the data and place points at locations where the counter rolls over.  You may see a major optimisation that I've know all along but haven't implemented.  I'll give you a clue, it a PH curve really necessary?  Find out at the end of the post.😉

The implementation of the threshold is a little abstract so I'll just explain the general concept. First of all we always work on an inverse grey scale version of the image as we are distributing black dots and the background is white. Imagine that the sum of all the pixels in the image is equal to 1,000,000 and you want 2000 points.  This means that each point represents 500.  So as you travel along the PH curve you add the pixel values and every time you accumulate another 500 place a point.

The points generated from this process are then iterated on via the Voronoi stippling process to produce a result. Another small feature I've added is a dithering step. For a certain number of iterations a random displacement is added to each point.  This can be seen in the GIF below as a jiggling of the points. What is the purpose of this? Imagine a jar of jelly beans that won't quite close. What do you do?  You pick the jar up and shake it to settle the contents.  That's what we're doing here.  What you end up with is a number of points that when randomly perturbed, return to their original position.  i.e. a stable arrangement.

The results in the GIF aren't that great.  The main reason for this is that I've kept the features large so you can see what's happening.  If you really wanted to process this image the best way to do it is to increase the size of the image and then process it.  When the calculated Voronoi regions are small and they only contain a few pixels, due to quantization issues the points won't move as the centroid calculations will always yield the same result.  So if you notice herding of points in the final image try enlarging the image and reprocessing it.
Distribution of Seed Stippling Points on a Pseudo-Hilbert Curve
To test the algorithm I've used an image of a plant from the paper I'm working from.  Weighted Voronoi Stippling, by Adrian Secord.  If this link becomes dead, contact me, I have a copy of the paper.
Test Image Used in Adrian Secord's Paper
In the image below I've started modestly with 4000 points to test that everything is working properly, and as you can see, everything is working.  On closer inspection though, you can see that the dynamic range, for want of a better term, of the image has been reduced.  In other words, the areas that are meant to be dark aren't as dark as they're meant to be, and in comparison, the areas that are meant to be light aren't as light.
4000 points - power 1
The dynamic range problem can be solved by pre-compensating the image.  Specifically raising each pixel to a power or exponent.  In the image below the value of each pixel has been squared before processing.  I've referred to this as a power parameter.  I've re-scaled the resulting data back to a range of 0-255 just for consistency but it's really not required.
4000 points - power 2
In the next image the number of points has been doubled and the power parameter has been pulled back slightly to 1.9 and things are starting to look even better.
8000 points - power 1.9
Let's step things up once again and go for 20000 thousand points to match the example in the paper. I've also pulled back the power parameter to 1.6. Why? It's subjective. I think it looks better.
20000 points - power 1.6

I think that looks pretty spectacular.  Of course I've had to resize the image and scale it down a bit so downloading this page isn't a 100 MB exercise, so there's a little bit of quality degradation, but the original vector files are even better.

The point of this post however is to discuss the PH curve approximation for the seed points and how close it gets us to the final image.  What does the initial distribution of seed points look like before any processing?
20000 points - power 1.6 - no processing
The structure is immediately visible, but there is a fuzziness to the image. This is more apparent when looking at an enlarged section. The points are distributed on a grid so you will see more lines and square corners in the arrangement.

Close Up Section of Seed Point Distribution
So what happens after processing?  The points are allowed to find their own space and curved features start to appear.  I can't explain the exact reasoning but in sections the points form curved lines running in multiple directions.  If you can't see what I'm talking about, go and have a look at something called a Fermat Spiral (also a way to distribute points to evenly cover an area) and you'll see what I mean.  In all things just look more pleasing to the eye.
Close Up Section of Points After Processing
I'm sorry but I'm not releasing the code on this one just yet. Why? Because it's terrible. There are very few comments and a lot of sections where code has been commented out. I know the theory is to develop in public and let the code be scrutinised, but I try to make a program so that it's not only functional but it also explains the concept I'm trying to describe to the user.  At this point I think it would just confuse people.

It's also hard to develop things cleanly when you don't exactly know where they are going to go. Unlike a lot of programming where you have a clearly defined goal and a general way to get there. I start not usually knowing either.  Think of this type of coding as a science experiment.  I'll try different things and see what the results are, change things until I get what I want and then go back and clean up a section and lock it down.

I will release this soon though.  There are couple things to fix but it's mostly there.  If for some reason I don't post the code, let me know and I'll help you out.

Here is the optimization I didn't implement.  You don't need a PH curve.  Just add a white border to the source image to make it a square so that the length of a side is a power of 2 and then use an actual Hilbert curve and crop the result.  They are way more efficient to generate than a PH curve.  I did it this way because I was interested in the problem of covering arbitrary rectangular regions with a curve.  As this blog is a labour of love I reserve the right to go off on the occasional tangent to satisfy my curiosity.

Friday, April 21, 2017

Efficient Centroid Calculation for Discrete Areas

A project I'm working on requires the repeated calculation of weighted centroids for arbitrary regions in an image.  Calculation of centroids is fairly straight forward but computationally expensive.  To calculate the x and y coordinates you first need to calculate three other values, the denominator, x-numerator and y-numerator.  The denominator is simply equal to the sum of all the values in the region, the x-numerator is the sum of all the values in the region multiplied by their x coordinate, and the y-numerator is the sum of all the values in the region multiplied by their y coordinate.  From these values the x coordinate of the centroid is equal to the the x-numerator divided by the denominator likewise for the y coordinate of the centroid.  

You can see from this description that there are a lot of time consuming multiplication operations to perform for each area that you want to calculate the centroid for.  As it turns out though, you can perform two cumulative summations along the rows of the input data as a first step, store this and then perform the centroid calculation by only accessing values at the boundary of the region. For reasons I'll explain later a column of zeros with x coordinate -1 is added to the start of the data.  A cumulative summation is performed along each row to obtain $$$P$$$. A cumulative summation along each row of $$$P$$$ is performed to generate $$$Q$$$ The following equations describe the process.  Before we get started, the derivation of these  equations can be found in this small article I wrote.
Equations
Centroid Calculations
It may not be immediately obvious what these mean so I'll give a quick explanation.  First of all each of the above calculations are done row by row and added together. This is what the summation symbol over $$$y$$$ represents.  For the denominator the value to be summed is the difference between two values on a row of $$$P$$$. For the y-numerator the calculation is the same as the denominator but each row is multiplied by the y coordinate.  The calculation of the x-numerator is a little different. It's similar in that values of $$$P$$$ at the boundary are subtracted and multiplied by $$$x+1$$$ but the additional step of subtracting the difference of the values of $$$Q$$$ at the boundaries is now added.  Maybe an example will help.

Spreadsheet
Example Calculation

In the first data set on the top left an area is marked green.  The centroid of this region is to be found. A column of zeros is added to the start of the data.  This is seen in the data set on the top right.  These are added because when performing the operation in software you end up accessing array outside of their bounds.

Typically when calculating centroids each green cell would have to be accessed for each calculation. However when using the new method described above only the cells in blue are accessed.  You may also notice that the width of these doesn't change the number of operations.  The complexity of the calculation is only dependant on the height of the region.  I've included this spreadsheet for you to play around with and get a better understanding of the process.

One last this to note is that the addition of the column of zeros while maintaining the original $$$x$$$ indices is silly as it creates a column with an index of negative one.  This is where the adjusted x index comes in.  Using this and the following adjusted equations allow the calculation to be performed easily on computers.
Equations

Let's for example calculate the centroid of the green section on only row 5.  Its $$$x$$$ bounds are $$$x_1=4$$$ and $$$x_2=7$$$, but in the above equations $$$P$$$ and $$$Q$$$ are accessed via the adjusted $$$x$$$ index at 4 and 8 (8 because of +1 on the upper index).  This means the denominator is equal to $$$(356-170) = 186$$$, the y-numerator is equal to $$$5(356-170)=930$$$, and the x-numerator is equal to $$$(8 \times 356-4 \times 170)-(1560-420)=1028$$$.  This leads to centroid coordinates of (5.526, 5). This is what you would expect as it's a single row the y value is obviously 5 and the x value is a little to the right of centre due to the increasing trend in the numbers. The centroid coordinates calculated are given in the original x index to allow accessing the original data.

Wednesday, February 1, 2017

Voronoi Stippling

I'm trying to write my own software to convert normal raster images to a corresponding stippled version.  If you're unfamiliar with the term, stippling is a style of artwork that that uses are large number of small dots to represent a scene.  Areas with a high density of dots appear darker, while areas without dots are white.  Normally I like to have reasonably good understanding of a subject before I do a blog post about it, but in this case I'll give a quick outline of what I'm trying to achieve and then fill in the details in a later post.

So, how do you get an image like this,
Plant
Stippled Image

From something like this.
Plant
Original Image

To start with I'm trying to replicate the process in a paper by Adrian Secord.

Weighted Voronoi Stippling
Adrian Secord
2nd International Symposium on Non-Photorealistic Animation and Rendering (NPAR 2002)

The first step is to generate a number of points equal to the desired number of stippling points in the final image.  They can be randomly distributed over the image, or ideally via a rough approximation, close to the locations of the stippling dots in the output image.  As the process I'm about to describe is iterative, the closer the initial approximation is the final output, the faster the algorithm will converge.

After the points have been created, their Voronoi diagram is generated.  A Voronoi diagram produces a region for each input point that identifies all the locations that are closest to that dot than any other.  There are some tricks to making sure that the regions around the boundary are properly created but in general it's a simple call to scipy.spatial.Voronoi

That's great, but the process so far hasn't taken the input image into account(unless you use it to generate the initial point locations).  Now that a number of points and their corresponding regions are created, the weighted centroid of the area is calculated.  The centroid then becomes the location of the new point.  This process of creating Voronoi diagrams and calculating centroids is repeated until some convergence criteria is met and is better known as Lloyd's Algorithm.

In the image below, the black dots are the initial points and the white dots are the centroids of the Voronoi regions after an iteration.
Voronoi Diagram
Voronoi Diagram with centroids
Overlaying this diagram on the original image helps to illustrate the process a little better.  Obviously the image below is sparsely stippled and the object isn't recognisable with only 64 points.

Voronoi Diagram
Voronoi Iteration

I've generated the next diagram with I think 256 points to explain a subtle point.  The Voronoi region for each point is a convex polygon.  Therefore its centroid will always be located within itself.  As the number of points increases the size of the regions decreases.  This means that the process will take a long time to converge as the points can only move a small amount during each iteration.   That's why you need to place the initial points as close to their final location as possible.  Or do you?

I only discovered this problem after programming my solution and needed a quick way to overcome it.  If a large number of points take a long time to converge then a small number should converge quickly.  So I started with only two points.  After a few iterations, each point splits, like a cell undergoing Mitosis.  Each point splits into two new points diametrically opposed on a unit circle centred on the old point.  The direction of the split is randomly chosen.  (I have a hunch that continually splitting in the same direction will cause artifacts that will delay convergence)  These new points are then iterated on for a while and then split again.  The key is to get the points close to where they need to be early and then converge and add the detail later.
Voronoi Diagram
Dense Voronoi diagram
I have some ideas I want to try to make the image better.  There are obviously some linearity problems with the output.  White parts aren't as light as they are should be.  My code is a mess too.  It needs a clean up.  I'm having fun though.

Wednesday, May 4, 2016

Visually Appealing Image Layout Algorithm

Let me pose a problem.  If I were to give you a number of randomly sized images that needed to be  displayed in order, in a row with a set width, how would you do it?  To make things slightly more complicated, you're not allowed to crop the images either.  What's the most visually appealing way to arrange and size the images?

This question came about as I was trying to display the results of another program I wrote.  My goal was to take a number of images, scale and place them in rows in a specified order to fill an image of a certain size.  My first attempt was based on a blog post theorizing about how the image layout algorithm for Google Plus works.  The described algorithm is quite simple.  Start with a single image and scale it to fill the required width.  After scaling, it will have a certain height.  If the resulting row is too high, add another image and scale them so that they are the same height and their total width is what you require.  Keep repeating this process until the row is the right height.  The equation below describes how to find the height of the row.

You can see from this that as you add another image, you add another w/h term to the denominator of the fraction.  As this will always be a positive number, the row height will decrease each time you add an image.  Let's have a look at my first attempt using this method.

Image Gallery Test
Pretty snazzy huh?  Wait a minute, what's going on in the first couple of rows?  All of those dots are actually images, and do you know why they look like that?  Remember how adding a new w/h term decreased the row height?  Adding an image with a large aspect ratio adds a very large w/h term.  In this case, the long thin image added last causes the row size to drop to around a single pixel high.  That's far from ideal.  On the plus side, for images with relatively normal aspect ratios, this algorithm does a good job.  It seems to be the basis for the jQuery plugin, flexImages from Pixabay.  I should add that the images above need to be padded better to make sure that they line up on the right hand side, but that's a trivial step for later.

Where do we go from here?  The dimensions still satisfy our requirements that all images have the same height, and that the total width be a specific number of pixels.  After looking at a lot of image galleries online and thinking about it intensely for a week, I came up with a formal definition of the problem.

Formal definition of the problem
It doesn't look like much, but it gives us a framework to solve the problem.  One subtle point to notice is that alpha is between 0 and 1.  This means that images should never be enlarged.  I believe that enlarging images to fill empty space is the wrong way to go.  If a 1x1 pixel image was added, enlarging it to fill a 100 pixel high row doesn't add any information, all it does is take up 100 pixels of horizontal space that could be used for images that do have information.  Besides, enlarged images never look good, they always come out blurry, regardless of the interpolation method used.

Looking at the definition above it becomes clear that we are trying to solve for the values of alpha.  We know alpha is bound by the values 0 and 1.  We also have to obey the constraint that the total width of the scaled images is equal to some value W.  Conveniently this is easily fed into the minimise function of the scipy.optimise Python library.  Using the SLSQP method, non linear constrained optimization problems can be solved.

Some of you may have rightly noticed that my definition of the problem doesn't solve anything.  Optimization algorithms require an objective function.  This is the function that is minimised in the optimisation process. This is the part I've been struggling with.  The best that I've come up with so far is to try and minimise the sum of the standard deviation of the heights of the scaled images and the standard deviation of the widths of the scaled images.  i.e. (std(heights)) + (std(widths)).  To test this method I've written a program that takes images sizes and tries to make a row out of the dimensions.  The output is displayed as black images on a white background.  The results aren't too bad.



Image row optimization first pass

You can see in the image above that none of the images dominate the row and all get a reasonably fair proportion of the display.  The tiny image at the end remains that size because it hasn't been scaled up, which is one of the bounds.  Ideally I'd like all the image to be the same height.  So I took the results of the first pass and fed them back thought the optimisation process.  This time however, the bounds are set to be within 85 to 115 percent of each of the values of alpha for the first round, still bound by 1 on the high side.  The optimisation function has also been changed only to minimize the standard deviation of the height, not the width.

Image row optimization second pass

That's as far as I got.  I think the second layout looks better.  I may need to walk away from this one for a while and think about what I'm trying to optimise.  That's the trick, coming up with the right optimisation parameters.

I've really enjoyed playing around with this.  This is one of those experiments that I can indulge in because I'm not doing this as a job.  I can take the occasional detour and explore interesting topics just because I want to.

https://github.com/GrantTrebbin/sortamajig/blob/master/optimiser.py
Get the Code


In case anyone else is interested in this problem I've collected some links to sites that have relevant information.

A good description of the different types of layouts used
http://blog.vjeux.com/2012/image/image-layout-algorithm-facebook.html

http://blog.vjeux.com/2012/image/image-layout-algorithm-lightbox.html

http://blog.vjeux.com/2012/image/image-layout-algorithm-lightbox-android.html

http://blog.vjeux.com/2012/image/image-layout-algorithm-500px.html

http://blog.vjeux.com/2013/image/image-gallery-left-and-right-areas.html

http://blog.vjeux.com/2014/image/google-plus-layout-find-best-breaks.html

A different layout that can use snapping
http://www.greentreelabs.net/isotope-masonry-tiles-gallery-and-final-tiles-gallery/

A layout algorithm that uses cropping
http://www.techbits.de/2011/10/25/building-a-google-plus-inspired-image-gallery/

Monday, August 3, 2015

8MP Aliexpress ELP Webcam

I've recently been thinking about using the Raspberry Pi for image processing and wanted to use a Logitech C525 web cam that I'd bought earlier in the year.  It quickly became apparent that its resolution was less than expected and although it'd probably be fine, I didn't want the image quality to be the limiting factor of my project.  The next step was to start looking around for cheap web cams on my favourite Chinese sites.

I wanted a reasonable image quality and resolution for a decent price, and luckily enough it turns out that this isn't too hard to find.  A store on AliExpress called Ailipu Technology was selling an ELPCCTV brand 8 MP USB web cam for approximately 85 AUD.  I'll get into the specs later on, but first I want to look at the construction as it differs from a standard retail web cam.

WebCam
ELP-USB8MP02G-L75
The camera is a cubic shape made of an extruded 40 mm aluminium profile with metallic front and back plates.  The camera is mounted to the front plate with a manually adjustable lens protruding out approximately 10mm.

WebCam
Back of Web Cam
The USB cable is removable and is connected with a 4 pin locking connector on the back plate.  I assume this allows easier installation if it's used as a security camera.  Most of the products the company sell are marketed to the security industry.

Connector
Web Cam Connector
At the bottom of the connector in the image above you can just see a small metal ball that's used as part of the locking mechanism.

Connector
Web Cam Connector
The outer part of the connector in the image above is a retractable shell that locks the plug in place when it's inserted.

So now for the specs.  The camera has a claimed resolution of 3264 x 2448 and uses a Sony IMX179 sensor.  It turns out that this seems to be the same sensor used in the Nexus 5 phone from 2013.  The business model that the company appears to use is it to take old excess phone image sensors (could possibly be a lower grade) and re purpose them as web cams.  The sales page claims 15 frames per second at maximum resolution but that seems a little far fetched.  I've only been able to get 2 fps which seems more realistic for a USB limited connection.

Before using the camera I wanted to be certain that the stated resolution was real and not some made up interpolated number.  In the past I've seen webcams that market themselves at the interpolated resolution which in my opinion is a blatant lie.  To do this I found an ISO12233 test chart from Cornell University and used it to verify the cameras specifications.

The process is simple, take some photos of a high quality printed version of the chart, and then analyse the image to observe the resolution limits.  I went to the local office supply store and had an A1 version of the test chart printed for a couple of dollars and then glued it to 600 x 900 mm Masonite panel that had been painted black.

Test Pattern
ISO 12233 Test Chart
As you can see from my test apparatus below, this wasn't going to be a very scientific test.  The test pattern isn't completely perpendicular to the camera, but by doing the test outside I got as much light as possible on the board to take a good image.  For my analysis I'm only testing the vertical resolution just to double check the claims the manufacturer made.

Calibration Setup
Test Set-up
First up is the 8 MP ELP camera with a 3264 x 2448 sized image.

Test Pattern
ELP Camera Test Image
The lines on the vertical resolution test marker start to merge at about the 1400 lines per picture height mark.  As the picture in the image is 1750 pixels high, this mean that the image is a total of  1400*(2448/1750) = 1958 lines high.  This may seem small but it's 80 percent of the stated vertical resolution, and when you consider the non ideal test arrangement, and my eyeballing of the resolution chart, I'm satisfied that the resolution claims are correct.

Test Pattern
ELP Camera Resolution Limit
While I was at it I thought I'd test some other web cams.  First up is the Logitech C525 Auto focus camera.
WebCam
Logitech C525 Web Cam
The test shot is at its maximum resolution of 1392 x 768.

Test Pattern
Logitech C525 Camera Test Image
In this case the markers merge at the 650 lines per picture mark.  As the picture is 600 pixels high this means that there should be 650 * (768/600) = 832 lines in the image.  Once again this is 8 percent higher than the stated resolution and when you take into account the errors in the measurement method, I'm happy that the stated resolution is correct.

Test Pattern
Logitech C525 Camera Resolution Limit
Finally, let's test my old C120 camera.

WebCam
Logitech C120 Web Cam
It produces and image with a resolution of 640 x 480.

Test Pattern
Logitech C120 Test Image
You're going to have to take my word for it, but the test markers merge at about the 350 vertical lines per image mark.  As the picture is 347 pixels high, the total image has a resolution of 350*(480/347) = 484 lines, which agrees with the specified resolution.

Test Pattern
Logitech C120 Camera Resolution Limit
In the end I'm verry happy with the camera it a good resolution and image quality for the price.

Sunday, July 12, 2015

Raspberry Pi USB Webcam Testing

This post will unfortunately be very short.  I was playing around with a webcam connected to a Raspberry Pi, and just as I was starting to make headway I got sick.  I was using a Logitech C525 to take still images.

webcam
Logitech C525


I got far enough into the project to realize that the camera didn't have enough resolution for what I wanted to do.  So I need to look at other options.

Just to document what I've done I'll include the commands I was using here.  fswebcam was used to take the still images and uvcdynctrl is used to configure the camera, but it's poorly documented and I'm still trying to figure out how to use it.  You may need to install packages to use these commands.


fswebcam  -r 1920x1080 -s brightness=70% -s gain=50% -S 10 test.jpg
uvcdynctrl -L a.txt
uvcdynctrl -s "Focus (absolute)" 220


I promise I'll do better next week.

Friday, May 29, 2015

Simple Data Backup with Paper Based QR Codes

Let's say you have some important data you want to protect, how do you do it?  The obvious answer is encryption, this then leaves you with the smaller but more manageable problem of protecting the key.  This is really important though, if you loose the key, the data becomes useless.  So it's not uncommon to back it up.  How you want to safeguard the key and where you want to store it aren't the subject of this post, what I want to talk about is a method to ensure the longevity of the data and medium you store it on that's also dead easy to recover.  (It looks hard, but it really isn't)


The first thing to consider when thinking about backups is the medium.  If you archived data on a 5.25 inch floppy 20 years ago, you might have a hard time recovering that today.  First you have to find a disk drive to read the information, then you have to hope that the information stored on the magnetic media hasn't degraded, then you need to be able to read the format of the recovered file.  This doesn't just apply to magnetic media.  To quote the National Archive of Australia about the preservation of physical media:

Recordable CDs and DVDs, USB keys and various forms of flash memory have doubtful long-term reliability and are subject to format and software obsolescence.

So what do you do?  The least worst solution is to store the data on paper.  Print it out and put it somewhere safe.  If you want a bit more safety, print out multiple copies and put them in different locations, it's up to you.  If you want to get all tin foil hat, you could split the data into n pieces that only require k parts to reassemble using Shamir's Secret Sharing algorithm.  For example split the key into 6 parts that only require any 4 pieces to reassemble, then store each portion in a different location.  I'll leave that for another time.    The point is, paper has proven that it can stand the test of time if stored with even the slightest bit of care.  You also don't need need specialised equipment to read it (although it helps).

Once you decide to store the data on paper then comes the question of how you plan to do this.  If the file is binary data you can't just print it as there'll be non printable characters, and unless you choose the right font it can be hard to tell the difference between characters.  E.g. | l 1.  You could print a hex dump of the file, but if you need to recover the file re-entering that data could be a very long process.  The easiest way is to use bar codes, QR codes to be exact.  The ubiquity of QR codes leads me to believe that a major catastrophe will  have to befall humanity before we forget how to read them.  Even if it has to be done by hand, I think they're a stable format.

The process to go from file to printed QR codes and back again is surprisingly simple when you use the right tools.  There are solutions like PaperBack that accomplish a similar goal, but it seems to use it's own barcode format and doesn't use a standard like a QR code.  That brings long term reliability into question.  The method I propose is listed below and uses software with functions that can be performed manually or easily reproduced with other software.

I decided to test this out using a live USB of the TAILS operating system.  The file I've backed up is an example Keepass database I created.  Start by installing the required tools.


    sudo apt-get update
    sudo apt-get install zbar-tools imagemagick qrencode



QRencode is used to create QR codes from terminal input, and that's what we'll be using it for.
Zbar-tools is a flexible easy to use barcode reader that can decode bar codes from an image or webcam.  We're going to use it to scan the data back into the computer.
ImageMagick is like the Swiss army knife of Linux image editing.  This will be used to combine 6 barcodes onto one page ready for printing.

 

Create the Barcodes


Next the input file will be encoded in base 64 format.  This probably isn't needed as QR codes are capable of encoding 8-bit binary data.  I just do it to be safe.  What I did actually wastes space, so do what works best for you.


    base64 keyfile.kdb > keyfile.64


The file is then split into a series of smaller files that can be converted to QR codes.  You can only fit so much data into a QR code.  A couple thousand bytes depending on your encoding and level of error correction.  Once again, use your judgement.


    split -n 6 keyfile.64 Passwords_kdb_64


Encode each portion of the split file as a QR code.  The -l H option gives the maximum amount of error correction in case the bar code is damaged.  I've processed all files using a command line loop.  This is something to generally avoid.


    for file in ./Passwords_kdb_64*; do qrencode -l H -o $file.png < $file; done


We'll then combine 6 QR codes into one image containing 3 rows of 2 codes with the filenames under each code.  If you have more than 6 bar codes don't worry about it, imagmagick will create as many output images as you need.


    montage -label '%f' *.png -geometry '1x1<' -tile 2x3 Passwords_kdb_64.png


QR code Backup
Resulting QR codes storing a password database

Recover the Original Data


Scan each of the QR codes in order using zbarcam and redirect the output to a file.  Each code is on a new line with a header identifying the type of code scanned.  The new lines and headers need to be removed.  This was done manually.


    zbarcam > keyfile.64


The last step is to convert the base 64 encoded file back to the original binary file.


    base64 -d keyfile.64 > keyfile.kdb


There you have it, file to QR code and back again.  What I like about this method is that even if all the software used to create the final output image disappears, the encoded data can still be recovered as long as you can decode a QR code and convert a file from base64 back to binary.  Both of these processes are widely known.

You can find all the associated files below.
https://gist.github.com/GrantTrebbin/0c6aadc7ecebe3107d08
https://drive.google.com/folderview?id=0B5Hb04O3hlQSfmwzVFdCTS1YZm8xSVVLZm95by0zLVpaTHR2WE1XcTVicWE5NUFJZjg4cGs&usp=sharing


Thursday, May 21, 2015

Losslessly Compressing Similar Images

In a recent post, I created an animation from a series of PNG images that were programmatically generated.  Usually after a project like this I back up all the related data for archiving, but this time I wondered if I could somehow reduce the size of the images by taking advantage of temporal redundancy?  What do I mean by that?  When you look at the frames they're basically all the same and only contain slight changes between adjacent images.  Is it possible to take two images, record the difference in a way that can losslessly recreate the original image, and save space?

The method I came up is a crude simplistic implementation of how video encoding works.  Take two frames, subtract one from the other and encode the difference.  This works because the difference image usually contains very little information.  Real video encoding is much more complicated,  other frames are used to predict a particular frame, taking into account how objects in the video are moving. The prediction is subtracted from the actual frame to produce a residual image.  This is what's encoded, usually in a lossy manner.

I wanted to recreate this, but as this was a quick project, I used common file formats and software that already existed.  I didn't want to write a compression algorithm (yuck) I just needed a result.  With that in mind I decided to exclusively use the PNG file format, as it's great for compressing images with a large areas the same colour.  (I know I could use something like WEBM, MNG, or APNG, but they're poorly supported)

The plan is quite simple, take two images the same size, subtract them from each other channel by channel (including transparency) and save the difference as greyscale PNG files.  The one complication is that subtracting an 8 bit number from another 8 bit number gives you a 9 bit solution space, meaning you can't just encode the result in one image.  To get around this, the difference images are encoded as two images, the first contains the absolute value of the difference, and the second image contains the sign of the difference (negative is encoded as white).  As the image containing the sign only has two colours, one for positive/zero, one for negative, it can be efficiently encoded.

Writing a compression and decompression script in python was easy enough.  There are some peculiarities and edge cases in the PNG spec that will cause it to fail.  Everything I tested works, but I feel like there may be problems with some palette based images.

As a demonstration I processed two images that aren't similar in any way to illustrate how the system works.

PNG Images
Difference Compression Example
The two source images above are split into their RGBA channels and subtracted from each other.  You may notice that the RBG difference images have a background that isn't black, (which would imply they are equal) but dark grey, while both images have transparent sections as their background.  This is because there are in fact a different colours in the backgrounds of both images, but with the transparency set to full they look the same.  The difference image for the alpha channel also looks different, it's just black and white, this is because the images don't use graduated transparency on the alpha channel.  It's either on or off, so the difference will be either full or zero.

The 8 images on the right along with the first source image were able to reproduce the second source image.  When compared they are found to be pixel for pixel replicas apart from pixels with full alpha. The encoder I used discards the colour information of pixels that are transparent.  I'm not going to go into the file sizes and results as this was to illustrate what happens during the process.  You'd never compress these images this way.

An actual example

Below is a frame from my animation that is 1080 x 1920 pixels.  I'm going to use that as a base to compress the next frame in the animation.  I'm not going to show it, apart form the timestamp they look exactly the same to the naked eye.

Test Image
Test Frame

So, what are the results.  The image below contains the file sizes.  The original file 1.png was 347 kB.  To be fair, I ran it through PNGOUT to see how small it could be made as a standalone image.  This resulted in a 248 kB file.  In comparrison to the 8 difference and sign images, there's no contest.  In total they take up around 30 kB of space.  Around an eighth the size of the standalone image, but wait, it gets better.

List of files
File sizes


If you run the sign and difference images through PNGOUT you get a massive size reduction.

List of files
File sizes
That's right, they now come to a total of 13.3 kB, around 5.3% percent of the original file size.  The original file 1.png is able to be encoded with only 13.3 kB when using 0.png as a base image.  As a side note you probably want to throw them into a tar or zip file so the actual size on disk is reduced.
File sizes
File sizes
Although this is an interesting compression method, I would actually use it in its current form, it's a bit fragile.  I'd love to know if anyone has a similar method they use for losslessly archiving similar image. Python scripts and test images can be found on Github and Google Drive.

https://gist.github.com/GrantTrebbin/4aba7898840d96fc6b86
https://drive.google.com/file/d/0B5Hb04O3hlQSYUVzekpieEswc1U/view?usp=sharing