Saturday, May 20, 2017

Linux Text To Speech with Saved Audio

In my last blog post I described a procedure to find a forgotton PIN for 10 digit mechanical lock boxes where you enter a specific sequence of button presses to efficiently test all the combinations. The generated sequence was supplied in the form of a text file, and although this works, it's a little cumbersome moving your eyes between the buttons and paper all the time. It occurred to me that this would be a lot easier if the numbers were read to me. I then imagined how easy it would be if I had a pair of headphones and the instruction in an audio file on my phone. This seemed like the perfect application for a text to speech application and Linux.

After a little bit of research I decided to use the eSpeak speech synthesizer. It has many options for different voices for different languages and countries and allows quite a bit of customisation of the the way the text is read.

The command below that converts the text in "lockbox.txt" to audio in "lockbox.wav" uses the english voice (-ven), pronounces capital letters in a certain way (-k20), leaves a certain gap between words (-g4), reads back at a certain words per minutes (-s90), and  has a certain pitch (-p29). It's that easy!

espeak -ven -k20 -g4 -s90 -p20 -f lockbox.txt -w lockbox.wav

Before processing the file I made some slight alterations to it by replacing some of the commands in the lock box opening sequence. Originally the commands were zero thought nine, open, and clear. I replaced open with test as it was only one syllable and easier to hear.  It's also important to leave spaces between numbers otherwise it will read 11 as "eleven" instead of "one one".

Here is the instructional WAV file converted to an MP3.  It goes for 30 minutes or so and with a little bit of practice you should be able to follow along at that speed.  If you can't, that's fine, just slow the speed down in your music player.  If you screw up just go back 15 or twenty seconds to catch up.

For the YouTube fans out there, here is another version. It might possibly be the most boring and monotonous video on YouTube. That's my speciality though :-)

To be serious though I'd like to try eSpeak on a Raspberry Pi.  I think it'd be great to read out status updates and events.  Compared to some of the other synthesized voices I've heard it's actually pretty good.

Friday, May 19, 2017

Testing All The PINs On A Lock Box With A Forgotten Code

A long time ago I did a blog post on PIN coded lock boxes that you put on your house to hold a spare set of keys. Their main function is to give easy access for emergency services in case they need keys to get in if an elderly relative has a fall and can't get up. The point of my post was to demonstrate that although they have 10 buttons, look secure, and you can choose how many digits are in the PIN, they only really have 1024 PIN combinations. This is because the numbers can only be used once in each PIN.

PIN coded key lock box
The math is explained in my initial blog post but the results are below. I opined it would take about 4 hours to try all the combinations, and in this case it would require pressing the open button 1024 times, the clear button 1024 times, and pressing digits (0*1 + 1*10 + 2*45 + 3*120 + 4*210 + 5*252 + 6*210 + 7*120 + 8*45 + 9*10 + 10*1) = 5120 times.  In total there are 7168 button presses. That works out at about one button press every 2 seconds.
Number of PINs vs PIN length
At some point in the 5 years since I first wrote about these locks (the time flies doesn't it) it occurred to me that I was being inefficient. Testing if the box opens doesn't clear the current code. So I  can test multiple PINs at once by just chaining them together. It's similar to the De Bruijn sequence, a mathematical tool that can be used to brute force another type of PIN. In this case however you have to reset the lock after a few buttons are pressed. To illustrate the process I'm trying to explain, imagine that the clear button has been pressed and I then enter the numbers 0 through 9, testing if the lock will open in between numbers. I've actually just tested the codes (Null) (0) (01) (012) (0123) (01234) (012345) (0123456) (01234567) (012345678) (0123456789). I've tested 11 codes with one press of the clear button, 11 presses of the open button, and just 10 digit presses. The whole process won't be this efficient but it'll be better than nothing.

But where do we start? In a perfect world, you may see by looking at the chart below and table above, that the best we can do is 1 sequence testing PINs of length 0-10,  followed by 9 sequences testing PINs of length 1-9, 35 sequences testing PINs of length 2-8, 75 sequences testing PINs of length 3-7, 90 sequences testing PINs of length 4-6, and 42 sequences testing PINs of length 5.

This will still result in 1024 presses of the test button, only 252 presses of the clear button, and (1*10)+(9*9)+(35*8)+(75*7)+(90*6)+(42*5) = 1646 presses of the number buttons. For a new total of 2922 button presses, or about 41% of the original estimate.
Distribution of PIN lengths
There's no guarantee that these sequences exist though and I had no deep understanding of how to create them, so I tried the first thing that came to mind and got lucky.

First, generate all possible combinations for each PIN length and then sort the numbers in each PIN. Then sort the PINs for each length comparing them element by element.  Start in the middle with the PINs of length 5 as there are more of these than the others. Then take the PINs of length 4 and go through them one by one from the start and place them to the left of first 5 digit PIN that could follow it. For instance (1, 5, 6, 7) might go to the left of (1, 2, 5, 6, 7) as only a 2 would have to be pressed to get to the 5 digit PIN. Do the same for the 6 digit PINs and place them on the right of the 5 digits. In this case (1, 2, 3, 5, 6, 7) might go to the right of (1, 2, 5, 6, 7) as only a 3 needs to be pressed. Repeat this left right procedure until all PINs are processed. It will look like a mess, but if you sort the sequences by length you will get a spreadsheet that looks like the one below (zoomed and rotated to fit). Look familiar? It's reflects the distribution graph above, and it shows that the sequences can be generated.

Rotated spreadsheet of the sequences
I've placed my code in a Github Gist for you to generate your own sequences in case there are a different number of buttons on your lock. If you have a lock with 10 buttons I've already generated the file for you. It looks a little different from the output of the Python file because I've done some some find and replace operations in Notepad++. I've used the word test instead of unlock as well.

Update There is a MP3 file of the instructions in this blog post, Linux Text To Speech With Saved Audio.

So what about those locks you see in banks and other offices that have 14 buttons.  I've done the maths so you don't have to, but they can be broken with 50316 button presses. So 4 extra buttons buys you an increase in security by a factor of about 17. Not that it matters, whenever I've seen these used I people aren't too discreet about entering the PIN.

Door Lock
PIN Door Lock

Friday, May 5, 2017

Generating Seed Points For Voronoi Stippling

I've been working on a way to generate an initial distribution of seed points for a Voronoi stippling program that I played with a while ago.  The problem is that for the final image to look good you really need a lot of points, and when you have a lot of points they are very slow to converge to a solution.  The process I used in the past is to generate a few points and allow them to be processed, split them in two, process them and so on and so on.  Each time the number of points is doubled and the solution is eventually found but it's still not great.  An improvement I thought I could try is to linearize the image via a Pseudo-Hilbert (PH) curve and apply a rolling threshold to the data and place points at locations where the counter rolls over.  You may see a major optimisation that I've know all along but haven't implemented.  I'll give you a clue, it a PH curve really necessary?  Find out at the end of the post.😉

The implementation of the threshold is a little abstract so I'll just explain the general concept. First of all we always work on an inverse grey scale version of the image as we are distributing black dots and the background is white. Imagine that the sum of all the pixels in the image is equal to 1,000,000 and you want 2000 points.  This means that each point represents 500.  So as you travel along the PH curve you add the pixel values and every time you accumulate another 500 place a point.

The points generated from this process are then iterated on via the Voronoi stippling process to produce a result. Another small feature I've added is a dithering step. For a certain number of iterations a random displacement is added to each point.  This can be seen in the GIF below as a jiggling of the points. What is the purpose of this? Imagine a jar of jelly beans that won't quite close. What do you do?  You pick the jar up and shake it to settle the contents.  That's what we're doing here.  What you end up with is a number of points that when randomly perturbed, return to their original position.  i.e. a stable arrangement.

The results in the GIF aren't that great.  The main reason for this is that I've kept the features large so you can see what's happening.  If you really wanted to process this image the best way to do it is to increase the size of the image and then process it.  When the calculated Voronoi regions are small and they only contain a few pixels, due to quantization issues the points won't move as the centroid calculations will always yield the same result.  So if you notice herding of points in the final image try enlarging the image and reprocessing it.
Distribution of Seed Stippling Points on a Pseudo-Hilbert Curve
To test the algorithm I've used an image of a plant from the paper I'm working from.  Weighted Voronoi Stippling, by Adrian Secord.  If this link becomes dead, contact me, I have a copy of the paper.
Test Image Used in Adrian Secord's Paper
In the image below I've started modestly with 4000 points to test that everything is working properly, and as you can see, everything is working.  On closer inspection though, you can see that the dynamic range, for want of a better term, of the image has been reduced.  In other words, the areas that are meant to be dark aren't as dark as they're meant to be, and in comparison, the areas that are meant to be light aren't as light.
4000 points - power 1
The dynamic range problem can be solved by pre-compensating the image.  Specifically raising each pixel to a power or exponent.  In the image below the value of each pixel has been squared before processing.  I've referred to this as a power parameter.  I've re-scaled the resulting data back to a range of 0-255 just for consistency but it's really not required.
4000 points - power 2
In the next image the number of points has been doubled and the power parameter has been pulled back slightly to 1.9 and things are starting to look even better.
8000 points - power 1.9
Let's step things up once again and go for 20000 thousand points to match the example in the paper. I've also pulled back the power parameter to 1.6. Why? It's subjective. I think it looks better.
20000 points - power 1.6

I think that looks pretty spectacular.  Of course I've had to resize the image and scale it down a bit so downloading this page isn't a 100 MB exercise, so there's a little bit of quality degradation, but the original vector files are even better.

The point of this post however is to discuss the PH curve approximation for the seed points and how close it gets us to the final image.  What does the initial distribution of seed points look like before any processing?
20000 points - power 1.6 - no processing
The structure is immediately visible, but there is a fuzziness to the image. This is more apparent when looking at an enlarged section. The points are distributed on a grid so you will see more lines and square corners in the arrangement.

Close Up Section of Seed Point Distribution
So what happens after processing?  The points are allowed to find their own space and curved features start to appear.  I can't explain the exact reasoning but in sections the points form curved lines running in multiple directions.  If you can't see what I'm talking about, go and have a look at something called a Fermat Spiral (also a way to distribute points to evenly cover an area) and you'll see what I mean.  In all things just look more pleasing to the eye.
Close Up Section of Points After Processing
I'm sorry but I'm not releasing the code on this one just yet. Why? Because it's terrible. There are very few comments and a lot of sections where code has been commented out. I know the theory is to develop in public and let the code be scrutinised, but I try to make a program so that it's not only functional but it also explains the concept I'm trying to describe to the user.  At this point I think it would just confuse people.

It's also hard to develop things cleanly when you don't exactly know where they are going to go. Unlike a lot of programming where you have a clearly defined goal and a general way to get there. I start not usually knowing either.  Think of this type of coding as a science experiment.  I'll try different things and see what the results are, change things until I get what I want and then go back and clean up a section and lock it down.

I will release this soon though.  There are couple things to fix but it's mostly there.  If for some reason I don't post the code, let me know and I'll help you out.

Here is the optimization I didn't implement.  You don't need a PH curve.  Just add a white border to the source image to make it a square so that the length of a side is a power of 2 and then use an actual Hilbert curve and crop the result.  They are way more efficient to generate than a PH curve.  I did it this way because I was interested in the problem of covering arbitrary rectangular regions with a curve.  As this blog is a labour of love I reserve the right to go off on the occasional tangent to satisfy my curiosity.

Friday, April 21, 2017

Efficient Centroid Calculation for Discrete Areas

A project I'm working on requires the repeated calculation of weighted centroids for arbitrary regions in an image.  Calculation of centroids is fairly straight forward but computationally expensive.  To calculate the x and y coordinates you first need to calculate three other values, the denominator, x-numerator and y-numerator.  The denominator is simply equal to the sum of all the values in the region, the x-numerator is the sum of all the values in the region multiplied by their x coordinate, and the y-numerator is the sum of all the values in the region multiplied by their y coordinate.  From these values the x coordinate of the centroid is equal to the the x-numerator divided by the denominator likewise for the y coordinate of the centroid.  

You can see from this description that there are a lot of time consuming multiplication operations to perform for each area that you want to calculate the centroid for.  As it turns out though, you can perform two cumulative summations along the rows of the input data as a first step, store this and then perform the centroid calculation by only accessing values at the boundary of the region. For reasons I'll explain later a column of zeros with x coordinate -1 is added to the start of the data.  A cumulative summation is performed along each row to obtain $$$P$$$. A cumulative summation along each row of $$$P$$$ is performed to generate $$$Q$$$ The following equations describe the process.  Before we get started, the derivation of these  equations can be found in this small article I wrote.
Centroid Calculations
It may not be immediately obvious what these mean so I'll give a quick explanation.  First of all each of the above calculations are done row by row and added together. This is what the summation symbol over $$$y$$$ represents.  For the denominator the value to be summed is the difference between two values on a row of $$$P$$$. For the y-numerator the calculation is the same as the denominator but each row is multiplied by the y coordinate.  The calculation of the x-numerator is a little different. It's similar in that values of $$$P$$$ at the boundary are subtracted and multiplied by $$$x+1$$$ but the additional step of subtracting the difference of the values of $$$Q$$$ at the boundaries is now added.  Maybe an example will help.

Example Calculation

In the first data set on the top left an area is marked green.  The centroid of this region is to be found. A column of zeros is added to the start of the data.  This is seen in the data set on the top right.  These are added because when performing the operation in software you end up accessing array outside of their bounds.

Typically when calculating centroids each green cell would have to be accessed for each calculation. However when using the new method described above only the cells in blue are accessed.  You may also notice that the width of these doesn't change the number of operations.  The complexity of the calculation is only dependant on the height of the region.  I've included this spreadsheet for you to play around with and get a better understanding of the process.

One last this to note is that the addition of the column of zeros while maintaining the original $$$x$$$ indices is silly as it creates a column with an index of negative one.  This is where the adjusted x index comes in.  Using this and the following adjusted equations allow the calculation to be performed easily on computers.

Let's for example calculate the centroid of the green section on only row 5.  Its $$$x$$$ bounds are $$$x_1=4$$$ and $$$x_2=7$$$, but in the above equations $$$P$$$ and $$$Q$$$ are accessed via the adjusted $$$x$$$ index at 4 and 8 (8 because of +1 on the upper index).  This means the denominator is equal to $$$(356-170) = 186$$$, the y-numerator is equal to $$$5(356-170)=930$$$, and the x-numerator is equal to $$$(8 \times 356-4 \times 170)-(1560-420)=1028$$$.  This leads to centroid coordinates of (5.526, 5). This is what you would expect as it's a single row the y value is obviously 5 and the x value is a little to the right of centre due to the increasing trend in the numbers. The centroid coordinates calculated are given in the original x index to allow accessing the original data.

Monday, April 10, 2017

Pseudo Hilbert Curve for Arbitrary Rectangular Regions - Part 2

I fixed the problems mentioned in my last blog post about Pseudo Hilbert curves.  When both side of a block are of length divisible by 4 the block is divided into 2x2 sub-blocks.  Each of these blocks are scanned in an appropriate manner to maintain the flow of the parent block.  I won't show all the examples from the paper I'm working from, but the python module I wrote does generate all the example curves.  I can see room for improvement in places but I'm happy with its operation and how fast it works.  After all, perfect is the enemy of good.

The module allows you to specify an rectangular region.  It will then create an in order list of the (x,y) coordinates of the curve and it will create a "2D" list of the length along the curve.  I have some plans for this that I hope workout.

Get the Code!

Thursday, March 30, 2017

Pseudo Hilbert Curve for Arbitrary Rectangular Regions - Part 1

It's been a bit longer than usual so I thought that an update was in order.  I've been working on implementing an algorithm that generates Hilbert curves for arbitrary rectangular regions and I've been having issues.  The paper that describes the algorithm is 

Zhang, Kamata, Ueshige "A Pseudo-Hilbert Scan for Arbitrarily-Sized Arrays"

The problems I've been having aren't so much related to my understanding of the problem, it's coming up with a understandable, fast piece of code to generate the curve (while finding time for sleep and work). I'm using python so I could increase the speed by going to C for some parts, but I've worked on my implementation until it generates a curve for a 3000 by 2000 area in about 20 seconds. I think that's a reasonable speed. The plan was to hold off on this post until I'd solved it.  It forces me to work and sets a deadline to meet.  The code reached a point last night that I was happy with and I was ready to post and then I compared my results with the curves in the paper.  Guess what, they don't match.  I know why and I'll get to that later but first an explanation of the algorithm.

The first step is to recursively divide the region vertically and horizontally.  In the small example below the region is divided into 4 blocks vertically and 4 blocks horizontally.  There will always be an equal number of blocks vertically and horizontally and it will be a  power of two.  These blocks determine the general direction of the curve as they can be easily traversed by a Hilbert curve.   The division algorithm also ensures that the dimensions of each block in all but the bottom and left rows will be a power of two.

Pseudo Hilbert Curve
Partition a 17 x 15 Region into Blocks
The next step is to determine a scanning method for each block.  The way the sides are divided makes this step easier.  Depending on the oddness or evenness of the each of the dimensions some block scanning methods are determined by a simple lookup process.  Others are a bit more complicated.

Hilbert scanning methods
Scanning methods

Once you know the size of the blocks, their location, and their scanning method, the curve can be easily generated.

Here's where things went wrong.  My curves match the curves in the paper for some sections but not others.  Bellow are a couple example of my curve vs the one from the paper and an overlay to show where they match and where they differ.

Pseudo Hilbert Curve
My version of a 123 x 97 curve
Pseudo Hilbert Curve
The 123 x 97 curve in the paper
Pseudo Hilbert Curve
Overlay of the two curves to see where they differ
Pseudo Hilbert Curve
My version of the 125 x 88 curve
Pseudo Hilbert Curve
The version of the 125 x 88 curve from the paper
Pseudo Hilbert Curve
Overlay of the two curves to see where they differ

I was scratching my head trying to figure out why, and after a bit of research I discovered that the author later made some optimisations to the scan methods for blocks larger than or equal to 4x4. The images in the paper were generated with that method.  Basically larger blocks are scanned with a Hilbert like method and not the bidirectional raster scans shown above.  So I'm not losing my mind. Well, not for that reason anyway.  I need to think about how to alter my code to make this work. Hmmm.

My other blog posts on the hilbert curve can be found here.
Calculating Hilbert Curve Coordinates