Friday, April 21, 2017

Efficient Centroid Calculation for Discrete Areas

A project I'm working on requires the repeated calculation of weighted centroids for arbitrary regions in an image.  Calculation of centroids is fairly straight forward but computationally expensive.  To calculate the x and y coordinates you first need to calculate three other values, the denominator, x-numerator and y-numerator.  The denominator is simply equal to the sum of all the values in the region, the x-numerator is the sum of all the values in the region multiplied by their x coordinate, and the y-numerator is the sum of all the values in the region multiplied by their y coordinate.  From these values the x coordinate of the centroid is equal to the the x-numerator divided by the denominator likewise for the y coordinate of the centroid.  

You can see from this description that there are a lot of time consuming multiplication operations to perform for each area that you want to calculate the centroid for.  As it turns out though, you can perform two cumulative summations along the rows of the input data as a first step, store this and then perform the centroid calculation by only accessing values at the boundary of the region. For reasons I'll explain later a column of zeros with x coordinate -1 is added to the start of the data.  A cumulative summation is performed along each row to obtain $$$P$$$. A cumulative summation along each row of $$$P$$$ is performed to generate $$$Q$$$ The following equations describe the process.  Before we get started, the derivation of these  equations can be found in this small article I wrote.
Centroid Calculations
It may not be immediately obvious what these mean so I'll give a quick explanation.  First of all each of the above calculations are done row by row and added together. This is what the summation symbol over $$$y$$$ represents.  For the denominator the value to be summed is the difference between two values on a row of $$$P$$$. For the y-numerator the calculation is the same as the denominator but each row is multiplied by the y coordinate.  The calculation of the x-numerator is a little different. It's similar in that values of $$$P$$$ at the boundary are subtracted and multiplied by $$$x+1$$$ but the additional step of subtracting the difference of the values of $$$Q$$$ at the boundaries is now added.  Maybe an example will help.

Example Calculation

In the first data set on the top left an area is marked green.  The centroid of this region is to be found. A column of zeros is added to the start of the data.  This is seen in the data set on the top right.  These are added because when performing the operation in software you end up accessing array outside of their bounds.

Typically when calculating centroids each green cell would have to be accessed for each calculation. However when using the new method described above only the cells in blue are accessed.  You may also notice that the width of these doesn't change the number of operations.  The complexity of the calculation is only dependant on the height of the region.  I've included this spreadsheet for you to play around with and get a better understanding of the process.

One last this to note is that the addition of the column of zeros while maintaining the original $$$x$$$ indices is silly as it creates a column with an index of negative one.  This is where the adjusted x index comes in.  Using this and the following adjusted equations allow the calculation to be performed easily on computers.

Let's for example calculate the centroid of the green section on only row 5.  Its $$$x$$$ bounds are $$$x_1=4$$$ and $$$x_2=7$$$, but in the above equations $$$P$$$ and $$$Q$$$ are accessed via the adjusted $$$x$$$ index at 4 and 8 (8 because of +1 on the upper index).  This means the denominator is equal to $$$(356-170) = 186$$$, the y-numerator is equal to $$$5(356-170)=930$$$, and the x-numerator is equal to $$$(8 \times 356-4 \times 170)-(1560-420)=1028$$$.  This leads to centroid coordinates of (5.526, 5). This is what you would expect as it's a single row the y value is obviously 5 and the x value is a little to the right of centre due to the increasing trend in the numbers. The centroid coordinates calculated are given in the original x index to allow accessing the original data.

Monday, April 10, 2017

Pseudo Hilbert Curve for Arbitrary Rectangular Regions - Part 2

I fixed the problems mentioned in my last blog post about Pseudo Hilbert curves.  When both side of a block are of length divisible by 4 the block is divided into 2x2 sub-blocks.  Each of these blocks are scanned in an appropriate manner to maintain the flow of the parent block.  I won't show all the examples from the paper I'm working from, but the python module I wrote does generate all the example curves.  I can see room for improvement in places but I'm happy with its operation and how fast it works.  After all, perfect is the enemy of good.

The module allows you to specify an rectangular region.  It will then create an in order list of the (x,y) coordinates of the curve and it will create a "2D" list of the length along the curve.  I have some plans for this that I hope workout.

Get the Code!

Thursday, March 30, 2017

Pseudo Hilbert Curve for Arbitrary Rectangular Regions - Part 1

It's been a bit longer than usual so I thought that an update was in order.  I've been working on implementing an algorithm that generates Hilbert curves for arbitrary rectangular regions and I've been having issues.  The paper that describes the algorithm is 

Zhang, Kamata, Ueshige "A Pseudo-Hilbert Scan for Arbitrarily-Sized Arrays"

The problems I've been having aren't so much related to my understanding of the problem, it's coming up with a understandable, fast piece of code to generate the curve (while finding time for sleep and work). I'm using python so I could increase the speed by going to C for some parts, but I've worked on my implementation until it generates a curve for a 3000 by 2000 area in about 20 seconds. I think that's a reasonable speed. The plan was to hold off on this post until I'd solved it.  It forces me to work and sets a deadline to meet.  The code reached a point last night that I was happy with and I was ready to post and then I compared my results with the curves in the paper.  Guess what, they don't match.  I know why and I'll get to that later but first an explanation of the algorithm.

The first step is to recursively divide the region vertically and horizontally.  In the small example below the region is divided into 4 blocks vertically and 4 blocks horizontally.  There will always be an equal number of blocks vertically and horizontally and it will be a  power of two.  These blocks determine the general direction of the curve as they can be easily traversed by a Hilbert curve.   The division algorithm also ensures that the dimensions of each block in all but the bottom and left rows will be a power of two.

Pseudo Hilbert Curve
Partition a 17 x 15 Region into Blocks
The next step is to determine a scanning method for each block.  The way the sides are divided makes this step easier.  Depending on the oddness or evenness of the each of the dimensions some block scanning methods are determined by a simple lookup process.  Others are a bit more complicated.

Hilbert scanning methods
Scanning methods

Once you know the size of the blocks, their location, and their scanning method, the curve can be easily generated.

Here's where things went wrong.  My curves match the curves in the paper for some sections but not others.  Bellow are a couple example of my curve vs the one from the paper and an overlay to show where they match and where they differ.

Pseudo Hilbert Curve
My version of a 123 x 97 curve
Pseudo Hilbert Curve
The 123 x 97 curve in the paper
Pseudo Hilbert Curve
Overlay of the two curves to see where they differ
Pseudo Hilbert Curve
My version of the 125 x 88 curve
Pseudo Hilbert Curve
The version of the 125 x 88 curve from the paper
Pseudo Hilbert Curve
Overlay of the two curves to see where they differ

I was scratching my head trying to figure out why, and after a bit of research I discovered that the author later made some optimisations to the scan methods for blocks larger than or equal to 4x4. The images in the paper were generated with that method.  Basically larger blocks are scanned with a Hilbert like method and not the bidirectional raster scans shown above.  So I'm not losing my mind. Well, not for that reason anyway.  I need to think about how to alter my code to make this work. Hmmm.

My other blog posts on the hilbert curve can be found here.
Calculating Hilbert Curve Coordinates

Thursday, March 16, 2017

Calculating Hilbert Curve Coordinates

You may have noticed from some of my previous blog posts that I've fallen down a rabbit hole learning about Gray codes and Hilbert curves.  They weren't just random topics.  I wrote those articles in order to understand the topic of this one, how to easily convert a Hilbert Index to Hilbert Coordinates via a method described in a paper by John Skilling.

Skilling, J. (2004), "Programming the Hilbert Curve", AIP Conference Proceedings.

Regrettably I don't yet understand why it works to a level that I'm satisfied with.  The paper describes the process and gives a summary of previous work by Butz and Lawder but doesn't really give a clear explanation of why the transform works.  It may be perfectly clear to someone who works in the field, but I don't get it... yet.

What I am capable of doing though is summarising the information here with some graphics to explain the concept to the next person that gets nerd sniped by this.  I'll start by recommending my last few blog posts as they will give you an idea as to my train of thought.

A basic concept to understand is how binary numbers are treated as coordinates.  In the animation below, the index is 6 bits long and counts up from 000000 to 111111.  This can be used to cover every coordinate on an 8 x 8 two dimensional grid if every second bit belongs to the coordinates of one dimension while the others belong to the other dimension.  In this case the red bits belong to the x axis while the blue bits belong to the y axis.  Everything I'll describe in this post is for two dimensions, but it works for any number of dimensions.  For example an 8x8x8 grid could be covered with a 9 bit index where bits 0, 3, 6 belong to one axis, 1, 4, 7 to another, and 2, 5 ,8 to another.

Binary Indexing Animation
Binary indexing in 2 dimensions
In the binary indexing animation you can see that there are diagonal lines.  These occur because more than one bit of the index can change at once affecting both dimensions.

The first step in the Skilling Transform is to take the binary reflected Gray code of the index.  This means that only one bit at a time can change.  Therefore the value of only one axis can change at a time leaving only vertical and horizontal movements.

Gray Code Animation
Gray code indexing in 2 dimensions

The addressing scheme detailed above is formally described in this section of the paper.  I put this here just so we're all on the same page about the variable names.

$$$p$$$ is the number of bits in each axis
$$$n$$$ is the number of dimensions

Addressing Scheme
Addressing description

Now onto the actual transform.  It's minimal and easy to compute.  It's not a recursive algorithm and takes a bit string that specifies the Hilbert index as an input and performs a series of swaps and inversions to return a bit string of the same length that describes the Hilbert coordinates.

Skilling transform

To see if I could understand the process better I wanted to visualise the transform.  The animation below show the transformation from Gray code indexing to the Hilbert curve.  You can see how the algorithm starts the transformation on the smaller features first and works up to larger features at each step.

Hilbert Curve Animation
Gray code to Hilbert curve via the Skilling transform in 2 dimensions.

Hopefully the worked example below will help people understand the process a little better.  The index is converted to its Gray code and the algorithm processes bits in a reverse order.  The grey boxes refer to the bit $$$r$$$ shown in the algorithm above.  The bits that are classified as low bits at each stage are underlined.
Hilbert Index to Coordinates
Worked example

I can't really offer much of a conclusion here.  The understanding I've gleaned so far is minimal and superficial at best.  I 'll leave this one for a while and come back to it in the future when I have more time.

I have a love hate relationship with problems like this.  I find them fascinating, but at the same time they remind me how much I miss formal study, and I get bummed out.  Kinda sucks having no one to bounce ideas off either.  Having said that though, I've finally generated some animations that I'm proud of.  You can find the undocumented scratch code here.

If you come across this page and have a better understanding of why this works I'd be glad to hear from you.

Saturday, March 4, 2017

Converting Binary to Gray Code with XOR

Over the last week I've been researching Gray Codes and how to generate them. I understand the basic concept but I wanted a deeper understanding.  Most of the explanations out there are a little dense so I'm going to try to make the available information a little bit more accessible to everyone. So let's start with a basic explanation of what a Gray code is.

A Gray Code is a special way of counting, where adjacent entries differ by only one digit. In practice this almost always refers to a binary counting sequence and more specifically a special code called the Binary Reflected Gray Code (BRGC). From here on, this is the Gray code that I'll be talking about.

These code have practical uses in many areas.  The one that I'm most familiar with is in rotary encoders where it's advantageous that only one digit changes at a time.  In a normal binary counting sequence you may have multiple bits change at once.  For example from $$$0111$$$ to $$$1000$$$. The problem with this is that if all the bits don't change at exactly the same moment, the value $$$1010$$$ may be read from the sensor.  If Gray codes are used, as you progress through the series only one bit changes at a time.  This leaves no ambiguity as to what the value is.

Let's start with a formal definition of a Gray code and try to explain things in easier to understand terms.

$$G_0=\epsilon \\ G_{k+1}=0G_k,1G_k^R$$

The series for zero bits is shown as $$$G_0=\epsilon$$$ and indicates an empty series.  The next equation, $$$G_{k+1}=0G_k,1G_k^R$$$ indicates that the series for a Gray code series of $$$k+1$$$ bits is created by taking the the Gray code of $$$k$$$ bits and joining a reversed copy to its end, with zeros added before entries in the first half of the series, and ones added before entries in the second half of the series.  This is how the series ensures that consecutive entries only differ by one digit.  This may be clearer in the diagram below.

Constructing $$$G_{k+1}$$$ from $$$G_{k}$$$

If $$$G_k$$$ is a Gray code where consecutive entries differ by one digit, then appending a mirror of itself to its end won't change that, except for where the two series join, as those two entries will be the same. Adding zero to the first half of the new series and one to the second half of the new series won't change it either but it does mean that the entries where the series join now differ by one digit. This means that the new series $$$G_{k+1}$$$ is now also a Gray code.  It also proves that the entries in the output are unique.  Starting with $$$G_0$$$, we know that all its entries are unique.  If we then assume that all the elements of $$$G_k$$$ are unique, then the elements of $$$G_{k+1}$$$ must also be unique as $$$G_k$$$ with zeros added before it and a reverse copy of $$$G_k$$$ with ones added before it create a new unique series. It's then provable by induction that the each entry in a Gray code is unique.

This gives us the instructions needed to construct a Gray code of any length.  Start with the minimal Gray Code that contains one empty element and keep doubling its length when adding a new digit.  

Gray Code
Constructing $$$G_3$$$ from $$$G_2$$$
Getting a picture in your mind of what a Gray code looks like will be helpful for the next step. For some situations the above representation can help, but in others, the vertical table representation with the index $$$n$$$ of each element is better.
Gray Code
Gray code indices in decimal

Now that I've explained the basics of what a Gray code is, I want to detail a proof of a well known formula to generate them with an xor operation.

$$H[n] = n \oplus (n>>1)$$

Where $$$H[n]$$$ is the $$$n$$$th element of the Gray code. $$$H$$$ is used at the moment as it's a hypothesis that this formula is equivalent to $$$G[n]$$$ described above.

I found a well thought out explanation of this proof in a Quora post but it could do with some further explanation.  So I'll repeat the proof here with some commentary to help those new to the subject. To be clear all of these operations are performed on the binary representation of the numbers involved.

First, let's define a helper function.

$$F[n,k]=(2^k-1) \oplus n$$

$$$F[n,k]$$$ is a function that inverts all the bits of a binary string $$$n$$$ of length $$$k$$$. $$$2^k$$$ is a binary string of length $$$k+1$$$ that is all zeros except for the most significant bit. Subtracting one from this will create a binary string of length $k$ that is all ones.  When the string $n$ is xored with this all of the bits are inverted. This is a function that essentially reverses the direction of  a binary count. For example, when $$$k=3$$$, and for a value of $$$n=001$$$, $$$F[001,3] = 110$$$.  Where $$$001$$$ is the second entry when counting in binary, $$$110$$$ is the second last.  It's a mathematical representation of reflecting/mirroring a binary count.

Now we have that out of the way we will prove the following

$$H[x] \oplus H[F[x,k]]=2^{k-1}$$

What does that mean?  It's saying that the output of the hypothesis function xored with the hypothesis function with the input ordering reversed will give a string of length $$$k$$$ that is all zeros except for the most significant bit.  This is a fancy way of saying that the output of $$$H$$$ is symmetrically except for the most significant bit.  This is a property of the Gray code that we are trying to prove. From the beginning of the series each element of a Gray code is equal to the element the same distance from the end except for the most significant bit.

As $$$F[x,k]$$$ is an inverted version of $$$x$$$ the most significant bit of one or the other, but not both, will be equal to one. Let's assume that it's $$$x$$$. It doesn't matter which one is chosen, the proof works either way.  We can now write

$$x=2^{k-1} \oplus y \quad where \quad y<2^{k-1}$$

It then follows that $$$F[x,k] = F[y,k-1]$$$.  This says that inverting $$$x$$$ which we have defined to start with one, will now start with zero and is equal to the inverted value of the new shorter bit string $$$y$$$.  For example if $$$x=1100101$$$ then $$$y = 100101$$$.  So $$$F[1100101,7]=0011010$$$ and $$$F[100101,6]=011010$$$.

Now onto the proof.  It's all maths at this point, you can ignore conceptual ideas here and just work through the process.

\begin{align}H[x] \oplus H[F[x,k]]&=(2^{k-1} \oplus y) \oplus (2^{k-2} \oplus (y>>1)) \oplus H[F[y,k-1]] \\ &=2^{k-1} \oplus 2^{k-2} \oplus H[y] \oplus H[F[y,k-1]] \\ &= 2^{k-1} \oplus 2^{k-2} \oplus 2^{k-2} \\ &= 2^{k-1}\end{align}

Now we've proven this, for strings of length $$$k$$$ using strings of length $$$k-1$$$ it just needs to be proved for strings of length 1 ie $$$k=1$$$ for the inductive proof to be complete.

\begin{align} H[0] \oplus H[1] &= 0 \oplus 0 \oplus 1 \oplus 0 \\&=1\end{align}

All of that above just proves that $$$H[n] = n \oplus (n>>1)$$$ results in an output series that is mirrored with the most significant bit inverted.  Now we need to prove the following

$$H[2^k \oplus x] = 2^k \oplus H[F[x,k]] \quad where \quad x<2^k$$

This states that if we add a one to the input to the hypothesis function, it's equivalent to inverting the input and then adding a one to the output.  It's hard to see, but this describes the process of going from a Gray code of $$$k-1$$$ bits to a Gray code of $$$k$$$ bits.

\begin{align} H[2^k \oplus x] &= 2^k \oplus x \oplus 2^{k-1} \oplus (x>>1) \\ &= 2^k \oplus 2^{k-1} \oplus H[x] \\ &=2^k \oplus H[F[x,k]]\end{align}

As we did before, this needs to be proved for $$$k=1$$$ for the induction proof to work.

\begin{align}H[10 \oplus 1] &= 10 \oplus H[0] \\H[11] &= 10 \oplus 0 \oplus 0  \\ 11 \oplus 01 &= 10 \\ 10=10 \end{align}

OK, that was a confusing rabbit hole we just fell down, but it proves that $$$H[n]=G[n]$$$ for all $$$n$$$.  I'm not entirely sure we need both parts of that proof, but it can't hurt to prove it twice. 😅 

We can now be sure that $$$G[n] = n \oplus (n>>1)$$$  How simple is that!  A Gray code can be generated by xoring its index with itself but right shifted by one.  This means that each digit of the Gray code is the xor of two consecutive digits in the input.

Generating a Gray code from an index - input in green output in purple
In the image above it's kind of like differentiating the bit stream.  If two consecutive bits are different it generates a one in the output.  So if we go from index to Gray code with something like a differentiation you might think we can go the other way with integration.  You sure can.  I won't prove it but you can look it up.

Now all this seems a little simple and useless apart from the use case I gave at the start, but this goes into some deep concepts.  We're looking at mathematics over a finite field, in this case $$$GF(2)$$$. Now here's the really cool part.  If you use each digit of a Gray code as a coordinate and plot it, you visit each vertex of a square for a two bit Gray code.  For a three bit Gray code you visit each vertex of a cube travelling along the edges. More generally for a k bit code you travel along the edges of a k-dimensional hypercube visiting all vertices.  This is called a Hamilton cycle.

Sorry if I'm ranting but I find all this fascinating, but it hasn't solidified in my mind yet.

If you've made it this far you deserve a joke.

Q. What's the best way to visit all the vertices of a hypercube?
A. On a Hamilton bicycle.

Hey give me a break.  I didn't say it'd be funny.

In case the equations don't render properly, here is a PDF of this page.
A PDF of the original solution on Quora.

Wednesday, February 22, 2017

Hilbert Curve Generation With Lookup Tables

I've been researching something that requires a Hilbert curve and I thought I'd share how to generate the path and also move between the index and coordinates of points.

For those of you unfamiliar with Hilbert curves let me quickly explain what one is and why you would want to use one.  The curve covers every point in a square with side length 2^n by moving up, down, left and right, starting at one corner and ending at an adjacent corner.  You could accomplish the same requirements by just scanning back and forth across the rows, but the advantage the Hilbert curve has is that nearby points on the 2D grid are generally also near each other on the curve.  If you were to scan back and forth you get a lot of points that are directly beside each other in 2D space but very far apart on the curve.

So why would you want to use one?  They're great for turning 2D areas into 1D streams of data while maintaining locality.  This comes in handy in image processing.  Depending on what you want to do, this may make the processing a lot easier.

Generation of the curve can be done recursively by first selecting an initial shape type and then using tables of sub-types and locations to generate the next stage.  In the animation below the initial shape is type 1.  When you go to the sub-type table you see that type 1 is replaced with types 2, 1, 1, and 4 in that order.  The position of each sub-type is defined by the parent type.  As the parent is type 1 the locations of the sub-types from the location table are:

2 at (0, 0)
1 at (0, 1)
1 at (1, 1)
4 at (1, 0)

The location table can then be used to find the coordinates of the sub-types.  As binary coordinates are used, the x and y coordinates of the sub-type can be appended to its coordinates to find the new sub-coordinates.   For example, the coordinates of the type 4 sub-type at position (1, 0) above are (1, 1) (0, 1) (0, 0) (1, 0).  Appending these to the coordinates of (1, 0) you find the new sub coordinates (11, 01) (10, 01) (10, 00) (11, 00).  Trust me it's confusing at first but after a while it all makes complete sense.  I'm still trying to come up with appropriate terminology for the process.  These tables are from a paper

It's a little confusing at first but I recommend reading that to help understand the process.

Recursive Hilbert Curve Generation

What if you don't want to generate the full curve and just want to know where along the curve a point lays or where a point on the curve is in 2D space?  Let's use the below example to demonstrate this.

44th Point at location (111, 101)

The above image shows the 44th point of the third stage at (111, 101).  It's actually the 45th point but we start counting at  0.  We add an index row to the tables provided in the above paper to make the process easier.

Generation Tables

We start by converting the index to a binary number with 2 bits for every stage.  In this case it's the third stage, so we get 6 bits.  These bits are placed two at a time on rows under the index column.  We also define the initial shape as type 1.  From this, the sub-type and location for the first row can be found.  The sub-type becomes the type for the next row.  This process is repeated two more times until we have three locations.  To get the final location coordinates, just append the x and y coordinates downwards to get (111, 101).

Converting an Index to a location

To go from coordinate to index, split the location up by removing the highest bit from the x and y coordinate for each line.  As the start type is known, the index can be found.  This index is then used to find the sub-type for the next row.  This is repeated two more times.  The indices are then concatenated to get 101100 in binary which is 44.

Converting a location to an index

I hope this helped to explain the process.  Trust me, even if it's still confusing, having diagrams and a couple of hopefully easy to follow examples will help.