Tuesday, August 6, 2013

Detecting the Rotation Angle of Printed Text

In my last post I described the basics of how to use the Radon Transform to detect the rotation angle of text.  In this post I'll automate the process slightly more, and try to make it more rugged.  As before, we'll start with the simulated test report.  This time however, a thresholding operation will be applied to the image to convert it to black and white before the radon transform is applied.   This will remove any variations due to the whiteness of the paper or darkness of the text.

The file used in testing the below process can be found here.
Report
Simulated Report
Previously when applying the Radon transform to the report, it was over the range of 0 to 180 degrees, this was more than what was needed.  I figure that the text will never be more than 20 degrees out of alignment in either direction.  For this reason the radon transform is only calculated over the range 70 to 110 degrees, this is 20 degrees either side of the 90 degree horizontal direction.

Radon Transform
Radon Transform from 70 to 110 degrees
After applying the radon transform, the features identifying the alignment of the text become evident in image.  To highlight these, a gradient filter was applied.  The vertical gradient filter of the Prewitt operator was convolved with the image.

[ 1  1  1]
[ 0  0  0]
[-1 -1 -1]

Convolving this matrix with the image has the effect of calculating a vertical gradient and then horizontally applying an averaging filter across three values.  This makes the features in the middle of the image more prominent.

Vertical Image Gradient
Vertical Image Gradient

The image is then summed vertically to produce an array.  The group of features in the middle of the image will cause a peak in this array.  This is evident in the plot of the array below.

Rotation Angle Graph
Gradient Intensity vs Text Rotation Angle
The location of the peak in this graph will indicate the amount the image is rotated.  By searching the array it can be found that the this peak is 0.6 degrees from the 90 degree mark.  This is how much the test image was rotated.

The above process represents the basis of an automated system to remove rotation distortion from scanned documents.  I haven't tested it on real documents, so there could be potential problems with how robust it is.  I've got a couple ideas to make the detection process stronger and less prone to error if that's a problem.  We're looking for two main things in the radon transform, bright spots that represent lines, and sharp transitions that represent the border between text and the white-space under it.  Applying a threshold to the radon transform will remove some of the background details that don't represent lines.  Thresholding the gradient image will also remove features that don't have sharp transitions.  It's just a matter of experimentation to see at what levels the thresholds need to be applied.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.