Wednesday, November 3, 2010

Reading #14. Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams (Bhat)

Comments

Jhonathan

Summary

This paper also addresses the problem of discerning shape from text. Unlike the paper in the previous post this recognizer does not attempt to use a lot of features, instead it uses only one single feature to split shape vs. text. Entropy proved to be a very distinctive feature between shape and text. Entropy is a measure of uncertainty associated with a random variable; it is in other words the randomness of an object or system.  Basically this gives the intuition that text is far more random than simple shapes. In order to measure this randomness in a sketch several steps were followed. First, the strokes were grouped on a time basis. Then, the sketch was resampled to leave every point in each stoke at the same fixed distance. With this angle each joint was classified in 7 possible labels and with this classification the overall entropy of the shape could be calculated according to the formula below.

Results show that this single feature is even better to differentiate shape vs. text than the combination of features shown by Plimmer. It achieved an accuracy of 95.56% with77.51% of the shapes classified (some were left as unclassified).

Discussion

This paper found a single feature that is very important to shape classification versus text. I think it is interesting that the paper analyzed the use of entropy by itself to be able to prove the power of this feature. However in a real classifier I would rely on more than this feature to be able to detect some of the cases not analyzed in this paper, for instance the musical notes, in which case entropy alone fails, but along with other features like density can discern accurately. Also other techniques may aid in the more general process, for instance wrong grouping that relies in time only can affect the whole classification. If other techniques such as growing boxes can detect and recover for a wrong groping this could turn this into a more robust classifier.

1 comment:

  1. The entropy is a nice trick, but if I've learned anything in sketch recognition, it's that a one-trick pony never performs perfectly (in sketch recognition, at least, as well as numerous other areas). The author should have included a few other weighted features, with entropy being (one of) the most weighted feature, of course.

    ReplyDelete