Wednesday, December 1, 2010

Reading #17: Distinguishing Text from Graphics in On-line Handwritten Ink (Bishop)

Comments

Summary

As in readings 13 & 14 this paper also addresses the problem of discerning between shape and text. The approach here is somewhat different than the previous posts, as this one uses not only features of the stroke but also characteristics of the sketch like gaps between strokes. Parting from an independent stroke model where features are extracted as in the other works to allow classification using cross-entropy. In a later step, machine learning techniques are used to take into account other important properties of the context of the stroke to improve classification. Particularly a Hidden Markov Model is used to represent the sketch and run algorithms to detect the optimal labeling for each stroke. The results shown are based on confusion matrices, they are not easily comparable to other recognizers but show internal differences amongst the use of context (independent, vs. uni-partite or bi-partite HMM).

Discussion

This paper presents another technique for classifying text vs shape. Altough the results are not very clear in terms of the accuracy of the recognizer in different domains, the concept of using context is very interesting. And even If their results cannot be trivially compared to others, they show improvements by using context. As a matter of fact the intuition of how humans recognize shape vs text in an apparently natural way relies heavily on concept. For instance the shape O in this paragraph would be classified by any normal person as the letter O. But in another context that letter O would be clearly classified as a wheel just depending on context. (See fig below).

2 comments:

  1. This is definitely not a one-trick pony like the entropy algorithm from a while back. I would like to a combination of this paper and the one-trick entropy algorithm. I know this paper uses entropy, but I don't think it uses the other method of measuring change in angles.

    ReplyDelete
  2. I agree the idea about the importance of context. Context should come to our attention, because when people recognize, they always use contexts. But when using context, the time cost will increase.

    I always think about why we always want to tell apart texts from graphics. Is it ok to treated them equally? When recognizing, just make the recognizer to recognize them according their structure or local features. To tell apart texts from graphics may make a mistake at the beginning.

    ReplyDelete