Comments
Summary
Once again, this paper focuses in separating text and shape. In this case a general approach is taken based mainly on spatial features of the ink. A graph is built that relates each stroke in the sketch based on its neighboring with other strokes. Then strokes that are grouped closed together can be identified by different recognizers. The novel approach here is that both grouping and recognition are done in parallel such that the recognition can judge if a grouping was good or not and in this case another grouping can be tried. Once a sketch is represented as a graph many of the usual algorithms in graph theory can be used. In this case an A* search is used to optimize the grouping of the strokes. The results show grouping accuracies of 90% and recognition with grouping of 85%.
Discussion
A very nice feature of this recognizer is that it does not require hand coded heuristics. This is very useful for a general recognizer that can be applied to many domains. However as it is usually the case the generalization comes at the price of lower accuracy. Other recognizers that are fine-tuned for particular domains show better accuracy. However this is a very good start-point and the grouping idea can be exploited in similar recognizers.
So it was entropy, then Markov models, and now spatial features. It is true that text strokes are denser than shape strokes, and the author achieved high recognition, but I cannot help but feel like something is not quite right with the results. I feel as if the data was not complex enough or something along those lines...I suppose I'm saying if we were to implement this algorithm ourselves, we would achieve a lower recognition rate.
ReplyDeleteJust as Jonathan says, researchers try their best to apply kinds of features and methods to distinguish texts from graphics. I think it is always not easy to achieve the same recognition rate as the author states. Just treat the paper as a candidate idea, and when you encounter the same problem ,it may be a choice..
ReplyDeleteThe idea of grouping is very good in my opinion, and it is now very common. Grouping dont requires users to draw in a single stroke or a specific order. That is very cool. However, to reduce the burden for users always means to increase the one for computers.