Sunday, September 12, 2010

Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Wobbrock)

Comments on others

liwenzhe

Summary

This paper proposes a very simple implement algorithm to recognize gestures that gives very accurate results in low processing time. They call this algorithm the $1 recognizer because of its simplicity to implement. Despite it has some known limitations, this algorithm provides very accurate results and in exchange it only asks for very low processing power and memory. This makes it ideal to have in mobile devices with low system specifications or in web applications. Its simplicity moreover makes it easy to implement in any prototyping design oriented environment like Flash. The author presents the algorithm and compares quantitative results with other known algorithms for recognition. The results show that this recognizer is more accurate than other relatively simple algorithms like Rubine, and that it is almost as good as sophisticated ones like DTW but with a much lower price to pay in implementation and in processing time. The algorithm itself is divided into four steps: resampling, rotating, scale-translate and optimizing the angle for best score.






Discussion

This is a very different approach from Rubine which gives us an idea that gesture recognition is not yet a universal recipe to solve recognition. Both algorithms came from very bright ideas and in fairly different ways solve the same problem quite successfully. After playing a little while with a Javascript implementation of the $1 Recognizer I was very pleased with recognition of predefined shapes if drawn correctly and the response time is almost immediate. However I find the direction independency can be a 2 edge weapon; it is very good if you think it successfully recognizes basic shapes in different rotations as the triangle showed. But also it leads to confusions of false positives as the arrow case shows, it states with a relatively high certainty that what a user may percept as a “left arrow” is really a v leaving very little threshold when compared to a gesture that indeed resembles a “v” (0.73-0.78). (Note that the left arrow is not exactly a 180ยบ rotation of the right arrow but it will most certainly be the way an average user may draw it) So you really want to carefully determine if the particular application is suited for a rotation insensitive algorithm.

2 comments:

  1. You've rotated the arrow, but you've also reflected it over the horizontal axis. The reflection is not supported by the $1 algorithm. When drawn as you have, I get 100% 'v'. However, if I draw it "correctly" (left, then down, then back up), I get 100% 'arrow'.

    ReplyDelete
  2. You are completely right, that is what I meant in the parenthesis, but maybe I wasnt clear. My point is that the "confidence" of the recognition is pretty high (0.73) for this mistakenly drawn shape. (kind of making emphasis on the high importance on the way the user draws in this algorithm).

    ReplyDelete