This is a blog mainly talking about Sketch Recognition and Computer Science related topics. It was created by Francisco Vides (Paco) originally as a posting site for the Sketch Recognition Class in Texas A&M.
Thursday, December 9, 2010
Reading #23: InkSeine: In Situ Search for Active Note Taking (Hinckley)
Comments
Summary
One of the most popular application of sketch recognition today is note taking. This is thanks to popular devices and applications such as the palm pilot and OneNote. In this paper they present a very nice application where the user can dynamically take notes and embed dynamic content within the notes without having to change the workspace context or the tool that is used. This enables a continuous workflow without interruption. The key part of the application is the Ink search where the user may draw an ink note and using gestures indicate that the selected strokes are text to be searched in a particular context, which is also selected via gestures. Thus without having to leave the canvas the user can create rich context over the ink notes.The paper explains in detail the use case scenarios of the system, and gives a good impression of the system usage. They however do not go deep down in the recognition techniques that were employed in the text recognition or the gestures. After two iterations of their work the authors found out that a plausible and usable system can be implemented using the InkSeine technology.
Discussion
This tool provides an important improvement in note taking applications for tablet PCs. Most of the applications for ink note taking found today either don’t provide many of the possible features and advantages achievable trough recognition, or have a complex user interface that makes the note taking non-natural. The learning curve of this application seems small enough to allow the novice user to use it, while still providing him with the ability to use embedded and rich content. I really like the idea and the UI, however I would have liked to see some numbers in the results in terms of accuracy, since the complete experience can turn very frustrating if the ink is not recognized correctly.Reading #22: Plushie: An Interactive Design System for Plush Toys (Mori)
Comments
Summary
Plushie is a very similar tool than Teddy, However, Plushie is designed to use the 3D model as an intermediate step, where the real output result is a 2D pattern that allows the final user to sew a stuffed animal that resembles the sketched model. The resulting technique has many advantages over the traditional framework for creating 2D patterns for sewing. The fact that there is continuous feedback between the sketched 2D form and the 3D model allows the user to adjust the 2D pattern in simulation before spending time and money making physical prototypes.The algorithms and techniques behind Plushie are very similar than in the previous paper, a 3D mesh is created based on the input strokes of the user. However this enhanced interface is designed for the particular application of sewing and displays the resulting cloth pattern in real time.
Discussion
This is a very interesting idea and application. It can be really useful and fun to design stuffed animals using this program. Moreover the simulation gives several advantages as the user can adapt the model as it sees the resulting 3D model at the same time that he can make sure he follows the appropriate constraints in cloth pattern. (Area of the cloth, number of parts…). I think the system successfully opened a door in a very different domain.Reading #21: Teddy: A Sketching Interface for 3D Freeform Design (Igarashi)
Comments
Summary
In this paper they introduce the concept of free sketching in 2D to create 3D shapes in a easy way. Teddy is an application that uses pen-input devices to allow the users to sketch and interact in the 2D space to create a 3D polygonal surface in a more creative manner. Unlike most 3D modeling tools teddy allows easy creation of freeform sketchy models in 3D which makes it ideal for fast prototyping and new users. The project uses recognition both for sketching and for gesture commands.A novel user interface converts basic strokes in 3D shapes that can be rotated and edited. The edit commands include extruding shapes, smoothing and cutting. The final result is a 3D mesh that can be the input of the multiple tools available for 3D rendering and processing. The implementation was made in Java and exposed to the public for user studies.
Discussion
The application is very interesting and novel and allows creative users to approach more comfortably to 3D computer models. The paper shows good insight of what the application is capable to do and has a detailed explanation of the user interface and the resulting output. However I was expecting more in the implementation details in terms of gesture and sketch recognition, also they lack conclusions of their achievements. I think the ideas were good enough to expand more on sections 6-8 of the paper.Wednesday, December 1, 2010
Reading #20: MathPad2: A System for the Creation and Exploration of Mathematical Sketches (LaViola)
Summary
MathPad2 is a very nice sketch application that attempts to enrich the experience of doing math in tablet by animating the components drawn in the sketch. The paper focuses on the prototype application that involves shape and gesture recognition that enables the user to interact with the equations and the pictures that they model. One of the claimed contributions of mathpad is the novel gestural recognition used in the interface that is said to be more general and work fluently trough several domains as math typing and diagram drawing.Discussion
This is one of those applications that encourage keeping working on sketch recognition. All of us that have dealt with an equation somewhere in our lives can value the power of receiving live feedback. Furthermore if that can be done in a completely natural interface that feels like the pen and paper that we used in school is even better. The interface that they show has simple yet very powerful ideas to manage gestures accurately and easily. I like the tap after the command as it is natural and easy to use but avoids the annoying false positives usually found in gesture recognizers.Reading #19: Diagram Structure Recognition by Bayesian Conditional Random Fields (Qi)
Comments
Summary
This is a top down recognizer that relies heavily on context to determine the correct classification of each stroke. In this case a model of Bayesian Conditional Random Fields is used to determine the classification of the strokes. Each stroke that is classified affects the classification of it neighbors. The paper provides a deep mathematical background towards the model compared to others. The first step in recognition is to fragment the strokes in order to create the Bayesian CRF. Note that the fragment here is defined in a different way than in other papers. It is not the line formed by each 2 consecutive point in the strokes, but the set of points in the stroke that could be recognized as a straight line. This implies corner detection as seen in previous posts. Then they can construct the BCRF and train it to make inference on the network. The results show different classifications on variations of the CRF, showing that the BCRF behaves better in the recognition. An improvement of Automatic Relevance Determination makes the recognition even better.Discussion
A nice thing about this work is that it takes a concept of another field, (computer vision) and applies it succesfully to the domain of sketch recognition. It is not the first time that we see this phenomena. Since the sketch recognition is such an open field in the moment many works successfully or not attempt to convert the sketch recongnition problem into a more familiar one (fuzzy logic, graph searches, HMMs…). In this case, the Bayesian Conditional Random Fields show interesting results in this domain.Reading #18: Spatial Recognition and Grouping of Text and Graphics (Shilman)
Comments
Summary
Once again, this paper focuses in separating text and shape. In this case a general approach is taken based mainly on spatial features of the ink. A graph is built that relates each stroke in the sketch based on its neighboring with other strokes. Then strokes that are grouped closed together can be identified by different recognizers. The novel approach here is that both grouping and recognition are done in parallel such that the recognition can judge if a grouping was good or not and in this case another grouping can be tried. Once a sketch is represented as a graph many of the usual algorithms in graph theory can be used. In this case an A* search is used to optimize the grouping of the strokes. The results show grouping accuracies of 90% and recognition with grouping of 85%.Discussion
A very nice feature of this recognizer is that it does not require hand coded heuristics. This is very useful for a general recognizer that can be applied to many domains. However as it is usually the case the generalization comes at the price of lower accuracy. Other recognizers that are fine-tuned for particular domains show better accuracy. However this is a very good start-point and the grouping idea can be exploited in similar recognizers.
Subscribe to:
Comments (Atom)