SenseShapes

.: SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System :.

People:

Project Description:

Selecting an object at a distance can be difficult in augmented reality (AR). SenseShapes are volumetric regions of interest that can be attached to parts of the user’s body to provide valuable information about the user’s interaction with objects. To assist in object selection, we generate a rich set of statistical data (both temporal and spatial in nature) and dynamically choose which data to consider based on the current situation.

Currently the regions can be any of the four geometric primitives: cone, sphere, cuboid, and cylinder. As the user interacts with the environment, the SenseShapes keep a history of all objects that intersected them, so that querying that history becomes possible. For example, the user might want to know "all objects that she was pointing at 2 seconds ago, that were visible." In addition to simple intersection history, SenseShapes keep a visibility history through the use of an object buffer.

The statistics that get computed by SenseShapes are:

time ranking (how long did the object spend in the shape)
distance ranking (average distane from the origin of the shape to the object)
stability ranking (number of times that the object entered and exited the volume)
visibility ranking (how much of the object was visible)
center proximity (how close the visible portion of an object was to the center of the shape)

To demonstrate the use of SenseShapes we have designed a multimodal interface that uses glove-based gesture recognition (Essential Reality P5 glove) and speech recognition (IBM ViaVoice 10) for a simple task of redecorating our lab (coloring, moving, rotating and scaling objects). Our glove-based gesture recognizer currently supports only 3 gestures (point, grab, and thumbs-up), but we have found those to be sufficient for our purposes.

When determining which object to select, it is useful to have the system perform dynamic integration, in which an integration strategy for determining which statistics to use is chosen based on the current gesture, speech and SenseShape. For example, we have used spatial cues in speech (e.g., “this/these" vs. "that/those”) to select among a set of alternative rankings. In the simplest example, closer objects might be weighted higher when “this” is used and lower when “that” is used.

Images:

	A user sitting on the couch with four visible SenseShapes: one for each "eye" cone attached to the head "pointing" cone attached to the hand "grabbing" sphere attached to the hand SenseShapes would normally be hidden from the user to facilitate more natural interaction, but are shown here for clarity.
	The P5 glove that we use to detect gestures. InterSense IS900 tracker is mounted on the glove for precise 6DOF tracking throughout the entire area of our lab.
	Grabbing gesture.
	Pointing gesture.
	Thumbs-up gesture.

Publications:

Olwal, A., Benko, H., Feiner, S. SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System. In Proceedings of The Second IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR 2003). Tokyo, Japan. October 7–10, 2003. p. 300–301. [copyright]

Downloads:

Video 320x240 (DivX) - avi (31 MB)
Poster - jpg (1 MB)

Acknowledgments:

This project is funded in part by NSF ITR grant IIS-0121239 and IIS-00-82961, and Office of Naval Research Contracts N00014-99-1- 0394, N00014-99-1-0683, N00014-04-1-0005, and N00014-99-1-0249.
Sajid Sadi and Avinanindra Utukuri for support with the P5 glove
Microsoft Research
Any opinions, findings, and conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF or any other organization supporting this work.