.: SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System :.


Project Description:

Selecting an object at a distance can be difficult in augmented reality (AR). SenseShapes are volumetric regions of interest that can be attached to parts of the user’s body to provide valuable information about the user’s interaction with objects. To assist in object selection, we generate a rich set of statistical data (both temporal and spatial in nature) and dynamically choose which data to consider based on the current situation.

Currently the regions can be any of the four geometric primitives: cone, sphere, cuboid, and cylinder. As the user interacts with the environment, the SenseShapes keep a history of all objects that intersected them, so that querying that history becomes possible. For example, the user might want to know "all objects that she was pointing at 2 seconds ago, that were visible." In addition to simple intersection history, SenseShapes keep a visibility history through the use of an object buffer.

The statistics that get computed by SenseShapes are:

To demonstrate the use of SenseShapes we have designed a multimodal interface that uses glove-based gesture recognition (Essential Reality P5 glove) and speech recognition (IBM ViaVoice 10) for a simple task of redecorating our lab (coloring, moving, rotating and scaling objects). Our glove-based gesture recognizer currently supports only 3 gestures (point, grab, and thumbs-up), but we have found those to be sufficient for our purposes.

When determining which object to select, it is useful to have the system perform dynamic integration, in which an integration strategy for determining which statistics to use is chosen based on the current gesture, speech and SenseShape. For example, we have used spatial cues in speech (e.g., “this/these" vs. "that/those”) to select among a set of alternative rankings. In the simplest example, closer objects might be weighted higher when “this” is used and lower when “that” is used.


A user sitting on the couch with four visible SenseShapes:

  • one for each "eye" cone attached to the head
  • "pointing" cone attached to the hand
  • "grabbing" sphere attached to the hand

SenseShapes would normally be hidden from the user to facilitate more natural interaction, but are shown here for clarity.

The P5 glove that we use to detect gestures. InterSense IS900 tracker is mounted on the glove for precise 6DOF tracking throughout the entire area of our lab.
Grabbing gesture.
Pointing gesture.
Thumbs-up gesture.


Olwal, A., Benko, H., Feiner, S. SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System. In Proceedings of The Second IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR 2003). Tokyo, Japan. October 7–10, 2003. p. 300–301. Download PDF [copyright]