Gestures and motion controllers in Unity

There are two key ways to take action on your gaze in Unity, gestures and motion controllers. Both are handled through the same set of Unity spatial input APIs.

Unity provides both low level access (raw position/orientation/velocity information) and a high level gesture recognizer that exposes more complex gesture events (for example: tap, double tap, hold, manipulation and navigation).

Gestures: High-level spatial input

Namespace: UnityEngine.VR.WSA.Input

Types: GestureRecognizer, GestureSettings, InteractionSourceKind

These high level gestures are generated by spatial input sources, such as hands and motion controllers. Each Gesture event provides the SourceKind for the input as well as the targeting head ray at the time of the event. Some events provide additional context specific information.

There are only a few steps required to capture gestures using a Gesture Recognizer:

  1. Create a new Gesture Recognizer
  2. Specify which gestures to watch for
  3. Subscribe to events for those gestures
  4. Start capturing gestures

Create a new Gesture Recognizer

To use the GestureRecognizer, you must have created a GestureRecognizer:

GestureRecognizer recognizer = new GestureRecognizer();

Specify which gestures to watch for

Specify which gestures you are interested in via SetRecognizableGestures():

recognizer.SetRecognizableGestures(GestureSettings.Tap | GestureSettings.Hold);

Subscribe to events for those gestures

Subscribe to events for the gestures you are interested in.

recognizer.TappedEvent += MyTapEventHandler;
recognizer.HoldEvent += MyHoldEventHandler;

Note: Navigation and Manipulation gestures are mutually exclusive on an instance of a GestureRecognizer.

Start capturing gestures

By default, a GestureRecognizer does not monitor input until StartCapturingGestures() is called. It is possible that a gesture event may be generated after StopCapturingGestures() is called if input was performed before the frame where StopCapturingGestures() was processed. Because of this, it is reliable if you want to start and stop gesture monitoring depending on which object the player is currently gazing at.

recognizer.StartCapturingGestures();

Stop capturing gestures

To stop gesture recognition:

recognizer.StopCapturingGestures();

Removing a gesture recognizer

Remember to unsubscribe from subscribed events before destroying a GestureRecognizer object.

void OnDestroy()
{
    recognizer.TappedEvent -= MyTapEventHandler;
    recognizer.HoldEvent -= MyHoldEventHandler;
}

Interactions: Low-level spatial input

Namespace: UnityEngine.VR.WSA.Input

Types: InteractionManager, InteractionSourceState, InteractionSource, InteractionSourceProperties, InteractionSourceKind, InteractionSourceLocation

Low-level spatial input allows you to get much more detailed information on interaction sources, such as the source's hand pose or pointing pose, or the state of a motion controller's touchpad or thumbstick.

How to poll for the state of hands and motion controllers

You can poll for the latest state of each interaction source (hand or motion controller) using the GetCurrentReading method.

var interactionSourceStates = InteractionManager.GetCurrentReading();

Each InteractionSourceState you get back represents an interaction source at the current moment in time. The InteractionSourceState exposes info such as:

  • Which kinds of presses are occurring (Select/Menu/Grasp/Touchpad/Thumbstick)
  • Other data specific to motion controllers, such the touchpad and/or thumbstick's XY coordinates and touched state
  • The head pose at the moment in time when this gesture data was captured, which can be used to determine what the user was gazing at. This is especially useful for targeting a user's hand gestures, since there is some latency before hand poses are processed by the system and provided to the app.
  • The hand pose and pointing pose of the interaction source at that point in time
  • The InteractionSourceKind to know if the source is a hand or a motion controller

Hand pose vs. pointing pose

Windows Mixed Reality supports motion controllers in a variety of form factors, with each controller's design differing in its relationship between the user's hand position and the natural "forward" direction that apps should use for pointing when rendering the controller.

To better represent these controllers, there are two kinds of poses you can investigate for each interaction source:

  • The hand pose, representing the location of either the palm of a hand detected by a HoloLens, or the palm holding a motion controller.
    • On immersive headsets, this pose is best used to render the user's hand or an object held in the user's hand, such as a sword or gun.
    • You can access the hand pose through either Unity's cross-vendor input API (VR.InputTracking.GetLocalPosition/Rotation) or through the Windows-specific API (sourceState.sourcePose.TryGetPosition/Rotation).
  • The pointer pose, representing the tip of the controller pointing forward.
    • This pose is best used to raycast when pointing at UI when you are rendering the controller model itself.
    • Currently, the pointer pose is available only through the Windows-specific API (sourceState.sourcePose.TryGetPointerPosition/Rotation).

These pose coordinates are all expressed in Unity world coordinates.

How to start handling an interaction event

Using interaction input is easy:

  • Register for a InteractionManager input event. For each type of interaction event that you are interested in, you need to subscribe to it.
InteractionManager.SourcePressed += InteractionManager_SourcePressed;
  • Handle the event. Once you have subscribed to an interaction event, you will get the callback when appropriate. In the SourcePressed example, this will be after the source was detected and before it is released or lost.
void InteractionManager_SourcePressed(InteractionSourceState state)
{
    // state has information about:
       // targeting head ray at the time when the event was triggered
       // whether the source is pressed or not
       // properties like position, velocity, source loss risk
       // source id (which hand id for example) and source kind like hand, voice, controller or other
}

How to stop handling an event

You need to stop handling an event when you are no longer interested in the event or you are destroying the object that has subscribed to the event. To stop handling the event, you unsubscribe from the event.

InteractionManager.SourcePressed -= InteractionManager_SourcePressed;

Input Source Change Events

These events describe when an input source is:

  • detected (becomes active)
  • lost (becomes inactive)
  • updates (moves or otherwise changes some state)
  • is pressed (tap, button press, or select uttered)
  • is released (end of a tap, button released, or end of select uttered)

Example

using UnityEngine.VR.WSA.Input;

void Start ()
{
    InteractionManager.SourceDetected += InteractionManager_SourceDetected;
    InteractionManager.SourceUpdated += InteractionManager_SourceUpdated;
    InteractionManager.SourceLost += InteractionManager_SourceLost;
    InteractionManager.SourcePressed += InteractionManager_SourcePressed;
    InteractionManager.SourceReleased += InteractionManager_SourceReleased;
}

void OnDestroy()
{
    InteractionManager.SourceDetected -= InteractionManager_SourceDetected;
    InteractionManager.SourceUpdated -= InteractionManager_SourceUpdated;
    InteractionManager.SourceLost -= InteractionManager_SourceLost;
    InteractionManager.SourcePressed -= InteractionManager_SourcePressed;
    InteractionManager.SourceReleased -= InteractionManager_SourceReleased;
}

void InteractionManager_SourceDetected(InteractionSourceState state)
{
    // Source was detected
    // state has the current state of the source including id, position, kind, etc.
}

void InteractionManager_SourceLost(InteractionSourceState state)
{
    // Source was lost. This will be after a SourceDetected event and no other events for this source id will occur until it is Detected again
    // state has the current state of the source including id, position, kind, etc.
}

void InteractionManager_SourceUpdated(InteractionSourceState state)
{
    // Source was updated. The source would have been detected before this point
    // state has the current state of the source including id, position, kind, etc.
}

void InteractionManager_SourcePressed(InteractionSourceState state)
{
    // Source was pressed. This will be after the source was detected and before it is released or lost
    // state has the current state of the source including id, position, kind, etc.
}

void InteractionManager_SourceReleased(InteractionSourceState state)
{
    // Source was released. The source would have been detected and pressed before this point. This event will not fire if the source is lost
    // state has the current state of the source including id, position, kind, etc.
}

See also