Gaze and gestures in DirectX

If you're going to build directly on top of the platform, you will have to handle input coming from the user - such as where the user is looking via gaze and what the user has selected with gestures. Combining these two forms of input, you can enable a user to place a hologram in your app. The holographic app template has an easy to use example.

Gaze input

To access the user's gaze, you use the SpatialPointerPose type. The holographic app template includes basic code for understanding gaze. This code provides a vector pointing forward from between the user's eyes, taking into account the device's position and orientation in a given coordinate system.

SpinningCubeRenderer[hide]
void SpinningCubeRenderer::PositionHologram(SpatialPointerPose^ pointerPose)
{
    if (pointerPose != nullptr)
    {
        // Get the gaze direction relative to the given coordinate system.
        const float3 headPosition    = pointerPose->Head->Position;
        const float3 headDirection   = pointerPose->Head->ForwardDirection;
    
        // The hologram is positioned two meters along the user's gaze direction.
        static const float distanceFromUser = 2.0f; // meters
        const float3 gazeAtTwoMeters        = headPosition + (distanceFromUser * headDirection);
    
        // This will be used as the translation component of the hologram's
        // model transform.
        SetPosition(gazeAtTwoMeters);
    }
}

You may find yourself asking: "But where does the coordinate system come from?"

Let's answer that question. In our AppMain's Update function, we processed a spatial input event by acquiring it relative to the coordinate system for our StationaryReferenceFrame. Recall that the StationaryReferenceFrame was created when we set up the HolographicSpace, and the coordinate system was acquired at the start of Update.

AppMain::Update[hide]
// Check for new input state since the last frame.
SpatialInteractionSourceState^ pointerState = m_spatialInputHandler->CheckForInput();
if (pointerState != nullptr)
{
    // When a Pressed gesture is detected, the sample hologram will be repositioned
    // two meters in front of the user.
    m_spinningCubeRenderer->PositionHologram(
        pointerState->TryGetPointerPose(currentCoordinateSystem)
        );
}

Note that the data is tied to a pointer state of some kind. We get this from a spatial input event. The event data object includes a coordinate system, so that you can always relate the gaze direction at the time of the event to whatever spatial coordinate system you need. In fact, you must do so in order to get the pointer pose.

Gesture input

There are two levels of gestures that you can access on HoloLens:

Interactions: SpatialInteractionManager

To detect low-level presses, releases and updates across hands and input devices on Windows Holographic, you start from a SpatialInteractionManager. The SpatialInteractionManager has an event that informs the app when hand or (for example) clicker input is detected. Note that the "Select" voice command also injects press and release input events.

SpatialInputHandler.cpp[show]

This pressed event is sent to your app asynchronously. Your app or game engine may want to perform some processing right away or you may want to queue up the event data in your input processing routine.

The template includes a helper class to get you started. This template forgoes any processing for simplicity of design. The helper class keeps track of whether one or more Pressed events occurred since the last Update call:

SpatialInputHandler.cpp[show]

If so, it returns the SpatialInteractionSourceState for the most recent input event during the next Update:

SpatialInputHandler.cpp[show]

You can also use the other events on SpatialInteractionManager, such as SourceDetected and SourceLost, to react when hands enter or leave the device's view or when they move in or out of the ready position (index finger raised with palm forward), or when new spatial input devices are attached or detached from the system.

Gestures: SpatialGestureRecognizer

A SpatialGestureRecognizer interprets user interactions from hands, clickers, and the "Select" voice command to surface spatial gesture events, which users target using their gaze.

Spatial gestures are a key form of input for Windows Mixed Reality apps. By routing interactions from the SpatialInteractionManager to a hologram's SpatialGestureRecognizer, apps can detect Tap, Hold, Manipulation, and Navigation events uniformly across hands, voice, and spatial input devices.

SpatialGestureRecognizer performs only the minimal disambiguation between the set of gestures that you request. For example, if you request just Tap, the user may hold their finger down as long as they like and a Tap will still occur. If you request both Tap and Hold, after about a second of holding down their finger, the gesture will promote to a Hold and a Tap will no longer occur.

To use SpatialGestureRecognizer, handle the SpatialInteractionManager's InteractionDetected event and grab the SpatialPointerPose exposed there. Use the user's gaze ray from this pose to intersect with the holograms and surface meshes in the user's surroundings, in order to determine what the user is intending to interact with. Then, route the SpatialInteraction in the event arguments to the target hologram's SpatialGestureRecognizer, using its CaptureInteraction method. This starts interpreting that interaction according to the SpatialGestureSettings set on that recognizer at creation time - or by TrySetGestureSettings.

On HoloLens, interactions and gestures should generally derive their targeting from the user's gaze, rather than trying to render or interact at the hand's location directly. Once an interaction has started, relative motions of the hand may be used to control the gesture, as with the Manipulation or Navigation gesture.

Motion controller input

Windows Mixed Reality also supports motion controllers, which are accessed using a superset of the same SpatialInteractionSource APIs that are used for hand gestures on HoloLens. A SpatialInteractionSource representing a motion controller will have a SpatialInteractionSourceKind of Controller. Motion controllers can offer a variety of capabilities, for example: analog triggers, touch pads, and thumbsticks.

Motion controllers use all spatial input events. In the following example, we will use the SpatialInteractionManager::SourceUpdated event to detect grasp gestures (where available) and use them to reposition the cube.

Add the following private member declarations to SpatialInputHandler.h:

void OnSourceUpdated(
       Windows::UI::Input::Spatial::SpatialInteractionManager^ sender,
       Windows::UI::Input::Spatial::SpatialInteractionSourceEventArgs^ args);
   Windows::Foundation::EventRegistrationToken m_sourceUpdatedEventToken;

Open SpatialInputHandler.cpp. Add the following event registration to the constructor:

m_sourceUpdatedEventToken =
       m_interactionManager->SourceUpdated +=
       ref new TypedEventHandler<SpatialInteractionManager^, SpatialInteractionSourceEventArgs^>(
           bind(&SpatialInputHandler::OnSourceUpdated, this, _1, _2)
           );

This is the event handler code. If the input source is a controller with grasp, and the grasp button is pressed, the pointer pose will be stored for the next update loop. Otherwise, it will check the press button instead.

void SpatialInputHandler::OnSourceUpdated(SpatialInteractionManager^ sender, SpatialInteractionSourceEventArgs^ args)
   {
       if (args->State->Source->Kind == SpatialInteractionSourceKind::Controller)
       {
           if (args->State->Source->IsGraspSupported)
           {
               if (args->State->IsGrasped)
               {
                   m_sourceState = args->State;
               }
           }
           else
           {
               if (args->State->IsPressed)
               {
                   m_sourceState = args->State;
               }
           }
       }
   }

Make sure to unregister the event handler in the destructor:

m_interactionManager->SourceUpdated -= m_sourcePressedEventToken;

Recompile, and then redeploy. Your template project should now be able to recognize grasp gestures to reposition the spinning cube.

The SpatialInteractionSource API supports controllers with a wide range of capabilities. In the example shown above, we check to see if grasp is supported before trying to use it. The SpatialInteractionSource supports the following optional features:

Grasp button: Check support by inspecting the IsGraspSupported property. When supported, check the SpatialInteractionSourceState::IsGrasped property to find out if grasp is activated.

Menu button: Check support by inspecting the IsMenuSupported property. When supported, check the SpatialInteractionSourceState::IsMenuPressed property to find out if the menu button is pressed.

Pointer pose: When IsPointerPoseSupported is true, the motion controller can provide an indication of the direction it is pointing. When that is the case, you can attempt to get the current pointer pose for the motion controller by using SpatialInteractionSourceState::TryGetPointerPose. The SpatialPointerPose it returns will contain pointing information for all spatial input devices attached to the system, including the motion controller and the user's head. For more info see SpatialPointerPose.

For controllers, the SpatialInteractionSource has a Controller property with additional capabilities.

HasThumbstick: If true, the controller has a thumbstick. Inspect the ControllerProperties property of the SpatialInteractionSourceState to acquire the thumbstick x and y values, as well as the button state.

HasTouchpad: If true, the controller has a touchpad. Inspect the ControllerProperties property of the SpatialInteractionSourceState to acquire the touchpad x and y values, and to know if the user is touching the pad and if they are pressing the touchpad down. If the user is not touching the touchpad, the x and y values will always be 0.

SimpleHapticsController: The SimpleHapticsController API for the controller allows you to inspect the haptics capabilities of the controller, and it also allows you to control haptic feedback.

Note that the range for touchpad and thumbstick is from -1 to 1 for both axes. The range for the analog trigger, which is accessed using the SpatialInteractionSourceState::SelectPressedValue property, has a range of 0 to 1. A value of 1 correlates with IsSelectPressed being equal to true; any other value correlates with IsSelectPressed being equal to false.

As with any other spatial input, additional controller properties are available in the SpatialInteractionSourceState::Properties property. Note that for a controller, using Properties::TryGetLocation will provide the hand position - which is distinct from the pointer pose. If you want to draw something at the hand location, use TryGetLocation. If you want to understand the pointer direction, use SpatialPointerInteractionSourcePose.