Gestures

Gestures allow users take action in mixed reality with their hands. For HoloLens, gesture input lets you interact with your holograms naturally, or you can optionally use the included HoloLens Clicker. While hands gestures do not provide a precise location in space, the simplicity of putting on a HoloLens and immediately interacting with content allows users to get to work without any accessories.

Device support

Feature HoloLens Immersive headsets
Gestures X

Gaze-and-commit

To take actions, gestures use gaze as targeting mechanism. The combination of gaze and a select gesture results in a gaze-and-commit interaction. An alternative to gaze-and-commit is point-and-commit is enabled by motion controllers. Apps that run on HoloLens only must support gaze-and-commit since HoloLens does not support motion controllers. Apps that run on both HoloLens and immersive headsets should support both gaze-driven and pointing-driven interactions, to give users choice in what input device they use.

Hand recognition

HoloLens recognizes gesture input by tracking the position of either or both hands that are visible to the device. HoloLens sees hands when they are in either the ready state (back of the hand facing you with index finger up) or the pressed state (back of the hand facing you with the index finger down). When hands are in other poses, the HoloLens will ignore them.

HoloLens looks for hand input within a cone in front of the device, known as the gesture frame, which extends above, below, left and right of the display frame where holograms appear. This lets you keep your elbow comfortably at your side while providing hand input. When using the HoloLens Clicker, your hands do not need to be within the gesture frame.

For each hand that HoloLens detects, you can access its position (without orientation) and its pressed state. As the hand nears the edge of the gesture frame, you're also provided with a direction vector, which you can show to the user so they know how to move their hand to get it back where HoloLens can see it.

Interactions: Low-level spatial input

The core interactions for gestures are select and home.

  • Ready state for hand gestures on HoloLens
    Ready state for hand gestures on HoloLens
    Select is the primary interaction to activate a hologram, consisting of a press followed by a release. With gestures, you can do a select press by making a fist in front of you, with the back of your hand facing you. Your elbow should be bent at your side in a comfortable position. Now, raise your index finger to the sky and then tap, by flexing your index finger down (the press) and then back up (the release). This is also known as an air-tap.

    Other ways to perform a select are by pressing the single button on a HoloLens Clicker or by speaking the voice command "select." The same select interaction can be used within any app. Think of select as the equivalent of a mouse click, a universal action that you learn once and then apply across all your apps.
  • Bloom gesture on HoloLens
    Bloom gesture on HoloLens
    Home is a special system action that is used to go back to the Start Menu. It is similar to pressing the Windows key on a keyboard or the Xbox button on an Xbox controller. On HoloLens, the hand gesture you perform to go Home is called Bloom. To do the bloom gesture on HoloLens, hold out your hand, palm up, with your fingertips together. Then open your hand. Note, you can also always return to Start by saying "Hey Cortana, Go Home". Apps cannot react specifically to home actions, as these are handled by the system.

Gestures: High-level spatial input

Select and hold gesture allows navigation like a virtual joystick
Select and hold gesture allows navigation like a virtual joystick
Your app can recognize more than just individual presses and releases. By combining presses and releases with movement of your hand, you can perform more complex gestures as well:

  • Tap: A Select press and release.
  • Hold: Holding a Select press beyond the system's Hold threshold.
  • Manipulation: A Select press, followed by absolute movement of your hand through 3-dimensional world.
  • Navigation: A Select press, followed by relative movement of your hand or the controller within a 3-dimensional unit cube, potentially on axis-aligned rails. More on this below.

One benefit of using gesture recognition is that you can configure a gesture recognizer just for the gestures the currently targeted hologram can accept. That way, a hologram that is just supports tap can accept any length of time between press and release, while a hologram that supports both tap and hold can promote the tap to a hold after the hold time threshold.

Gesture How to apply in app
TapTap gestures (as well as the other gestures below) react only to select presses. To detect other presses, such as Menu or Grasp, your app must directly use the lower-level interactions described above.
HoldHold gestures are similar to a touch tap-and-hold and can be used to take a secondary action, such as picking up an object instead of activating it, or showing a context menu.
Manipulation

Manipulation gestures can be used to move, resize or rotate a hologram when you want the hologram to react 1:1 to the user's hand movements. One use for such 1:1 movements is to let the user draw or paint in the world.

The initial targeting for a manipulation gesture should be done by gaze or pointing. Once the press starts, any manipulation of the object is then handled by hand movements, freeing the user to look around while they manipulate.

Navigation

Navigation gestures operate like a virtual joystick, and can be used to navigate UI widgets, such as radial menus. You press to start the gesture and then move your hand within a normalized 3D cube, centered around the initial press. You can move your hand along the X, Y or Z axis from a value of -1 to 1, with 0 being the starting point.

Navigation can be used to build velocity-based continuous scrolling or zooming gestures, similar to scrolling a 2D UI by clicking the middle mouse button and then moving the mouse up and down.

Navigation with rails refers to the ability of recognizing movements in certain axis until certain threshold is reached on that axis. This is only useful, when movement in more than one axis is enabled in an application by the developer, e.g. if an application is configured to recognize navigation gestures across X, Y axis but also specified X axis with rails. In this case system will recognize hand movements across X axis as long as they remain within an imaginary rails (guide) on X axis, if hand movement also occurs Y axis.

Within 2D apps, users can use vertical navigation gestures to scroll, zoom, or drag inside the app. This injects virtual finger touches to the app to simulate touch gestures of the same type. Users can select which of these actions take place by toggling between the tools on the bar above the app, either by selecting the button or saying '<Scroll/Drag/Zoom> Tool'.

See also