At their core, mixed reality apps place holograms in your world that look and sound like real objects. This involves precisely positioning and orienting those holograms at places in the world that are meaningful to the user, whether the world is their physical room or a virtual realm you've created. When reasoning about the position and orientation of your holograms, or any other geometry such as the gaze ray or hand positions, Windows provides various real-world coordinate systems in which that geometry can be expressed.
|Stationary frame of reference||✔️||✔️|
|Stage frame of reference||Not supported yet||✔️|
|Attached frame of reference||✔️||✔️|
Mixed reality apps can design for a broad range of user experiences, from 360-degree video viewers that just need the headset's orientation, to full world-scale apps and games, which need spatial mapping and spatial anchors:
|Experience scale||Requirements||Example experience|
|Orientation-only||Headset orientation (gravity-aligned)||360° video viewer|
|Seated-scale||Above, plus headset position relative to zero position||Elite: Dangerous|
|Standing-scale||Above, plus stage floor origin||Obduction|
|Room-scale||Above, plus stage bounds polygon||Fantastic Contraption|
|World-scale||Spatial anchors (and typically spatial mapping)||RoboRaid|
These experience scales follow a "nesting dolls" model. The key design principle here for Windows Mixed Reality is that a given headset supports apps built for a target experience scale, as well as all lesser scales:
|6DOF tracking||Floor defined||360° tracking||Bounds defined||Spatial anchors||Max experience|
|Yes||Yes||No||-||-||Standing - Forward|
|Yes||Yes||Yes||No||-||Standing - 360°|
Note that the Stage frame of reference is not yet supported on HoloLens. A room-scale app on HoloLens currently needs to use spatial mapping to find the user's floor and walls.
All 3D graphics applications use Cartesian coordinate systems to reason about the positions and orientations of objects in the virtual worlds they render. Such coordinate systems establish 3 perpendicular axes along which to position objects: an X, Y, and Z axis.
In mixed reality, your apps will reason about both virtual and physical coordinate systems. Windows calls a coordinate system that has real meaning in the physical world a spatial coordinate system.
Spatial coordinate systems express their coordinate values in meters. This means that objects placed 2 units apart in either the X, Y or Z axis will appear 2 meters apart from one another when rendered in mixed reality. This lets you easily render objects and environments at real-world scale.
In general, Cartesian coordinate systems can be either right-handed or left-handed. Spatial coordinate systems on Windows are always right-handed, which means that the positive X-axis points right, the positive Y-axis points up (aligned to gravity) and the positive Z-axis points towards you.
In both kinds of coordinate systems, the positive X-axis points to the right and the positive Y-axis points up. The difference is whether the positive Z-axis points towards or away from you. You can remember which direction the positive Z-axis points by pointing the fingers of either your left or right hand in the positive X direction and curling them to the positive Y direction. The direction your thumb points, either toward or away from you, is the direction that the positive Z-axis points for that coordinate system.
Describing the position of a hologram in the real-world requires a reference point that remains stationary as the device moves through the environment. The system provides a simple affordance for this purpose which is called a "stationary frame of reference". The coordinate system provided by this frame of reference works to keep the positions of objects near the user as stable as possible, relative to the world.
In a game engine such as Unity, a stationary frame of reference is what defines the engine's "world origin". Objects that are placed at a specific world coordinate use the stationary frame of reference to define their position in the real-world using those same coordinates. An app will typically create one stationary frame of reference on startup and use its coordinate system throughout the app's lifetime.
Over time, as the system learns more about the user's environment it may determine that distances between various points in the real-world are shorter or longer than the system previously believed. Therefore, the system adjusts as the user walks around a larger area resulting in holograms that you've placed in a stationary coordinate system may be seen to drift off their original position.
The stationary frame of reference on its own can be used to build seated-scale experiences. In Unity, you just start placing content relative to the origin, which will be at the user's initial head position and orientation, recentering the origin as needed.
To go beyond seated-scale for immersive headsets, you can use the "stage frame of reference".
When first setting up an immersive headset, the user defines a stage, which represents the room in which they will experience mixed reality. The stage minimally defines a stage origin, a spatial coordinate system centered at the user's chosen floor position and forward orientation where they intend to use the device. By placing content in this stage coordinate system at the Y=0 floor plane, you can ensure your holograms appear comfortably on the floor when the user is standing, providing users a standing-scale experience.
The user may also optionally define stage bounds, an area within the room that they've cleared of furniture where they intend to move around in mixed reality. If so, the app can build a room-scale experience, using these bounds to ensure that holograms are always placed where the user can reach them.
Because the stage frame of reference provides a single fixed coordinate system within which to place floor-relative content, it is the easiest path for porting standing-scale and room-scale applications developed for virtual reality headsets. However, as with those VR platforms, a single coordinate system can only stabilize content in about a 5 meter (15 foot) diameter, before lever-arm effects cause content far from the center to shift noticeably as the system adjusts. To go beyond 5 meters, spatial anchors are needed.
To avoid drift and ensure that a hologram remains exactly at a specific spot in the world, even as the system discovers more about the world, you can place that hologram using a spatial anchor. A spatial anchor represents an important point in the world that the system should keep track of over time. Each anchor has a coordinate system that adjusts as needed, relative to other spatial anchors or frames of reference, in order to ensure that anchored holograms stay precisely in place.
Rendering a hologram in a spatial anchor's coordinate system gives you the most accurate positioning for that hologram at any given time. This comes at the cost of small adjustments over time to the hologram's position as the system continually moves it back into place relative to the real world.
Today, when writing games, data visualization apps, or virtual reality apps, the typical approach is to establish one absolute world coordinate system that all other coordinates can reliably map back to. In that environment, you can always find a stable transform that defines a relationship between any two objects in that world. If you didn't move those objects their relative transform would always remain the same. This kind of global coordinate system works well when rendering a purely virtual world where you know all of the geometry in advance.
In contrast, an untethered mixed reality device such as HoloLens has a dynamic sensor-driven understanding of the world, continuously adjusting its knowledge over time of the spatial anchors you create as the user walks many meters across an entire floor of a building. As a result, your app must be prepared for the spatial anchors you create to change their relationships to each other over time.
For example, the device may currently believe two locations in the world to be 4 meters apart, and then later refine that understanding, learning that the locations are in fact 3.9 meters apart. If those holograms had initially been placed 4 meters apart in a single rigid coordinate system, one of them would then always appear 0.1 meters off from the real-world.
Windows Mixed Reality solves this issue by letting you create spatial anchors to mark important points in the world where the user has placed holograms. As the device learns about the world, these spatial anchors can adjust their position relative to one another as needed to ensure that each anchor stays precisely where it was placed relative to the real-world. By placing a hologram in the coordinate systems of a nearby spatial anchor, you can ensure that this hologram maintains optimal stability.
This continuous adjustment of spatial anchors relative to one another is the key difference between coordinate systems from spatial anchors and stationary frames of reference:
In contrast to a stationary frame of reference, which always optimizes for stability near the user, the stage frame of reference and spatial anchors ensure stability near their origins. This helps those holograms stay precisely in place over time, but it also means that holograms rendered too far away from their coordinate system's origin will experience increasingly severe lever-arm effects. This is because small adjustments to the position and orientation of the stage or anchor are magnified proportional to the distance from that anchor. A good rule of thumb is to ensure that anything you render based on a distant spatial anchor's coordinate system is within about 3 meters of its origin. For a nearby stage origin, rendering distant content is OK, as any increased positional error will affect only small holograms that will not shift much in the user's view.
Spatial anchors can also allow your app to remember an important location even after your app suspends or the device is shut down.
You can save to disk the spatial anchors your app creates, and then load them back again later, by persisting them to your app's spatial anchor store. When saving or loading an anchor, you provide a string key that is meaningful to your app, in order to identify the anchor later. Think of this key as the filename for your anchor. If you want to associate other data with that anchor, such as a 3D model that the user placed at that location, save that to your app's local storage and associate it with the key you chose.
By persisting anchors to the store, your users can place individual holograms or place a workspace around which an app will place its various holograms, and then find those holograms later where they expect them, over many uses of your app.
Your app can also share spatial anchors with other devices. By transferring a spatial anchor along with its supporting understanding of the environment and sensor data around it from one HoloLens to another, both devices can then reason about the same location. By having each device render a hologram using that shared spatial anchor, both users will see the hologram appear at the same place in the real world.
Some holograms are designed to follow the user, floating at a chosen heading and distance from the user at all times. These holograms can be placed in an "attached frame of reference", which moves with the user as they walk around. One key note is that an attached frame of reference has a fixed orientation, defined when it's first created. The reference frame does not rotate as the user turns their head or body. This lets the user comfortably look around at various holograms placed within that frame of reference, while still bringing those holograms along as the user walks around. Content rendered with this behavior relative to the user is called 'body-locked' content.
When the headset can't figure out where it is in the world, an attached frame of reference provides the only coordinate system which can be used to render holograms. This makes it ideal for displaying fallback UI to tell the user that their device can't find them in the world. All apps should include such a fallback to help the user get things working again with UI similar to that shown in the Mixed Reality home.
We strongly discourage rendering head-locked content, which stays at a fixed spot in the display (such as a HUD). In general, head-locked content is uncomfortable for users and does not feel like a natural part of their world.
Head-locked content should usually be replaced with holograms that are attached to the user or placed in the world itself. For example, cursors should be pushed out into the world, scaling naturally to reflect the position and distance of the object under the user's gaze.
In some environments such as dark hallways, it may not be possible for a headset using inside-out tracking to locate itself correctly in the world. This can lead holograms to either not show up or appear at incorrect places if handled incorrectly. We now discuss the conditions in which this can happen, its impact on user experience, and tips to best handle this situation.
Sometimes, the headset's sensors are not able to figure out where the headset is. This can happen if the room is dark, or if the sensors are covered by hair or hands, or if the surroundings do not have enough texture.
When this happens, the headset will be unable to track its position with enough accuracy to render world-locked holograms. You won't be able to figure out where a spatial anchor, stationary frame or stage frame is relative to the device, but you can still render body-locked content in the attached frame of reference.
Your app should tell the user how to get positional tracking back, rendering some fallback body-locked content that describes some tips, such as uncovering the sensors and turning on more lights.
Sometimes, the device cannot track properly if there are lots of dynamic changes in the environment, such as many people walking around in the room. In this case, the holograms may seem to jump or drift as the device tries to track itself in this dynamic environment. We recommend using the device in a less dynamic environment if you hit this scenario.
Sometimes, when you start using a headset in an environment which has undergone lot of changes (e.g. significant movement of furniture, wall hangings etc.), it is possible that some holograms may appear shifted from their original locations. The earlier holograms may also jump around as the user moves around in this new space. This is because the system's understanding of your space no longer holds and it tries to remap the environment while trying to reconcile the features of the room. In this scenario, it is advised to encourage users to re-place holograms they pinned in the world if they are not appearing where expected.
Sometimes, a home or other space may have two identical areas. For example, two identical conference rooms, two identical corner areas, two large identical posters that cover the device's field of view. In such scenarios, the device may, at times, get confused between the identical parts and mark them as the same in its internal representation. This may cause the holograms from some areas to appear in other locations. The device may start to lose tracking often since its internal representation of the environment has been corrupted. In this case, it is advised to reset the system's environmental understanding. Please note that resetting the map leads to loss of all spatial anchor placements. This will cause the headset to track well in the unique areas of the environment. However, the problem may re-occur if the device gets confused between the identical areas again.