The Depth API uses a Depth API-supported device’s RGB camera to create depth maps (also called depth images). You can then use the information provided by a depth map to make virtual objects accurately appear in front of or behind real world objects, enabling immersive and realistic user experiences.
For example, the following images show a virtual Android figure in a real space containing a trunk beside a door. In the first image, the virtual Android unrealistically overlaps with the edge of the trunk. In the second image, the Android is properly occluded, appearing much more realistic in its surroundings.
Depth API-supported devices
Refer to the ARCore supported devices page for an up-to-date list of devices that support the Depth API. This page also lists devices that have a supported hardware depth sensor, such as a time-of-flight sensor (or ToF sensor).
The Depth API uses a depth-from-motion algorithm to create depth maps, which
you can obtain using
This algorithm takes multiple device images from different angles and compares them to estimate the distance to every pixel as a user moves their phone. If the device has a supported hardware depth sensor, such as a time-of-flight sensor (or ToF sensor), that data is automatically included in the processed depth. This enhances the existing depth map and enables depth even when the camera is not moving. It also provides better depth on surfaces with few or no features, such as white walls, or in dynamic scenes with moving people or objects.
The following images show a camera image of a hallway with a bicycle on the wall, and a visualization of the depth map that is created from the camera images.
Format of a depth map
Each depth map is represented as an
ARCore uses the
format for depth maps, with the three most-significant bits always set to
Each pixel in the depth map is represented by an unsigned 16-bit integer. The least significant 13 bits contain the distance in millimeters to the estimated surface from the camera's image plane, along the camera's optical axis.
Requirements for motion
The Depth API uses motion tracking, and the hardware depth sensor if present and supported. The depth-from-motion algorithm effectively treats pairs of camera images as two observations of an assumed static scene. If parts of the environment move — for example, if a person moves in front of the camera — the static components of the scene will have an accurate estimation of depth, but the parts that have moved will not.
For depth to work well on devices without a supported hardware depth sensor, the user will need to move their device at least a few centimeters.
Ensuring accurate depth reading
Depth is provided for scenes between
8 meters. Optimal depth accuracy
is achieved between
0.5 meters and
5 meters from the camera. Error increases
quadratically as distance from the camera increases.
In absence of an hardware depth sensor, only RGB color information is used to perform the depth estimation task. Surfaces with few or no features, such as white walls, will be associated with imprecise depth.
Understanding depth values
A on the observed real-world geometry and a 2D point
representing the same point in the depth image, the value given by the Depth
a is equal to the length of
CA projected onto the principal axis.
This can also be referred as the z-coordinate of
A relative to the camera
C. When working with the Depth API, it is important to understand that
the depth values are not the length of the ray
CA itself, but the projection
Start using the Depth API in your own apps. To learn more, see: