Google's Project Tango is a platform for Android phones and tablets designed to track the full 3-dimensional motion of the device as you hold it, while simultaneously creating a map of the environment around it. The devices track themselves with an Inertial Measurement Unit (IMU) and collect 3D points with a built in depth-sensing camera. Project Tango is progressing at a fast pace thanks to many open source tools that facilitate the use of the 3D data. Only 200 of these devices have been made available to early testers and developers, and we had the luck of getting two of them at Kitware. First, we pulled the 3D data out of the device and plotted it using the open source visualization platform ParaView.
The depth sensor is a Myriad 1, manufactured by Movidius. It generates data in the form of points in 3D space, along with color values of the images seen by the camera at a particular point in space. This type of data collection is very similar to what the Kinect device does and is called point cloud. In the case of Project Tango, this information is enriched by sensors that report on the orientation and position of the device at about one quarter million times per second. Point cloud data tends to be noisy, and therefore must be processed by correlating points based on their 3D positions and their color information. Just as I pointed out about Google Glass, open source tools are allowing Project Tango to evolve with great speed and agility.
Point cloud data can be manipulated and processed with the open source Point Cloud Library (PCL).
The combination of point clouds and color pixel data can be processed and visualized with Paraview.
Point data can be exported from the device using the Android SDK. See example datasets.
Thanks to these tools being open source, it was possible in a few days to create a PCL plugin for Paraview, download data from the device using the adb tool of the Android SDK, and load the cloud data into ParaView for analysis and visualization.
Since the data is acquired by the device in a continuous manner, the output is really 3D + time in nature. That is, as a sequence of cloud points, each one of which is time stamped to a particular time and has a camera position and orientation associated with it. This aggregate data is managed in ParaView as a time series, and it can be consolidated to reconstruct the 3D scene around the device.
We have created a tutorial with detailed instructions on how to replicate this process. The capabilities of these new devices open the door to a great number of possibilities. For example, combining 3D scanning with accelerometer data, GPS, video images, and compass direction data, we can build:
guidance devices for people with vision disabilities