Project COMRADE - The IRIS vision system

IRIS-3D

What is IRIS-3D?

IRIS-3D is the component of the IRIS vision system responsible for endowing COMRADE with a sense of the third dimension - depth, distance, whatever you call it - thus enabling us to program robots to perform even more complex tasks. Sonar is a good method for measuring depth, but it is limited in range. For determination of 3D information using cost-effective means, vision is a sure-fire winner, though the algorithms and maths involved aren't all that easy.

IRIS-3D is a newer component of the IRIS system and thus is extremely prone to changes (almost daily in an average). But most of the architecture is in place; it remains to tune the algorithms or replace them using better/faster ones. IRIS-3D already shows promising results, and will only get better with time.

A rough structural description

IRIS-3D resides in the namespace Comrade::Iris3D, so to use it, you'll either have to refer to its data types and functions in a fully qualified fashion, or bring in the whole thing with the using directive.

The classes currently implemented in the IRIS-3D are as follows:

Matrix4x4: This structure is almost exclusively used for homogeneous 3D operations on a point. For efficiency purposes, therefore, its dimensions are fixed to 4-by-4.
Coordinate: This is IRIS' basic 3D point structure. It can be translated, rotated or multiplied by any arbitrary Matrix4x4 object. Note that the axis of rotation can also be arbitrary, thus making this a very flexible structure at the lowest hierarchical level.
Voxel: This represents the basic volume element in 3D space and is used in the 3D space carver engine. It is important to note that since the voxel is a finite cube, its footprint will be more than one pixel if it is 'projected' onto a screen during a projective transformation.
Sensor: This class represents the mathematical model of the imaging sensor, i.e., the camera. Strictly speaking, there should be other functions in addtion to the ones present to optimise the calibration matrix of the camera in a least-mean-squares sense. Also, the camera model should also model (at a minimum) a first-order projective distortion. All of this is currently being worked upon, though the model as it is adequate for approximate results.
VoxelWorker: This class is responsible for calculating various mathematical interactions of a Voxel object with its environment; for example, determining the footprint of a voxel on an arbitrary plane.
WorldSpace: This represents a cuboidal arrangement of voxels (a 'box') and is used by the space carver engine to reconstruct the model from its N projective views within this space.

Besides the above, there are a few other data structures like Point and Parametric, but they are not for direct use by the programmer.

Stereovision and space carving

The two most important algorithms in IRIS-3D are currently not encapsulated inside classes because they are not the final versions. Nevertheless, they still provide useful results. They are as follows:

Stereovision algorithm: It enables binocular stereoscopic vision, by analysing pairs of images. Currently, the method implemented is a fixed-window correlation method, which gives useful results already. However, this causes the so-called corona effect at discontinuities, in addition to being slow. For this reason, a new fast multiresolution, variable window, stereovision algorithm has been designed. Initial results are already interesting. You can also read the associated paper.
Space carving engine: This allows the robot to reconstruct the 3D model of an object (upto an approximation) from N calibrated views of the same. This uses the basic space carving algorithm given by Kyros Kutulakos. Currently, it performs reconstruction with a very good degree of accuracy. Photorealism will involve mapping the image texture onto the model, and is nice to look at, but not very useful right now. You can see some results with uncalibrated images here.

Possible goals for IRIS-3D

Integrated automatic/semiautomatic camera calibration routine
Map building using multiple stereo images in a 3D evidence grid (will possibly be moved to the Osiris engine)
Fast optical flow estimation and consequent tracking of (multiple?) moving targets

You can check out the latest developments in IRIS-3D in our project journal.