|
IRIS-3D
What is IRIS-3D?
IRIS-3D is the component of the IRIS vision system responsible for endowing COMRADE with a sense
of the third dimension - depth, distance, whatever you call it - thus enabling us to program robots
to perform even more complex tasks. Sonar is a good method for measuring depth, but it is limited
in range. For determination of 3D information using cost-effective means, vision is a sure-fire
winner, though the algorithms and maths involved aren't all that easy.
IRIS-3D is a newer component of the IRIS system and thus is extremely prone to changes
(almost daily in an average). But most of the architecture is in place; it remains to tune the
algorithms or replace them using better/faster ones. IRIS-3D already shows promising
results,
and will only get better with time.
A rough structural description
IRIS-3D resides in the namespace Comrade::Iris3D,
so to use it, you'll either have to refer to its data types and functions in a fully qualified
fashion, or bring in the whole thing with the using directive.
The classes currently implemented in the IRIS-3D are as follows:
Matrix4x4: This structure is almost exclusively used for homogeneous 3D operations
on a point. For efficiency purposes, therefore, its dimensions are fixed to 4-by-4.
Coordinate: This is IRIS' basic 3D point structure. It can be translated,
rotated or multiplied by any arbitrary Matrix4x4 object. Note
that the axis of rotation can also be arbitrary, thus making this a very flexible structure
at the lowest hierarchical level.
Voxel: This represents the basic volume element in 3D space and is used in the
3D space carver engine. It is important to note that since the voxel is a finite cube, its
footprint will be more than one pixel if it is 'projected' onto a screen during a projective
transformation.
Sensor: This class represents the mathematical model of the imaging sensor, i.e.,
the camera. Strictly speaking, there should be other functions in addtion to the ones present
to optimise the calibration matrix of the camera in a least-mean-squares sense. Also, the
camera model should also model (at a minimum) a first-order projective distortion. All of this
is currently being worked upon, though the model as it is adequate for approximate results.
VoxelWorker: This class is responsible for calculating various mathematical
interactions of a Voxel object with its environment; for example,
determining the footprint of a voxel on an arbitrary plane.
WorldSpace: This represents a cuboidal arrangement of voxels (a 'box') and is
used by the space carver engine to reconstruct the model from its N projective views within
this space.
Besides the above, there are a few other data structures like Point
and Parametric, but they are not for direct use by the
programmer.
Stereovision and space carving
The two most important algorithms in IRIS-3D are currently not encapsulated inside classes
because they are not the final versions. Nevertheless, they still provide useful results. They
are as follows:
Stereovision algorithm: It enables binocular stereoscopic vision, by analysing
pairs of images. Currently, the method implemented is a fixed-window correlation method, which
gives useful results
already. However, this causes the so-called corona effect at discontinuities,
in addition to being slow. For this reason, a new fast multiresolution,
variable window, stereovision algorithm has been designed. Initial
results are already interesting.
You can also read the associated
paper.
Space carving engine: This allows the robot to reconstruct the 3D model of an
object (upto an approximation) from N calibrated views of the same. This uses the basic
space carving algorithm given by Kyros Kutulakos. Currently, it performs reconstruction
with a very good degree of accuracy. Photorealism will involve mapping the image texture
onto the model, and is nice to look at, but not very useful right now.
You can see some results with uncalibrated images
here.
Possible goals for IRIS-3D
- Integrated automatic/semiautomatic camera calibration routine
- Map building using multiple stereo images in a 3D evidence grid (will possibly
be moved to the Osiris engine)
- Fast optical flow estimation and consequent tracking of (multiple?) moving targets
You can check out the latest developments in IRIS-3D in our
project journal.
Copyright (c) 2004 Avishek Sen Gupta
|