Paper: Altunbasak 95

Author: Altunbasak Y, Tekalp AM, Bozdagi G

Title: Simultaneous Stereo-Motion Fusion and 3D Motion Tracking

Date: 1995

Publisher: IEEE International Conference on Acoustics, Speech and Signal Processing, 1995, Vol. 4, pp. 2277-2280

Comments: Feature based motion-stereo. A feature based algorithm that simultaneously combines the two problems of stereo correspondence and motion estimation. 3-D points are recovered using maximum likelihood estimation to identify the most probable stereo correspondences by minimising a cost function that takes into account local image similarity (using current estimates of disparity) both in the temporal and spatial domains, as well as the estimated motion vectors of the features. All of the feature locations, velocities and accelerations (rotational and translational) are then estimated and predicted by an extended Kalman filter using the estimated 3-D locations of the points from two successive stereo results. The Kalman filter is supposedly robust to occlusions because if a mistake is made in identifying a stereo match, then the cost function should be high, this is turn gives a high noise coefficient in the Kalman filter and the implications of the mismatch should be minimised. The algorithm is iterative, using the 3-D motion parameters from the Kalman filter to re-estimate and re-evaluate stereo correspondences. The iterations stop when a global cost function is minimised.

Paper: Arakawa 95

Author: Arakawa H, Etoh M

Title: Integration Algorithm for Stereo, Motion and Color in Real Time Applications

Date: December 1995

Publisher: IEICE Transactions on Information and Systems, December 1995, Vol. E78-D, No. 12, pp. 1615-1620

Comments: Optical flow based motion-stereo. Aim is to make a basic framework for integrating motion, depth and colour for real-time applications. The system identifies fragments in the input images within which the pixels share common colour, motion and disparity distributions (modelled by multivariate normal distributions) so it is an area based algorithm. A competitive learning technique is used to find the best set of fragment vectors that describe a good fragment match. The system is calibrated using a reference plane. The warp applied to the right image to make the disparity zero at the reference plane, allows foreground objects to be separated from background objects with reference to the sign of disparity and can give relative depth values relative to the reference plane. System suitable for surveillance or human-computer interaction, but the assumptions made about single motion and disparity values for the elliptical fragments of the input images, would limit the accuracy for any individual feature or dense disparity measurements that are recovered. Might make a reasonable method for disparity seeding a more accurate feature matching stereo algorithm however it is quite slow due to the competitive learning process.

Paper: Arun 87

Author: Arun KS, Huang TS, Blostein SD

Title: Least Square Fitting of Two 3D Point Sets

Date: 1987

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, Vol. 9, No. 5, pp. 698-700

Paper: Baker 81

Author: Baker HH, Binford TO

Title: Depth from Edge and Intensity Based Stereo

Date: August 1981

Publisher: Proc. of the Seventh International Joint Conference on Artificial Intelligence, August 1981, pp. 631-636

Comments: Dynamic programming, image rectification, and the ordering constraint.

Book: Ballard 82

Author: Ballard DH, Brown CM

Title: Computer Vision

Date: 1982

Publisher: Prentice-Hall

ISBN: 0131653164

Paper: Barnard 80

Author: Barnard ST, Thompson WB

Title: Disparity Analysis of Images

Date: July 1980

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1980, Vol. 2, No. 4, pp. 333-340

Comments: Stereo vision. Uses a relaxation labelling scheme to derive the most probable point feature correspondences. Point features detected using the Moravec operator [Moravec 77] are comparing using a local correlation measure. Match probability scores are then calculated for the potential matches based on the similarity of local neighbourhood candidate matches. To select winning matches, the match probabilities are then iteratively updated using a relaxation labelling scheme based on the match probabilities of neighbouring points with similar disparities. Both stereo and temporal matching examples are identified in the paper; matching features in the left and right images, and then matching features through a monocular image sequence.

Paper: Burt 80

Author: Burt P, Julesz B

Title: Modifications of the Classical Notion of Panum's Fusional Area

Date: 1980

Publisher: Perception, 1980, Vol. 9, pp. 671-682

Comments: Stereo vision. How the disparity gradient constraint was derived from psychophysical experiments using dot stereograms. Disparity gradient limit calculated and the idea of 'forbidden cones' in 3-D space inside which other objects will not fuse are introduced. Also stated how disparity gradient enforces uniqueness and also prevents order reversal. Found from experimentation, that a change in the viewing distance scales both dot separation and disparity, and, therefore, the disparity gradient must remain constant over a wide variety of scales.

Paper: Canny 85

Author: Canny JF

Title: A Computational Approach to Edge Detection

Date: January 1985

Publisher: IEEE Transactions of Pattern Analysis and Machine Vision, January 1985, Vol. 8, No. 6, pp. 679-698

Paper: Chang 97

Author: Chang YL, Aggarwal JK

Title: Line Correspondences from Cooperating Spatial and Temporal Grouping Processes for a Sequence of Images

Date: August 1997

Publisher: Computer Vision and Image Understanding, August 1997, Vol. 67, No. 2, pp. 186-201

Comments: Feature based motion-stereo. Work based upon relaxation labelling as used by [Barnard 80] only this time relaxation applied both temporally and spatially. Work is also very similar to [Ho 96] only where Ho used point features and tracked their motions whereas Chang is using lines so this is a feature based algorithm. Comments about [Ho 96] apply equally here.

Paper: Cheng 93

Author: Cheng TK, Kitchen L

Title: Preliminary Results on Real Time 3D Feature Based Tracker

Date: December 1993

Publisher: DICTA-93 Conference, Australian Pattern Recognition Society, Sydney, December 1993

Paper: Chevrel 81

Author: Chevrel M, Courtis M, Weill G

Title: The SPOT Satellite Remote Sensing Mission

Date: 1981

Publisher: Photogrammetric Eng. Remote Sensing, 1981, Vol. 47, No. 8, pp. 1163-1171

Paper: Crossley 97

Author: Crossley S, Lacey AJ, Thacker NA, Seed NL

Title: Robust Stereo via Temporal Consistency

Date: 1997

Publisher: Proc. of the British Machine Vision Conference, 1997, pp. 659-668

Comments: See previous link page.

Paper: Crossley 98

Author: Crossley S, Thacker NA, Seed NL

Title: Benchmarking of Bootstrap Temporal Stereo using Statistical and Physical Scene Modelling

Date: 1998

Publisher: Proc. of the British Machine Vision Conference, 1998, pp. 346-355

Comments: Temporal stereo vision. Robust bootstrapping of a temporal stereo algorithm. Includes benchmarking results obtained using statistical modelling of the algorithm and physical scene modelling to get non-subjective outlier counts.

Paper: Dalmia 96

Author: Dalmia AK, Trivedi M

Title: High Speed Extraction of 3D Structure of Selectable Quality using a Translating Camera

Date: July 1996

Publisher: Computer Vision and Image Understanding, July 1996, Vol. 64, No. 1, pp. 97-110

Comments: Optical flow based stereo from motion from a translating camera. Spatial and temporal gradient approach to finding depth without the need to solve the correspondence problem. Depth is deduced from temporal and spatial flow fields extracted from stereo pairs taken at different baseline lengths. The camera displacement vector is registered for point where temporal gradient is equal to spatial gradient which corresponds to a specific disparity. Using the camera displacement vector, the focal length, and the selected disparity value, the 3-D depth is easily calculated. The algorithm assumes that the scene is stationary (rigid body constraint). There is a trade-off however between several factors such as accuracy of depth, depth of field, and the amount the camera must translate. In order to perceive larger depths, the camera has to translate over larger distances.

Paper: Dhond 89

Author: Dhond UR, Aggarwal JK

Title: Structure from Stereo - A Review

Date: November / December 1989

Publisher: IEEE Transactions on Systems, Man and Cybernetics, November / December 1989, Vol. 19, No. 6, pp. 1489-1510

Comments: Background to stereo and trinocular stereo. Compares area and feature based algorithms. Feature based algorithms - Marr and Poggio and Grimson and Mayhew and Frisby. Area based algorithms - Marr and Poggios Cooperative algorithm, hierarchical approaches, trinocular stereo. Conclusion - Algorithms need to be improved to give a lower percentage of false matches as well as better accuracy of depth estimates. Many references.

Paper: Dinkar 98

Author: Dinkar BN, Shree NK

Title: Ordinal Measures for Image Correspondence

Date: 1998

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, April 1998, Vol. 20, No. 4, pp. 415-423

Comments: Correspondence matching. A new method is suggested as an alternative to correlation for correspondence matching. The new technique is called an ordinal measure and uses the relative ordering of intensity values in windows (rank permutations) to obtain robust matching. They are independent of absolute intensity and the types of monotone transformations that can occur between stereo images.

Book: Duda 73

Author: Duda RO, Hart PE

Title: Pattern Recognition and Scene Analysis

Date: 1973

Publisher: Wiley

ISBN: 0471223611

Comments: Description of the correspondence problem.

Paper: Faugeras 92

Author: Faugeras OD

Title: What can be seen in Three Dimensions with an Uncalibrated Stereo Rig?

Date: 1992

Publisher: ECCV2, 1992, pp. 563-578

Book: Faugeras 93

Author: Faugeras OD

Title: Three-Dimensional Computer Vision

Date: 1993

Publisher: MIT Press

ISBN: 0262061589

Comments: Structure from motion and structure from stereo. 3D computer vision including stereo and structure from motion. Highlights problems with structure from motion with optical flow.

Paper: Fischler 81

Author: Fischler MA, Bolles RC

Title: Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography

Date: June 1981

Publisher: Communications of the ACM, June 1981, pp. 381-395

Paper: Foerstner 87

Author: Foerstner W, Gulch E

Title: A Fast Operator for Detection and Precise Location of Distinct Points, Corners, and Centres of Circular Features

Date: 1987

Publisher: Proc. ISPRS, 1987, Vol. 25, No. 3, pp. 281-305

Book: Foley 94

Author: Foley, Van Dam, Feiner, Hughes, Phillips

Title: Introduction to Computer Graphics

Date: 1994

Publisher: Addison Wesley

ISBN: 0201609215

Paper: Förstner 94

Author: Förstner W

Title: Diagnostics and Performance Evaluation in Computer Vision

Date: 1994

Publisher: Proc. Performance versus Methodology in Computer Vision, 1994, pp. 11-25

Comments: Algorithmic evaluation. To increase algorithmic performance, algorithms must be more robust to errors and include self-diagnosis to achieve autonomous evaluation of results. Urgent need for tools to analyse the results of computer vision algorithms by exploiting the redundancy in the data and by controlled tests.

Paper: Förstner 96

Author: Förstner W

Title: 10 Pros and Cons Against Performance Characterisation of Vision Algorithms

Date: 1996

Publisher: 1996 EPSRC Summer School on Computer Vision, The University of Surrey, Guildford

Comments: Algorithmic evaluation. Growing awareness that experimental proofs of algorithmic performance are insufficient. To allow a clear comparison of algorithms and to allow appropriate algorithms to be chosen for a particular task, algorithmic performance characterisation is necessary. With a set of standard quality measures and a representative set of data for the possible input data classes allows two things: The user to specify appropriate application specific quality variables and probabilities, and then enable the user to invert the results of the various algorithms' simulations to select the appropriate algorithm for the task even if the task was not one anticipated by the algorithm's author. Traffic light programs - the need for vision modules to contain self diagnosis of performance. Simulations can prove the correctness of implementations and can help develop performance measures.

Paper: Gibson 50

Author: Gibson JJ

Title: The Perception of the Visual World

Date: 1950

Publisher: Houghton Mufflin

Comments: Structure from motion. Introduced the concept of optical flow.

Book: Gonzalez 87

Author: Gonzalez RC, Wintz G

Title: Digital Image Processing 2nd Edition

Date: 1987

Publisher: Addison Wesley

ISBN: 0201110261

Comments: Image transforms, enhancement and segmentation.

Paper: Grimson 85

Author: Grimson WEL

Title: Computational Experiments with a Feature Based Stereo Algorithm

Date: 1985

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1985, Vol. 7, No. 1, pp. 17-34

Comments: Stereo vision. An outline of the Marr-Poggio model of the human visual system, followed by the Marr-Poggio-Grimson stereo algorithm and its performance. Coarse to fine approach used to limit the search space of possible matches. Laplacian of Gaussian masks used of variable size, tuned to different spatial frequencies, for image filtering. Once matching performed at coarsest scale, disparities used guide matching at next finer scale. As density of zero-crossings increases as the size of the filter is decreased, this coarse-to-fine control strategy allows the matching of very dense zero-crossing descriptions with greatly reduced false target problems by using coarser resolution matching to drive the alignment process.

Paper: Grosso 89

Author: Grosso E, Sandini G, Tistarelli M

Title: 3D Object Reconstruction using Stereo and Motion

Date: November / December 1989

Publisher: IEEE Transactions on Systems, Man, and Cybernetics, 1989, Vol. 19, No. 6, pp. 1465-1476

Comments: Optical flow based motion-stereo. Integration of a correlation based stereo algorithm (fixed mode of operation) and a depth from motion parallax algorithm. Optical flow used, estimation of camera motion necessary (controlled and constrained camera motion used to simplify this process). Depth obtained from optical flow and known egomotion parameters. Integration of results based on depth maps from stereo and motion and also using uncertainty maps which encode the errors peculiar to each algorithm. Evidence of depths is then accumulated and the most probable world scene arrived at.

Paper: Grosso 95

Author: Grosso E, Tistarelli M

Title: Active / Dynamic Stereo Vision

Date: 1995

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, Vol. 17, No. 11, pp. 1117-1128

Comments: Optical flow based motion-stereo. Task is to detect corridors of free space along which a robot could navigate. Tries to avoid the need for detailed calibration procedures by using 'active' techniques for controlling the cameras positioning which allows self-calibration. Algorithmic robustness can certainly be improved by providing independent estimates of the same quantity. Derives and equation for depth based on angular disparity. Cameras are moveable independently with encoders to record their rotation. This helps solve the optical flow equations because the rotation is known leaving just the translation to be calculated.

Paper: Gruen 85

Author: Gruen AW

Title: Adaptive Least Squares Correlation - A Powerful Image Matching Technique

Date: 1985

Publisher: South African Journal of Photogrammetry, Remote Sensing, and Cartography, 1985, Vol. 14, No. 3

Comments: Area correlation based stereo. Uses window shaping in the form of affine transformations to accommodate local perspective distortions.

Paper: Hannah 80

Author: Hannah MJ

Title: Bootstrap Stereo

Date: April 1980

Publisher: Proc. ARPA Image Understanding Workshop, April 1980, pp. 201-208

Paper: Hannah 85

Author: Hannah MJ

Title: SRI's Baseline Stereo System

Date: 1985

Publisher: Proc. DARPA Image Understanding Workshop, 1985, pp. 149-155

Comments: Area correlation based stereo. Includes hierarchical multi-scale processing.

Paper: Hannah 89

Author: Hannah MJ

Title: A System for Digital Stereo Image Matching

Date: 1989

Publisher: Photogrammertic Engineering and Remote Sensing, 1989, pp. 1765-1770

Comments: Area correlation based stereo. Includes hierarchical multi-scale processing.

Paper: Haralick 94

Author: Haralick RM

Title: Performance Characterisation Protocol in Computer Vision

Date: 1994

Publisher: CVGIP Image Understanding, 1994, Vol. 60, No. 2, pp. 245-249

Comments: Algorithmic evaluation. Discusses meaning of performance characterisation and various protocols with which an algorithm's performance can be characterised. No point in studying the performance of algorithm on perfect data because no input noise or random variation should lead to perfect output. Performance characterisation is really about seeing how noise and imperfect input data affects the quality of the output data. Models; input data, random perturbations (also noise), and output data. Data unit may change therefore the nature of the errors propagated may change. When capturing a representative set of sample images, they should be chosen across the full range of lighting, object position, object orientation, permissible object shape variation, occlusion, clutter, distortion, and noise. Protocol uses two random perturbations; a small, usually, Gaussian model applied to all input data units, and the other, a large perturbation which is applied to a fraction of the data units and can be modelled by just replacing values with other totally unrelated values. New data units can be introduced (false alarms) and others removed (misdetections). Most algorithms can handle the small variations but fail with the large fractional perturbations. Characterisation will be specified by how much of this large random perturbation the algorithm can tolerate can still give good results. Algorithms which give good results for large perturbations on a small fraction of data units can be said to be robust. Methods derived for determining robustness and reliability measures. A good protocol summary is given.

Paper: Harris 88

Author: Harris C, Stephens M

Title: A Combined Corner and Edge Detector

Date: 1988

Publisher: Proc. of the Fourth Alvey Vision Conference, 1988, pp. 147-151

Paper: Harris 98

Author: Harris AJ, Thacker NA, Lacey AJ

Title: Modelling Feature Based Stereo Vision for Range Sensor Simulation

Date: June 1998

Publisher: Proc. of the European Simulation Multiconference, June 1998, pp. 417-421

Comments: Stereo algorithmic evaluation. Derivation of the errors contained in feature based stereo vision. The paper describes the construction of a model for a general feature based stereo vision algorithm. The operational characteristics, data types and linear and non-linear errors based an error propagation techniques are all modelled and the model simulated to asses the robustness of the stereo vision system in the context of an automatic collision avoidance system on a semi-autonomous wheelchair. The vision system is assumed to be either corner or edge based stereo, where the stereo is calibrated, although the exact method of calibration is unimportant. The corner or edge based detection routines are assumed to have a fixed accuracies associated with them and from those values the expected disparity and real world errors can be calculated.

Paper: Ho 96

Author: Ho AYK, Pong TC

Title: Cooperative Fusion of Stereo and Motion

Date: January 1996

Publisher: Pattern Recognition, January 1996, Vol. 29, No. 1, pp. 121-130

Comments: Feature based motion-stereo. Recovery of stereo disparity and image flow values. Two successive pairs of stereo images: four sub-processes; two stereo and two motion correspondences. Each sub-process can use information from the others to try to solve any ambiguities that arise. Correspondences established using [Barnard 80]'s method, however the constraint for consistency becomes a 3 rule system: uniqueness, consistency (same as [Barnard 80]), and image flow continuity. The 3-D continuity constraint is used to disambiguate the matching process (but not to guide it). This is an iterative algorithm, approximately 10 iterations per matcher.

Paper: Ho 97

Author: Ho PK, Chung R

Title: Stereo-Motion that Compliments Stereo and Motion Analysis

Date: 1997

Publisher: Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997, pp. 213-218

Comments: Feature based motion-stereo. This system solves the two multi-ocular correspondence problems of motion and stereo using singular value decomposition. Starting with the assumption of affine projection, and that there are two stereo cameras that move along a known path through a scene taking a sequence of images during which features are tracked (i.e. solved motion correspondence problem for the left and right images). Using some known initially correct matches (which are determined using one of the established stereo matching techniques such as exploiting epipolar geometry and disparity gradient), the algorithm then uses SVD to arrive at a set of stereo correspondences that give 3-D world points that are consistent across the entire image sequence. However, this algorithm does rely on several assumptions; that both the temporal and initial stereo correspondences can be solved robustly, that the scene comprises a single rigid body, and that the camera projection can be approximated by an affine projection (only valid where the scene is not too close to the cameras). The technique is also really only effective for short sequences of images where the likelihood of points becoming occluded or new points appearing is small.

Paper: Hollinghurst 94

Author: Hollinghurst N, Cipolla R

Title: Uncalibrated Stereo Hand-Eye Coordination

Date: April 1994

Publisher: Image and Vision Computing, April 1994, Vol. 12, No. 3, pp. 187-192

Comments: Stereo vision. A system that combines stereo vision with a robot manipulator arm to enable it to locate and reach for objects in an unstructured environment. The system is self calibrating, with the system moving the manipulator arm to four (arbitrary) reference points. Errors are minimised by using visual feedback on the gripper's position and orientation. The stereo system relies on weak perspective. Using weak perspective simplifies the calibration problem, as the entire projection mathematics becomes a simple linear mapping and the stereo correspondences between the two images become affine projections.

Paper: Horaud 89

Author: Horaud R, Skordas T

Title: Stereo Correspondence through Feature Grouping and Maximal Cliques

Date: 1989

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, Vol. 11, pp. 1168-1180

Comments: Feature based stereo. Line matching using feature relations (collinear-with, same-junction-as, left-of, etc) between lines using relational graphs.

Paper: Horn 81

Author: Horn, Schunk

Title: Constraints on Optical Flow Computations

Date: 1981

Publisher: Proc. IEEE Conference on Pattern Recognition and Image Processing, 1981, pp. 205-210

Comments: Structure from motion using optical flow. Method proposed to recover the full motion field. A relaxation algorithm is used to apply a local smoothness of motion constraint, aiming to reach a more global consensus on scene motion than using individual measurements alone.

Paper: Hung 95

Author: Hung YP, Tang CY, Shih SW, Chen Z, Lin WS

Title: A 3D Predictive Visual Tracker for Tracking Multiple Moving Objects with a Stereo Vision System

Date: 1995

Publisher: Lecture Notes in Computer Science, 1995, Vol. 1024, pp. 25-32

Comments: Feature based motion-stereo. A 3-D predictive tracker of point features. Uses a linear Kalman filter for motion tracking estimation. Stereo cameras are calibrated so system can exploit epipolar search geometry and 'mutually supported' consistency. The algorithm operates as follows: 1) Feature extraction. 2) 2D temporal matcher used on left and right image streams separately. Features matched temporally using motion prediction parameters. 3) Stereo correspondences established and the 3-D positions calculated. 4) RANSAC based clustering based on feature motions. 5) Kalman filter predicts future positions for next iteration. RANSAC clustering: Point features exhibiting similar motion parameters; rotation and translation matrices, are grouped into a single cluster. Points can be removed and new clusters formed to allow for changing / new objects. Motion prediction using Kalman filters: Each cluster is handled by an individual Kalman filter. State vector holds values for angular velocity and acceleration, centre of rotation, translational velocity and acceleration. Constant acceleration is assumed. Initially, when the algorithm has no knowledge of scene content, a conventional stereo algorithm is used to extract 3D data from feature data in the scene. Once the rigid body motions have been recovered, the future motion predictions can then be used to help constrain the temporal correspondence matching for the next scene.

Paper: Hung 95b

Author: Hung YP, Tang CY, Shih SW, Chen Z, Lin WS

Title: A 3D Feature Based Tracker for Tracking Multiple Moving Objects with a Controlled Binocular Head

Date: 1995

Publisher: Technique Report TR-IIS-95-004, Institute of Information Science, Academia Sinica, Taiwan, 1995

Comments: Description of the mutually supported consistency constraint which is used in [Hung 95] stereo matcher.

Paper: Illingworth 98

Author: Illingworth J, Hilton A

Title: Looking to Build a Model World: Automated Construction of Static Object Models using Computer Vision

Date: June 1998

Publisher: Electronics and Communication Engineering Journal, June 1998, Vol. 10, No. 3, pp. 103-113

Comments: Active vision and uncalibrated stereo vision. A good introduction including descriptions of several fields that require the use of accurate 3-D models end hence the use of 3-D acquisition scheme such as stereo vision. These fields are: Industrial manufacture where hand crafted clay models such as those in the motor industry are input into CAD/CAM systems, or, reverse engineering a master object for subsequent production. Building and architecture where 3-D models are transferred into CAD systems or captured for used in VRML modelling for advertising on the web or virtual reality demonstrations to customers. Retailing where accurate 3-D capture would allow cheap customised tailoring for clothing. Medicine and biometrics where again the acquisition of human shapes would permit better diagnosis, design of prostheses or personal identification in security systems. Communication and broadcast systems where model based coding schemes allow object models to be transmitted just once and then high level descriptions of changes only need be transmitted which would give large savings in bandwidth. Entertainment industries where computer games and films use increasingly computer models to achieve special effects. Robotics and automation where autonomous robots have to navigate and interact with the 3-D world and therefore need the capacity to virtually model their surrounding environment. Education and information provision industries where virtual reality models can enhance understanding of meaning, inner workings or 3-D structure. The 3-D acquisition techniques covered here are touch probe co-ordinate measuring machines (CMMs), models from silhouettes, active range sensors, and models from video sequences using uncalibrated stereo. The problem with the touch probe system is that it requires an expensive CMM and is very slow which could mean only a sparse set of datum points being recovered. Silhouettes fail when concavities are present however quite complex models can be constructed providing enough silhouettes are gathered from various views around the object in question. The active range sensor studied here is a laser striping system using a single camera to view the shape of the laser line and hence deduce an objects shape. This system is also expensive and time consuming as there has to be a physical scanning motion either moving the laser/camera system around the object or moving the object under the laser stripe. The models from video sequences system uses uncalibrated stereo to simultaneously calculate camera and motion calibration (fundamental matrix) and reconstruct scene geometry from a single video sequence. The estimation of the fundamental matrix uses a RANSAC (RANdom SAmpling Consensus) methodology to ensure statistical robustness although the data collected by the stereo algorithms with not be as dense as that produced by the active range sensor methods. However the biggest advantage of the video based methods is the ease of data collection combined with cheap commercially available equipment which makes such methods ideal for a wide range of real world environments. Concludes that most of the problems mentioned can now be solved quite well with existing techniques to the accuracy the ease of use that general use will require.

Paper: Inria 91

Author: Inria Research Laboratories France

Title: A Parallel Stereo Algorithm that Produces Dense Depth Maps and Preserves Image Features

Date: 1991

Publisher: Research Report, 1991, No. 1369

Comments: Area correlation based stereo.

Book: Iyengar 91

Author: Iyengar SS, Elfes A

Title: Autonomous Mobile Robots

Date: 1991

Publisher: IEEE Computer Society Press

ISBN: 0818690186

Paper: Jenkin 86

Author: Jenkin M, Tsotsos JK

Title: Applying Temporal Constraints to the Dynamic Stereo Problem

Date: 1986

Publisher: Computer Vision, Graphics, and Image Processing, 1986, Vol. 33, pp. 16-32

Comments: Feature based motion-stereo. Algorithm uses general smoothness assumption for both temporal and spatial domains. Simple 3-D feature motion model is used to guide the matching process. Stereopsis constraints: off-epipolar error and maximum disparity cut. Temporal constraints: maximum distance cut on how far an object can move between frames and a maximum velocity cut on how much the velocity of an object can change between frames; defines a region in space within which valid matches must be found. Claims there is no evidence for 'global' spatial relaxation techniques used to assign final disparities. No such techniques used here, instead points are assigned labels such as 'create', 'split', 'track' and 'merge'. These hypotheses are then tested for frames t0 and t1, by looking at frames t1 and t2. The most 'continuous', is chosen. It is expected that for 'noise' points, there will be no continuity temporally and so those matches will be discarded. Of coarse with no motion the hypothesis generation fails and the result degrades to however the initial matches where chosen.

Paper: Jones 85

Author: Jones AG, Taylor CJ

Title: Scale Space Surface Recovery using Binocular Shading and Stereo Information

Date: 1995

Publisher: Proc. of the British Machine Vision Conference, 1995, Vol. 1, pp. 77-86

Paper: Julesz 61

Author: Julesz B

Title: Binocular Depth Perception and Pattern Recognition

Date: 1961

Publisher: Proc. of the Fourth London Symposium on Information Theory, 1961, pp. 212-224

Comments: Shows that depth perception by humans is principally a function of processes operating on the fused binocular field. With the computer model he suggests being based on the binocular parallax shifts of the left image with respect to the right.

Paper: Kalman 60

Author: Kalman RE, , Trans. ASME,

Title: A New Approach to Linear Filtering and Prediction Problems

Date: March 1960

Publisher: Journal of Basic Engineering, March 1960, pp. 35-45

Paper: Krol 80

Author: Krol JD, Grind WA van de

Title: The Double Nail Illusion: Experiments on Binocular Vision with Nails, Needles and Pins

Date: 1980

Publisher: Perception, 1980, Vol. 9, pp. 651-669

Comments: Shows that the human visual system cannot cope with order reversed stereo fusion.

Paper: Lacey 96

Author: Lacey AJ, Thacker NA, Crossley S, Yates RB

Title: Surface Approximation from Industrial SEM Images

Date: 1996

Publisher: Proc. of the British Machine Vision Conference, 1996, pp. 725-734

Comments: Practical use for stereo depth estimation. Need for fast processing for real time visual feedback.

Paper: Lacey 98

Author: Lacey AJ

Title: Automatic Extraction and Tracking of Moving Image Features

Date: 1998

Publisher: PhD Thesis, The University of Sheffield, 1998

Comments: Optical Flow gives ambiguous result, only good for extracting velocity parallel to direction of intensity gradient. Much better to use features that are well locatable and unique such as corners.

Paper: Lacey 98b

Author: Lacey AJ, Thacker NA, Crossley S, Yates RB

Title: A Multi-Stage Approach to the Dense Estimation of Disparity from Stereo SEM Images

Date: 1998

Publisher: Image and Vision Computing, 1998, Vol. 16, pp. 373-383

Comments: Attempts to reconstruct dense depth maps from SEM images. A multi-stage approach was taken starting with a calibration process that exploited the constraints of an SEM stereo image pair and epipolar error minimisation. Stretch correlation was then used to reconstruct sparse but accurate edge depth information. Finally, B-fitting used to construct the surface of materials viewed under a SEM, filling in the gaps between the known disparity data. A new soft rank-order filtering process was employed to counter the effects of illumination changes due to the change in viewpoint so the function fitting process would be more reliable.

Paper: Lane 94

Author: Lane RA, Thacker NA, Seed NL

Title: Stretch Correlation as a Real Time Alternative to Feature Based Stereo Matching Algorithms

Date: May 1994

Publisher: Image and Vision Computing, 1994, Vol. 12, No. 4, pp. 203-212

Comments: Stereo vision. Feature based algorithm using correlation of edge enhanced, information rich areas of image to drive correlation process. Using warping to tackle difficult non-front-o-parallel stereo problems. Suited to fast hardware implementation and temporal acceleration.

Paper: Lane 95

Author: Lane RA

Title: Edge Based Stereo Vision with a VLSI Implementation

Date: June 1995

Publisher: PhD Thesis, The University of Sheffield, June 1995

Comments: The stretch correlation algorithm [Lane 94] and design of VCP chip [Lane 96].

Paper: Lane 96

Author: Lane RA, Thacker NA, Seed NL, Ivey PA

Title: A Generalised Computer Vision Chip

Date: April 1996

Publisher: Real Time Imaging, April 1996, Vol. 2, pp. 203-213

Comments: Design and construction of a VLSI device to accelerate convolution in hardware. Acceleration of the stretch correlation algorithm.

Paper: Levine 73

Author: Levine MD, O'Handley DA, Yagi GM

Title: Computer Determination of Depth Maps

Date: 1973

Publisher: Computer Graphics and Image Processing, 1973, Vol. 2, No. 2, pp. 131-150

Comments: Stereo vision. Paper concerned with a Mars roving vehicle equipped with a stereo camera system. Basic problem is to isolate and identify objects and the relationships between objects. Specifically wanted to calculate the range to objects. Computes range image as a series of contour lines. Epipolar constraint exploited and correlation matching used. Too smaller correlation window used and noise dominates. Too larger correlation window used and poorly defined edges result, therefore inaccurate. An adaptive windowing system employed in which the window's significant dimension is dependant of the local grey level variance. A regions with small variance; larger windows are required to capture a significant amount of information to ensure proper correspondence. For larger variances, there must be more local texture and a smaller window can be employed. The algorithm establishes gross correspondences first and then moves onto finer scales to refine the result.

Paper: Lew 94

Author: Lew MS, Huang TS, Wong K

Title: Learning and Feature Selection in Stereo Matching

Date: September 1994

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, September 1994, Vol. 16, No. 9, pp. 869-881

Paper: Liu 93

Author: Liu J, Skerjane R

Title: Stereo and Motion Correspondence in a Sequence of Stereo Images

Date: October 1993

Publisher: Signal Processing: Image Communication, 1993, Vol. 5, Iss. 4, pp. 305-318

Comments: Feature based motion-stereo. The algorithm processes each new stereo pair by initially tracking the edge points from the previous pair, starting at the coarsest scale of the image pyramids. Once the motion field has been recovered, the algorithm performs disparity estimation using an epipolar matching technique called dynamic programming which was applied hierarchically to the stereo image pyramids. Uses three terms in the dynamic programming cost function to combine the motion and multi-scale match data. The first is an inter-line cost, added because most edge features are connected across epipolar lines and are, therefore, likely to have common correspondences. The second is a multi-scale cost that enforces consistency with correspondences already established for the previous pyramid level. Finally, there is a motion cost derived from the binocular disparity difference constraint that enforces consistency between successive stereo image pairs. The final set of stereo correspondences from the finest scale are then used to reconstruct the 3D data for the scene.

Paper: Lloyd 87

Author: Lloyd SA

Title: A Parallel Binocular Stereo Algorithm Utilizing Dynamic Programming and Relaxation Labelling

Date: 1987

Publisher: Computer Vision, Graphics, and Image Processing, 1987, Vol. 39, pp. 202-225

Paper: Marik 96

Author: Marik R, Kittler J, Petrou M

Title: Error Sensitivity Assessment of Vision Algorithms Based on Direct Error Propagation

Date: 1996

Publisher: 1996 EPSRC Summer School on Computer Vision, The University of Surrey, Guildford

Comments: Algorithmic evaluation. Propagating through a series of mathematical steps to see how noise is propagated and the effects of hardware implementations on precision limitation. Two methods used: variance propagation and min/max value propagation.

Paper: Marr 76

Author: Marr D, Poggio T

Title: Cooperative Computation of Stereo Disparity

Date: October 1976

Publisher: Science, October 1976, Vol. 194, pp. 283-287

Comments: Co-operative algorithms - class of parallel algorithms, operate on many input elements to reach global organisation by way of local interactive constraints (relaxation algorithm). Human vision - 3 steps involved in measuring disparity, S1 - a location on a surface in one scene must be selected, S2 - same location in other image must be found, S3 - the disparity must be measured. Using unambiguous cues e.g. structured light make S1 and S2 easier, but in reality the correspondence problem is the one to solve. >From 2 constraints on the physical world (C1 - a given point on a physical surface has a unique position in space at any one time, C2 - matter is cohesive, separated into objects, generally smooth compared with the distance from the viewer) derives the uniqueness and continuity rules. Algorithm uses a mesh of connected cells which communicate with each other to iterate towards a solution.

Paper: Marr 79

Author: Marr D, Poggio T

Title: A Computational Theory of Human Stereo Vision

Date: 1979

Publisher: Proc. of the Royal Society of London, 1979, B, Vol. 204, pp. 301-338

Comments: Based on the psychophysical studies of the human visual system, a simple set of spatial filters are used to locate features for correlation at a number of spatial resolutions. Stereo images are filtered by Laplacian of Gaussian operators at different scales. Matches at coarse scales are used to constrain the searching at finer scales.

Paper: Marr 80

Author: Marr D, Hildreth E

Title: Theory of Edge Detection

Date: 1980

Publisher: Proc. of the Royal Society of London, Series B, 1980, Vol. 207, pp. 187-217

Comments: The Marr-Hildreth edge detector.

Paper: Matthies 89

Author: Matthies L, Okutomi M

Title: Bootstrap Algorithms for Dynamic Stereo Vision

Date: 1989

Publisher: Proc. of the 6th Multidimensional Signal Processing Workshop, 1989, p. 12

Comments: Stereo from motion of translating camera. Highlights narrow baseline stereo using moving/translating cameras to help solve the stereo correspondence problem.

Paper: Matthies 89b

Author: Matthies L, Kanade T, Szeliski R

Title: Kalman Filter Based Algorithms for Estimating Depth from Image Sequences

Date: 1989

Publisher: International Journal of Computer Vision, 1989, Vol. 3, No. 3, pp. 209-238

Comments: Structure from motion. Two methods are studied, the first uses edge features which are tracked using Kalman filtering. The second method produces dense depth maps and depth uncertainty maps again using Kalman filtering based methods. Both techniques require known camera motion and the size of the motion in between frames has to be small for the correlation based optical flow estimation to work. Small motion minimises the correspondence problem between successive images, but sacrifices depth resolution because of the small baseline between consecutive pairs.

Paper: Maybank 92

Author: Maybank SJ, Faugeras OD

Title: A Theory of Self-Calibration of a Moving Camera

Date: 1992

Publisher: International Journal of Computer Vision, 1992, Vol. 8, No. 2, pp. 123-151

Paper: Mayhew 81

Author: Mayhew JEW, Frisby JP

Title: Psychophysical and Computational Studies towards a Theory of Human Stereopsis

Date: 1981

Publisher: Artificial Intelligence, 1981, Vol. 17, pp. 349-385

Comments: Stereopsis has two distinctive characteristics: 1) Disparity can only be calculated from low level monocular 'point' descriptions which are binocularly matched. Higher level surface or object matching is unnecessary as random dot stereograms work very well without them and the high level descriptions appear only after stereopsis has been achieved. 2) Solving the correspondence problem requires 'global stereopsis' when the ambiguity of false or ghost matches occur. So the two choices available for the algorithm designer are what monocular cues to use and what global mechanism to use to resolve ambiguity. Paper talks about using zero crossing points and peak and trough points as possible monocular cues both of which can be picked out using spatially tuned frequency filters. Conclusions are that figural continuity is important; matter is cohesive and edges and surface markings will be spatially continuous. Human binocular vision combines primitives such as zero crossings and peak derived from several spatially tuned frequency channels. Matches are chosen according to cross-channel combination rules. Dense complex textures are difficult because blurring can occur across two or more edges.

Book: Mayhew 91

Author: Mayhew JEW, Frisby JP

Title: 3D Model Recognition from Stereoscopic Cues

Date: 1991

Publisher: MIT Press

ISBN: 0262132435

Comments: Combined works on stereo vision and visual reconstruction all carried out at AIVRU on the PMF stereo algorithm project, the 2.5D sketch project and the 3-D model-based vision project. The PMF project was to create an algorithm to solve the stereo correspondence problem. The 2.5D sketch project was to create a representation of the 3-D structure of the visible surfaces in the scene. The 3-D model-based vision project was to develop a scheme for the recognition and manipulation of 3-D objects using information about the 3-D structure delivered by the 2.5D sketch project.

Paper: McLauchlan 91

Author: McLauchlan PF, Mayhew JEW, Frisby JP

Title: Location and Description of Textured Surfaces using Stereo Vision

Date: 1991

Publisher: Published in [Mayhew 91]

Comments: Stereo vision. Stereo from textured smooth continuous surfaces. Unlike PMF which propagates local constraints, the Needles algorithm uses histogramming and Hough Transform techniques to give local constraints and region growing to get global constraints. All edge based like PMF.

Paper: Medioni 85

Author: Medioni G, Nevatia R

Title: Segment-based Stereo Matching

Date: 1985

Publisher: Computer Vision, Graphics and Image Processing, 1985, 31, pp. 2-18

Comments: Stereo vision via high level line features.

Paper: Moravec 77

Author: Moravec HP

Title: Towards Automatic Visual Obstacle Avoidance

Date: 1977

Publisher: Proc. of the Fifth Joint Conference on Artificial Intelligence, 1977, MIT, Cambridge M.A.

Comments: Detection of discrete point features.

Paper: Mori 73

Author: Mori K, Kidode M, Asada H

Title: An Iterative Prediction and Correction Method for Automatic Stereocomparison

Date: 1973

Publisher: Computer Graphics and Image Processing, 1973, Vol. 2, pp. 393-401

Comments: Stereo vision. Area based correlation matching using a variable sized window.

Paper: Negahdaripour 95

Author: Negahdaripour S, Hayashi BY, Aloimonos Y

Title: Direct Motion Stereo for Passive Navigation

Date: December 1995

Publisher: IEEE Transactions on Robotics and Automation, December 1995, Vol. 11, No. 6, pp. 829-843

Comments: Optical flow based motion-stereo. An area based algorithm which recovers dense depth maps. The algorithm recovers the cameras' motions (rotation and translation) from the temporal and spatial derivatives in the left and right images. This gives two depth (disparity) maps, one for the left and one for the right, however being depth from motion these are only correct up to a scale. Negahdaripour then finds the correct 'disparity scale' by performing a search over a range of scales and seeing which scale maximises a correlation measure between the two disparity images of the left and right. This method will only be as good as the underlying depth from motion extraction and therefore subject to the same failures as standard structure from motion. Also assumes rigid body constraint as there is no segmentation of the motion.

Paper: Nevatia 76

Author: Nevatia R

Title: Depth Measurement by Motion Stereo

Date: 1976

Publisher: Computer Graphics and Image Processing, 1976, Vol. 5, pp. 203-214

Comments: Stereo from motion using a translating camera. Uses a series of progressive views to constrain disparity to small values. This algorithm is very akin to [Dalmia 96] and their translating camera. This algorithm uses a rotating camera setup in exactly the same manner. Using a whole series of small rotations proves to make the correspondence problem much easier and the eventual result more robust.

Paper: Ohta 85

Author: Ohta Y, Kanade T

Title: Stereo by Intra- and Inter-Scanlines Search using Dynamic Programming

Date: 1985

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1985, Vol. 7, pp. 139-154

Comments: Stereo vision and dynamic programming.

Paper: Okutomi 92

Author: Okutomi M, Kanade T

Title: A Locally Adaptive Window for Signal Matching

Date: 1992

Publisher: International Journal of Computer Vision, 1992, Vol. 7, No. 2, pp. 143-162

Comments: Varying the size of window for maximum robustness when signal matching.

Paper: O'Neill 96

Author: O'Neill M, Denos M

Title: Automated System for Coarse to Fine Pyramidal Area Correlation Stereo Matching

Date: 1996

Publisher: Image and Vision Computing, 1996, Vol. 14, pp. 225-236

Comments: Stereo vision. Produces dense disparity models using area based correlation and multi-scale image pyramids. Uses area correlation based algorithm rather than feature based method because better at handling the continuous texture which is found in topographic maps instead of intensity anomaly features which the latter methods are better at. Two main correlation algorithms quoted. Original adaptive least squares correlation algorithm done by Gruen which was followed by an iterative sheet growing version of the same algorithm done by Otto. Algorithms used affine transformations when correlating to accommodate for the distortions found in non-rectified stereo images. Algorithms work at a fixed image scale; the original image single pixel scale. Correlation algorithm used; coarse to fine, with a variable window size which allows exhaustive search with the ability to find optimal matching point. Coarse: big window > fine: small window (more marginal differences in left and right begin to dominate the similarity metric). The trouble with these algorithms were that they: Fail at discontinuities where there is a change in the disparity function in that area. They need seeding with accurate matches initially. They fail when faced with insufficient texture. New system does not need seed points because the new coarse to fine stereo matching component of the system allows the existing area correlation algorithms to be reapplied at various scales. It is claimed that the new system increases the amount of stereo data eventually returned and with a higher degree of confidence in that data because it is less susceptible to noise or bad seed points.

Paper: Otto 89

Author: Otto GP, Chau TKW

Title: Region Growing Algorithm for the Matching of Terrain Images

Date: 1989

Publisher: Image and Vision Computing, 1989, Vol. 7, No. 2, pp. 83-93

Comments: Stereo vision. Area based region growing. Grows smooth surfaces from a small number (5-10) of accurately measured seed points. Iterates predicting neighbouring matches and then refining using adaptive least squares algorithm.

Paper: Ozeki 86

Author: Ozeki O, Nakano T, Yamamoto S

Title: Real Time Range Measurement Device for Three Dimensional Object Recognition

Date: 1986

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, Vol. 8, No. 4, pp. 550-554

Comments: Active vision. Using a laser striping system with known relationships between the camera and laser projecting system to detect objects in a 60cm square area at 100cm in 490ms. Accurate to +/-2cm. Used to discriminate between three types of machine parts on a conveyor belt in 4s.

Paper: Pollard 85

Author: Pollard SB, Mayhew JEW, Frisby JP

Title: PMF: A Stereo Correspondence Algorithm using a Disparity Gradient Limit

Date: 1985

Publisher: Perception, 1985, Vol. 14, pp. 449-470

Comments: Stereo vision. Edge based stereo, essentially derived from psychological findings about human vision. Disparity gradient: the disparity gradients between correct matches will be small almost everywhere. Epipolar constraint and uniqueness constraint also exploited.

Paper: Pollard 85b

Author: Pollard SB

Title: Identifying Correspondences in Binocular Stereo

Date: 1985

Publisher: PhD Thesis, The University of Sheffield, 1985

Comments: Stereo vision. The original PMF algorithm.

Paper: Pollard 86

Author: Pollard SB, Porrill J, Mayhew JEW, Frisby JP

Title: Disparity Gradient, Lipschitz Continuity, and Computing Binocular Correspondences

Date: 1986

Publisher: Proc. of the Third International Symposium of Robotics Research, 1986, pp. 19-26

Comments: Stereo vision.

Paper: Pollard 91

Author: Pollard SB, Mayhew JEW, Frisby JP

Title: Implementation Details of the PMF Stereo Algorithm

Date: 1991

Publisher: Published in [Mayhew 91]

Comments: Stereo vision. Implementation details focusing on PMF with emphasis on robustness and efficiency. Also contains more global constraints than PMF originally did. Epipolar/edge rectification used. Figural continuity looks at strings of edges. The edge with the most matching points is chosen. The figural continuity constraint is also used to fill in the gaps between seed points and is even allowed to break the edge orientation rule when trying to jump over the breaks or gaps. Ordering constraint enforces edge strings must be in order along epipolars, if not the weakest strings breaking the ordering are removed and considered in subsequent stages. Existing matches can be used with the ordering constraint to set disparity limits within which unmatched point must be matched so as not to violate the ordering constraint. Final iteration allows contrast reversal for matching edges.

Paper: Pollard 91b

Author: Pollard SB, Porrill J, Mayhew JEW

Title: Recovering Partial 3D Wire Frames Descriptions from Stereo Data

Date: 1991

Publisher: Image and Vision Computing, 1991, Vol. 9, No. 1, pp. 58-65

Paper: Porrill 88

Author: Porrill J, Pollard SB, Pridmore TP, Bowen JB, Mayhew JEW, Frisby JP

Title: TINA: A 3D Vision System for Pick and Place

Date: 1988

Publisher: Image and Vision Computing, 1988, Vol. 6, pp. 91-99

Comments: A description of the TINA vision system and how it integrated and implemented all of the parts of the AIVRU stereo project, from PMF, through grouping and segmentation of edge data into geometrical lines and arcs and through object model creation and 3-D wireframe model matching.

Book: Russ 94

Author: Russ JC

Title: The Image Processing Handbook 2nd Edition

Date: 1994

Publisher: CRC Press

ISBN: 0849325161

Paper: Shah 89

Author: Shah YC, Chapman R, Mahani RB

Title: A New Technique to Extract Range Information from Stereo Images

Date: 1989

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, Vol. 11, No. 7, pp. 768-773

Comments: Multi-scale stereo vision. Highlights the computational and functional advantages that the use of multiple resolution techniques can bring. Low resolution correlation matching ensures gross image features are matched rather than details. Once refined to high resolution correlations, finer details may be matched over a limited area.

Book: Sonka 93

Author: Sonka M, Hlavac V, Boyle R

Title: Image Processing, Analysis and Machine Vision

Date: 1993

Publisher: Chapman and Hill Computing

ISBN: 0412455706

Comments: Stereopsis, disparity gradient, PMF and shape from shading.

Book: Stucki 79

Author: Stucki P

Title: Advances in Digital Image Processing

Date: 1979

Publisher: Plenum Press

ISBN: 0306403145

Paper: Sung 93

Author: Sung E, Myint T

Title: Incorporating Color and Spatiotemporal Stereovision Techniques for Road Following

Date: 1993

Publisher: Proc. of SPIE, 1993, Vol. 1825, pp. 356-365

Comments: Feature based motion-stereo. Colour can greatly help in the processing of visual cues. However, colour segmentation can have problems with shadows and other additional matter on the roads. But stereo is not affected by shadows, etc, with passive stereo being particularly preferred. New method relies on temporal feature tracking with an accurate odometry system on the vehicle. Using the odometry information, the location of any scene feature can be predicted from one frame to the next. This information combined with colour, intensity and geometrical dimensions is used to solve with motion correspondence of features between the frames of a sequence. Because the algorithm already has a set of stereo matches from the previous frame, the motion correspondences allow the stereo correspondences in the current frame to be established simply. The major assumptions made by the motion-stereo algorithm are: Accurate odometry on the motion of the mobile robot is available. The world is stationary, with the only motion in the scene being due to the motion of the robot itself. There is an accurate initial set of stereo correspondences for the scene available at start-up.

Paper: Thacker 91

Author: Thacker NA, Mayhew JEW

Title: Optimal Combination of Stereo Camera Calibration from Arbitrary Stereo Images

Date: 1991

Publisher: Image and Vision Computing, February 1991, Vol. 9, No. 1, pp. 27-32

Comments: Stereo camera calibration techniques. A good calibration algorithm should determine the calibration scale based on many different (uncorrelated) 3D measurements combined with ground truth data points, to arrive at a consensus calibration using error minimisation.

Paper: Thacker 92

Author: Thacker NA, Courtney P

Title: Statistical Analysis of a Stereo Matching Algorithm

Date: 1992

Publisher: Proc. of the British Machine Vision Conference, 1992, pp. 316-326

Comments: Derivation of the probability of matching errors for a corner matching algorithm using cross-correlation resulting in the conclusion that the probability of an error is proportional to the mean number of candidate matches and therefore is proportional to search area.

Paper: Thacker 99

Author: Thacker NA, Jackson A, Moriarty D, Vokurka E

Title: Improved Quality of Re-sliced MR Images using Re-normalised Sinc Interpolation

Date: 1999

Publisher: Journal of Magnetic Resonance Imaging, 1999, Vol. 10, pp. 582-588

Comments: Improved sinc interpolation technique.

Paper: Torr 97

Author: Torr PT, Fitzgibbon AW, Zisserman A

Title: Maintaining Multiple Motion Model Hypotheses Over Many Views to Recover Matching and Structure

Date: January 1997

Publisher: Proc. of the 6th International Conference on Computer Vision, January 1998, Bombay, pp. 485-491

Comments: Uncalibrated stereo vision. Paper outlines a methodology of calculating calibration information and scene structure simultaneously when processing a monocular video sequence. The scheme uses uncalibrated stereo and trifocal tensors to match features across long sequences of monocular images. Robustness is maintained throughout the sequence by identifying situations where the recovery of the epipolar geometry through motion becomes unreliable: Under the particular circumstances of just camera rotation or if the only features visible are all co-planar then the structure from motion results cannot be used. Instead the matches are 'saved' until sufficient information (such as an out-of-plane match) can be found so they can be properly integrated into the final 3-D result.

Paper: Trivedi 85

Author: Trivedi HP, Lloyd SA

Title: The Role of Disparity Gradient in Stereo Vision

Date: 1985

Publisher: Perception, 1985, Vol 14, pp. 685-690

Comments: Stereo vision. Interpretation of the disparity gradient constraint as a method of enforcing topological equivalence between surfaces in the left and right images, and that the DG constraint is essential to solving the correspondence problem before interpolating surfaces from the computed 3D points

Paper: Trivedi 87

Author: Trivedi HP

Title: Estimation of Stereo and Motion Parameters using a Variation Principle

Date: 1987

Publisher: Image and Vision Computing, May 1987, Vol. 5, No. 2, pp. 181-183

Paper: Tsai 87

Author: Tsai RY

Title: An Efficient and Accurate Camera Calibration Technique for 3D machine Vision

Date: 1987

Publisher: Proc. IEEE Computer Vision and Pattern Recognition, 1987

Paper: Venkateswar 92

Author: Venkateswar V, Chellappa R

Title: Extraction of Straight Lines in Aerial Images

Date: 1992

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, No. 14, pp. 1111-1114

Paper: Venkateswar 95

Author: Venkateswar V, Chellappa R

Title: Hierarchical Stereo and Motion Correspondence using Feature Groupings

Date: July 1995

Publisher: International Journal of Computer Vision, July 1995, Vol. 15, No. 3, pp. 245-269

Comments: Stereo vision. A feature based stereo algorithm, later extended to motion correspondence, which matches surfaces, lines and vertices. Structural relationships along with attributes such as parallel to, co-linear with, left of and right of are used to resolve ambiguities and match edge rings and eventually lines for stereo. Same algorithm applied to motion correspondence.

Paper: Verri 87

Author: Verri, Poggio

Title: Against Quantitative Optical Flow

Date: 1987

Publisher: 1st International Conference of Computer Vision, 1987

Comments: Motion can be recovered more reliably by focussing the attention of the motion analysis on information significant areas of the image such as features.

Paper: Wade 97

Author: Wade P, Moran D, Graham J, Brook Jackson C

Title: Robust and Accurate 3D Measurement of Formed Tube using Trinocular Stereo Vision

Date: 1997

Publisher: Proc. of the British Machine Vision Conference, 1997, Vol. 1, pp. 232-241

Comments: A vision system that is used to accurately measure the geometry of metal pipes which have been bent into complex 3D shapes. However, the vision system requires a very controlled working environment and can only recover the structure of pipes with a constant radius.

Paper: Wang 96

Author: Wang W, Duncan JH

Title: Recovering the Three Dimensional Motion and Structure of Multiple Moving Objects from Binocular Image Flows

Date: May 1996

Publisher: Computer Vision and Image Understanding, May 1996, Vol. 63, No. 3, pp. 430-446

Comments: Optical flow motion-stereo. A continuation of the work done in [Waxman 86]. This is an iterative algorithm that illustrates how structure from motion using optical flow and stereopsis can be combined to produce a co-operative result. Initially, rigid body motion is assumed for the whole scene. Using conventional SfM techniques, two motion vectors are recovered for the scene and two independent sets of 3D data from the left and right motion fields are calculated. Using the epipolar constraint, the algorithm then checks that disparities recovered from the 3D motion results comply with the observed stereo data. They then refine the 3D result relying on that fact that the initially estimated motion will be close to the dominant rigid body motion in the scene (referred to as object 1). Therefore, the majority of the correct stereo matches will belong to object 1. This allows the algorithm to segment out the object 1 points and repeat the motion estimation process for those points alone, improving the accuracy of the recovered object 1 3D motion. By reapplying the recovered motion to unmatched points, new points belonging to object 1 can then be found. By iteratively estimating motions and using the stereo correspondence check, rigid objects with different motions are gradually matched and segmented. After start-up, processing further stereo frames becomes easier as the feature motion groupings from the previous frames can be carried into the current stage without having to resort to initial motion estimation. There must be significant variations between the motions of independently moving objects for a robust segmentation of the scene. Unfortunately, the algorithm is let down by the fact that the motion-stereo matching constraint is considerably weakened for scenes containing small feature velocities or no motion at all, a symptom of most algorithms relying on depth from motion as their primary depth cue.

Paper: Waxman 86

Author: Waxman AM, Duncan JH

Title: Binocular Image Flows: Steps Towards Stereo-Motion Fusion

Date: November 1986

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, November 1986, Vol. 8, No. 6, pp. 715-729

Comments: Optical flow based motion-stereo. The binocular difference flow is defined as the difference between the left and right optical flow fields, where the right flow field is shifted by the current disparity field. Shows that the difference flow and the ratio of rate of change of disparity to disparity are equivalent for image regions containing planar surfaces. Whilst this does not provide a direct solution to the correspondence problem (disparity must be known in order to calculate the difference flow), suggests two ways in which binocular difference flows can be used in stereo correspondence. Firstly, a vertical motion constraint can be derived as the y component of the difference flow for two corresponding points is the same. Therefore, this constraint can be used to supplement conventional stereo correspondence techniques. Alternatively, suggests using the ratio of rate of change of disparity to disparity as a match score metric. Potential matches can be identified by using the x component values of the difference flow for rate of change of disparity and then searching over range of disparity recording the ratio scores. A local support metric can then choose matches whose ratios are in agreement (cf. with the disparity gradient constraint. Does still depend on recovering the full motion field from the optical flow field. However, does relax the rigid body constraint by segmenting the scene. Three problems still remain. Firstly, the segmentation process is problematic, being susceptible to noise and variations in the density of reliable optical flow points. Secondly, the technique will only work for scenes containing significant motions. Thirdly, independently moving objects must have significant variations between their motions for robust segmentation.

Book: Wolberg 90

Author: Wolberg G

Title: Digital Image Warping

Date: 1990

Publisher: IEEE Computer Society Press

ISBN: 0818689447

Comments: Good digital image processing book including chapters on affine and perspective warping.

Paper: Xu 87

Author: Xu G, Tsuji S, Asada M

Title: A Motion Stereo Method Based on Coarse to Fine Control Strategy

Date: 1987

Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, Vol. 9, No. 2, pp. 332-336

Comments: Stereo from motion of translating camera. Initial matches are established using a search range proportional to the width of the chosen edge LoG filter. However, successive searches become increasingly constrained as the baseline length increases. In these algorithms, Dd is the expected accuracy of the disparity estimates due to factors such as the edge location error and the mechanical positioning error of the camera. Whilst Dd remains constant, the ratio of Dd/d (disparity error / measured disparity) decreases with increasing baseline length which means that accuracy improves as the baseline becomes longer.

Paper: Xu 96

Author: Xu G

Title: Unification of Stereo, Motion and Object Recognition via Epipolar Geometry

Date: 1996

Publisher: Lecture Notes in Computer Science, 1996, Vol. 1035, pp. 265-274

Comments: Structure from motion. This is a feature based algorithm. Uses an affine perspective model to define epipolar geometries that allows the motion correspondence problem to be considered as a 1-D search problem, the same as calibrated stereo. Defines the fundamental matrix and derives from it the epipolar equation from which disparity is defined. The technique avoids the aperture problem by defining multiple epipolar geometries for two motion images. Each object or set of objects undergoing a common motion when combined with the motion of the camera forms a unique epipolar geometry. Xu uses a clustering algorithm to identify common points undergoing the same motion and sharing the same epipolar equation. This way objects are identified due to the implicit segmentation of the scene and a set of the most probable epipolar geometries are derived and 3-D structure can be recovered.

Paper: Yi 97

Author: Yi JW, Oh JH

Title: Recursive Resolving Algorithm for Multiple Stereo and Motion Matches

Date: March 1997

Publisher: Image and Vision Computing, March 1997, Vol. 15, No. 3, pp. 181-196

Comments: Feature based motion-stereo. A feature based algorithm that tracks objects (features) using Kalman filters. Given a stereo camera system and a set of possible stereo matches the algorithm generates a set of virtual objects, one for each possible match. This recursive algorithm then uses Kalman filtering to predict the motion of the virtual objects through the sequence. Ambiguous matches show up by not following their predicted paths and as such can be rejected. As sequences of stereo images progress, the variances used in the Kalman filter get smaller as more stereo correspondences are resolved and this in turn will aid matching new/unmatched tokens.

Paper: Zhang 92

Author: Zhang Z, Faugeras O

Title: 3D Dynamic Scene Analysis: A Stereo Based Approach.

Date: 1992

Publisher: Springer-Verlag 1992