Paper: Altunbasak 95
Author: Altunbasak Y, Tekalp AM, Bozdagi G
Title: Simultaneous Stereo-Motion Fusion and 3D Motion Tracking
Date: 1995
Publisher: IEEE International Conference on Acoustics, Speech and Signal Processing, 1995, Vol. 4, pp. 2277-2280
Comments: Feature based motion-stereo. A feature based algorithm that simultaneously combines the two problems of stereo correspondence and motion estimation. 3-D points are recovered using maximum likelihood estimation to identify the most probable stereo correspondences by minimising a cost function that takes into account local image similarity (using current estimates of disparity) both in the temporal and spatial domains, as well as the estimated motion vectors of the features. All of the feature locations, velocities and accelerations (rotational and translational) are then estimated and predicted by an extended Kalman filter using the estimated 3-D locations of the points from two successive stereo results. The Kalman filter is supposedly robust to occlusions because if a mistake is made in identifying a stereo match, then the cost function should be high, this is turn gives a high noise coefficient in the Kalman filter and the implications of the mismatch should be minimised. The algorithm is iterative, using the 3-D motion parameters from the Kalman filter to re-estimate and re-evaluate stereo correspondences. The iterations stop when a global cost function is minimised.
Paper: Arakawa 95
Author: Arakawa H, Etoh M
Title: Integration Algorithm for Stereo, Motion and Color in Real Time Applications
Date: December 1995
Publisher: IEICE Transactions on Information and Systems, December 1995, Vol. E78-D, No. 12, pp. 1615-1620
Comments: Optical flow based motion-stereo. Aim is to make a basic framework for integrating motion, depth and colour for real-time applications. The system identifies fragments in the input images within which the pixels share common colour, motion and disparity distributions (modelled by multivariate normal distributions) so it is an area based algorithm. A competitive learning technique is used to find the best set of fragment vectors that describe a good fragment match. The system is calibrated using a reference plane. The warp applied to the right image to make the disparity zero at the reference plane, allows foreground objects to be separated from background objects with reference to the sign of disparity and can give relative depth values relative to the reference plane. System suitable for surveillance or human-computer interaction, but the assumptions made about single motion and disparity values for the elliptical fragments of the input images, would limit the accuracy for any individual feature or dense disparity measurements that are recovered. Might make a reasonable method for disparity seeding a more accurate feature matching stereo algorithm however it is quite slow due to the competitive learning process.
Paper: Arun 87
Author: Arun KS, Huang TS, Blostein SD
Title: Least Square Fitting of Two 3D Point Sets
Date: 1987
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, Vol. 9, No. 5, pp. 698-700
Paper: Baker 81
Author: Baker HH, Binford TO
Title: Depth from Edge and Intensity Based Stereo
Date: August 1981
Publisher: Proc. of the Seventh International Joint Conference on Artificial Intelligence, August 1981, pp. 631-636
Comments: Dynamic programming, image rectification, and the ordering constraint.
Book: Ballard 82
Author: Ballard DH, Brown CM
Title: Computer Vision
Date: 1982
Publisher: Prentice-Hall
ISBN: 0131653164
Paper: Barnard 80
Author: Barnard ST, Thompson WB
Title: Disparity Analysis of Images
Date: July 1980
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1980, Vol. 2, No. 4, pp. 333-340
Comments: Stereo vision. Uses a relaxation labelling scheme to derive the most probable point feature correspondences. Point features detected using the Moravec operator [Moravec 77] are comparing using a local correlation measure. Match probability scores are then calculated for the potential matches based on the similarity of local neighbourhood candidate matches. To select winning matches, the match probabilities are then iteratively updated using a relaxation labelling scheme based on the match probabilities of neighbouring points with similar disparities. Both stereo and temporal matching examples are identified in the paper; matching features in the left and right images, and then matching features through a monocular image sequence.
Paper: Burt 80
Author: Burt P, Julesz B
Title: Modifications of the Classical Notion of Panum's Fusional Area
Date: 1980
Publisher: Perception, 1980, Vol. 9, pp. 671-682
Comments: Stereo vision. How the disparity gradient constraint was derived from psychophysical experiments using dot stereograms. Disparity gradient limit calculated and the idea of 'forbidden cones' in 3-D space inside which other objects will not fuse are introduced. Also stated how disparity gradient enforces uniqueness and also prevents order reversal. Found from experimentation, that a change in the viewing distance scales both dot separation and disparity, and, therefore, the disparity gradient must remain constant over a wide variety of scales.
Paper: Canny 85
Author: Canny JF
Title: A Computational Approach to Edge Detection
Date: January 1985
Publisher: IEEE Transactions of Pattern Analysis and Machine Vision, January 1985, Vol. 8, No. 6, pp. 679-698
Paper: Chang 97
Author: Chang YL, Aggarwal JK
Title: Line Correspondences from Cooperating Spatial and Temporal Grouping Processes for a Sequence of Images
Date: August 1997
Publisher: Computer Vision and Image Understanding, August 1997, Vol. 67, No. 2, pp. 186-201
Comments: Feature based motion-stereo. Work based upon relaxation labelling as used by [Barnard 80] only this time relaxation applied both temporally and spatially. Work is also very similar to [Ho 96] only where Ho used point features and tracked their motions whereas Chang is using lines so this is a feature based algorithm. Comments about [Ho 96] apply equally here.
Paper: Cheng 93
Author: Cheng TK, Kitchen L
Title: Preliminary Results on Real Time 3D Feature Based Tracker
Date: December 1993
Publisher: DICTA-93 Conference, Australian Pattern Recognition Society, Sydney, December 1993
Paper: Chevrel 81
Author: Chevrel M, Courtis M, Weill G
Title: The SPOT Satellite Remote Sensing Mission
Date: 1981
Publisher: Photogrammetric Eng. Remote Sensing, 1981, Vol. 47, No. 8, pp. 1163-1171
Paper: Crossley 97
Author: Crossley S, Lacey AJ, Thacker NA, Seed NL
Title: Robust Stereo via Temporal Consistency
Date: 1997
Publisher: Proc. of the British Machine Vision Conference, 1997, pp. 659-668
Comments: See previous link page.
Paper: Crossley 98
Author: Crossley S, Thacker NA, Seed NL
Title: Benchmarking of Bootstrap Temporal Stereo using Statistical and Physical Scene Modelling
Date: 1998
Publisher: Proc. of the British Machine Vision Conference, 1998, pp. 346-355
Comments: Temporal stereo vision. Robust bootstrapping of a temporal stereo algorithm. Includes benchmarking results obtained using statistical modelling of the algorithm and physical scene modelling to get non-subjective outlier counts.
Paper: Dalmia 96
Author: Dalmia AK, Trivedi M
Title: High Speed Extraction of 3D Structure of Selectable Quality using a Translating Camera
Date: July 1996
Publisher: Computer Vision and Image Understanding, July 1996, Vol. 64, No. 1, pp. 97-110
Comments: Optical flow based stereo from motion from a translating camera. Spatial and temporal gradient approach to finding depth without the need to solve the correspondence problem. Depth is deduced from temporal and spatial flow fields extracted from stereo pairs taken at different baseline lengths. The camera displacement vector is registered for point where temporal gradient is equal to spatial gradient which corresponds to a specific disparity. Using the camera displacement vector, the focal length, and the selected disparity value, the 3-D depth is easily calculated. The algorithm assumes that the scene is stationary (rigid body constraint). There is a trade-off however between several factors such as accuracy of depth, depth of field, and the amount the camera must translate. In order to perceive larger depths, the camera has to translate over larger distances.
Paper: Dhond 89
Author: Dhond UR, Aggarwal JK
Title: Structure from Stereo - A Review
Date: November / December 1989
Publisher: IEEE Transactions on Systems, Man and Cybernetics, November / December 1989, Vol. 19, No. 6, pp. 1489-1510
Comments: Background to stereo and trinocular stereo. Compares area and feature based algorithms. Feature based algorithms - Marr and Poggio and Grimson and Mayhew and Frisby. Area based algorithms - Marr and Poggios Cooperative algorithm, hierarchical approaches, trinocular stereo. Conclusion - Algorithms need to be improved to give a lower percentage of false matches as well as better accuracy of depth estimates. Many references.
Paper: Dinkar 98
Author: Dinkar BN, Shree NK
Title: Ordinal Measures for Image Correspondence
Date: 1998
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, April 1998, Vol. 20, No. 4, pp. 415-423
Comments: Correspondence matching. A new method is suggested as an alternative to correlation for correspondence matching. The new technique is called an ordinal measure and uses the relative ordering of intensity values in windows (rank permutations) to obtain robust matching. They are independent of absolute intensity and the types of monotone transformations that can occur between stereo images.
Book: Duda 73
Author: Duda RO, Hart PE
Title: Pattern Recognition and Scene Analysis
Date: 1973
Publisher: Wiley
ISBN: 0471223611
Comments: Description of the correspondence problem.
Paper: Faugeras 92
Author: Faugeras OD
Title: What can be seen in Three Dimensions with an Uncalibrated Stereo Rig?
Date: 1992
Publisher: ECCV2, 1992, pp. 563-578
Book: Faugeras 93
Author: Faugeras OD
Title: Three-Dimensional Computer Vision
Date: 1993
Publisher: MIT Press
ISBN: 0262061589
Comments: Structure from motion and structure from stereo. 3D computer vision including stereo and structure from motion. Highlights problems with structure from motion with optical flow.
Paper: Fischler 81
Author: Fischler MA, Bolles RC
Title: Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography
Date: June 1981
Publisher: Communications of the ACM, June 1981, pp. 381-395
Paper: Foerstner 87
Author: Foerstner W, Gulch E
Title: A Fast Operator for Detection and Precise Location of Distinct Points, Corners, and Centres of Circular Features
Date: 1987
Publisher: Proc. ISPRS, 1987, Vol. 25, No. 3, pp. 281-305
Book: Foley 94
Author: Foley, Van Dam, Feiner, Hughes, Phillips
Title: Introduction to Computer Graphics
Date: 1994
Publisher: Addison Wesley
ISBN: 0201609215
Paper: Förstner 94
Author: Förstner W
Title: Diagnostics and Performance Evaluation in Computer Vision
Date: 1994
Publisher: Proc. Performance versus Methodology in Computer Vision, 1994, pp. 11-25
Comments: Algorithmic evaluation. To increase algorithmic performance, algorithms must be more robust to errors and include self-diagnosis to achieve autonomous evaluation of results. Urgent need for tools to analyse the results of computer vision algorithms by exploiting the redundancy in the data and by controlled tests.
Paper: Förstner 96
Author: Förstner W
Title: 10 Pros and Cons Against Performance Characterisation of Vision Algorithms
Date: 1996
Publisher: 1996 EPSRC Summer School on Computer Vision, The University of Surrey, Guildford
Comments: Algorithmic evaluation. Growing awareness that experimental proofs of algorithmic performance are insufficient. To allow a clear comparison of algorithms and to allow appropriate algorithms to be chosen for a particular task, algorithmic performance characterisation is necessary. With a set of standard quality measures and a representative set of data for the possible input data classes allows two things: The user to specify appropriate application specific quality variables and probabilities, and then enable the user to invert the results of the various algorithms' simulations to select the appropriate algorithm for the task even if the task was not one anticipated by the algorithm's author. Traffic light programs - the need for vision modules to contain self diagnosis of performance. Simulations can prove the correctness of implementations and can help develop performance measures.
Paper: Gibson 50
Author: Gibson JJ
Title: The Perception of the Visual World
Date: 1950
Publisher: Houghton Mufflin
Comments: Structure from motion. Introduced the concept of optical flow.
Book: Gonzalez 87
Author: Gonzalez RC, Wintz G
Title: Digital Image Processing 2nd Edition
Date: 1987
Publisher: Addison Wesley
ISBN: 0201110261
Comments: Image transforms, enhancement and segmentation.
Paper: Grimson 85
Author: Grimson WEL
Title: Computational Experiments with a Feature Based Stereo Algorithm
Date: 1985
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1985, Vol. 7, No. 1, pp. 17-34
Comments: Stereo vision. An outline of the Marr-Poggio model of the human visual system, followed by the Marr-Poggio-Grimson stereo algorithm and its performance. Coarse to fine approach used to limit the search space of possible matches. Laplacian of Gaussian masks used of variable size, tuned to different spatial frequencies, for image filtering. Once matching performed at coarsest scale, disparities used guide matching at next finer scale. As density of zero-crossings increases as the size of the filter is decreased, this coarse-to-fine control strategy allows the matching of very dense zero-crossing descriptions with greatly reduced false target problems by using coarser resolution matching to drive the alignment process.
Paper: Grosso 89
Author: Grosso E, Sandini G, Tistarelli M
Title: 3D Object Reconstruction using Stereo and Motion
Date: November / December 1989
Publisher: IEEE Transactions on Systems, Man, and Cybernetics, 1989, Vol. 19, No. 6, pp. 1465-1476
Comments: Optical flow based motion-stereo. Integration of a correlation based stereo algorithm (fixed mode of operation) and a depth from motion parallax algorithm. Optical flow used, estimation of camera motion necessary (controlled and constrained camera motion used to simplify this process). Depth obtained from optical flow and known egomotion parameters. Integration of results based on depth maps from stereo and motion and also using uncertainty maps which encode the errors peculiar to each algorithm. Evidence of depths is then accumulated and the most probable world scene arrived at.
Paper: Grosso 95
Author: Grosso E, Tistarelli M
Title: Active / Dynamic Stereo Vision
Date: 1995
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, Vol. 17, No. 11, pp. 1117-1128
Comments: Optical flow based motion-stereo. Task is to detect corridors of free space along which a robot could navigate. Tries to avoid the need for detailed calibration procedures by using 'active' techniques for controlling the cameras positioning which allows self-calibration. Algorithmic robustness can certainly be improved by providing independent estimates of the same quantity. Derives and equation for depth based on angular disparity. Cameras are moveable independently with encoders to record their rotation. This helps solve the optical flow equations because the rotation is known leaving just the translation to be calculated.
Paper: Gruen 85
Author: Gruen AW
Title: Adaptive Least Squares Correlation - A Powerful Image Matching Technique
Date: 1985
Publisher: South African Journal of Photogrammetry, Remote Sensing, and Cartography, 1985, Vol. 14, No. 3
Comments: Area correlation based stereo. Uses window shaping in the form of affine transformations to accommodate local perspective distortions.
Paper: Hannah 80
Author: Hannah MJ
Title: Bootstrap Stereo
Date: April 1980
Publisher: Proc. ARPA Image Understanding Workshop, April 1980, pp. 201-208
Paper: Hannah 85
Author: Hannah MJ
Title: SRI's Baseline Stereo System
Date: 1985
Publisher: Proc. DARPA Image Understanding Workshop, 1985, pp. 149-155
Comments: Area correlation based stereo. Includes hierarchical multi-scale processing.
Paper: Hannah 89
Author: Hannah MJ
Title: A System for Digital Stereo Image Matching
Date: 1989
Publisher: Photogrammertic Engineering and Remote Sensing, 1989, pp. 1765-1770
Comments: Area correlation based stereo. Includes hierarchical multi-scale processing.
Paper: Haralick 94
Author: Haralick RM
Title: Performance Characterisation Protocol in Computer Vision
Date: 1994
Publisher: CVGIP Image Understanding, 1994, Vol. 60, No. 2, pp. 245-249
Comments: Algorithmic evaluation. Discusses meaning of performance characterisation and various protocols with which an algorithm's performance can be characterised. No point in studying the performance of algorithm on perfect data because no input noise or random variation should lead to perfect output. Performance characterisation is really about seeing how noise and imperfect input data affects the quality of the output data. Models; input data, random perturbations (also noise), and output data. Data unit may change therefore the nature of the errors propagated may change. When capturing a representative set of sample images, they should be chosen across the full range of lighting, object position, object orientation, permissible object shape variation, occlusion, clutter, distortion, and noise. Protocol uses two random perturbations; a small, usually, Gaussian model applied to all input data units, and the other, a large perturbation which is applied to a fraction of the data units and can be modelled by just replacing values with other totally unrelated values. New data units can be introduced (false alarms) and others removed (misdetections). Most algorithms can handle the small variations but fail with the large fractional perturbations. Characterisation will be specified by how much of this large random perturbation the algorithm can tolerate can still give good results. Algorithms which give good results for large perturbations on a small fraction of data units can be said to be robust. Methods derived for determining robustness and reliability measures. A good protocol summary is given.
Paper: Harris 88
Author: Harris C, Stephens M
Title: A Combined Corner and Edge Detector
Date: 1988
Publisher: Proc. of the Fourth Alvey Vision Conference, 1988, pp. 147-151
Paper: Harris 98
Author: Harris AJ, Thacker NA, Lacey AJ
Title: Modelling Feature Based Stereo Vision for Range Sensor Simulation
Date: June 1998
Publisher: Proc. of the European Simulation Multiconference, June 1998, pp. 417-421
Comments: Stereo algorithmic evaluation. Derivation of the errors contained in feature based stereo vision. The paper describes the construction of a model for a general feature based stereo vision algorithm. The operational characteristics, data types and linear and non-linear errors based an error propagation techniques are all modelled and the model simulated to asses the robustness of the stereo vision system in the context of an automatic collision avoidance system on a semi-autonomous wheelchair. The vision system is assumed to be either corner or edge based stereo, where the stereo is calibrated, although the exact method of calibration is unimportant. The corner or edge based detection routines are assumed to have a fixed accuracies associated with them and from those values the expected disparity and real world errors can be calculated.
Paper: Ho 96
Author: Ho AYK, Pong TC
Title: Cooperative Fusion of Stereo and Motion
Date: January 1996
Publisher: Pattern Recognition, January 1996, Vol. 29, No. 1, pp. 121-130
Comments: Feature based motion-stereo. Recovery of stereo disparity and image flow values. Two successive pairs of stereo images: four sub-processes; two stereo and two motion correspondences. Each sub-process can use information from the others to try to solve any ambiguities that arise. Correspondences established using [Barnard 80]'s method, however the constraint for consistency becomes a 3 rule system: uniqueness, consistency (same as [Barnard 80]), and image flow continuity. The 3-D continuity constraint is used to disambiguate the matching process (but not to guide it). This is an iterative algorithm, approximately 10 iterations per matcher.
Paper: Ho 97
Author: Ho PK, Chung R
Title: Stereo-Motion that Compliments Stereo and Motion Analysis
Date: 1997
Publisher: Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997, pp. 213-218
Comments: Feature based motion-stereo. This system solves the two multi-ocular correspondence problems of motion and stereo using singular value decomposition. Starting with the assumption of affine projection, and that there are two stereo cameras that move along a known path through a scene taking a sequence of images during which features are tracked (i.e. solved motion correspondence problem for the left and right images). Using some known initially correct matches (which are determined using one of the established stereo matching techniques such as exploiting epipolar geometry and disparity gradient), the algorithm then uses SVD to arrive at a set of stereo correspondences that give 3-D world points that are consistent across the entire image sequence. However, this algorithm does rely on several assumptions; that both the temporal and initial stereo correspondences can be solved robustly, that the scene comprises a single rigid body, and that the camera projection can be approximated by an affine projection (only valid where the scene is not too close to the cameras). The technique is also really only effective for short sequences of images where the likelihood of points becoming occluded or new points appearing is small.
Paper: Hollinghurst 94
Author: Hollinghurst N, Cipolla R
Title: Uncalibrated Stereo Hand-Eye Coordination
Date: April 1994
Publisher: Image and Vision Computing, April 1994, Vol. 12, No. 3, pp. 187-192
Comments: Stereo vision. A system that combines stereo vision with a robot manipulator arm to enable it to locate and reach for objects in an unstructured environment. The system is self calibrating, with the system moving the manipulator arm to four (arbitrary) reference points. Errors are minimised by using visual feedback on the gripper's position and orientation. The stereo system relies on weak perspective. Using weak perspective simplifies the calibration problem, as the entire projection mathematics becomes a simple linear mapping and the stereo correspondences between the two images become affine projections.
Paper: Horaud 89
Author: Horaud R, Skordas T
Title: Stereo Correspondence through Feature Grouping and Maximal Cliques
Date: 1989
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, Vol. 11, pp. 1168-1180
Comments: Feature based stereo. Line matching using feature relations (collinear-with, same-junction-as, left-of, etc) between lines using relational graphs.
Paper: Horn 81
Author: Horn, Schunk
Title: Constraints on Optical Flow Computations
Date: 1981
Publisher: Proc. IEEE Conference on Pattern Recognition and Image Processing, 1981, pp. 205-210
Comments: Structure from motion using optical flow. Method proposed to recover the full motion field. A relaxation algorithm is used to apply a local smoothness of motion constraint, aiming to reach a more global consensus on scene motion than using individual measurements alone.
Paper: Hung 95
Author: Hung YP, Tang CY, Shih SW, Chen Z, Lin WS
Title: A 3D Predictive Visual Tracker for Tracking Multiple Moving Objects with a Stereo Vision System
Date: 1995
Publisher: Lecture Notes in Computer Science, 1995, Vol. 1024, pp. 25-32
Comments: Feature based motion-stereo. A 3-D predictive tracker of point features. Uses a linear Kalman filter for motion tracking estimation. Stereo cameras are calibrated so system can exploit epipolar search geometry and 'mutually supported' consistency. The algorithm operates as follows: 1) Feature extraction. 2) 2D temporal matcher used on left and right image streams separately. Features matched temporally using motion prediction parameters. 3) Stereo correspondences established and the 3-D positions calculated. 4) RANSAC based clustering based on feature motions. 5) Kalman filter predicts future positions for next iteration. RANSAC clustering: Point features exhibiting similar motion parameters; rotation and translation matrices, are grouped into a single cluster. Points can be removed and new clusters formed to allow for changing / new objects. Motion prediction using Kalman filters: Each cluster is handled by an individual Kalman filter. State vector holds values for angular velocity and acceleration, centre of rotation, translational velocity and acceleration. Constant acceleration is assumed. Initially, when the algorithm has no knowledge of scene content, a conventional stereo algorithm is used to extract 3D data from feature data in the scene. Once the rigid body motions have been recovered, the future motion predictions can then be used to help constrain the temporal correspondence matching for the next scene.
Paper: Hung 95b
Author: Hung YP, Tang CY, Shih SW, Chen Z, Lin WS
Title: A 3D Feature Based Tracker for Tracking Multiple Moving Objects with a Controlled Binocular Head
Date: 1995
Publisher: Technique Report TR-IIS-95-004, Institute of Information Science, Academia Sinica, Taiwan, 1995
Comments: Description of the mutually supported consistency constraint which is used in [Hung 95] stereo matcher.
Paper: Illingworth 98
Author: Illingworth J, Hilton A
Title: Looking to Build a Model World: Automated Construction of Static Object Models using Computer Vision
Date: June 1998
Publisher: Electronics and Communication Engineering Journal, June 1998, Vol. 10, No. 3, pp. 103-113
Comments: Active vision and uncalibrated stereo vision. A good introduction including descriptions of several fields that require the use of accurate 3-D models end hence the use of 3-D acquisition scheme such as stereo vision. These fields are: Industrial manufacture where hand crafted clay models such as those in the motor industry are input into CAD/CAM systems, or, reverse engineering a master object for subsequent production. Building and architecture where 3-D models are transferred into CAD systems or captured for used in VRML modelling for advertising on the web or virtual reality demonstrations to customers. Retailing where accurate 3-D capture would allow cheap customised tailoring for clothing. Medicine and biometrics where again the acquisition of human shapes would permit better diagnosis, design of prostheses or personal identification in security systems. Communication and broadcast systems where model based coding schemes allow object models to be transmitted just once and then high level descriptions of changes only need be transmitted which would give large savings in bandwidth. Entertainment industries where computer games and films use increasingly computer models to achieve special effects. Robotics and automation where autonomous robots have to navigate and interact with the 3-D world and therefore need the capacity to virtually model their surrounding environment. Education and information provision industries where virtual reality models can enhance understanding of meaning, inner workings or 3-D structure. The 3-D acquisition techniques covered here are touch probe co-ordinate measuring machines (CMMs), models from silhouettes, active range sensors, and models from video sequences using uncalibrated stereo. The problem with the touch probe system is that it requires an expensive CMM and is very slow which could mean only a sparse set of datum points being recovered. Silhouettes fail when concavities are present however quite complex models can be constructed providing enough silhouettes are gathered from various views around the object in question. The active range sensor studied here is a laser striping system using a single camera to view the shape of the laser line and hence deduce an objects shape. This system is also expensive and time consuming as there has to be a physical scanning motion either moving the laser/camera system around the object or moving the object under the laser stripe. The models from video sequences system uses uncalibrated stereo to simultaneously calculate camera and motion calibration (fundamental matrix) and reconstruct scene geometry from a single video sequence. The estimation of the fundamental matrix uses a RANSAC (RANdom SAmpling Consensus) methodology to ensure statistical robustness although the data collected by the stereo algorithms with not be as dense as that produced by the active range sensor methods. However the biggest advantage of the video based methods is the ease of data collection combined with cheap commercially available equipment which makes such methods ideal for a wide range of real world environments. Concludes that most of the problems mentioned can now be solved quite well with existing techniques to the accuracy the ease of use that general use will require.
Paper: Inria 91
Author: Inria Research Laboratories France
Title: A Parallel Stereo Algorithm that Produces Dense Depth Maps and Preserves Image Features
Date: 1991
Publisher: Research Report, 1991, No. 1369
Comments: Area correlation based stereo.
Book: Iyengar 91
Author: Iyengar SS, Elfes A
Title: Autonomous Mobile Robots
Date: 1991
Publisher: IEEE Computer Society Press
ISBN: 0818690186
Paper: Jenkin 86
Author: Jenkin M, Tsotsos JK
Title: Applying Temporal Constraints to the Dynamic Stereo Problem
Date: 1986
Publisher: Computer Vision, Graphics, and Image Processing, 1986, Vol. 33, pp. 16-32
Comments: Feature based motion-stereo. Algorithm uses general smoothness assumption for both temporal and spatial domains. Simple 3-D feature motion model is used to guide the matching process. Stereopsis constraints: off-epipolar error and maximum disparity cut. Temporal constraints: maximum distance cut on how far an object can move between frames and a maximum velocity cut on how much the velocity of an object can change between frames; defines a region in space within which valid matches must be found. Claims there is no evidence for 'global' spatial relaxation techniques used to assign final disparities. No such techniques used here, instead points are assigned labels such as 'create', 'split', 'track' and 'merge'. These hypotheses are then tested for frames t0 and t1, by looking at frames t1 and t2. The most 'continuous', is chosen. It is expected that for 'noise' points, there will be no continuity temporally and so those matches will be discarded. Of coarse with no motion the hypothesis generation fails and the result degrades to however the initial matches where chosen.
Paper: Jones 85
Author: Jones AG, Taylor CJ
Title: Scale Space Surface Recovery using Binocular Shading and Stereo Information
Date: 1995
Publisher: Proc. of the British Machine Vision Conference, 1995, Vol. 1, pp. 77-86
Paper: Julesz 61
Author: Julesz B
Title: Binocular Depth Perception and Pattern Recognition
Date: 1961
Publisher: Proc. of the Fourth London Symposium on Information Theory, 1961, pp. 212-224
Comments: Shows that depth perception by humans is principally a function of processes operating on the fused binocular field. With the computer model he suggests being based on the binocular parallax shifts of the left image with respect to the right.
Paper: Kalman 60
Author: Kalman RE, , Trans. ASME,
Title: A New Approach to Linear Filtering and Prediction Problems
Date: March 1960
Publisher: Journal of Basic Engineering, March 1960, pp. 35-45
Paper: Krol 80
Author: Krol JD, Grind WA van de
Title: The Double Nail Illusion: Experiments on Binocular Vision with Nails, Needles and Pins
Date: 1980
Publisher: Perception, 1980, Vol. 9, pp. 651-669
Comments: Shows that the human visual system cannot cope with order reversed stereo fusion.
Paper: Lacey 96
Author: Lacey AJ, Thacker NA, Crossley S, Yates RB
Title: Surface Approximation from Industrial SEM Images
Date: 1996
Publisher: Proc. of the British Machine Vision Conference, 1996, pp. 725-734
Comments: Practical use for stereo depth estimation. Need for fast processing for real time visual feedback.
Paper: Lacey 98
Author: Lacey AJ
Title: Automatic Extraction and Tracking of Moving Image Features
Date: 1998
Publisher: PhD Thesis, The University of Sheffield, 1998
Comments: Optical Flow gives ambiguous result, only good for extracting velocity parallel to direction of intensity gradient. Much better to use features that are well locatable and unique such as corners.
Paper: Lacey 98b
Author: Lacey AJ, Thacker NA, Crossley S, Yates RB
Title: A Multi-Stage Approach to the Dense Estimation of Disparity from Stereo SEM Images
Date: 1998
Publisher: Image and Vision Computing, 1998, Vol. 16, pp. 373-383
Comments: Attempts to reconstruct dense depth maps from SEM images. A multi-stage approach was taken starting with a calibration process that exploited the constraints of an SEM stereo image pair and epipolar error minimisation. Stretch correlation was then used to reconstruct sparse but accurate edge depth information. Finally, B-fitting used to construct the surface of materials viewed under a SEM, filling in the gaps between the known disparity data. A new soft rank-order filtering process was employed to counter the effects of illumination changes due to the change in viewpoint so the function fitting process would be more reliable.
Paper: Lane 94
Author: Lane RA, Thacker NA, Seed NL
Title: Stretch Correlation as a Real Time Alternative to Feature Based Stereo Matching Algorithms
Date: May 1994
Publisher: Image and Vision Computing, 1994, Vol. 12, No. 4, pp. 203-212
Comments: Stereo vision. Feature based algorithm using correlation of edge enhanced, information rich areas of image to drive correlation process. Using warping to tackle difficult non-front-o-parallel stereo problems. Suited to fast hardware implementation and temporal acceleration.
Paper: Lane 95
Author: Lane RA
Title: Edge Based Stereo Vision with a VLSI Implementation
Date: June 1995
Publisher: PhD Thesis, The University of Sheffield, June 1995
Comments: The stretch correlation algorithm [Lane 94] and design of VCP chip [Lane 96].
Paper: Lane 96
Author: Lane RA, Thacker NA, Seed NL, Ivey PA
Title: A Generalised Computer Vision Chip
Date: April 1996
Publisher: Real Time Imaging, April 1996, Vol. 2, pp. 203-213
Comments: Design and construction of a VLSI device to accelerate convolution in hardware. Acceleration of the stretch correlation algorithm.
Paper: Levine 73
Author: Levine MD, O'Handley DA, Yagi GM
Title: Computer Determination of Depth Maps
Date: 1973
Publisher: Computer Graphics and Image Processing, 1973, Vol. 2, No. 2, pp. 131-150
Comments: Stereo vision. Paper concerned with a Mars roving vehicle equipped with a stereo camera system. Basic problem is to isolate and identify objects and the relationships between objects. Specifically wanted to calculate the range to objects. Computes range image as a series of contour lines. Epipolar constraint exploited and correlation matching used. Too smaller correlation window used and noise dominates. Too larger correlation window used and poorly defined edges result, therefore inaccurate. An adaptive windowing system employed in which the window's significant dimension is dependant of the local grey level variance. A regions with small variance; larger windows are required to capture a significant amount of information to ensure proper correspondence. For larger variances, there must be more local texture and a smaller window can be employed. The algorithm establishes gross correspondences first and then moves onto finer scales to refine the result.
Paper: Lew 94
Author: Lew MS, Huang TS, Wong K
Title: Learning and Feature Selection in Stereo Matching
Date: September 1994
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, September 1994, Vol. 16, No. 9, pp. 869-881
Paper: Liu 93
Author: Liu J, Skerjane R
Title: Stereo and Motion Correspondence in a Sequence of Stereo Images
Date: October 1993
Publisher: Signal Processing: Image Communication, 1993, Vol. 5, Iss. 4, pp. 305-318
Comments: Feature based motion-stereo. The algorithm processes each new stereo pair by initially tracking the edge points from the previous pair, starting at the coarsest scale of the image pyramids. Once the motion field has been recovered, the algorithm performs disparity estimation using an epipolar matching technique called dynamic programming which was applied hierarchically to the stereo image pyramids. Uses three terms in the dynamic programming cost function to combine the motion and multi-scale match data. The first is an inter-line cost, added because most edge features are connected across epipolar lines and are, therefore, likely to have common correspondences. The second is a multi-scale cost that enforces consistency with correspondences already established for the previous pyramid level. Finally, there is a motion cost derived from the binocular disparity difference constraint that enforces consistency between successive stereo image pairs. The final set of stereo correspondences from the finest scale are then used to reconstruct the 3D data for the scene.
Paper: Lloyd 87
Author: Lloyd SA
Title: A Parallel Binocular Stereo Algorithm Utilizing Dynamic Programming and Relaxation Labelling
Date: 1987
Publisher: Computer Vision, Graphics, and Image Processing, 1987, Vol. 39, pp. 202-225
Paper: Marik 96
Author: Marik R, Kittler J, Petrou M
Title: Error Sensitivity Assessment of Vision Algorithms Based on Direct Error Propagation
Date: 1996
Publisher: 1996 EPSRC Summer School on Computer Vision, The University of Surrey, Guildford
Comments: Algorithmic evaluation. Propagating through a series of mathematical steps to see how noise is propagated and the effects of hardware implementations on precision limitation. Two methods used: variance propagation and min/max value propagation.
Paper: Marr 76
Author: Marr D, Poggio T
Title: Cooperative Computation of Stereo Disparity
Date: October 1976
Publisher: Science, October 1976, Vol. 194, pp. 283-287
Comments: Co-operative algorithms - class of parallel algorithms, operate on many input elements to reach global organisation by way of local interactive constraints (relaxation algorithm). Human vision - 3 steps involved in measuring disparity, S1 - a location on a surface in one scene must be selected, S2 - same location in other image must be found, S3 - the disparity must be measured. Using unambiguous cues e.g. structured light make S1 and S2 easier, but in reality the correspondence problem is the one to solve. >From 2 constraints on the physical world (C1 - a given point on a physical surface has a unique position in space at any one time, C2 - matter is cohesive, separated into objects, generally smooth compared with the distance from the viewer) derives the uniqueness and continuity rules. Algorithm uses a mesh of connected cells which communicate with each other to iterate towards a solution.
Paper: Marr 79
Author: Marr D, Poggio T
Title: A Computational Theory of Human Stereo Vision
Date: 1979
Publisher: Proc. of the Royal Society of London, 1979, B, Vol. 204, pp. 301-338
Comments: Based on the psychophysical studies of the human visual system, a simple set of spatial filters are used to locate features for correlation at a number of spatial resolutions. Stereo images are filtered by Laplacian of Gaussian operators at different scales. Matches at coarse scales are used to constrain the searching at finer scales.
Paper: Marr 80
Author: Marr D, Hildreth E
Title: Theory of Edge Detection
Date: 1980
Publisher: Proc. of the Royal Society of London, Series B, 1980, Vol. 207, pp. 187-217
Comments: The Marr-Hildreth edge detector.
Paper: Matthies 89
Author: Matthies L, Okutomi M
Title: Bootstrap Algorithms for Dynamic Stereo Vision
Date: 1989
Publisher: Proc. of the 6th Multidimensional Signal Processing Workshop, 1989, p. 12
Comments: Stereo from motion of translating camera. Highlights narrow baseline stereo using moving/translating cameras to help solve the stereo correspondence problem.
Paper: Matthies 89b
Author: Matthies L, Kanade T, Szeliski R
Title: Kalman Filter Based Algorithms for Estimating Depth from Image Sequences
Date: 1989
Publisher: International Journal of Computer Vision, 1989, Vol. 3, No. 3, pp. 209-238
Comments: Structure from motion. Two methods are studied, the first uses edge features which are tracked using Kalman filtering. The second method produces dense depth maps and depth uncertainty maps again using Kalman filtering based methods. Both techniques require known camera motion and the size of the motion in between frames has to be small for the correlation based optical flow estimation to work. Small motion minimises the correspondence problem between successive images, but sacrifices depth resolution because of the small baseline between consecutive pairs.
Paper: Maybank 92
Author: Maybank SJ, Faugeras OD
Title: A Theory of Self-Calibration of a Moving Camera
Date: 1992
Publisher: International Journal of Computer Vision, 1992, Vol. 8, No. 2, pp. 123-151
Paper: Mayhew 81
Author: Mayhew JEW, Frisby JP
Title: Psychophysical and Computational Studies towards a Theory of Human Stereopsis
Date: 1981
Publisher: Artificial Intelligence, 1981, Vol. 17, pp. 349-385
Comments: Stereopsis has two distinctive characteristics: 1) Disparity can only be calculated from low level monocular 'point' descriptions which are binocularly matched. Higher level surface or object matching is unnecessary as random dot stereograms work very well without them and the high level descriptions appear only after stereopsis has been achieved. 2) Solving the correspondence problem requires 'global stereopsis' when the ambiguity of false or ghost matches occur. So the two choices available for the algorithm designer are what monocular cues to use and what global mechanism to use to resolve ambiguity. Paper talks about using zero crossing points and peak and trough points as possible monocular cues both of which can be picked out using spatially tuned frequency filters. Conclusions are that figural continuity is important; matter is cohesive and edges and surface markings will be spatially continuous. Human binocular vision combines primitives such as zero crossings and peak derived from several spatially tuned frequency channels. Matches are chosen according to cross-channel combination rules. Dense complex textures are difficult because blurring can occur across two or more edges.
Book: Mayhew 91
Author: Mayhew JEW, Frisby JP
Title: 3D Model Recognition from Stereoscopic Cues
Date: 1991
Publisher: MIT Press
ISBN: 0262132435
Comments: Combined works on stereo vision and visual reconstruction all carried out at AIVRU on the PMF stereo algorithm project, the 2.5D sketch project and the 3-D model-based vision project. The PMF project was to create an algorithm to solve the stereo correspondence problem. The 2.5D sketch project was to create a representation of the 3-D structure of the visible surfaces in the scene. The 3-D model-based vision project was to develop a scheme for the recognition and manipulation of 3-D objects using information about the 3-D structure delivered by the 2.5D sketch project.
Paper: McLauchlan 91
Author: McLauchlan PF, Mayhew JEW, Frisby JP
Title: Location and Description of Textured Surfaces using Stereo Vision
Date: 1991
Publisher: Published in [Mayhew 91]
Comments: Stereo vision. Stereo from textured smooth continuous surfaces. Unlike PMF which propagates local constraints, the Needles algorithm uses histogramming and Hough Transform techniques to give local constraints and region growing to get global constraints. All edge based like PMF.
Paper: Medioni 85
Author: Medioni G, Nevatia R
Title: Segment-based Stereo Matching
Date: 1985
Publisher: Computer Vision, Graphics and Image Processing, 1985, 31, pp. 2-18
Comments: Stereo vision via high level line features.
Paper: Moravec 77
Author: Moravec HP
Title: Towards Automatic Visual Obstacle Avoidance
Date: 1977
Publisher: Proc. of the Fifth Joint Conference on Artificial Intelligence, 1977, MIT, Cambridge M.A.
Comments: Detection of discrete point features.
Paper: Mori 73
Author: Mori K, Kidode M, Asada H
Title: An Iterative Prediction and Correction Method for Automatic Stereocomparison
Date: 1973
Publisher: Computer Graphics and Image Processing, 1973, Vol. 2, pp. 393-401
Comments: Stereo vision. Area based correlation matching using a variable sized window.
Paper: Negahdaripour 95
Author: Negahdaripour S, Hayashi BY, Aloimonos Y
Title: Direct Motion Stereo for Passive Navigation
Date: December 1995
Publisher: IEEE Transactions on Robotics and Automation, December 1995, Vol. 11, No. 6, pp. 829-843
Comments: Optical flow based motion-stereo. An area based algorithm which recovers dense depth maps. The algorithm recovers the cameras' motions (rotation and translation) from the temporal and spatial derivatives in the left and right images. This gives two depth (disparity) maps, one for the left and one for the right, however being depth from motion these are only correct up to a scale. Negahdaripour then finds the correct 'disparity scale' by performing a search over a range of scales and seeing which scale maximises a correlation measure between the two disparity images of the left and right. This method will only be as good as the underlying depth from motion extraction and therefore subject to the same failures as standard structure from motion. Also assumes rigid body constraint as there is no segmentation of the motion.
Paper: Nevatia 76
Author: Nevatia R
Title: Depth Measurement by Motion Stereo
Date: 1976
Publisher: Computer Graphics and Image Processing, 1976, Vol. 5, pp. 203-214
Comments: Stereo from motion using a translating camera. Uses a series of progressive views to constrain disparity to small values. This algorithm is very akin to [Dalmia 96] and their translating camera. This algorithm uses a rotating camera setup in exactly the same manner. Using a whole series of small rotations proves to make the correspondence problem much easier and the eventual result more robust.
Paper: Ohta 85
Author: Ohta Y, Kanade T
Title: Stereo by Intra- and Inter-Scanlines Search using Dynamic Programming
Date: 1985
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1985, Vol. 7, pp. 139-154
Comments: Stereo vision and dynamic programming.
Paper: Okutomi 92
Author: Okutomi M, Kanade T
Title: A Locally Adaptive Window for Signal Matching
Date: 1992
Publisher: International Journal of Computer Vision, 1992, Vol. 7, No. 2, pp. 143-162
Comments: Varying the size of window for maximum robustness when signal matching.
Paper: O'Neill 96
Author: O'Neill M, Denos M
Title: Automated System for Coarse to Fine Pyramidal Area Correlation Stereo Matching
Date: 1996
Publisher: Image and Vision Computing, 1996, Vol. 14, pp. 225-236
Comments: Stereo vision. Produces dense disparity models using area based correlation and multi-scale image pyramids. Uses area correlation based algorithm rather than feature based method because better at handling the continuous texture which is found in topographic maps instead of intensity anomaly features which the latter methods are better at. Two main correlation algorithms quoted. Original adaptive least squares correlation algorithm done by Gruen which was followed by an iterative sheet growing version of the same algorithm done by Otto. Algorithms used affine transformations when correlating to accommodate for the distortions found in non-rectified stereo images. Algorithms work at a fixed image scale; the original image single pixel scale. Correlation algorithm used; coarse to fine, with a variable window size which allows exhaustive search with the ability to find optimal matching point. Coarse: big window > fine: small window (more marginal differences in left and right begin to dominate the similarity metric). The trouble with these algorithms were that they: Fail at discontinuities where there is a change in the disparity function in that area. They need seeding with accurate matches initially. They fail when faced with insufficient texture. New system does not need seed points because the new coarse to fine stereo matching component of the system allows the existing area correlation algorithms to be reapplied at various scales. It is claimed that the new system increases the amount of stereo data eventually returned and with a higher degree of confidence in that data because it is less susceptible to noise or bad seed points.
Paper: Otto 89
Author: Otto GP, Chau TKW
Title: Region Growing Algorithm for the Matching of Terrain Images
Date: 1989
Publisher: Image and Vision Computing, 1989, Vol. 7, No. 2, pp. 83-93
Comments: Stereo vision. Area based region growing. Grows smooth surfaces from a small number (5-10) of accurately measured seed points. Iterates predicting neighbouring matches and then refining using adaptive least squares algorithm.
Paper: Ozeki 86
Author: Ozeki O, Nakano T, Yamamoto S
Title: Real Time Range Measurement Device for Three Dimensional Object Recognition
Date: 1986
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, Vol. 8, No. 4, pp. 550-554
Comments: Active vision. Using a laser striping system with known relationships between the camera and laser projecting system to detect objects in a 60cm square area at 100cm in 490ms. Accurate to +/-2cm. Used to discriminate between three types of machine parts on a conveyor belt in 4s.
Paper: Pollard 85
Author: Pollard SB, Mayhew JEW, Frisby JP
Title: PMF: A Stereo Correspondence Algorithm using a Disparity Gradient Limit
Date: 1985
Publisher: Perception, 1985, Vol. 14, pp. 449-470
Comments: Stereo vision. Edge based stereo, essentially derived from psychological findings about human vision. Disparity gradient: the disparity gradients between correct matches will be small almost everywhere. Epipolar constraint and uniqueness constraint also exploited.
Paper: Pollard 85b
Author: Pollard SB
Title: Identifying Correspondences in Binocular Stereo
Date: 1985
Publisher: PhD Thesis, The University of Sheffield, 1985
Comments: Stereo vision. The original PMF algorithm.
Paper: Pollard 86
Author: Pollard SB, Porrill J, Mayhew JEW, Frisby JP
Title: Disparity Gradient, Lipschitz Continuity, and Computing Binocular Correspondences
Date: 1986
Publisher: Proc. of the Third International Symposium of Robotics Research, 1986, pp. 19-26
Comments: Stereo vision.
Paper: Pollard 91
Author: Pollard SB, Mayhew JEW, Frisby JP
Title: Implementation Details of the PMF Stereo Algorithm
Date: 1991
Publisher: Published in [Mayhew 91]
Comments: Stereo vision. Implementation details focusing on PMF with emphasis on robustness and efficiency. Also contains more global constraints than PMF originally did. Epipolar/edge rectification used. Figural continuity looks at strings of edges. The edge with the most matching points is chosen. The figural continuity constraint is also used to fill in the gaps between seed points and is even allowed to break the edge orientation rule when trying to jump over the breaks or gaps. Ordering constraint enforces edge strings must be in order along epipolars, if not the weakest strings breaking the ordering are removed and considered in subsequent stages. Existing matches can be used with the ordering constraint to set disparity limits within which unmatched point must be matched so as not to violate the ordering constraint. Final iteration allows contrast reversal for matching edges.
Paper: Pollard 91b
Author: Pollard SB, Porrill J, Mayhew JEW
Title: Recovering Partial 3D Wire Frames Descriptions from Stereo Data
Date: 1991
Publisher: Image and Vision Computing, 1991, Vol. 9, No. 1, pp. 58-65
Paper: Porrill 88
Author: Porrill J, Pollard SB, Pridmore TP, Bowen JB, Mayhew JEW, Frisby JP
Title: TINA: A 3D Vision System for Pick and Place
Date: 1988
Publisher: Image and Vision Computing, 1988, Vol. 6, pp. 91-99
Comments: A description of the TINA vision system and how it integrated and implemented all of the parts of the AIVRU stereo project, from PMF, through grouping and segmentation of edge data into geometrical lines and arcs and through object model creation and 3-D wireframe model matching.
Book: Russ 94
Author: Russ JC
Title: The Image Processing Handbook 2nd Edition
Date: 1994
Publisher: CRC Press
ISBN: 0849325161
Paper: Shah 89
Author: Shah YC, Chapman R, Mahani RB
Title: A New Technique to Extract Range Information from Stereo Images
Date: 1989
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, Vol. 11, No. 7, pp. 768-773
Comments: Multi-scale stereo vision. Highlights the computational and functional advantages that the use of multiple resolution techniques can bring. Low resolution correlation matching ensures gross image features are matched rather than details. Once refined to high resolution correlations, finer details may be matched over a limited area.
Book: Sonka 93
Author: Sonka M, Hlavac V, Boyle R
Title: Image Processing, Analysis and Machine Vision
Date: 1993
Publisher: Chapman and Hill Computing
ISBN: 0412455706
Comments: Stereopsis, disparity gradient, PMF and shape from shading.
Book: Stucki 79
Author: Stucki P
Title: Advances in Digital Image Processing
Date: 1979
Publisher: Plenum Press
ISBN: 0306403145
Paper: Sung 93
Author: Sung E, Myint T
Title: Incorporating Color and Spatiotemporal Stereovision Techniques for Road Following
Date: 1993
Publisher: Proc. of SPIE, 1993, Vol. 1825, pp. 356-365
Comments: Feature based motion-stereo. Colour can greatly help in the processing of visual cues. However, colour segmentation can have problems with shadows and other additional matter on the roads. But stereo is not affected by shadows, etc, with passive stereo being particularly preferred. New method relies on temporal feature tracking with an accurate odometry system on the vehicle. Using the odometry information, the location of any scene feature can be predicted from one frame to the next. This information combined with colour, intensity and geometrical dimensions is used to solve with motion correspondence of features between the frames of a sequence. Because the algorithm already has a set of stereo matches from the previous frame, the motion correspondences allow the stereo correspondences in the current frame to be established simply. The major assumptions made by the motion-stereo algorithm are: Accurate odometry on the motion of the mobile robot is available. The world is stationary, with the only motion in the scene being due to the motion of the robot itself. There is an accurate initial set of stereo correspondences for the scene available at start-up.
Paper: Thacker 91
Author: Thacker NA, Mayhew JEW
Title: Optimal Combination of Stereo Camera Calibration from Arbitrary Stereo Images
Date: 1991
Publisher: Image and Vision Computing, February 1991, Vol. 9, No. 1, pp. 27-32
Comments: Stereo camera calibration techniques. A good calibration algorithm should determine the calibration scale based on many different (uncorrelated) 3D measurements combined with ground truth data points, to arrive at a consensus calibration using error minimisation.
Paper: Thacker 92
Author: Thacker NA, Courtney P
Title: Statistical Analysis of a Stereo Matching Algorithm
Date: 1992
Publisher: Proc. of the British Machine Vision Conference, 1992, pp. 316-326
Comments: Derivation of the probability of matching errors for a corner matching algorithm using cross-correlation resulting in the conclusion that the probability of an error is proportional to the mean number of candidate matches and therefore is proportional to search area.
Paper: Thacker 99
Author: Thacker NA, Jackson A, Moriarty D, Vokurka E
Title: Improved Quality of Re-sliced MR Images using Re-normalised Sinc Interpolation
Date: 1999
Publisher: Journal of Magnetic Resonance Imaging, 1999, Vol. 10, pp. 582-588
Comments: Improved sinc interpolation technique.
Paper: Torr 97
Author: Torr PT, Fitzgibbon AW, Zisserman A
Title: Maintaining Multiple Motion Model Hypotheses Over Many Views to Recover Matching and Structure
Date: January 1997
Publisher: Proc. of the 6th International Conference on Computer Vision, January 1998, Bombay, pp. 485-491
Comments: Uncalibrated stereo vision. Paper outlines a methodology of calculating calibration information and scene structure simultaneously when processing a monocular video sequence. The scheme uses uncalibrated stereo and trifocal tensors to match features across long sequences of monocular images. Robustness is maintained throughout the sequence by identifying situations where the recovery of the epipolar geometry through motion becomes unreliable: Under the particular circumstances of just camera rotation or if the only features visible are all co-planar then the structure from motion results cannot be used. Instead the matches are 'saved' until sufficient information (such as an out-of-plane match) can be found so they can be properly integrated into the final 3-D result.
Paper: Trivedi 85
Author: Trivedi HP, Lloyd SA
Title: The Role of Disparity Gradient in Stereo Vision
Date: 1985
Publisher: Perception, 1985, Vol 14, pp. 685-690
Comments: Stereo vision. Interpretation of the disparity gradient constraint as a method of enforcing topological equivalence between surfaces in the left and right images, and that the DG constraint is essential to solving the correspondence problem before interpolating surfaces from the computed 3D points
Paper: Trivedi 87
Author: Trivedi HP
Title: Estimation of Stereo and Motion Parameters using a Variation Principle
Date: 1987
Publisher: Image and Vision Computing, May 1987, Vol. 5, No. 2, pp. 181-183
Paper: Tsai 87
Author: Tsai RY
Title: An Efficient and Accurate Camera Calibration Technique for 3D machine Vision
Date: 1987
Publisher: Proc. IEEE Computer Vision and Pattern Recognition, 1987
Paper: Venkateswar 92
Author: Venkateswar V, Chellappa R
Title: Extraction of Straight Lines in Aerial Images
Date: 1992
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, No. 14, pp. 1111-1114
Paper: Venkateswar 95
Author: Venkateswar V, Chellappa R
Title: Hierarchical Stereo and Motion Correspondence using Feature Groupings
Date: July 1995
Publisher: International Journal of Computer Vision, July 1995, Vol. 15, No. 3, pp. 245-269
Comments: Stereo vision. A feature based stereo algorithm, later extended to motion correspondence, which matches surfaces, lines and vertices. Structural relationships along with attributes such as parallel to, co-linear with, left of and right of are used to resolve ambiguities and match edge rings and eventually lines for stereo. Same algorithm applied to motion correspondence.
Paper: Verri 87
Author: Verri, Poggio
Title: Against Quantitative Optical Flow
Date: 1987
Publisher: 1st International Conference of Computer Vision, 1987
Comments: Motion can be recovered more reliably by focussing the attention of the motion analysis on information significant areas of the image such as features.
Paper: Wade 97
Author: Wade P, Moran D, Graham J, Brook Jackson C
Title: Robust and Accurate 3D Measurement of Formed Tube using Trinocular Stereo Vision
Date: 1997
Publisher: Proc. of the British Machine Vision Conference, 1997, Vol. 1, pp. 232-241
Comments: A vision system that is used to accurately measure the geometry of metal pipes which have been bent into complex 3D shapes. However, the vision system requires a very controlled working environment and can only recover the structure of pipes with a constant radius.
Paper: Wang 96
Author: Wang W, Duncan JH
Title: Recovering the Three Dimensional Motion and Structure of Multiple Moving Objects from Binocular Image Flows
Date: May 1996
Publisher: Computer Vision and Image Understanding, May 1996, Vol. 63, No. 3, pp. 430-446
Comments: Optical flow motion-stereo. A continuation of the work done in [Waxman 86]. This is an iterative algorithm that illustrates how structure from motion using optical flow and stereopsis can be combined to produce a co-operative result. Initially, rigid body motion is assumed for the whole scene. Using conventional SfM techniques, two motion vectors are recovered for the scene and two independent sets of 3D data from the left and right motion fields are calculated. Using the epipolar constraint, the algorithm then checks that disparities recovered from the 3D motion results comply with the observed stereo data. They then refine the 3D result relying on that fact that the initially estimated motion will be close to the dominant rigid body motion in the scene (referred to as object 1). Therefore, the majority of the correct stereo matches will belong to object 1. This allows the algorithm to segment out the object 1 points and repeat the motion estimation process for those points alone, improving the accuracy of the recovered object 1 3D motion. By reapplying the recovered motion to unmatched points, new points belonging to object 1 can then be found. By iteratively estimating motions and using the stereo correspondence check, rigid objects with different motions are gradually matched and segmented. After start-up, processing further stereo frames becomes easier as the feature motion groupings from the previous frames can be carried into the current stage without having to resort to initial motion estimation. There must be significant variations between the motions of independently moving objects for a robust segmentation of the scene. Unfortunately, the algorithm is let down by the fact that the motion-stereo matching constraint is considerably weakened for scenes containing small feature velocities or no motion at all, a symptom of most algorithms relying on depth from motion as their primary depth cue.
Paper: Waxman 86
Author: Waxman AM, Duncan JH
Title: Binocular Image Flows: Steps Towards Stereo-Motion Fusion
Date: November 1986
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, November 1986, Vol. 8, No. 6, pp. 715-729
Comments: Optical flow based motion-stereo. The binocular difference flow is defined as the difference between the left and right optical flow fields, where the right flow field is shifted by the current disparity field. Shows that the difference flow and the ratio of rate of change of disparity to disparity are equivalent for image regions containing planar surfaces. Whilst this does not provide a direct solution to the correspondence problem (disparity must be known in order to calculate the difference flow), suggests two ways in which binocular difference flows can be used in stereo correspondence. Firstly, a vertical motion constraint can be derived as the y component of the difference flow for two corresponding points is the same. Therefore, this constraint can be used to supplement conventional stereo correspondence techniques. Alternatively, suggests using the ratio of rate of change of disparity to disparity as a match score metric. Potential matches can be identified by using the x component values of the difference flow for rate of change of disparity and then searching over range of disparity recording the ratio scores. A local support metric can then choose matches whose ratios are in agreement (cf. with the disparity gradient constraint. Does still depend on recovering the full motion field from the optical flow field. However, does relax the rigid body constraint by segmenting the scene. Three problems still remain. Firstly, the segmentation process is problematic, being susceptible to noise and variations in the density of reliable optical flow points. Secondly, the technique will only work for scenes containing significant motions. Thirdly, independently moving objects must have significant variations between their motions for robust segmentation.
Book: Wolberg 90
Author: Wolberg G
Title: Digital Image Warping
Date: 1990
Publisher: IEEE Computer Society Press
ISBN: 0818689447
Comments: Good digital image processing book including chapters on affine and perspective warping.
Paper: Xu 87
Author: Xu G, Tsuji S, Asada M
Title: A Motion Stereo Method Based on Coarse to Fine Control Strategy
Date: 1987
Publisher: IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, Vol. 9, No. 2, pp. 332-336
Comments: Stereo from motion of translating camera. Initial matches are established using a search range proportional to the width of the chosen edge LoG filter. However, successive searches become increasingly constrained as the baseline length increases. In these algorithms, Dd is the expected accuracy of the disparity estimates due to factors such as the edge location error and the mechanical positioning error of the camera. Whilst Dd remains constant, the ratio of Dd/d (disparity error / measured disparity) decreases with increasing baseline length which means that accuracy improves as the baseline becomes longer.
Paper: Xu 96
Author: Xu G
Title: Unification of Stereo, Motion and Object Recognition via Epipolar Geometry
Date: 1996
Publisher: Lecture Notes in Computer Science, 1996, Vol. 1035, pp. 265-274
Comments: Structure from motion. This is a feature based algorithm. Uses an affine perspective model to define epipolar geometries that allows the motion correspondence problem to be considered as a 1-D search problem, the same as calibrated stereo. Defines the fundamental matrix and derives from it the epipolar equation from which disparity is defined. The technique avoids the aperture problem by defining multiple epipolar geometries for two motion images. Each object or set of objects undergoing a common motion when combined with the motion of the camera forms a unique epipolar geometry. Xu uses a clustering algorithm to identify common points undergoing the same motion and sharing the same epipolar equation. This way objects are identified due to the implicit segmentation of the scene and a set of the most probable epipolar geometries are derived and 3-D structure can be recovered.
Paper: Yi 97
Author: Yi JW, Oh JH
Title: Recursive Resolving Algorithm for Multiple Stereo and Motion Matches
Date: March 1997
Publisher: Image and Vision Computing, March 1997, Vol. 15, No. 3, pp. 181-196
Comments: Feature based motion-stereo. A feature based algorithm that tracks objects (features) using Kalman filters. Given a stereo camera system and a set of possible stereo matches the algorithm generates a set of virtual objects, one for each possible match. This recursive algorithm then uses Kalman filtering to predict the motion of the virtual objects through the sequence. Ambiguous matches show up by not following their predicted paths and as such can be rejected. As sequences of stereo images progress, the variances used in the Kalman filter get smaller as more stereo correspondences are resolved and this in turn will aid matching new/unmatched tokens.
Paper: Zhang 92
Author: Zhang Z, Faugeras O
Title: 3D Dynamic Scene Analysis: A Stereo Based Approach.
Date: 1992
Publisher: Springer-Verlag 1992