Advanced Vision 2014: V. Stereo data based 3D part recognition

V.1. Binocular Stereo Introduction
We introduce the core idea of recovering 3D information from a pair of images slightly displaced. Given a matched pair of points or other structure, computing the 3D positions is simple geometry. So this lecture set focusses on what features to match. This video introduces the 4 main types of feature (patch, point, edge, structure) and the overall summary of the order of topics needed to build the stereo-based object recognition system.
- PDF: stereointro stereointro-small stereointro-boxes
- Topic Cross References: I.1 II.1 IV.1
- Review question + answer: Question Answer
- Video: 13 min MP4 (185 MB) WEBM (377 MB) M4V (185 MB)
V.2. SIFT: Scale Invariant Feature Transform
Using points for matching requires points that lie on the same image structure in the 2 images. SIFT points are commonly used because they are invariant to translation, rotation, and scale. The lecture give the theory behind the SIFT points: how they are defined, how they are located and the 128 dimensional vector that describes the neighbourhood around the point.
- PDF: sift sift-small sift-boxes
- Topic Cross References: V.1 V.5
- Review question + answer: Question Answer
- Video: 27 min MP4 (393 MB) WEBM (786 MB) M4V (393 MB)
V.3. SIFT Examples
This short video presents 2 examples of SIFT feature detections, one on a stereo pair, and the other on a translated, rotated, and scaled image.
- PDF: siftexamples siftexamples-small siftexamples-boxes
- Topic Cross References: V.2
- Review question + answer: Question Answer
- Video: 5 min MP4 (72 MB) WEBM (143 MB) M4V (72 MB)
V.4. Stereo Geometry
The matching of points from 2 images depends on the relation between scene and image points, and on the relation between the corresponding points in the 2 images. This section introduces the basics of the pinhole camera, projection and epipolar geometry, including the Fundamental matrix and its estimation.
- PDF: stereogeom stereogeom-small stereogeom-boxes
- Topic Cross References: V.1
- Review question + answer: Question Answer
- Video: 19 min MP4 (280 MB) WEBM (563 MB) M4V (280 MB)
- Errata:
  1. March 14, 2015: Slide 4 - the variables c and r on rows 1 and 2 of the Intrinsic Matrix are reversed between the video and the PDF. The PDF is correct because the columns are (commonly) aligned with the x global 3D direction.
V.5. Canny Edge Detection
Many historical stereo algorithms are based on edge fragment correspondence, and many other image description and matching algorithms also use edges. This lecture gives some ideas of what edges are and how one of the best traditional edge detectors (Canny) works. It also introduces 2D convolution and spot noise removal methods.
- PDF: edges edges-small edges-boxes
- Topic Cross References: V.1 V.2 VII.6
- Review question + answer: Question Answer
- Video: 30 min MP4 (445 MB) WEBM (890 MB) M4V (445 MB)
V.6. Feature Finding Using RANSAC
RANSAC is a general model-based shape matcher, which we use here to find ling straight lines, which are used in the stereo matching. RANSAC can be used for other shapes, such as circles and even arbitrary parameterised shapes. It is particularly useful for finding shpaes in a lot of clutter, and has a tunable failure rate.
- PDF: ransac ransac-small ransac-boxes
- Topic Cross References: V.1 V.5 line finding II.3 edges
- Review question + answer: Question Answer
- Video: 16 min MP4 (236 MB) WEBM (473 MB) M4V (236 MB)
V.7. Left:Right Line Pairing
To find 3D line segments by triangulation, we need to find matching segments in the left and right images. This talk describes how to use the Fundamental matrix to find overlapping segments between all possible pairs of lines. Pairs that don't have sufficient overlap are ignored, as are pairs that do not have similar contrasts across the edges, and pairs whose disparities are too large or small.
- PDF: lrlinepairing lrlinepairing-small lrlinepairing-boxes
- Topic Cross References: V.1 V.4 V.8
- Review question + answer: Question Answer
- Video: 12 min MP4 (170 MB) WEBM (341 MB) M4V (170 MB)
V.8. Stereo Correspondence Constraints
Traditional stereo matching algorithms have used a variety of constraints to limit the possible matches between features in the left and right images. Here we look at several, including the orientation, contrast, shape and epipolar constraints.
- PDF: correspondence correspondence-small correspondence-boxes
- Topic Cross References: V.1 V.4 V.7
- Review question + answer: Question Answer
- Video: 14 min MP4 (200 MB) WEBM (400 MB) M4V (200 MB)
V.9. 3D Lines from Left:Right 2D Line Pairs
Once lines from the left and right images are paired together, we use their geometric relationship to compute the 3D line that was projected onto the two 2D image lines, by intersecting the back-projection of the 2D lines. We test how well this has worked by examining the angles between the reconstructed lines.
- PDF: 3dlines 3dlines-small 3dlines-boxes
- Topic Cross References: V.1 V.4 V.7
- Review question + answer: Question Answer
- Video: 14 min MP4 (205 MB) WEBM (409 MB) M4V (205 MB)
V.10. 3D Model Matching and Verification
Given a set of 3D model lines and 3D scene lines, the Interpretation Tree is used to pair these to hypothesise matches of the model in the image. A 3D version of the 2D pose estimation and verification algorithms is also given. We show the results of the matching algorithm overlaid on the initial image.
- PDF: 3dmodelmatch 3dmodelmatch-small 3dmodelmatch-boxes
- Topic Cross References: I.3 II.4 II.5 II.6 IV.5 IV.6 IV.7 V.1
- Review question + answer: Question Answer
- Video: 26 min MP4 (375 MB) WEBM (750 MB) M4V (375 MB)
V.11. Basic Dense Stereo Depth Calculation
The rest of this set computes 3D positions only for the matched features, whether SIFT points, edge fragments or whole lines. But this does not give depth data at every location in the image. Here we look at a pioneering method for computing depth at every point along each rectified scan line independently.
- PDF: densestereo densestereo-small densestereo-boxes
- Topic Cross References: V.1 IV.2
- Review question + answer: Question Answer
- Video: 12 min MP4 (180 MB) WEBM (359 MB) M4V (180 MB)