Three dimensional scene analysis has reached a turning point. Though researchers have been investigating object recognition and scene understanding since the late 1960's, a new excitement can be felt in the field. I attribute this to four converging activities: (1) the increasing popularity and successes of stereo and active range sensing systems, (2) the set of emerging tools and competences concerned with three dimensional surface shape and its meaningful segmentation and description, (3) Marr's attempt to place three dimensional vision into a more integrated and scientific context and (4) Brooks' demonstration of what a more intelligent three dimensional scene understander might entail. It is the convergence of these that has led to the problem considered in this book: "Assuming that we have easily accessible surface data and can segment it into useful patches, what can we then do with it?". The work presented here takes an integrated view of the problems and demonstrates that recognition can actually be done.
The central message of this book is that surface information can greatly simplify the image understanding process. This is because surfaces are the features that directly link perception to the objects perceived (for normal "camera-like" sensing) and because they make explicit information needed to understand and cope with some visual problems (e.g. obscured features).
Hence, this book is as much about a style of model-based three dimensional scene analysis as about a particular example of that style. That style combines surface patches segmented from the three dimensional scene description, surface patch based object models, a hierarchy of representations, models and recognitions, a distributed network-based model invocation process, and a knowledge-based model matcher. Part of what I have tried to do was show that these elements really do fit together well - they make it easy to extend the competence of current vision systems without extraordinary complications, and don't we all know how fragile complicated computer-based processes are?
This book is an organic entity - the research described started in 1982 and earlier results were reported in a PhD thesis in 1985. Since then, research has continued under the United Kingdom Alvey program, replacing weak results and extending into new areas, and most of the new results are included here. Some of the processes are still fairly simple and need further development, particularly when working with automatically segmented data. Thus, this book is really just a "progress report" and its content will continue to evolve. In a way, I hope that I will be able to rewrite the book in five to ten years, reporting that all problems have been solved by the computer vision community and showing that generic three dimensional object recognition can now be done. Who knows?
The book divides naturally into three parts. Chapters three to six describe the model independent scene analysis. The middle chapters, seven and eight, describe how objects are represented and selected, and thus how one can pass from an iconic to a symbolic scene representation. The final chapters then describe our approach to geometric model-based vision - how to locate, verify and understand a known object given its geometric model.
There are many people I would like to thank for their help with the work and the book. Each year there are a few more people I have had the pleasure of working and sharing ideas with. I feel awkward about ranking people by the amount or the quality of the help given, so I will not do that. I would rather bring them all together for a party (and if enough people buy this book I'll be able to afford it). The people who would be invited to the party are: (for academic advice) Damal Arvind, Jon Aylett, Bob Beattie, Li Dong Cai, Mike Cameron-Jones, Wayne Caplinger, John Hallam, David Hogg, Jim Howe, Howard Hughes, Zi Qing Li, Mark Orr, Ben Paechter, Fritz Seytter, Manuel Trucco, (and for help with the materials) Paul Brna, Douglas Howie, Doug Ramm, David Robertson, Julian Smart, Lincoln Wallen and David Wyse and, of course, Gaynor Redvers-Mutton and the staff at John Wiley & Sons, Ltd. for their confidence and support.
Well, I take it back. I would particularly like to thank Mies for her love, support and assistance. I'd also like to thank the University of Edinburgh, the Alvey Programme, my mother and Jeff Mallory for their financial support, without which this work could not have been completed.
With the advances in technology, I am pleased to now add the book to the web. Much of the content is a bit dated; however I think that the sections on: Making Complete Surface Hypotheses, Surface Clusters, Model Invocation, Feature Visibility Analysis and Binding Subcomponents with Degrees of Freedom still have something offer researchers.
In addition, it may be interesting for beginning researchers as a more-or-less complete story, rather than the compressed view that you get in conference and journal papers.
I offer many thanks to Anne-Laure Cromphout who helped with the preparation of this web version.