Vision as Inverse Graphics
We have a PhD scholarship funded by Microsoft Research
on the topic of Vision as Inverse Graphics. This studentship
was awarded to Lukasz Romaszko, who started work in September 2015.
The Microsoft co-supervisor was Dr Pushmeet Kohli (until summer 2017)
and is now
Dr John Winn.
The project
A long-standing view of computer vision is that it is the inverse of a
computer graphics problem. That is, the goal of computer vision is to
infer the objects present in a scene, their positions and poses, the
illuminant etc. In the language of machine learning, the object
identities, poses, illuminant etc are latent variables which must be
inferred in order to understand the scene.
In this project we will develop a stochastic scene generator, and
render these scenes to produce images; we will then train recognition
models to infer the relevant latent variables. These can be dense
fields (intrinsic images) such as a depth map or segmentation map, or
sparse information, e.g. concerning the presence of a certain object
class. This generalizes the work of Shotton et al (2011) on the
Microsoft Kinect, where the scene consists of a single human plus
background. The great advantage of using synthetic data is that there
is ready access to the relevant latent variables, and that large
quantities of data can be easily generated for training the
recognition models. We will also study the "structured noise" process
that relates graphics to real images, so as to enhance transferance of
the learned models to real images.
Related publications
- Learning Direct Optimization for Scene Understanding
pdf
- Lukasz Romaszko, Christopher K. I. Williams, John Winn. Final m/s
version of paper published in Pattern Recognition vol 105, 107369,
https://doi.org/10.1016/j.patcog.2020.107369. Initial verson posted
on arXiv 18 Dec 2018.
- Vision-as-Inverse-Graphics:
Obtaining a Rich 3D Explanation of a Scene from a Single Image
pdf
- Lukasz Romaszko, Christopher K.I. Williams, Pol Moreno, Pushmeet Kohli.
ICCV 2017 Geometry Meets Deep Learning workshop,
October 2017 (oral presenatation).
supplementary material.
- Overcoming Occlusion with Inverse Graphics
pdf
- Pol Moreno, Christopher K.I. Williams, Charlie Nash and
Pushmeet Kohli. Presented at:
Geometry Meets Deep Learning workshop, ECCV 2016 (oral presentation).
Final m/s version of paper appearing in Computer Vision-ECCV 2016 Workshops Proceedings Part
III, eds. H. Gang and H. Jegou,
Springer LNCS
9915 pp 170-185.
Code
is available.
Chris Williams