Most problems that we encounter in computer vision can naturally be understood as structured prediction problems. To describe the world around us we need rich representations, and these representations have interesting interdependencies. As an example consider scene understanding. The 3D geometry of a scene, the composition of objects and global properties such as lighting effects all have a non-trivial interplay and affect the visual appearance of the scene. Structured prediction aims to recover these components jointly. Complex and high-dimensional output spaces pose challenging problems to the inference procedure. In this research project we
are cameras This includes recognizing objects in images, together with teir interplay and 3D extend as well as labelling every pixel in an image with its semantic content.
Our book [ ] provides a comprehensive overview of diverse applications for structured prediction systems.
underWhat is inference. What is the problem? Why should we learn it?
- Learning with truncated inference. [ ]
- Learning with generalized CNNs [ ]
- Learning to pass messages [ ]
- Learning to predict bounding boxes. [ ]
The problem of predicting bounding boxes can naturally be understood as a structured prediction problem. The goal is to find the best matching bounding box around an object of interest. Since the number of boxes scales in the order of $O(n^4)$
In we develop a system that performs bounding box prediction for localizing objects in images.