Discrete Optimization in Computer Vision: Theory and Practice

Project details

Type: ERC Consolidator Grant
Duration: From 1 June, 2014 to 31 May, 2019 (extension granted until November 2020)
Granted money: EUR 1 641 585
Programme acronym: DOiCV

Principal Investigator

Prof. Vladimir Kolmogorov
Tel: +43 (0)2243 9000 4801
Fax: +43 (0)2243 9000 2000

Objectives

This proposal aims at developing new inference algorithms for graphical models with discrete variables, with a focus on the MAP estimation task. MAP estimation algorithms such as graph cuts have transformed computer vision in the last decade; they are now routinely used and are also utilized in commercial systems.
Topics of this project fall into 3 categories.
Theoretically-oriented: Graph cut techniques come from combinatorial optimization. They can minimize a certain class of functions, namely submodular functions with unary and pairwise terms. Larger classes of functions can be minimized in polynomial time. A complete characterization of such classes has been established. They include k-submodular functions for an integer k.
I investigate whether such tools from discrete optimization can lead to more efficient inference algorithms for practical problems. I have already found an important application of k-submodular functions for minimizing Potts energy functions that are frequently used in computer vision. The concept of submodularity also recently appeared in the context of the task of computing marginals in graphical models, here discrete optimization tools could be used.
Practically-oriented: Modern techniques such as graph cuts and tree-reweighted message passing give excellent results for some graphical models such as with the Potts energies. However, they fail for more complicated models. I aim to develop new tools for tackling such hard energies. This will include exploring tighter convex relaxations of the problem.
Applications, sequence tagging problems: Recently, we developed new algorithms for inference in pattern-based Conditional Random Fields (CRFs) on a chain. This model can naturally be applied to sequence tagging problems; it generalizes the popular CRF model by giving it more flexibility. I will investigate (i) applications to specific tasks, such as the protein secondary structure prediction, and (ii) ways to extend the model.

Team

Current members:

Nasim Sameh (postdoc)
Yekini Shehu (postdoc)

Former members:

Rustem Takhanov (postdoc)
Alexandr Kazda (postdoc)
K S Sesh Kumar (postdoc)
Paul Swoboda (postdoc)
Sharareh Alipour (postdoc)
Neel Shah (PhD student)
Michal Rolínek (PhD student)