In International Conference on Machine Learning (ICML), July 2021.

Abstract

We study a class of convex-concave saddle-point problems of the form min
_{x} max_{y} < Kx,y > +f_{P}(x)-h^{*}(y)
where K is a linear operator, f_{P} is the sum of a convex function f with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope P, and h^{*} is a convex (possibly nonsmooth) function.
Such problem arises, for example, as a Lagrangian relaxation of various discrete optimization problems.
Our main assumptions are the existence of an efficient linear minimization oracle (lmo) for P and an efficient proximal map for h^{*} which motivate the solution via a blend of proximal primal-dual algorithms and Frank-Wolfe algorithms.
In case h^{*} is the indicator function of a linear constraint and function f is quadratic, we show a O(1/n^{2}) convergence rate on the dual objective, requiring O(nlogn) calls of lmo. If the problem comes from the constrained optimization problem
min_{x}{f_{P}(x)|Ax-b=0}
then we additionally get bound O(1/n^{2}) both on the primal gap and on the infeasibility gap. In the most general case, we show a O(1/n) convergence rate of the primal-dual gap again requiring O(nlogn) calls of lmo. To the best of our knowledge, this improves on the known convergence rates for the considered class of saddle-point problems. We show applications to labeling problems frequently appearing in machine learning and computer vision.