We extended the applicability of the Heavy-ball method from smooth unconstrained optimization to a class of non-smooth non-convex optimization problems: problems that are the sum of a continuously differentiable function with Lipschitz continuous gradient and a non-smooth non-convex extended-valued function with a simple proximal mapping. The algorithm was termed iPiano. We proved global subsequential convergence of the algorithm to a stationary point. Moreover, this is the first work that proves global convergence of an inertial algorithm for non-smooth non-convex optimization problems. This global convergence result is holds for objective functions that satisfy the so-called Kurdyka--Lojasiewicz inequality. This is a mild assumption that is satisfied by most of the functions in practical applications. iPiano has shown state-of-the-art performance for several problems from image processing.
The contribution of this work is twofold: The first one is a unification of abstract convergence theorems for Kurdyka--Lojasiewicz functions. The second one uses the abstract convergence results to derive a block coordinate and variable metric variant of iPiano.
We studied the local convergence of iPiano for a class of so-called prox-regular functions. The local convergence result shows an equivalence of iPiano to inertial alternating/averaged proximal minimization/projection methods and therefore allowed us to tranfer the convergence results to these algorithms. Key is a formula for prox-regular functions that expresses the gradient of the Moreau envelope using the proximal mapping, which is well known in the convex setting.
The inertial algorithm with Nesterov-like extrapolation and Bregman distances proposed in this work is significant due to several reasons: (i) We introduce a novel convex concave inertial (CoCaIn) backtracking strategy. Key is the observation that the use of inertia is closely connected to certain lower bound estimates of the objective function---non-inertial methods only rely on certain simple upper bounds. Intuitively, the algorithm adapts to the "local convexity" of the objective via an efficient backtracking strategy. (ii) The algorithm shows state-of-the-art performance on several problems from Machine Learning and Computer Vision. (iii) We present the first global convergence result for an inertial algorithm with Bregman distances to a stationary point for Kurdyka--Lojasiewicz functions.
D. Hafner, P. Ochs, J. Weickert, M. Reißel, S. Grewenig: FSI schemes: Fast semi-iterative solvers for PDEs and Optimisation Methods.
In B. Andres, B. Rosenhahn (Eds.):
German Conference on Pattern Recognition (GCPR). Lecture Notes in Computer Science, Vol. 9796, 91-102, Springer, 2016.
(Best Paper Award)