Welcome to the homepage of the

Mathematical Optimization for Data Science Group

Department of Mathematics and Computer Science, Saarland University, Germany

Inertial Algorithms for Non-smooth Non-convex Optimization

The inertial algorithms that we study are motivated from physics, namely from a dynamical system that describes the behaviour of a ball that is rolling down a surface subject to friction (also known as heavy-ball dynamical system). The motion of the ball accelerates due to inertia. In analogy to that, the considered algorithms generate a sequence of points that accelerates when it moves in the same direction as before. A classic method for smooth unconstrained optimization problems is the Heavy-ball method. It was introduced by Polyak in 1964 in a paper with the title "Some methods of speeding up the convergence of iteration methods" [Polyak64] B.T. Polyak: Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics 4(5):1--17, 1964. , which suggests an improved performance of inertial algorithms. The global convergence of the Heavy-ball method for non-convex problems was studied by Zavriev and Kostyuk [ZK93] S.K. Zavriev and F.V. Kostyuk: Heavy-ball method in nonconvex optimization problems. Computational Mathematics and Modeling 4(4):336--341, 1993. . For the mechanical interpretation, we refer to [AGR00] H. Attouch, X. Goudou, and P. Redont: The heavy ball with friction method, i. the continuous dynamical system: global exploration of the local minima of a real-valued function by asymptotic analysis of a dissipative dynamical system. Communications in Contemporary Mathematics 2(1):1--34, 2000. .

In convex optimization the Heavy-ball method and its variations have led to the study of so-called accelerated schemes and optimal methods [Nesterov04] Y. Nesterov: Introductory lectures on convex optimization: A basic course. Applied Optimization, vol. 87, Kluwer Academic Publishers, 2004. , which refers to algorithms with a theoretical worst-case complexity that is of the same order as the lower complexity bound for a certain class of problems. This topic is still actively researched.

Our Contribution:

P. Ochs, Y. Chen, T. Brox, T. Pock:
iPiano: Inertial Proximal Algorithm for Non-convex Optimization. [pdf]
SIAM Journal on Imaging Sciences, 7(2):1388-1419, 2014.

We extended the applicability of the Heavy-ball method from smooth unconstrained optimization to a class of non-smooth non-convex optimization problems: problems that are the sum of a continuously differentiable function with Lipschitz continuous gradient and a non-smooth non-convex extended-valued function with a simple proximal mapping. The algorithm was termed iPiano. We proved global subsequential convergence of the algorithm to a stationary point. Moreover, this is the first work that proves global convergence of an inertial algorithm for non-smooth non-convex optimization problems. This global convergence result is holds for objective functions that satisfy the so-called Kurdyka--Lojasiewicz inequality. This is a mild assumption that is satisfied by most of the functions in practical applications. iPiano has shown state-of-the-art performance for several problems from image processing.

P. Ochs:
Unifying abstract inexact convergence theorems and block coordinate variable metric iPiano. [pdf]
SIAM Journal on Optimization, 29(1):541-570, 2019.

The contribution of this work is twofold: The first one is a unification of abstract convergence theorems for Kurdyka--Lojasiewicz functions. The second one uses the abstract convergence results to derive a block coordinate and variable metric variant of iPiano.

P. Ochs:
Local Convergence of the Heavy-ball Method and iPiano for Non-convex Optimization. [pdf]
Journal of Optimization Theory and Applications, 177(1):153-180, 2018.

We studied the local convergence of iPiano for a class of so-called prox-regular functions. The local convergence result shows an equivalence of iPiano to inertial alternating/averaged proximal minimization/projection methods and therefore allowed us to tranfer the convergence results to these algorithms. Key is a formula for prox-regular functions that expresses the gradient of the Moreau envelope using the proximal mapping, which is well known in the convex setting.

M.C. Mukkamala, P. Ochs, T. Pock, S. Sabach:
Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization. [pdf]
SIAM Journal on Mathematics of Data Science, 2(3):658-682, 2020.

The inertial algorithm with Nesterov-like extrapolation and Bregman distances proposed in this work is significant due to several reasons: (i) We introduce a novel convex concave inertial (CoCaIn) backtracking strategy. Key is the observation that the use of inertia is closely connected to certain lower bound estimates of the objective function---non-inertial methods only rely on certain simple upper bounds. Intuitively, the algorithm adapts to the "local convexity" of the objective via an efficient backtracking strategy. (ii) The algorithm shows state-of-the-art performance on several problems from Machine Learning and Computer Vision. (iii) We present the first global convergence result for an inertial algorithm with Bregman distances to a stationary point for Kurdyka--Lojasiewicz functions.

P. Ochs, T. Pock:
Adaptive Fista for Non-convex Optimization. [pdf]
SIAM Journal on Optimization, 29(4):2482-2503, 2019.

This work establishes a close connection between inertial algorithms and quasi-Newton methods. In particular, using a certain optimized extrapolation step, the SR1 quasi-Newton method can be recovered. Thanks to recent results on proximal calculus with respect to a rank-1 modified metric [BF12] S. Becker and J. Fadili: A Quasi-Newton Proximal Splitting Method. In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), 2618--2626, 2012. and the extension to proximal calculus for a rank-r modified metric, the resulting update step in Adaptive FISTA can be computed efficiently also for non-smooth non-convex additive composite problems.

The following works contribute to inertial algorithms in the convex setting.

P. Ochs, T. Brox, T. Pock:
iPiasco: Inertial Proximal Algorithm for strongly convex Optimization. [pdf]
Journal of Mathematical Imaging and Vision, 53(2):171-181, 2015.
D. Hafner, P. Ochs, J. Weickert, M. Reißel, S. Grewenig:
FSI schemes: Fast semi-iterative solvers for PDEs and Optimisation Methods. [pdf]
In B. Andres, B. Rosenhahn (Eds.): German Conference on Pattern Recognition (GCPR). Lecture Notes in Computer Science, Vol. 9796, 91-102, Springer, 2016. (Best Paper Award)
J.A. Tomasson, P. Ochs, J. Weickert:
AFSI: Adaptive restart for fast semi-iterative schemes for convex optimisation. [pdf]
In T. Brox, A. Bruhn, M. Fritz (Eds.): German Conference on Pattern Recognition (GCPR). Lecture Notes in Computer Science, Vol. 11269, 669-681, Springer, 2019.

MOP Group
The author is not
responsible for
the content of
external pages.