Projection on the epigraph of the squared Euclidean norm

As a follow-up on the previous post titled Projection on an epigraph, we here discuss how we can project on the epigraph of the squared norm function. Continue reading →


Projection on an epigraph

Here we study the problem of projecting onto the epigraph of a convex continuous function. Unlike the computation of the proximal operator of a function or the projection on its sublevel sets, the projection onto epigraphs is more complex and there exist only a few functions for which semi-explicit formulas are available.


Projection of a point onto the epigraph of a convex function.

Continue reading →

Variable substitution in optimisation: a paradox

Given an optimisation problem of the form \min_{x,y} \{f(x,y) = g(x) + h(y), g(x) = \alpha y\}, where \alpha \neq 0, can we equivalently solve the problem \min_{x,y} \{\alpha y + h(y)\}? Continue reading →

Metric subregularity for monotone inclusions

Metric sub-regularity is a local property of set-valued operators which turns out to be a key enabler for linear convergence in several operator-based algorithms. However, conditions are often stated for the fixed-point residual operator and become rather difficult to verify in practice. In this post we state sufficient metric sub-regularity conditions for a monotone inclusion which are easier to verify and we focus on the preconditioned proximal point method (P3M). Continue reading →

Do no generic termination criteria exist for steepest descent?

Where here we wonder why there are no generic termination criteria for the gradient method which guarantee a desired sub-optimality when f is strictly (not strongly) convex and has L-Lipschitz gradient. Does it make sense to terminate when \|\nabla f(x_k)\| is small? And when should we terminate? Continue reading →

Convergence of the iterates of the gradient method with constant stepsize

The gradient method with constant step length is the simplest method for solving unconstrained optimisation problems involving a continuously differentiable function with Lipschitz-continuous gradient. The motivation for this post came after reading this Wikipedia article where it is stated that under certain assumptions the sequence \{x_k\} converges to a local optimum, but it is no further discussion is provided. Continue reading →

Third and higher order Taylor expansions in several variables

In this post we show that it is possible to derive third and higer-order Taylor expansions for functions of several variables. Given that the gradient of a function f:\mathbb{R}^n \to\mathbb{R} is vector-valued and its Hessian is matrix-valued, it is natural to guess that its third-order gradient will be tensor-valued. However, not only is the use of tensors not very convenient, but in this context it is also unnecessary. Continue reading →

Error bounds for second order approximation

Where here we prove an approximation bound for twice continuously differentiable functions f:\mathbb{R}^n \to \mathbb{R} with M-Lipschitzian Hessian, that is \|\nabla^2 f(x) - \nabla^2 f(y)\| \leq M \|x-y\| for all x,y. In particular, we show that for all x,y

\begin{aligned} |f(y) - f(x) - \langle \nabla f(x), y-x\rangle - \frac{1}{2}\langle \nabla f^2(x)(y-x), y-x\rangle| \leq \frac{M}{6}\|x-y\|^3. \end{aligned}

This is stated as Lemma 1.2.4 in: Y. Nesterov, Introductory Lectures on Convex Optimization – A basic course, Kluwer Ac. Publishers, 2004. Continue reading →

Weak closure points not attainable as limits of sequences

Where in this post we discover an uncanny property of the weak topology: the points of the weak closure of a set cannot always be attained as limits of elements of the set. Naturally, the w-closure of a set is weakly closed. The so-called weak sequential closure of a set, on the other hand, is the set of cluster points of sequences made with elements from that set. The new set is not, however, weakly sequentially closed, which means that there may arise new cluster points; we may, in fact, have to take the weak sequential closure trans-finitely many time to obtain a set which is weakly sequentially closed – still, this may not be weakly closed. Continue reading →

Graphical convergence

Where in this post we wonder whether pointwise convergence makes sense for sequences of set-valued mappings. We present some essential limitations of the pointwise limit and we motivate the notion of graphical convergence. What is more, we use animated GIFs to better illustrate and understand these new notions. Continue reading →


Exploring and venting about quantitative issues

Look at the corners!

The math blog of Dmitry Ostrovsky

The Unapologetic Mathematician

Mathematics for the interested outsider

Almost Sure

A random mathematical blog


Mathematix is the mathematician of the village of Asterix