Variable substitution in optimisation: a paradox

Given an optimisation problem of the form \min_{x,y} \{f(x,y) = g(x) + h(y), g(x) = \alpha y\}, where \alpha \neq 0, can we equivalently solve the problem \min_{x,y} \{\alpha y + h(y)\}? Continue reading →

Metric subregularity for monotone inclusions

Metric sub-regularity is a local property of set-valued operators which turns out to be a key enabler for linear convergence in several operator-based algorithms. However, conditions are often stated for the fixed-point residual operator and become rather difficult to verify in practice. In this post we state sufficient metric sub-regularity conditions for a monotone inclusion which are easier to verify and we focus on the preconditioned proximal point method (P3M). Continue reading →

Do no generic termination criteria exist for steepest descent?

Where here we wonder why there are no generic termination criteria for the gradient method which guarantee a desired sub-optimality when f is strictly (not strongly) convex and has L-Lipschitz gradient. Does it make sense to terminate when \|\nabla f(x_k)\| is small? And when should we terminate? Continue reading →

Convergence of the iterates of the gradient method with constant stepsize

The gradient method with constant step length is the simplest method for solving unconstrained optimisation problems involving a continuously differentiable function with Lipschitz-continuous gradient. The motivation for this post came after reading this Wikipedia article where it is stated that under certain assumptions the sequence \{x_k\} converges to a local optimum, but it is no further discussion is provided. Continue reading →

Third and higher order Taylor expansions in several variables

In this post we show that it is possible to derive third and higer-order Taylor expansions for functions of several variables. Given that the gradient of a function f:\mathbb{R}^n \to\mathbb{R} is vector-valued and its Hessian is matrix-valued, it is natural to guess that its third-order gradient will be tensor-valued. However, not only is the use of tensors not very convenient, but in this context it is also unnecessary. Continue reading →

Error bounds for second order approximation

Where here we prove an approximation bound for twice continuously differentiable functions f:\mathbb{R}^n \to \mathbb{R} with M-Lipschitzian Hessian, that is \|\nabla^2 f(x) - \nabla^2 f(y)\| \leq M \|x-y\| for all x,y. In particular, we show that for all x,y

\begin{aligned} |f(y) - f(x) - \langle \nabla f(x), y-x\rangle - \frac{1}{2}\langle \nabla f^2(x)(y-x), y-x\rangle| \leq \frac{M}{6}\|x-y\|^3. \end{aligned}

This is stated as Lemma 1.2.4 in: Y. Nesterov, Introductory Lectures on Convex Optimization – A basic course, Kluwer Ac. Publishers, 2004. Continue reading →

Weak closure points not attainable as limits of sequences

Where in this post we discover an uncanny property of the weak topology: the points of the weak closure of a set cannot always be attained as limits of elements of the set. Naturally, the w-closure of a set is weakly closed. The so-called weak sequential closure of a set, on the other hand, is the set of cluster points of sequences made with elements from that set. The new set is not, however, weakly sequentially closed, which means that there may arise new cluster points; we may, in fact, have to take the weak sequential closure trans-finitely many time to obtain a set which is weakly sequentially closed – still, this may not be weakly closed. Continue reading →

Graphical convergence

Where in this post we wonder whether pointwise convergence makes sense for sequences of set-valued mappings. We present some essential limitations of the pointwise limit and we motivate the notion of graphical convergence. What is more, we use animated GIFs to better illustrate and understand these new notions. Continue reading →

Pretty convexity result

Where here we discover some interesting facts about continuous convex functions.

We know that a function f:\mathbb{R}^n\to\mathbb{R} is convex if

f(\tau x + (1-\tau)y) \leq \tau f(x) + (1-\tau)f(y)

for all \tau \in [0,1] and x,y\in \mathbb{R}^n.

We see that if f is a continuous function, then an equivalent condition for convexity is that either of the following inequalities holds

\begin{aligned} f\left(\frac{x+y}{2}\right) \leq \int_0^1 f(\theta x + (1-\theta)y)\mathrm{d}\theta \leq \frac{f(x)+f(y)}{2}\end{aligned} Continue reading →

The Mittag-Leffler function

Mostly known as the man because of whom mathematicians are not entitled to the Nobel prize, the Swedish mathematician Gösta Mittag-Leffler, the man who – the legend has it – had an affair with Nobel’s wife, is little known for his namesake function (for the record, Nobel never had a wife). The Mittag-Leffler function arises in several applications in physics and mathematics, chiefly in the solution of fractional-order differential equations – a well-studied special type of integrodifferential equations [1]. The function is given in the form of an infinite convergent series. A straightforward way to compute then would be to sum up the terms of this series until, eventually, the sum does not change. This, however, would be neither fast not accurate. In this article we are discussing about numerical methods for the computation of the Mittag-Leffler function. Continue reading →

mathbabe

Exploring and venting about quantitative issues

Look at the corners!

The math blog of Dmitry Ostrovsky

The Unapologetic Mathematician

Mathematics for the interested outsider

Almost Sure

A random mathematical blog

Mathematix

Mathematix is the mathematician of the village of Asterix