# Error bounds for second order approximation

Where here we prove an approximation bound for twice continuously differentiable functions $f:\mathbb{R}^n \to \mathbb{R}$ with M-Lipschitzian Hessian, that is $\|\nabla^2 f(x) - \nabla^2 f(y)\| \leq M \|x-y\|$ for all $x,y$. In particular, we show that for all $x,y$

\begin{aligned} |f(y) - f(x) - \langle \nabla f(x), y-x\rangle - \frac{1}{2}\langle \nabla f^2(x)(y-x), y-x\rangle| \leq \frac{M}{6}\|x-y\|^3. \end{aligned}

This is stated as Lemma 1.2.4 in: Y. Nesterov, Introductory Lectures on Convex Optimization – A basic course, Kluwer Ac. Publishers, 2004.

Since $f$ is twice continuously differentiable, we may use the fact that

\begin{aligned} f(y) = f(x) + \langle \nabla f(x), y-x\rangle + \frac{1}{2}\nabla f^2(x+\alpha (y-x))(y-x), \end{aligned}\hfill (1)

for some $\alpha\in [0,1]$.

We may write this equivalently as (and this is maybe more convenient)

\begin{aligned} f(y) = f(x) + \langle \nabla f(x), y-x\rangle + \int_0^1 \int_0^\tau \left\langle \nabla f^2(x+\alpha (y-x))(y-x), y-x\right\rangle \mathrm{d} \alpha \mathrm{d} \tau. \end{aligned}\hfill (2)

Since $\nabla^2 f$ is $M$-Lipschitz,

\begin{aligned} &\|\nabla^2 f(x+\alpha(y-x)) - \nabla^2 f(x)\| \leq M\alpha \|x-y\|, \\ \Leftrightarrow& \nabla^2 f(x+\alpha(y-x)) = \nabla^2 f(x) + H, \end{aligned}\hfill (3)

where $\|H\|\leq M\alpha \|x-y\|$. We may now easily plug (3) into (2) and prove the original inequality. In particular, because of (1):

\begin{aligned} f(y) &= f(x) + \langle \nabla f(x), y-x\rangle + \int_0^1 \int_0^\tau \left\langle \nabla f^2(x)(y-x), y-x\right\rangle \mathrm{d} \alpha \mathrm{d} \tau\\ &+ \int_0^1 \int_0^\tau \left\langle H(y-x), y-x\right\rangle \mathrm{d} \alpha \mathrm{d} \tau, \end{aligned}\hfill (4)

where the first integral in (4) is

\begin{aligned} \int_0^1 \int_0^\tau \left\langle \nabla f^2(x)(y-x), y-x\right\rangle \mathrm{d} \alpha \mathrm{d} \tau = \tfrac{1}{2}\left\langle \nabla f^2(x)(y-x), y-x\right\rangle \end{aligned}

and the second integral’s absolute value is

\begin{aligned} \left|\int_0^1 \int_0^\tau \left\langle H(y-x), y-x\right\rangle \mathrm{d} \alpha \mathrm{d} \tau \right| &\leq \int_0^1 \int_0^\tau \|H\|\ \|y-x\|^2 \mathrm{d} \alpha \mathrm{d} \tau\\ &\leq \int_0^1 \int_0^\tau M \alpha \|y-x\|^3 \mathrm{d} \alpha \mathrm{d} \tau\\ &=\frac{M}{6}\|y-x\|^3. \end{aligned}

### One response

1. […] derived – for real-valued functions – when second order information is available. Read this post for details. In that case, the right-hand side involves a cubic term […]

Like

mathbabe

Exploring and venting about quantitative issues

Look at the corners!

The math blog of Dmitry Ostrovsky

The Unapologetic Mathematician

Mathematics for the interested outsider

Almost Sure

A random mathematical blog

Mathematix

Mathematix is the mathematician of the village of Asterix