# Lagrange vs Fenchel Duality

In this post we discuss the correspondence between the Lagrangian and the Fenchelian duality frameworks and we trace their common origin to the concept of convex conjugate functions and perturbed optimization problems.

### Duality in mathematics

In mathematics, the concept of duality allows to study an object by means of some other object called its dual. A linear operator $A:X\to Y$ can be studied via its adjoint operator $A^*: Y\to X$. Certain properties of a Banach space $X$ can be studied via its topological dual $X^*$. A convex set $C$ in $\mathbb{R}^n$ can be seen as the intersection of a set of hyperplanes, that is $C = \bigcap_{i\in I} \{F_i x \leq g_i \}$ and the latter is often a more convenient representation. These are examples of dual objects in mathematics. Likewise, in convex optimization, the dual object which corresponds to a convex function is its convex conjugate. When it comes to optimization problems, however, there are several ways in which we may derive dual optimization problems leading to different formulations. This is because we first need to specify what exactly we dualize and how…

### Convex conjugate functions

Notation: In what follows, $\mathcal{H},\mathcal{K}$ denote two Hilbert spaces. Their inner product will be denoted by $\langle {}\cdot{}, {}\cdot{} \rangle$ With minor adaptations the results presented here hold for Banach spaces as well. We denote the extended-real line by $\bar{\mathbb{R}} := \mathbb{R} \cup \{+\infty\}$.

In convex optimization theory, most duality concepts (including the Lagrange and Fenchel duality frameworks) source from the realization that convex sets can be represented as the intersection of a set of hyperplanes. This extends elegantly to proper, convex, lower semicontinuous functions $f:\mathcal{H}\to\bar{\mathbb{R}}$ which can be identified by their epigraphs. Recall that the epigraph of a function is the set $\mathrm{epi} f := \{(x,t): f(x) \leq t\}$.

Let us describe the supporting hyperplanes of the epigraph of a function. Let $l(x) = \langle x*, x \rangle - b$ be an affine function with slope $x^*$ which is majorized by $f$, that is $l(x) \leq f(x)$ for all $x \in \mathcal{H}$.

A function f with the collection of functions with slope x* which are majorized by f and the supporting hyperplane of its epigraph with slope x*.

Provided that $f$ is proper, convex and lower semicontinuous, for every slope $x^*$ there is a supporting hyperplane for the form $l^*(x) = \langle x^*, x\rangle - b^*$ for some $b^*\in \mathbb{R}$. This is

\begin{aligned} &l(x) \leq f(x)\\ \Leftrightarrow\ & \langle x^*, x \rangle - b \leq f(x)\\ \Leftrightarrow\ & b \geq \langle x^*, x \rangle - f(x)\\ \Leftrightarrow\ & b \geq \sup_{x}\langle x^*, x \rangle - f(x). \end{aligned}

The convex conjugate of $f$ at $x^*$ is the RHS of this equation which returns a value $b^*$ so that $\langle x^*, x\rangle - b^*$ is a supporting hyperplane of the epigraph of $f$.

Provided that $f$ is proper, convex and lower semicontinuous, knowing $f^*$ can tell us everything about $f$. In fact, according to the Fenchel-Moreau theorem, $f=f^{**}$, where $f^{**}=(f^*)^*$.

The convex conjugates are the dual objects of convex functions and, as you can see, they

### Perturbation Functions

The framework of perturbation functions is perhaps the most elegant in optimization theory the reveals the essence of dual optimization problems [1], [2]. In fact, every dualization (there is no unique way in which we define a dual optimization problem) is associated with a perturbation function.

Consider the optimization problem

\begin{aligned} \mathrm{Minimize}_{x\in\mathcal{H}}\ f(x) \end{aligned}

We introduce a convex function $\Phi: \mathcal{H}\times\mathcal{K}\to\bar{\mathbb{R}}$ which we shall call a perturbation function, so that $\Phi(x,0)=f(x)$. Then, the optimization problem above can be equivalently written as

\begin{aligned} \mathrm{Minimize}_{x\in\mathcal{H}}\ \Phi(x,0). \end{aligned}

For $y\neq 0$, the following problem, which depends parametrically on $y$, is called a perturbed optimization problem

\begin{aligned} \mathrm{Minimize}_{x\in\mathcal{H}}\ \Phi(x,y) \end{aligned}

Let $h(y) := \inf_{x\in\mathcal{H}} \Phi(x,y)$ be the optimal value of this problem. We are interested in determining $h(0)$. If $h$ is sufficiently regular, that is, proper, convex, lower semicontinuous (we shall discuss later when this is the case), then

\begin{aligned} h(0) = h^{**}(0), \end{aligned}

otherwise $h(0)\geq h^{**}(0)$.

The dual problem consists in finding $h^{**}(0)$.

By definition

\begin{aligned} h^{**}(y) = \sup_{y^*\in\mathcal{K}} \langle y, y^* \rangle - h^*(y^*) \end{aligned}

therefore,

\begin{aligned} h^{**}(0) = \sup_{y^*\in\mathcal{K}} - h^*(y^*) \end{aligned}

Therefore, the determination of $h^{**}(0)$ requires the solution of the optimization problem

\begin{aligned} \mathrm{Maximize}_{y^*\in\mathcal{K}}\ -h^*(y^*). \end{aligned}

This is precisely the Fenchel dual optimization problem [3].

### Duality and Perturbation Functions

Let us see how exactly perturbation functions lead naturally to dual formulations. The convex conjugate of $\Phi$ is

\begin{aligned} \Phi^*(x^*,y^*) = \sup_{\substack{x\in \mathcal{H}\\y\in\mathcal{K}}} \langle x^*,x \rangle + \langle y^*, y\rangle - \Phi(x,y), \end{aligned}

therefore,

\begin{aligned} \Phi^*(0,y^*) &= \sup_{\substack{x\in \mathcal{H}\\y\in\mathcal{K}}} \langle y^*, y\rangle - \Phi(x,y)\\ &= \sup_{y\in \mathcal{K}} \{ \langle y^*, y\rangle - \inf_{x\in \mathcal{H}} \Phi(x,y)\}\\ &= \sup_{y\in \mathcal{K}} \{\langle y^*, y\rangle - h(y)\}\\ &= h^*(y^*). \end{aligned}

As a result, the dual optimization problem problem can be written as

\begin{aligned} \mathrm{Maximize}_{y^*\in\mathcal{K}}\ -\Phi^*(0,y^*), \end{aligned}

or, equivalently

\begin{aligned} \mathrm{Minimize}_{y^*\in\mathcal{K}}\ \Phi^*(0,y^*). \end{aligned}

Juxtapose this with the primal problem which is

\begin{aligned} \mathrm{Minimize}_{x\in\mathcal{H}}\ \Phi(x,0). \end{aligned}

### Lagrangian Duality and Perturbation Functions

Lagrangian duality is a particular type of duality for optimization problems of the form

\begin{aligned} &\mathrm{Minimize}_{x\in\mathcal{H}}\ f(x)\\ &\text{subject to: }\ g(x) \leq 0, \end{aligned}

where $f:\mathbb{R}^n\to\bar{\mathbb{R}}$, $g:\mathbb{R}^n\to\mathbb{R}^m$ and $\leq$ is meant in the component-wise sense.

We perturb the problem as follows

\begin{aligned} &\mathrm{Minimize}_{x\in\mathcal{H}}\ f(x)\\ &\text{subject to: }\ g(x) - y \leq 0, \end{aligned}

where $y\in\mathbb{R}^m$. Define the set $C = \{(x,y)\in\mathbb{R}^{n+m}: g(x) \leq y\}$. The perturbed problem is written as

\begin{aligned} &\mathrm{Minimize}_{x\in\mathcal{H}}\ f(x) + \delta_C(x,y), \end{aligned}

where $\delta_C(x,y)$ is the indicator function of $C$, that is

\begin{aligned} \delta_C(x,y) = \begin{cases} 0,&\text{ if } g(x) \leq y,\\ +\infty,&\text{ otherwise} \end{cases} \end{aligned}

In other words, the perturbation function is defined as

\begin{aligned} \Phi(x,y) = f(x) + \delta_C(x, y). \end{aligned}

Its convex conjugate is

\begin{aligned} \Phi^*(0, y^*) &= \sup_{x,y} \langle y^*, y\rangle - \Phi(x,y)\\ &= \sup_{x,y: g(x) \leq y} \langle y^*, y\rangle - f(x)\\ &= \sup_{x, z\geq 0} \langle y^*, -g(x) - z\rangle - f(x)\\ &= - \inf_{x,z\geq 0} f(x) + \langle y^*, g(x)\rangle + \langle y^*, z\rangle\\ &= -\inf_{x} \{ f(x) + \langle y^*, g(x)\rangle + \inf_{z\geq 0} \langle y^*, z\rangle \}\\ &= -\inf_{x}\{ f(x) + \langle y^*, g(x)\rangle\} + \delta(y^*| S_+), \end{aligned}

where $S_+ := \{y\in\mathbb{R}^m{}:{} y\geq 0\}$.

This is exactly the Lagrangian dual optimization problem which is the Fenchel dual on a properly chosen perturbation function.

### References

[1] R.I. Bot, S.M. Grad and G. Wanka, Fenchel-Lagrange Duality Versus Geometric Duality in Convex Optimization, JOTA 129(1):35-54, 2006.

[2] R.T. Rockafellar, R. J.-B. Wets, Variational Analysis, Springer, 2009.

[3] H. H. Bauschke, P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer, 2011.

### 2 responses

1. In the paragraph on Lagrangian Duality and Perturbation Functions shouldn’t the perturbed problem be g(x) – y <=0 for the definition of the set C(x,y) to make sense?

Liked by 1 person

1. Yes, this was a typo. Thanks for pointing out to it.

Like

mathbabe

Exploring and venting about quantitative issues

Look at the corners!

The math blog of Dmitry Ostrovsky

The Unapologetic Mathematician

Mathematics for the interested outsider

Almost Sure

A random mathematical blog

Mathematix

Mathematix is the mathematician of the village of Asterix