# Cone programs and self-dual embeddings

This post aims at providing some intuition into cone programs from different perspectives; in particular:

1. Equivalence of different formulations of cone programs
2. Fenchel duality
3. Primal-dual optimality conditions (OC)
4. OCs as variational inequalities
5. Homogeneous self-dual embeddings (HSDEs)
6. OCs for HSDEs

## Cones and their properties

A set $\mathcal{K}$ in a vector space $X$ is called a cone if $\lambda x \in \mathcal{K}$ for every $x\in\mathcal{K}$ and it is closed
under addition. $\mathcal{K}$ is said to be a pointed cone if $x\in \mathcal{K}$ and $-x \in \mathcal{K}$ imply $x=0$. Pointed cones define the partial ordering:

\begin{aligned} x \geq_\mathcal{K} y \Leftrightarrow x-y \in \mathcal{K}. \end{aligned}

The polar of a cone $\mathcal{K}$ is a cone $\mathcal{K}^\circ$ defined as

\begin{aligned}\mathcal{K}^\circ = \{x^* {}\mid{} \langle x^*, x\rangle \leq 0, \forall x\in\mathcal{K}\}. \end{aligned}

If $K$ is closed, then $\mathcal{K}^{\circ\circ}=\mathcal{K}$ and the indicators of $\mathcal{K}$ and $\mathcal{K}^{\circ}$ are conjugate to each other, that is

\begin{aligned} (\delta_\mathcal{K})^* = \delta_{\mathcal{K}^\circ}. \end{aligned}

The dual cone of $\mathcal{K}$ is defined as

\begin{aligned} \mathcal{K}^* = \{x^* {}\mid{} \langle x^*, x\rangle \geq 0, \forall x\in\mathcal{K}\} = -\mathcal{K}^\circ. \end{aligned}

If $\mathcal{K}$ is a closed, convex cone, then $\mathcal{K}^{**}=\mathcal{K}$ and $\mathcal{K}^{\circ\circ}=\mathcal{K}$.
This result is known as the extended Farkas’ lemma.

Interesting examples of conex are

1. the set of positive semidefinite matrices $\mathcal{K}_{\mathrm{psd}} =\{A \in \mathbb{R}^{n\times n}{}\mid{}X\succeq 0\}$ which is a closed, convex, pointed cone,
2. the icecream cone ${\mathcal{C}}_n=\{z=(x,t)\in\mathbb{R}^n\times \mathbb{R}{}\mid{} \|x\|_2\leq t\}$, and
3. polyhedral cones $\mathcal{K}_{\mathrm{poly}}=\{x\in\mathbb{R}^n{}\mid{}Ax\geq 0\}$, for some matrix $A\in\mathbb{R}^{m\times n}$.

The conjugate of a function $f$ is defined as

\begin{aligned} f^*(y) = \sup_x \langle x,y\rangle - f(x). \end{aligned}

If $f(x) = g(x-b)$, then $f^*(y) = \langle b,y\rangle + g^*(y)$. For all $\alpha>0$, if $f(x) = \alpha g(x)$, $f^*(y) = \alpha g^*(y/\alpha)$.

The subdifferential of a convex function $f$ at a point $x \in \mathrm{dom} f$ is

\begin{aligned} \partial f(x) = \{g {}\mid{} f(y) \geq f(x) + \langle g, y-x\rangle , \forall y \in \mathrm{dom} f\}. \end{aligned}

The subdifferential of the indicator function $\delta_C$ of a convex set $C$ is

\begin{aligned} \partial \delta_C(x) = N_C(x) = \{g {}\mid{} \langle g, x\rangle \geq \langle g, y\rangle , \forall y\in C\}. \end{aligned}

This function is called the normal cone of $C$.

Moreau’s Theorem establishes an important duality correspondence between a cone $\mathcal{K}$ and its polar cone $\mathcal{K}^\circ$:

For all $x,y\in X$, the following are equivalent:

1. $z=x+y$, $x\in \mathcal{K}$, $y\in\mathcal{K}^{\circ}$, $\langle x,y\rangle =0$,
2. $x = \mathrm{proj}_{\mathcal{K}}z$, $y = \mathrm{proj}_{\mathcal{K}^\circ}z$

## Cone programs

A cone program is an optimization problem of the form

\begin{aligned} \mathcal{P}:&\mathrm{minimize}_{x\in\mathbb{R}^n} \langle c, x\rangle \\ &\text{s.t. } b- Ax\in\mathcal{K}, \end{aligned}

with $A\in\mathbb{R}^{m\times n}$.

The constraint $b- Ax\in\mathcal{K}$ can be equivalently written as (i) $b-Ax\geq_{\mathcal{K}} 0$, that is $Ax \leq_{\mathcal{K}} b$, or (ii) $Ax + s = b$ for $s\in\mathcal{K}$. Problem $\mathcal{P}$ is often written as

\begin{aligned} \mathcal{P}:&\mathrm{minimize}_{x\in\mathbb{R}^n} \langle c, x\rangle \\ &\text{s.t. } Ax + s = b\\ &s\in \mathcal{K} \end{aligned}

In the literature, we often encounter the following standard formulation for a cone program

\begin{aligned} \mathcal{P}':&\mathrm{minimize}_{x\in\mathbb{R}^n} \langle c, x\rangle \\ &\text{s.t. } Ax = b,\\ & x\in \mathcal{K}. \end{aligned}

These two formulations — problems $\mathcal{P}$ and $\mathcal{P}'$ —  are equivalent in the sense that we may transform one into the other.

For instance, starting from the second one, we clearly see that the constraints can be interpreted as (i) $x$ belongs to an affine space defined as $E = \{x {}\mid{} Ax = b\}$ and (ii) $x\in\mathcal{K}$, that is, $x$ is contained in a cone. Essentially, $x$ must be in the intersection of a cone and an affine space.

Take $x_0\in\mathbb{R}^n$ so that $Ax_0 = b$ and let $N_A\in\mathbb{R}^{n\times p}$ be a matrix that spans the kernel of $A$ (which has dimension $p\leq n$), that is $AN_Av=0$ for all $v\in\mathbb{R}^p$.

Then $E = \{z = N_A v + x_0 {}\mid{} v\in\mathbb{R}^p\}$ and the requirement that $Ax=b$ is written as $x = N_A v + x_0$, so the problem becomes

\begin{aligned} &\mathrm{minimize}_{v\in\mathbb{R}^p} \langle N_A^*c, v\rangle \\ &\text{s.t. } N_A v + x_0 \in \mathcal{K}, \end{aligned}

which is in the form of $\mathcal{P}$.

Conversely, $\mathcal{P}$ can be written in the form of problem $\mathcal{P}'$ .

## Fenchel duality

The dual of problem $\mathcal{P}$ can be derived using the Fenchel duality framework.

Define $f(x) = \langle c,x\rangle$ and $g(z) = \delta_{\mathcal{K}}(b-z)$. Then, problem $\mathcal{P}$ can be written as

\begin{aligned} \mathcal{P}:\mathrm{minimize}_{x\in\mathbb{R}^n} f(x) + g(Ax), \end{aligned}

whose Fenchel dual is

\begin{aligned} \mathcal{D}:\mathrm{minimize}_{y\in\mathbb{R}^m} f^*(-A^*y) + g^*(y). \end{aligned}

The conjugates of $f$ and $g$ are [note: The conjugate of $g$ is derived as follows: $g(z) = \delta_{\mathcal{K}}(b-z) = \delta_{-\mathcal{K}}(z-b)$, so $g^*(y) = \langle b,y\rangle + (\delta_{-\mathcal{K}})^*(y) = \langle b,y\rangle + \delta_{(-\mathcal{K})^{\circ}}(y) = \langle b,y\rangle + \delta_{\mathcal{K}^*}(y)$.]

\begin{aligned} f^*(y) &= \delta_{\{c\}}(y),\\ g^*(y) &= \langle b, y\rangle + \delta_{\mathcal{K}^*}(y), \end{aligned}

and the Fenchel dual problem is

\begin{aligned} \mathcal{D}:\mathrm{minimize}_{y\in\mathbb{R}^m} f^*(-A^*y) + g^*(y), \end{aligned}

which is

\begin{aligned} \mathcal{D}:&\mathrm{minimize}_{y\in\mathbb{R}^m} \langle b,y\rangle \\ &\text{s.t. } y\in\mathcal{K}^*,\\ & A^*y + c = 0 \end{aligned}

Let $p^\star$ be the optimal value of $\mathcal{P}$ and $d^\star$
be the optimal value of $\mathcal{D}$. Then, strong duality holds if the primal or the dual problem are strictly feasible and then $p^\star = -d^\star$.

Overall, we may derive the following optimality conditions for a pair $(x^\star, y^\star)$ to be a primal-dual optimal pair

\begin{aligned} b-Ax^\star \in \mathcal{K},\\ y^\star \in \mathcal{K}^*,\\ A^* y^\star + c = 0,\\ \langle c,x^\star\rangle + \langle b, y^\star\rangle = 0. \end{aligned}

These optimality conditions can be seen as the following conditions on $(x^\star, s^\star, y^\star)$

\begin{aligned} \begin{bmatrix} 0\\s^\star \end{bmatrix} = \begin{bmatrix} & A^*\\ -A \end{bmatrix} \begin{bmatrix} x^\star\\y^\star \end{bmatrix} + \begin{bmatrix} c\\b \end{bmatrix},\\ s^\star\in\mathcal{K},\ y^\star\in\mathcal{K}^*,\ \langle s^\star,y^\star\rangle =0. \end{aligned}

Here note that the equality $\langle s^\star,y^\star\rangle =0$ is equivalent to zero duality gap, that is $\langle c,x^\star\rangle + \langle b, y^\star\rangle = 0$.

## Optimality conditions from a different perspective

The optimality conditions of $\mathcal{P}$ (using the splitting we introduced in the previous section) are simply

\begin{aligned} 0 \in \partial(f + g\circ A)(x^\star), \end{aligned}

and, provided that $\mathcal{K}$ has a nonempty relative interior, this is equivalent to

\begin{aligned} &0 \in \nabla f(x^\star) + A^* (\partial g)(A x^*)\\ \Leftrightarrow\ & - \nabla f(x^\star) \in A^* (\partial g)(A x^\star), \end{aligned}

or equivalently

\begin{aligned} &- \nabla f(x^\star) = A^* y^\star,\\ &y^\star \in (\partial g)(A x^\star). \end{aligned}

We have $\partial g(z) = -(\partial \delta_{\mathcal{K}})(b-z) = -N_{\mathcal{K}}(b-z)$ and

\begin{aligned} &-c = A^* y^\star,\\ &y^\star \in -N_{\mathcal{K}}(b-Ax^\star). \end{aligned}

or

\begin{aligned} &c = A^* y^\star,\\ &y^\star \in N_{\mathcal{K}}(b-Ax^\star). \end{aligned}

Since $\mathcal{K}$ is a nonempty convex cone, we have (book of Hiriart-Urruty and Lemaréchal, Example 5.2.6 (a))

\begin{aligned} N_{\mathcal{K}}(b-Ax^\star) = \mathcal{K}^\circ \cap \{b-Ax^\star\}^{\bot}. \end{aligned}

This yields the primal-dual optimality conditions we stated above.

## Unboundedness and infeasibility

Infeasibility and unboundedness (dual infeasibility) conditions are provided by the so-called theorems of the alternative. The weak theorems of the alternative, state that

1. (Primal feasibility). At most one of the following two systems is solvable
• $Ax\preceq_{\mathcal{K}} b$ (i.e., $b-Ax\in\mathcal{K}$ or $Ax+s=b$ for some $s \in \mathcal{K}$)
• $A^* y = 0$, $y \succeq_{\mathcal{K}^*} 0$ and $\langle b,y\rangle <0$
2. (Dual feasibility). At most one of the following two systems is solvable
• $A^*y+c = 0$ and $y\succeq_{\mathcal{K}} 0$
• $Ax\succeq 0$ and  $\langle c,x\rangle \leq 0$

The primal-dual optimality conditions we stated previously together with these
feasibility conditions make the whole picture.

## Homogeneous Self-Dual Embeddings

Consider the following feasibility problem in $(x^\star, s^\star, y^\star)$ and $(\tau^\star, \kappa^\star)$

\begin{aligned} \begin{bmatrix} 0\\s^\star\\\kappa^\star \end{bmatrix} = \underbrace{ \begin{bmatrix} 0 & A^* & c\\ -A & 0 & b\\ -c^* & -b^* & 0 \end{bmatrix}}_{Q} \begin{bmatrix} x^\star\\y^\star\\\tau^\star \end{bmatrix},\\ s^\star\in\mathcal{K},\ y^\star\in\mathcal{K}^*,\ \tau\geq 0,\ \kappa \geq 0. \end{aligned}

Note that for $\tau=1$ and $\kappa=0$, the above equations collapse to the
primal-dual optimality conditions. Second, due to the skew-symmetry of the above system, any solution $(0,s^\star, \kappa^\star)$ and $(x^\star, y^\star, \tau^\star)$ satisfies

\begin{aligned} \left\langle \begin{bmatrix} 0\\s^\star\\\kappa^\star \end{bmatrix}, \begin{bmatrix} x^\star\\y^\star\\\tau^\star \end{bmatrix}\right\rangle = 0, \end{aligned}

which leads to $\langle s^\star, y^\star \rangle + \kappa^\star \tau^\star = 0$, but since we already know that $\langle s^\star, y^\star \rangle = 0$, it is $\kappa^\star\tau^\star=0$, i.e., at least one of the two must be zero.

If $\kappa=0$ and $\tau>0$, then $(x^\star/\tau^\star, y^\star/\tau^\star, s^\star/\tau^\star)$ is a solution. If $\tau=0$ and $\kappa>0$, then the problem is either primal- or dual-infeasible. If $\tau=\kappa=0$, no definitive conclusion can be drawn.

Let us define $u=(x,y,\tau)$ and $v=(0,s,\kappa)$. Then, the self-dual embedding becomes

\begin{aligned} \text{Find: } u,\,v,\text{ s.t. } Qu=v,\,(u,v)\in\mathcal{C}\times\mathcal{C}^*, \end{aligned}

where $\mathcal{C} = \mathbb{R}^n\times\mathcal{K}^*\times\mathbb{R}_+$. The problem can now be cast as an optimization problem

\begin{aligned} \mathrm{minimize}_{u,v} \delta_{\mathcal{C}\times\mathcal{C}^*}(u,v) + \delta_{\{0\}}(Qu - v). \end{aligned}

Furthermore, this is equivalent to the variational inequality

\begin{aligned} 0 \in Qu + N_{\mathcal{C}}(u) \end{aligned}

and $N_{\mathcal{C}}(u) = \mathcal{C}^{\circ}\cap\{u\}^{\bot}$.

This, then, becomes

\begin{aligned} u&\in\mathcal{C}\\ Qu&\in\mathcal{C}^*\\ \langle u, Qu \rangle &= 0 \end{aligned}

## Up next

Operator splitting algorithms to solve cone programs.

mathbabe

Exploring and venting about quantitative issues

Look at the corners!

The math blog of Dmitry Ostrovsky

The Unapologetic Mathematician

Mathematics for the interested outsider

Almost Sure

A random mathematical blog

Mathematix

Mathematix is the mathematician of the village of Asterix