2 Convex functions

2.1 Basic Definitions and Examples

2.1.1 The definitions of a Convex Function

Definition 2.1.1

Let $C$ be a nonempty convex set in $\mathbb {R}^n$. A function $f : C \to \mathbb {R}$ is said to be convex on $C$ when, for all pairs $(x,x')\in C\times C$ and all $\alpha \in [0,1]$, there holds

\[ f(\alpha x + (1-\alpha )x') \le \alpha f(x) + (1-\alpha ) f(x'). \tag {1.1.1} \]

Proposition 2.1.1

The function $f$ is strongly convex on $C$ with modulus $c$ if and only if the function $f-\tfrac {1}{2}c\| \cdot \| ^{2}$ is convex on $C$.

Proof ▶

Use direct calculations in the definition (1.1.1) of convexity applied to the function $f-\tfrac {1}{2}c\| \cdot \| ^{2}$, namely:

\[ f(\alpha x+(1-\alpha )x')-\tfrac {1}{2}c\| \alpha x+(1-\alpha )x'\| ^{2}\le \]

\[ \le \alpha f(x)+(1-\alpha )f(x')-\tfrac {1}{2}c\big[\alpha \| x\| ^{2}+(1-\alpha )\| x'\| ^{2}\big]. \]

Definition 2.1.2

(The Set $\operatorname {Conv}\mathbb {R}^n$) A function $f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $, not identically $+\infty $, is said to be convex when, for all $(x,x')\in \mathbb {R}^n\times \mathbb {R}^n$ and all $\alpha \in [0,1]$, there holds

\[ f\big(\alpha x+(1-\alpha )x'\big)\le \alpha f(x)+(1-\alpha )f(x'), \]

considered as an inequality in $\mathbb {R}\cup \{ +\infty \} $.

The class of such functions is denoted by $\operatorname {Conv}\mathbb {R}^n$.

Definition 2.1.3

(Domain of a Function) The domain (or also effective domain) of $f\in \text{Conv }\mathbb {R}^n$ is the nonempty set

\[ \text{dom }f := \{ x\in \mathbb {R}^n : f(x) {\lt} +\infty \} . \]

Definition 2.1.4

(Epigraph of a Function) Given $f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $, not identically equal to $+\infty $, the epigraph of $f$ is the nonempty set

\[ \operatorname {epi} f := \{ (x,r)\in \mathbb {R}^n\times \mathbb {R} : r \ge f(x)\} . \]

Its strict epigraph $\operatorname {epi}_s f$ is defined likewise, with “$\ge $” replaced by “${\gt}$” (beware that the word “strict” here has nothing to do with strict convexity).

Proposition 2.1.2

Let $f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $ be not identically equal to $+\infty $. The three properties below are equivalent:

$f$ is convex in the sense of Definition 1.1.3;
its epigraph is a convex set in $\mathbb {R}^n\times \mathbb {R}$;
its strict epigraph is a convex set in $\mathbb {R}^n\times \mathbb {R}$.

Proof ▶

Left as an exercise.

Theorem 2.1.1

(Inequality of Jensen) Let $f\in \operatorname {Conv}\mathbb {R}^n$. Then, for all collections $\{ x_1,\dots ,x_k\} $ of points in $\text{dom }f$ and all $\alpha =(\alpha _1,\dots ,\alpha _k)$ in the unit simplex of $\mathbb {R}^k$, there holds (inequality of Jensen in summation form)

\[ f\! \bigg(\sum _{i=1}^k \alpha _i x_i\bigg)\le \sum _{i=1}^k \alpha _i f(x_i). \]

Proof ▶

The $k$ points $(x_i,f(x_i))\in \mathbb {R}^n\times \mathbb {R}$ are clearly in $\text{epi }f$, a convex set. Their convex combination

\[ \sum _i \alpha _i (x_i,f(x_i)) = \Big(\sum _i \alpha _i x_i,\sum _i \alpha _i f(x_i)\Big) \]

is also in $\text{epi }f$ (Proposition A.1.3.3). This is just the claimed inequality.

Proposition 2.1.3

Let $f\in \text{Conv }\mathbb {R}^n$. The relative interior of $\operatorname {epi}f$ is the union over $x\in \operatorname {ri}\, \operatorname {dom}f$ of the open half-lines with bottom endpoints at $f(x)$:

\[ \operatorname {ri}\operatorname {epi}f=\{ (x,r)\in \mathbb {R}^n\times \mathbb {R}: x\in \operatorname {ri}\operatorname {dom}f,\ r{\gt}f(x)\} . \]

Proof ▶

Since $\operatorname {dom}f$ is the image of $\operatorname {epi}f$ under the linear mapping “projection onto $\mathbb {R}^n$”, Proposition A.2.1.12 tells us that

\[ \operatorname {ri}\operatorname {dom}f\ \text{is the projection onto }\mathbb {R}^n\ \text{of }\operatorname {ri}\operatorname {epi}f. \tag {1.1.5} \]

Now take $x$ arbitrary in $\operatorname {ri}\operatorname {dom}f$. The subset of $\operatorname {ri}\operatorname {epi}f$ that is projected onto $x$ is just $((\{ x\} \times \mathbb {R})\cap \operatorname {ri}\operatorname {epi}f)$, which in turn is $\operatorname {ri}((\{ x\} \times \mathbb {R})\cap \operatorname {epi}f)$ (use Proposition A.2.1.10). This latter set is clearly $\{ x\} \times (f(x),+\infty )$.

In summary, we have proved that, for $x\in \operatorname {ri}\operatorname {dom}f$, $(x,r)\in \operatorname {ri}\operatorname {epi}f$ if and only if $r{\gt}f(x)$. Together with (1.1.5), this proves our claim.

2.1.2 Special convex functions: Affinity and Closedness

Proposition 2.1.4

Any $f\in \operatorname {Conv}\mathbb {R}^n$ is minorized by some affine function. More precisely: for any $x_0\in \operatorname {ri}\operatorname {dom}f$, there is $s$ in the subspace parallel to $\operatorname {aff}\operatorname {dom}f$ such that

\[ f(x)\ge f(x_0)+\langle s,\, x-x_0\rangle \quad \text{for all }x\in \mathbb {R}^n. \]

In other words, the affine function can be forced to coincide with $f$ at $x_0$.

Proof ▶

We know that $\operatorname {dom}f$ is the image of $\operatorname {epi}f$ under the linear mapping "projection onto $\mathbb {R}^n$". Look again at the definition of an affine hull (§A.1.3) to realize that

\[ \operatorname {aff}\operatorname {epi}f=(\operatorname {aff}\operatorname {dom}f)\times \mathbb {R}\, . \]

Denote by $V$ the linear subspace parallel to $\operatorname {aff}\operatorname {dom}f$, so that $\operatorname {aff}\operatorname {dom}f=\{ x_0\} +V$ with $x_0$ arbitrary in $\operatorname {dom}f$; then we have

\begin{equation} \label{eq:1.2.1} \operatorname {aff}\operatorname {epi}f=\{ x_0+V\} \times \mathbb {R}. \end{equation}

We equip $V\times \mathbb {R}$ and $\mathbb {R}^n\times \mathbb {R}$ with the scalar product of product-spaces. With $x_0\in \operatorname {ri}\operatorname {dom}f$, Proposition 1.1.9 tells us that $(x_0,f(x_0))\in \operatorname {rbd}\operatorname {epi}f$ and we can take a nontrivial hyperplane supporting $\operatorname {epi}f$ at $(x_0,f(x_0))$: using Remark A.4.2.2 and (1), there are $s=s_V\in V$ and $\alpha \in \mathbb {R}$, not both zero, such that

\begin{equation} \label{eq:1.2.2} \langle s,x\rangle +\alpha r\le \langle s,x_0\rangle +\alpha f(x_0) \end{equation}

for all $(x,r)$ with $f(x)\le r$. Note: this implies $\alpha \le 0$ (let $r\to +\infty $!).

Because of our choice of $s$ (in $V$) and $x_0$ (in $\operatorname {ri}\operatorname {dom}f$), we can take $\delta {\gt}0$ so small that $x_0+\delta s\in \operatorname {dom}f$, for which (2) gives

\[ \delta \| s\| ^2\le \alpha \bigl[f(x_0)-f(x_0+\delta s)\bigr]{\lt}+\infty ; \]

this shows $\alpha \neq 0$ (otherwise, both $s$ and $\alpha $ would be zero). Without loss of generality, we can assume $\alpha =-1$; then (2) gives our affine function.

Proposition 2.1.5

For $f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $, the following three properties are equivalent:

$f$ is lower semi-continuous on $\mathbb {R}^n$;
$\operatorname {epi} f$ is a closed set in $\mathbb {R}^n\times \mathbb {R}$;
the sublevel-sets $S_r(f)$ are closed (possibly empty) for all $r\in \mathbb {R}$.

Proof ▶

[(i) $\Rightarrow $ (ii)] Let $(y_k,r_k)_k$ be a sequence of $\operatorname {epi} f$ converging to $(x,r)$ for $k\to +\infty $. Since $f(y_k)\le r_k$ for all $k$, the l.s.c. relation (1.2.3) readily gives

\[ r=\lim r_k \ge \liminf f(y_k)\ge \liminf _{y\to x} f(y)\ge f(x), \]

i.e. $(x,r)\in \operatorname {epi} f$.

[(ii) $\Rightarrow $ (iii)] Construct the sublevel-sets $S_r(f)$ as in Remark 1.1.7: the closed sets $\operatorname {epi} f$ and $\mathbb {R}^n\times \{ r\} $ have a closed intersection.

[(iii) $\Rightarrow $ (i)] Suppose $f$ is not lower semi-continuous at some $x$: there is a (sub)sequence $(y_k)$ converging to $x$ such that $f(y_k)$ converges to $\rho {\lt}f(x)\le +\infty $. Pick $r\in [\rho ,f(x))$: for $k$ large enough, $f(y_k)\le r{\lt} f(x)$; hence $S_r(f)$ contains the tail of $(y_k)$ but not its limit $x$. Consequently, this $S_r(f)$ is not closed.

Definition 2.1.5

(Closed Functions) The function $f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $ is said to be closed if it is lower semi-continuous everywhere, or if its epigraph is closed, or if its sublevel-sets are closed.

Definition 2.1.6 Closure of a Function

The closure (or lower semi-continuous hull) of a function $f$ is the function $\operatorname {cl}f:\mathbb {R}^n\to \mathbb {R}\cup \{ \pm \infty \} $ defined by:

\[ \operatorname {cl}f(x):=\liminf _{y\to x} f(y)\qquad \text{for all }x\in \mathbb {R}^n, \tag {1.2.4} \]

or equivalently by

\[ \operatorname {epi}(\operatorname {cl}f):=\operatorname {cl}\big(\operatorname {epi} f\big). \tag {1.2.5} \]

Proposition 2.1.6

Let $f\in \operatorname {Conv}\mathbb {R}^n$ and $x'\in \operatorname {ri}\operatorname {dom}f$. There holds (in $\mathbb {R}\cup \{ +\infty \} $)

\[ \operatorname {cl}f(x)=\lim _{t\downarrow 0}f\big(x+t(x'-x)\big)\quad \text{for all }x\in \mathbb {R}^n. \]

Proof ▶

Since $x_t:=x+t(x'-x)\to x$ when $t\downarrow 0$, we certainly have

\[ (\operatorname {cl}f)(x)\le \liminf _{t\downarrow 0}f\big(x+t(x'-x)\big). \]

We will prove the converse inequality by showing that

\[ \limsup _{t\downarrow 0}f\big(x+t(x'-x)\big)\le r\quad \text{for all }r\ge (\operatorname {cl}f)(x). \]

(Non-existence of such an $r$ means that $\operatorname {cl}f(x)=+\infty $, the proof is finished.)

Thus let $(x,r)\in \operatorname {epi}(\operatorname {cl}f)=\operatorname {cl}(\operatorname {epi}f)$. Pick $r'{\gt}f(x')$, hence $(x',r')\in \operatorname {epi}f$ (Proposition 1.1.9). Applying Lemma A.2.1.6 to the convex set $\operatorname {epi}f$, we see that

\[ t(x',r')+(1-t)(x,r)\in \operatorname {epi}f\subset \operatorname {epi}f\quad \text{for all }t\in ]0,1]. \]

This just means

\[ f\big(x+t(x'-x)\big)\le tr'+(1-t)r\quad \text{for all }t\in ]0,1] \]

and our required inequality follows by letting $t\downarrow 0$.

Proposition 2.1.7

For $f\in \text{Conv }\mathbb {R}^n$, there holds

\[ \operatorname {cl} f\in \text{Conv }\mathbb {R}^n; \tag {1.2.7} \]

\[ \operatorname {cl} f\text{ and }f\text{ coincide on the relative interior of }\text{dom }f. \tag {1.2.8} \]

Proof ▶

We already know from Proposition A.1.2.6 that $\text{epi }\operatorname {cl} f=\operatorname {cl}\text{epi }f$ is a convex set; also $\operatorname {cl} f\le f\not\equiv +\infty $; finally, Proposition 1.2.1 guarantees in the relation of definition (1.2.4) that $\operatorname {cl} f(x){\gt}-\infty $ for all $x$: (1.2.7) does hold.

On the other hand, suppose $x\in \text{ri }\text{dom }f$. Then the one-dimensional function $\varphi (t)=f(x+td)$ is continuous at $t=0$ (Theorem 0.6.2); it follows from Proposition 1.2.5 that $\operatorname {cl} f$ coincides with $f$ on $\text{ri }\text{dom }f$; besides, $\operatorname {cl} f(x)$ is obviously equal to $f(x)=+\infty $ for all $x\notin \text{cl }\text{dom }f$. Altogether, (1.2.8) is true.

Notation 1.2.7 (The Set $\overline{\operatorname {Conv}}\mathbb {R}^n$) The set of closed convex functions on $\mathbb {R}^n$ is denoted by $\overline{\operatorname {Conv}}\mathbb {R}^n$. □

Proposition 2.1.8

The closure of $f\in \text{Conv }\mathbb {R}^n$ is the supremum of all affine functions minorizing $f$:

\[ \text{cl }f(x)=\sup _{(s,b)\in \mathbb {R}^n\times \mathbb {R}}\{ \langle s,x\rangle -b:\langle s,y\rangle -b\le f(y)\text{ for all }y\in \mathbb {R}^n\} . \tag {1.2.9} \]

Proof ▶

A closed half-space containing $\text{epi }f$ is characterized by a nonzero vector $(s,\alpha )\in \mathbb {R}^n\times \mathbb {R}$ and a real number $b$ such that

\[ \langle s,x\rangle +\alpha r\le b\qquad \text{for all }(x,r)\in \text{epi }f \tag {1.2.10} \]

(we equip the graph-space $\mathbb {R}^n\times \mathbb {R}$ with the scalar product of a product-space). Let us denote by $\Sigma \subset \mathbb {R}^n\times \mathbb {R}\times \mathbb {R}$ the index-set of such triples $\sigma =(s,\alpha ,b)$, with corresponding half-space

\[ H^-_{\sigma }:=\{ (x,r):\langle s,x\rangle +\alpha r\le b\} . \tag {1.2.11} \]

In other words, $\operatorname {epi}(\overline{cl}\, f)=\overline{\operatorname {epi}f}=\bigcap _{\sigma \in \Sigma }H^-_{\sigma }$.

Because of the particular nature of an epigraph, (1.2.10) implies $\alpha \le 0$ (let $r\to +\infty $) and, by positive homogeneity, the values $\alpha =0$ and $\alpha =-1$ suffice: $\Sigma $ can be partitioned in

\[ \Sigma _1 := \{ (s,-1,b):(1.2.10)\ \text{holds with }\alpha =-1\} \]

and

\[ \Sigma _0 := \{ (s,0,b):(1.2.10)\ \text{holds with }\alpha =0\} . \]

Indeed, $\Sigma _1$ corresponds to affine functions minorizing $f$ (Proposition 1.2.1 tells us that $\Sigma _1\neq \varnothing $) and $\Sigma _0$ to closed half-spaces of $\mathbb {R}^{n}$ containing $\operatorname {dom}f$ (note that $\Sigma _0=\varnothing $ if $\operatorname {dom}f=\mathbb {R}^n$).

We have to prove that, even when $\Sigma _0\neq \varnothing $, intersecting the half-spaces $H^-_{\sigma }$ over $\Sigma $ or over $\Sigma _1$ produces the same set, namely $\operatorname {epi}f$. For this we take arbitrary $\sigma _0=(s_0,0,b_0)\in \Sigma _0$ and $\sigma _1=(s_1,-1,b_1)\in \Sigma _1$, we set

\[ \sigma (t):=(s_1+ts_0,-1,b_1+t b_0)\in \Sigma _1\quad \text{for all }t\ge 0, \]

and we prove (see Fig. 1.2.1)

\[ H^-_{\sigma _0}\cap H^-_{\sigma _1}=\bigcap _{t\ge 0}H^-_{\sigma (t)}=:\, H^-. \]

Fig. 1.2.1. Closing a convex epigraph

It results directly from the definition (1.2.11) that an $(x,r)$ in $H^-_{\sigma _0}\cap H^-_{\sigma _1}$ satisfies

\[ \langle s_1+ts_0,x\rangle -(b_1+tb_0)\le r\quad \text{for all }t\ge 0,\tag {1.2.12} \]

i.e. $(x,r)\in H^-$. Conversely, take $(x,r)\in H^-$. Set $t=0$ in (1.2.12) to see that $(x,r)\in H^-_{\sigma _1}$. Also, divide by $t{\gt}0$ and let $t\to +\infty $ to see that $(x,r)\in H^-_{\sigma _0}$. The proof is complete.

2.1.3 First examples

Theorem 2.1.2

Let $C$ be a nonempty subset of $\mathbb {R}^n\times \mathbb {R}$ satisfying ??, and let its lower-bound function $\ell _C$ be defined by ??.

If $C$ is convex, then $\ell _C\in \operatorname {Conv}\mathbb {R}^n$;
If $C$ is closed convex, then $\ell _C\in \overline{\operatorname {Conv}}\mathbb {R}^n$.

Proof ▶

We use the analytical definition (1.1.1). Take arbitrary $\varepsilon {\gt}0$, $\alpha \in ]0,1[$ and $(x_i,r_i)\in C$ such that $r_i\le \ell _C(x_i)+\varepsilon $ for $i=1,2$.

When $C$ is convex, $(\alpha x_1+(1-\alpha )x_2,\alpha r_1+(1-\alpha )r_2)\in C$, hence

\[ \ell _C(\alpha x_1+(1-\alpha )x_2)\le \alpha r_1+(1-\alpha )r_2 \le \alpha \ell _C(x_1)+(1-\alpha )\ell _C(x_2)+\varepsilon . \]

The convexity of $\ell _C$ follows, since $\varepsilon {\gt}0$ was arbitrary; (i) is proved.

Now take a sequence $(x_k,\rho _k)_k\subset \operatorname {epi}\ell _C$ converging to $(x,\rho )$; we have to prove $\ell _C(x)\le \rho $ (cf. Proposition 1.2.2). By definition of $\ell _C(x_k)$, we can select, for each positive integer $k$, a real number $r_k$ such that $(x_k,r_k)\in C$ and

\begin{equation} \label{eq:1.3.7} \ell _C(x_k)\le r_k\le \ell _C(x_k)+\tfrac {1}{k}\le \rho _k+\tfrac {1}{k}. \end{equation}

We deduce first that $(r_k)$ is bounded from above. Also, when $\ell _C$ is convex, Proposition 1.2.1 implies the existence of an affine function minorizing $\ell _C$: $(r_k)$ is bounded from below.

Extracting a subsequence if necessary, we may assume $r_k\to r$. When $C$ is closed, $(x,r)\in C$, hence $\ell _C(x)\le r$; but pass to the limit in 3 to see that $r\le \rho $; the proof is complete.

2.2 Functional Operations Preserving Convexity

2.2.1 Operations preserving closedness

Proposition 2.2.1

Let $f_1,\dots ,f_m$ be in $\text{Conv }\mathbb {R}^n$ [resp. in $\overline{\text{Conv }}\mathbb {R}^n$], let $t_1,\dots ,t_m$ be positive numbers, and assume that there is a point where all the $f_j$’s are finite. Then the function $f:=\sum _{j=1}^m t_j f_j$ is in $\text{Conv }\mathbb {R}^n$ [resp. in $\overline{\text{Conv }}\mathbb {R}^n$].

Proof ▶

The convexity of $f$ is readily proved from the relation of definition (1.1.1). As for its closedness, start from

\[ \liminf _{y\to x} t_j f_j(y)=t_j\liminf _{y\to x} f_j(y)\ge t_j f_j(x) \]

(valid for $t_j{\gt}0$ and $f_j$ closed); then note that the lim inf of a sum is not smaller than the sum of lim inf’s.

Proposition 2.2.2

Let $\{ f_j\} _{j\in J}$ be an arbitrary family of convex [resp. closed convex] functions. If there exists $x_0$ such that $\sup _{j} f_j(x_0) {\lt} +\infty $, then their pointwise supremum $f := \sup \{ f_j : j\in J\} $ is in $\mathrm{Conv}\, \mathbb {R}^n$ [resp. in $\mathrm{Conv}_\mathrm {cl}\, \mathbb {R}^n$].

Proof ▶

The key property is that a supremum of functions corresponds to an intersection of epigraphs: $\operatorname {epi} f = \bigcap _{j\in J}\operatorname {epi} f_j$, which conserves convexity and closedness. The only needed restriction is nonemptiness of this intersection.

Proposition 2.2.3

Let $f\in \operatorname {Conv}\mathbb {R}^n$ [resp. $\overline{\operatorname {Conv}}\mathbb {R}^n$] and let $A$ be an affine mapping from $\mathbb {R}^m$ to $\mathbb {R}^n$ such that $\operatorname {Im}A\cap \operatorname {dom}f\neq \varnothing $. Then the function

\[ f\circ A : \mathbb {R}^m \supseteq x\mapsto (f\circ A)(x)=f(A(x)) \]

is in $\operatorname {Conv}\mathbb {R}^m$ [resp. $\overline{\operatorname {Conv}}\mathbb {R}^m$].

Proof ▶

Clearly $(f\circ A)(x){\gt}-\infty $ for all $x$; besides, there exists by assumption $y=A(x)\in \mathbb {R}^n$ such that $f(y){\lt}+\infty $. To check convexity, it suffices to plug the relation

\[ A(\alpha x+(1-\alpha )x')=\alpha A(x)+(1-\alpha )A(x') \]

into the analytical definition (1.1.1) of convexity. As for closedness, it comes readily from the continuity of $A$ when $f$ is itself closed.

Proposition 2.2.4

Let $f\in \operatorname {Conv}\mathbb {R}^n$ [resp. $\overline{\operatorname {Conv}}\mathbb {R}^n$] and let $g\in \operatorname {Conv}\mathbb {R}$ [resp. $\overline{\operatorname {Conv}}\mathbb {R}$] be increasing. Assume that there is $x_0\in \mathbb {R}^n$ such that $f(x_0)\in \operatorname {dom}g$, and set $g(+\infty ):=+\infty $. Then the composite function $g\circ f:\; x\mapsto g(f(x))$ is in $\operatorname {Conv}\mathbb {R}^n$ [resp. in $\overline{\operatorname {Conv}}\mathbb {R}^n$].

Proof ▶

It suffices to check the inequalities of definition: (1.1.1) for convexity, (1.2.3) for closedness.

2.2.2 Dilations and perspectives of a functions

Proposition 2.2.5

If $f\in \operatorname {Conv}\mathbb {R}^n$, its perspective $\tilde f$ is in $\operatorname {Conv}\mathbb {R}^{n+1}$.

Proof ▶

Here also, it is better to look at $\tilde f$ with “geometric glasses”:

\[ \operatorname {epi}\tilde f=\{ (u,x,r)\in \mathbb {R}_+^\times \times \mathbb {R}^n\times \mathbb {R}:\; f(x/u)\le r/u\} =\{ u(1,x',r'):\; u{\gt}0,(x',r')\in \operatorname {epi}f\} \]

\[ =\bigcup _{u{\gt}0}u(\{ 1\} \times \operatorname {epi}f)=\mathbb {R}_+^\times (\{ 1\} \times \operatorname {epi}f) \]

and $\operatorname {epi}\tilde f$ is therefore a convex cone.

Proposition 2.2.6

Let $f\in \operatorname {Conv}\mathbb {R}^n$ and let $x'\in \operatorname {ri}\text{dom }f$. Then the closure $\overline{f}$ of its perspective is given as follows:

\[ (\text{cl }\tilde f)(u,x)= \begin{cases} u f(x/u) & \text{if } u{\gt}0,\\[4pt] \lim _{\alpha \downarrow 0}\alpha f(x’-x+\tfrac {x}{\alpha }) & \text{if } u=0,\\[4pt] +\infty & \text{if } u{\lt}0. \end{cases} \]

Proof ▶

Suppose first $u{\lt}0$. For any $x$, it is clear that $(u,x)$ is outside $\text{cl }\text{dom }\tilde f$ and, in view of (1.2.8), $\text{cl }\tilde f(u,x)=+\infty $.

Now let $u\ge 0$. Using (2.2.1), the assumption on $x'$ and the results of §A.2.1, we see that $(1,x')\in \text{ri }\text{dom }\tilde f$, so Proposition 1.2.5 allows us to write

\[ (\text{cl }\tilde f)(u,x)=\lim _{\alpha \downarrow 0}\tilde f\bigl((u,x)+\alpha [(1,x')-(u,x)]\bigr) \]

\[ \qquad =\lim _{\alpha \downarrow 0}[u+\alpha (1-u)]\, f\! \Bigl(\frac{x+\alpha (x'-x)}{u+\alpha (1-u)}\Bigr). \]

If $u=1$, this reads $\text{cl }\tilde f(1,x)=\text{cl }f(x)=f(x)$ (because $f$ is closed); if $u=0$, we just obtain our claimed relation.

2.2.3 Infimal convolution

Definition 2.2.1

Let $f_1$ and $f_2$ be two functions from $\mathbb {R}^n$ to $\mathbb {R}\cup \{ +\infty \} $. Their infimal convolution is the function from $\mathbb {R}^n$ to $\mathbb {R}\cup \{ \pm \infty \} $ defined by

\begin{align} (f_1\mathbin {\square }f_2)(x)& :=\inf \{ f_1(x_1)+f_2(x_2):\; x_1+x_2=x\} \label{eq:2.3.1}\\ & =\inf _{y\in \mathbb {R}^n}[\, f_1(y)+f_2(x-y)\, ].\notag \end{align}

We will also call “infimal convolution” the operation expressed by 7. It is called exact at $x=\bar x_1+\bar x_2$ when the infimum is attained at $(\bar x_1,\bar x_2)$, not necessarily unique. □

Proposition 2.2.7

Let the functions $f_1$ and $f_2$ be in $\operatorname {Conv}\mathbb {R}^n$. Suppose that they have a common affine minorant: for some $(s,b)\in \mathbb {R}^n\times \mathbb {R}$,

\[ f_j(x)\ge \langle s,x\rangle - b\qquad \text{for }j=1,2\text{ and all }x\in \mathbb {R}^n. \]

Then their infimal convolution is also in $\operatorname {Conv}\mathbb {R}^n$.

Proof ▶

For arbitrary $x\in \mathbb {R}^n$ and $x_1,x_2$ such that $x_1+x_2=x$, we have by assumption

\[ f_1(x_1)+f_2(x_2)\ge \langle s,x\rangle -2b{\gt}- \infty , \]

and this inequality extends to the infimal value $(f_1\mathbin {\square }f_2)(x)$.

On the other hand, it suffices to choose particular values $x_j\in \text{dom }f_j$, $j=1,2$, to obtain the point $x_1+x_2\in \text{dom }(f_1\mathbin {\square }f_2)$. Finally, the convexity of $f_1\mathbin {\square }f_2$ results from the convexity of a lower-bound function, as seen in §1.3(g).

2.2.4 Image of a function under a linear mapping

Definition 2.2.2

(Image Function) Let $A:\mathbb {R}^m\to \mathbb {R}^n$ be a linear operator and let $g:\mathbb {R}^m\to \mathbb {R}\cup \{ +\infty \} $. The image of $g$ under $A$ is the function $Ag:\mathbb {R}^n\to \mathbb {R}\cup \{ \pm \infty \} $ defined by

\[ (Ag)(x):=\inf \{ \, g(y):\; Ay=x\, \} \tag {2.4.2} \]

(here as always, $\inf \varnothing =+\infty $).

Theorem 2.2.1

Let $g$ of Definition 2.4.1 be in $\mathrm{Conv}\, \mathbb {R}^m$. Assume also that, for all $x\in \mathbb {R}^n$, $g$ is bounded below on the inverse image $A^{-1}(x)=\{ y\in \mathbb {R}^m:Ay=x\} $. Then $Ag\in \mathrm{Conv}\, \mathbb {R}^n$.

Proof ▶

By assumption, $Ag$ is nowhere $-\infty $; also, $(Ag)(x){\lt}+\infty $ whenever $x=Ay$, with $y\in \text{dom }g$. Now consider the extended operator

\[ A':\mathbb {R}^m\times \mathbb {R}\rightrightarrows (y,r)\mapsto A'(y,r):=(Ay,r)\in \mathbb {R}^n\times \mathbb {R}. \]

The set $A'(\text{epi }g)=:C$ is convex in $\mathbb {R}^n\times \mathbb {R}$; let us compute its lower-bound function (1.3.5): for given $x\in \mathbb {R}^n$,

\[ \inf _{r}\{ r : (x,r)\in C\} = \inf _{y,r}\{ r : Ay = x \ \text{and}\ g(y)\le r\} = \inf _{y}\{ g(y) : Ay = x\} = (Ag)(x), \]

and this proves the convexity of $Ag=\ell _C$.

Corollary 2.2.1

Let (2.4.1) have the following form: $U=\mathbb {R}^p$; $\varphi \in \text{Conv }\mathbb {R}^p$; $X=\mathbb {R}^n$ is equipped with the canonical basis; the mapping $c$ has its components $c_j\in \text{Conv }\mathbb {R}^p$ for $j=1,\dots ,n$. Suppose also that the optimal value is ${\gt}-\infty $ for all $x\in \mathbb {R}^n$, and that

\[ \text{dom }\varphi \cap \text{dom }c_1\cap \cdots \cap \text{dom }c_n\neq \varnothing . \tag {2.4.4} \]

Then the value function

\[ v_{\varphi ,c}(x):=\inf \{ \varphi (u):c_j(u)\le x_j,\ \text{for }j=1,\dots ,n\} \]

lies in $\text{Conv }\mathbb {R}^n$.

Proof ▶

Note first that we have assumed $v_{\varphi ,c}(x){\gt}-\infty $ for all $x$. Take $u_0$ in the set (2.4.4) and set $M:=\max _j c_j(u_0)$; then take $x_0:=(M,\dots ,M)\in \mathbb {R}^n$, so that $v_{\varphi ,c}(x_0)\le \varphi (u_0){\lt}+\infty $. Knowing that $v_{\varphi ,c}$ is an image-function, we just have to prove the convexity of the set (2.4.3); but this in turn comes immediately from the convexity of each $c_j$.

Definition 2.2.3

(Marginal Function) Let $g\in \operatorname {Conv}(\mathbb {R}^n\times \mathbb {R}^m)$. Then

\[ \mathbb {R}^n\ni x\mapsto \gamma (x):=\inf \{ \, g(x,y):\; y\in \mathbb {R}^m\, \} \]

is the marginal function of $g$.

Corollary 2.2.2

With the above notation, suppose that $g$ is bounded below on the set $\{ x\} \times \mathbb {R}^m$, for all $x\in \mathbb {R}^n$. Then the marginal function $\gamma $ lies in $\operatorname {Conv}\mathbb {R}^n$.

Proof ▶

The marginal function $\gamma $ is the image of $g$ under the the linear operator $A$ projecting each $(x,y)\in \mathbb {R}^n\times \mathbb {R}^m$ onto $x\in \mathbb {R}^n$: $A(x,y)=x$. □

2.2.5 Convex hull and closed convex hull of a function

Proposition 2.2.8

Let $g:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $, not identically $+\infty $, be minorized by some affine function: for some $(s,b)\in \mathbb {R}^n\times \mathbb {R}$,

\[ g(x)\ge \langle s,x\rangle - b\quad \text{for all }x\in \mathbb {R}^n . \tag {2.5.1} \]

Then, the following three functions $f_1, f_2$ and $f_3$ are convex and coincide on $\mathbb {R}^n$:

\[ \begin{aligned} f_1(x) & := \inf \{ r : (x,r)\in \operatorname {co}\, \operatorname {epi} g\} ,\\ f_2(x) & := \sup \{ h(x): h\in \operatorname {Conv}\mathbb {R}^n,\ h\le g\} ,\\ f_3(x) & := \inf \Big\{ \sum _{j=1}^k \alpha _j g(x_j): k=1,2,\dots \\ & \qquad \qquad \qquad \alpha \in \Delta _k,\ x_j\in \operatorname {dom} g,\ \sum _{j=1}^k \alpha _j x_j = x\Big\} . \end{aligned} \tag {2.5.2} \]

Proof ▶

We denote by $\Gamma $ the family of convex functions minorizing $g$. By assumption, $\Gamma \neq \varnothing $; then the convexity of $f_1$ results from §1.3(g).

[$f_2\le f_1$] Consider the epigraph of any $h\in \Gamma $: its lower-bound function $\ell _{\mathrm{epi}\, h}$ is $h$ itself; besides, it contains $\mathrm{epi}\, g$, and $\operatorname {co}(\mathrm{epi}\, g)$ as well (see Proposition A.1.3.4). In a word, there holds $h=\ell _{\mathrm{epi}\, h}\le \ell _{\operatorname {co}\, \mathrm{epi}\, g}=f_1$ and we conclude $f_2\le f_1$ since $h$ was arbitrary in $\Gamma $.

[$f_3\le f_2$] We have to prove $f_3\in \Gamma $, and the result will follow by definition of $f_2$; clearly $f_3\le g$ (take $\alpha \in \Delta _1$!), so it suffices to establish $f_3\le \operatorname {Conv}\mathbb {R}^n$. First, with $(s,b)$ of (2.5.1) and all $x,\ \{ x_j\} $ and $\{ \alpha _j\} $ as described by (2.5.2),

\[ \sum _{j=1}^k \alpha _j g(x_j) \ge \sum _{j=1}^k \alpha _j(\langle s,x_j\rangle - b) = \langle s,x\rangle - b; \]

hence $f_3$ is minorized by the affine function $\langle s,\cdot \rangle - b$. Now, take two points $(x,r)$ and $(x',r')$ in the strict epigraph of $f_3$. By definition of $f_3$, there are $k,\ \{ \alpha _j\} ,\ \{ x_j\} $ as described in (2.5.2), and likewise $k',\ \{ \alpha '_j\} ,\ \{ x'_j\} $, such that $\sum _{j=1}^k \alpha _j g(x_j){\lt}r$ and likewise $\sum _{j=1}^{k'} \alpha '_j g(x'_j){\lt}r'$.

For arbitrary $t\in ]0,1[$, we obtain by convex combination

\[ \sum _{j=1}^k t\alpha _j g(x_j) + \sum _{j=1}^{k'} (1-t)\alpha '_j g(x'_j) {\lt} tr + (1-t)r'. \]

Observe that

\[ \sum _{j=1}^k t\alpha _j x_j + \sum _{j=1}^{k'} (1-t)\alpha '_j x'_j = tx + (1-t)x', \]

i.e. we have in the lefthand side a convex decomposition of $tx+(1-t)x'$ in $k+k'$ elements; therefore, by definition of $f_3$:

\[ f_3(tx+(1-t)x') \le \sum _{j=1}^k t\alpha _j g(x_j) + \sum _{j=1}^{k'} (1-t)\alpha '_j g(x'_j) \]

and we have proved that $\mathrm{epi}_s f_3$ is a convex set: $f_3$ is convex.

Let $x\in \mathbb {R}^n$ and take an arbitrary convex decomposition $x=\sum _{j=1}^k \alpha _j x_j$, with $\alpha _j$ and $x_j$ as described in (2.5.2). Since $(x_j,g(x_j))\in \operatorname {epi} g$ for $j=1,\ldots ,k$,

\[ \biggl(x,\sum _{j=1}^k \alpha _j g(x_j)\biggr)\in \operatorname {co}\operatorname {epi} g \]

and this implies $f_1(x)\le \sum _{j=1}^k \alpha _j g(x_j)$ by definition of $f_1$. Because the decomposition of $x$ was arbitrary within (2.5.2), this implies $f_1(x)\le f_3(x)$.

Proposition 2.2.9

Let $g$ satisfy the hypotheses of Proposition 2.5.1. Then the three functions below

\[ \bar f_1(x):=\inf \{ r:(x,r)\in \overline{\operatorname {epi} g}\} , \qquad \bar f_2(x):=\sup \{ h(x):\ h\in \operatorname {Conv}\mathbb {R}^n,\ h\le g\} , \]

\[ \bar f_3(x):=\sup \{ \langle s,x\rangle -b:\ \langle s,y\rangle -b\le g(y)\text{ for all }y\in \mathbb {R}^n\} \]

are closed, convex, and coincide on $\mathbb {R}^n$ with the closure of the function constructed in Proposition 2.5.1.

Definition 2.2.4 Convex Hulls of a Function

Let $g:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} $, not identically $+\infty $, be minorized by an affine function. The common function $f_1=f_2=f_3$ of Proposition 2.5.1 is called the convex hull of $g$, denoted by $\operatorname {co}g$. The closed convex hull of $g$ is any of the functions described by Proposition 2.5.2; it is denoted by $\overline{\operatorname {co}}\, g$ or $\operatorname {cl}\, \operatorname {co}g$.

Proposition 2.2.10

Let $g_1,\dots ,g_m$ be in $\operatorname {Conv}\mathbb {R}^n$, all minorized by the same affine function. Then the convex hull of their infimum is the function

\[ \mathbb {R}^n \ni x \mapsto [\operatorname {co}(\min _j g_j)](x)= \inf \Big\{ \sum _{j=1}^m \alpha _j g_j(x_j)\; :\; \alpha \in \Delta _m,\; x_j\in \operatorname {dom} g_j,\; \sum _{j=1}^m\alpha _j x_j=x\Big\} . \tag {2.5.3} \]

Proof ▶

Apply Example A.1.3.5 to the convex sets $C_j=\operatorname {epi} g_j$.

2.3 Local and Global Behavior of a Convex Function

2.3.1 Continuity properties

Lemma 2.3.1

Let $f\in \operatorname {Conv}\mathbb {R}^n$ and suppose there are $x_0,\ \delta ,\ m$ and $M$ such that

\[ m\le f(x)\le M\quad \text{for all }x\in B(x_0,2\delta ). \]

Then $f$ is Lipschitzian on $B(x_0,\delta )$; more precisely: for all $y$ and $y'$ in $B(x_0,\delta )$,

\begin{equation} \label{eq:3.1.1} |f(y)-f(y')|\le \frac{M-m}{\delta }\| y-y'\| . \end{equation}

Proof ▶

Look at Fig. 3.1.1: with two different $y$ and $y'$ in $B(x_0,\delta )$, take

\[ y'':=y'+\delta \, \frac{y'-y}{\| y'-y\| }\in B(x_0,2\delta ); \]

by construction, $y'$ lies on the segment $[y,y'']$, namely

\[ y'=\frac{\| y'-y\| }{\delta +\| y'-y\| }\, y''+\frac{\delta }{\delta +\| y'-y\| }\, y. \]

Applying the convexity of $f$ and using the postulated bounds, we obtain

\[ f(y')-f(y)\le \frac{\| y'-y\| }{\delta +\| y'-y\| }\bigl[f(y'')-f(y)\bigr]\le \frac{1}{\delta }\| y'-y\| (M-m). \]

Then, it suffices to exchange $y$ and $y'$ to prove (3.1.1).

Theorem 2.3.1

With $f\in \operatorname {Conv}\mathbb {R}^n$, let $S$ be a convex compact subset of $\operatorname {ri}\text{dom }f$. Then there exists $L=L(S)\ge 0$ such that

\[ |f(x)-f(x')|\le L\| x-x'\| \quad \text{for all }x\text{ and }x'\text{ in }S. \tag {3.1.2} \]

Preliminaries ▶

First of all, our statement ignores $x$-values outside the affine hull of the convex set $\text{dom }f$. Instead of $\mathbb {R}^n$, it can be formulated in $\mathbb {R}^d$, where $d$ is the dimension of $\text{dom }f$; alternatively, we may assume $\text{ri }\text{dom }f=\int \text{dom }f$, which will simplify the writing.

Make this assumption and let $x_0\in S$. We will prove that there are $\delta =\delta (x_0){\gt}0$ and $L=L(x_0,\delta )$ such that the ball $B(x_0,\delta )$ is included in $\int \text{dom }f$ and

\[ |f(y)-f(y')|\le L\| y-y'\| \quad \text{for all }y\text{ and }y'\text{ in }B(x_0,\delta ). \tag {3.1.3} \]

If this holds for all $x_0\in S$, the corresponding balls $B(x_0,\delta )$ will provide a covering of the compact set $S$, from which we will extract a finite covering $(x_1,\delta _1,L_1),\ldots ,(x_k,\delta _k,L_k)$. With these balls, we will divide an arbitrary segment $[x,x']$ of the convex set $S$ into finitely many subsegments, of endpoints $y_0:=x,\ldots ,y_i,\ldots ,y_\ell :=x'$. Ordering properly the $y_i$’s, we will have $\| x-x'\| =\sum _{i=1}^\ell \| y_i-y_{i-1}\| $; furthermore, $f$ will be Lipschitzian on each $[y_{i-1},y_i]$ with the common constant $L:=\max \{ L_1,\ldots ,L_k\} $. The required Lipschitz property (3.1.2) will follow.

[Main Step] To establish (3.1.3), we use Lemma 3.1.1, which requires boundedness of $f$ in the neighborhood of $x_0$. For this, we construct as in the proof of Theorem A.2.1.3 (see Fig. A.2.1.1) a simplex $\Delta =\operatorname {co}\{ v_0,\dots ,v_n\} \cap \operatorname {dom}f$ having $x_0$ in its interior: we can take $\delta {\gt}0$ such that $B(x_0,2\delta )\subset \Delta $.

Then any $y\in B(x_0,2\delta )$ can be written: $y=\sum _{i=0}^n\alpha _i v_i$ with $\alpha \in \Delta _{n+1}$, so that the convexity of $f$ gives

\[ f(y)\le \sum _{i=0}^n\alpha _i f(v_i)\le \max \{ f(v_0),\dots ,f(v_n)\} =: M . \]

On the other hand, Proposition 1.2.1 tells us that $f$ is bounded from below, say by $m$, on this very same $B(x_0,2\delta )$. Our claim is proved: we have singled out $\delta {\gt}0$ such that $m\le f(y)\le M$ for all $y\in B(x_0,2\delta )$.

Theorem 2.3.2

Let the convex functions $f_k:\mathbb {R}^n\to \mathbb {R}$ converge pointwise for $k\to +\infty $ to $f:\mathbb {R}^n\to \mathbb {R}$. Then $f$ is convex and, for each compact set $S$, the convergence of $f_k$ to $f$ is uniform on $S$.

Proof ▶

Convexity of $f$ is trivial: pass to the limit in the definition (1.1.1) itself. For uniformity, we want to use Lemma 3.1.1, so we need to bound $f_k$ on $S$ independently of $k$; thus, let $r{\gt}0$ be such that $S\subset B(0,r)$.

[Step 1] First the function $g:=\sup _k f_k$ is convex, and $g(x){\lt}+\infty $ for all $x$ because the convergent sequence $(f_k(x))_k$ is certainly bounded. Hence, $g$ is continuous and therefore bounded, say by $M$, on the compact set $B(0,2r)$:

\[ f_k(x)\le g(x)\le M\qquad \text{for all $k$ and all $x\in B(0,2r)$.} \]

Second, the convergent sequence $(f_k(0))_k$ is bounded from below:

\[ \mu \le f_k(0)\qquad \text{for all $k$.} \]

Then, for $x\in B(0,2r)$ and all $k$, use convexity on $[-x,x]\subset B(0,2r)$:

\[ 2\mu \le 2f_k(0)\le f_k(x)+f_k(-x)\le f_k(x)+M, \]

i.e. the $f_k$’s are bounded from below, independently of $k$. Thus, we are within the conditions of Lemma 3.1.1: there is some $L$ (independent of $k$) such that

\[ |f_k(y)-f_k(y')| \le L\| y-y'\| \quad \text{for all }k\text{ and all }y,y'\in B(0,r). \tag {3.1.5} \]

Naturally, the same Lipschitz property is transmitted to the limiting function $f$.

[Step 2] Now fix $\varepsilon {\gt}0$. Cover $S$ by the balls $B(x,\varepsilon )$ for $x$ describing $S$, and extract a finite covering $S\subset B(x_1,\varepsilon )\cup \cdots \cup B(x_m,\varepsilon )$. With $x$ arbitrary in $S$, take an $x_i$ such that $x\in B(x_i,\varepsilon )$. There is $k_{i,\varepsilon }$ such that, for all $k\ge k_{i,\varepsilon }$,

\[ |f_k(x)-f(x)| \le |f_k(x)-f_k(x_i)|+|f_k(x_i)-f(x_i)|+|f(x_i)-f(x)| \le (2L+1)\varepsilon \]

where we have also used (3.1.5), knowing that $x$ and $x_i$ are in $S\subset B(0,r)$. The above inequality is then valid uniformly in $x$, providing that

\[ k\ge \max \{ k_{1,\varepsilon },\ldots ,k_{m,\varepsilon }\} =: k_\varepsilon . \]

2.3.2 Behavior at infinity

Proposition 2.3.1

For $f\in \overline{\operatorname {Conv}}\mathbb {R}^n$, the asymptotic cone of $\operatorname {epi} f$ is the epigraph of the function $f'_{\infty }\in \overline{\operatorname {Conv}}\mathbb {R}^n$ defined by

\[ d\mapsto f'_{\infty }(d):=\sup _{t{\gt}0}\frac{f(x_0+td)-f(x_0)}{t} \; =\; \lim _{t\to +\infty }\frac{f(x_0+td)-f(x_0)}{t}, \tag {3.2.2} \]

where $x_0$ is arbitrary in $\operatorname {dom} f$.

Proof ▶

Since $(x_0,f(x_0))\in \operatorname {epi} f$, (3.2.1) tells us that $(d,\rho )\in (\operatorname {epi} f)_{\infty }$ if and only if $f(x_0+td)\le f(x_0)+t\rho $ for all $t{\gt}0$, which means

\[ \sup _{t{\gt}0}\frac{f(x_0+td)-f(x_0)}{t}\le \rho . \tag {3.2.3} \]

In other words, $(\operatorname {epi} f)_\infty $ is the epigraph of the function whose value at $d$ is the left hand side of (3.2.3); and this is true no matter how $x_0$ has been chosen in $\operatorname {dom} f$. The rest follows from the fact that the difference quotient in (3.2.3) is closed convex in $d$, and increasing in $t$ (the function $t\mapsto f(x_0+td)$ is convex and enjoys the property of increasing slopes, namely Proposition 0.6.1). $\square $

Definition 2.3.1

(Asymptotic function) The function $f'_{\infty }$ of Proposition 3.2.1 is called the asymptotic function, or recession function, of $f$.

Proposition 2.3.2

Let $f\in \operatorname {Conv}\mathbb {R}^n$. All the nonempty sublevel-sets of $f$ have the same asymptotic cone, which is the sublevel-set of $f^\infty $ at the level $0$:

\[ \forall r\in \mathbb {R}\ \text{with } S_r(f)\neq \varnothing ,\qquad [S_r(f)]_\infty =\{ d\in \mathbb {R}^n:\ f^\infty (d)\le 0\} . \]

In particular, the following statements are equivalent:

There is $r$ for which $S_r(f)$ is nonempty and compact;
all the sublevel-sets of $f$ are compact;
$f^\infty _\circ (d){\gt}0$ for all nonzero $d\in \mathbb {R}^n$.

Proof ▶

By definition (A.2.2.1), a direction $d$ is in the asymptotic cone of the nonempty sublevel-set $S_r(f)$ if and only if

\[ x\in S_r(f)\qquad \Longrightarrow \qquad [x+td\in S_r(f)\ \text{for all }t{\gt}0], \]

which can also be written — see (1.1.4):

\[ (x,r)\in \operatorname {epi} f\qquad \Longrightarrow \qquad (x+td,r+t\times 0)\in \operatorname {epi} f\ \text{ for all }t{\gt}0; \]

and this in turn just means that $(d,0)\in (\operatorname {epi} f)_\infty =\operatorname {epi} f^\infty _\circ $. We have proved the first part of the theorem.

A particular case is when the sublevel-set $S_0(f^\infty _\circ )$ is reduced to the singleton $\{ 0\} $, which exactly means (iii). This is therefore equivalent to $[S_r(f)]_\infty =\{ 0\} $ for all $r\in \mathbb {R}$ with $S_r(f)\neq \emptyset $, which means that $S_r(f)$ is compact (Proposition A.2.2.3). The equivalence between (i), (ii) and (iii) is proved.

Definition 2.3.2

(Coercivity) The functions $f\in \operatorname {Conv}\mathbb {R}^n$ satisfying (i), (ii) or (iii) are called $0$-coercive. Equivalently, the $0$-coercive functions are those that “increase at infinity”:

\[ f(x)\to +\infty \qquad \text{whenever}\qquad \| x\| \to +\infty , \]

and closed convex $0$-coercive functions achieve their minimum over $\mathbb {R}^n$.

An important particular case is when $f'_\infty (d)=+\infty $ for all $d\neq 0$, i.e. when $f'_\infty = \iota _{\{ 0\} }$. It can be seen that this means precisely

\[ \frac{f(x)}{\| x\| }\to +\infty \qquad \text{whenever}\qquad \| x\| \to +\infty . \]

(to establish this equivalence, extract a cluster point of $(x_k/\| x_k\| )$, and use the lower semi-continuity of $f'_\infty $). In words: at infinity, $f$ increases to infinity faster than any affine function; such functions are called $1$-coercive, or sometimes just coercive.

Proposition 2.3.3

A function $f\in \operatorname {Conv}\mathbb {R}^n$ is Lipschitzian on the whole of $\mathbb {R}^n$ if and only if $f'_\infty $ is finite on the whole of $\mathbb {R}^n$. The best Lipschitz constant for $f$ is then

\[ \sup \{ f'_\infty (d):\| d\| =1\} . \tag {3.2.4} \]

Proof ▶

When the (convex) function $f'_\infty $ is finite-valued, it is continuous (§3.1) and therefore bounded on the compact unit sphere:

\[ \sup \{ f'_\infty (d):\| d\| =1\} =: L {\lt} +\infty , \]

which implies by positive homogeneity

\[ f'_\infty (d) \le L\| d\| \quad \text{for all }d\in \mathbb {R}^n. \]

Now use the definition (3.2.2) of $f'_\infty $:

\[ f(x+d)-f(x)\le L\| d\| \quad \text{for all }x\in \operatorname {dom}f\text{ and }d\in \mathbb {R}^n; \]

thus, $\operatorname {dom}f$ is the whole space ($f(x+d){\lt}+\infty $ for all $d$) and we do obtain that $L$ is a global Lipschitz constant for $f$.

Conversely, let $f$ have a global Lipschitz constant $L$. Pick $x_0\in \operatorname {dom}f$ and plug the inequality

\[ f(x_0+td)-f(x_0)\le Lt\| d\| \quad \text{for all }t{\gt}0\text{ and }d\in \mathbb {R}^n \]

into the definition (3.2.2) of $f'_\infty $ to obtain $f'_\infty (d)\le L\| d\| $ for all $d\in \mathbb {R}^n$.

It follows that $f'_\infty $ is finite everywhere, and the value (3.2.4) does not exceed $L$.

Proposition 2.3.4 3.2.8

Let $f_1,\dots ,f_m$ be $m$ functions of $\text{Conv }\mathbb {R}^n$, and $t_1,\dots ,t_m$ be positive numbers. Assume that there is $x_0$ at which each $f_j$ is finite. Then,

\[ \text{for } f:=\sum _{j=1}^m t_j f_j,\qquad \text{we have } f'_\infty =\sum _{j=1}^m t_j(f_j)'_\infty . \]
Let $\{ f_j\} _{j\in J}$ be a family of functions in $\text{Conv }\mathbb {R}^n$. Assume that there is $x_0$ at which $\sup _{j\in J} f_j(x_0){\lt}+\infty $. Then,

\[ \text{for } f:=\sup _{j\in J} f_j,\qquad \text{we have } f'_\infty =\sup _{j\in J}(f_j)'_\infty . \]
Let $A:\mathbb {R}^n\to \mathbb {R}^m$ be affine with linear part $A_0$, and let $f\in \text{Conv }\mathbb {R}^m$. Assume that $A(\mathbb {R}^n)\cap \text{dom }f\neq \emptyset $. Then $(f\circ A)'_\infty =f'_\infty \circ A_0$.

2.4 First- and Second-order Differentiation

2.4.1 Differentiable convex functions

Theorem 2.4.1

Let $f$ be a function differentiable on an open set $\Omega \subset \mathbb {R}^n$, and let $C$ be a convex subset of $\Omega $. Then

$f$ is convex on $C$ if and only if

\[ f(x)\ge f(x_0)+\langle \nabla f(x_0),\, x-x_0\rangle \quad \text{for all }(x_0,x)\in C\times C; \tag {4.1.1} \]
$f$ is strictly convex on $C$ if and only if strict inequality holds in (4.1.1) whenever $x\neq x_0$;
$f$ is strongly convex with modulus $c$ on $C$ if and only if, for all $(x_0,x)\in C\times C$,

\[ f(x)\ge f(x_0)+\langle \nabla f(x_0),x-x_0\rangle +\tfrac {1}{2}c\| x-x_0\| ^2. \tag {4.1.2} \]

(i) ▶

Let $f$ be convex on $C$: for arbitrary $(x_0,x)\in C\times C$ and $\alpha \in ]0,1[$, we have from the definition (1.1.1) of convexity

\[ f(\alpha x+(1-\alpha )x_0)-f(x_0)\le \alpha [f(x)-f(x_0)]. \]

Divide by $\alpha $ and let $\alpha \downarrow 0$: observing that $\alpha x+(1-\alpha )x_0=x_0+\alpha (x-x_0)$, the lefthand side tends to $\langle \nabla f(x_0),x-x_0\rangle $ and (4.1.1) is established.

Conversely, take $x_1$ and $x_2$ in $C$, $\alpha \in ]0,1[$ and set $x_0:=\alpha x_1+(1-\alpha )x_2\in C$. By assumption,

\[ f(x_i)\ge f(x_0)+\langle \nabla f(x_0),x_i-x_0\rangle \quad \text{for }i=1,2 \tag {4.1.3} \]

and we obtain by convex combination

\[ \alpha f(x_1)+(1-\alpha )f(x_2)\ge f(x_0)+\langle \nabla f(x_0),\alpha x_1+(1-\alpha )x_2-x_0\rangle \]

which, after simplification, is just the relation of definition (1.1.1).

[(ii)] If $f$ is strictly convex, we have for $x_0\ne x$ in $C$ and $\alpha \in ]0,1[$,

\[ f(x_0+\alpha (x-x_0))-f(x_0){\lt}\alpha [f(x)-f(x_0)]; \]

but $f$ is in particular convex and we can use (i):

\[ \langle \nabla f(x_0),\alpha (x-x_0)\rangle \le f(x_0+\alpha (x-x_0))-f(x_0), \]

so the required strict inequality follows.

For the converse, proceed as for (i), starting from strict inequalities in (4.1.3).

[(iii)] Using Proposition 1.1.2, just apply (i) to the function $f-\tfrac {1}{2}c\| \cdot \| ^2$, which is of course differentiable.

Definition 2.4.1

Let $C\subset \mathbb {R}^n$ be convex. The mapping $F: C\to \mathbb {R}^n$ is said to be monotone [resp. strictly monotone, resp. strongly monotone with modulus $c{\gt}0$] on $C$ when, for all $x$ and $x'$ in $C$,

\[ \langle F(x)-F(x'),\, x-x'\rangle \ge 0 \]

[resp. $\langle F(x)-F(x'),\, x-x'\rangle {\gt} 0$ whenever $x\neq x'$, resp. $\langle F(x)-F(x'),\, x-x'\rangle \ge c\| x-x'\| ^2$].

Theorem 2.4.2

Let $f$ be a function differentiable on an open set $\Omega \subset \mathbb {R}^n$, and let $C$ be a convex subset of $\Omega $. Then, $f$ is convex [resp. strictly convex, resp. strongly convex with modulus $c$] on $C$ if and only if its gradient $\nabla f$ is monotone [resp. strictly monotone, resp. strongly monotone with modulus $c$] on $C$.

Proof ▶

We combine the “convex $\Leftrightarrow $ monotone” and “strongly convex $\Leftrightarrow $ strongly monotone” cases by accepting the value $c=0$ in the relevant relations such as (4.1.2).

Thus, let $f$ be [strongly] convex on $C$: in view of Theorem 4.1.1, we can write for arbitrary $x_0$ and $x$ in $C$:

\[ f(x)\ge f(x_0)+\langle \nabla f(x_0),x-x_0\rangle +\tfrac {1}{2}c\| x-x_0\| ^2 \]

\[ f(x_0)\ge f(x)+\langle \nabla f(x),x_0-x\rangle +\tfrac {1}{2}c\| x_0-x\| ^2, \]

and mere addition shows that $\nabla f$ is [strongly] monotone.

Conversely, let $(x_0,x_1)$ be a pair of elements in $C$. Consider the univariate function $t\mapsto \varphi (t):=f(x_t)$, where $x_t:=x_0+t(x_1-x_0)$; for $t$ in an open interval containing $[0,1]$, $x_t\in \Omega $ and $\varphi $ is well-defined and differentiable; its derivative at $t$ is $\varphi '(t)=\langle \nabla f(x_t),x_1-x_0\rangle $. Thus, we have for all $0\le t'{\lt}t\le 1$

\[ \varphi '(t)-\varphi '(t')=\langle \nabla f(x_t)-\nabla f(x_{t'}),\, x_1-x_0\rangle =\frac{1}{t-t'}\langle \nabla f(x_t)-\nabla f(x_{t'}),\, x_t-x_{t'}\rangle \tag {4.1.4} \]

and the monotonicity relation for $\nabla f$ shows that $\varphi '$ is increasing, $\varphi $ is therefore convex (Corollary 0.6.5).

For strong convexity, set $t'=0$ in (4.1.4) and use the strong monotonicity relation for $\nabla f$:

\[ \varphi '(t)-\varphi '(0)\ge \frac{1}{t}c\| x_t-x_0\| ^2 = tc\| x_1-x_0\| ^2. \tag {4.1.5} \]

Because the differentiable convex function $\varphi $ is the integral of its derivative, we can write

\[ \varphi (1)-\varphi (0)-\varphi '(0)=\int _0^1[\varphi '(t)-\varphi '(0)]\, dt \ge \tfrac {1}{2}c\| x_1-x_0\| ^2 \]

which, by definition of $\varphi $, is just (4.1.2) (the coefficient $1/2$ is $\int _0^1 t\, dt!$).

The same technique proves the “strictly monotone $\Leftrightarrow $ strictly convex” case; then, (4.1.5) becomes a strict inequality — with $c=0$ — and remains so after integration.

2.4.2 Nondifferentiable convex functions

Proposition 2.4.1

For $f\in \operatorname {Conv}\mathbb {R}^n$ and $x\in \operatorname {int}\text{dom }f$, the three statements below are equivalent:

The function

\[ \mathbb {R}^n\ni d\mapsto \lim _{t\downarrow 0}\frac{f(x+td)-f(x)}{t} \]

is linear in $d$;
for some basis of $\mathbb {R}^n$ in which $x=(\xi ^1,\dots ,\xi ^n)$, the partial derivatives $\dfrac {\partial f}{\partial \xi ^i}(x)$ exist at $x$, for $i=1,\dots ,n$;
$f$ is differentiable at $x$.

Proof ▶

First of all, remember from Theorem 0.6.3 that the one-dimensional function $t\mapsto f(x+td)$ has half-derivatives at $0$: the limits considered in (i) exist for all $d$. We will denote by $\{ b_1,\dots ,b_n\} $ the basis postulated in (ii), so that $x=\sum _{i=1}^n \xi ^i b_i$.

Denote by $d\mapsto \ell (d)$ the function defined in (i); taking $d=\pm b_i$, realize that, when (i) holds,

\[ \lim _{\tau \downarrow 0}\frac{f(x+\tau b_i)-f(x)}{-\tau } =\ell (-b_i)=-\ell (b_i)=-\lim _{t\downarrow 0}\frac{f(x+t b_i)-f(x)}{t}. \]

This means that the two half-derivatives at $t=0$ of the function $t\mapsto f(x+t b_i)$ coincide: the partial derivative of $f$ at $x$ along $b_i$ exists, (ii) holds. That (iii) implies (i) is clear: when $f$ is differentiable at $x$,

\[ \lim _{t\downarrow 0}\frac{f(x+td)-f(x)}{t}=\langle \nabla f(x),d\rangle . \]

We do not really complete the proof here, because everything follows in a straightforward way from subsequent chapters. More precisely, [(ii) $\Rightarrow $ (i)] is Proposition C.1.1.6, which states that the function $\ell $ is linear on the space generated by the $b_i$’s, whenever it is linear along each $b_i$. Finally [(i) $\Rightarrow $ (iii)] results from Lemma D.2.1.1 and the proof goes as follows. One of the possible definitions of (iii) is:

\[ \lim _{t\downarrow 0,\; d'\to d}\frac{f(x+t d')-f(x)}{t}\quad \text{is linear in }d. \]

Because $f$ is locally Lipschitzian, the above limit exists whenever it exists for fixed $d'=d$—i.e. the expression in (i).

Theorem 2.4.3

Let $f\in \operatorname {Conv}\mathbb {R}^n$. The subset of $\operatorname {int}\operatorname {dom}f$ where $f$ fails to be differentiable is of zero (Lebesgue) measure.

2.4.3 Second-order differentiation

Theorem 2.4.4

Let $f$ be twice differentiable on an open convex set $\Omega \subset \mathbb {R}^n$. Then

$f$ is convex on $\Omega $ if and only if $\nabla ^2 f(x_0)$ is positive semi-definite for all $x_0\in \Omega $;
if $\nabla ^2 f(x_0)$ is positive definite for all $x_0\in \Omega $, then $f$ is strictly convex on $\Omega $;
$f$ is strongly convex with modulus $c$ on $\Omega $ if and only if the smallest eigenvalue of $\nabla ^2 f(\cdot )$ is minorized by $c$ on $\Omega $: for all $x_0\in \Omega $ and all $d\in \mathbb {R}^n$,

\[ \langle \nabla ^2 f(x_0)d,d\rangle \ge c\| d\| ^2. \]

Proof ▶

For given $x_0\in \Omega $, $d\in \mathbb {R}^n$ and $t\in \mathbb {R}$ such that $x_0+td\in \Omega $, we will set

\[ x_t:=x_0+td\quad \text{and}\quad \varphi (t):=f(x_t)=f(x+td), \]

so that $\varphi ''(t)=\langle \nabla ^2 f(x_t)d,d\rangle $.

[(i)] Assume $f$ is convex on $\Omega $; let $(x_0,d)$ be arbitrary in $\Omega \times \mathbb {R}^n$, with $d\neq 0$: $\varphi $ is then convex on the open interval $I:=\{ t\in \mathbb {R}:x_0+td\in \Omega \} $. It follows

\[ 0 \le \varphi ''(t) = \langle \nabla ^2 f(x_t)d,d\rangle \quad \text{for all }t\in I\ni 0 \]

and $\nabla ^2 f(x_0)$ is positive semi-definite.

Conversely, take an arbitrary $[x_0,x_1]\subset \Omega $, set $d:=x_1-x_0$ and assume $\nabla ^2 f(x_t)$ positive semi-definite, i.e. $\varphi ''(t)\ge 0$, for $t\in [0,1]$. Then Theorem 0.6.6 tells us that $\varphi $ is convex on $[0,1]$, i.e. $f$ is convex on $[x_0,x_1]$. The result follows since $x_0$ and $x_1$ were arbitrary in $\Omega $.

[(ii)] To establish the strict convexity of $f$ on $\Omega $, we prove that $\nabla f$ is strictly monotone on $\Omega $: Theorem 4.1.4 will apply. As above, take an arbitrary $[x_0,x_1]\subset \Omega $, $x_1\neq x_0$, $d:=x_1-x_0$, and apply the mean-value theorem to the function $\varphi '$, differentiable on $[0,1]$, for some $\tau \in ]0,1[$,

\[ \varphi '(1)-\varphi '(0)=\varphi ''(\tau )=\langle \nabla ^2 f(x_\tau )d,d\rangle {\gt}0 \]

and the result follows since

\[ \varphi '(1)-\varphi '(0)=\langle \nabla f(x_1)-\nabla f(x_0),x_1-x_0\rangle . \]

[(iii)] Using Proposition 1.1.2, apply (i) to the function $f-{\tfrac 12}c\| \, \cdot \, \| ^2$, whose Hessian operator is $\nabla ^2 f-cI_n$ and has the eigenvalues $\lambda -c$, with $\lambda $ describing the eigenvalues of $\nabla ^2 f$.