Convex Eval

2 Convex functions

2.1 Basic Definitions and Examples

2.1.1 The definitions of a Convex Function

Definition 2.1.1
#

Let \(C\) be a nonempty convex set in \(\mathbb {R}^n\). A function \(f : C \to \mathbb {R}\) is said to be convex on \(C\) when, for all pairs \((x,x')\in C\times C\) and all \(\alpha \in [0,1]\), there holds

\[ f(\alpha x + (1-\alpha )x') \le \alpha f(x) + (1-\alpha ) f(x'). \tag {1.1.1} \]
Proposition 2.1.1
#

The function \(f\) is strongly convex on \(C\) with modulus \(c\) if and only if the function \(f-\tfrac {1}{2}c\| \cdot \| ^{2}\) is convex on \(C\).

Proof

Use direct calculations in the definition (1.1.1) of convexity applied to the function \(f-\tfrac {1}{2}c\| \cdot \| ^{2}\), namely:

\[ f(\alpha x+(1-\alpha )x')-\tfrac {1}{2}c\| \alpha x+(1-\alpha )x'\| ^{2}\le \]
\[ \le \alpha f(x)+(1-\alpha )f(x')-\tfrac {1}{2}c\big[\alpha \| x\| ^{2}+(1-\alpha )\| x'\| ^{2}\big]. \]
Definition 2.1.2
#

(The Set \(\operatorname {Conv}\mathbb {R}^n\)) A function \(f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \), not identically \(+\infty \), is said to be convex when, for all \((x,x')\in \mathbb {R}^n\times \mathbb {R}^n\) and all \(\alpha \in [0,1]\), there holds

\[ f\big(\alpha x+(1-\alpha )x'\big)\le \alpha f(x)+(1-\alpha )f(x'), \]

considered as an inequality in \(\mathbb {R}\cup \{ +\infty \} \).

The class of such functions is denoted by \(\operatorname {Conv}\mathbb {R}^n\).

Definition 2.1.3
#

(Domain of a Function) The domain (or also effective domain) of \(f\in \text{Conv }\mathbb {R}^n\) is the nonempty set

\[ \text{dom }f := \{ x\in \mathbb {R}^n : f(x) {\lt} +\infty \} . \]
Definition 2.1.4
#

(Epigraph of a Function) Given \(f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \), not identically equal to \(+\infty \), the epigraph of \(f\) is the nonempty set

\[ \operatorname {epi} f := \{ (x,r)\in \mathbb {R}^n\times \mathbb {R} : r \ge f(x)\} . \]

Its strict epigraph \(\operatorname {epi}_s f\) is defined likewise, with “\(\ge \)” replaced by “\({\gt}\)” (beware that the word “strict” here has nothing to do with strict convexity).

Proposition 2.1.2
#

Let \(f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \) be not identically equal to \(+\infty \). The three properties below are equivalent:

  1. \(f\) is convex in the sense of Definition 1.1.3;

  2. its epigraph is a convex set in \(\mathbb {R}^n\times \mathbb {R}\);

  3. its strict epigraph is a convex set in \(\mathbb {R}^n\times \mathbb {R}\).

Proof

Left as an exercise.

Theorem 2.1.1
#

(Inequality of Jensen) Let \(f\in \operatorname {Conv}\mathbb {R}^n\). Then, for all collections \(\{ x_1,\dots ,x_k\} \) of points in \(\text{dom }f\) and all \(\alpha =(\alpha _1,\dots ,\alpha _k)\) in the unit simplex of \(\mathbb {R}^k\), there holds (inequality of Jensen in summation form)

\[ f\! \bigg(\sum _{i=1}^k \alpha _i x_i\bigg)\le \sum _{i=1}^k \alpha _i f(x_i). \]
Proof

The \(k\) points \((x_i,f(x_i))\in \mathbb {R}^n\times \mathbb {R}\) are clearly in \(\text{epi }f\), a convex set. Their convex combination

\[ \sum _i \alpha _i (x_i,f(x_i)) = \Big(\sum _i \alpha _i x_i,\sum _i \alpha _i f(x_i)\Big) \]

is also in \(\text{epi }f\) (Proposition A.1.3.3). This is just the claimed inequality.

Proposition 2.1.3
#

Let \(f\in \text{Conv }\mathbb {R}^n\). The relative interior of \(\operatorname {epi}f\) is the union over \(x\in \operatorname {ri}\, \operatorname {dom}f\) of the open half-lines with bottom endpoints at \(f(x)\):

\[ \operatorname {ri}\operatorname {epi}f=\{ (x,r)\in \mathbb {R}^n\times \mathbb {R}: x\in \operatorname {ri}\operatorname {dom}f,\ r{\gt}f(x)\} . \]
Proof

Since \(\operatorname {dom}f\) is the image of \(\operatorname {epi}f\) under the linear mapping “projection onto \(\mathbb {R}^n\)”, Proposition A.2.1.12 tells us that

\[ \operatorname {ri}\operatorname {dom}f\ \text{is the projection onto }\mathbb {R}^n\ \text{of }\operatorname {ri}\operatorname {epi}f. \tag {1.1.5} \]

Now take \(x\) arbitrary in \(\operatorname {ri}\operatorname {dom}f\). The subset of \(\operatorname {ri}\operatorname {epi}f\) that is projected onto \(x\) is just \(((\{ x\} \times \mathbb {R})\cap \operatorname {ri}\operatorname {epi}f)\), which in turn is \(\operatorname {ri}((\{ x\} \times \mathbb {R})\cap \operatorname {epi}f)\) (use Proposition A.2.1.10). This latter set is clearly \(\{ x\} \times (f(x),+\infty )\).

In summary, we have proved that, for \(x\in \operatorname {ri}\operatorname {dom}f\), \((x,r)\in \operatorname {ri}\operatorname {epi}f\) if and only if \(r{\gt}f(x)\). Together with (1.1.5), this proves our claim.

2.1.2 Special convex functions: Affinity and Closedness

Proposition 2.1.4
#

Any \(f\in \operatorname {Conv}\mathbb {R}^n\) is minorized by some affine function. More precisely: for any \(x_0\in \operatorname {ri}\operatorname {dom}f\), there is \(s\) in the subspace parallel to \(\operatorname {aff}\operatorname {dom}f\) such that

\[ f(x)\ge f(x_0)+\langle s,\, x-x_0\rangle \quad \text{for all }x\in \mathbb {R}^n. \]

In other words, the affine function can be forced to coincide with \(f\) at \(x_0\).

Proof

We know that \(\operatorname {dom}f\) is the image of \(\operatorname {epi}f\) under the linear mapping "projection onto \(\mathbb {R}^n\)". Look again at the definition of an affine hull (§A.1.3) to realize that

\[ \operatorname {aff}\operatorname {epi}f=(\operatorname {aff}\operatorname {dom}f)\times \mathbb {R}\, . \]

Denote by \(V\) the linear subspace parallel to \(\operatorname {aff}\operatorname {dom}f\), so that \(\operatorname {aff}\operatorname {dom}f=\{ x_0\} +V\) with \(x_0\) arbitrary in \(\operatorname {dom}f\); then we have

\begin{equation} \label{eq:1.2.1} \operatorname {aff}\operatorname {epi}f=\{ x_0+V\} \times \mathbb {R}. \end{equation}
1

We equip \(V\times \mathbb {R}\) and \(\mathbb {R}^n\times \mathbb {R}\) with the scalar product of product-spaces. With \(x_0\in \operatorname {ri}\operatorname {dom}f\), Proposition 1.1.9 tells us that \((x_0,f(x_0))\in \operatorname {rbd}\operatorname {epi}f\) and we can take a nontrivial hyperplane supporting \(\operatorname {epi}f\) at \((x_0,f(x_0))\): using Remark A.4.2.2 and (1), there are \(s=s_V\in V\) and \(\alpha \in \mathbb {R}\), not both zero, such that

\begin{equation} \label{eq:1.2.2} \langle s,x\rangle +\alpha r\le \langle s,x_0\rangle +\alpha f(x_0) \end{equation}
2

for all \((x,r)\) with \(f(x)\le r\). Note: this implies \(\alpha \le 0\) (let \(r\to +\infty \)!).

Because of our choice of \(s\) (in \(V\)) and \(x_0\) (in \(\operatorname {ri}\operatorname {dom}f\)), we can take \(\delta {\gt}0\) so small that \(x_0+\delta s\in \operatorname {dom}f\), for which (2) gives

\[ \delta \| s\| ^2\le \alpha \bigl[f(x_0)-f(x_0+\delta s)\bigr]{\lt}+\infty ; \]

this shows \(\alpha \neq 0\) (otherwise, both \(s\) and \(\alpha \) would be zero). Without loss of generality, we can assume \(\alpha =-1\); then (2) gives our affine function.

Proposition 2.1.5
#

For \(f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \), the following three properties are equivalent:

  1. \(f\) is lower semi-continuous on \(\mathbb {R}^n\);

  2. \(\operatorname {epi} f\) is a closed set in \(\mathbb {R}^n\times \mathbb {R}\);

  3. the sublevel-sets \(S_r(f)\) are closed (possibly empty) for all \(r\in \mathbb {R}\).

Proof

[(i) \(\Rightarrow \) (ii)] Let \((y_k,r_k)_k\) be a sequence of \(\operatorname {epi} f\) converging to \((x,r)\) for \(k\to +\infty \). Since \(f(y_k)\le r_k\) for all \(k\), the l.s.c. relation (1.2.3) readily gives

\[ r=\lim r_k \ge \liminf f(y_k)\ge \liminf _{y\to x} f(y)\ge f(x), \]

i.e. \((x,r)\in \operatorname {epi} f\).

[(ii) \(\Rightarrow \) (iii)] Construct the sublevel-sets \(S_r(f)\) as in Remark 1.1.7: the closed sets \(\operatorname {epi} f\) and \(\mathbb {R}^n\times \{ r\} \) have a closed intersection.

[(iii) \(\Rightarrow \) (i)] Suppose \(f\) is not lower semi-continuous at some \(x\): there is a (sub)sequence \((y_k)\) converging to \(x\) such that \(f(y_k)\) converges to \(\rho {\lt}f(x)\le +\infty \). Pick \(r\in [\rho ,f(x))\): for \(k\) large enough, \(f(y_k)\le r{\lt} f(x)\); hence \(S_r(f)\) contains the tail of \((y_k)\) but not its limit \(x\). Consequently, this \(S_r(f)\) is not closed.

Definition 2.1.5
#

(Closed Functions) The function \(f:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \) is said to be closed if it is lower semi-continuous everywhere, or if its epigraph is closed, or if its sublevel-sets are closed.

Definition 2.1.6 Closure of a Function
#

The closure (or lower semi-continuous hull) of a function \(f\) is the function \(\operatorname {cl}f:\mathbb {R}^n\to \mathbb {R}\cup \{ \pm \infty \} \) defined by:

\[ \operatorname {cl}f(x):=\liminf _{y\to x} f(y)\qquad \text{for all }x\in \mathbb {R}^n, \tag {1.2.4} \]

or equivalently by

\[ \operatorname {epi}(\operatorname {cl}f):=\operatorname {cl}\big(\operatorname {epi} f\big). \tag {1.2.5} \]
Proposition 2.1.6
#

Let \(f\in \operatorname {Conv}\mathbb {R}^n\) and \(x'\in \operatorname {ri}\operatorname {dom}f\). There holds (in \(\mathbb {R}\cup \{ +\infty \} \))

\[ \operatorname {cl}f(x)=\lim _{t\downarrow 0}f\big(x+t(x'-x)\big)\quad \text{for all }x\in \mathbb {R}^n. \]
Proof

Since \(x_t:=x+t(x'-x)\to x\) when \(t\downarrow 0\), we certainly have

\[ (\operatorname {cl}f)(x)\le \liminf _{t\downarrow 0}f\big(x+t(x'-x)\big). \]

We will prove the converse inequality by showing that

\[ \limsup _{t\downarrow 0}f\big(x+t(x'-x)\big)\le r\quad \text{for all }r\ge (\operatorname {cl}f)(x). \]

(Non-existence of such an \(r\) means that \(\operatorname {cl}f(x)=+\infty \), the proof is finished.)

Thus let \((x,r)\in \operatorname {epi}(\operatorname {cl}f)=\operatorname {cl}(\operatorname {epi}f)\). Pick \(r'{\gt}f(x')\), hence \((x',r')\in \operatorname {epi}f\) (Proposition 1.1.9). Applying Lemma A.2.1.6 to the convex set \(\operatorname {epi}f\), we see that

\[ t(x',r')+(1-t)(x,r)\in \operatorname {epi}f\subset \operatorname {epi}f\quad \text{for all }t\in ]0,1]. \]

This just means

\[ f\big(x+t(x'-x)\big)\le tr'+(1-t)r\quad \text{for all }t\in ]0,1] \]

and our required inequality follows by letting \(t\downarrow 0\).

Proposition 2.1.7
#

For \(f\in \text{Conv }\mathbb {R}^n\), there holds

\[ \operatorname {cl} f\in \text{Conv }\mathbb {R}^n; \tag {1.2.7} \]
\[ \operatorname {cl} f\text{ and }f\text{ coincide on the relative interior of }\text{dom }f. \tag {1.2.8} \]
Proof

We already know from Proposition A.1.2.6 that \(\text{epi }\operatorname {cl} f=\operatorname {cl}\text{epi }f\) is a convex set; also \(\operatorname {cl} f\le f\not\equiv +\infty \); finally, Proposition 1.2.1 guarantees in the relation of definition (1.2.4) that \(\operatorname {cl} f(x){\gt}-\infty \) for all \(x\): (1.2.7) does hold.

On the other hand, suppose \(x\in \text{ri }\text{dom }f\). Then the one-dimensional function \(\varphi (t)=f(x+td)\) is continuous at \(t=0\) (Theorem 0.6.2); it follows from Proposition 1.2.5 that \(\operatorname {cl} f\) coincides with \(f\) on \(\text{ri }\text{dom }f\); besides, \(\operatorname {cl} f(x)\) is obviously equal to \(f(x)=+\infty \) for all \(x\notin \text{cl }\text{dom }f\). Altogether, (1.2.8) is true.

Notation 1.2.7 (The Set \(\overline{\operatorname {Conv}}\mathbb {R}^n\)) The set of closed convex functions on \(\mathbb {R}^n\) is denoted by \(\overline{\operatorname {Conv}}\mathbb {R}^n\).

Proposition 2.1.8
#

The closure of \(f\in \text{Conv }\mathbb {R}^n\) is the supremum of all affine functions minorizing \(f\):

\[ \text{cl }f(x)=\sup _{(s,b)\in \mathbb {R}^n\times \mathbb {R}}\{ \langle s,x\rangle -b:\langle s,y\rangle -b\le f(y)\text{ for all }y\in \mathbb {R}^n\} . \tag {1.2.9} \]
Proof

A closed half-space containing \(\text{epi }f\) is characterized by a nonzero vector \((s,\alpha )\in \mathbb {R}^n\times \mathbb {R}\) and a real number \(b\) such that

\[ \langle s,x\rangle +\alpha r\le b\qquad \text{for all }(x,r)\in \text{epi }f \tag {1.2.10} \]

(we equip the graph-space \(\mathbb {R}^n\times \mathbb {R}\) with the scalar product of a product-space). Let us denote by \(\Sigma \subset \mathbb {R}^n\times \mathbb {R}\times \mathbb {R}\) the index-set of such triples \(\sigma =(s,\alpha ,b)\), with corresponding half-space

\[ H^-_{\sigma }:=\{ (x,r):\langle s,x\rangle +\alpha r\le b\} . \tag {1.2.11} \]

In other words, \(\operatorname {epi}(\overline{cl}\, f)=\overline{\operatorname {epi}f}=\bigcap _{\sigma \in \Sigma }H^-_{\sigma }\).

Because of the particular nature of an epigraph, (1.2.10) implies \(\alpha \le 0\) (let \(r\to +\infty \)) and, by positive homogeneity, the values \(\alpha =0\) and \(\alpha =-1\) suffice: \(\Sigma \) can be partitioned in

\[ \Sigma _1 := \{ (s,-1,b):(1.2.10)\ \text{holds with }\alpha =-1\} \]

and

\[ \Sigma _0 := \{ (s,0,b):(1.2.10)\ \text{holds with }\alpha =0\} . \]

Indeed, \(\Sigma _1\) corresponds to affine functions minorizing \(f\) (Proposition 1.2.1 tells us that \(\Sigma _1\neq \varnothing \)) and \(\Sigma _0\) to closed half-spaces of \(\mathbb {R}^{n}\) containing \(\operatorname {dom}f\) (note that \(\Sigma _0=\varnothing \) if \(\operatorname {dom}f=\mathbb {R}^n\)).

We have to prove that, even when \(\Sigma _0\neq \varnothing \), intersecting the half-spaces \(H^-_{\sigma }\) over \(\Sigma \) or over \(\Sigma _1\) produces the same set, namely \(\operatorname {epi}f\). For this we take arbitrary \(\sigma _0=(s_0,0,b_0)\in \Sigma _0\) and \(\sigma _1=(s_1,-1,b_1)\in \Sigma _1\), we set

\[ \sigma (t):=(s_1+ts_0,-1,b_1+t b_0)\in \Sigma _1\quad \text{for all }t\ge 0, \]

and we prove (see Fig. 1.2.1)

\[ H^-_{\sigma _0}\cap H^-_{\sigma _1}=\bigcap _{t\ge 0}H^-_{\sigma (t)}=:\, H^-. \]

Fig. 1.2.1. Closing a convex epigraph

It results directly from the definition (1.2.11) that an \((x,r)\) in \(H^-_{\sigma _0}\cap H^-_{\sigma _1}\) satisfies

\[ \langle s_1+ts_0,x\rangle -(b_1+tb_0)\le r\quad \text{for all }t\ge 0,\tag {1.2.12} \]

i.e. \((x,r)\in H^-\). Conversely, take \((x,r)\in H^-\). Set \(t=0\) in (1.2.12) to see that \((x,r)\in H^-_{\sigma _1}\). Also, divide by \(t{\gt}0\) and let \(t\to +\infty \) to see that \((x,r)\in H^-_{\sigma _0}\). The proof is complete.

2.1.3 First examples

Theorem 2.1.2
#

Let \(C\) be a nonempty subset of \(\mathbb {R}^n\times \mathbb {R}\) satisfying ??, and let its lower-bound function \(\ell _C\) be defined by ??.

  1. If \(C\) is convex, then \(\ell _C\in \operatorname {Conv}\mathbb {R}^n\);

  2. If \(C\) is closed convex, then \(\ell _C\in \overline{\operatorname {Conv}}\mathbb {R}^n\).

Proof

We use the analytical definition (1.1.1). Take arbitrary \(\varepsilon {\gt}0\), \(\alpha \in ]0,1[\) and \((x_i,r_i)\in C\) such that \(r_i\le \ell _C(x_i)+\varepsilon \) for \(i=1,2\).

When \(C\) is convex, \((\alpha x_1+(1-\alpha )x_2,\alpha r_1+(1-\alpha )r_2)\in C\), hence

\[ \ell _C(\alpha x_1+(1-\alpha )x_2)\le \alpha r_1+(1-\alpha )r_2 \le \alpha \ell _C(x_1)+(1-\alpha )\ell _C(x_2)+\varepsilon . \]

The convexity of \(\ell _C\) follows, since \(\varepsilon {\gt}0\) was arbitrary; (i) is proved.

Now take a sequence \((x_k,\rho _k)_k\subset \operatorname {epi}\ell _C\) converging to \((x,\rho )\); we have to prove \(\ell _C(x)\le \rho \) (cf. Proposition 1.2.2). By definition of \(\ell _C(x_k)\), we can select, for each positive integer \(k\), a real number \(r_k\) such that \((x_k,r_k)\in C\) and

\begin{equation} \label{eq:1.3.7} \ell _C(x_k)\le r_k\le \ell _C(x_k)+\tfrac {1}{k}\le \rho _k+\tfrac {1}{k}. \end{equation}
3

We deduce first that \((r_k)\) is bounded from above. Also, when \(\ell _C\) is convex, Proposition 1.2.1 implies the existence of an affine function minorizing \(\ell _C\): \((r_k)\) is bounded from below.

Extracting a subsequence if necessary, we may assume \(r_k\to r\). When \(C\) is closed, \((x,r)\in C\), hence \(\ell _C(x)\le r\); but pass to the limit in 3 to see that \(r\le \rho \); the proof is complete.

2.2 Functional Operations Preserving Convexity

2.2.1 Operations preserving closedness

Proposition 2.2.1
#

Let \(f_1,\dots ,f_m\) be in \(\text{Conv }\mathbb {R}^n\) [resp. in \(\overline{\text{Conv }}\mathbb {R}^n\)], let \(t_1,\dots ,t_m\) be positive numbers, and assume that there is a point where all the \(f_j\)’s are finite. Then the function \(f:=\sum _{j=1}^m t_j f_j\) is in \(\text{Conv }\mathbb {R}^n\) [resp. in \(\overline{\text{Conv }}\mathbb {R}^n\)].

Proof

The convexity of \(f\) is readily proved from the relation of definition (1.1.1). As for its closedness, start from

\[ \liminf _{y\to x} t_j f_j(y)=t_j\liminf _{y\to x} f_j(y)\ge t_j f_j(x) \]

(valid for \(t_j{\gt}0\) and \(f_j\) closed); then note that the lim inf of a sum is not smaller than the sum of lim inf’s.

Proposition 2.2.2
#

Let \(\{ f_j\} _{j\in J}\) be an arbitrary family of convex [resp. closed convex] functions. If there exists \(x_0\) such that \(\sup _{j} f_j(x_0) {\lt} +\infty \), then their pointwise supremum \(f := \sup \{ f_j : j\in J\} \) is in \(\mathrm{Conv}\, \mathbb {R}^n\) [resp. in \(\mathrm{Conv}_\mathrm {cl}\, \mathbb {R}^n\)].

Proof

The key property is that a supremum of functions corresponds to an intersection of epigraphs: \(\operatorname {epi} f = \bigcap _{j\in J}\operatorname {epi} f_j\), which conserves convexity and closedness. The only needed restriction is nonemptiness of this intersection.

Proposition 2.2.3
#

Let \(f\in \operatorname {Conv}\mathbb {R}^n\) [resp. \(\overline{\operatorname {Conv}}\mathbb {R}^n\)] and let \(A\) be an affine mapping from \(\mathbb {R}^m\) to \(\mathbb {R}^n\) such that \(\operatorname {Im}A\cap \operatorname {dom}f\neq \varnothing \). Then the function

\[ f\circ A : \mathbb {R}^m \supseteq x\mapsto (f\circ A)(x)=f(A(x)) \]

is in \(\operatorname {Conv}\mathbb {R}^m\) [resp. \(\overline{\operatorname {Conv}}\mathbb {R}^m\)].

Proof

Clearly \((f\circ A)(x){\gt}-\infty \) for all \(x\); besides, there exists by assumption \(y=A(x)\in \mathbb {R}^n\) such that \(f(y){\lt}+\infty \). To check convexity, it suffices to plug the relation

\[ A(\alpha x+(1-\alpha )x')=\alpha A(x)+(1-\alpha )A(x') \]

into the analytical definition (1.1.1) of convexity. As for closedness, it comes readily from the continuity of \(A\) when \(f\) is itself closed.

Proposition 2.2.4

Let \(f\in \operatorname {Conv}\mathbb {R}^n\) [resp. \(\overline{\operatorname {Conv}}\mathbb {R}^n\)] and let \(g\in \operatorname {Conv}\mathbb {R}\) [resp. \(\overline{\operatorname {Conv}}\mathbb {R}\)] be increasing. Assume that there is \(x_0\in \mathbb {R}^n\) such that \(f(x_0)\in \operatorname {dom}g\), and set \(g(+\infty ):=+\infty \). Then the composite function \(g\circ f:\; x\mapsto g(f(x))\) is in \(\operatorname {Conv}\mathbb {R}^n\) [resp. in \(\overline{\operatorname {Conv}}\mathbb {R}^n\)].

Proof

It suffices to check the inequalities of definition: (1.1.1) for convexity, (1.2.3) for closedness.

2.2.2 Dilations and perspectives of a functions

Proposition 2.2.5
#

If \(f\in \operatorname {Conv}\mathbb {R}^n\), its perspective \(\tilde f\) is in \(\operatorname {Conv}\mathbb {R}^{n+1}\).

Proof

Here also, it is better to look at \(\tilde f\) with “geometric glasses”:

\[ \operatorname {epi}\tilde f=\{ (u,x,r)\in \mathbb {R}_+^\times \times \mathbb {R}^n\times \mathbb {R}:\; f(x/u)\le r/u\} =\{ u(1,x',r'):\; u{\gt}0,(x',r')\in \operatorname {epi}f\} \]
\[ =\bigcup _{u{\gt}0}u(\{ 1\} \times \operatorname {epi}f)=\mathbb {R}_+^\times (\{ 1\} \times \operatorname {epi}f) \]

and \(\operatorname {epi}\tilde f\) is therefore a convex cone.

Proposition 2.2.6
#

Let \(f\in \operatorname {Conv}\mathbb {R}^n\) and let \(x'\in \operatorname {ri}\text{dom }f\). Then the closure \(\overline{f}\) of its perspective is given as follows:

\[ (\text{cl }\tilde f)(u,x)= \begin{cases} u f(x/u) & \text{if } u{\gt}0,\\[4pt] \lim _{\alpha \downarrow 0}\alpha f(x’-x+\tfrac {x}{\alpha }) & \text{if } u=0,\\[4pt] +\infty & \text{if } u{\lt}0. \end{cases} \]
Proof

Suppose first \(u{\lt}0\). For any \(x\), it is clear that \((u,x)\) is outside \(\text{cl }\text{dom }\tilde f\) and, in view of (1.2.8), \(\text{cl }\tilde f(u,x)=+\infty \).

Now let \(u\ge 0\). Using (2.2.1), the assumption on \(x'\) and the results of §A.2.1, we see that \((1,x')\in \text{ri }\text{dom }\tilde f\), so Proposition 1.2.5 allows us to write

\[ (\text{cl }\tilde f)(u,x)=\lim _{\alpha \downarrow 0}\tilde f\bigl((u,x)+\alpha [(1,x')-(u,x)]\bigr) \]
\[ \qquad =\lim _{\alpha \downarrow 0}[u+\alpha (1-u)]\, f\! \Bigl(\frac{x+\alpha (x'-x)}{u+\alpha (1-u)}\Bigr). \]

If \(u=1\), this reads \(\text{cl }\tilde f(1,x)=\text{cl }f(x)=f(x)\) (because \(f\) is closed); if \(u=0\), we just obtain our claimed relation.

2.2.3 Infimal convolution

Definition 2.2.1
#

Let \(f_1\) and \(f_2\) be two functions from \(\mathbb {R}^n\) to \(\mathbb {R}\cup \{ +\infty \} \). Their infimal convolution is the function from \(\mathbb {R}^n\) to \(\mathbb {R}\cup \{ \pm \infty \} \) defined by

\begin{align} (f_1\mathbin {\square }f_2)(x)& :=\inf \{ f_1(x_1)+f_2(x_2):\; x_1+x_2=x\} \label{eq:2.3.1}\\ & =\inf _{y\in \mathbb {R}^n}[\, f_1(y)+f_2(x-y)\, ].\notag \end{align}

We will also call “infimal convolution” the operation expressed by 7. It is called exact at \(x=\bar x_1+\bar x_2\) when the infimum is attained at \((\bar x_1,\bar x_2)\), not necessarily unique.

Proposition 2.2.7
#

Let the functions \(f_1\) and \(f_2\) be in \(\operatorname {Conv}\mathbb {R}^n\). Suppose that they have a common affine minorant: for some \((s,b)\in \mathbb {R}^n\times \mathbb {R}\),

\[ f_j(x)\ge \langle s,x\rangle - b\qquad \text{for }j=1,2\text{ and all }x\in \mathbb {R}^n. \]

Then their infimal convolution is also in \(\operatorname {Conv}\mathbb {R}^n\).

Proof

For arbitrary \(x\in \mathbb {R}^n\) and \(x_1,x_2\) such that \(x_1+x_2=x\), we have by assumption

\[ f_1(x_1)+f_2(x_2)\ge \langle s,x\rangle -2b{\gt}- \infty , \]

and this inequality extends to the infimal value \((f_1\mathbin {\square }f_2)(x)\).

On the other hand, it suffices to choose particular values \(x_j\in \text{dom }f_j\), \(j=1,2\), to obtain the point \(x_1+x_2\in \text{dom }(f_1\mathbin {\square }f_2)\). Finally, the convexity of \(f_1\mathbin {\square }f_2\) results from the convexity of a lower-bound function, as seen in §1.3(g).

2.2.4 Image of a function under a linear mapping

Definition 2.2.2
#

(Image Function) Let \(A:\mathbb {R}^m\to \mathbb {R}^n\) be a linear operator and let \(g:\mathbb {R}^m\to \mathbb {R}\cup \{ +\infty \} \). The image of \(g\) under \(A\) is the function \(Ag:\mathbb {R}^n\to \mathbb {R}\cup \{ \pm \infty \} \) defined by

\[ (Ag)(x):=\inf \{ \, g(y):\; Ay=x\, \} \tag {2.4.2} \]

(here as always, \(\inf \varnothing =+\infty \)).

Theorem 2.2.1
#

Let \(g\) of Definition 2.4.1 be in \(\mathrm{Conv}\, \mathbb {R}^m\). Assume also that, for all \(x\in \mathbb {R}^n\), \(g\) is bounded below on the inverse image \(A^{-1}(x)=\{ y\in \mathbb {R}^m:Ay=x\} \). Then \(Ag\in \mathrm{Conv}\, \mathbb {R}^n\).

Proof

By assumption, \(Ag\) is nowhere \(-\infty \); also, \((Ag)(x){\lt}+\infty \) whenever \(x=Ay\), with \(y\in \text{dom }g\). Now consider the extended operator

\[ A':\mathbb {R}^m\times \mathbb {R}\rightrightarrows (y,r)\mapsto A'(y,r):=(Ay,r)\in \mathbb {R}^n\times \mathbb {R}. \]

The set \(A'(\text{epi }g)=:C\) is convex in \(\mathbb {R}^n\times \mathbb {R}\); let us compute its lower-bound function (1.3.5): for given \(x\in \mathbb {R}^n\),

\[ \inf _{r}\{ r : (x,r)\in C\} = \inf _{y,r}\{ r : Ay = x \ \text{and}\ g(y)\le r\} = \inf _{y}\{ g(y) : Ay = x\} = (Ag)(x), \]

and this proves the convexity of \(Ag=\ell _C\).

Corollary 2.2.1
#

Let (2.4.1) have the following form: \(U=\mathbb {R}^p\); \(\varphi \in \text{Conv }\mathbb {R}^p\); \(X=\mathbb {R}^n\) is equipped with the canonical basis; the mapping \(c\) has its components \(c_j\in \text{Conv }\mathbb {R}^p\) for \(j=1,\dots ,n\). Suppose also that the optimal value is \({\gt}-\infty \) for all \(x\in \mathbb {R}^n\), and that

\[ \text{dom }\varphi \cap \text{dom }c_1\cap \cdots \cap \text{dom }c_n\neq \varnothing . \tag {2.4.4} \]

Then the value function

\[ v_{\varphi ,c}(x):=\inf \{ \varphi (u):c_j(u)\le x_j,\ \text{for }j=1,\dots ,n\} \]

lies in \(\text{Conv }\mathbb {R}^n\).

Proof

Note first that we have assumed \(v_{\varphi ,c}(x){\gt}-\infty \) for all \(x\). Take \(u_0\) in the set (2.4.4) and set \(M:=\max _j c_j(u_0)\); then take \(x_0:=(M,\dots ,M)\in \mathbb {R}^n\), so that \(v_{\varphi ,c}(x_0)\le \varphi (u_0){\lt}+\infty \). Knowing that \(v_{\varphi ,c}\) is an image-function, we just have to prove the convexity of the set (2.4.3); but this in turn comes immediately from the convexity of each \(c_j\).

Definition 2.2.3
#

(Marginal Function) Let \(g\in \operatorname {Conv}(\mathbb {R}^n\times \mathbb {R}^m)\). Then

\[ \mathbb {R}^n\ni x\mapsto \gamma (x):=\inf \{ \, g(x,y):\; y\in \mathbb {R}^m\, \} \]

is the marginal function of \(g\).

Corollary 2.2.2
#

With the above notation, suppose that \(g\) is bounded below on the set \(\{ x\} \times \mathbb {R}^m\), for all \(x\in \mathbb {R}^n\). Then the marginal function \(\gamma \) lies in \(\operatorname {Conv}\mathbb {R}^n\).

Proof

The marginal function \(\gamma \) is the image of \(g\) under the the linear operator \(A\) projecting each \((x,y)\in \mathbb {R}^n\times \mathbb {R}^m\) onto \(x\in \mathbb {R}^n\): \(A(x,y)=x\).   

2.2.5 Convex hull and closed convex hull of a function

Proposition 2.2.8
#

Let \(g:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \), not identically \(+\infty \), be minorized by some affine function: for some \((s,b)\in \mathbb {R}^n\times \mathbb {R}\),

\[ g(x)\ge \langle s,x\rangle - b\quad \text{for all }x\in \mathbb {R}^n . \tag {2.5.1} \]

Then, the following three functions \(f_1, f_2\) and \(f_3\) are convex and coincide on \(\mathbb {R}^n\):

\[ \begin{aligned} f_1(x) & := \inf \{ r : (x,r)\in \operatorname {co}\, \operatorname {epi} g\} ,\\ f_2(x) & := \sup \{ h(x): h\in \operatorname {Conv}\mathbb {R}^n,\ h\le g\} ,\\ f_3(x) & := \inf \Big\{ \sum _{j=1}^k \alpha _j g(x_j): k=1,2,\dots \\ & \qquad \qquad \qquad \alpha \in \Delta _k,\ x_j\in \operatorname {dom} g,\ \sum _{j=1}^k \alpha _j x_j = x\Big\} . \end{aligned} \tag {2.5.2} \]
Proof

We denote by \(\Gamma \) the family of convex functions minorizing \(g\). By assumption, \(\Gamma \neq \varnothing \); then the convexity of \(f_1\) results from §1.3(g).

[\(f_2\le f_1\)] Consider the epigraph of any \(h\in \Gamma \): its lower-bound function \(\ell _{\mathrm{epi}\, h}\) is \(h\) itself; besides, it contains \(\mathrm{epi}\, g\), and \(\operatorname {co}(\mathrm{epi}\, g)\) as well (see Proposition A.1.3.4). In a word, there holds \(h=\ell _{\mathrm{epi}\, h}\le \ell _{\operatorname {co}\, \mathrm{epi}\, g}=f_1\) and we conclude \(f_2\le f_1\) since \(h\) was arbitrary in \(\Gamma \).

[\(f_3\le f_2\)] We have to prove \(f_3\in \Gamma \), and the result will follow by definition of \(f_2\); clearly \(f_3\le g\) (take \(\alpha \in \Delta _1\)!), so it suffices to establish \(f_3\le \operatorname {Conv}\mathbb {R}^n\). First, with \((s,b)\) of (2.5.1) and all \(x,\ \{ x_j\} \) and \(\{ \alpha _j\} \) as described by (2.5.2),

\[ \sum _{j=1}^k \alpha _j g(x_j) \ge \sum _{j=1}^k \alpha _j(\langle s,x_j\rangle - b) = \langle s,x\rangle - b; \]

hence \(f_3\) is minorized by the affine function \(\langle s,\cdot \rangle - b\). Now, take two points \((x,r)\) and \((x',r')\) in the strict epigraph of \(f_3\). By definition of \(f_3\), there are \(k,\ \{ \alpha _j\} ,\ \{ x_j\} \) as described in (2.5.2), and likewise \(k',\ \{ \alpha '_j\} ,\ \{ x'_j\} \), such that \(\sum _{j=1}^k \alpha _j g(x_j){\lt}r\) and likewise \(\sum _{j=1}^{k'} \alpha '_j g(x'_j){\lt}r'\).

For arbitrary \(t\in ]0,1[\), we obtain by convex combination

\[ \sum _{j=1}^k t\alpha _j g(x_j) + \sum _{j=1}^{k'} (1-t)\alpha '_j g(x'_j) {\lt} tr + (1-t)r'. \]

Observe that

\[ \sum _{j=1}^k t\alpha _j x_j + \sum _{j=1}^{k'} (1-t)\alpha '_j x'_j = tx + (1-t)x', \]

i.e. we have in the lefthand side a convex decomposition of \(tx+(1-t)x'\) in \(k+k'\) elements; therefore, by definition of \(f_3\):

\[ f_3(tx+(1-t)x') \le \sum _{j=1}^k t\alpha _j g(x_j) + \sum _{j=1}^{k'} (1-t)\alpha '_j g(x'_j) \]

and we have proved that \(\mathrm{epi}_s f_3\) is a convex set: \(f_3\) is convex.

Let \(x\in \mathbb {R}^n\) and take an arbitrary convex decomposition \(x=\sum _{j=1}^k \alpha _j x_j\), with \(\alpha _j\) and \(x_j\) as described in (2.5.2). Since \((x_j,g(x_j))\in \operatorname {epi} g\) for \(j=1,\ldots ,k\),

\[ \biggl(x,\sum _{j=1}^k \alpha _j g(x_j)\biggr)\in \operatorname {co}\operatorname {epi} g \]

and this implies \(f_1(x)\le \sum _{j=1}^k \alpha _j g(x_j)\) by definition of \(f_1\). Because the decomposition of \(x\) was arbitrary within (2.5.2), this implies \(f_1(x)\le f_3(x)\).

Proposition 2.2.9
#

Let \(g\) satisfy the hypotheses of Proposition 2.5.1. Then the three functions below

\[ \bar f_1(x):=\inf \{ r:(x,r)\in \overline{\operatorname {epi} g}\} , \qquad \bar f_2(x):=\sup \{ h(x):\ h\in \operatorname {Conv}\mathbb {R}^n,\ h\le g\} , \]
\[ \bar f_3(x):=\sup \{ \langle s,x\rangle -b:\ \langle s,y\rangle -b\le g(y)\text{ for all }y\in \mathbb {R}^n\} \]

are closed, convex, and coincide on \(\mathbb {R}^n\) with the closure of the function constructed in Proposition 2.5.1.

Definition 2.2.4 Convex Hulls of a Function
#

Let \(g:\mathbb {R}^n\to \mathbb {R}\cup \{ +\infty \} \), not identically \(+\infty \), be minorized by an affine function. The common function \(f_1=f_2=f_3\) of Proposition 2.5.1 is called the convex hull of \(g\), denoted by \(\operatorname {co}g\). The closed convex hull of \(g\) is any of the functions described by Proposition 2.5.2; it is denoted by \(\overline{\operatorname {co}}\, g\) or \(\operatorname {cl}\, \operatorname {co}g\).

Proposition 2.2.10

Let \(g_1,\dots ,g_m\) be in \(\operatorname {Conv}\mathbb {R}^n\), all minorized by the same affine function. Then the convex hull of their infimum is the function

\[ \mathbb {R}^n \ni x \mapsto [\operatorname {co}(\min _j g_j)](x)= \inf \Big\{ \sum _{j=1}^m \alpha _j g_j(x_j)\; :\; \alpha \in \Delta _m,\; x_j\in \operatorname {dom} g_j,\; \sum _{j=1}^m\alpha _j x_j=x\Big\} . \tag {2.5.3} \]
Proof

Apply Example A.1.3.5 to the convex sets \(C_j=\operatorname {epi} g_j\).

2.3 Local and Global Behavior of a Convex Function

2.3.1 Continuity properties

Lemma 2.3.1

Let \(f\in \operatorname {Conv}\mathbb {R}^n\) and suppose there are \(x_0,\ \delta ,\ m\) and \(M\) such that

\[ m\le f(x)\le M\quad \text{for all }x\in B(x_0,2\delta ). \]

Then \(f\) is Lipschitzian on \(B(x_0,\delta )\); more precisely: for all \(y\) and \(y'\) in \(B(x_0,\delta )\),

\begin{equation} \label{eq:3.1.1} |f(y)-f(y')|\le \frac{M-m}{\delta }\| y-y'\| . \end{equation}
8

Proof

Look at Fig. 3.1.1: with two different \(y\) and \(y'\) in \(B(x_0,\delta )\), take

\[ y'':=y'+\delta \, \frac{y'-y}{\| y'-y\| }\in B(x_0,2\delta ); \]

by construction, \(y'\) lies on the segment \([y,y'']\), namely

\[ y'=\frac{\| y'-y\| }{\delta +\| y'-y\| }\, y''+\frac{\delta }{\delta +\| y'-y\| }\, y. \]

Applying the convexity of \(f\) and using the postulated bounds, we obtain

\[ f(y')-f(y)\le \frac{\| y'-y\| }{\delta +\| y'-y\| }\bigl[f(y'')-f(y)\bigr]\le \frac{1}{\delta }\| y'-y\| (M-m). \]

Then, it suffices to exchange \(y\) and \(y'\) to prove (3.1.1).

Theorem 2.3.1

With \(f\in \operatorname {Conv}\mathbb {R}^n\), let \(S\) be a convex compact subset of \(\operatorname {ri}\text{dom }f\). Then there exists \(L=L(S)\ge 0\) such that

\[ |f(x)-f(x')|\le L\| x-x'\| \quad \text{for all }x\text{ and }x'\text{ in }S. \tag {3.1.2} \]
Preliminaries

First of all, our statement ignores \(x\)-values outside the affine hull of the convex set \(\text{dom }f\). Instead of \(\mathbb {R}^n\), it can be formulated in \(\mathbb {R}^d\), where \(d\) is the dimension of \(\text{dom }f\); alternatively, we may assume \(\text{ri }\text{dom }f=\int \text{dom }f\), which will simplify the writing.

Make this assumption and let \(x_0\in S\). We will prove that there are \(\delta =\delta (x_0){\gt}0\) and \(L=L(x_0,\delta )\) such that the ball \(B(x_0,\delta )\) is included in \(\int \text{dom }f\) and

\[ |f(y)-f(y')|\le L\| y-y'\| \quad \text{for all }y\text{ and }y'\text{ in }B(x_0,\delta ). \tag {3.1.3} \]

If this holds for all \(x_0\in S\), the corresponding balls \(B(x_0,\delta )\) will provide a covering of the compact set \(S\), from which we will extract a finite covering \((x_1,\delta _1,L_1),\ldots ,(x_k,\delta _k,L_k)\). With these balls, we will divide an arbitrary segment \([x,x']\) of the convex set \(S\) into finitely many subsegments, of endpoints \(y_0:=x,\ldots ,y_i,\ldots ,y_\ell :=x'\). Ordering properly the \(y_i\)’s, we will have \(\| x-x'\| =\sum _{i=1}^\ell \| y_i-y_{i-1}\| \); furthermore, \(f\) will be Lipschitzian on each \([y_{i-1},y_i]\) with the common constant \(L:=\max \{ L_1,\ldots ,L_k\} \). The required Lipschitz property (3.1.2) will follow.

[Main Step] To establish (3.1.3), we use Lemma 3.1.1, which requires boundedness of \(f\) in the neighborhood of \(x_0\). For this, we construct as in the proof of Theorem A.2.1.3 (see Fig. A.2.1.1) a simplex \(\Delta =\operatorname {co}\{ v_0,\dots ,v_n\} \cap \operatorname {dom}f\) having \(x_0\) in its interior: we can take \(\delta {\gt}0\) such that \(B(x_0,2\delta )\subset \Delta \).

Then any \(y\in B(x_0,2\delta )\) can be written: \(y=\sum _{i=0}^n\alpha _i v_i\) with \(\alpha \in \Delta _{n+1}\), so that the convexity of \(f\) gives

\[ f(y)\le \sum _{i=0}^n\alpha _i f(v_i)\le \max \{ f(v_0),\dots ,f(v_n)\} =: M . \]

On the other hand, Proposition 1.2.1 tells us that \(f\) is bounded from below, say by \(m\), on this very same \(B(x_0,2\delta )\). Our claim is proved: we have singled out \(\delta {\gt}0\) such that \(m\le f(y)\le M\) for all \(y\in B(x_0,2\delta )\).

Theorem 2.3.2

Let the convex functions \(f_k:\mathbb {R}^n\to \mathbb {R}\) converge pointwise for \(k\to +\infty \) to \(f:\mathbb {R}^n\to \mathbb {R}\). Then \(f\) is convex and, for each compact set \(S\), the convergence of \(f_k\) to \(f\) is uniform on \(S\).

Proof

Convexity of \(f\) is trivial: pass to the limit in the definition (1.1.1) itself. For uniformity, we want to use Lemma 3.1.1, so we need to bound \(f_k\) on \(S\) independently of \(k\); thus, let \(r{\gt}0\) be such that \(S\subset B(0,r)\).

[Step 1] First the function \(g:=\sup _k f_k\) is convex, and \(g(x){\lt}+\infty \) for all \(x\) because the convergent sequence \((f_k(x))_k\) is certainly bounded. Hence, \(g\) is continuous and therefore bounded, say by \(M\), on the compact set \(B(0,2r)\):

\[ f_k(x)\le g(x)\le M\qquad \text{for all $k$ and all $x\in B(0,2r)$.} \]

Second, the convergent sequence \((f_k(0))_k\) is bounded from below:

\[ \mu \le f_k(0)\qquad \text{for all $k$.} \]

Then, for \(x\in B(0,2r)\) and all \(k\), use convexity on \([-x,x]\subset B(0,2r)\):

\[ 2\mu \le 2f_k(0)\le f_k(x)+f_k(-x)\le f_k(x)+M, \]

i.e. the \(f_k\)’s are bounded from below, independently of \(k\). Thus, we are within the conditions of Lemma 3.1.1: there is some \(L\) (independent of \(k\)) such that

\[ |f_k(y)-f_k(y')| \le L\| y-y'\| \quad \text{for all }k\text{ and all }y,y'\in B(0,r). \tag {3.1.5} \]

Naturally, the same Lipschitz property is transmitted to the limiting function \(f\).

[Step 2] Now fix \(\varepsilon {\gt}0\). Cover \(S\) by the balls \(B(x,\varepsilon )\) for \(x\) describing \(S\), and extract a finite covering \(S\subset B(x_1,\varepsilon )\cup \cdots \cup B(x_m,\varepsilon )\). With \(x\) arbitrary in \(S\), take an \(x_i\) such that \(x\in B(x_i,\varepsilon )\). There is \(k_{i,\varepsilon }\) such that, for all \(k\ge k_{i,\varepsilon }\),

\[ |f_k(x)-f(x)| \le |f_k(x)-f_k(x_i)|+|f_k(x_i)-f(x_i)|+|f(x_i)-f(x)| \le (2L+1)\varepsilon \]

where we have also used (3.1.5), knowing that \(x\) and \(x_i\) are in \(S\subset B(0,r)\). The above inequality is then valid uniformly in \(x\), providing that

\[ k\ge \max \{ k_{1,\varepsilon },\ldots ,k_{m,\varepsilon }\} =: k_\varepsilon . \]

2.3.2 Behavior at infinity

Proposition 2.3.1

For \(f\in \overline{\operatorname {Conv}}\mathbb {R}^n\), the asymptotic cone of \(\operatorname {epi} f\) is the epigraph of the function \(f'_{\infty }\in \overline{\operatorname {Conv}}\mathbb {R}^n\) defined by

\[ d\mapsto f'_{\infty }(d):=\sup _{t{\gt}0}\frac{f(x_0+td)-f(x_0)}{t} \; =\; \lim _{t\to +\infty }\frac{f(x_0+td)-f(x_0)}{t}, \tag {3.2.2} \]

where \(x_0\) is arbitrary in \(\operatorname {dom} f\).

Proof

Since \((x_0,f(x_0))\in \operatorname {epi} f\), (3.2.1) tells us that \((d,\rho )\in (\operatorname {epi} f)_{\infty }\) if and only if \(f(x_0+td)\le f(x_0)+t\rho \) for all \(t{\gt}0\), which means

\[ \sup _{t{\gt}0}\frac{f(x_0+td)-f(x_0)}{t}\le \rho . \tag {3.2.3} \]

In other words, \((\operatorname {epi} f)_\infty \) is the epigraph of the function whose value at \(d\) is the left hand side of (3.2.3); and this is true no matter how \(x_0\) has been chosen in \(\operatorname {dom} f\). The rest follows from the fact that the difference quotient in (3.2.3) is closed convex in \(d\), and increasing in \(t\) (the function \(t\mapsto f(x_0+td)\) is convex and enjoys the property of increasing slopes, namely Proposition 0.6.1). \(\square \)

Definition 2.3.1
#

(Asymptotic function) The function \(f'_{\infty }\) of Proposition 3.2.1 is called the asymptotic function, or recession function, of \(f\).

Proposition 2.3.2

Let \(f\in \operatorname {Conv}\mathbb {R}^n\). All the nonempty sublevel-sets of \(f\) have the same asymptotic cone, which is the sublevel-set of \(f^\infty \) at the level \(0\):

\[ \forall r\in \mathbb {R}\ \text{with } S_r(f)\neq \varnothing ,\qquad [S_r(f)]_\infty =\{ d\in \mathbb {R}^n:\ f^\infty (d)\le 0\} . \]

In particular, the following statements are equivalent:

  1. There is \(r\) for which \(S_r(f)\) is nonempty and compact;

  2. all the sublevel-sets of \(f\) are compact;

  3. \(f^\infty _\circ (d){\gt}0\) for all nonzero \(d\in \mathbb {R}^n\).

Proof

By definition (A.2.2.1), a direction \(d\) is in the asymptotic cone of the nonempty sublevel-set \(S_r(f)\) if and only if

\[ x\in S_r(f)\qquad \Longrightarrow \qquad [x+td\in S_r(f)\ \text{for all }t{\gt}0], \]

which can also be written — see (1.1.4):

\[ (x,r)\in \operatorname {epi} f\qquad \Longrightarrow \qquad (x+td,r+t\times 0)\in \operatorname {epi} f\ \text{ for all }t{\gt}0; \]

and this in turn just means that \((d,0)\in (\operatorname {epi} f)_\infty =\operatorname {epi} f^\infty _\circ \). We have proved the first part of the theorem.

A particular case is when the sublevel-set \(S_0(f^\infty _\circ )\) is reduced to the singleton \(\{ 0\} \), which exactly means (iii). This is therefore equivalent to \([S_r(f)]_\infty =\{ 0\} \) for all \(r\in \mathbb {R}\) with \(S_r(f)\neq \emptyset \), which means that \(S_r(f)\) is compact (Proposition A.2.2.3). The equivalence between (i), (ii) and (iii) is proved.

Definition 2.3.2
#

(Coercivity) The functions \(f\in \operatorname {Conv}\mathbb {R}^n\) satisfying (i), (ii) or (iii) are called \(0\)-coercive. Equivalently, the \(0\)-coercive functions are those that “increase at infinity”:

\[ f(x)\to +\infty \qquad \text{whenever}\qquad \| x\| \to +\infty , \]

and closed convex \(0\)-coercive functions achieve their minimum over \(\mathbb {R}^n\).

An important particular case is when \(f'_\infty (d)=+\infty \) for all \(d\neq 0\), i.e. when \(f'_\infty = \iota _{\{ 0\} }\). It can be seen that this means precisely

\[ \frac{f(x)}{\| x\| }\to +\infty \qquad \text{whenever}\qquad \| x\| \to +\infty . \]

(to establish this equivalence, extract a cluster point of \((x_k/\| x_k\| )\), and use the lower semi-continuity of \(f'_\infty \)). In words: at infinity, \(f\) increases to infinity faster than any affine function; such functions are called \(1\)-coercive, or sometimes just coercive.

Proposition 2.3.3

A function \(f\in \operatorname {Conv}\mathbb {R}^n\) is Lipschitzian on the whole of \(\mathbb {R}^n\) if and only if \(f'_\infty \) is finite on the whole of \(\mathbb {R}^n\). The best Lipschitz constant for \(f\) is then

\[ \sup \{ f'_\infty (d):\| d\| =1\} . \tag {3.2.4} \]
Proof

When the (convex) function \(f'_\infty \) is finite-valued, it is continuous (§3.1) and therefore bounded on the compact unit sphere:

\[ \sup \{ f'_\infty (d):\| d\| =1\} =: L {\lt} +\infty , \]

which implies by positive homogeneity

\[ f'_\infty (d) \le L\| d\| \quad \text{for all }d\in \mathbb {R}^n. \]

Now use the definition (3.2.2) of \(f'_\infty \):

\[ f(x+d)-f(x)\le L\| d\| \quad \text{for all }x\in \operatorname {dom}f\text{ and }d\in \mathbb {R}^n; \]

thus, \(\operatorname {dom}f\) is the whole space (\(f(x+d){\lt}+\infty \) for all \(d\)) and we do obtain that \(L\) is a global Lipschitz constant for \(f\).

Conversely, let \(f\) have a global Lipschitz constant \(L\). Pick \(x_0\in \operatorname {dom}f\) and plug the inequality

\[ f(x_0+td)-f(x_0)\le Lt\| d\| \quad \text{for all }t{\gt}0\text{ and }d\in \mathbb {R}^n \]

into the definition (3.2.2) of \(f'_\infty \) to obtain \(f'_\infty (d)\le L\| d\| \) for all \(d\in \mathbb {R}^n\).

It follows that \(f'_\infty \) is finite everywhere, and the value (3.2.4) does not exceed \(L\).

Proposition 2.3.4 3.2.8
#
  1. Let \(f_1,\dots ,f_m\) be \(m\) functions of \(\text{Conv }\mathbb {R}^n\), and \(t_1,\dots ,t_m\) be positive numbers. Assume that there is \(x_0\) at which each \(f_j\) is finite. Then,

    \[ \text{for } f:=\sum _{j=1}^m t_j f_j,\qquad \text{we have } f'_\infty =\sum _{j=1}^m t_j(f_j)'_\infty . \]
  2. Let \(\{ f_j\} _{j\in J}\) be a family of functions in \(\text{Conv }\mathbb {R}^n\). Assume that there is \(x_0\) at which \(\sup _{j\in J} f_j(x_0){\lt}+\infty \). Then,

    \[ \text{for } f:=\sup _{j\in J} f_j,\qquad \text{we have } f'_\infty =\sup _{j\in J}(f_j)'_\infty . \]
  3. Let \(A:\mathbb {R}^n\to \mathbb {R}^m\) be affine with linear part \(A_0\), and let \(f\in \text{Conv }\mathbb {R}^m\). Assume that \(A(\mathbb {R}^n)\cap \text{dom }f\neq \emptyset \). Then \((f\circ A)'_\infty =f'_\infty \circ A_0\).

2.4 First- and Second-order Differentiation

2.4.1 Differentiable convex functions

Theorem 2.4.1

Let \(f\) be a function differentiable on an open set \(\Omega \subset \mathbb {R}^n\), and let \(C\) be a convex subset of \(\Omega \). Then

  1. \(f\) is convex on \(C\) if and only if

    \[ f(x)\ge f(x_0)+\langle \nabla f(x_0),\, x-x_0\rangle \quad \text{for all }(x_0,x)\in C\times C; \tag {4.1.1} \]
  2. \(f\) is strictly convex on \(C\) if and only if strict inequality holds in (4.1.1) whenever \(x\neq x_0\);

  3. \(f\) is strongly convex with modulus \(c\) on \(C\) if and only if, for all \((x_0,x)\in C\times C\),

    \[ f(x)\ge f(x_0)+\langle \nabla f(x_0),x-x_0\rangle +\tfrac {1}{2}c\| x-x_0\| ^2. \tag {4.1.2} \]
(i)

Let \(f\) be convex on \(C\): for arbitrary \((x_0,x)\in C\times C\) and \(\alpha \in ]0,1[\), we have from the definition (1.1.1) of convexity

\[ f(\alpha x+(1-\alpha )x_0)-f(x_0)\le \alpha [f(x)-f(x_0)]. \]

Divide by \(\alpha \) and let \(\alpha \downarrow 0\): observing that \(\alpha x+(1-\alpha )x_0=x_0+\alpha (x-x_0)\), the lefthand side tends to \(\langle \nabla f(x_0),x-x_0\rangle \) and (4.1.1) is established.

Conversely, take \(x_1\) and \(x_2\) in \(C\), \(\alpha \in ]0,1[\) and set \(x_0:=\alpha x_1+(1-\alpha )x_2\in C\). By assumption,

\[ f(x_i)\ge f(x_0)+\langle \nabla f(x_0),x_i-x_0\rangle \quad \text{for }i=1,2 \tag {4.1.3} \]

and we obtain by convex combination

\[ \alpha f(x_1)+(1-\alpha )f(x_2)\ge f(x_0)+\langle \nabla f(x_0),\alpha x_1+(1-\alpha )x_2-x_0\rangle \]

which, after simplification, is just the relation of definition (1.1.1).

[(ii)] If \(f\) is strictly convex, we have for \(x_0\ne x\) in \(C\) and \(\alpha \in ]0,1[\),

\[ f(x_0+\alpha (x-x_0))-f(x_0){\lt}\alpha [f(x)-f(x_0)]; \]

but \(f\) is in particular convex and we can use (i):

\[ \langle \nabla f(x_0),\alpha (x-x_0)\rangle \le f(x_0+\alpha (x-x_0))-f(x_0), \]

so the required strict inequality follows.

For the converse, proceed as for (i), starting from strict inequalities in (4.1.3).

[(iii)] Using Proposition 1.1.2, just apply (i) to the function \(f-\tfrac {1}{2}c\| \cdot \| ^2\), which is of course differentiable.

Definition 2.4.1
#

Let \(C\subset \mathbb {R}^n\) be convex. The mapping \(F: C\to \mathbb {R}^n\) is said to be monotone [resp. strictly monotone, resp. strongly monotone with modulus \(c{\gt}0\)] on \(C\) when, for all \(x\) and \(x'\) in \(C\),

\[ \langle F(x)-F(x'),\, x-x'\rangle \ge 0 \]

[resp. \(\langle F(x)-F(x'),\, x-x'\rangle {\gt} 0\) whenever \(x\neq x'\), resp. \(\langle F(x)-F(x'),\, x-x'\rangle \ge c\| x-x'\| ^2\)].

Theorem 2.4.2

Let \(f\) be a function differentiable on an open set \(\Omega \subset \mathbb {R}^n\), and let \(C\) be a convex subset of \(\Omega \). Then, \(f\) is convex [resp. strictly convex, resp. strongly convex with modulus \(c\)] on \(C\) if and only if its gradient \(\nabla f\) is monotone [resp. strictly monotone, resp. strongly monotone with modulus \(c\)] on \(C\).

Proof

We combine the “convex \(\Leftrightarrow \) monotone” and “strongly convex \(\Leftrightarrow \) strongly monotone” cases by accepting the value \(c=0\) in the relevant relations such as (4.1.2).

Thus, let \(f\) be [strongly] convex on \(C\): in view of Theorem 4.1.1, we can write for arbitrary \(x_0\) and \(x\) in \(C\):

\[ f(x)\ge f(x_0)+\langle \nabla f(x_0),x-x_0\rangle +\tfrac {1}{2}c\| x-x_0\| ^2 \]
\[ f(x_0)\ge f(x)+\langle \nabla f(x),x_0-x\rangle +\tfrac {1}{2}c\| x_0-x\| ^2, \]

and mere addition shows that \(\nabla f\) is [strongly] monotone.

Conversely, let \((x_0,x_1)\) be a pair of elements in \(C\). Consider the univariate function \(t\mapsto \varphi (t):=f(x_t)\), where \(x_t:=x_0+t(x_1-x_0)\); for \(t\) in an open interval containing \([0,1]\), \(x_t\in \Omega \) and \(\varphi \) is well-defined and differentiable; its derivative at \(t\) is \(\varphi '(t)=\langle \nabla f(x_t),x_1-x_0\rangle \). Thus, we have for all \(0\le t'{\lt}t\le 1\)

\[ \varphi '(t)-\varphi '(t')=\langle \nabla f(x_t)-\nabla f(x_{t'}),\, x_1-x_0\rangle =\frac{1}{t-t'}\langle \nabla f(x_t)-\nabla f(x_{t'}),\, x_t-x_{t'}\rangle \tag {4.1.4} \]

and the monotonicity relation for \(\nabla f\) shows that \(\varphi '\) is increasing, \(\varphi \) is therefore convex (Corollary 0.6.5).

For strong convexity, set \(t'=0\) in (4.1.4) and use the strong monotonicity relation for \(\nabla f\):

\[ \varphi '(t)-\varphi '(0)\ge \frac{1}{t}c\| x_t-x_0\| ^2 = tc\| x_1-x_0\| ^2. \tag {4.1.5} \]

Because the differentiable convex function \(\varphi \) is the integral of its derivative, we can write

\[ \varphi (1)-\varphi (0)-\varphi '(0)=\int _0^1[\varphi '(t)-\varphi '(0)]\, dt \ge \tfrac {1}{2}c\| x_1-x_0\| ^2 \]

which, by definition of \(\varphi \), is just (4.1.2) (the coefficient \(1/2\) is \(\int _0^1 t\, dt!\)).

The same technique proves the “strictly monotone \(\Leftrightarrow \) strictly convex” case; then, (4.1.5) becomes a strict inequality — with \(c=0\) — and remains so after integration.

2.4.2 Nondifferentiable convex functions

Proposition 2.4.1

For \(f\in \operatorname {Conv}\mathbb {R}^n\) and \(x\in \operatorname {int}\text{dom }f\), the three statements below are equivalent:

  1. The function

    \[ \mathbb {R}^n\ni d\mapsto \lim _{t\downarrow 0}\frac{f(x+td)-f(x)}{t} \]

    is linear in \(d\);

  2. for some basis of \(\mathbb {R}^n\) in which \(x=(\xi ^1,\dots ,\xi ^n)\), the partial derivatives \(\dfrac {\partial f}{\partial \xi ^i}(x)\) exist at \(x\), for \(i=1,\dots ,n\);

  3. \(f\) is differentiable at \(x\).

Proof

First of all, remember from Theorem 0.6.3 that the one-dimensional function \(t\mapsto f(x+td)\) has half-derivatives at \(0\): the limits considered in (i) exist for all \(d\). We will denote by \(\{ b_1,\dots ,b_n\} \) the basis postulated in (ii), so that \(x=\sum _{i=1}^n \xi ^i b_i\).

Denote by \(d\mapsto \ell (d)\) the function defined in (i); taking \(d=\pm b_i\), realize that, when (i) holds,

\[ \lim _{\tau \downarrow 0}\frac{f(x+\tau b_i)-f(x)}{-\tau } =\ell (-b_i)=-\ell (b_i)=-\lim _{t\downarrow 0}\frac{f(x+t b_i)-f(x)}{t}. \]

This means that the two half-derivatives at \(t=0\) of the function \(t\mapsto f(x+t b_i)\) coincide: the partial derivative of \(f\) at \(x\) along \(b_i\) exists, (ii) holds. That (iii) implies (i) is clear: when \(f\) is differentiable at \(x\),

\[ \lim _{t\downarrow 0}\frac{f(x+td)-f(x)}{t}=\langle \nabla f(x),d\rangle . \]

We do not really complete the proof here, because everything follows in a straightforward way from subsequent chapters. More precisely, [(ii) \(\Rightarrow \) (i)] is Proposition C.1.1.6, which states that the function \(\ell \) is linear on the space generated by the \(b_i\)’s, whenever it is linear along each \(b_i\). Finally [(i) \(\Rightarrow \) (iii)] results from Lemma D.2.1.1 and the proof goes as follows. One of the possible definitions of (iii) is:

\[ \lim _{t\downarrow 0,\; d'\to d}\frac{f(x+t d')-f(x)}{t}\quad \text{is linear in }d. \]

Because \(f\) is locally Lipschitzian, the above limit exists whenever it exists for fixed \(d'=d\)—i.e. the expression in (i).

Theorem 2.4.3
#

Let \(f\in \operatorname {Conv}\mathbb {R}^n\). The subset of \(\operatorname {int}\operatorname {dom}f\) where \(f\) fails to be differentiable is of zero (Lebesgue) measure.

2.4.3 Second-order differentiation

Theorem 2.4.4

Let \(f\) be twice differentiable on an open convex set \(\Omega \subset \mathbb {R}^n\). Then

  1. \(f\) is convex on \(\Omega \) if and only if \(\nabla ^2 f(x_0)\) is positive semi-definite for all \(x_0\in \Omega \);

  2. if \(\nabla ^2 f(x_0)\) is positive definite for all \(x_0\in \Omega \), then \(f\) is strictly convex on \(\Omega \);

  3. \(f\) is strongly convex with modulus \(c\) on \(\Omega \) if and only if the smallest eigenvalue of \(\nabla ^2 f(\cdot )\) is minorized by \(c\) on \(\Omega \): for all \(x_0\in \Omega \) and all \(d\in \mathbb {R}^n\),

    \[ \langle \nabla ^2 f(x_0)d,d\rangle \ge c\| d\| ^2. \]
Proof

For given \(x_0\in \Omega \), \(d\in \mathbb {R}^n\) and \(t\in \mathbb {R}\) such that \(x_0+td\in \Omega \), we will set

\[ x_t:=x_0+td\quad \text{and}\quad \varphi (t):=f(x_t)=f(x+td), \]

so that \(\varphi ''(t)=\langle \nabla ^2 f(x_t)d,d\rangle \).

[(i)] Assume \(f\) is convex on \(\Omega \); let \((x_0,d)\) be arbitrary in \(\Omega \times \mathbb {R}^n\), with \(d\neq 0\): \(\varphi \) is then convex on the open interval \(I:=\{ t\in \mathbb {R}:x_0+td\in \Omega \} \). It follows

\[ 0 \le \varphi ''(t) = \langle \nabla ^2 f(x_t)d,d\rangle \quad \text{for all }t\in I\ni 0 \]

and \(\nabla ^2 f(x_0)\) is positive semi-definite.

Conversely, take an arbitrary \([x_0,x_1]\subset \Omega \), set \(d:=x_1-x_0\) and assume \(\nabla ^2 f(x_t)\) positive semi-definite, i.e. \(\varphi ''(t)\ge 0\), for \(t\in [0,1]\). Then Theorem 0.6.6 tells us that \(\varphi \) is convex on \([0,1]\), i.e. \(f\) is convex on \([x_0,x_1]\). The result follows since \(x_0\) and \(x_1\) were arbitrary in \(\Omega \).

[(ii)] To establish the strict convexity of \(f\) on \(\Omega \), we prove that \(\nabla f\) is strictly monotone on \(\Omega \): Theorem 4.1.4 will apply. As above, take an arbitrary \([x_0,x_1]\subset \Omega \), \(x_1\neq x_0\), \(d:=x_1-x_0\), and apply the mean-value theorem to the function \(\varphi '\), differentiable on \([0,1]\), for some \(\tau \in ]0,1[\),

\[ \varphi '(1)-\varphi '(0)=\varphi ''(\tau )=\langle \nabla ^2 f(x_\tau )d,d\rangle {\gt}0 \]

and the result follows since

\[ \varphi '(1)-\varphi '(0)=\langle \nabla f(x_1)-\nabla f(x_0),x_1-x_0\rangle . \]

[(iii)] Using Proposition 1.1.2, apply (i) to the function \(f-{\tfrac 12}c\| \, \cdot \, \| ^2\), whose Hessian operator is \(\nabla ^2 f-cI_n\) and has the eigenvalues \(\lambda -c\), with \(\lambda \) describing the eigenvalues of \(\nabla ^2 f\).