1 Convex Sets
1.1 Generalities
1.1.1 Definitions and first examples
The set \(C \subset \mathbb {R}^n\) is said to be convex if \( \alpha x + (1-\alpha )x'\) is in \(C\) whenever \(x\) and \(x'\) are in \(C\), and \(\alpha \in ]0,1[\) (or equivalently \(\alpha \in [0,1]\)).
1.1.2 Convexity preserving operations on sets
Let \(\{ C_j\} _{j\in J}\) be an arbitrary family of convex sets. Then
is convex.
Immediate from the very Definition 1.1.1.
For \(i=1,\dots ,k\), let \(C_i\subset \mathbb {R}^{n_i}\) be convex sets. Then \(C_1\times \cdots \times C_k\) is a convex set of \(\mathbb {R}^{n_1}\times \cdots \times \mathbb {R}^{n_k}\).
Straightforward.
Let \(A:\mathbb {R}^n\to \mathbb {R}^m\) be an affine mapping and \(C\) a convex set of \(\mathbb {R}^n\). The image \(A(C)\) of \(C\) under \(A\) is convex in \(\mathbb {R}^m\).
If \(D\) is a convex set of \(\mathbb {R}^m\), the inverse image
is convex in \(\mathbb {R}^n\).
For \(x\) and \(x'\) in \(\mathbb {R}^n\), the image under \(A\) of the segment \([x,x']\) is clearly the segment \([A(x),A(x')]\subset \mathbb {R}^m\). This proves the first claim, but also the second: indeed, if \(x\) and \(x'\) are such that \(A(x)\) and \(A(x')\) are both in the convex set \(D\), then every point of the segment \([x,x']\) has its image in \([A(x),A(x')]\subset D\).
If \(C\) is convex, so are its interior \(\operatorname {int}C\) and its closure \(\overline{C}\).
For given different \(x\) and \(x'\), and \(\alpha \in ]0,1[\), we set \(x''=\alpha x+(1-\alpha )x'\in ]x,x'[\).
Take first \(x\) and \(x'\) in \(\operatorname {int}C\). Choosing \(\delta {\gt}0\) such that \(B(x',\delta )\subset C\), we show that \(B(x'',(1-\alpha )\delta )\subset C\). As often in convex analysis, it is probably best to draw a picture. The ratio \(\| x''-x\| /\| x'-x\| \) being precisely \(1-\alpha \), Fig. 1.2.3 clearly shows that \(B(x'',(1-\alpha )\delta )\) is just the set \(\alpha x+(1-\alpha )B(x',\delta )\), obtained from segments with endpoints in \(\operatorname {int}C\): \(x''\in \operatorname {int}C\).
Now, take \(x\) and \(x'\) in \(\operatorname {cl}C\): we select in \(C\) two sequences \(\{ x_k\} \) and \(\{ x_k'\} \) converging to \(x\) and \(x'\) respectively. Then, \(\alpha x_k + (1-\alpha )x_k'\) is in \(C\) and converges to \(\alpha x + (1-\alpha )x'\), which is therefore in \(\operatorname {cl}C\).
1.1.3 Convex combinations and convex hulls
A convex combination of elements \(x_1,\ldots ,x_k\) in \(\mathbb {R}^n\) is an element of the form
where
A set \(C\subset \mathbb {R}^n\) is convex if and only if it contains every convex combination of its elements.
The condition is sufficient: convex combinations of two elements just make up the segment joining them. To prove necessity, take \(x_1,\dots ,x_k\) in \(C\) and \(\alpha =(\alpha _1,\dots ,\alpha _k)\in \Delta _k\). One at least of the \(\alpha _i\)’s is positive, say \(\alpha _1{\gt}0\). Then form
which is in \(C\) by Definition 1.1.1 itself. Therefore,
is in \(C\) for the same reason; and so on until
The convex hull can also be described as the set of all convex combinations:
Call \(T\) the set described in the rightmost side of (1.3.2). Clearly, \(T\supset S\). Also, if \(C\) is convex and contains \(S\), then it contains all convex combinations of elements
For this, take two points \(x\) and \(y\) in \(T\), characterized respectively by \((x_{1},\alpha _{1}),\dots ,\) \((x_{k},\alpha _{k})\) and by \((y_{1},\beta _{1}),\dots ,(y_{\ell },\beta _{\ell })\); take also \(\lambda \in [0,1]\). Then \(\lambda x+(1-\lambda )y\) is a certain combination of \(k+\ell \) elements of \(S\); this combination is convex because its coefficients \(\lambda \alpha _{i}\) and \((1-\lambda )\beta _{j}\) are nonnegative, and their sum is
(C. Carathéodory) Any \(x\in \operatorname {co}S\subset \mathbb {R}^n\) can be represented as a convex combination of \(n+1\) elements of \(S\).
Take an arbitrary convex combination \(x=\sum _{i=1}^k \alpha _i x_i\), with \(k{\gt}n+1\). We will show that one of the \(x_i\)’s can be assigned a \(0\)-coefficient without changing \(x\). For this, assume that all coefficients \(\alpha _i\) are positive (otherwise we are done).
The \(k{\gt}n+1\) elements \(x_i\) are certainly affinely dependent: (1.3.1) tells us that we can find \(\delta _1,\dots ,\delta _k\), not all zero, such that
There is at least one positive \(\delta _i\); and we can set \(\alpha _i':=\alpha _i-t^*\delta _i\) for \(i=1,\dots ,k\), where
Clearly enough,
and there exists \(i_0\) such that \(\alpha _{i_0}'=0\). [by construction of \(t^*\)]
In other words, we have expressed \(x\) as a convex combination of \(k-1\) among the \(x_i\)’s; our claim is proved.
Now, if \(k-1=n+1\), the proof is finished. If not, we can apply the above construction to the convex combination \(x=\sum _{i=1}^{k-1}\alpha _i' x_i\) and so on. The process can be continued until there remain only \(n+1\) elements (which may be affinely independent).
(W. Fenchel and L. Bunt) If \(S\subset \mathbb {R}^n\) has no more than \(n\) connected components (in particular, if \(S\) is connected), then any \(x\in \operatorname {co}S\) can be expressed as a convex combination of \(n\) elements of \(S\).
1.1.4 Closed convex sets and hulls
The closed convex hull of a nonempty set \(S\subset \mathbb {R}^n\) is the intersection of all closed convex sets containing \(S\). It will be denoted by \(\overline{\operatorname {co}}\, S\).
The closed convex hull \(\overline{\operatorname {co}}S\) of Definition 1.4.1 is the closure \(\operatorname {cl}(\operatorname {co}S)\) of the convex hull of \(S\).
Because \(\operatorname {cl}(\operatorname {co}S)\) is a closed convex set containing \(S\), it contains \(\overline{\operatorname {co}}S\) as well. On the other hand, take a closed convex set \(C\) containing \(S\); being convex, \(C\) contains \(\operatorname {co}S\); being closed, it contains also the closure of \(\operatorname {co}S\). Since \(C\) was arbitrary, we conclude \(\bigcap C \supset \operatorname {cl}\operatorname {co}S\).
If \(S\) is bounded [resp. compact], then \(\operatorname {co}S\) is bounded [resp. compact].
Let \(x=\sum _{i=1}^{n+1}\alpha _i x_i\in \operatorname {co}S\). If \(S\) is bounded, say by \(M\), we can write
Now take a sequence \(\{ x^k\} \subset \operatorname {co}S\). For each \(k\) we can choose
such that \(x^k=\sum _{i=1}^{n+1}\alpha ^k_i x^k_i\). Note that \(\Delta _{n+1}\) is compact. If \(S\) is compact, we can extract a subsequence as many times as necessary (not more than \(n+2\) times) so that \(\{ \alpha ^k\} \) and each \(\{ x^k_i\} \) converge: we end up with an index set \(K\subset \mathbb {N}\) such that, when \(k\to +\infty \),
Passing to the limit for \(k\in K\), we see that \(\{ x^k\} _{k\in K}\) converges to a point \(x\), which can be expressed as a convex combination of points of \(S\): \(x\in \operatorname {co}S\), whose compactness is thus established.
A conical combination of elements \(x_1,\dots ,x_k\) is an element of the form \(\sum _{i=1}^k \alpha _i x_i\), where the coefficients \(\alpha _i\) are nonnegative.
The set of all conical combinations from a given nonempty \(S\subset \mathbb {R}^n\) is the conical hull of \(S\). It is denoted by \(\operatorname {cone} S\).
The closed conical hull (or rather closed convex conical hull) of a nonempty set \(S\subset \mathbb {R}^n\) is
Let \(S\) be a nonempty compact set such that \(0\notin \operatorname {co}S\). Then
The set \(C:=\operatorname {co}S\) is compact and does not contain the origin; we prove that \(\mathbb {R}^{+}C\) is closed. Let \(\{ t_kx_k\} \subset \mathbb {R}^{+}C\) converge to \(y\); extracting a subsequence if necessary, we may suppose \(x_k\to x\in C\); note: \(x\neq 0\). We write
which implies \(t_k\to \| y\| /\| x\| =:t\ge 0\). Then, \(t_kx_k\to tx=y\), which is thus in \(\mathbb {R}^{+}C\).
1.2 Convex Sets attached to a Convex Set
1.2.1 Relative interior
The relative interior \(\operatorname {ri} C\) (or \(\operatorname {relint} C\)) of a convex set \(C \subset \mathbb {R}^n\) is the interior of \(C\) for the topology relative to the affine hull of \(C\). In other words: \(x \in \operatorname {ri} C\) if and only if
The dimension of a convex set \(C\) is the dimension of its affine hull, that is to say the dimension of the subspace parallel to \(\operatorname {aff} C\).
If \(C\neq \varnothing \), then \(\operatorname {ri}C\neq \varnothing \). In fact, \(\dim (\operatorname {ri}C)=\dim C\).
Let \(k:=1+\dim C\). Since \(\operatorname {aff}C\) has dimension \(k-1\), \(C\) contains \(k\) elements affinely independent \(x_1,\dots ,x_k\). Call \(\Delta :=\operatorname {co}\{ x_1,\dots ,x_k\} \) the simplex that they generate; see fig. 2.1.1; \(\operatorname {aff}\Delta =\operatorname {aff}C\) because \(\Delta \subset C\) and \(\dim \Delta =k-1\). The proof will be finished if we show that \(\Delta \) has nonempty relative interior.
Take \(\bar{x}:=1/k\sum _{i=1}^k x_i\) (the “center” of \(\Delta \)) and describe \(\operatorname {aff}\Delta \) by points of the form
where \(\alpha (y)=(\alpha _1(y),\dots ,\alpha _k(y))\in \mathbb R^k\) solves
Because this system has a unique solution, the mapping \(y\mapsto \alpha (y)\) is (linear and) continuous: we can find \(\delta {\gt}0\) such that \(\| y\| \le \delta \) implies
hence \(\bar x+y\in \Delta \).
In other words, \(\bar x\in \operatorname {ri}\Delta \subset \operatorname {ri}C\).
It follows in particular \(\dim \operatorname {ri}C=\dim \Delta =\dim C\).
Let \(x\in \operatorname {cl}C\) and \(x'\in \operatorname {ri}C\). Then the half-open segment
is contained in \(\operatorname {ri}C\).
Take \(x''=\alpha x+(1-\alpha )x'\), with \(1{\gt}\alpha \ge 0\). To avoid writing “\(\operatorname {ri}C\)” every time, we assume without loss of generality that \(\operatorname {aff}C=\mathbb {R}^n\).
Since \(x\in \operatorname {cl}C\), for all \(\varepsilon {\gt}0\), \(x\in C+B(0,\varepsilon )\) and we can write
Since \(x'\in \operatorname {int}C\), we can choose \(\varepsilon \) so small that \(x'+B\bigl(0,\tfrac {1+\alpha }{1-\alpha }\varepsilon \bigr)\subset C\). Then we have
(where the last equality is just the definition of a convex set).
The three convex sets \(\operatorname {ri}C\), \(C\) and \(\operatorname {cl}C\) have the same affine hull (and hence the same dimension), the same relative interior and the same closure (and hence the same relative boundary).
The case of the affine hull was already seen in Theorem 2.1.3. For the others, the key result is Lemma 2.1.6 (as well as for most other properties involving closures and relative interiors). We illustrate it by restricting our proof to one of the properties, say: \(\operatorname {ri}C\) and \(C\) have the same closure.
Thus, we have to prove that \(\operatorname {cl}C\subset \operatorname {cl}(\operatorname {ri}C)\). Let \(x\in \operatorname {cl}C\) and take \(x'\in \operatorname {ri}C\) (it is possible by virtue of Theorem 2.1.3). Because \([x,x']\subset \operatorname {ri}C\) (Lemma 2.1.6), we do have that \(x\) is a limit of points in \(\operatorname {ri}C\) (and even a "radial" limit); hence \(x\) is in the closure of \(\operatorname {ri}C\).
Let the two convex sets \(C_1\) and \(C_2\) satisfy \(\operatorname {ri}C_1\cap \operatorname {ri}C_2\neq \varnothing \). Then
First we show that \(\operatorname {cl}C_1\cap \operatorname {cl}C_2 \subset \operatorname {cl}(C_1\cap C_2)\) (the converse inclusion is always true). Given \(x\in \operatorname {cl}C_1\cap \operatorname {cl}C_2\), we pick \(x'\) in the nonempty \(\operatorname {ri}C_1\cap \operatorname {ri}C_2\). From Lemma 2.1.6 applied to \(C_1\) and to \(C_2\),
Taking the closure of both sides, we conclude
which proves 2 because \(x\) was arbitrary; the above inclusion is actually an equality.
Now, we have just seen that the two convex sets \(\operatorname {ri}C_1\cap \operatorname {ri}C_2\) and \(C_1\cap C_2\) have the same closure. According to Remark 2.1.9, they have the same relative interior:
It remains to prove the converse inclusion, so let \(y\in \operatorname {ri}C_1\cap \operatorname {ri}C_2\). If we take \(x'\in C_1\) [resp. \(C_2\)], the segment \([x',y]\) is in aff \(C_1\) [resp. aff \(C_2\)] and, by definition of the relative interior, this segment can be stretched beyond \(y\) and yet stay in \(C_1\) [resp. \(C_2\)] (see Fig. 2.1.3). Take in particular \(x'\in \operatorname {ri}(C_1\cap C_2)\), \(x'\neq y\) (if such an \(x'\) does not exist, we are done). The above stretching singles out an \(x\in C_1\cap C_2\) such that \(y\in ]x,x'[:\)
Then Lemma 2.1.6 applied to \(C_1\cap C_2\) tells us that \(y\in \operatorname {ri}(C_1\cap C_2)\).
For \(i=1,\ldots ,k\), let \(C_i\subset \mathbb {R}^{n_i}\) be convex sets. Then
It suffices to apply Definition 2.1.1 alone, observing that
Let \(A:\mathbb {R}^n\to \mathbb {R}^m\) be an affine mapping and \(C\) a convex set of \(\mathbb {R}^n\). Then
If \(D\) is a convex set of \(\mathbb {R}^m\) satisfying \(A^{-1}(\operatorname {ri}D)\neq \varnothing \), then
First, note that the continuity of \(A\) implies \(A(\operatorname {cl}S)\subset \operatorname {cl}[A(S)]\) for any \(S\subset \mathbb {R}^n\). Apply this result to \(\operatorname {ri}C\), whose closure is \(\operatorname {cl} C\) (Proposition 2.1.8), and use the monotonicity of the closure operation:
the closed set \(\operatorname {cl}[A(\operatorname {ri}C)]\) is therefore \(\operatorname {cl}[A(C)]\). Because \(A(\operatorname {ri}C)\) and \(A(C)\) have the same closure, they have the same relative interior (Remark 2.1.9):
To prove the converse inclusion, let \(w=A(y)\in A(\operatorname {ri}C)\), with \(y\in \operatorname {ri}C\). We choose \(z'=A(x')\in \operatorname {ri}A(C)\), with \(x'\in C\) (we assume \(z'\neq w\), hence \(x'\neq y\)).
Using in \(C\) the same stretching mechanism as in Fig. 2.1.3, we single out \(x\in C\) such that \(y\in ]x,x'[\), to which corresponds \(z=A(x)\in A(C)\). By affinity, \(A(y)\in ]A(x),A(x')[=]z,z'[\). Thus, \(z\) and \(z'\) fulfill the conditions of Lemma 2.1.6 applied to the convex set \(A(C)\): \(w\in \operatorname {ri}A(C)\), and (2.1.3) is proved.
The proof of (2.1.4) uses the same technique.
1.2.2 The asymptotic cone
The closed convex cone \(C_{\infty }(x)\) does not depend on \(x\in C\).
See Theorem I.2.3.1 and the pantographic Figure I.2.3.1. Take two different points \(x_{1}\) and \(x_{2}\) in \(C\); it suffices to prove one inclusion, say \(C_{\infty }(x_{1})\subset C_{\infty }(x_{2})\). Let \(d\in C_{\infty }(x_{1})\) and \(t{\gt}0\), we have to prove \(x_{2}+td\in C\). With \(\varepsilon \in ]0,1[\), consider the point
Writing it as
we see that \(\bar{x}_\varepsilon \in C\) (use the definitions of \(C_\infty (x_1)\) and of a convex set). On the other hand,
The asymptotic cone, or recession cone of the closed convex set \(C\) is the closed convex cone \(C_{\infty }\) defined by (2.2.1) or (2.2.2), in which Proposition 2.2.1 is exploited.
A closed convex set \(C\) is compact if and only if \(C_\infty =\{ 0\} \).
If \(C\) is bounded, it is clear that \(C_\infty \) cannot contain any nonzero direction.
Conversely, let \(\{ x_k\} \subset C\) be such that \(\| x_k\| \to +\infty \) (we assume \(x_k\neq 0\)). The sequence \(\{ d_k:=x_k/\| x_k\| \} \) is bounded, extract a convergent subsequence: \(d=\lim _{k\in K}d_k\) with \(K\subset \mathbb N\) (\(\| d\| =1\)). Now, given \(x\in C\) and \(t{\gt}0\), take \(k\) so large that \(\| x_k\| \ge t\). Then, we see that
is in the closed convex set \(C\), hence \(d\in C_\infty \).
If \(\{ C_j\} _{j\in J}\) is a family of closed convex sets having a point in common, then
\[ \left(\bigcap _{j\in J} C_j\right)_\infty = \bigcap _{j\in J}(C_j)_\infty . \]If, for \(j=1,\dots ,m\), \(C_j\) are closed convex sets in \(\mathbb {R}^{n_j}\), then
\[ (C_1\times \cdots \times C_m)_\infty = (C_1)_\infty \times \cdots \times (C_m)_\infty . \]Let \(A:\mathbb {R}^n\to \mathbb {R}^m\) be an affine mapping. If \(C\) is closed convex in \(\mathbb {R}^n\) and \(A(C)\) is closed, then
\[ A(C_\infty )\subset [A(C)]_\infty . \]If \(D\) is closed convex in \(\mathbb {R}^m\) with nonempty inverse image, then
\[ \bigl[\; A^{-1}(D)\; \bigr]_\infty = A^{-1}(D_\infty ). \]
1.2.3 Extreme points
We say that \(x\in C\) is an extreme point of \(C\) if there are no two different points \(x_1\) and \(x_2\) in \(C\) such that \(x=\tfrac {1}{2}(x_1+x_2)\).
If \(C\) is compact, then \(\operatorname {ext}C\neq \varnothing \).
Because \(C\) is compact, there is \(\bar x\in C\) maximizing the continuous function \(x\mapsto \| x\| ^2\). We claim that \(\bar x\) is extremal. In fact, suppose that there are \(x_1\) and \(x_2\) in \(C\) with \(\bar x=\tfrac 12(x_1+x_2)\). Then, with \(x_1\neq x_2\) and using (2.3.1), we obtain the contradiction
(H. Minkowski) Let \(C\) be compact, convex in \(\mathbb {R}^n\). Then \(C\) is the convex hull of its extreme points: \(C=\operatorname {co}(\operatorname {ext}C)\).
A nonempty convex subset \(F\subset C\) is a face of \(C\) if it satisfies the following property: every segment of \(C\), having in its relative interior an element of \(F\), is entirely contained in \(F\). In other words,
Let \(F\) be a face of \(C\). Then any extreme point of \(F\) is an extreme point of \(C\).
Take \(x\in F\subset C\) and assume that \(x\) is not an extreme point of \(C\): there are different \(x_1,x_2\) in \(C\) and \(\alpha \in \, ]0,1[\) such that \(x=\alpha x_1+(1-\alpha )x_2\in F\). From the very definition (2.3.2) of a face, this implies that \(x_1\) and \(x_2\) are in \(F\): \(x\) cannot be an extreme point of \(F\). \(\square \)
1.2.4 Exposed faces
An affine hyperplane \(H_{s,r}\) is said to support the set \(C\) when \(C\) is entirely contained in one of the two closed half-spaces delimited by \(H_{s,r}\): say
It is said to support \(C\) at \(x\in C\) when, in addition, \(x\in H_{s,r}\); (2.4.1) holds, as well as
The set \(F\subset C\) is an exposed face of \(C\) if there is a supporting hyperplane \(H_{s,r}\) of \(C\) such that \(F=C\cap H_{s,r}\).
An exposed point, or vertex, is a \(0\)-dimensional exposed face, i.e. a point \(x\in C\) at which there is a supporting hyperplane \(H_{s,r}\) of \(C\) such that \(H_{s,r}\cap C\) reduces to \(\{ x\} \).
An exposed face is a face.
Let \(F\) be an exposed face, with its associated support \(H_{s,r}\). Take \(x_1\) and \(x_2\) in \(C\):
take also \(\alpha \in ]0,1[\) such that \(\alpha x_1+(1-\alpha )x_2\in F\subset H_{s,r}\):
Suppose that one of the relations 3 holds as strict inequality. By convex combination, we obtain \((0{\lt}\alpha {\lt}1!)\)
a contradiction.
Let \(C\) be convex and compact. For \(s\in \mathbb {R}^n\), there holds
Furthermore, the solution-set of the first problem is the convex hull of the solution-set of the second:
Because \(C\) is compact, \(\langle s,\cdot \rangle \) attains its maximum on \(F_C(s)\). The latter set is convex and compact, and as such is the convex hull of its extreme points (Minkowski’s Theorem 2.3.4); these extreme points are also extreme in \(C\) (Proposition 2.3.7 and Remark 2.4.4).
1.3 Projection Onto Closed Convex Set
1.3.1 The projection operator
A point \(y_x\in C\) is the projection \(p_C(x)\) if and only if
Call \(y_x\) the solution of (3.1.1); take \(y\) arbitrary in \(C\), so that \(y_x+\alpha (y-y_x)\in C\) for any \(\alpha \in ]0,1[\). Then we can write with the notation (3.1.2)
Developing the square, we obtain after simplification
Divide by \(\alpha \ ({\gt}0)\) and let \(\alpha \downarrow 0\) to obtain (3.1.3).
Conversely, suppose that \(y_x\in C\) satisfies (3.1.3). If \(y_x=x\), then \(y_x\) certainly solves (3.1.1). If not, write for arbitrary \(y\in C\):
where the Cauchy-Schwarz inequality is used. Divide by \(\| x-y_x\| {\gt}0\) to see that \(y_x\) solves (3.1.1).
For all \((x_1,x_2)\in \mathbb {R}^n\times \mathbb {R}^n\), there holds
Write (3.1.3) with \(x=x_1\), \(y=p_C(x_2)\in C\):
likewise,
and conclude by addition
1.3.2 Projection onto a closed convex cone
Let \(K\) be a convex cone, as defined in Example 1.1.4. The polar cone of \(K\) (called negative polar cone by some authors) is:
Let \(K\) be a closed convex cone. Then \(y_x = p_K(x)\) if and only if
We know from Theorem 3.1.1 that \(y_x = p_K(x)\) satisfies
Taking \(y=\alpha y_x\), with arbitrary \(\alpha \ge 0\), this inequality implies
Since \(\alpha -1\) can have either sign, this implies \(\langle x-y_x,y_x\rangle =0\) and (3.2.2) becomes
Conversely, let \(y_x\) satisfy (3.2.1). For arbitrary \(y\in K\), use the notation (3.1.2):
but (3.2.1) shows that
hence \(f_x(y)\ge f_x(y_x)\): \(y_x\) solves (3.1.1).
Let \(K\) be a closed convex cone. For the three elements \(x,x_1\) and \(x_2\) in \(\mathbb {R}^n\), the properties below are equivalent:
\(x=x_1+x_2\) with \(x_1\in K\), \(x_2\in K^\circ \) and \(\langle x_1,x_2\rangle =0\);
\(x_1=\mathrm{p}_K(x)\) and \(x_2=\mathrm{p}_{K^\circ }(x)\).
Straightforward, from (3.2.3) and the characterization (3.2.1) of \(x_1=\mathrm{p}_K(x)\).
1.4 Separation and Applications
1.4.1 Separation between convex sets
Let \(C\subset \mathbb {R}^n\) be nonempty closed convex, and let \(x\notin C\). Then there exists \(s\in \mathbb {R}^n\) such that
Set \(s := x - p_C(x) \neq 0\). We write (3.1.3) as
Thus we have
and our \(s\) is a convenient answer for (4.1.1).
Let \(C_1,\, C_2\) be two nonempty closed convex sets with \(C_1\cap C_2=\varnothing \). If \(C_2\) is bounded, there exists \(s\in \mathbb {R}^n\) such that
The set \(C_1-C_2\) is convex (Proposition 1.2.4) and closed (because \(C_2\) is compact). To say that \(C_1\) and \(C_2\) are disjoint is to say that \(0\notin C_1-C_2\), so we have by Theorem 4.1.1 an \(s\in \mathbb {R}^n\) separating \(\{ 0\} \) from \(C_1-C_2\):
This means:
Because \(C_2\) is bounded, the last infimum (is a min and) is finite and can be moved to the left-hand side.
If the two nonempty convex sets \(C_1\) and \(C_2\) satisfy \(\operatorname {ri}C_1\cap (\operatorname {ri}C_2)=\varnothing \), they can be properly separated.
1.4.2 First consequences of the separation properties
Let \(x\in \partial C\), where \(C\neq \varnothing \) is convex in \(\mathbb {R}^n\) (naturally \(C\neq \mathbb {R}^n\)). There exists a hyperplane supporting \(C\) at \(x\).
Because \(C\), \(\operatorname {cl}C\) and their complements have the same boundary (remember Remark 2.1.9), a sequence \(\{ x_k\} \) can be found such that
For each \(k\) we have by Theorem 4.1.1 some \(s_k\) with \(\| s_k\| =1\) such that
Extract a subsequence if necessary so that \(s_k\to s\) (note: \(s\neq 0\)) and pass to the limit to obtain
which is the required result \(\langle s,x\rangle =r\ge \langle s,y\rangle \) for all \(y\in C\).
Let \(S\subset \mathbb {R}^n\) and \(C:=\operatorname {co}S\). Any \(x\in C\cap \operatorname {bd}C\) can be represented as a convex combination of \(n\) elements of \(S\).
Because \(x\in \operatorname {bd}C\), there is a hyperplane \(H_{s,r}\) supporting \(C\) at \(x\): for some \(s\neq 0\) and \(r\in \mathbb {R}\),
On the other hand, Carathéodory’s Theorem 1.3.6 implies the existence of points \(x_1,\dots ,x_{n+1}\) in \(S\) and convex multipliers \(\alpha _1,\dots ,\alpha _{n+1}\) such that \(x=\sum _{i=1}^{n+1}\alpha _i x_i\); and each \(\alpha _i\) can be assumed positive (otherwise the proof is finished).
Setting successively \(y=x_i\) in (4.2.1), we obtain by convex combination
so each \(\langle s,x_i\rangle - r\) is actually \(0\). Each \(x_i\) is therefore not only in \(S\), but also in \(H_{s,r}\), a set whose dimension is \(n-1\). It follows that our starting \(x\), which is in \(\operatorname {co}\{ x_1,\dots ,x_{n+1}\} \), can be described as the convex hull of only \(n\) among these \(x_i\)’s.
Let \(\varnothing \neq C\subseteq \mathbb {R}^n\) be convex. The set \(C^*\) defined above is the closure of \(C\).
By construction, \(C^*\supseteq \operatorname {cl}C\). Conversely, take \(x\notin \operatorname {cl}C\); we can separate \(x\) and \(\operatorname {cl}C\): there exists \(s_0\neq 0\) such that
Then \((s_0,r_0)\in \Sigma _C\); but \(x\notin H_{s_0,r_0}\), hence \(x\notin C^*\).
The data \((s_j,r_j)\in \mathbb {R}^n\times \mathbb {R}\) for \(j\) in an arbitrary index set \(J\) is equivalent to the data of a closed convex set \(C\) via the relation
If \(C\) is given, define \(\{ (s_j,r_j)\} _{J}:=\Sigma _C\) as in (4.2.2). If \(\{ (s_j,r_j)\} _J\) is given, the intersection of the corresponding half-spaces is a closed convex set. Note here that we can define at the same time the whole of \(\mathbb {R}^n\) and the empty sets as two extreme cases.
A closed convex polyhedron is an intersection of finitely many half-spaces. Take \((s_1,r_1),\ldots ,(s_m,r_m)\) in \(\mathbb {R}^n\times \mathbb {R}\), with \(s_i\neq 0\) for \(i=1,\ldots ,m\); then define
or in matrix notations (assuming the dot-product for \(\langle \cdot ,\cdot \rangle \)),
if \(A\) is the matrix whose rows are \(s_j\) and \(b\in \mathbb {R}^m\) has coordinates \(r_j\).
A closed convex polyhedral cone is the special case where \(b=0\).
Let \(K\) be a convex cone with polar \(K^{\circ }\); then, the polar \(K^{\circ \circ }\) of \(K^{\circ }\) is the closure of \(K\).
We exploit Remark 4.1.2: due to its conical character \((\alpha x\in K\) if \(\alpha \in K\) and \(\alpha {\gt}0)\), \(\operatorname {cl}K\) has a very special support function:
In other words, \(\sigma _{\operatorname {cl}K}\) is \(0\) on \(K^{\circ }\), \(+\infty \) elsewhere. Thus, the characterization
becomes
in which we recognize the definition of \(K^{\circ \circ }\).
1.4.3 The lemma of Minkowski and Farkas
Let \(b,s_1,\dots ,s_m\) be given in \(\mathbb {R}^n\). The set
is contained in the set
if and only if (see Definition 1.4.5 of a conical hull)
Let \(b,s_1,\dots ,s_m\) be given in \(\mathbb {R}^n\). Then exactly one of the following statements is true.
?? has a solution \(\alpha \in \mathbb {R}^n\).
- \[ \left\{ \begin{aligned} & \langle b,x\rangle {\gt}0,\\ & \langle s_j,x\rangle \le 0\quad \text{for }j=1,\dots ,m \end{aligned} \right. \]
has a solution \(x\in \mathbb {R}^n\).
Let \(s_1,\dots ,s_m\) be given in \(\mathbb {R}^n\). Then the convex cone
is closed.
It is quite similar to that of Carathéodory’s Theorem 1.3.6. First, the proof is easy if the \(s_j\)’s are linearly independent: then, the convergence of
is equivalent to the convergence of each \(\{ \alpha _j^k\} \) to some \(\alpha _j\), which must be nonnegative if each \(\alpha _j^k\) in 8 is nonnegative.
Suppose, on the contrary, that the system \(\sum _{j=1}^m\beta _j s_j=0\) has a nonzero solution \(\beta \in \mathbb {R}^m\) and assume \(\beta _j{\lt}0\) for some \(j\) (change \(\beta \) to \(-\beta \) if necessary). As in the proof of Theorem 1.3.6, write each \(x\in K\) as
where
so that each \(\alpha '_j=\alpha _j+t^*(x)\beta _j\) is nonnegative. Letting \(x\) vary in \(K\), we thus construct a decomposition
where \(K_i\) is the conical hull of the \(m-1\) generators \(s_j,\; j\neq i\).
Now, if there is some \(i\) such that the generators of \(K_i\) are linearly dependent, we repeat the argument for a further decomposition of this \(K_i\). After finitely many such operations, we end up with a decomposition of \(K\) as a finite union of polyhedral convex cones, each having linearly independent generators. All these cones are therefore closed (first part of the proof), so \(K\) is closed as well.
Let be given \((b,r)\) and \((s_j,\rho _j)\) in \(\mathbb {R}^n\times \mathbb {R}\), where \(j\) varies in an (arbitrary) index set \(J\). Suppose that the system of inequalities
has a solution \(x\in \mathbb {R}^n\) (the system is consistent). Then the following two properties are equivalent:
(i) \(\langle b,x\rangle \le r\) for all \(x\) satisfying (4.3.6);
(ii) \((b,r)\) is in the closed convex conical hull of \(S:=\{ (0,1)\} \cup \{ (s_j,\rho _j)\} _{j\in J}\).
Let first \((b,r)\) be in \(K:=\operatorname {cone}S\). In other words, there exists a finite set \(\{ 1,\dots ,m\} \subset J\) and nonnegative \(\alpha _0,\alpha _1,\dots ,\alpha _m\) such that (we adopt the convention \(\sum \emptyset =0\))
For each \(x\) satisfying (4.3.6) we can write
If, now, \((b,r)\) is in the closure of \(K\), pass to the limit in (4.3.7) to establish the required conclusion (i) for all \((b,r)\) described by (ii).
[(i) \(\Rightarrow \) (ii)] If \((b,r)\notin \operatorname {cl}K\), separate \((b,r)\) from \(\operatorname {cl}K\): equipping \(\mathbb {R}^n\times \mathbb {R}\) with the scalar product
there exists \((d,-t)\in \mathbb {R}^n\times \mathbb {R}\) such that
It follows first that the left-hand supremum is a finite number \(\kappa \). Then the conical character of \(K\) implies \(\kappa \le 0\), because \(\alpha \kappa \le \kappa \) for all \(\alpha {\gt}0\); actually \(\kappa =0\) because \((0,0)\in K\). In summary, we have singled out \((d,t)\in \mathbb {R}^n\times \mathbb {R}\) such that
Now consider two cases:
If \(t{\gt}0\), divide \((\ast )\) and \((\ast \ast )\) by \(t\) to exhibit the point \(x=d/t\) violating (i).
If \(t=0\), take \(x_0\) satisfying (4.3.6). Observe from \((\ast )\) that, for all \(\alpha {\gt}0\), the point \(x(\alpha )=x_0+\alpha d\) satisfies (4.3.6) as well. Yet, let \(\alpha \to +\infty \) in
\[ \langle b,x(\alpha )\rangle = \langle b,x_0\rangle + \alpha \langle b,d\rangle \]to realize from \((\ast \ast )\) that \(x(\alpha )\) violates (i) if \(\alpha \) is large enough.
Thus we have proved in both cases that "not (ii) \(\Rightarrow \) not (i)".
1.5 Conical approximations of convex sets
1.5.1 Convenient definitions of tangent cones
Let \(S\subset \mathbb {R}^n\) be nonempty. We say that \(d\in \mathbb {R}^n\) is a direction tangent to \(S\) at \(x\in S\) when there exists a sequence \(\{ x_k\} \subset S\) and a sequence \(\{ t_k\} \) such that, when \(k\to +\infty \),
The set of all such directions is called the tangent cone (also called the contingent cone, or Bouligand’s cone) to \(S\) at \(x\in S\), denoted by \(T_S(x)\).
A direction \(d\) is tangent to \(S\) at \(x\in S\) if and only if
The tangent cone is closed.
Let \(\{ d_\ell \} \subset T_S(x)\) be converging to \(d\); for each \(\ell \) take sequences \(\{ x_{\ell ,k}\} _k\) and \(\{ t_{\ell ,k}\} _k\) associated with \(d_\ell \) in the sense of Definition 5.1.1. Fix \(\ell {\gt}0\): we can find \(k_\ell \) such that
Letting \(\ell \to \infty \), we then obtain the sequences \(\{ x_\ell ,t_\ell \} _\ell \) and \(\{ t_\ell ,k_\ell \} _\ell \) which define \(d\) as an element of \(T_S(x)\).
1.5.2 The tangent cones and normal cones to a convex set
The tangent cone to \(C\) at \(x\) is the closure of the cone generated by \(C-\{ x\} \):
We have just said that \(C-\{ x\} \subset T_C(x)\). Because \(T_C(x)\) is a closed cone (Proposition 5.1.3), it immediately follows that \(\overline{\mathbb {R}^+}(C-x)\subset T_C(x)\). Conversely, for \(d\in T_C(x)\), take \(\{ x_k\} \) and \(\{ t_k\} \) as in the definition (5.1.1): the point \((x_k-x)/t_k\) is in \(\mathbb {R}^+(C-x)\), hence its limit \(d\) is in the closure of this latter set. □
The direction \(s\in \mathbb {R}^n\) is said normal to \(C\) at \(x\in C\) when
The set of all such directions is called normal cone to \(C\) at \(x\), denoted by \(N_C(x)\).
The normal cone is the polar of the tangent cone.
If \(\langle s,d\rangle \le 0\) for all \(d\in C-x\), the same holds for all \(d\in \mathbb R^+(C-x)\), as well as for all \(d\) in the closure \(T_C(x)\) of the latter. Thus, \(N_C(x)\subset [T_C(x)]^\circ \).
Conversely, take \(s\) arbitrary in \([T_C(x)]^\circ \). The relation \(\langle s,d\rangle \le 0\), which holds for all \(d\in T_C(x)\), a fortiori holds for all \(d\in C-x\subset T_C(x)\); this is just (5.2.2). □
The tangent cone is the polar of the normal cone:
1.5.3 Some properties of tangent and normal cones
Here, the \(C\)’s are nonempty closed convex sets.
For \(x\in C_1\cap C_2\), there holds
\[ T_{C_1\cap C_2}(x)\subset T_{C_1}(x)\cap T_{C_2}(x) \qquad \text{and}\qquad N_{C_1\cap C_2}(x)\supset N_{C_1}(x)+N_{C_2}(x). \]With \(C_i\subset \mathbb R^{n_i}\), \(i=1,2\) and \((x_1,x_2)\in C_1\times C_2\),
\[ T_{C_1\times C_2}(x_1,x_2)=T_{C_1}(x_1)\times T_{C_2}(x_2), \qquad N_{C_1\times C_2}(x_1,x_2)=N_{C_1}(x_1)\times N_{C_2}(x_2). \]With an affine mapping \(A(x)=y_0+A_0x\) (\(A_0\) linear) and \(x\in C\),
\[ T_{A(C)}[A(x)]=\operatorname {cl}[A_0T_C(x)] \qquad \text{and}\qquad N_{A(C)}[A(x)]=A_0^{-*}[N_C(x)]. \]In particular (start from (ii), (iii) and proceed as when proving (1.2.2)):
\[ T_{C_1+C_2}(x_1+x_2)=\operatorname {cl}[T_{C_1}(x_1)+T_{C_2}(x_2)], \qquad N_{C_1+C_2}(x_1+x_2)=N_{C_1}(x_1)\cap N_{C_2}(x_2). \]
For \(x\in C\) and \(s\in \mathbb {R}^n\), the following properties are equivalent:
\(s\in N_C(x)\);
\(x\) is in the exposed face \(F_C(s)\): \(\langle s,x\rangle =\max _{y\in C}\langle s,y\rangle \);
\(x=p_C(x+s)\).
Nothing really new: everything comes from the definitions of normal cones, supporting hyperplanes, exposed faces, and the characteristic property (3.1.3) of the projection operator.
For given \(x\in C\) and \(d\in \mathbb {R}^n\), there holds
HINT. Start from the characterization (3.1.3) of a projection, to observe that the difference quotient \([\operatorname {P}_C(x+td)-x]/t\) is the projection of \(d\) onto \((C-x)/t\). Then let \(t\downarrow 0\); the result comes as well with the help of (5.1.4) and Remark 5.2.2.