Convex Analysis
Convex Analysis An Introductory Text Jan van Tiel Royal Netherlands Meteorological Institute
JOHN WI...
324 downloads
1166 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Convex Analysis
Convex Analysis An Introductory Text Jan van Tiel Royal Netherlands Meteorological Institute
JOHN WILEY AND SONS Chichester • New York • Brisbane • Toronto • Singapore
Copyright C) 1984 by John Wiley & Sons Ltd.
All rights reserved. No part of this book may be reproduced by any means, nor transmitted, nor translated into a machine language without the written permission of the publisher.
Library
of Congress Cataloging in Publication
Data:
Tiel, Jan van. Convex analysis. Includes bibliographical references and indexes. 1. Convex functions. 2. Convex sets. 3. Convex programming. I. Title. QA331.5.T49 1984 515.8'3 8310176 ISBN 0 471 90263 2 ISBN 0 471 90265 9 (pbk.) British Library Cataloguing in Publication Data:
Tie!, Jan van Convex analysis. 1. Convex functions I. Title 515.8'8 QA331.5 ISBN 0 471 90263 2 ISBN 0 471 90265 9 (pbk.) Filmset and printed in Northern Ireland at the Universities Press (Belfast) Ltd. Bound at the Pitman Press Ltd., Bath, Avon.
Preface This little book has evolved from my experience in teaching convex analysis at the University of Utrecht, Holland. In theory and applications, convex analysis is of increasing interest at the present time. This book is primarily an introductory text; therefore I have tried to emphasize the basic concepts and the characteristic methods of this part of mathematics (such as separation, subgradient, conjugate function, convex optimization). A large number of elementary exercises at the ends of the various chapters (with answers and hints at the end of the book) are intended to aid in understanding the concepts employed. The book is intended for the young student who is interested in convexity and whose mathematical background includes the basic facts of calculus, linear algebra, and general topology; it is also supposed that he is acquainted with the basic concepts of functional analysis (such as normed linear space, Hilbert space, dual). In order to convey the flavour of the subject and to arouse the student's interest, I have not restricted myself to the finitedimensional case one usually deals with in practice. But to keep things as simple as possible, of the class of locally convex spaces, the 'natural' domain of convex analysis, only normed spaces appear in this book. Some historical remarks and additional material are collected in bibliographical notes; of course these are by no means exhaustive. Chapter 1 summarizes the essentials of the theory of real convex functions on the real line. We also consider some generalizations to functions which can have infinite values. Chapter 2 studies algebraic properties of convex sets in a linear space. In the case of a linear topological space, we find some topological properties of convex sets. Chapter 3 develops the theory of separation in a linear space. Applying this theory in the case of a linear topological space yields the Hahn—Banach theorem.
vi
Chapter 4 considers some classical theorems concerning convex subsets of Rn and some applications to polyhedral cones. Using the notion of relative interior, we study separation in Rn. Chapter 5 studies convex functions on a linear space which can have infinite values. In a certain sense, locally boundedness turns out to be equivalent to continuity. We study the important concepts of lower semicontinuity and subdifferentiability. Chapter 6 develops the theory of duality. We find characterizations of the bipolar function and of support functions. Chapter 7 gives an impression of the meaning of convexity in optimization. It deals mainly with convex programming (Kuhn—Tucker conditions, saddle points and Fenchel's duality theorem). I am indebted to Professor John Horvdth who suggested the writing of an English version of my lecture notes. I wish to thank my colleagues Tineke de Bunje and Leen Roozemond who have read all or part of the manuscript and made many improvements. Finally, my thanks go to Mrs. M. M. Meijer who spent many hours typing the manuscript. Jan van Tiel
Contents
Preface
v
Chapter 1 Convex Functions on R
1 Real convex functions 1 Midpoint convexity 6 Differentiable convex functions 8 Theorems concerning integrals 9 The conjugate function 11 Convex functions with values in A 14 Generalizations 17 Exercises 17 Notes 18
Chapter 2 Convex Subsets of a Linear Space Convex hull and affine hull Convex polytopes Algebraic interior and algebraic closure Convex algebraic bodies Convex subsets of a linear topological space Exercises
Notes
Chapter 3 Separation Theorems Separation in a linear space Separation in a linear topological space The Hahn—Banach theorem Theorems in a normed linear space Exercises
vii
20 20 22 24 26 28 30 31
32 32 35 37 38 39
VI" Chapter 4 Convex Subsets of Or 41 Some classical theorems 41 The relative interior 46 Separation in (Fr 48 Polyhedral cones 50 Exercises 55 Notes
56
Chapter 5 Convex Functions on a Linear Space The epigraph Lower semicontinuity Convexity Continuity Continuity and lower semicontinuity in R" Differentiable convex functions Subdifferentiability
Exercises Notes
Chapter 6 Duality The conjugate function The bipolar function The set F(E) Support functions
Exercises Notes
Chapter 7 Optimization Convex programming in RV' Saddle points Fenchel's duality theorem Proximity mappings Monotone operators Notes
58 58 59 61 67 69 72 73 80 82
84 84 88 91 92 94 95 96 98 102 105 107 109 110
Answers and Hints
112
Glossary
121
Subject Index
123
CHAPTER 1
Convex Functions on
In this chapter we shall designate by I a (closed, open or halfopen, finite or infinite) interval in R.
REAL CONVEX FUNCTIONS 1.1
Definitions
I ,et f be a function / —> R. (a) f is said to be convex if
f(Aa +(1—A.)b) Af(a)d (1 —A)f(b)
(b)
(1)
for all a, b e / and all A E R with O < A R: (a) b— x x— a f(a)+ f(b) f(x), b—a b— a —
for all a, b, x e / with a < x < b. Note that the righthand side of this inequality can be written as f(a)+
f(b)— f(a) (x — a). b— a
1
2
Figure 1
(b) f(Aa + /lb)
Af(a)+ pf(b)
for all a, b E I and all A, ER such that A.> 0, fI >0, + = 1.
1.3 The proof of the following simple properties is left to the reader. (a) If f and g are convex functions and a 0, f3 O, then cif+ f3g is convex. (b) The sum of finitely many convex functions is convex. (c) The (pointwise) limit of a convergent sequence of convex functions is convex. (d) Let f: I be convex. Then
E Aixi E / (e)
f(1 A ix) i=1
and
i=i
A. if(x i )
Et_ Ai = 1. whenever xi E /, ÀO (1 i Let f be the pointwise supremum of an arbitrary collection of convex functions I > R. If f is finite everywhere on I, then f is convex. Does an analogous proposition hold for the infimum?
1.4 Theorem Let f: I
—
ll be convex. Then f(x)—f(a)f(b)— f(a) x a b a —
b
—
f(x) x
(2)
3
I
I
i i
a
b
x
Figure 2
whenever a, b, XE I, a <x R be convex, and let C e int(/). Let [a, b] cE I such that a
f(x)— f(c) x—c
4
is nondecreasing on (c, b]. Hence the right derivative
f _:_(c)
lc vi f(xx) _ fc (c)
:= i
exists. In a similar way it can be proved that the left derivative f_'_(c) exists. If a <e R be convex. Then f has a right derivative and a left derivative at every point of int(/), and f _' and f : are nondecreasing on int(/). If c e int(/), we have
r(c) .__ f (c) and
f (x) _. f(c) + r (c)(x — c), f(x) _. f(c) + r(c)(x — c) for all x e / (cf. Figure 3).
Remark. Let f: [a, b ] > R be convex. The above proof shows that in this case f (a) and f 1(b) exist if +00 and —00 are allowed as limits.
Figure 3
5 1.7
f: is called Lipschitzian relative to 10 I if there exists K >0 such that I f(x) K Ix — y for all x, y c 10 . This condition implies that f is continuous and even uniformly continuous relative to 1 0 , and of bounded variation on every closed bounded subinterval of h,. —
Theorem. Let f: I > R be convex and [a, b] int(/). Then —
(a) (b)
f is Lipschitzian relative to [a, b]. f is continuous on int(/).
Proof. There exist c, d e / such that c 0, y >0, A >0, >0 and A + 1u, = 1. This inequality can be derived by using the (strict) convexity of the function x e" in the following form:
exp(A. log x + ix log y) A exp(log x)+ it exp(log y). Other wellknown ways of presenting (7) are x y itq .
0, y>0, p>1, q>1 and 1/p + 1/q =1. For p = q = 2, (8) is the wellknown inequality ,./(xy).1(x + y).
9 THEOREMS CONCERNING INTEGRALS 1.13 Theorem Let f be a function (a, b) > R. Then f is convex if and only if f can be represented in the form
f(x)= f(c)+ ix g(t) dt (c, x E (a, b)) where g is a nondecreasing rightcontinuous function
Proof. 'Only if': let f be convex and c, x E
(10)
(a, b) > R.
b). By Theorem f ±' exists and is nondecreasing and rightcontinuous. Set
h(E):= I
f(t + E)— f(t) E
1.6 and § 1.8,
dt.
We have urn 1[f(t F
By
E) f(t)] = f'+(t)
E
(a R be convex and ai E [a, b] (1 i —
/1 '1 \ n i = i a.)
E
Then we have
1 n n 1 = 1 NO.
E
(12)
(12) is a theorem on the arithmetic mean (a.m.) of n numbers: f(a.m. of a l , a2 ,
, an ) a.m. of f(a i ), f(a 2),
,
11 This theorem has an analogue for the mean value of a function:
Theorem (Jensen's inequality). Let f: (a, b) > R be convex, and let g: [c, d]> (a, b) be continuous. Then 1 d /\d jcf dg(x) 1 f(g(x ) dx. ) d—c
Proof. Setting d 1 p := g(x) dx d— c 1
we have p e (a, b). By Theorem 1.6, we have
f(Y)=f(P)±r(P)(Y — P) whenever y e (a, b), and therefore f(g(x)).f(p)+ f(p)[g(x)— whenever x e [c, d]. Integrating the last inequality over [c, d] yields the stated result.
Remarks (a)
(b)
In this theorem we may replace g by a function which is only integrable in the Lebesgue sense over [c, d]. Jensen's inequality has the next analogue in probability theory, which can be proved in a similar way. Let X be a probability space, with probability measure (so that (X) 1). Let f: (a, b) > R be convex, and let g: X > (a, b) be p, integrable. Then
=

f(f
f
g
g)
In probabilistic terms: if x is a random variable on X, then we have f(Ex)E[f(x)] where Ex is the expectation of x.
THE CONJUGATE FUNCTION 1.15
Theorem
A function f: R —>R is convex if and only if there exists a function g: R —› R U1+001 such that
f(x) = sup[xy — g(y)] yen
for all x e R.
12 Proof. 'If': we have f(x)= sup [xy — g(y)] We see that f is the pointwise supremum of a collection of affine (and hence convex) functions, so by § 1.3(e), f is convex. 'Only if': we define a function g: R —>R U1+001 by
g (y) = sup [xy —f(x)]. xER
Let xo E R. For any y e R,
xo y —f(x0) hence x o y — g(y)
It follows that sup [xo y — V ER
f (x0).
(13)
Set y0 : = f (x0). By Theorem 1.6, for any x e R f(x)f(x0)+f_(x0)(x  x0) = f(x0)+ Mx
x0)
hence xY0  f(x)x0y0  f(x0). It follows that
g(y0) = x0 y0 — f (x 0) and so
xoYo — g(Yo)= f(x0).
(14)
Combining (13) and (14) we obtain the stated result.
1.16 The above function g is called the conjugate of f. f and g form a pair of functions satisfying the inequality
f(x)+
(15)
for all x, y R. We give the following geometrical interpretation of Theorem 1.15 (cf. Figure 6). A line m with slope y and intercept —a lies nowhere above the graph of f if and only if, for any x E R
yx — a
f (x)
hence
a
xy — f(x).
13
Figure 6
The smallest number a which satisfies this inequality is sup [xy — f(x)] = g(y). xeR
Therefore, translating m upwards as far as possible, we obtain a line n(y) that intersects the graph of f and whose intercept equals —g(y). Theorem 1.15 tells us that the graph of f is the envelope of the lines n(y) (y e R) if and only if f is convex. The reader is urged to give a geometrical interpretation of the choice of yo in the proof of the theorem, and also of the statement `g(y) =
1.17
Examples
(a) Let p>1, f(x) = Ix IP/p (x e R). Then g( y )
1
  I Y1q
q
where 1/p + 1/q = 1. Hence, by (15)
1 1 xy   1x1 P +  1Y1 q ci P (b)
(16)
for all real x and y. (Cf. § 1.12.) Let f: [0, 00)> R be strictly increasing and continuous and f(0) = O. Let
14
Figure 7 g be the inverse function of f. We define F and G by
ix f(t) dt if x _ 0
F(x) {
0
if x0, but also the following less obvious ones: 0 • (+00) = (+00)  0=0 • (00) = (00) • 0=0.
The expression +0000 is undefined. In the sequel we generalize the concept of convex function.
1.19 Definition A function f: R >R is said to be convex if for all x, y, A, IL, y E R such that f(x) R. (a)
f is said to be quasi convex if 
f(A.a + (1
(b)

A )b) f(b)
for all a, b E l with f(a) f(b) and all A E (0, 1). f is said to be strictly quasiconvex if f(Àa + (1 A)b)< f(b)
for all a, b c I with f(a) < f(b) and all À e (0, 1). A strictly quasiconvex function is not necessarily quasiconvex (cf. Exercise 10).
EXERCISES 1
2
Let f be a function (a, b) R. Prove the following statements: (a) If f is convex, then f is monotone (nondecreasing or nonincreasing) or there exists c e (a, b) such that f is nonincreasing to the left of c and nondecreasing to the right. (b) If f is convex, then every local (relative) minimum point of f is a global (absolute) minimum point. (c) If f is strictly convex, then f has at most one global minimum point. Let f: (a, b) R be convex and c e (a, b). Prove that f is differentiable at c if and only if l•rn hi()
3
4
f(c + h)+ f(c h

h)

2f(c)
=0.
Let f be a function (a, b) > R. Prove that f is convex if and only if for every point P of the graph of f there exists at least one line through P that lies nowhere above the graph of f. (a) Let x l , , x„ r 1 ,. , rn be positive real numbers such that = 1. Prove that —
i=i (b)
E rixi .
i 1
Show that the geometric mean of n positive real numbers is not greater than their arithmetic mean.
18 (c)
Prove Holder's inequality:
E
5 6
) lip( n
n
i=i
ar
E
)1/q
where a l , , an, b 1 , , bn are nonnegative real numbers and p> 1, q >1, 1/p +1/q = 1. Putting p = q =2, we obtain Cauchy's inequality. Proof that a bounded convex function f: R is a constant. A positive function f is said to be logarithmically convex (Lc.) if log f is convex. Let f and g be twice differentiable functions R —> R. Prove the following statements: (a) f is 1.c. if and only if f>0 and rf (b) If f is 1.c., then f is convex. (c) If f and g are 1.c. and a >0, >0, then af+ Og is 1.c. (d) Let a 1 ,. , an be positive real numbers. Then the function
xi—> log (axi + . . . + axn.) 7 8 9
is convex on R. Let f and g be convex functions R R. If f is nondecreasing, show that the function xi—> f(g(x)) is convex. Show that in § 1.17, example (b) the function G is the restriction of the conjugate of F. Let f be a function R R Ul+col. Prove that f is convex if and only if
f(Aa + (1— yk)b) Af(a)±(1—)t)f(b) 10
for all a, b ER and all /1. e (0, 1). Let f be a function (a, b) > R. (a) Prove that f is quasiconvex if and only if for each a c R the set Ix E (a, b)1 f(x)  a} is convex. (b) Show that strict quasiconvexity does not imply quasiconvexity. (c) If f is continuous and strictly quasiconvex, show that f is quasiconvex. —
NOTES 1
It follows from Theorem 1.10 that for continuous functions convexity and midpoint convexity are equivalent; this result is in J. L. W. V. Jensen, Sur les fonctions convexes et les inégalités entre les valeurs moyennes, Acta Math. 30 (1906) 17593. Much sharper results are in: H. Blumberg, On convex functions, Trans. Am. Math. Soc. 20 (1919) 4044, W. Sierpinski, Sur les fonctions convexes mesurables, Fund. Math. 1 (1920) 1258, A. Ostrowski, Zur Theorie der konvexen Funktionen, Comm. Math. Helvetici 1 (1929) 1579, and in the literature cited in these articles. For example, a measurable midpoint convex function is convex (and hence continuous).
19 2
3
For more inequalities (cf. § 1.12), the reader is referred to the book by G. H. Hardy, J. E. Littlewood and G. P(51ya, Inequalities, Cambridge, Cambridge University Press, 1934. The first treatment of pairs of functions f, g satisfying the inequality f(x)+
4
xy
for all x, y ER (cf. § 1.16) is given in Z. W. Birnbaum and W. Orlicz, Über die Verallgemeinerung des Begriffes der zueinander konjugierten Potenzen, Studia Math. 3 (1931) 167. This is also the first paper where the concept of conjugate function is used (under the name of `komplementke Funktion'; in this paper, `konjugiere has a different meaning). The basic results in the article of Birnbaum and Orlicz are also in the first chapter of M. A. Krasnoseskii and Ya. B. RutickiT, Convex functions and Orlicz spaces, Groningen, Noordhoff, 1961. Young uses the inequality (17) (§ 1.17) in W. H. Young, On classes of summable functions and their Fourier series, Proc. R Soc. A 87 (1912) 2259. Theorem 1.15 is due to S. Mandelbrojt, Sur les fonctions convexes, C. R. Acad. Sci. 209 (1939) 9778. The convexity of a function f :R > R can be expressed as follows: 1
1
1
x
y
Z
f(X)
0
f(y) f(Z)
whenever x 1 such that Ax c Ca, hence Ap(x) = p(Ax)1, and thus p(x) < 1. Utilizing the above, we conclude that (Ca) i = e V I p(x) al. The following conditions are equivalent: (a) H is closed. (b) int(A) 0. (c) f is continuous.
Proof. (a) (b): Let xo e A. Since H is closed, there exists a neighborhood U of xo such that Un = 0. By § 3.6, there exists a neighborhood S of xo which is starshaped relative to xo, and such that S U. It follows that f(S)> a, hence S c A. We conclude that A is open.
36 (c): let xo e int(A), and let x E A, x xo . Since f(x 0)> a, f(x)> a there exists z e A such that x e (z, x0). By Theorem 2.23 ( 3), utilizing the convexity of A, we find that x e int(A). It follows that A is open. Let (o, T) OE R. We will show that f 1 (a, T) is open (which implies the continuity of f). Let b E E such that f(b) = 1. We have
(13)
f(x) (a, T) o < f(x)< T f(x +(a  a) b) = f (x) + a  o >a and f ((a + T)b  x) = a +
f(x) >a
0. If A and B are two subsets of such a space, we define: ,
d(A, B):=
(b)
— bill a e A, b e Bl.
Finitedimensional spaces. These are topologically isomorphic (linearly homeomorphic) to some Tr, endowed with some norm topology (all norms for WI are equivalent). Convexity in R" will be studied in the next chapter.
THEOREMS IN A NORMED LINEAR SPACE
3.12 Theorem Let E be a normed linear space. Let C OE E be closed, convex and nonempty, and let a0 C. Then there exists a closed hyperplane in E separating C and a strictly. Proof. Since C is closed, we have
o := d(a, C) = inf Ilx — all > 0
xEc
— all 0. Definitions. Let E be a normed linear space. Let E' be the dual of E, that R. If x e E, u E E', is the linear space of all continuous linear functions E we write (x I u) instead of u(x). Let K E be a cone.
39
(a) The polar of K, denoted by K", is given by K" := {u E E' I (K I 14 ) where we have written (K I instead of (Vx e K)(x (b) The bipolar of K, denoted by K w, is given by
K"":= Ix
E I (x
0}
The reader can easily verify that K" and K w are convex cones containing 0. We have K c K"". We will next show that K w is closed. In fact, let (xn ) be a sequence in K"" such that xn > x. If u E K", then (xn for all n EN, hence (x (in virtue of the continuity of u) and so x e K °° . It follows that k K 110 . Theorem. If K E is a nonempty convex cone, then K w = k. Proof. We have already shown that k c K w . The reader can easily verify that k is a convex cone containing 0. Let a K. By Theorem 3.12, there exist 140 E E' and a E R such that
and
(K I ut())< a
(22)
(a I u0)> a
We have 0 e k, hence a >0, so that
(a I u0)>0
(23)
Since AK = K for all A >0, (22) implies (Z. u0) O hence u0 e K° . Therefore, it follows from (23) that a0K"". We conclude that K w k, hence K w = K. EXERCISES
Let E be a linear topological space. Let A, B E be open, convex and nonempty such that A fl B = 0. Show that there exists a closed hyperplane in E separating A and B strictly. 2 Let E be a normed linear space. Let A E be closed, convex and nonempty. Let B E be compact, convex and nonempty such that A n B = 0. Show that there exists a closed hyperplane in E separating A and B strictly (cf. Exercise 1 and Theorem 3.12). 3 Let E be a linear topological space. Let A E be convex such that int(A) 0. Prove that x E int(A) if and only if for every hyperplane H containing x there exist at least two points of A which are separated strictly by H. 4 Give an example of a convex algebraic body that is not a convex body (cf. § 2.26). [Hint: study Exercises 8 and 10 of Chapter 2.] 1
40
Let K be an open convex cone in a normed linear space. Show that K + k K. 6 Let K1 , K2 be nonempty cones in a normed linear space. Prove the 5

following statements: (a) (K 1 + K2)() = K? n 10. There exists N EN such that for all j, in > N we have
h(Di, D) hence
e(D,,,, D1 ) ••
46 and so e(A m , D1 ) E. It follows that e(A, e whenever j > N. Next, there exists MEN such that e(A, n, A) £ whenever in M. In fact, suppose this were not so. Then there would exist a sequence (x„) such that, for all i EN, x„ c A, and d(x, A)> E, where d is the Euclidean metric. Since A 1 is compact, this sequence would have a convergent subsequence (yi ) Let E, but also y E A, a contradiction. it yi > y. This would imply d(y, follows that there exists M EN such that e(Dr„, A) e for all in >M. We conclude that h(Dr,„ A) e whenever rn >rnax(N, M). THE RELATIVE INTERIOR 4.7
Theorem
Let Cc. Dr be convex. The following conditions are equivalent: (a) C is a .convex algebraic body. (b) C is a convex body. (c) dim(C) = n.
Proof. (c) co(x o , x i ,
(b): by Theorem 4.2, C contains an nsimplex , xj. It is left to the reader to prove that int(S) =
E Aixi Xi >0 (O =0
in), i
E
S=
=1}
hence int(S) 0. It follows that int(C) 0. (b) (a): this is true in virtue of Theorem 2.23(c). (a) (c): suppose C were a convex algebraic body such that dim(C)< n. This would imply aff(C) Dr. Let c e C and x Egr\aff(C). Let in be the line through x and c. We thus would have in fl aff(C) =Icl, hence in n c {c}, a contradiction. 
4.8
Definition
Let E be a linear topological space and A E. The relative interior ri(A) of A is defined as the interior of A regarded as a subset of aff(A) (with the relative topology). Remark. In Or, it is of no use to introduce the notion of relative closure. In fact, since every affine set in gr is closed, the closure A of A is the same as the closure of A regarded as a subset of aff(A). Theorem. Let C c Or be convex. (a) If C 0, then ri(C) 0 and dim(ri(C)) = dim(C). (b) ri(C)= C and ri(C) = ri(C). (c) If x e ri(C) and y e C, then [x, y)=ri(C).
47 Proof. (a) follows from § 4.7. Applying (a) and Theorem 2.27, we obtain (b). (c) follows from Theorem 2.23.
4.9 In the following theorems, we study some properties of the operation `ri'. Theorem. Let COE Rn be convex, and let T be a linear map from R" to Rn. Then ri(TC) = T(ri(C))
and
TC7) T(C).
Proof. Any linear map from Rn to Rr" is continuous. The second formula follows from the continuity of T.
Applying the second formula to ri(C), by § 4.8 we have T(ri(C)) D T(ri(C)) = T(C)D TC D T(ri(C)).
It follows that TC = T(ri(C)). Once again, by § 4.8 we have ri(TC) = ri(TC) = ri(T(ri(C))) = ri(T(ri(C))), hence ri(TC) OE T(ri(C))
(25)
Let x E T(ri(C)). There is x 1 E ri(C) such that x = Tx I . By § 4.8, ri(TC) 0, hence there exists y c ri(TC), and there exists Yi E C such that y = Ty i . Since E ri(C), there exists z 1 e C such that x 1 E Vi). Let z = Tz i . We have Z E TC and x e (z, y); it follows from § 4.8 that x E ri(TC). We conclude that T(ri(C)) OE ri(TC).
(26)
Combining (25) and (26) we obtain the stated result.
4.10 Theorem Let C, D ŒR" be convex, and let A e R. Then: (a) ri(AC) = Ari(C) (b) ri(C+ D) = ri(C) + ri(D) If, moreover, ri(C) n ri(D) 0, then:
(c) cnncnD (d)
ri(C n D) = ri(C) fl ri(D)
Proof. (a) Apply § 4.9 to T: tir > R", defined by Tx = (b) Apply § 4.9 to T: R" x ER' >R", defined by T(x, y) = x + y. It follows that ri(C + D) = ri(T(C x D)) = T(ri(C x D)). It is left to the reader to prove that aff(C x D) = aff(C)x aff(D) and ri(C x D) = ri(C) x ri(D). We conclude that ri(C + D) = T(ri(C) X ri(D)) = ri(C)+ri(D). (c), (d). We have Let x E ri(C) n ri(D), and let yECn b. By § 4.8, [x, y) OE ri(C) n ri(D), hence y E ri(C) n ri(D). It follows that
cnDcnb.
O n bŒri(onri(D)cn.D0n15
48 hence
cnD=cnD, which proves (c). Moreover, by § 4.8 ri(ri(C)
n ri(D))= ri(C n D)
hence ri(C n D)
ri(C)
n ri(D).
Let now z c ri(C) n ri(D) and w E ri(C n D). There exist u 1 E C, u2 c D such that z E ( 4 1 , w)n(u2 , w), hence there exists u E CnD such that z E w). By § 4.8, z E ri(C n D). We conclude that ri(C) n ri(D) ri(C n D) which proves (d).
Remark. It is left to the reader to show (by means of a counterexample) that in (c) and (d) the condition ri(C) ri(D) # 0 cannot be omitted.
SEPARATION IN R" 4.11 The following theorem is an improvement of the separation theorem (Theorem 3.8) for the case where the linear topological space is finitedimensional. It gives a necessary and sufficient condition for proper separation in ITV.
Theorem (separation theorem). Let C, D
R" be convex and nonempty. Then there exists a hyperplane in R" separating C and D properly if and only if ri(C) n ri(D) = 0.
Proof. We leave it to the reader to verify that the stated result is true if n = 1. Assume now that n 2. Let A = C— D. A is convex (cf. § 2.4, property (a)) and nonempty. By Theorem 4.10, ri(A) = ri(C)—ri(D), hence we have ri(C)
n ri(D) = 0.(=> 0 ri(A)
'If': Let B = ri(A). B is relatively open (that is open in aff(B)), convex (cf. Theorem 2.23) and nonempty (cf. § 4.8), and we have 0 0 B. We shall prove in the next lemma that there exists a hyperplane H = f (0) such that 0E 11, H fl B = 0. Since B is convex, there is no loss of generality if we assume that f(ri(A)) = f(B) > 0. It follows that f(A) 0. We conclude that (VC E C)(V d
E
D) f
f(d)
(Rc c C)(3d
E
D)f (c) > f (d)
and
Let y = inflf(c) I CE CI, then C and D are separated properly by the hyperplane f 1 (y).
49 'Only if': Let H = f1 (a) be a hyperplane separating C and D properly. There is no loss of generality if we assume that f(C) a, f(D) a and f(c)> a for some c E C. It follows that f(A) 0 and f(a)> 0 for some a c A. Let x c ri(A). There is 6>0 such that [x, x + 5(x — a)] A, hence f(x + 8(x a)) 0 , so that (1 + 5)f(x) Sf(a) and so f(x) > O. We conclude that f(ri(A))> 0, hence 0 Ø ri(A). —
Lemma. Let B OE R" be convex and relatively open such that 0 0 B. Then there exists a hyperplane H in Rn such that 0 e H,
rin B = .
Proof. We will prove the following statement: if F is a linear subspace of R" such that 0 dim(F) n —2 and F n B = 0, then there exists a hyperplane H in R" containing F such that H n B = 0 (putting F = 101, this yields the lemma). First we consider the case where n = 2. Then we have F = { 0} and 0 0 B. The hyperplane we are looking for is a line in R 2 through 0 not meeting B. The existence of this line is trivial in the cases where dim(B) = —1, 0, 1, and in the case where dim(B) =2 it follows from Theorem 3.8 (since in this case B is an open subset of R 2) • We next consider the case where n >2. Let S be a twodimensional linear subspace of R" such that S F, and let B i = S fl (B +F) ; B i is convex. By Theorem 4.10, ri(B + F) = ri(B) + ri(F) = B + F. Since ri(S) = S we have, again by Theorem 4.10, B 1 = 0 or ri(B I ) = S n(B +F) = B i . In both cases, B 1 is relatively open. Since 0 0 B 1 , by what has been proved for the case where n =2 there exists a line nt in S through 0 not meeting B i . Let S i = F+ in; S i a linear subspace of R" containing F, and dim(S 1 ) = dim(F) 4 1. Supposeis we had S i n B 0. Then there would exist f E F, a c nt, b E B such that f + a = b, hence a = b f, a contradiction (since in n B, = 0). We conclude that S i n B = 0. Continuing in this way, after a finite number of steps we obtain a hyperplane H such that H D F, H n B = 0. —
4.12 Application: supporting hyperplanes Let E be a linear topological space and A E. The relative boundary rb(A) of A is defined as the boundary of A regarded as a subset of aff(A) (with the relative topology). If A R", then rb(A) = The following theorem and its proof are practically identical with those of § 3.10.
Theorem. Let C[fr
convex and nonempty, and let x E rb(C). Then there exists a nontrivial supporting hyperplane for C at x. be
50 4.13
Application: extreme points
In § 2.10 we introduced the notion of extreme point. Below we shall demonstrate that compact convex sets in R" are entirely determined by their extreme points. We start with a lemma.
Lemma. Let A be a bounded nonempty subset of Rn containing at least two points. Then ri(A) OE co(rb(A)).
Proof. Let x E ri(A). There exists y e A, y(t x. Let m be the line through x and y. The set m n A is bounded and contains a line segment having x as a relative interior point. It follows that there exist p, qcmn rb(A) such that x E [p, q].
Theorem (of Minkowski). Let C c R" be compact, convex and nonempty. Let E be the set of the extreme points of C. Then E 0 and C = co(E). Proof. Let d = dim(C). We give a proof by induction on d. If d = 0, C consists of one point, and the statement of the theorem is trivially true. Now suppose that dim(C) = k and that the statement is true if d 0 such that
oil
pi
(27)
whenever i E /(x). If i 1(x), we have (x I pi ) = (y I pi ) = 13f , hence (x + E(x — y) I pi ) = pi . It follows that (27) holds for all j. We conclude that x + E(x — y) E A. We have
x
1+e
y+
1 1
r Lx + e(x — y)].
Since x, y E E, it follows that x = y. We conclude that x y implies 1(x) gy). This proves the stated result, since there are only finitely many subsets of {1, 2, ... , . (b) A is closed and convex. From the hypotheses of the theorem it follows that A is compact and nonempty. Applying Minkowski's theorem (cf. § 4.13) yields the stated result.
4.17 Theorem Let K be a finitely generated convex cone in W. Then: (a) K is the union of finitely many finitely generated convex cones each having a linearly independent set of generators. (b) K is closed. Proof. (a)
a2,
(Cf. the proof of Carathéodory's theorem). There exist , ap ER" such that
K = {A l a i + A2a,+ . . .+ Avap I A l , Let
x e K,
A2,.
x = A 1 a 1 + A2 a2 + ...+ Apap where
,
A 1 , A2,.
,
Ap
O.
If
53
al , a2 ,.
, at, are linearly dependent, there exist Il i , 0, such that + tL2 a2 +
fL p
ER, not all
+ ptp a, = O.
For all p ER, we have
x=E
(Ai 
The set la ER1 crtLi Ai p)1 is an interval of the form (—co, a], [a, 0] (where a = p = O is allowed) or [a, +00). Putting p = a, we get an expression of x as a nonnegative linear combination of at most p —1 generators of K. Repeating this process, after a finite number of steps we get an expression of x as a nonnegative linear combination of linearly independent generators of K. This proves the stated result, since the set of generators of K has only finitely many subsets. (b) If a l , a 2 , . . , ap are linearly independent, the cone {A l a i 4A2a2 + ...+ Apap lik i , 
A2, . . ,
Àp
is a closed subset of R", since it is homeomorphic to the closed set 01. Using (a), we conclude that K is ri. )ER In15 2) • closed.
4.18
Theorem
Let K U;Rrt. Then K is a polyhedral cone if and only if K is a finitely generated convex cone.
Proof. 'Only if': let K =Ix E R" 1 Tx
(where T is a linear map from Rn
to Rk ) be a polyhedral cone. We have K _ Ti (TR n n
pk)) .
Let A = TR" n (pk) and B ={(m. 'l2' • , 71k) E A nonempty (since 0 E B) and A = R,J3
(28)  i, 1 . B }
is
(29)
where R, is the set of all nonnegative real numbers. B is the set of solutions of the systems of inequalities
1
(Y1P1)= 0 (1= i. s) (y1 —pt )  _0 (1 .. i ._ s)
(y I et ) _ 0 (1 . i ._ k) (y1 — e) ._ 1
where p i , p2, ... , ps is a basis for the orthogonal complement of TR in R k , e l , e2, . . . , eic are the unit vectors in R k (e, = (1, 0, 0, ... , 0), etc.) and
54
e = (1, 1, . . . , 1). The reader can easily verify that B is bounded. By § 4.16, there exist a l , a2 ,. , ap ER k such that B = co(a i , a2,
, ap ).
Using (29) we conclude that A is the finitely generated convex cone generated by al , a2 ,. . . , ap. There exist b1 , b2, , bp ER" such that Tbi = (1 i p). If x E K, there exist A I , //12, . . À such that Tx —
E A i ai = E
hence X 
E Ai bi c 11101
It follows that K is the finitely generated convex cone generated by b 1 , b2, , bp, c l , c2 , , c,, —c l , —c2, , Cq where c l , c2, , c, is a basis for the kernel T1 101 of T. 'If': Let K be a finitely generated convex cone. By Theorem 4.17, K is closed. From § 3.13 we deduce that
K = 1( 0 " = (K")". By § 4.15, K ° is a polyhedral cone and so, by what has been proved before, K° is a finitely generated convex cone. Finally, once more by § 4.15, we conclude that K °° is a polyhedral cone.
Remark. From what has been proved above and § 4.15 we deduce that each of the sets {Tx I x 0} and { y I Tty 0} is the polar of the other. 4.19 Application: Farkas' lemma
Let A be a linear map from Rn to R k, and let K =Ix A tx b
(Vx ER k )(A tx
(b I x) 0)
From the remark in § 4.18 it follows that K ° = {Ay I y (ay
We have
= Ay (Vx eR k )(A tx 0
0 } , hence
(b I x) .0).
(30)
The statement (30) is called Farkas' lemma. It can also be formulated as a theorem of the alternative:
For each linear map A: R" > R! (I)
(II)
each b ER'', either
Ax 0, (b I x) > 0 has a solution x eR k, or b = Ay, y 0 has a solution y ER"
but never both.
55
EXERCISES V is a linear space over R. 1 2 3
4
5
6
7 8 9 10
11
12
13
Let A c V, dim(A) = k, a c A. Show that each x e co(A) is contained in a ksimplex having its vertices in A and having a as one of its vertices. Let A V. Show that co(A) is the union of all finitedimensional simplices whose vertices belong to A. be disjoint convex subsets of V, and let x E Let C1 , C2, co(U Ci ). Show that x is contained in a simplex having at most one vertex in each Let r be the collection of all nonempty compact subsets of or with the Hausdorff distance. Let (Ai ) be a sequence in r, and let A E r such that Ai A (i 00). Let (xi ) be a sequence in Rn such that x i c Ai (i EN), and let x E or such that xi x (i > cc). Show that x E A. Let r be defined as in Exercise 4. Let (Ai ) be a sequence in r, and let A e r such that Ai > A (i > 00). If each Ai is convex, show that A is convex. (a) Let C C=R 1' be convex. Show that Ca = C. (b) Find an example of a convex subset C of a linear topological space such that C C. Let C or be convex. Show that C is closed if and only if C fl m is closed for each line m in or. Let C Rn be convex and let A Rn be open. Show that A n C 0 if and only if A n ri(C) 0. Let C, D or be convex such that C =15 , C n ri(D) 0. Prove that ri(C) c ri(D). (a) Prove the generalization of Theorem 4.10(c) to the case of an arbitrary collection of convex sets. (b) Show that the generalization of Theorem 4.10(d) to the case of a collection of more than two convex sets holds if the collection is finite, but not if it is infinite. Prove Radon's theorem: Let A Rn contain at least n + 2 points. Then there exist A 1 , A2 cR n such that A 1 n A2 = 0, A 1 U A2 = A, co(A 1 )nco(A2) 0. Let K cT R n be a nonempty convex cone. Let C be a nonempty convex subset of R" such that Cn K = 0. Show that there exists y E K° , y 0 such that (c I 0 whenever c c C. Let K1 , K2 Fr be polyhedral cones. Show that
(Ki n K2 )° =K7+K? 14
(cf. Exercise 6 of Chapter 3). Prove Farkas' lemma, applying the separation theorem to {b} and AP n (Hint: 1)0 AP n there exists a hyperplane separating {b} and AP n strictly).
56 15
Prove Gordan's lemma: For each linear map A: Rn >R k, either (I) Atx >0 has a solution x E R k, or has a solution y ER' (II) Ay =0, yO y but never both. ,
NOTES 1
2 3
4 5
6
The geometry of convex sets is much older than the analysis of convex functions. The older results are in T. Bonnesen and W. Fenchel, Theorie der konvexen K6rper, reissued by Chelsea, New York, 1971. Other useful books in this field are H. G. Eggleston, Convexity, Cambridge, Cambridge University Press, 1969, F. A. Valentine, Convex Sets, New York, McGrawHill, 1964, K. Leichtweisz, Konvexe Mengen, Berlin, SpringerVerlag, 1980. Generally speaking, we have only given those results concerning convex sets that are of use in convex analysis. The theorem in § 4.3 has an analogue for infinitedimensional spaces. It is the theorem of Mazur: if E is a Banach space and A E is compact, then ai(A) is compact. The volume and the surface area of compact convex subsets of Ur are continuous functions on RJR) (cf. § 4.6). It follows from Blaschke's convergence theorem that any real continuous function on fl(R) has a minimum. This fact can be used in studying the isoperimetric problem: find the set whose surface has a given area and which contains the largest volume. There exists an infinitedimensional generalization of the theorem of Minkowski (cf. § 4.13). It is the theorem of KreinMilman: A compact, convex and nonempty set in a locally convex space is the closed convex hull of its extreme points. The concept of convexity in a linear space has been generalized in various ways. In V. W. Bryant and R. J. Webster, Generalizations of the theorems of Radon, Helly, and Carathéodory, Monatshefte far Mathematik 73 (1969) 30915 a convexity space is defined as a pair (X, .), where X is a nonempty set and is a map from X x X to the set of all subsets of X (obviously, a • h is a generalization of the interior (a, h) of the line segment [a, b]) satisfying: (a) a 0 (b) a•h=h•a (c) a • (h • c)= (a • h) • c (d) Let al b Ix EX act) x}. Then we have: (a/h )n (c/d) ,k 0 (a • d) n (b • c) 0 (e) a • a (al = ala (f) (a •h)n(ac)0b=c or hca•c or ceab. In such a convexity space, the notions of independent set and of dimension can he defined, and generalizations of the theorems of Carathéodory (cf. Theorem 4.2), }Telly (cf. Theorem 4.4), and Radon (cf. Exercise 11) hold. More geometry based on these ideas can be found in W. Prenowitz and J. Jantosciak, Join Geometries: a Theory of Convex Sets and Linear Geometry, Berlin, Springer, 1979. In D. C. Kay and E. W. Womble, Axiomatic convexity theory and relationships between the Carathéodory, Helly, and Radon numbers, Pacific J. Math. 38 (1971) 47185 a convexity space is defined as a pair (X, W) where X is a set and c(?) is a family of subsets of X satisfying the following conditions: (a) 0 c 4?) and X c
57 (b) The intersection of each subfamily of (6 belongs to (6. The convex hull of a set A OE X, denoted by C(A), is defined by E B A} C(A) = n{s In such a convexity space three numbers can be defined: (X, cC) is said to have Carathéodory number c if c is the smallest nonnegative integer such that, for all A OE X:
C(A)= UIC(B)I B
cl
where 1 BI is the cardinality of B.
(X, '6) is said to have Helly number h if h is the smallest nonnegative integer for which it is true that a finite subfamily of sets in c6 has nonempty intersection provided each h members of the subfamily has nonempty intersection. (X, '6) is said to have Radon number r if r is the smallest positive integer for which it is true that any A OE X with IAIr may he partitioned into two nonempty subsets A 1, A2 such that C (A 1 ) nc(A 2) 0. If X =R" and
is the family of all convex subsets of R", then c — h = n + 1 and
r = n + 2. It can be proved that in a convexity space where c, h, and r exist we have
7
More about an axiomatic setting for the theory of convexity and related topics can he found in R. E. Jamison, A general theory of convexity, PhD Thesis, University of Washington, Seattle, 1974, G. Sierksma, Relationships between Carathéodory, Helly, Radon and exchange numbers of convexity spaces, Nieuw Archief voor Wiskunde (3), XXV (1977) 115132, H. van Maaren and H. J. P. De Smet, Extremal points, separation and Carathéodory, Helly and Radon numbers in nonreal linear spaces, Proc. Kon. Ned. Akad. Wet. 84 (= Indag. Math. 43) (1981) 20718. We define a multifunction from a set X to a set Y as a relation from X to Y or, what comes nearly to the same thing, as a function from X into the power set g"(Y) of Y. Properties of a multifunction from X to Y can sometimes be described by endowing g"(Y) (or part of it) with a suitable structure, compatible (in some way) with the structure given to Y. An example of such a structure is the topology on [Ice (cf. § 4.6) induced by the Hausdorff distance. Very readable introductions to multifunctions are R. E. Smithson, Multifunctions, Nieuw Archief voor Wiskunde (3), XX (1972) 3153, B. L. McAllister, Hyperspaces and multifunctions, the first half century (1900— 1950), Nieuw Archief voor Wiskunde (3), XXVI (1978) 30929. An example of a theorem closely related to Blaschke's convergence theorem is the following: Let X be a complete metric space, and let P(X) he the collection of all nonempty bounded closed subsets of X. Then P(X) with the Hausdorff distance is a complete metric space. If Y is a linear space, a multifunction f from X to Y is called convex if, for each x E X, f(x) is a convex subset of Y. An exposition of convex multifunctions can be found in C. Castaing and M. Valadier, Convex Analysis and Measureable Multifunctions (Lecture Notes in Mathematics no. 580) Berlin, Springer, 1977.
CHAPTER 5
Convex Functions on a Linear Space
In this chapter we designate by V a linear space over R and by E a linear topological space over R having a Hausdorff topology, both containing more than one point.
THE EPIGRAPH 5.1 Definition
Let X be a set and f a function X
ft. The epigraph epi(f) of f is the set
{(x, A.)e XxR I f(x)A}.
Cf. Figure 10. H
x
,k,
Figure 10
58
X
59 In the following, properties of f will sometimes be described in terms of properties of epi(f). If X is a topological space, we endow X x R with the product topology. Closedness of epi(f) turns out to correspond with lower semi continuity of f (cf. Theorem 5.3). In a wellknown way, V x R can be made a linear space which we denote by WM. Convexity of epi(f) turns out to correspond with convexity of f (cf. Theorem 5.10). ER endowed with the product topology is a linear topological space. In particular, if E is a normed linear space (with norm x lx I), then the topology on ER is the norm topology of one of the (equivalent) norms (x, A) 1>I1xII + IA I and (x, A) max(11.4, IAI). 
LOWER SEMICONTINUITY Let X be a topological space. 5.2
Definition
n
Let f be a function X > and a c X. f is said to be lower semi continuous at a if for each KER, K< f(a) there exists a neighborhood U of a such that f(U)> K. f is said to be lower semi continuous if f is lower semicontinuous at each point of X. 

Remarks (a) (b)
A continuous function is lower semicontinuous. If a c X is an accumulation point of X and f(a) = semicontinuous at a, then
and if f is lower
lim f(x)= +00 (c)
If f(a)
5.3
Theorem
— —
00, then f is lower semicontinuous at a.
f be a function X R. The following conditions are equivalent: f is lower semicontinuous. Ix c X I f(x)> AI is open for each A c R. Ix c X I f(x) Al is closed for each A c R. (d) epi(f) is closed (as a subset of X x R).
Let (a) (b) (c)
Proof. (a) (h) is a direct consequence of Definition 5.2; (b) (c) is trivial. (a) (d): Define F: X x R > fi by F(x, A)    f(x)— A. The reader can easily verify that F is lower semicontinuous if and only if f is lower semicontinuous. By (c), the lower semicontinuity of F can be reexpressed as the
60 closedness of {(x, A) I F(x, A) Al for every t E R. This proves the stated result, since {(x, A) I F(x, A) 1.0 = {(x, A) I (x, A + epi(f)} = epi(f)— (0, pt).
5.4 The proof of the following simple properties is left to the reader. (a) The pointwise supremum of an arbitrary collection of lower semicontinuous functions is lower semicontinuous. (b) If X is compact and if f: X>RUI+091 is lower semicontinuous, then f assumes a minimum value (which may be +oe). (c) If f and g are lower semicontinuous functions X —>R U1+001 and if >0, then Af and f+ g are lower semicontinuous.
5.5 The closure epi(f) of the epigraph of a function f: X > g turns out to be A, and let U x W be a also an epigraph. In fact, let (x, X)E epi(f), neighborhood of (x, in X x R. There exists an open interval / R such that A. e /, p. I. Since (x, A) E epi(f), there exists a point (y, a) E epi(f) such that (y, cr)e Ux/. We have f(y). o < ti,, hence (y, ti,) e epi(f) (U x W). It follows that (x, ix) e epi(f). We conclude that the intersection of epi(f) and the line {x} x R is 0, a halfline [a, +cc), or R. Putting g(x) = +co, g(x) = a and g(x) = CX) , respectively, we define a function g whose epigraph is epi(f).
Definition. Let f be a function X —> DI. The lower semi continuous hull f of f is the function X DI whose epigraph is epi(f): 
epi(f) = epi(f)
5.6 g: X is said to he a minorant of f: X > rk if g(x) f(x) whenever x e X, which is equivalent to saying that epi(g) epi(f). By Theorem 5.3, the epigraph of every lower semicontinuous minorant of f is closed (and contains epi(f)). We conclude that f is the largest lower semi continuous minorant of f. Remark that f can also be defined as the pointwise supremum of all lower semicontinuous minorants of f (the constant function —00 is one of these minorants). 
5.7 Lower semicontinuity can also be introduced using the notion of lim inf, defined as (31) lim inf f(x):= sup linf f(x) Ix c U\lan where U ranges over the neighborhood system of a. If X is a normed linear
61 space, then we have
lirn inf f(x) = lirn inflf(x) I 0 fR and a e X. Then (a) f is lower semicontinuous f= f; (b) f is lower semicontinuous at a a
Proof. (a) By Theorem 5.3, we have: f is lower semicontinuous ,(=> epi(f)= epi(f) .47> epi(f) = epi(f) .47> f = f (b) is a direct consequence of Definition 5.2. (c) In virtue of (b) and the lower semicontinuity of f, we have
f(a) lirn inf !(x) urn inf f(x). We also have f(a)f(a), hence t(a)
where
: = min f(a), lirn inf f(x)}. x —>a
Suppose now we would have f(a)< ix. This would imply the existence of ER such that ix >A, (a) e/:=[00, A> and a neighborhood U of a such that inflf(x) I x E U\Iall> A hence f(x)> A whenever x E U. Thus U x I would be a neighborhood of (a, f (a)) (in X xR) not meeting epi(f), violating the fact that (a, f (a)) e epi(f)= epi(f). It follows that f(a)= (d) is a direct consequence of (b) and (c).
Remark. An alternative way to define lim inf (cf. (31)) is the following: lim inf f(x) := sup
(x) j x E U}.
Using this definition, we have f(a)= lirn infx,f(x).
CONVEXITY 5.9
Definition
Let f be a function V > E. f is said to be convex if for all x, y E V and all A, ix, v ER such that f(x)< t, f(y)< v, 0 R. The following conditions are equivalent: f is convex. epi(f) is convex. 1(x, A) e VeR f(x) < AI is convex.
Proof. (a)(c): let A
1(x, A) E VOR f(x)
f(x 0 ; x) on V is positively homogeneous and convex. Proof. (a) Let x e V. Define g: Ef > A by g(E) = f(xo + Ex) (E e R). g is a proper convex function, hence g'± (0) exists (cf. § 1.24). This proves the statement, since g'(0) = f(x o , x). (b) It follows directly from the definition of directional derivative that f(x0 ; Ax) = Afi(xo , x) whenever x e V and A 0, hence the function xi> f(x 0 ; x) is positively homogeneous. The convexity of this function follows from 1
[f(X0+ E (Ay
+ (1 A)z)f(x 0)]
1
= U(A(X0+ EY) ± ( 1 A)(X0
1
[Af(x0 + Ey) + (1 A)f(x 0 + sz)
 A f(xo+ EY)f(x0) + (1 where
EZ)) f(X0)]
A)

f (x 0)]
f(x0 1 Ez)  f(x0)
z e V, E >0, e (0, 1). CONTINUITY
5.20 f: E > A is said to be locally bounded above (below) at a point a e E if there exists a neighborhood of a on which f is bounded above (below).
Theorem. Let f: E > A be convex. Let a e E be such that f(a)> 00 and f is locally bounded above at a. Then: (a) f is a proper convex function, and int(dom(f))# 0. (b) f is locally bounded above at each point of int(dom(f)). (c) f is continuous on int(dom(f)). 

Proof. The study of f is equivalent to the study of the function x f(x + a). Hence there is no loss of generality if we assume that a = O. Let U be a  M < +09 whenever neighborhood of 0 on which f is bounded above: f(x) ,:; x c U. (a) We have U int(dom(f)). Since f(0)> 00, it follows from § 5.12 that f is a proper convex function. (h) Let xo c int(dom(f)). There exists A >1 such that Axo E int(dom(f)).
68 The set
W:= xo + (1
U
is a neighborhood of xo . If y c W, we have
1
y = x0 + (I
Ai
u
for some u e U, hence
1 f(y) = f ( (AO + (1 — 1
1
f(Ax 0)+ (1  )f(u)
—
)1/4.
1) f(Ax0 )+ (1 — — M and it follows that f is locally bounded above at xo . (c) Let 0< e < 1. The set X := e(Un( U)) is a neighborhood of 0. For all x c X we have x/CE U, hence f(xle)M and so —
f(x) = f ((l — E) 0 +
E •
(1—
+ ef(i x)
f(0)+ E[M f ( 0 )]. We also have —x/e e U, hence
f /1 +E
+E 1
1 +E
x+
f(x)+ f(x)+
e
( 1 )c \
1+e
E ))
E f( 1+E
1 E
1+
1
M.
It follows that If(x)—f(0)1 £(M  f(0)). We conclude that f is continuous at 0. Combining this result with (b), the stated result follows.
5.21 Now let E be a normed linear space. In this case we can prove a little bit more. Let A OE E. A function f: E —>[{1 is called Lipschitzian relative to A if f is realvalued on A and if there exists K >0 such that If(x)— K lx for all x, ye A. f: E >ft is called locally Lipschitzian relative to A c= E if f is realvalued —
69 on A and if for every a c A there exists a neighborhood U of a such that f is Lipschitzian relative to U fl A.
Theorem. Let E be a normed linear space and f: E—> GI a proper convex function. If f is locally bounded above at some point of E, then f is locally Lipschitzian relative to int(dom(f)) (cf. § 1.7).
Proof. Let a e int(dom(f)). From § 5.20 it follows that f is continuous at a. for all points x Hence there are ro > 0 and m, MGR such that m Let 0< r< ro and x, y c in the closed ball B (a ; ro) : = E E k — all — x), we have y = B (a ; r). Writing ilx — yll = cr and z = y+ [(ro — — all+ ro — ro , we Az + (1 — A.)x where À = a/(a+r0 —r). Since ilz — have z c B (a; ro). It follows that f(y) ,._ Af(z)+ (1— A)f(x) hence
f(Y) – f(x) (f(z) – f(x)) A(M–
ni
)
ro r Ilx – Yll
M
rn
Interchanging x and y, we get
If(x) — f(Y)1
M — ni
ro r
I lx — vil
for all x, y c B(a; r), which completes the proof.
CONTINUITY AND LOWER SEMICONTINUITY IN Rn 5.22 Leimna Let A cRn be open and xo c A. Then there exists an nsimplex S such that S A and XE int(S).
Proof. There exists e >0 such that the closed hall B(x0 ; e) is contained in A. Let e l , e2, , en be an orthonormal basis for R", and let
P:= The nsimplex T:=coaee 1 ,lee2, point 0 can be represented as
0=
1
/
E
2_, ei . n 2
, iteen , p) is contained in B(0;
The
E
(=eei )+ = tti acei )+ (lop 2n i:1
—
where 0< pi g is continuous at each point of ri(dom(f)) (the reader is urged to construct a counterexample). But it is true (and the simple proof is left to the reader) that a convex function R" —> k is lower semicontinuous at each point of ri(dorn(f)).
5.24 Theorem Let f be a proper convex function on Rn. Then function.
is a proper convex
Proof. Let xo c ri(dom(f)). We have f(x0) E R. By remark (b) of § 5.23, f is lower semicontinuous at xo, hence Rico) f(x 0) (cf. Theorem 5.8). Applying (which is convex and lower semi§ 5.12, we find that the function
r
continuous) cannot be improper.
5.25 Combining § 5.21 and § 5.23 yields the following result:
Theorem. Let f be a proper convex function on R". Then f is locally Lipschitzian relative to ri(dom(f)).
71 5.26 Let E be a normed linear space and A OE E. Let T:= { ff3 I 13 E B} be a collection of functions E > R. T is called locally equiLipschitzian relative to A if each fo T is realvalued on A and if for every a e A there exists a neighborhood U of a and a K> 0 such that
Ifp (x) — fo(Y)1Kilx for all x, y
LI n A and all
t3 E B
Theorem. Let UOERn be open, and let T := Ifo I f3 c BI be a collection of convex functions Rn —> R. If the set {f(x) 13 E BI is bounded for each x E U (in other words, if T is pointwise bounded on U), then T is locally equiLipschitzian relative to U. Proof (cf. the proof of Theorem 5.23). Let a c U. Define M(x) := sup{fo (x) I BI (x e U) m(x):— inf{fo (x) I 13E B} (x e U). By Lemma 5.22, there exists an nsimplex S = co(ao , a l , S U and a E int(S). We have
, an ) such that
(Vi3 c B)(Vx c S)fp (x) where M : = max{M(a1 )10 i n}. Let ro > 0 be such that B (a ; ro) OE S, and let x B (a ; ro). Setting x — ail= a and y = a + (ro/cr)(a — x), we have y c B (a ; ro) and a = Ax + (1— A)y where A = rol(ro + o). It follows that for all g c B fo (a) Af (3 (x)+ (1— )0f 0 (y) hf o (x) + (1— A.)A4 hence h— 1 1 fo(x) — fo(a)+ M A
m
where m := min(M, m(a)). We conclude that
B)(Vx B(a; ro)) m fo (x)M. Let 0< r < ro . Following the proof in § 5.21, we get
Ifo(x)  fo(Y)I whenever x, y c B(a ; r), 13
E
M m 10
r
il x
B, which proves the stated result.
Remark. It is left to the reader to verify that the openness assumption concerning U may be weakened.
72 5.27 The reader is invited to note that some properties of linear functions have an analogue for convex functions. For instance, we can relate Theorem 5.23 to the fact that each real linear function on Or is continuous. § 5.26 can be related to the theorem that says that a collection of real continuous linear functions on a Banach space that is pointwise bounded, is uniformly bounded (i.e. there exists K > 0 such that 11f11 K for each function f in the collection).
DIFFERENTIABLE CONVEX FUNCTIONS 5.28 Definitions Let E be a normed linear space. Let E' be the dual of E (cf. § 3.13). Let f and let xo be a point of E where f is finite. be a function E (a) f is said to be Fréchet differentiable (or, shortly, differentiable) at xo if there exists x' e E' such that f (x) f (x 0) (x xo x') lim (35) =0 X X lix xoll or in other words 
—
—
—
0
f(x)= f(x 0)+(x — xo x')+o(IIx — x 011) (x —› x 0)
(b)
x' is uniquely determined by (35). It is called the Fréchet derivative (or, shortly, the derivative) of f at xo and is denoted by f(x 0) or df(x 0). f is said to he Gâteaux differentiable at xo if there exists x' E E' such that for all x E E f(x 0 + ex) f(x 0 (36) lim ) (x x') 
—
E
x' is uniquely determined by (36). It is called the Gâteaux differential of f at xo . We shall denote it by Vf(x 0). 
Fréchetdifferentiability implies Gateauxdifferentiability, but the converse is not true. It is true, however, for proper convex functions on R If the Gateauxdifferential Vf(x 0) exists, then 11 (x 0 ; x) exists for each X E E, and f(x0; x) = (x I Vf(xo)). But the existence of all directional derivatives of f at xo does not imply the Gâteauxdifferentiability of f at xo (since the last property means that f(x0 ; x), in addition to existing for each x e E, is linear and continuous in x).
5.29
Theorem
Let f: E > possess continuous partial derivatives of order 1 and 2 (in other words, let f be twice continuously differentiable). Define the Hessian
73 matrix H(x) of f at x by
a2f 2
8x1
H(x):=
a2f (X)
aXiaXtt
(X)
a 2f
where x ER'. Then f is convex H(x) is positive semidefinite for each x E Rn. Proof. By § 5.9, we have: f is convex for all x, y ER' the function g: tl—> f(x + ty) from R to R is convex. By Theorem 1.11, g is convex if and only if g"(t) 0 for all t E R. We have U
gi(t)=
a
E ax,  (x
i == 1
n.
g"(t)
f
E
ty)yi
02f ty)yi yi = (H(x + ty)y y).
axi axi
Hence
(H(x + ty)y y) X, yERn, teR
f is convex
0 whenever
which proves the stated result.
SUBDIFFERENTIABILITY Let E be a normed linear space, and let E' be the dual of
E.
5.30 Convex functions are not necessarily differentiable. In the following we introduce the notion of subdifferentiability. It will turn out that in convex analysis subgradients of convex functions are often useful where ordinary derivatives do not exist. Definitions. Let f be a function E > I , and let xo be a point of E where f is finite. (a) Let )4 E E'. x (') is said to be a subgradient of f at xo if —
f (x) f (x 0) ± (x (b)
—
x o I x ))
whenever x E. The set of all subgradients of f at xo is called the subdifferential of f at xo . It is denoted by af(x 0). af(x 0) is a convex subset of E'. f is said to be
74 subdifferentiable at xo if 8f(x0) 0. If f is not finite at x, we define
af(x)= 0. (c) (d)
The subdifferential of f is the multifunction af: xl> Of(X) from E to E'. The domain dom(af) of af is the set Ix E E I af(x) 01.
5.31 Examples (a)
Let f: (a, b)> R be convex. By Theorem 1.6, f is subdifferentiable at every c E (a, b), and we have af(c) [ft(c), f (c)]. Define f :R ÷R by 
(b)
f(x)={
(c)
V(1x 2) +00
if lx11 if lxl> 1
f is a proper convex function which is subdifferentiable (and even differentiable) on (1, +1). We have 1, +1E dom(f), but f is not subdifferentiable at 1 or +1. Let E = Rn, and let f be the Euclidean norm on Rn: f(x)= f is not differentiable at 0, but it is subdifferentiable at 0: af(o) consists of all x' ER' such that
(Vy E Ur)
 x'). 11Y11 ,(y
It follows that Of(0) is the closed unit ball Ix ER
lix1111.
5.32 In the following we give a geometric description of the notion of subdifferentiability. First we recall that every FE (E R)' can be represented as
F(x, A) = (xlx;))+a0 A. ((x, X)E ETR) for some x (') EE' and some ao E R. We shall write this as F= (x(') , ao). It follows that a closed hyperplane H in ECM is a set described by an equation of the form
(xjx ))+a () X = 00
(37)
where (xo', a 0) (0, 0) and 00 E R. H is said to be vertical if ao = 0 (hence x o' 0). If H is non vertical (which means that a () 0), then (37) can be written as 
X = Kx
a ()
I
a()
and it follows that H is the graph of the continuous affine function X 1> (x
1 ) 00 x (') +—
ao
ao
from E to R.
75
Lemma
5.33
Let f be a proper convex function on E and xo e dom(f). Let F := ao) (EGR)', and let H := F '(00) be a supporting hyperplane for epi(f) at (x0 , f (x 0)). Then: (a) If F(ePi0 2, 0 0 , then a0 0. (b) If x o e int(dom(P), then ao 0 (in other words, H is nonvertical).
Proof. (a) Since F(epi(f)),, 00 , we have 00
(x I x (1) )+ ot o X
whenever (x, X) e epi(f). Letting X +00, we conclude that a0 0. (b) Let xo E int(dom(f)). Suppose we would have a0 =0 and F(epi(f)) 0 0 , then (X . 76 )
= (X01 X())
and so
(x  x o I x o');? 0
(38)
whenever x e dom(f). Since xo e int(dom(f)), for each y e E there exists £ > 0 such that xo + Ey e dom(f), x0  Ey e dom(f). Putting x xo + Ey and x= X0  Ey, respectively, in (38) we get (y 0 and (y .36) 0, hence (y J x ))= 0. We thus would have (y = 0 whenever y c E, hence )4 =0, a contradiction.
5.34
Theorem
Let f be a proper convex function on E, and let xo dom(f). Then f is subdifferentiable at xo if and only if there exists a nonvertical closed supporting hyperplane for epi(f) at (x0 , f(x0 )).
Proof. 'Only if': Let y4 e f (x 0). Set F := (x;) , 1), po : = (x0 I x ) f(x0), 
H:= F1 (00).
H is a nonvertical closed hyperplane in EGA. We have
F(xo , f(x0 = 00 , hence (xo , f(x o))e H. Let (x, X) e epi(f). We have f(x) and f (x) f(x 0) + (x  xo I hence
X
))
F(x, X) = (x I x (') )

(x
f(x)
(xo I x;)) — f(xo) = PO and it follows that H is a supporting hyperplane for epi(f) at (x0 , f(x0 'If': Let H = F  '((30 be a nonvertical closed supporting hyperplane for epi(f) at (x0 , f(x0 We have F ao) for some x (') c E' and some a0 E There is no loss of generality if we assume that )).
)
)).
RePi(f))  00
(39)
76 where 0 0 = F(xo, f(x o)). Combining (39) and Lemma 5.33 yields ()to= 0. Since ao 0, it follows that a 0 > 0. By (39), for each x e dom(f) (x f x ) )+ a0f(x)= Po (x0 f )0+ ce0f(x0) hence 1 ) f(x) f (x 0) + (x –xo – — x;) ao
(40)
Since (40) holds trivially when x0 dom(f), it follows that – xUot 0 E af(x0), hence af(x 0) 0.
5.35
Theorem
Let f be a proper convex function on E. (a) If f is continuous at some xo e dom(f), then f is subdifferentiable at each point of int(dom(f)). (b) If E = Rn, then f is subdifferentiable at each point of ri(dom(f)).
Proof. (a) Since f is continuous at x o , there exists a neighborhood U of xo on which f is hounded above: there exists K >0 such that f(U)._ K. It follows that U X (K, +00) OE epi(f), hence epi(f) is a convex body in E R. By § 3.10, there exists a nontrivial closed supporting hyperplane H for epi(f) at (xo , f(x0)). Since x o E int(dom(f)), it follows from Lemma 5.33 that H is nonvertical. Applying Theorem 5.34, we conclude that 0f(x) 0. By § 5.20, f is continuous on int(dom(f)). Applying the above result yields af(x) 0 whenever x E int(dom(f)). (b) Let xo e ri(dom(f)). It is left to the reader to prove that (xo , f(x 0)) rb(epi(f)). By § 5.12, there exists a nontrivial supporting hyperplane H for epi(f) at (xo , f(x 0)). Suppose H were vertical: H = F1 (i3 )) where F (x;) , 0). Following the proof of Lemma 5.33, we would have (y f x(i)) = Po whenever y e. dom(f), hence epi(f) H, violating the fact that H is a nontrivial supporting hyperplane for epi(f). We conclude that H is nonvertical, hence af(x 0) 0.
Corollary. A convex function f: Rn 5.36
R is subdifferentiable.
Theorem
Let f be a convex function E > fi, and let xo be a point of E where f is finite. Then x E af(x o) if and only if f(x 0 ; x)(x f x (')) whenever x E E.
77
Proof. We have x (') c af(x 0) (N x c E)f(x)  f(x 0)+ (x — xo I 4) (Vx e E)(VE >0)f(x0 + EX) f(x 0)+ E (x I x ))
(Vx c E)(VE >0) 1 If(x0 + x) — f(x 0)} (x f x (') ) Since —
If (xo + EX) f(X0)}
f f (X0; X) (E
0
)
(cf. § 1.5) the stated result follows.
Theorem
5.37
Let f be a convex function E > DI, and let x o be a point of E where f is finite. If f is Gâteauxdifferentiable at x o , then af(x()) —{Vf(x0)} where Vf(x 0) is the Gâteauxdifferential of f at xo .
Proof. From § 5.28 it follows that f(x(); x) = (X f Vf(x())) whenever x c E. Applying Theorem 5.36 yields Vf(x 0) E 8f(x0). Conversely, if E af(x 0), then by Theorem 5.36 (x f Vf(x o)) (x I x(')) hence (x Vf(x0)— whenever x c E. It follows that Vf(x 0)— x (1)
—
0, hence x (') =Vf(x0).
Remark. If f is continuous at x o and has a unique subgradient at xo , it can be proved that f is Gâteauxdifferentiable at x o . 5.38
Theorem
Let f 1 , f2 be proper convex functions on E. Then: (a) a (f , + f2)(x) af,(x)+ 8f 2 (x) whenever x c E. (b) If there is a point in dom(ft ) n dom(f2) where f is continuous, then 0 (f1+ f2)(x) = afi(x)+ af2(x)
whenever x E E.
78
Proof. (a) If Xj E afi (x),
af2 (x), then for all y e E
I ;) (V X I XD
fi(Y) ft(x)+(Y f2(Y)
hence
I X ; X2)
(f + f2)(y) (fi + f2)(x) + (3)
It follows that x; + )6,E 8(f 1 + f2)(x), which proves the stated result. (b) Let xo E dOM(fl f2) (=dorn(h) n don(f2)), 4 E a(fl +f2)(x0) (the case where X0 G E, x o lkdom(f i +f2) is left to the reader). Define g 1 and g2 by
I
g i (x):= fi (x+ xo) — f (x0) 
(X
f
(41)
g2(x): f2(x 4  X0) f2(X0)
g l and g2 are proper convex functions on E, and dom(g 1 ) = dom(fi) — xo, dom(g2 ) = dom(f2) — xo, g1(0) = g2(0 ) = 0, 0 c a(g i + g2)(0). Furthermore, g 1 is continuous at a point of dom(g i ) dom(g2). Let EEDR I g 1 (x)À}
C1 :={(x, A) C2: = {(X,
C1 and 12.)
C2
E
E CDR I
—g2(x)}
are convex and nonempty (note that C 1 = epi(g 1 )). (Cf. Figure
In virtue of the continuity of g 1 at a point of dom(g i ), we have int(C,) :A 0 (cf. the proof of Theorem 5.35). Furthermore, int(C i ) n C2 = 0. In fact, suppose this were not true. Then there would exist (x, A) e int(Ci ) n C2, and there would exist E > 1 such that {x} x (X — E, À + E) C1 , hence g 1 (x)À  e < _—g2(x), a contradiction, since 0E a(g, + g2)(0) implies g 1 (x) —g2(x) whenever x c E.
Figure 12
79 By Theorem 3.8, there exists a closed hyperplane H in EEDR separating C1 and C2 properly. Let H = (13), F(Ci ) 13, F(C2) (3, F = (x', a) where (x', a) (0, 0). Since (0, 0) c H, we have p = O. Suppose a were zero (hence x' 0). Then we would have
(x
for all x e dom(g 1 )
(42)
(x x')0
for all x dom(g2)•
(43)
and
Let y be a point of dom(g 1 ) n dom(g2 ) where g 1 is continuous, then y E int(dom(g 1 )). (42) implies (y I x')< 0, violating (43). We conclude that a O. Since (0, A) e Ci for each À >0, we have a (R convex. Define f: V —> A by f(x) = inflA I (x, A) e AI.
7
8
9
Show that f is convex. Let E be a linear topological space. Let f be a continuous proper convex function on E and A. c R. Let A := c E I f(x). XI, B := E f(x)< XI. Prove the following statements: (a) It is not necessarily true that B = int(A). (b) If there exists x c E such that f(x)< A., then B int(A). Let V be a linear space, and let f be a real convex function on V. Prove the following statements: (a) Every local minimum point of f is a global minimum point. (b) If f is strictly convex, then f has at most one global minimum point. (Cf. Exercise 1 of Chapter 1.) Let E be an inner product space. Let C E be convex and nonempty, and let ,Ic o e E. Show that there exists at most one best approximation to xo in C, that is an element c c C such that lixo — ell = d(x o , C)
1()
(cf. § 5.18, example (b)). Let V be a linear space, and let f and g be convex functions V —>R. Determine dom(f g).
81 11
12
be convex and Let E be a linear topological space. Let f: E a E dom(f). Prove that f is continuous at a if and only if f is locally bounded above at a. The function f: Rn k is defined by
(16, • • • 6, ) lin f
13
• • • ,
otherwise.
Prove that f is convex. Let f be a continuous function from R" xR to R such that, for each t E [a, b], the function xi> f(x, t) is convex (shortly: f is convex in x). defined by Show that the function g: Rn
g(x)=
14 15
j
f(x, t) dt
is convex. Let E be a normed linear space. Show that a suhdifferentiable function from E to R is convex. (Cf. Exercise 3 of Chapter 1.) Let f he the Tchebycheff norm on Rn, defined by
• . . , 6i) = max
f(6,
Show that afo= co(e i , , e„, e l , , e„) where e l , e2, , e„ are the unit vectors in R" (e l = (1,0,0, , 0), etc.). Let E be a normed linear space, and let f(x)= 111x11 2 (x E E). (a) Show that f is convex. (b) Show that for all x E E —
16

af(x) = Ix' c E' (x 1 x') =114 11x'lland lix'11=11x111 where
11,41 := sup 1( x 1 x')1 x 'e() 1 ,C11 (e)
17
Prove: if E is a Hilbert space, then af(x)= {x} for each x E E. Let E he a normed linear space. Let f he a real Gâteauxdifferentiable function on E, with Gâteauxdifferential Vf. Prove the following statements: (a) f is convex if and only if
f(y)f(x)+(y —x I Vf(x)) (h)
whenever x, y E E. f is strictly convex if and only if
f(Y)>f(x)+(Y x I Vf(x)) whenever x, y E E, x
y.
82 18 19
Show by means of a counterexample that it is generally not true that a(f1+ f2)(x)= afi (x)+ af,(x) (cf. the remark following Theorem 5.38). Show that the theorem we get by putting E =Rn in Theorem 5,38 is a special case of the theorem in § 5.39.
The following exercises, although useful and elementary, are a little bit tedious.
20
Let X he a topological space and A X. Let f be a function A Show that the following conditions are equivalent.  X} is a closed subset of X. (a) For each A. e R, the set {x E A I f(x) (b) {(x, A.)E AxR f(x)À} is a closed subset of X x R. (c) f is lower semicontinuous and for each a E \A lirn f(x) = +00 X >(1
21
(a)
Let E be a linear topological space. Let C E be convex, and let f be a real convex function on C. Show that either (I) f is locally bounded below at each point of C.7, or (II) f is locally bounded below at no point of C.
Remark. f is said to he locally bounded below at a e (..; if there exists a neighborhood U of a such that f is bounded below on u nc (h) Let CR be convex, and let f be a real convex function on C. 22
23
24
Show that f is locally bounded below at each point of C. (c) Find E, C, and f for which statement (II) of (a) is true. be Let UR" be open, convex and nonempty, and let f: convex. Prove that f is Lipschitzian relative to each compact subset of U. (Cf. § 1.7.) Let f be a function RP x Rq >R such that f(x, y) is continuous in x and convex in y (which means that xi>f(x, y) is a continuous function for each y E R' and y i—f(x, y) is a convex function for each x ER"). Show that f is continuous. Let E be a normed linear space, and let f be a convex function E which is continuous at a point xo where f is finite. Show that f(x 0 ; x) = max{(x I x (;) I x (1) E af(x 0)}
25
whenever x e E. Prove the 'remark' following Theorem 5.37.
NOTES 1 2
It can be proved that a lower semicontinuous convex function f on a Banach space is continuous on int(dom(f)). (Cf. § 5.20.) In W. W. Breckner and G. Orbdri, 'Continuity properties of rationally sconvex mappings with values in an ordered topological linear space'. 'Babes,Bolyai' University of ClujNapoca, 1978
83 results on the continuity of convex functions (cf. § 5.20) are extended to mappings f whose values lie in an ordered topological linear space and satisfying the inequality f(Ax + (1— A.)y) ,. A.sf(x)+ (I — Vf(y) for all x, y and all À e (0, 1), where s is a real number belonging to the interval
(0,11 3 Convex analysis in Rn is the oldest and most developed part of our subject. A 4 5
6
fundamental reference is R. T. Rockafellar, Convex Analysis, Princeton, Princeton University Press, 1970. A more detailed account of differentiability properties can be found in A. D. Joffe and V. M. Tihornirov, Theory of Extremal Problems, Amsterdam, NorthHolland, 1979. It can be proved that if f is a lower semicontinuous proper convex function on a Banach space, then dom(af) 0 (cf. § 6.35) and dom(af) is even dense in dom(f). See, for instance I. Ekeland and R. Temam, Convex Analysis and Variational Problems, Amsterdam, NorthHolland, 1976. Theorem 5.38 is in R. T. Rockafellar, Convex functions and dual extremum problems, PhD Thesis, Harvard University, 1963.
CHAPTER 6
Duality
In this chapter we designate by E a normed linear space (containing more than one point) over R, with norm x 1>I1x1I, and by E' the dual of E. The
separation theorems imply that for each x G E, x 0 there exists x'c E' such that (x x') O. THE CONJUGATE FUNCTION
6.1 Definitions (a)
The conjugate (or dual or polar) of a function f: E fi is the function f*: E'—'R defined by
f*(x') = sup {(x J x')— f(x)} (x' E E'). xcE.
(b)
The conjugate of a function g: E' fi is the function g*: E >R
defined by
g*(x) = sup Rx I
g(x')} (x
E).
x'cE'
(c)
The bipolar f ** of a function f from E to Ei;i or from E' to 01 is the conjugate (f*)* of the conjugate of f.
6.2 Remarks (a)
If f*(x`) is finite, then it equals the smallest real number a satisfying
f(x) (x I x') — a (b)
whenever x E E. (Cf. § 1.16.) Every function of the form x
I x')+ a, where x' c E', a e R, is a
84
85 continuous affine function on E (hence so is the function x 1—> (x J x')— g(x')), and every continuous affine function on E is of this form. If we supply E' with the norm topology, defined by the norm x'' > where I x ') i = sup 1(x I x')I up 114 : 7= s 1(x it= I II . then every function of the form (x J x')+ a, where x EE, a c R, is a continuous affine function on E' (hence so is the function x' (x J x')— f(x)). But then it is generally not true that every continuous affine function on E' is of this form. This is the case, for instance, if we supply E' with the socalled weak topology w(E',E) (with the property that convergence relative to this topology of a sequence—or, more generally, a net—(x) to a point x' means that (x (x I x') for each x EE). 6.3 The (a) (b) (c)
proof of the following simple properties is left to the reader. If f, h are functions from E to R such that f h, then f*. h*. (+00)*= —00. If there is a point where f: E —> [14 has the value —00, then f* = +00. In particular, (00)*= +00.
Note that (b) and (c) imply that the formula f** = f is generally not true. The reader can easily show that for all f: E R we have f** (d)
If Ifo, I a E Al is an arbitrary collection of functions E>
(inf fo,) = sup f: aCY
(sup fa) a
(e)
jflf f. a
In the last inequality, equality does not hold in general. If f is a function E —>fi and A >0, then
(4)*(x') = Af*( j F) (x' (f)
E').
If f is a function E—> fi and a E R, then
+ a) * f* (g)
If f is a function E—
and x
E
E, x'
E', then
f"(x')= f*(x')+(x x'), (h)
where the function fx is defined by fx (y)=f(y —x) (y c E). If f is a function E —> fi, then inflf(x) I x E El = _f*()
then
86 6.4 Examples (a)
Let xE E', a ER. Define f: E>R by f(x) = (x I x(;)— a
(x E E).
Then if x' x (') f *(x') = sup {(x I x'— x (') )+ a} = roe a if x' = x t") LEE (b)
hence f* = a (cf. § 5.1 5). Let f(x) = 1142 . Then
f*(x')= sup {(x J x')
1142}
LEE
= sup sup {(x t 3.O 11.11=t
sup
t sup (X I x')— t 11.11
= sup {t hence f*(x')
6.5
x')  14112 1
,t2} = Ilx 1112
ilx'112.
Definition
Let A E. The support function of A is the conjugate SA of the indicator function 5A of A:
— sup x I x {(
')
—
A (x
)}
LEE
= sup (x x') (x' E xcA
The reader is urged to give a geometrical interpretation of SI in the case where E=R.
6.6 Theorem Let f and g be functions from E to R U1+001. Then
(fo g)*
r + g* •
Proof. For each x' E E', we have (f g)*(x') — sup ((x
inf [f(x i ) + g(x 2)]} .2
=sup sup [(x I x X1 1 X2=X
')
—f ( x 1 )  g (x2):1
= sup {[(xi x') — f(x1)]+[(x2 J x`) — g(x2)]} X1 ,X2
= f*(x')+ g*(x').
87 6.7
Example
Let C
E be convex and nonempty. Define f: E
by
Ilx Yll (x E E). f (x) = yinf C E
gE15c where g(x) =114 (x G E).
From § 5.18, example (b) it follows that f For each x' e E', we have
g*(x')= 913{0cl014} sup sup {(x1x) t}
= sup t(11x11 I) cO
hence g* 5s where S =fix'e E'
f*= g*+
r oc)
if 11x11> 1
0
if 114=1
11. It follows that
8s +St
hence f*(x') =
{(x') if 11)01=1 +00 if 114 > 1.
6.8 Theorem Let f be a function E >E. Then f* is a lower semicontinuous convex function on E` (with the norm topology; cf. § 6.2, remark (b)). The simple proof is left to the reader. —
6.9 Let f be a function E .1178. For each x E E,
f*(x') (x
E
E',
x') f(x) 
hence
I x')
(44)
whenever the lefthand side is defined. (44) is called Fenchel's inequality. (Cf. § 1.16.)
6.10
Theorem
Let f be a function E
and let x be a point of E where f is finite. Then
E af(x) f*(x') (x x') f(x).
(45)
88 Proof. We have
x' E af(x) (Vy
E
E)f(y) f(x)+ (y
(Vy e E)(x x') —f(x)
x x') y J x')—f(y)
sup {(y x')—f(y)} = (x x')—f(x) yeE
f*(x') = (x I x')—f(x). Remark. The last equality can be written as
f(x)+f*(x') = (x I x'). We thus see that the subgradients of f are the elements of E' for which Fenchel's inequality turns out to be an equality.
THE BIPOLAR FUNCTION 6.11
Theorem
Let f be a function E—I. Then: (a) f** is the pointwise supremum of the collection of all continuous affine minorants of f. (b) f** is a lower semicontinuous convex function on E.
(c)
f*** f*.
Proof. (a) Let A he the collection of all continuous affine minorants of f, and let F = supfg I g e Al. Suppose first that f*(x') = —00 for some x' E E'. The reader can easily prove that in this case we have f** = f = F = +oe, hence f ** = F. Suppose next that f*(x')> —oe for each x' E E'. Then for each x' e E' such that f*(x')< +00, the function g: x 1>(x x')—f*(x') is a continuous affine function on E. From § 6.9 it follows that g is a minorant of f, hence g c A. It follows that
r(x) I f*(x')< +oel
f**(x)= sup{Kx
F(x)
whenever x E E. If h c A, h is of the form h(x) = (x I x')— a (x c E) where x' E E', a c and we have (Vx E E)(x x')— a  _f(x) hence a
sup {(x J x')— f(x)} = f*(x') XE
E
and so
h(x) = (x I x')— a
x')—f*(x').
89 It follows that for each x
G
E
F(x)  __. sup {(x I x')—f*(x 1)1= f**(x). .x.',E' We conclude that f** = F. (b) Cf. Theorem 6.8. (c) We have f** ._.f (cf. § 6.3(c)), hence f*** . f* (cf. § 6.3(a)). Applying the inequality f **f to f* instead of f gives f***f*, which proves the stated result. 
6.12
Lemma
Let g be a lower semicontinuous proper convex function from E to g. Then g has a continuous affine minorant. More precisely, for each xo e E and each ao E (00, g(x o)) there exists a continuous affine function h such that h(x 0)= a0 , h 0 > (x0 I x')+ aa 0 hence a(g(x0)—a 0)>0. It follows that a >0. Define the continuous affine function h on E by
h(x)= (x
—
1 x0 1 x')+ao (x E E). a
By (46) and (47), we have for each x a
(1
E
dom(g)
a
= h(x)+ (x 0 1 1 x')+ ao > h(x) a a hence h < g. Also, h(x 0)= ao . (b) If g(x0)= +00 and a >0, we can give a proof similar to the second half of the proof of (a). (c) Assume now that g(x0)= +00 and a = 0. Define the continuous affine function k on E by
k(x)= (x I —x')+ p (x E E).
90
We have k(x 0) >0 and k(x) < 0 for each x E dom(g). From (a) it follows that g has a continuous affine minorant m such that m < g. If in (x0) a o , then (x0) satisfies h (x0) = ao , h < g. If m(x 0) < a0 , for the function h : = m + ao each À >0 the function m + Ak is a continuous affine function satisfying in + Ak < g. Defining h := in + Ao k where —
a0 — m (x0)
A.0=
k (x 0)
we have again h (x 0) = ao, h < g.
Corollary. A lower semicontinuous proper convex function is the pointwise supremum of the collection of its continuous affine minorants. If g is such a function, then g** = g (cf. Theorem 6.11(a)).
6.13 In order to give another (simple) characterization of the bipolar function, we introduce the notion of closure of a function which is closely related to the notion of lower semicontinuous hull.
Definitions. Let f be a function E > R. (a) The closure cl(f) of f is defined to be the lower semicontinuous hull ff if f nowhere has the value —00, and in the other case it is defined toof (b)
6.14 (a)
he the constant —00. f is said to be closed if cl(f) = f.
Examples be convex. If Let f: E has no finite values, and —
f has the value { 00 +00 
f (x) =
(b)
6.15
—00 somewhere, by § 5.12
f
if x dom(f) if x0 dom(f) .
In this case cl(f) differs from f only outside dom(f), where cl(f) is —00 whereas f is +co there. If f is a proper convex function on ER', then by Theorem 5.24, f is proper convex too. In this case we have cl(f) = f. Hence for proper convex functions on Rn, closedness is the same as lower semicontinuity.
Theorem
Let f be a function E [11 . Then f ** = cl(co(f)).
91
Proof. Set g := cl(co(f)). g is a lower semicontinuous convex function. (a) If g = +0o, then f = +00 hence f** = +00= g. (b) Assume g = —co. Suppose f would have a continuous affine minorant h. Then we would have h f hence h co(f). This would imply co(f)(x)> —00 for each x E E, hence co(f) = cl(co(f)) = g, a contradiction. We conclude that f has no continuous affine minorants, hence f** = 00= g. (c) In the remaining case g is a lower semicontinuous proper convex function and g = co(f). We have f**g. Suppose there were x o E E such that f**(x 0)< g(x0). This would imply the existence of a c R such that f**(xo)< a < g(x 0). By Lemma 6.12, there would exist a continuous affine function h such that h(x 0) = a and h < g. This would imply f **(xo) a, a contradiction. We conclude that f** = g. ,
—
6.16
Theorem
If f is a proper convex function on R", then f**
f.
Proof. Combine Theorem 6.15 and § 6.14, example (b). Remark. Cf. Theorem 1.15. THE SET r(E) 6.17
Definition
F(E) is the set of all functions E >g which are pointwise supremum of a family of functions on E of the form xt> (xix')+ a, where x' c E', a c R. Analogously, we define F(E') as the set of all functions E' >fi which are pointwise supremum of a family of functions on E' of the form x' (xIx')+ a, where x E E, a c R. Note that F(E) can also be defined as the set of all functions E >R which are pointwise supremum of a family of continuous affine functions on E. F(E') can be defined analogously, provided we supply E' with a suitable topology (for instance, the weak topology; cf. § 6.2, remark (b)). —
6.18 Let (a) (b) (c)
Theorem
f be a function E —>fi. The following conditions are equivalent: f E RE). f = f **. f is a lower semicontinuous proper convex function or f is one of the constant functions —00 and +00.
92 Proof. (a) (c): let f E F(E). Since f is the pointwise supremum of a family A of continuous affine (hence convex) functions, it is a lower semicontinuous and convex function. If A = 0, then f = —co. If A 0, then f(x) > —00 for each x E E, hence f = +00 or f is a proper convex function. (c) (b): by § 6.3, properties (b) and (c) we have (00)**= —00 and (+00)**= +00. If f is a lower semicontinuous proper convex function, by f (cf. Lemma 6.12, corollary). Theorem 6.15 we have f** = cl(f)= (b) (a): Apply Theorem 6.11(a).
r=
6.19 Theorem The map f 1> f* is a bijection between F(E) and F(E').
Proof. Let f F(E). From the definitions of f* and f(E') it follows that f* from f* ERE). Also f** = f (cf. Theorem 6.18), hence the map f F(E) to RE') is injective. Let g G F(E'), then g* G RE). Following the proof of Theorem 6.11(a) we can show that g** is the pointwise supremum of the family of all minorants of g of the form x' x')+ a, where x e E, a GR. It follows that g g, hence g = g** = (g*)*. We conclude that the map f f* from F(E) to F(E') is surjective, which proves the stated result. We denote by ro(E) the set of all functions in F(E) which are not the constant functions +00 and —00. Analogously, we define ro(E'). Note that fo(E) is the set of all lower semicontinuous proper convex functions on E. By the above theorem, the map f f* is a bijection between Fo(E) and ro(E').
SUPPORT FUNCTIONS
6.20 Theorem (a) (b) (c) (d)
E be convex. Then: C = 0 = +00. 0 8c is proper convex. C is closed and nonempty 8c G Fo(E). 5c = Se.
(e)
oc
Let
Proof. (a), (b) and (e) are direct consequences of § 5.15. (d) The equality 8c = 8c: is trivially true if C = 0. If C epi(45c ) = epi(8c ) = C x [0, +00) = x[0, +00) = epi(k) hence gc =
0, we have
93 (e) The equality is trivially true if C = 0. If C Theorem 6.15 we have cl(8c ) =
0, combining (d) and
=
6.21 Theorem Let C OE E be closed, convex and nonempty. Then St.e
ro(E').
Proof. Combine Theorem 6.20(c) and § 6.19. 6.22 It follows from Definition 6.5 that a support function is positively homogeneous. Conversely, we have the following theorem.
Theorem. Let g E Fo(E') be positively homogeneous. Then there exists a unique closed convex nonempty subset C of E such that St= g.
Proof. First we show that g* is an indicator function. For each A >0 and each x' E E' , we have x x (Ag*)*(x') = Ag**( ) = Ag (— g(x') Ai
hence
(Ag*)* = g and so Ag *,__
(AO** = g* .
We conclude that the only values of g* are 0 and +co. It follows that g* =. 8c, where
C = Ix c El g*(x) — 0}
(48)
hence g = g**— 8t. The reader can easily verify that C is closed, convex and nonempty. Assume now that C 1 and C2 are closed, convex and nonempty subsets of E such that
 tz
Then 5 c, =
5)* (1/74 ,
5 C*, = 5c,
hence C1 = C2.
Remarks (a)
The set C defined by (48) can also be written as
C
c E I (V x' c E')(x I x') g(x')}.
94 (b)
(c)
The above theorem implies that in R" there is a onetoone correspondence between the closed convex nonempty sets and the positively homogeneous lower semicontinuous proper convex functions. If COE E, then C and C have the same support function, and this function is also the support function of each subset A of E such that COEA OE C. Hence it is generally impossible to find a set if its support function is known. The above theorem shows that this is possible if we know that the set is convex and closed.
EXERCISES E is a normed linear space.
1 2 3 4
be an even function. Define f: E > fk and g: E' —> R by f(x) = (p(11x11), g(x') = cp*(11x'11) (x E E, E E'). Show that f* = g. Show that the only function f: Rn > satisfying the equality f* = f is Ix). (Cf. § 6.4, example (b).) the function f(x) = Prove the following statement: if f: E > 1131 has a continuous affine minorant, then f** = co(f). Let f be a function E —> R, and let xo be a point of E where f is finite. Let af(x 0) 0. Show that Let cp: R >
(a) f**(x0) = f (x0); (b) af**(x0) = af(x 0). 5 6
Let f be a function E —>fi. Show that f ** is the largest function g in F(E) such that gf. Let f be a lower semicontinuous proper convex function on R". Show that for all x, x' E x E af*(x'). X' E af(x)
7
Let f be a convex function 'R" > f is proper convex
8 9
.
Show that
f* is proper convex.
Let A E. Show that e;,* = 813 where B = c7)(A). Let C, D T E be convex. Prove the following statements: (Cf. Theorem 6.6.) (a) 6t ±D = 8 1('+
(b) 10 11 12
Prove that in R" the support functions of the bounded convex nonempty sets are the real positively homogeneous convex functions. Let f: R > 01 be positively homogeneous and convex such that f +co. Show that cl(f) is a support function. Let f be a convex function E —>k, and let xo be a point of E where f is finite. Let the function g: E —> F8 be defined by g(x) = r(x0 ; x) (x e E). Show that
q (x o ) =
13
g(0).
Let K E be a nonempty cone. Show that 8
= 5.Ku.
95 14 Let A c Or be such that n is a linear function. Show that A consists of one point.
NOTES 1
2
The first general treatment of the conjugate of a convex function is in W. Fenchel, On conjugate convex functions, Can. J. Math. 1 (1949) 737. Cf. Note 3 of Chapter 1. The notion of conjugate of a convex function is connected with the classical Legendre transformation, used in the theory of differential equations and for functions >O defined by X y', Y = xy' y.
3
The definition of the conjugate function is based on the use of a collection of continuous affine minorizing functions (cf. § 1.16 and Definition 6.1). Using other classes of minorizing functions, one obtains various generalizations of conjugation. An axiomatic approach of duality which provides a framework for these generalizations is given in J. J. M. Evers and H. van Maaren, 'Duality principles in mathematics and their relations to conjugate functions', Department of Applied Mathematics, Twente University of Technology, 1981.
CHAPTER 7
Optimization
For many years, optimization problems were connected with differentiability, in particular in the classical calculus of variations. Although convex functions have been studied for a long time, the article of Jensen mentioned in Note 1 of Chapter 1 being one of the first treating convex real functions, only rather recently have they found wide applications in optimization. It has turned out to be possible to give new optimality criteria for convex differentiable functions. Moreover, some of these criteria remain valid if we omit differentiability and consider convex, not necessarily differentiable, functions. In this chapter we give an impression of the meaning of convexity in optimization. We designate by E a normed linear space (containing more than one point) over R, by x '—>11x11 its norm, by E' its dual.
7.1 Let f be a proper convex function on E. f has a (global) minimum at xo E E if and only if f (x ) f (x0 ) = f (x0 ) + (x — xo I 0 )
for each x c E. It follows that f has a (global) minimum at xo 0 c af(xo)•
(49)
The condition 0 c af(x 0) has the following geometrical interpretation: there exists a horizontal (closed) supporting hyperplane for epi(f) at (x0 , f(x0)) (cf. Theorem 5.34). This condition may be regarded as an analogue for convex functions of a familiar condition for a minimum of a differentiable function f, viz, the existence of a horizontal tangent plane to the graph of f, which in the case of a global minimum (cf. Exercise 8 of Chapter 5) is a horizontal supporting hyperplane for epi(f).
96
97 7.2 Let E be convex and nonempty. Let f be a proper convex function on E such that dom(f)n C 0. We denote fc the restriction to C of f. Set g := f+ 8c ; g is a proper convex function. Minimizing f over C (i.e. minimizing fc ) is equivalent to minimizing g over E, hence (49) implies that fc has a minimum at x0 E C if and only if 0 E ag(xo). By Theorem 5.38, the last condition can be written as
O E af(X0)
a5c (xo)
(50)
in each of the following cases: (a) there is a point in C dom(f) where f is continuous; (b) int(C) n dom(f) 0.
7.3 In the following we will study the set 38c (x0) that occurs in (50). We have
x' c (gS(x 0 ) (Vx c E)Sc (x) 8c (x0) + (x — x o I x') (Vx C)(x — x o
(Si)
It follows that 0Sc (x0) is a convex cone containing O. The simple proof of the following theorem is left to the reader.
Theorem. If xo c int(C), then 08c (x0) = The reader is urged to give a geometrical interpretation of the last inequality in (51) in the case where E = Or (cf. Figure 13). A vector x' E 85c(x0) is said
Figure 13
98 to be normal to C at xo , and 08c (x0) is called the normal cone (or cone of supporting functionals) to C at xo .
CONVEX PROGRAMMING IN R n 7.4
In the applications of convex analysis we meet the case where
C = {X E R n g(x)0}
(52)
g being a real convex function on W. We recall that g is continuous and subdifferentiable, and that (cf. Exercise 7 of Chapter 5)
int(C) = fx E [Fr I g(x)0 g'(x0 ; (x —x0)) =
lim
= À liM
g (_xo v e (x — x 0)) — g(x0)
—1 g((1— e)x 0 + F,
99 Conversely, if x ER" satisfies g'(x 0 ; X) 0 such that g(x0 + Ex) < 0, hence x0 + ex E int(C) and so x efl(int(C) — x 0) where II= (0, +00). It follows that
K := {x E R" g'(x0 ; x) x (n > 00). Since A is compact,
100
the sequence (an ) contains a subsequence (bn ) that converges to a certain b E A. Let (v.„) be the corresponding subsequence of (A. n ). Since there is E > 0 such thatlIx11 £ for all x E A (where is, for instance, the Euclidean norm on Rn), we have for all n ItLnl= It follows that the boundedness of the sequence (Rnbn ) implies the boundedness of (fi n ), hence (g n ) contains a subsequence (vn ) that converges to a certain y 0. Let (cn ) be the corresponding subsequence of (bn ), then vncn
x, y —> y, ,,
cn > b
(n
co)
hence x = vb, which proves the stated result.
7.8
Theorem
Let g be a real convex function on R" satisfying Slater's condition. Let C= E D I g(x) and let xo E R n such that g(x 0)= 0. Then a5c (x 0 ) is the cone generated by ag(x 0).
Proof. Let h be the function defined in Theorem 7.6 and let K be the cone defined in Theorem 7.5. By Theorem 7.5, we have
a8c (x 0) = K () = K() =
eRnI g'(x0 ;
(55)
(in virtue of the continuity of h). By Theorem 7.6, h = 87, where D = 8g(x0), hence tX ERn j g'(xo , eRn (Vy ED)(x y).01 =IxERn I (Vy EKD )(x y)=01= ICL where KD is the cone generated by D:
KD :=1A.x I A 0, xeDI. Since D is convex, KD is a convex cone. By § 3.13, ICA ) = ID hence (55) implies a8c (x0) = KZ) = kD
.
(
56)
The reader can easily verify that 3g(x 0) is closed. Since h has only finite values, ag(x0) is bounded (cf. Exercise 10 of Chapter 6). It follows that ag(xo) is compact. In virtue of Slater's condition, 0 ag(x 0). Applying Lemma 7.7 yields the closedness of KD, hence by (56), ()(x ()) = KD. 7.9 A convex programming problem in R n is a problem in which we seek to minimize a real convex function f subject to the p constraints gi (x) = 0 (1
i
p)
(57)
101 , g, are real convex functions on R". Shortly:
where g l , g2,
f min f(x)
(CP)
A point X0 E R satisfying gi (x0) 0 (1 i p) is called a feasible solution to (CP). A point xo c R" satisfying gi (x0) sup Ig*(x')f*(x')}. xeE,
(67)
x'EE'
If m = —00, the stated result is trivially true. Assume now that m >00. The hypotheses imply that m ER. Define
: {(x, A.) e E(f)Rjf(x) , ik} C2
:=
{(x, A.)E EeiR I À
g(x) f ml. 
C 1 and C2 are convex subsets of ER. The reader can easily verify (cf. § 5.38 and § 5.39) that there exists a nonvertical closed hyperplane H = F1 (p) in EGA,/ separating C1 and C2 properly. Assume that F(C11 )...F(C2) and F = (x', a) where x'e E', a E R . We have a >0, and for each x e dom(f) n dom(g) we have
(x Ix') + alg(x)+ ml
(x x')+ af(x) hence
(xly)y g(x)+ m where y' = —x7a e E' and 'y = —(3/a. Moreover, it is easy to see that these inequalities hold for all x E E. It follows that
y
and
g * (y r) y + m 
hence, in virtue of (67): in
g * (0f * (y'). sup Ig * (x 1)f * (x')}
in
x'EE'
which proves the stated result. 7.16
Example
Let CE be convex and nonempty and let xo e E. Define f(x):=11x — x
,
g(x):  = —8c (x)
(x E E)
Applying Fenchel's duality theorem yields
inf ilx
xeC
= inf {f(x)— g(x)} = max {g * (x') f * (X f)} —
xeE
x'eE'
107 From Example 6.7 and § 6.3, property (g) it follows that
f*(x')= 8s (x') + (x 0 I x') where S = Ix' E E'
(xi E Er)
We have g*(x 1) = —5t,(—x') hence
inf lix xoli = max xcC x'cE' { 5t( — x') — 5s (x')
()co I x)}
= max {(x0 I —x')— x'ES = max {(x0 j x')— 5t(x')}. Ix
lkl
It is left to the reader to give a geometrical interpretation of this formula in the case where E = Rn. PROXIMITY MAPPINGS
7.17 Let f be a lower semicontinuous proper convex function on R". Let xo E R. Define the function f: R > R by
F(x) = f(x)+111x — x 0I1 2 (x
ER)
where x i4 11x11 is the Euclidean norm on R. It is left to the reader to verify that F is lower semicontinuous and convex. We will show that F has a global minimum. By Lemma 6.12, f has a continuous affine minorant, hence there exist a E R" and a E R satisfying
(x I a) — whenever x E R. It follows that for each x E
— x 0I1 2 F(x) (x I a) — a + _ — 21Ix (x 0 — a)II 2 + (x0 I a) —110112 — a. Let b E dom(f). The last inequality implies the existence of a number R >0 such that F(x)> F(b) whenever lix — (x0 — a)II> R. The restriction of F to the compact ball Ix I JJx — (x0 — a)II RI assumes a minimum value m (cf. § 5.4, property (b)). Let this minimum value be attained at c. Since m F(b), it follows that F has a global minimum at c. Suppose that deRn satisfies F(d) = in. Since
II1(c
d) X0112 =111C X0112
— x011 2
IC u
we have in
F((c + d)) =f((c+ d))±
f(d)+ilIc xoll 2 + lic dil2
+ d) xor xoll2
108 hence c = d. It follows that the minimum of F is uniquely attained. The point c is denoted by
proxf (x0) and the mapping proxf of R" into itself is called a proximity mapping (with respect to f).
Remarks (a) The uniqueness of the minimum point can also be proved using the strict convexity of F. (Cf. § 5.18, example (a) and Exercise 8 of Chapter 5.) (h) Let C OE Rn be closed, convex and nonempty, and let f= 8c (cf. Theorem 6.20). Then proxf (xo) is the best approximation to xo in C, i.e. the point of C nearest to xo (also called the projection of xo on C). 7.18
Theorem
Let f be a lower semicontinuous proper convex function on Rn, and let x, y, z E llr. The following conditions are equivalent: (a) z = x + y and f(x)+ f* (y) = (x y). (b) x = proxf (z) and y = proxf .(z).
Proof. Define the functions g and F from Ilr to FR by g(u) = u — z112 and F = f + g. By Theorem 5.38, we have aF = af +ag. From §7.1 it follows that x = proxf (z)
0 c aF(x)0 E af(x)+ ag(x).
The function g is Fréchetdifferentiable, and Vg(u) = u — z for each u Theorem 5.37 implies X = proxf (z) .(=> 0
E
af(x)+ x
—
z
af(x))z x + y
(68)
y = proxf,(z) (3x E af*(y))z = x + y.
(69)
(3y and
Theorems 6.10 and 6.18 imply
y E af(x) .(=> f(x) f*(y) = (x
y)
x
af* (y)
(70)
(a) (b): Combining (a) and (70) yields y E 0f(X) and x E ar(y). Combining these results with (a), (68) and (69) yield x = proxf (z) and y = proxf *(z). (b) (a): Combining (b) and (68) yields the existence of a yo c 8f(x) such that z = x + yo . (70) implies x E 3f *(y0) and f(x)+f*(y o) = (x I yo). Combining these results with (69) yields yo = proxf ,(z) = y. It follows that z = x + y and f(X)+r(y)= (X I y).
109 7.19
Example
Let K OE R" be a closed convex nonempty cone. Set f = 5K • The reader can easily verify that f* = K o. The condition
5 K (X)± K 0(Y)=(X I Y) is equivalent to
xEK, yele, (xly)=0. In this case the formula Z
= proxf (z) + proxf,(z)
yields the unique orthogonal decomposition of z as a sum of elements of K and le, respectively (viz, the projections of z on K and K °).
MONOTONE OPERATORS 7.20 Let E be a normed linear space over R. We define a multifunction from E to E' as a function from E into the power set g"(E') of E'. A multifunction T from E to E' is called a monotone operator if
(x  y x'whenever x, y e E, x', y' E E', x' E Tx, y' E Ty. Note that a nondecreasing function from R to R is a monotone operator. A multifunction T from E to E' is called a cyclically monotone operator if
(71) for any finite set of pairs (x 1 , x (x2, x) ....., (xp, x p') such that xi E E, x:e Tx i (1 i p). Putting p =2 in (71), we see that a cyclically monotone operator is in particular a monotone operator, but the converse is not true. Monotone operators play an important role in optimization theory. The following theorems show some relations between monotone operators and convex functions.
7.21 Theorem Let f be a proper convex function on E. Then if is a cyclically monotone operator.
110 Proof. Let xi E E, x
p). Then
i
E af(X.i)
f(x2)=f(x1) +(X2 — xi I xi) f (x3)
f (x2) + (x3— x21 x
)
f(xp_i) + (xp — xp_i f(x 1 )
f(x)+(x 1 —x 2 x p').
Adding these inequalities yields
which proves the stated result. 7.22 A (cyclically) monotone operator T from E to E' is said to be maximal (cyclically) monotone if its graph
{(x, x') E E x E' x'
E
Tx}
is not properly contained in the graph of any other (cyclically) monotone operator from E to E'. Theorem. Let f be a lower semicontinuous proper convex function on R. Then af is maximal monotone and maximal cyclically monotone. Proof. Assume that xo, yo ER" satisfy
(x — xo jy —
(72)
for all x, y e R" satisfying y e af(x). Set
proxf (xo + yo), y i = proxf .,(xo + yo) (cf. § 7.17). Theorem 7.18 implies x0 + yo = x 1 + Vi and Vi E af(x 1 ). Putting X = x i , y = Vi in (72) yields
—11x0 412 = (x 1 — xo I x0 —x 1 )0 (where x 11x11 is the Euclidean norm on Fr) hence xo = x l and so yo = Yi E from Theorem 7.21 that
af
af
is maximal monotone. It follows now is also maximal cyclically monotone.
af(X i ) = af(X 0). We conclude that
NOTES 1
Many convex programming problems originate from mathematical economics. See, for instance H. Nikaido, Convex Structures and Economic Theory, New York, Academic Press, 1968.
111
2
3
4 5
One of the first papers where the role of convexity in optimization problems involving inequality constraints is emphasized is H. W. Kuhn and A. W. Tucker, 'Nonlinear programming', in Proc. 2nd Berkeley Symp. on Mathematical Statistics and Probability, Berkeley (1951) 48192. In § 7.13 we used the existence of a strictly feasible solution to prove that in (65) the coefficient (3 of f(x) f(x ()) is not zero. Conditions which guarantee that f3 > 0 are called constraint qualifications. Other constraint qualifications can be found in O. L. Mangasarian, Nonlinear Programming, New York, McGrawHill, 1969, M. S. Bazaraa and C. M. Shetty, Foundations of Optimization, Berlin, Springer (Lecture Notes in Economics and Mathematical Systems no. 122, 1976). The original (finitedimensional) version of Fenchel's duality theorem (cf. § 7.15) is in W. Fenchel, Convex Cones, Sets and Functions, Lecture notes, Princeton, 1953. The extension to the infinitedimensional case is in R. T. Rockafellar, Extension of Fenchel's duality theorem for convex functions, Duke Math. J. 33 (1966) 819. The theory of proximity mappings has been developed in J.J. Moreau, Proximité et dualité dans un espace hilbertien, Bull. Soc. Math. France 93 (1965) 27399. One of the first papers where monotone operators are studied in connection with convex analysis is G. J. Minty, Monotone (nonlinear) operators in Hilbert space, Duke Math. J. 29
(1962) 3416.
6
An exposition of the theory of maximal monotone operators can be found in H. Brézis, Opérateurs Maximaux Monotones et Semigroupes de Contractions dans les Espaces de Hilbert, Amsterdam, NorthHolland, 1973. The theory of monotone operators has proved to be a powerful tool for studying nonlinear partial differential equations of elliptic type. Also, many physical systems are described by monotone operators. See, for instance V. Dolezal, Monotone Operators and Applications in Control and Network Theory, Amsterdam, Elsevier, 1979. We mention some books in which a more extensive treatment of optimization theory can be found: V. Barbu and Th. Precupanu, Convexity and Optimization in Banach Spaces, Alphen aan den Rijn, The Netherlands, Sythoff & Noordhoff, 1978. I. Ekeland and R. Temam, Convex Analysis and Variational Problems, Amsterdam, NorthHolland, 1976. 1. V. Girsanov, Lectures on Mathematical Theory of Extremum Problems, Berlin, Springer (Lecture Notes in Economics and Mathematical Systems no. 67, 1972). A. D. Ioffe and V. M. Tihomirov, Theory of Extremal Problems, Amsterdam,
NorthHolland, 1979. P.J. Laurent, Approximation et Optimisation, Paris, Hermann, 1972. R. Wets, Grundlagen konvexer Optimierung, Berlin, Springer (Lecture Notes in Economics and Mathematical Systems no. 137, 1976).
Answers and Hints
CHAPTER 1 1
(a)
By Theorem 1.6, _C exists and is a nondecreasing function.Use the inequality
f(x) f(Y)+r±(Y)(x y) (x, Y
E
(a, b)).
(1))
2 3
Let x be a local minimum point of f and y E (a, b). Consider the point z = Ax + (1 A.)y where A E (0, 1). If A is sufficiently close to 1, f(z)f(x). It follows that f(y)f(x). (c) Suppose that x and y were global minimum points of f. Consider the point 1(x+ y). f is differentiable at c if and only if f',(c) = f' (c). There exists at least one line through (x, f(x)) that lies nowhere above the graph of f if and only if there exists m ER such that
f(Y)f(x)+rn(Y  x)
4
whenever y E (a, b). 'If': let a<x0), then H'(x) = yo —f(x). It follows that Mx()) = O if xo is such that f(x0)= yo (hence xo = g(3,0)). 'Only if': if f(a)= +00 or f(b)= +09, then Àf (a) + (1 — A.)f(b) = +co for each e (0, 1). 'If': use Definition 1.19. (a) Follows directly from Definition (a) in § 1.25. (b) Consider 1 if x = 0 f(x) = 1.0 if (c)
Suppose that a<x0 there exist 4
c, d E C satisfying d(a, c)< E ± 5, d(b, d) < £ + 5. It follows that d(ika + (1 — ik)b, + (1 — A.)d) < E ± 5, hence d(Aa + (1  A.)b, £. Suppose that C= co(a l , a2 , , an) = co(bi, b2 ,. , bu,) and no a1 (b1 ) is a convex combination of the remaining ak (bk ). Let 1 i rn. Express bi as a convex combination k =1
Akak
and express each ak as a convex combination
E 5 6
Deduce that bi = ak for some k. , en ) where e l , e2,. . . , en are the unit vectors Note that C = co(e i , e2, in Rn (e 1 = (1, 0, 0, , 0), e2 = (0, 1, 0, , 0), etc.). These vectors are the vertices of C. Let C be a convex polytope. Every vertex a of C is an extreme point of C: let the vertices of C he a, x l , x2, , xk. Suppose there exist x, y E C such that a E (x, y).
114 , xk and deduce Express x and y as convex combinations of a, x l , x2 , that a = x = y. Every extreme point b of C is a vertex of C: let the vertices of C be x l , x2, , X. Express b as a convex combination
E i =1
7
and show that Xi = 1 for some i. D is the set of all points of V that can be represented in the form
E i=1
where k NI and A.k O.
=E
e
V.
i =1
8
Then [—x, +1 , z)Œ D. (a) The convexity of A and B follows directly from Definition 2.1(b). Since p(()) = p(2 0) = 2p(0) we have p(0) = 0 hence OEA n B. If x e V, then p
(b) (c)
( 1 x)0, then p(x) = O. If p is continuous, Ix I p(x)< 1} is an open subset of C, hence C is a convex body.
115
11 12
13
Conversely, suppose that C is a convex body. Prove that 0 e int(C) (cf. Theorem 2.23). It follows that there exists a neighborhood U of 0 such that UŒ C, hence p(U). 1. Deduce that p is continuous at O. Now show that p is continuous at each point of E. To prove that for a convex body C we have int(C) = C and C = Cel, use the openness of Ix f p(x) < 1} (=C i ) and the closedness of Ix I 1} (= Follows directly from Theorem 2.27. Let B be the complement of A with respect to E. There exist x, y c A, z e B such that z e (x, y). Define a = suptA. E R I [z, z + A.(y  x)] OE B} and 0 e R I [z, z + A.(x  y)]OE B1. Set u = z + (y  x), y = z + f3(x  y) and show that u e A, v e A, (u, v)OE B. First show that
n int(Ci ) n int(Ci ). If x e n fl C, =
Let x„ be a point in Complete the proof.
n int(Ci ). show that (x,
n int(Ci ).
CHAPTER 3 1
2 3
4
5
6
By Theorem 3.8, there exists a closed hyperplane H f 1 (a) in E separating A and B properly. Assume that f(A) . a, f(B) a. Utilizing the openness of A and B, show that f(A) < a, f(B)> a. Show that there exist open convex subsets C, D of E satisfying A OE C, B OE D, c n D = 0. Use the result of Exercise 1. 'Only if': assume H = 1 (a) is a hyperplane such that f(x) = a. There exists y E E such that f(y)> O. Consider the points x + 5y and x  5y where 5 >O. 'If': suppose that x0 int(A). By § 3.9, there exists a hyperplane H in E, containing x, and separating A and {x} properly. There do not exist two points of A which are separated strictly by H. Let E be a normed linear space and f a linear function from E to R such that If' is not continuous (cf. remark (c) following Theorem 3.7). Consider the set Ix E I If(x)1 Assume that K 0. First show that 0 E K. It follows that K + K D K. To show that K + K K, assume that x e K, y c .1Z and prove that 1(x + y) c K. The statements follow directly from the definition of 'polar'. We give one example. If u e (K 1 + K2)° , then (lc I u)+A (y I u)=(x+Ay
whenever x e K 1 , y E K2, A > 0. Letting +00 and A 0, respectively, we conclude that (y Iu»O and (x I u»O. If follows that
(K 1 + K2 )° OE K? n K.
116 CHAPTER 4
1
By Theorem 4.2, each x e co(A) can be written as a convex combination
of k +1 points a t , a2,
, ak , of A. Write k+1
x=
A i ai +0 a i= 1
2 3 4 5
6
7 8
9 10
Following the proof of Theorem 4.2, try to write x as a convex combination of a and k other points of A. Apply Theorem 4.2. Use property (d) of § 2.4 and Theorem 4.2. Suppose that x0 A. Since d(x, A)>0 and A, > A (i >00), there exists N e IN such that h(A„ A)< 1 d(x, A) whenever i > N. This violates the fact that ; > x (i If A is not convex, then there exist x, y, z ER such that x, y e A, z0 A, Z E (x, y). Define p = d(z, A) (where d is the Euclidean metric on Ra). We have p >0. If A, > A (i then there exists Ne IN such that h(A, A N ) 0} and A2 = A \A l . Cf. the proof of Helly's theorem. By § 4.11, there exist y E Rn, a ER such that (K I y)=. a and (C I y) a. Prove that y E ./(`) and (C I y) O. K 1 and K2 are closed convex cones, hence Kr = K1 , = K2. Use the results of Exercise 6 of Chapter 3. Since AP" is a cone, (AP" I a)< a implies (APn f a)=0. The last inequality implies Ata 0. Let B be the set of all Y = 011,712, ER" such that y 0, Th = 1. B is convex and compact, hence AB is convex and compact. We have: Ay =0, y0, y 0 has no solution y in R"
0$ AB there exists a hyperplane separating {0} and AB strictly. CHAPTER 5 1, 2, 5, 6 Follow directly from Definition 5.9. 3 Use § 5.12. 4 (a) Follows directly from Definition 5.9. (h) Use the sets fx E E f 5A (X) Al (cf. Theorem 5.3). 7 (a) Let f be a constant. (1") In virtue of the continuity of f, we have B int(A). Let x e E be such that f(x)< À. Let x0 E int(A). There exists y E A such that B. X0 E (X, y). Conclude that f(x0)< )1/4.. It follows that int(A) 8 (a) Let x be a local minimum point of f. For each y e V, consider the restriction of f to the line through x and y. Prove that f(y).f(x) (cf. Exercise 1 of Chapter 1). (b) Suppose that x and y were global minimum points of f. Consider the point :.(x + y). 9 Use the strict convexity of the function xi> fx — x 0112 . Cf. example (a) of § 5.18. 10 dom(fg) dom(f) + dom(g). 11 Cf. § 5.20. If f(a)= 00 and f is locally bounded above at a, there exists a neighborhood U of a such that U dom(f). By § 5.12, f(x) = —OED whenever x c U. 12 Apply an analogue of Theorem 5.29. 13 Follows directly from Definition 5.9. + (1 —)1/4.)y. Let f be a subdifferen14 Let x, y E E and /1. E (0, 1). Set z = tiable function from E to R and z' E af(Z). We have f(z)+(t— z z') whenever t E E. Take t = x and t = y, respectively, and conclude that Al(x)+ (1— A.)f(y)_?f(z). f(t)
118
15 Set C = co(e i , • • • , en, — e1, • • • , — en ). Show that
C=
• • • , 6, ) E R" 1
,
i =1
(cf. Exercise 2 of Chapter 2). Note that y E af(o) is equivalent to max 141 (x I y) whenever x =(,
16
6.)
61) E
R
(*)
.
Prove that (*) is equivalent to y E C. Conclude that af(o)= C. (a) Follows directly from Definition 5.9. (b) Let x E E, x' E E' such that (x J x')= and Ilx'll= y e E, we have
114 
114. For all
f(Y) — f(x)  (y — x I x') =1.11Y112— MIx112— (y I )0+1142 =IIIY112 +114 2— IIY11lxii =1(411  11x11) 2 0. Conversely, let x E E, x' E af(x). We have 111Y112 111x112 ±(Y — x I x') whenever y E E. Take y = Ax and let A. t 1 and A J, 1, respectively. Conclude that (x I x')=11x112 and hence Take y = x + ez and deduce that
1(z114 1 x')1 .11x11 17 18
whenever z c E, z O. Conclude that Ilx'1111x11. (c) Use the result of (b). Apply Theorem 5.37 and cf. Exercise 14. Define the functions fi and f2 from R to A by
(x)
1 — 'ix if xO = +00 if xC1
21
E
R. Show
(*)
Show that (*) is equivalent to (c). (a) Extend f to all of E by defining
f(x) = +Go if x0 C. Distinguish two cases: 1(x)> —09 whenever x E 0 and there exists x E 0 such that f(x) = —00. Show that in the last case we have If(x )I = +co whenever x e E (cf. § 5.12), hence f (x) = —00 whenever x G C. Apply Theorem 5.8(c). (b) Use Theorem 5.23 and the result of (a). (c) Let the function f: E —> R be linear and not continuous. Take C = E. 22 Let A U be compact. There exists E >0 such that AE OE U, where A€
23 24
e
d(x, A)
El.
By Theorem 5.23, f is continuous on U. Since AE is compact, there exist m, Me R such that m f(x) M whenever x E A. Imitate the proof of the theorem in § 5.21. Let (a, b) EIRP Mr. Apply § 5.26 to the collection of functions {f I XE B} where fx is defined by f(v) = f(x, y) and B is a ball in RP with centre a. Let x e E. Since f is continuous at xo, we have f(xo , x) ER . Let m be the line {(x 0 + ybc, f (x 0) + Af' (xo ; x)) IÀ E R}
25
in E x01 Prove that there exists a closed hyperplane in E x R, containing and separating epi(f) and in properly (cf. § 3.9). Conclude that there exists x c af(x 0) such that f(x 0 ; x) = (x I 4). To complete the proof, use Theorem 5.36. Use the result of Exercise 24.
CHAPTER 6
1
Write
f* x (
') 
s u p (x I x') — (1)( {
xcE
11x 11)}
= sup ti sup (x f x')tO
(cf. § 6.4, example (b)).
ilxii=
120 2 Take x = x' in Fenchel's inequality (cf. § 6.9). 3 Apply Theorem 6.15. 4 (a) We know that f**(x0) f(x0) (cf. Theorem 6.11). Let x (') eaf(x 0). For each x e E
f (x) f (x0) + (x  x0 I x,
)
hence
f** (x) f (x 0) + (x x„ xo') —
5
(cf. Theorem 6.11). It follows that f**(x ()). f(x„). (b) Use Theorem 6.10 and the result of (a). First show that f** EF(E) (combine Theorem 6.11(a) and Definition
6.17). 6 Apply Theorems 6.10 and 6.16. 7 If f is improper convex, then f = +Go or f(x) = —00 for at least one x. In both cases, f* is improper convex (cf. § 6.3). If f is proper convex, 1 is proper convex (cf. Theorem 5.24) and f** = f (cf. Theorem 6.16). Conclude that f* is proper convex. 8 8,:*= cl(co(8A ))= c1(8, )(A)) = co(A) = 8 8 where B = co(A) (cf. Theorem 6.20). 9 (a) Follows directly from Definition 6.1(a). Note that 8c 11181, =8  C+D• (b) Use Theorem 6.20(e). 10 If f is a real positively homogeneous convex function on R", f is continuous hence f e fo(Or). By § 6.22, there exists a closed convex nonempty subset C of Or such that f = 8t. For each x' e R" e EJ
hence sup (x I x')< +00
.cc
Deduce that C is bounded. Conversely, let C be a bounded convex nonempty subset of R. 5t is positively homogeneous and convex. For each x'ER" and each x c C
(x
M 11)41 where M = sup„, c 114. It follows that (x') E l. 11 If f is improper convex, then cl(f) = —00= S. If f is proper convex, then cl(f)=1.. Apply § 6.22 (the positive homogeneity of f follows, for instance, from Theorem 5.8(c)). 12 Use the definition of 3 g(0) and apply Theorem 5.36. 13 Follows directly from the definitions of 8k and ./(°. 14 There exists a ER' such that 8,Vx) = (x a) (x c Fr). It follows that 8,:,*= 8{0 (cf. § 6.4). Now use the result of Exercise 8.
Glossary
as a subset of R: the set of all x E R satisfying a x b; in a linear space: the line segment with endpoints a and b (see § 2.1) (a, b) the interior of [a, b]. Analogously (a, b] and [a, b) (x I y) the inner product of x and y (see § 4.5) (x u) the value of u E E' at x e E (see §3.13) the set of all x e R such that x Rf R U1+091 U {—co} (see § 1.18) int(A) the interior of A A the closure of A bd(A) the boundary of A ri(A) the relative interior of A (see § 4.8) rb(A) the relative boundary of A (see § 4.12) Al the algebraic interior of A (see § 2.14) A" the algebraic closure of A (see § 2.14) co(A) the convex hull of A (see § 2.2) cT)(A) the closed convex hull of A (see § 2.25) aff(A) the affine hull of A (see § 2.6) dim(A) the dimension of A (see § 2.6) E' the dual of E (see § 3.13) the polar of K (see § 3.13) Koo the bipolar of K (see § 3.13) pn the nonnegative orthant in Rn (see § 4.14) Tt the adjoint of the linear map T VeR the linear space of all (x, À)E VxR (see § 5.1) f(A) < a f(x) < a whenever x e A dom(f) the effective domain of f (see § 1.21, § 5.11, and § 7.14) epi(f) the epigraph of f (see § 5.1) the lower semicontinuous hull of f (see § 5.5) the right derivative of f f [a, b]
121
122 f'f ni co (f) fDg
f(x 0 ; x) B (a ; r) Vf
af dom(af) f* f** cl (f) 8A 8',11 F(E) Fo(E) proxf
the left derivative of f the restriction to m of f the convex hull of f (see § 5.16) the infimal convolution of f and g (see § 5.17) the directional derivative of f at xo in the direction x (see § 5.19) the closed ball with centre a and radius r the Gateauxdifferential of f (see § 5.28) the subdifferential of f (see § 5.30) the domain of af (see § 5.30) the conjugate of f (see § 6.1) the bipolar of f (see § 6.1) the closure of f (see § 6.13) the indicator function of A (see § 5.15) the support function of A (see § 6.5) see Definition 6.17 see § 6.19 the proximity mapping with respect to f (see § 7.17)
Subject Index
absolutely continuous, 5, 10 affine combination, 22 affine function, 66 affine hull, 22 affine subset, 21 affinely dependent, 23 affinely independent, 23 algebraic closure, 24 algebraic interior, 24
convex cone, 38 convex function, 1, 15, 61 improper, 15, 63 proper, 15, 63 strictly, 1, 63 convex hull, 20, 64 convex polytope, 22 convex programming, 100 convex set, 20 convexity space, 56 cyclically monotone, 109 maximal, 110
barycentric coordinates, 24 best approximation, 108 bipolar of a cone, 39 bipolar of a function, 84
derivative, 72 differentiable, 72
Blaschke's convergence theorem, 45
Fréchet, 72 Gateaux, 72 dimension, 21, 22
Carathéodory number, 57 Carathéodory's theorem, 41 closed convex hull, 29 closed function, 90 closed halfspace, 50 closed hyperplane, 35, 36 closure of a function, 90 concave, 105 proper, 105 cone, 38 convex, 38
directional derivative, 66 domain, 74 effective, 15, 63 dual of a function, 84 dual of a normed linear space, 38 dual problem, 105 duality theorem of Fenchel, 105 effective domain, 15, 63 epigraph, 58 equality constraints, 102 extreme point, 23
finitely generated convex, 51 cone of supporting functionals, 98 conjugate, 12, 84, 105 constraint qualification, 111 convex algebraic body, 26 convex body, 29 convex combination, 20
Farkas' lemma, 54 feasible solution, 101 strictly, 101 Fenchel's duality theorem, 105
123
124 Fenchel's inequality, 87 finitely generated convex cone, 51 Fréchet derivative, 72
Fréchetdifferentiable, 72 Gateauxdifferentiable, 72
Gâteauxdifferential, 72 gauge, 26 generator, 51 Gordan's lemma, 56 Hahn—Banach theorem, 37 Hausdorff distance, 44 He lly number, 57 He lly's theorem, 43 Wilder's inequality, 18 hyperplane, 32 closed, 35, 36 nonvertical, 74 supporting, 37 vertical, 74 improper convex, 15, 63 indicator function, 64 inequality constraints, 102 inequality of Fenchel, 87 inequality of Wilder, 18 inequality of Jensen, 11 inequality of Young, 14 infimal convolution, 65 interior of a line segment, 20 Jensen's inequality, 11
ksimplex, 24 Kirchberger's theorem, 44 Kuhn—Tucker conditions, 102 Lagrange function, 103 Lagrange multipliers, 102
Lagrangian, 103 line segment, 20 linear programming, 101 linear topological space, 28
Lipschitzian, 5, 68 locally, 68 locally equi, 71 locally bounded, 67 locally convex space, 37 logarithmically convex, 18 lower semicontinuity, 59 lower semicontinuous hull, 60 maximal (cyclically) monotone, 110
midpoint convex, 6 Minkowski, theorem of, 50 Minkowski distance functional, 26 minorant, 60 monotone operator, 109 cyclically, 109 maximal (cyclically), 110 multifunction, 57, 109 convex, 57 multipliers, 102 nontrivial supporting hyperplane, 37 nonvertical hyperplane, 74 norm topology, 85 normal, 98 normal cone, 98 normed linear space, 38 optimal solution, 101 polar of a cone, 39 polar of a function, 84 polyhedral cone, 51 positively homogeneous, 26, 67 projection, 108 proper concave, 105 proper convex, 15, 63 proper separation, 34 proximity mapping, 108 quasiconvex, 17 strictly, 17 Radon number, 57 Radon's theorem, 55 relative boundary, 49 relative interior, 46 saddle point, 103 separation, 34 proper, 34 strict, 34 separation theorem, 34, 36, 48 simplex, 24 Slater's condition, 98 starshaped, 35 strict separation, 34 strictly convex, 1, 63 strictly feasible solution, 101 strictly quasiconvex, 17
subadditive, 26 subdifferentiable, 74 subdifferential, 73, 74
125 subgradient, 73 support function, 86 supporting hyperplane, 37 nontrivial, 37 theorem of the alternative, 54
vertex, 23 vertical hyperplane, 74 weak topology, 85 Young's inequality, 14
convom anaresis An Introductory Text Jan van Tiel Royal Netherlands Meteorological Institute This book provides an introduction to convex sets, convex functions and convex optimization. It emphasizes the basic concepts and the characteristic methods of this area of mathematics. The proofs of the theorems have been constructed in such a way that trying to understand them means learning the methods of convex analysis, and a large number of elementary exercises (with answers and hints at the end of the book) aid in understanding the concepts employed. A book for students of mathematics in particular, but also for physicists, engineers, control theorists and economists. Contents Preface Chapter 1 Convex Functions on Chapter 2 Convex Subsets of a Linear Space Chapter 3 Separation Theorems Chapter 4 Convex Subsets of Rn Chapter 5 Convex Functions on a Linear Space Chapter 6 Duality Chapter 7 Optimization Answers and Hints Glossary Subject Index
JOHN WILEY & SINS Chichester New York Brisbane Toronto Singapore