Metric Spaces

This is an exposition of some of the material in Kolmogorov’s book.

definition: a metric on a set X is a non-negative real-valued function d : X \times X \rightarrow \mathbb{R} such that d(x,y) = 0 iff x = y, d(x,y) = d(y,x) and d(x,y) + d(y,z) \ge d(x,z). A metric space is a set X equipped with a metric.

examples:

  • the delta function \delta(x,y) = \mathbb{I}(x=y).
  • the set of real numbers with \delta(x,y) = |x - y|.
  • the set of n-dimensional real vectors with
    • \delta(x,y) = \left( \sum_{i=1..n} (x_i - y_i)^2 \right)^{1/2}
    • \delta(x,y) = \max_{i = 1..n} |x_i  - y_i|
  • the set of continuous functions defined on [a,b] with distance d(f,g) = \max \right( |f(t) - g(t)|, t \in [a,b] \right)
  • the set of infinite real-valued sequences which are square-summable.
  • the set of functions continuous on [a,b] with d(x,y) = \left(\int_a^b (x(t) - y(t))^2 \right)^{1/2}.
  • the set of bounded sequences with d(x,y) = \sup_n |x_n - y_n|
  • the set of n-dimensional real vectors with d(x,y) = \left( \sum_{i=1..n} (x_i - y_i)^p \right)^{1/p}

Most of the examples define candidate metrics that are relatively straightforward to verify as metrics. To verify the last candidate is a metric, an inequality called Hölder’s inequality needs to be established:

\sum_i |a_i b_i| \le \left( \sum_i |a_i|^p \right)^{1/p} \left( \sum_i |b_i|^q \right)^{1/q}

where p,q > 1 are a conjugate pair real numbers such that p^{-1} + q^{-1} = 1. Some properties:

  • The sum of conjugate number is identical to their product.
  • (p-1)q = p and (q-1)p = q.
  • (p-1)(q-1) = 1
  • p > 2 iff q < 2.

The inequality is homogenous (if it holds for a,b it holds for \lambda a, \mu b where \lambda, \mu are real numbers. So it is sufficient to establish the inequality when

\sum_i |a_i|^p , \sum_i |b_i|^q  = 1.

Consider the jointly non-negative quadrant of the real plane with coordinates (\xi,\nu) and the function \eta = \xi^{p-1} or equivalently \xi = \eta^{q-1}. Note that \phi(x) = x^\alpha is increasing when \alpha > 0.

Suppose a^{p-1} \le b. Since \eta \ge a^{p-1} implies \eta^{q-1} \ge a^{(p-1)(q-1)} = a and so

\int_{a^{p-1}}^b \eta^{q-1} d\eta \ge \int_{a^{p-1}}^b a d\eta

\int_0^{a^{p-1}} \eta^{q-1} d\eta + \int_0^a \xi^{p-1} d\xi = \int_0^{a^{p-1}} a d\eta

adding prior two equations yields

\int_0^b \eta^{q-1} d\eta + \int_0^a \xi^{p-1} d\xi  \ge ab

\frac{1}{p} a^p + \frac{1}{q} b^q \ge ab

Analogously, b \le a^{p-1} will imply the same conclusion. Hence

\sum_i \frac{1}{p} a_i^p + \frac{1}{q} b_i^q \ge \sum_i a_i b_i

1 \ge \sum_i a_i b_i

A second inequality called Minkowski’s inequality is given by

\left( \sum_i (|a_i|| + |b_i|)^p \right)^{1/p} \le \left( \sum_i |a_i|^p \right)^{1/p} + \left( \sum_i |b_i|^p \right)^{1/p}

Since

(|a|+|b|)^p = (|a| + |b|)^{p-1} |a| +  (|a| + |b|)^{p-1} |b|

\sum_i (|a_i|+|b_i|)^p = \sum_i (|a_i| + |b_i|)^{p-1} |a_i| +  \sum_i (|a_i| + |b_i|)^{p-1} |b_i| \le \\ \left( \sum_i (|a_i| + |b_i|)^{p} \right)^{1/q} \left( \left[ \sum_i |a_i|^p \right]^{1/p} + \left[ \sum_i |b_i|^p \right]^{1/p} \right)

Continuous Mappings and Homeomorphisms

A mapping f: X \rightarrow Y between metric spaces is continuous at point x_0 \in X if for any \epsilon > 0, there is a \delta > 0 such that d(x_0,x) < \delta implies d(fx_0,fx) < \epsilon. The mapping is continuous if it is continuous at x_0 for all x_0 \in X.

A homeomorphism is a set isomorphism f between metric spaces such that f,f^{-1} are both continuous. A homeomorphism is an isometry or distance-preserving if d(fx,fy) = d(x,y) for all x,y \in X. Two metric spaces are isometric if there is an isometry between them.

Convergence. Open and Closed Sets.

The open sphere with center x \in X and radius r > 0 is the set S(x,r) = \{y : d(x,y) < r\} consisting of all points of X within distance less than r of center x. The closed sphere is the set S[x,r] = \{y : d(x,y) \le r\}. An r-neighbourhood of x is the open sphere S(x,r).

A contact point of a set E is a point x \in X such that every neighbourhood of x contains a point of E. The closure of a set E is the set [E] consisting of E and its contact points. The closure operator is a unary operator on sets mapping a set E to its closure [E].

theorem: properties of the closure operator:

  • if M \subseteq N then [M] \subseteq [N].
  • [[M]] = [M]
  • [M \cup N] = [M] \cup [N]
  • [\emptyset] = \emptyset

Since every point of closure of M is a point of closure of N when M is a subset of N, the first claim is true.

Let p be a point of closure of [M]. Then for every neighbourhood B(p,r) there is a point q \in [M] such that q \in B(p,r). But then B(q,r-d(p,q)) is a neighbouhood of q contained in B(p,r) that contains point of M, and so B(p.r) also contains a point of M. Since this is true for every r > 0, it follows that p is a contact point of M. Hence [[M]] \subseteq [M]. But since [M] \subseteq [[M]], it follows that [M] = [[M]].

Since M,N \subseteq M\cup N, it follows that [M],[N] \subseteq [M \cup N] and so [M] \cup [N] \subseteq [M \cup N]. Let p be a point of closure of M \cup N. Either p is a point of closure of M or it is not a point of closure of M. In the latter case, there is a neighbourhood B(p,r) that does not contain any points of M. Since every neighbourhood of p contains a point of M \cup N, it follows that every neighbourhood of p contains a point of N and so p is a point of closure of N. Hence [M \cup N] \subseteq [M] \cup [N].

Let p be a point of closure of [\emptyset]. Then for every neighbourhood B(p,r) there is a point q \in \emptyset \cap B(p,r). But this is false, since the empty set contains no points. Therefore [\emptyset] cannot contain any points, that is it is the empty set.

definition: a limit point of a set E is a point x \in X such that every neighbourhood of x contains an infinite number of points of E. An isolated point of a set E is a point x \in E for which there is a neighbourhood of x that contains no other points of E other than x.

definition: a sequence of points \{x_n\} in a metric space X converge to a point x \in X if every neighborhood of x contains all but finitely many points of the sequence. This statement is equivalent to the requirement for any \epsilon > 0 there is an N such that for all n\ge N x_n \in B(x,\epsilon). Here x is called the limit of the sequence \{x_n\} and is denoted by x_n \rightarrow x (as n \rightarrow \infty.

There are some consequences of this definition:

  • if a limit exists, it is unique
  • if a sequence has a limit x, so does every subsequence.

Suppose x,y are distinct limits of the sequence \{x_n\}. Then B(x,2^{-1}d(x,y)),B(y,2^{-1}d(x,y)) are disjoint neighbourhoods of x,y that both contain all but finitely many points of the sequence \{x_n\}. But this is false. Hence a limit if it exists must be unique.

A subsequence is a sequence of the form x \circ m where m is a strictly increasing operator on the natural numbers. Let x be the limit of the sequence \{x_n\}. Let \epsilon > 0 be given. Then there is an N such that for all n \ge N it is true that x_n \in B(x,\epsilon). Since m(N) \ge N, it follows that (x \circ m)(n) \in B(x,\epsilon) for all n \ge m(N). Hence x is the limit of (x \circ m). But since this is true for any subsequence, the second claim is also true.

theorem: a necessary and sufficient condition for x to be a contact point of a given set E is that there is a sequence of points of E that has limit x.

Suppose x is a limit point of E. For each positive integer i, choose x_i B(x,2^{-i}) \cap E. Then necessarily \{x_i\} is a sequence of points of E with limit x. Conversely, suppose there is a sequence of points of E that has limit x. Then for any \epsilon > 0, there is a positive integer i such that x_i \in B(x,\epsilon) \cap E. Hence x is a contact point of E.

theorem: a necessary and sufficient condition for x to be a limit point of a given set E is that there is a sequence of distinct points of E that limit x.

The proof is similar to the prior result.

definition: Let A,B be subsets of a metric space X. A is dense in B if B \subseteq [A]. That is, every point of B is a point of closure of A. A is everywhere dense if [A] = X. A is nowhere dense if there is no open set O for which A is dense in O. A metric space is separable if it has a countable everywhere dense set.

A set A is dense in B if a neighbourhood of any point of B contains a point of A. That is, A is “always in the neighbourhood” of B. A set is A is nowhere dense then every open set has a point that has a neighbourhood that contains no points of A. For any neighbourhood of a point, there are always points that are isolated from A.

examples:

  • the rationals are dense in the real line. since the rationals are countable, the real line is separable
  • the set of rational vectors in n-dimensional real space \mathbb{R}^n is dense in \mathbb{R}^n.
    • to see this note that every neighbourhood of a point x \in \mathbb{R}^n contains a “rectangular set” \prod_{i=1..n} B(x_i,r_i), and each such component contains a rational. So the product contains a rational vector.

definition: a subset E of a metric space X is closed if [E] = E, that is every contact point of E is a point of E or even more succinctly every limit point of E is a point of E.

theorem: the intersection of any collection of closed sets is closed. The union of any finite collection of closed sets is closed.

Let p be a point of limit point of \cap \mathcal{C} where \mathcal{C} is a collection of closed sets. Then it is a limit point of every set C \in \mathcal{C}. But since C is closed, p is a point of C. Since this is true for all C \in \mathcal{C}, it follows that p \in \cap \mathcal{C}. Hence \cap \mathcal{C} is closed.

It is sufficient to establish the union of every pair of closed sets is closed, since the more general statement will naturally follow by induction. Let x be a limit point p of the union C \cup D of closed sets C,D. Then there is a sequence of distinct points of C \cup D with limit p. Delete all points of this sequence belonging to D. If this sequence is infinite then it is a subsequence of the original sequence that is a sequence of distinct points of C with limit x. Since C is closed, x is a point of C and hence C \cup D. If the sequence is finite, then deleting all points of this sequence belonging to C will yield an infinite sequence and the same argument can be applied.

definition: an interior point of a set E is a point that has a neighbourhood contained in E. A set is open if every point of E is an interior point of E.

theorem: a set E is open iff its complement is closed.

Let E be an open set. If E^C is not closed, there is a limit point p \in E^C that is contained in E. But since p is an interior point of E, there is a neighbourhood O contained in $E$. But every neighbourhood of p contains an infinite number of points of E^C, which is false. So E^C is closed.

Let E^C be a closed set. Suppose E is not open. Then there is a point p \in E that is not an interior point of E. But then every neighbourhood of p contains a point of E^C. Hence p is a contact point of E^C, and since E^C is closed, p \in E^C. This is false. And so E is open.

corollary: \emptyset is closed and open.

Since the space X is open and closed, so is its complement.

theorem: the union of a collection of open sets is an open set and the intersection of a finite collection of open sets is an open set.

Since the complement of sets in a collection of open sets is a collection of closed sets, their intersection is closed. But the complement is open and equal to the union of the collection.

Since the complement of a sets in a finite collection of open sets is a collection of closed sets, their union is closed. But the complement is open and equal to the intersection of the collection.

theorem: every open set of the real line is the union of an at most countable collection of pairwise disjoint open intervals.

Let O be an open set. Then each point of x \in O is an interior point of O and so there is an open interval I_x contained O. Let G_x be the collection of all open intervals containing x and contained in O. Then J_x = \cup G_x is the largest open interval containing x and contained in O. Then \{J_x : x \in O\} is a collection of open intervals contained in O and containing every point in O. The collection is disjoint for if J_x \cap \cup J_y is nonempty, then \cup J_x \cup \cup J_y is an open interval containing both x and y, and so J_x \cup J_y \subseteq J_x,J_y, and so J_x = J_y.

corollary: every closed set can be obtained by deleting an at most countable collection of open intervals from the real line.

Complete Metric Spaces

definition: a cauchy sequence of a metric space X is a sequence \{x_n\} of points of X such that for any \epsilon > 0, there is an N such that for all m,n \ge N, d(x_n,x_m) < \epsilon.

theorem: every convergent sequence is a Cauchy sequence.

since d(x_n,x_m) \le d(x_n,x) + d(x,x_m), taking the limits as n,m \rightarrow \infty, d(x_n,x_m) \rightarrow 0.

definition: a metric space is complete if every cauchy sequence is convergent.

theorem: a metric space X is complete iff every nested sequence of closed spheres whose radii form a sequence that converges to zero has a nonempty intersection.

Let X be complete. Let \{S[x_n,r_n]\} be a nested sequence of closed spheres such that \lim_n r_n = 0. Then the sequence of centers is a Cauchy sequence, since d(x_n,x_m) < r_N for all n,m \ge N, and for any \epsilon > 0, there is an N such that r_N < \epsilon. Since X is complete, the sequence of centers is convergent to x. But this implies that x is a contact point of S[x_n,r_n] for any n. Since S[x_n,r_n] is closed, it contains its contact points, and so x \in S[x_n,r_n] for all n. That is, the intersection is nonempty.

Conversely, suppose every nested sequence of closed spheres whose radii converge to zero has a nonempty intersection. Let \{x_n\} be a Cauchy sequence. Choose a subsequence such that d(x_n,x_m) \le 2^{-N_k} for all n,m \ge N_k and N_k > N_{k-1}. Then \{ S[x(N_k),2^{-N_k}] \} is a nested sequence of closed spheres whose radius converges to zero, and so the intersection is nonempty. Hence there is a point x contained in the intersection. So d(x(N_l),x) < 2^{-N_k} for all l \ge k, and so this subsequence is convergent. But a cauchy sequence that has a convergent subsequence is convergent.

Baire’s theorem: A complete metric space cannot be represented as a union of a countable number of nowhere dense sets.

Suppose the complete metric space X can be represented as a countable union of nowhere dense sets \{A_n\}. Since A_1 is nowhere dense in a closed sphere S_1 of radius 2^{-1} there is a closed sphere S_2 \subseteq S_1 of radius less than 2^{-2} such that S_2 \cap A_1 = \emptyset. This process can be iterated to obtain a sequence of nested closed balls whose radius converges to zero. Since the space is complete, the intersection of the closed balls is nonempty. But Since S_{i+1} \subseteq A_i^C it follows that \cap_i S_{i+1} \subseteq \cap_i A_i^C = \left(\cup_i A_i \right)^C = X^C = \emptyset. This is false. hence the supposition is false.

corollary: a complete metric space with no isolated points is uncountable.

If it is countable, then there is an enumeration \{x_i\} such that X = \cup_i \{x_i\}. But \{x_i\} is nowhere dense. For if there is an open set O where \{x_i \} is dense, then every point x \in O and every neighbourhood of x contains x_i. But this means for any x \in O, and for any \epsilon > 0, d(x,x_i) < \epsilon and so x = x_i. Hence x_i is an isolated point of X. But the space has no isolated points.

Completion of Metric Spaces

Not every metric space is complete. For example, the metric space X = (0,1) with distance d(x,y) = |x-y| is such that the sequences \{x_n = n^{-1}\} and \{y_n = 1 - x_n\} are both Cauchy but do not converge to a point in X.

definition: A completion of a metric space (R,d) is a complete metric space (X,\rho) such that R \subseteq X and d = \rho | R \times R.

theorem: Every metric space has a completion unique to homeomorphism.

Let (X,\rho) and (Y,\beta) be completions of the metric space (R,d). Then R \subseteq X,Y and d = \rho | R \times R = \beta | R \times R. Let p be a contact point of R \subseteq X. Then there is a sequence \{p_n\} of points of R such that \lim_n \rho(p_n,p) = 0. But since \{p_n\} is convergent in X, it is a cauchy in X, cauchy in R, and cauchy in Y. Since Y is complete, it is convergent in Y and so there is a contact point q \in Y such that \lim_n \beta(p_n,q) = 0.

Let p R q be a relation on [R]_X \times [R]_Y such that p R q iff there is a sequence in R convergent in X to p and convergent in Y to q.

To establish injectivity, suppose p R q,q'. Then there are two sequences \{r_n\},\{r'_n\} of R convergent to p in X and convergent to q,q' in Y. Then

\beta(q,q') \le \beta(q,r_n) + d(r_n,r_n') + \beta(r_n',q')

but since d(r_n,r'_n) \le \rho(r_n,p) + d(p,p) + d(p,r'_n) implies \lim_n d(r_n,r'_n) = 0. it follows that \beta(q,q') = 0 and so q = q'.

Hence for any p \in [R]_X, there is a unique \phi(p) \in [R]_Y such that p R \phi(p). But by symmetry, for any q \in [R]_Y there is a unique \psi(q) \in [R]_X such that \psi(q) R q. Hence \psi(\phi(p)) = p. So [R]_X \sim [R]_Y.

Furthermore, \rho(p,p') = \beta(\phi(p),\phi(p')). For

\rho(p,p') \le \lim_n \rho(p,p_n) + d(p_n,p'_n) + \rho(p'_n,p') = \lim_n d(p_n,p'_n) \le \lim_n [\beta(p_n,\phi(p)) + \beta(\phi(p),\phi(p')) + \beta(\phi(p'),p'_n)] = \beta(\phi(p),\phi(p'))

and dually \beta(\phi(p),\phi(p')) \le \rho(p,p')

and so \phi is an isometry between complete metric spaces. So a completion of R is unique up to isometry of complete metric spaces.

To establish existence, a complete metric space extending the metric space R must be produced. Define the binary relation on Cauchy sequences of R by p S q if \lim_n d(p_n,q_n) = 0. Then S is an equivalence relation (it is clearly reflexive, symmetric and transitive). Let [S] denote the equivalence classes of S.

Define \rho([s],[t]) = \lim_n d(s_n,t_n). Assuming the limit exists, this is a well-defined function since if s S s', t S t', then

\lim_n d(s'_n,t'_n) \le \lim_n d(s'_n,s_n) + d(s_n,t_n) + d(t_n,t_n') \le \lim_n  d(s_n,t_n) and dually \lim_n d(s_n,t_n) \le \lim_n d(s'_n,t'_n).

Furthermore, \rho is a metric: it is non-negative, real-valued, symmetric. If \rho([s],[t]) = 0 then \lim_n d(s_n,t_n) = 0 and so s S t. Hence [s] = [t]. Finally, for any [s],[t],[u],

\rho([s],[u]) = \lim_n d(s_n,u_n) \le \lim_n d(s_n,t_n) + d(t_n,u_n) = \rho([s],[t]) + \rho([t],[u])

establishing the triangle inequality.

To verify the existence of the limit, for any \epsilon > 0, choose M such that m,n\ge M implies d(s_n,s_m),d(t_n,t_m) < \epsilon/2. Then d(s_n,t_n) - d(s_m,t_m) \le d(s_n,s_m) + d(t_m,t_n) < \epsilon, and dually d(s_m,t_m) - d(s_n,t_n) < \epsilon, and so |d(s_m,t_m) - d(s_n,t_n)| < \epsilon when n,m \ge N. Hence the sequence \left< d(s_n,t_n) \right> is a real-valued cauchy sequence and hence convergent.

It remains to show that Cauchy sequences in [S] converge and that there is an isometric embedding of R into [S].

For the latter, for any p \in R, define \phi(p) = [p^*] where p^* is the constant sequence (p^*)_n = p. Then d(p,q) = \lim_n d(p^*_n,q^*_n) = \rho(\phi(p),\phi(q)). Clearly \phi : R \rightarrow [S] is an injective isometry.

THIS SECTION BELOW IS NOT COMPLETED. A BIT OF CONFUSION RELATING TO THE DEFINITION, WILL REVISIT LATER.

Finally, the completeness needs to be demonstrated. Let w be a Cauchy sequence in [S]. Then \lim_{n,m \rightarrow \infty} \rho(w_n,w_m) = 0. Must show there is a v \in [S] (a cauchy sequence in R) such that \lim_n \rho(w_n,v) = 0.

Let p be a Cauchy Sequence in R. Then w : w_k = \phi(p_k) is a Cauchy sequence in [S] since \lim_{n,m\rightarrow \infty} \rho(w_m,w_n) = \lim_{n,m\rightarrow \infty} d(p_m,p_n) = 0. But [p] \in [S] and \lim_n \rho([p],w_n) = \lim_n d(p_n,p_n) = 0.

Contraction Mapping

Contraction mapping theorem was examined in another page. The proof will not be repeated here.

definition: A mapping on a metric space (R,d) is a map f : R \rightarrow R such that there is an \alpha < 1 such that for all x,y \in R, d(fx,fy) \le \alpha d(x,y).

contraction mapping are continuous.

theorem: Every contraction mapping on a complete metric space has a unique fixpoint.

Since the set of real numbers is a complete normed space, the following result is immediate.

corollary: If f is a mapping on [a,b] such that there is a K < 1 such that for all x, y \in [a,b], |fx - fy| \le K |x-y| then f has a unique fixpoint.

Assuming the function is differentiable, the MVT gives fx - fy = f'(z) (x - y). If the derivative is uniformly bounded by a constant less than unity, then f is a contraction.

examples:

For a linear mapping f on \mathbb{R}^n defined by y_i = \sum_{i} a_{ij} x_j + b_i for i = 1..n, the method of successive approximations can be used. The conditions under which the mapping is a contraction depends on the metric attached to the space.

Using the L_\infty-norm ||x|| = \max_i |x_i|, and using the metric induced by the norm,

\rho(fx,fy) = ||fx - fy|| = \max_i |\sum_j a_{ij} (x_j - y_j) | \le \max_i \sum_j |a_{ij}| ||x - y|| \le \alpha \rho(x,y) when \max_i \sum_j |a_{ij}|  \le \alpha. So in this case if the maximum row sum of the absolute value matrix is less than unity the map is a contraction.

Using the L_1-norm ||x|| = \sum_i |x_i|,

\rho(fx,fy) = \sum_i |(fx)_i - (fy)_i| = \sum_i |\sum_j a_{ij} (x_j - y_j)| \le \sum_i \sum_j |a_{ij} (x_j - y_j)| \le \sum_i \max_j |a_{ij}| \rho(x,y). So if the sum of column maximums is less than unity the map is a contraction.

Using the L_2-norm,

\rho(fx,fy)^2 = \sum_i |\sum_j a_{ij} (x_j - y_j)|^2 \le \sum_i \sum_j a_{ij}^2 \rho(x,y)^2 and so if the sum of the square of all components is less than unity, the map is a contraction.

Contraction mapping and ODES

NEEDS WORK

theorem: Given a function that is globally 2-Lipschitz in a planar domain G containing the point (x_0,y_0) then there is a a neighbourhood of x_0 such that the ode \frac{dy}{dx} = f(x,y) has a unique solution\phi such that \phi(x_0) = y_0.

Let X_\theta = C^1[x_0,x_0 + \theta] be the set of all continuous and differentiable real functions defined on [x_0,x_0 + \theta], and equipped with the norm ||f|| = \max \{fx : x \in [x_0,x_0 + \theta]\}.

Define the operator F on X_\theta by F(\phi)(x) = y_0 + \int_{x_0}^x f(u,\phi(u)) du for all x \in [x_0,x_0+\theta]. Then provided the solutions are contained in G,

||F(\phi) - F(\psi)|| = \max \{ |\int_{x_0}^{x} [f(u,\phi(u)) - f(u,\psi(u))] du| : x \in [x_0,x_0 + \theta]\}

\le \int_{x_0}^{x_0 + \theta} |f(u,\phi(u)) - f(u,\psi(u))| du \le \int_{x_0}^{x_0 + \theta} K |\phi(u) - \psi(u)| du \le K\theta ||\phi - \psi||

Choose \theta > 0 such that K\theta < 1. Then F is a contraction on X_\theta and by the CMT there is a unique fixpoint, corresponding to a solution of the IVP of the ODE.

Navigation

About

Raedwulf ….