Metric Spaces — Reading Notes
Purpose & Context (from Dr. Oussa)
These notes introduce the formal notion of a metric space. My aim is that you build a precise definition, develop a geometric picture, and learn to recognize metrics that arise in familiar and less familiar settings (finite-dimensional vectors, sequences, and functions). Carry the following guiding image: in a metric space, detours are never shorter than the direct route—this is the heart of the triangle inequality.
Definition: Metric and Metric Space
Let \(X\) be a nonempty set. A metric on \(X\) is a function \(d:X\times X\to\mathbb{R}\) such that for all \(x,y,z\in X\):
- Non-negativity: \(d(x,y)\ge 0\).
- Identity of indiscernibles: \(d(x,y)=0\) iff \(x=y\).
- Symmetry: \(d(x,y)=d(y,x)\).
- Triangle inequality: \(d(x,z)\le d(x,y)+d(y,z)\).
The pair \((X,d)\) is called a metric space, and \(d(x,y)\) is the distance between \(x\) and \(y\).
(a) Because of symmetry, the map \((x,y)\mapsto d(x,y)\) is typically not injective: the ordered pairs \((x,y)\) and \((y,x)\) produce the same value unless \(x=y\).
(b) The triangle inequality encodes the geometric intuition that any detour \(x\to y\to z\) is at least as long as the direct comparison \(x\to z\).
Subspace (Inherited) Metrics
If \(Y\subseteq X\) and \(d\) is a metric on \(X\), then the restriction of \(d\) to \(Y\times Y\) is a metric on \(Y\). We say that \(Y\) is endowed with the metric inherited from \((X,d)\).
Example (a line sitting inside the plane). With the Euclidean metric on \(\mathbb{R}^2\), a straight line \(L\subset\mathbb{R}^2\) becomes a metric space when we restrict the same distance formula to points of \(L\). The metric comes from the ambient space rather than being defined intrinsically on \(L\).
Canonical Metrics on Finite-Dimensional Spaces
On \(\mathbb{R}\), \(d(x,y)=|x-y|\) is the prototype. On \(\mathbb{R}^n\) (or \(\mathbb{C}^n\)), for \(1\le p\le \infty\) we set
\[
d_p(x,y)=\left(\sum_{k=1}^n |x_k-y_k|^p\right)^{1/p},\qquad
d_\infty(x,y)=\max_{1\le k\le n}|x_k-y_k|.
\]
Discrete metric. On any nonempty set \(X\), define
\[
d_{\text{disc}}(x,y)=
\begin{cases}
0,& x=y,\\
1,& x\ne y.
\end{cases}
\]
Geometric Intuition for the Triangle Inequality in \(d_p\)
Taxi vs. Air distance. In \(\mathbb{R}^2\), \(d_1\) counts horizontal + vertical blocks (detours allowed), \(d_2\) is the straight-line distance, \(d_\infty\) is the longest coordinate difference. For any three intersections (points) \(x,y,z\), the taxi route \(x\to y\to z\) cannot be shorter than going directly in the same taxi metric; the same holds for \(d_2\) and \(d_\infty\).
Theorem: \(d_p\) is a Metric on \(\mathbb{K}^n\) — Proof Without Norms
Statement. Let \(1\le p<\infty\) and \(x=(x_i)_{i=1}^n,\ y=(y_i)_{i=1}^n\in\mathbb{K}^n\) with \(\mathbb{K}\in\{\mathbb{R},\mathbb{C}\}\). Define \[ d_p(x,y)=\Big(\sum_{i=1}^n |x_i-y_i|^{\,p}\Big)^{1/p}. \] Then \(d_p\) is a metric. For \(p=\infty\), \(d_\infty(x,y)=\max_{1\le i\le n}|x_i-y_i|\) is also a metric.
Nonnegativity. Each \(|x_i-y_i|^p\ge 0\), hence \(d_p(x,y)\ge 0\).
Identity of indiscernibles. If \(d_p(x,y)=0\), then \(\sum |x_i-y_i|^p=0\). A sum of nonnegative terms is zero iff every term is zero; thus \(|x_i-y_i|=0\) for all \(i\), i.e. \(x=y\). Conversely, \(x=y\Rightarrow d_p(x,y)=0\).
Symmetry. \(|x_i-y_i|=|y_i-x_i|\) termwise; hence \(d_p(x,y)=d_p(y,x)\).
Triangle inequality. Fix \(z\in\mathbb{K}^n\) and set \(u_i=|x_i-z_i|\), \(v_i=|z_i-y_i|\). Then \(|x_i-y_i|\le u_i+v_i\), so
\[
d_p(x,y)=\Big(\sum |x_i-y_i|^p\Big)^{1/p}\le \Big(\sum (u_i+v_i)^p\Big)^{1/p}.
\]
By Minkowski’s inequality for finite sums (proved below),
\[
\Big(\sum (u_i+v_i)^p\Big)^{1/p}\le \Big(\sum u_i^p\Big)^{1/p}+\Big(\sum v_i^p\Big)^{1/p}
= d_p(x,z)+d_p(z,y).
\]
This yields \(d_p(x,y)\le d_p(x,z)+d_p(z,y)\).
- Young’s inequality. For \(a,b\ge 0\),
\[
ab\le \frac{a^p}{p}+\frac{b^q}{q}.
\]
Proof sketch: maximize \(ab-\tfrac{b^q}{q}\) in \(b\ge 0\); optimum at \(b=a^{p-1}\). - Hölder’s inequality (finite sums). For nonnegative sequences \((\alpha_i),(\beta_i)\),
\[
\sum_{i=1}^n \alpha_i\beta_i\ \le\ \Big(\sum \alpha_i^{p}\Big)^{1/p}\Big(\sum \beta_i^{q}\Big)^{1/q}.
\]
Proof: Normalize by \(A=(\sum \alpha_i^p)^{1/p}, B=(\sum \beta_i^q)^{1/q}\); apply Young termwise to \(\tilde\alpha_i=\alpha_i/A\), \(\tilde\beta_i=\beta_i/B\), and sum. - Minkowski’s inequality (finite sums). For nonnegative \((u_i),(v_i)\),
\[
\Big(\sum (u_i+v_i)^p\Big)^{1/p}\ \le\ \Big(\sum u_i^{p}\Big)^{1/p}+\Big(\sum v_i^{p}\Big)^{1/p}.
\]
Proof: Let \(S=\big(\sum (u_i+v_i)^p\big)^{1/p}\). If \(S=0\) we are done. Otherwise,
\[
\sum (u_i+v_i)^p=\sum u_i(u_i+v_i)^{p-1}+\sum v_i(u_i+v_i)^{p-1}.
\]
Apply Hölder to each sum with conjugate exponents \(p,q\), noting \((p-1)q=p\), to get
\[
S^p \le \Big(\sum u_i^p\Big)^{1/p}S^{p/q}+\Big(\sum v_i^p\Big)^{1/p}S^{p/q}.
\]
Divide by \(S^{p/q}>0\) to conclude \(S\le (\sum u_i^p)^{1/p}+(\sum v_i^p)^{1/p}\).
Endpoint \(p=1\). Minkowski reduces to the ordinary triangle inequality termwise:
\[
\sum |u_i+v_i|\le \sum |u_i|+\sum |v_i|.
\]
Case \(p=\infty\). Define \(d_\infty(x,y)=\max_i |x_i-y_i|\). Then for every \(i\),
\(|x_i-y_i|\le |x_i-z_i|+|z_i-y_i|\le d_\infty(x,z)+d_\infty(z,y)\).
Taking the maximum over \(i\) gives the triangle inequality for \(d_\infty\).
(a) The same argument works for countable sums (\(\ell^p\)) and integrals (\(L^p\)) once one identifies functions equal almost everywhere.
(b) For \(0
not satisfy the triangle inequality (only a quasi-inequality), so it is not a metric.
Worked Examples in \(\mathbb{R}^n\): Seeing the Triangle Inequality
Example A (numerical check in \(\mathbb{R}^2\)). Take
\(x=(0,0),\ y=(1,2),\ z=(3,3)\). Compute distances:
- \(d_1(x,y)=|1|+|2|=3\), \(d_1(y,z)=|2|+|1|=3\), \(d_1(x,z)=|3|+|3|=6\). Hence \(d_1(x,z)=6\le 3+3=6\) (equality).
- \(d_2(x,y)=\sqrt{1^2+2^2}=\sqrt{5}\), \(d_2(y,z)=\sqrt{2^2+1^2}=\sqrt{5}\), \(d_2(x,z)=\sqrt{3^2+3^2}=3\sqrt{2}\). Then
\(3\sqrt{2}\le 2\sqrt{5}\) (true since \(18\le 20\)). - \(d_\infty(x,y)=\max\{1,2\}=2\), \(d_\infty(y,z)=\max\{2,1\}=2\), \(d_\infty(x,z)=\max\{3,3\}=3\). Thus \(3\le 2+2=4\).
Observation: Equality held for \(p=1\) because each coordinate difference from \(x\to y\) and \(y\to z\) had the same sign (no cancellation).
Example B (strict inequality for \(p=2\)). Let \(x=(0,0),\ y=(1,0),\ z=(1,1)\). Then
\(d_2(x,z)=\sqrt{2}\), while \(d_2(x,y)+d_2(y,z)=1+1=2\). Hence \(\sqrt{2}<2\). The detour along the legs of a right triangle is longer than the hypotenuse.
Example C (equality in all \(p\)). Take three colinear points with \(y\) between \(x\) and \(z\): \(x=(0,0)\), \(y=(1,0)\), \(z=(3,0)\). Then for any \(1\le p\le \infty\),
\(d_p(x,z)=3\) and \(d_p(x,y)+d_p(y,z)=1+2=3\). Equality holds because there is no directional disagreement across coordinates: every step is along the same ray.
Example D (\(d_\infty\) intuition). Let \(x=(0,0),\ y=(1,4),\ z=(2,5)\). We get
\(d_\infty(x,y)=4\), \(d_\infty(y,z)=1\), \(d_\infty(x,z)=5\). Thus \(5\le 4+1\). The max–coordinate view says one coordinate controls the distance; the controlling coordinate can change along the path.
Example E (weighted \(\mathbb{R}^n\) is still metric). Fix weights \(w_k>0\) and define
\[d_{p,w}(x,y)=\Big(\sum_{k=1}^n w_k\,|x_k-y_k|^p\Big)^{1/p}.\]
Let \(w=(2,\tfrac12)\), \(x=(0,0), y=(1,2), z=(1,3)\). Then
\(d_{2,w}(x,y)=\sqrt{2\cdot 1^2+\tfrac12\cdot 2^2}=\sqrt{2+2}=2\),
\(d_{2,w}(y,z)=\sqrt{2\cdot 0^2+\tfrac12\cdot 1^2}=\sqrt{\tfrac12}\),
\(d_{2,w}(x,z)=\sqrt{2\cdot 1^2+\tfrac12\cdot 3^2}=\sqrt{2+\tfrac{9}{2}}=\sqrt{\tfrac{13}{2}}\approx2.55\). Indeed \(2.55\lt 2+0.707\). Weights correspond to scaling axes; Minkowski persists under such scalings.
When Does Equality Hold? When Is It Strict?
For \(p=1\). Since \(|a+b|=|a|+|b|\) iff \(ab\ge 0\), we get
\(\|x+y\|_1=\|x\|_1+\|y\|_1\) precisely when each coordinate has the same sign (or one is zero): \(x_k y_k\ge 0\) for all \(k\). Otherwise the inequality is strict.
For \(p=2\). By the parallelogram picture (or Cauchy–Schwarz), \(\|x+y\|_2=\|x\|_2+\|y\|_2\) happens iff \(x\) and \(y\) point in the same direction (they are positively colinear). Any nonzero angle produces a strict inequality.
For \(p=\infty\). Equality \(\|x+y\|_\infty=\|x\|_\infty+\|y\|_\infty\) can occur only when the same coordinate achieves the maximum in both \(x\) and \(y\) with matching signs. If the maximizing coordinates differ, the inequality is usually strict.
Infinite-Dimensional Example: Sequence Spaces \(\ell^p\)
Let \(\mathbb{F}\in\{\mathbb{R},\mathbb{C}\}\). The space \(\ell^p\) (\(1\le p\lt\infty\)) consists of all sequences
\(x=(x_k)_{k\in\mathbb{N}}\) with
\[
\|x\|_p=\left(\sum_{k=1}^\infty |x_k|^p\right)^{1/p}\lt\infty,\qquad
d_p(x,y)=\|x-y\|_p.
\]
The space \(\ell^\infty\) contains all bounded sequences, with
\[
\|x\|_\infty=\sup_{k\in\mathbb{N}}|x_k|\lt\infty,\qquad
d_\infty(x,y)=\|x-y\|_\infty.
\]
Membership in \(\ell^1\). If \(x_k=1\) or \(x_k=\frac{1}{k}\), then \(\sum |x_k|=\infty\), so \(x\notin\ell^1\).
If \(x_k=\frac{1}{k^2}\), then \(\sum \frac{1}{k^2}=\frac{\pi^2}{6}\), so \(x\in\ell^1\).
Worked Examples in \(\ell^p\)
Example F (finite support sequences). Let
\(x=(1,\tfrac12,0,0,\dots),\ y=(0,\tfrac12,\tfrac12,0,\dots),\ z=0\). For \(p=1\):
\(\|x\|_1=\tfrac32\), \(\|y\|_1=1\), \(\|x+y\|_1=\|(1,1,\tfrac12,0,\dots)\|_1=2.5\). Then
\(\|x+y\|_1=2.5\le 1.5+1=2.5\) (equality occurs in the first two coordinates: same sign; strict in the third where only one is nonzero).
Example G (\(p=2\) with cancellation). Take \(x=(1,1,0,\dots)\), \(y=(-1,1,0,\dots)\). Then
\(\|x\|_2=\sqrt{2}\), \(\|y\|_2=\sqrt{2}\), but \(x+y=(0,2,0,\dots)\) so \(\|x+y\|_2=2\). Thus \(2\le \sqrt{2}+\sqrt{2}=2\sqrt{2}\) (strict inequality). The vectors are not aligned.
Example H (\(\ell^\infty\)). Let \(x=(1,4,0,\dots)\), \(y=(2,1,5,0,\dots)\).
Then \(\|x\|_\infty=4\), \(\|y\|_\infty=5\), \(x+y=(3,5,5,\dots)\) so \(\|x+y\|_\infty=5\). We have
\(5\le 4+5=9\). Equality does not hold since different coordinates maximize \(x\) and \(y\).
Function Spaces: Uniform and \(L^1\) Metrics
Let \(K\in\{\mathbb{R},\mathbb{C}\}\). On the space of bounded functions \(K_b([0,1])=\{f:[0,1]\to K\mid \sup_{t\in[0,1]}|f(t)|\lt\infty\}\), define the uniform metric
\[
d_\infty(f,g)=\sup_{t\in[0,1]}|f(t)-g(t)|.
\]
On the space \(C([0,1])\) of continuous functions, define the \(L^1\) metric
\[
d_1(f,g)=\int_0^1 |f(t)-g(t)|\,dt.
\]
In both cases, the metric is induced by a norm \(\|\,\cdot\,\|_\infty\) or \(\|\,\cdot\,\|_1\), just as in the finite-dimensional setting—sums become integrals.
Worked Examples with Functions
Example I (uniform metric). Let \(f(t)=t\), \(g(t)=t^2\), \(h(t)=0\) on \([0,1]\). Then
\(d_\infty(f,h)=\sup_{t\in[0,1]}|t-0|=1\),
\(d_\infty(h,g)=\sup_{t\in[0,1]}|t^2|=1\),
\(d_\infty(f,g)=\sup_{t\in[0,1]}|t-t^2|=\max_{t\in[0,1]} t(1-t)=1/4\) (attained at \(t=1/2\)).
Triangle inequality: \(\tfrac14\le 1+1\) (very loose here).
Example J (\(L^1\) metric with step functions). On \([0,1]\), let
\(f=\mathbf{1}_{[0,1/2]}\), \(g=\mathbf{1}_{[1/2,1]}\), \(h=0\).
Then
\(d_1(f,h)=\int_0^{1/2}1\,dt=\tfrac12\),
\(d_1(g,h)=\tfrac12\),
\(d_1(f,g)=\int_0^1 |f-g|=1\).
Thus \(1\le \tfrac12+\tfrac12=1\) (equality). Geometrically: the supports of \(f\) and \(g\) are disjoint and nonnegative.
Example K (strict inequality in \(L^1\)). Let \(f=\mathbf{1}_{[0,1]}\), \(g=\mathbf{1}_{[0,1/2]}\), \(h=0\).
Then \(d_1(f,h)=1\), \(d_1(h,g)=\tfrac12\), and
\(d_1(f,g)=\int_0^1 |1-\mathbf{1}_{[0,1/2]}|=\tfrac12\). Hence \(\tfrac12\le 1+\tfrac12\) (strict). The overlap causes cancellation in the integrand \(|f-g|\).
Triangle Inequality in \(\ell^1\): A Micro-Proof
For \(x,y,z\in\ell^1\),
\[
d_1(x,z)=\sum_{k=1}^\infty |x_k-z_k|
=\sum_{k=1}^\infty |(x_k-y_k)+(y_k-z_k)|
\le \sum_{k=1}^\infty\big(|x_k-y_k|+|y_k-z_k|\big)
= d_1(x,y)+d_1(y,z).
\]
We used \(|u+v|\le |u|+|v|\) term-by-term and the linearity of infinite sums of nonnegative terms.
Why \(0<p<1\) Fails the Triangle Inequality
Concrete counterexample (\(p=\tfrac12\)). In \(\mathbb{R}^2\) let \(x=(1,0)\), \(z=(0,1)\), \(y=(0,0)\). Then
\(d_p(x,y)=1\), \(d_p(y,z)=1\), but
\(d_p(x,z)=\big(|1|^{p}+|1|^{p}\big)^{1/p}=(2)^{1/p}\).
If \(p=\tfrac12\), \((2)^{1/p}=2^2=4\). Thus \(d_p(x,z)=4\not\le 1+1=2\). Hence no metric for \(p<1\).
Practice: Triangle Inequality Across Settings
- Numerical check. In \(\mathbb{R}^2\), use \(x=(2,-1)\), \(y=(-1,3)\), \(z=(0,0)\). Compute all pairwise distances in \(d_1, d_2, d_\infty\) and test the inequality. Where is it tightest?
- Sign patterns in \(\ell^1\). Give \(x,y\in\mathbb{R}^3\) with mixed signs so that \(\|x+y\|_1<\|x\|_1+\|y\|_1\), and explain the role of cancellation.
- Uniform vs. \(L^1\). On \([0,1]\), take \(f(t)=t\), \(g(t)=\tfrac12\), \(h(t)=0\). Compare the triangle inequality numerically in \(d_\infty\) and in \(d_1\).
- Weights. Show that choosing positive weights \(w_k\) still yields a metric \(d_{p,w}\) by reducing to the unweighted case via the substitution \(u_k=w_k^{1/p}(x_k-y_k)\).