Metric spaces

Purpose & Context (from Dr. Oussa)

These notes introduce the formal notion of a metric space. My aim is that you build a precise definition, develop a geometric picture, and learn to recognize metrics that arise in familiar and less familiar settings (finite-dimensional vectors, sequences, and functions). Carry the following guiding image: in a metric space, detours are never shorter than the direct route—this is the heart of the triangle inequality.

Learning outcomes. After reading, you should be able to (i) state the metric axioms; (ii) test whether a proposed formula is a metric; (iii) form the subspace (inherited) metric; (iv) compute with \(d_p\) and \(d_\infty\) on \(\mathbb{R}^n\), with \(\ell^p\) on sequences, and with uniform and \(L^1\) metrics on function spaces.

Definition: Metric and Metric Space

Let \(X\) be a nonempty set. A metric on \(X\) is a function \(d:X\times X\to\mathbb{R}\) such that for all \(x,y,z\in X\):

Non-negativity: \(d(x,y)\ge 0\).
Identity of indiscernibles: \(d(x,y)=0\) iff \(x=y\).
Symmetry: \(d(x,y)=d(y,x)\).
Triangle inequality: \(d(x,z)\le d(x,y)+d(y,z)\).

The pair \((X,d)\) is called a metric space, and \(d(x,y)\) is the distance between \(x\) and \(y\).

Remarks.
(a) Because of symmetry, the map \((x,y)\mapsto d(x,y)\) is typically not injective: the ordered pairs \((x,y)\) and \((y,x)\) produce the same value unless \(x=y\).
(b) The triangle inequality encodes the geometric intuition that any detour \(x\to y\to z\) is at least as long as the direct comparison \(x\to z\).

Subspace (Inherited) Metrics

If \(Y\subseteq X\) and \(d\) is a metric on \(X\), then the restriction of \(d\) to \(Y\times Y\) is a metric on \(Y\). We say that \(Y\) is endowed with the metric inherited from \((X,d)\).

Example (a line sitting inside the plane). With the Euclidean metric on \(\mathbb{R}^2\), a straight line \(L\subset\mathbb{R}^2\) becomes a metric space when we restrict the same distance formula to points of \(L\). The metric comes from the ambient space rather than being defined intrinsically on \(L\).

Quick check. Let \(X=\mathbb{R}^2\) with \(d_2(x,y)=\sqrt{(x_1-y_1)^2+(x_2-y_2)^2}\). Show the restriction of \(d_2\) to the unit circle \(S^1\) is a metric on \(S^1\). (It is not the arc-length distance, just the ambient Euclidean distance between points on the circle.)

Canonical Metrics on Finite-Dimensional Spaces

On \(\mathbb{R}\), \(d(x,y)=|x-y|\) is the prototype. On \(\mathbb{R}^n\) (or \(\mathbb{C}^n\)), for \(1\le p\le \infty\) we set

\[
d_p(x,y)=\left(\sum_{k=1}^n |x_k-y_k|^p\right)^{1/p},\qquad
d_\infty(x,y)=\max_{1\le k\le n}|x_k-y_k|.
\]

Comment. Though these metrics can measure distance differently, on any fixed finite dimension they generate the same notion of convergence. (We will come back to this later.)

Discrete metric. On any nonempty set \(X\), define
\[
d_{\text{disc}}(x,y)=
\begin{cases}
0,& x=y,\\
1,& x\ne y.
\end{cases}
\]

Geometric Intuition for the Triangle Inequality in \(d_p\)

Unit balls picture (no figure needed). The set \(\{x\in\mathbb{R}^2: \|x\|_p\le 1\}\) is: a diamond (\(p=1\)), a circle (\(p=2\)), and a square aligned with the axes (\(p=\infty\)). The inequality \(\|x+ y\|_p\le \|x\|_p+\|y\|_p\) says: if you scale the unit ball by \(\|x\|_p\) and \(\|y\|_p\) and place them tip-to-tail, the sum still lands within the ball scaled by their total size.

Taxi vs. Air distance. In \(\mathbb{R}^2\), \(d_1\) counts horizontal + vertical blocks (detours allowed), \(d_2\) is the straight-line distance, \(d_\infty\) is the longest coordinate difference. For any three intersections (points) \(x,y,z\), the taxi route \(x\to y\to z\) cannot be shorter than going directly in the same taxi metric; the same holds for \(d_2\) and \(d_\infty\).

Theorem: \(d_p\) is a Metric on \(\mathbb{K}^n\) — Proof Without Norms

Statement. Let \(1\le p<\infty\) and \(x=(x_i)_{i=1}^n,\ y=(y_i)_{i=1}^n\in\mathbb{K}^n\) with \(\mathbb{K}\in\{\mathbb{R},\mathbb{C}\}\). Define \[ d_p(x,y)=\Big(\sum_{i=1}^n |x_i-y_i|^{\,p}\Big)^{1/p}. \] Then \(d_p\) is a metric. For \(p=\infty\), \(d_\infty(x,y)=\max_{1\le i\le n}|x_i-y_i|\) is also a metric.

Nonnegativity. Each \(|x_i-y_i|^p\ge 0\), hence \(d_p(x,y)\ge 0\).

Identity of indiscernibles. If \(d_p(x,y)=0\), then \(\sum |x_i-y_i|^p=0\). A sum of nonnegative terms is zero iff every term is zero; thus \(|x_i-y_i|=0\) for all \(i\), i.e. \(x=y\). Conversely, \(x=y\Rightarrow d_p(x,y)=0\).

Symmetry. \(|x_i-y_i|=|y_i-x_i|\) termwise; hence \(d_p(x,y)=d_p(y,x)\).

Triangle inequality. Fix \(z\in\mathbb{K}^n\) and set \(u_i=|x_i-z_i|\), \(v_i=|z_i-y_i|\). Then \(|x_i-y_i|\le u_i+v_i\), so
\[
d_p(x,y)=\Big(\sum |x_i-y_i|^p\Big)^{1/p}\le \Big(\sum (u_i+v_i)^p\Big)^{1/p}.
\]
By Minkowski’s inequality for finite sums (proved below),
\[
\Big(\sum (u_i+v_i)^p\Big)^{1/p}\le \Big(\sum u_i^p\Big)^{1/p}+\Big(\sum v_i^p\Big)^{1/p}
= d_p(x,z)+d_p(z,y).
\]
This yields \(d_p(x,y)\le d_p(x,z)+d_p(z,y)\).

Key inequalities used (no “norms”). Let \(p\in(1,\infty)\) and \(q=\frac{p}{p-1}\).

Young’s inequality. For \(a,b\ge 0\),
\[
ab\le \frac{a^p}{p}+\frac{b^q}{q}.
\]
Proof sketch: maximize \(ab-\tfrac{b^q}{q}\) in \(b\ge 0\); optimum at \(b=a^{p-1}\).
Hölder’s inequality (finite sums). For nonnegative sequences \((\alpha_i),(\beta_i)\),
\[
\sum_{i=1}^n \alpha_i\beta_i\ \le\ \Big(\sum \alpha_i^{p}\Big)^{1/p}\Big(\sum \beta_i^{q}\Big)^{1/q}.
\]
Proof: Normalize by \(A=(\sum \alpha_i^p)^{1/p}, B=(\sum \beta_i^q)^{1/q}\); apply Young termwise to \(\tilde\alpha_i=\alpha_i/A\), \(\tilde\beta_i=\beta_i/B\), and sum.
Minkowski’s inequality (finite sums). For nonnegative \((u_i),(v_i)\),
\[
\Big(\sum (u_i+v_i)^p\Big)^{1/p}\ \le\ \Big(\sum u_i^{p}\Big)^{1/p}+\Big(\sum v_i^{p}\Big)^{1/p}.
\]
Proof: Let \(S=\big(\sum (u_i+v_i)^p\big)^{1/p}\). If \(S=0\) we are done. Otherwise,
\[
\sum (u_i+v_i)^p=\sum u_i(u_i+v_i)^{p-1}+\sum v_i(u_i+v_i)^{p-1}.
\]
Apply Hölder to each sum with conjugate exponents \(p,q\), noting \((p-1)q=p\), to get
\[
S^p \le \Big(\sum u_i^p\Big)^{1/p}S^{p/q}+\Big(\sum v_i^p\Big)^{1/p}S^{p/q}.
\]
Divide by \(S^{p/q}>0\) to conclude \(S\le (\sum u_i^p)^{1/p}+(\sum v_i^p)^{1/p}\).

Endpoint \(p=1\). Minkowski reduces to the ordinary triangle inequality termwise:
\[
\sum |u_i+v_i|\le \sum |u_i|+\sum |v_i|.
\]

Remarks.
(a) The same argument works for countable sums (\(\ell^p\)) and integrals (\(L^p\)) once one identifies functions equal almost everywhere.
(b) For \(0not satisfy the triangle inequality (only a quasi-inequality), so it is not a metric.

Worked Examples in \(\mathbb{R}^n\): Seeing the Triangle Inequality

Example A (numerical check in \(\mathbb{R}^2\)). Take
\(x=(0,0),\ y=(1,2),\ z=(3,3)\). Compute distances:

\(d_1(x,y)=|1|+|2|=3\), \(d_1(y,z)=|2|+|1|=3\), \(d_1(x,z)=|3|+|3|=6\). Hence \(d_1(x,z)=6\le 3+3=6\) (equality).
\(d_2(x,y)=\sqrt{1^2+2^2}=\sqrt{5}\), \(d_2(y,z)=\sqrt{2^2+1^2}=\sqrt{5}\), \(d_2(x,z)=\sqrt{3^2+3^2}=3\sqrt{2}\). Then
\(3\sqrt{2}\le 2\sqrt{5}\) (true since \(18\le 20\)).
\(d_\infty(x,y)=\max\{1,2\}=2\), \(d_\infty(y,z)=\max\{2,1\}=2\), \(d_\infty(x,z)=\max\{3,3\}=3\). Thus \(3\le 2+2=4\).

Observation: Equality held for \(p=1\) because each coordinate difference from \(x\to y\) and \(y\to z\) had the same sign (no cancellation).

Example B (strict inequality for \(p=2\)). Let \(x=(0,0),\ y=(1,0),\ z=(1,1)\). Then
\(d_2(x,z)=\sqrt{2}\), while \(d_2(x,y)+d_2(y,z)=1+1=2\). Hence \(\sqrt{2}<2\). The detour along the legs of a right triangle is longer than the hypotenuse.

Example C (equality in all \(p\)). Take three colinear points with \(y\) between \(x\) and \(z\): \(x=(0,0)\), \(y=(1,0)\), \(z=(3,0)\). Then for any \(1\le p\le \infty\),
\(d_p(x,z)=3\) and \(d_p(x,y)+d_p(y,z)=1+2=3\). Equality holds because there is no directional disagreement across coordinates: every step is along the same ray.

Example D (\(d_\infty\) intuition). Let \(x=(0,0),\ y=(1,4),\ z=(2,5)\). We get
\(d_\infty(x,y)=4\), \(d_\infty(y,z)=1\), \(d_\infty(x,z)=5\). Thus \(5\le 4+1\). The max–coordinate view says one coordinate controls the distance; the controlling coordinate can change along the path.

Example E (weighted \(\mathbb{R}^n\) is still metric). Fix weights \(w_k>0\) and define
\[d_{p,w}(x,y)=\Big(\sum_{k=1}^n w_k\,|x_k-y_k|^p\Big)^{1/p}.\]
Let \(w=(2,\tfrac12)\), \(x=(0,0), y=(1,2), z=(1,3)\). Then
\(d_{2,w}(x,y)=\sqrt{2\cdot 1^2+\tfrac12\cdot 2^2}=\sqrt{2+2}=2\),
\(d_{2,w}(y,z)=\sqrt{2\cdot 0^2+\tfrac12\cdot 1^2}=\sqrt{\tfrac12}\),
\(d_{2,w}(x,z)=\sqrt{2\cdot 1^2+\tfrac12\cdot 3^2}=\sqrt{2+\tfrac{9}{2}}=\sqrt{\tfrac{13}{2}}\approx2.55\). Indeed \(2.55\lt 2+0.707\). Weights correspond to scaling axes; Minkowski persists under such scalings.

Try it. Pick your own triple of points in \(\mathbb{R}^2\). Compute \(d_1, d_2, d_\infty\) for each pair and verify \(d_p(x,z)\le d_p(x,y)+d_p(y,z)\).

When Does Equality Hold? When Is It Strict?

For \(p=1\). Since \(|a+b|=|a|+|b|\) iff \(ab\ge 0\), we get
\(\|x+y\|_1=\|x\|_1+\|y\|_1\) precisely when each coordinate has the same sign (or one is zero): \(x_k y_k\ge 0\) for all \(k\). Otherwise the inequality is strict.

For \(p=2\). By the parallelogram picture (or Cauchy–Schwarz), \(\|x+y\|_2=\|x\|_2+\|y\|_2\) happens iff \(x\) and \(y\) point in the same direction (they are positively colinear). Any nonzero angle produces a strict inequality.

For \(p=\infty\). Equality \(\|x+y\|_\infty=\|x\|_\infty+\|y\|_\infty\) can occur only when the same coordinate achieves the maximum in both \(x\) and \(y\) with matching signs. If the maximizing coordinates differ, the inequality is usually strict.

Check. Find nonzero \(x,y\in\mathbb{R}^2\) so that equality holds for \(p=1\) but is strict for \(p=2\) and \(p=\infty\).

Infinite-Dimensional Example: Sequence Spaces \(\ell^p\)

Let \(\mathbb{F}\in\{\mathbb{R},\mathbb{C}\}\). The space \(\ell^p\) (\(1\le p\lt\infty\)) consists of all sequences
\(x=(x_k)_{k\in\mathbb{N}}\) with

\[
\|x\|_p=\left(\sum_{k=1}^\infty |x_k|^p\right)^{1/p}\lt\infty,\qquad
d_p(x,y)=\|x-y\|_p.
\]

The space \(\ell^\infty\) contains all bounded sequences, with

\[
\|x\|_\infty=\sup_{k\in\mathbb{N}}|x_k|\lt\infty,\qquad
d_\infty(x,y)=\|x-y\|_\infty.
\]

Membership in \(\ell^1\). If \(x_k=1\) or \(x_k=\frac{1}{k}\), then \(\sum |x_k|=\infty\), so \(x\notin\ell^1\).
If \(x_k=\frac{1}{k^2}\), then \(\sum \frac{1}{k^2}=\frac{\pi^2}{6}\), so \(x\in\ell^1\).

Standard basis. \(\delta_n=(0,0,\dots,1,0,\dots)\) (a \(1\) in the \(n\)-th slot) are the canonical basis vectors in these sequence spaces.

Worked Examples in \(\ell^p\)

Example F (finite support sequences). Let
\(x=(1,\tfrac12,0,0,\dots),\ y=(0,\tfrac12,\tfrac12,0,\dots),\ z=0\). For \(p=1\):
\(\|x\|_1=\tfrac32\), \(\|y\|_1=1\), \(\|x+y\|_1=\|(1,1,\tfrac12,0,\dots)\|_1=2.5\). Then
\(\|x+y\|_1=2.5\le 1.5+1=2.5\) (equality occurs in the first two coordinates: same sign; strict in the third where only one is nonzero).

Example G (\(p=2\) with cancellation). Take \(x=(1,1,0,\dots)\), \(y=(-1,1,0,\dots)\). Then
\(\|x\|_2=\sqrt{2}\), \(\|y\|_2=\sqrt{2}\), but \(x+y=(0,2,0,\dots)\) so \(\|x+y\|_2=2\). Thus \(2\le \sqrt{2}+\sqrt{2}=2\sqrt{2}\) (strict inequality). The vectors are not aligned.

Example H (\(\ell^\infty\)). Let \(x=(1,4,0,\dots)\), \(y=(2,1,5,0,\dots)\).
Then \(\|x\|_\infty=4\), \(\|y\|_\infty=5\), \(x+y=(3,5,5,\dots)\) so \(\|x+y\|_\infty=5\). We have
\(5\le 4+5=9\). Equality does not hold since different coordinates maximize \(x\) and \(y\).

Function Spaces: Uniform and \(L^1\) Metrics

Let \(K\in\{\mathbb{R},\mathbb{C}\}\). On the space of bounded functions \(K_b([0,1])=\{f:[0,1]\to K\mid \sup_{t\in[0,1]}|f(t)|\lt\infty\}\), define the uniform metric
\[
d_\infty(f,g)=\sup_{t\in[0,1]}|f(t)-g(t)|.
\]
On the space \(C([0,1])\) of continuous functions, define the \(L^1\) metric
\[
d_1(f,g)=\int_0^1 |f(t)-g(t)|\,dt.
\]
In both cases, the metric is induced by a norm \(\|\,\cdot\,\|_\infty\) or \(\|\,\cdot\,\|_1\), just as in the finite-dimensional setting—sums become integrals.

Show that \((K_b([0,1]),d_\infty)\) is a metric space. (Where do we use the supremum?)

Worked Examples with Functions

Example I (uniform metric). Let \(f(t)=t\), \(g(t)=t^2\), \(h(t)=0\) on \([0,1]\). Then
\(d_\infty(f,h)=\sup_{t\in[0,1]}|t-0|=1\),
\(d_\infty(h,g)=\sup_{t\in[0,1]}|t^2|=1\),
\(d_\infty(f,g)=\sup_{t\in[0,1]}|t-t^2|=\max_{t\in[0,1]} t(1-t)=1/4\) (attained at \(t=1/2\)).
Triangle inequality: \(\tfrac14\le 1+1\) (very loose here).

Example J (\(L^1\) metric with step functions). On \([0,1]\), let
\(f=\mathbf{1}_{[0,1/2]}\), \(g=\mathbf{1}_{[1/2,1]}\), \(h=0\).
Then
\(d_1(f,h)=\int_0^{1/2}1\,dt=\tfrac12\),
\(d_1(g,h)=\tfrac12\),
\(d_1(f,g)=\int_0^1 |f-g|=1\).
Thus \(1\le \tfrac12+\tfrac12=1\) (equality). Geometrically: the supports of \(f\) and \(g\) are disjoint and nonnegative.

Example K (strict inequality in \(L^1\)). Let \(f=\mathbf{1}_{[0,1]}\), \(g=\mathbf{1}_{[0,1/2]}\), \(h=0\).
Then \(d_1(f,h)=1\), \(d_1(h,g)=\tfrac12\), and
\(d_1(f,g)=\int_0^1 |1-\mathbf{1}_{[0,1/2]}|=\tfrac12\). Hence \(\tfrac12\le 1+\tfrac12\) (strict). The overlap causes cancellation in the integrand \(|f-g|\).

Triangle Inequality in \(\ell^1\): A Micro-Proof

For \(x,y,z\in\ell^1\),
\[
d_1(x,z)=\sum_{k=1}^\infty |x_k-z_k|
=\sum_{k=1}^\infty |(x_k-y_k)+(y_k-z_k)|
\le \sum_{k=1}^\infty\big(|x_k-y_k|+|y_k-z_k|\big)
= d_1(x,y)+d_1(y,z).
\]
We used \(|u+v|\le |u|+|v|\) term-by-term and the linearity of infinite sums of nonnegative terms.

Why \(0<p<1\) Fails the Triangle Inequality

Concrete counterexample (\(p=\tfrac12\)). In \(\mathbb{R}^2\) let \(x=(1,0)\), \(z=(0,1)\), \(y=(0,0)\). Then
\(d_p(x,y)=1\), \(d_p(y,z)=1\), but
\(d_p(x,z)=\big(|1|^{p}+|1|^{p}\big)^{1/p}=(2)^{1/p}\).
If \(p=\tfrac12\), \((2)^{1/p}=2^2=4\). Thus \(d_p(x,z)=4\not\le 1+1=2\). Hence no metric for \(p<1\).

Moral. For \(p<1\), the mapping \(\|\cdot\|_p\) is a quasi-norm; the triangle inequality holds only up to a constant factor.

Practice: Triangle Inequality Across Settings

Numerical check. In \(\mathbb{R}^2\), use \(x=(2,-1)\), \(y=(-1,3)\), \(z=(0,0)\). Compute all pairwise distances in \(d_1, d_2, d_\infty\) and test the inequality. Where is it tightest?
Sign patterns in \(\ell^1\). Give \(x,y\in\mathbb{R}^3\) with mixed signs so that \(\|x+y\|_1<\|x\|_1+\|y\|_1\), and explain the role of cancellation.
Uniform vs. \(L^1\). On \([0,1]\), take \(f(t)=t\), \(g(t)=\tfrac12\), \(h(t)=0\). Compare the triangle inequality numerically in \(d_\infty\) and in \(d_1\).
Weights. Show that choosing positive weights \(w_k\) still yields a metric \(d_{p,w}\) by reducing to the unweighted case via the substitution \(u_k=w_k^{1/p}(x_k-y_k)\).

Metric spaces

Metric Spaces — Reading Notes

Purpose & Context (from Dr. Oussa)

Definition: Metric and Metric Space

Subspace (Inherited) Metrics

Canonical Metrics on Finite-Dimensional Spaces

Geometric Intuition for the Triangle Inequality in \(d_p\)

Theorem: \(d_p\) is a Metric on \(\mathbb{K}^n\) — Proof Without Norms

Worked Examples in \(\mathbb{R}^n\): Seeing the Triangle Inequality

When Does Equality Hold? When Is It Strict?

Infinite-Dimensional Example: Sequence Spaces \(\ell^p\)

Worked Examples in \(\ell^p\)

Function Spaces: Uniform and \(L^1\) Metrics

Worked Examples with Functions

Triangle Inequality in \(\ell^1\): A Micro-Proof

Why \(0<p<1\) Fails the Triangle Inequality

Practice: Triangle Inequality Across Settings

Like this:

Metric spaces

Metric Spaces — Reading Notes

Share this:

Like this: