Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section you will be able to:

Parametrise a surface in 3D and read off its area element $dS = |\mathbf{r}_u \times \mathbf{r}_v|\,du\,dv$ directly from the parametrisation.
Evaluate scalar surface integrals $\iint_S f\,dS$ over parametric surfaces, including surface area as a special case.
Compute flux integrals $\iint_S \mathbf{F} \cdot d\mathbf{S}$ , measuring how much of a vector field passes through a curved surface.
Reason about orientation and the role of the unit normal in determining the sign of flux.
Connect the formalism to electromagnetism (Gauss's law), fluid dynamics (mass flow rate) and the change-of-variables formula in machine-learning normalising flows.

Why this matters: Surface integrals are how calculus measures stuff that lives on a curved skin in space — the mass of a soap bubble, the wind passing through a kite, the electric field threading a Gaussian pillbox, the probability mass squeezed by a generative model. Every "flow through a boundary" in physics is a flux integral in disguise.

The Big Picture

Calculus keeps doing the same trick — chop, evaluate, sum, take the limit — only the shape of the domain changes. Surface integrals are the natural next step:

Dimension	Domain	Integral	Geometric meaning
1-D straight	Interval $[a,b]$	$\int_a^b f(x)\,dx$	Signed area under a curve
1-D curved	Curve $C$ in $\mathbb{R}^3$	$\int_C f\,ds$ or $\int_C \mathbf{F}\cdot d\mathbf{r}$	Mass along wire / work along path
2-D flat	Region $R$ in $\mathbb{R}^2$	$\iint_R f\,dA$	Signed volume under a graph
2-D curved	Surface $S$ in $\mathbb{R}^3$	$\iint_S f\,dS$ or $\iint_S \mathbf{F}\cdot d\mathbf{S}$	Mass on a shell / flux through it

Just as line integrals split into "scalar along a curve" and "vector circulation along a curve", surface integrals split into two species:

Scalar surface integral

$\iint_S f(x,y,z)\,dS$

Integrates a scalar density against area. Used for the mass of a curved sheet of uneven thickness, the average temperature on a satellite's skin, or just the surface area when $f = 1$ .

Vector surface integral (flux)

$\iint_S \mathbf{F}\cdot d\mathbf{S} = \iint_S \mathbf{F}\cdot\mathbf{n}\,dS$

Integrates the normal component of a vector field. Measures "how much of $\mathbf{F}$ " pierces the surface. Used for electric flux, magnetic flux, fluid flow rate, heat current.

The single core idea

A surface integral asks: what is the total of some quantity distributed over a curved 2-D sheet? For scalars we weight the function by area. For vectors we weight by the component pointing through the sheet.

Historical Context

Surface integrals were invented to answer concrete questions in 19th-century physics: how does an electric field surround a charge? how does fluid mass move through a pipe? The mathematics was forged by three names you will meet again in this chapter.

Carl Friedrich Gauss (1777–1855)

Gauss spotted the deep link between flux through a closed surface and the sources sitting inside it. His electrostatic statement $\oint_S \mathbf{E}\cdot d\mathbf{S} = Q_{\text{enc}}/\varepsilon_0$ is one of Maxwell's four equations and the cleanest example of a flux integral doing real work.

George Green (1793–1841)

A self-taught miller's son who introduced the function we now call Green's function and connected surface integrals to volume integrals through what we now call Green's identities. Modern PDE theory begins with him.

George Gabriel Stokes (1819–1903)

Stokes proved the theorem now bearing his name (next section), tying the circulation of a vector field around a curve to the flux of its curl through a spanning surface.

One principle behind them all

Stokes' theorem, the divergence theorem, Green's theorem and the fundamental theorem of calculus are all special cases of the same deep statement — the integral of a derivative over a region equals the integral of the original function over the boundary of that region. Surface integrals are the boundary half of the 3-D version.

Parametric Surfaces Review

Before integrating over a surface we have to be able to describe it. A parametric surface is a vector function from a flat 2-D parameter rectangle into 3-D space:

\mathbf{r}: D \subseteq \mathbb{R}^2 \longrightarrow \mathbb{R}^3, \quad \mathbf{r}(u,v) = \langle x(u,v),\; y(u,v),\; z(u,v)\rangle.

As $(u,v)$ sweeps through the rectangle $D$ , the tip of $\mathbf{r}(u,v)$ traces out the surface $S$ in 3-D.

A small atlas of parametrisations

Surface	Parametrisation $\mathbf{r}(u,v)$	Parameter domain
Plane	$\mathbf{r}_0 + u\,\mathbf{a} + v\,\mathbf{b}$	Bounded rectangle in $(u,v)$
Sphere of radius $R$	$(R\sin\phi\cos\theta,\;R\sin\phi\sin\theta,\;R\cos\phi)$	$0\le\phi\le\pi,\;0\le\theta\le 2\pi$
Cylinder radius $R$ , height $h$	$(R\cos\theta,\;R\sin\theta,\;z)$	$0\le\theta\le 2\pi,\;0\le z\le h$
Graph $z = f(x,y)$	$(x,\;y,\;f(x,y))$	$(x,y)$ in the domain of $f$
Paraboloid $z = x^2 + y^2$	$(r\cos\theta,\;r\sin\theta,\;r^2)$	$0\le r\le R,\;0\le\theta\le 2\pi$
Torus $(R, a)$	$((R+a\cos v)\cos u,\;(R+a\cos v)\sin u,\;a\sin v)$	$0\le u, v\le 2\pi$

The Surface Area Element $dS$

Once we have $\mathbf{r}(u,v)$ , the central object of the whole section is the area element:

dS = |\mathbf{r}_u \times \mathbf{r}_v|\,du\,dv,

where $\mathbf{r}_u = \partial\mathbf{r}/\partial u$ and $\mathbf{r}_v = \partial\mathbf{r}/\partial v$ are the partial derivatives of the parametrisation. Equivalently, the vector area element is $d\mathbf{S} = (\mathbf{r}_u \times \mathbf{r}_v)\,du\,dv$ — a normal vector whose magnitude is the area scale.

Where $dS$ Comes From (Intuition)

Imagine the parameter rectangle $D$ ruled with a fine grid of cells, each of width $du$ and $dv$ . The map $\mathbf{r}$ sends one such cell — a tiny square — to a tiny curved patch of the surface $S$ . Two edges of that patch are

$\mathbf{r}(u+du,v) - \mathbf{r}(u,v) \approx \mathbf{r}_u\,du$ — the local edge in the $u$ -direction;
$\mathbf{r}(u,v+dv) - \mathbf{r}(u,v) \approx \mathbf{r}_v\,dv$ — the local edge in the $v$ -direction.

Those two edges span a small parallelogram living tangent to the surface. The area of a parallelogram with edges $\mathbf{a}$ and $\mathbf{b}$ is $|\mathbf{a} \times \mathbf{b}|$ — that is exactly the magnitude of $(\mathbf{r}_u du) \times (\mathbf{r}_v dv) = (\mathbf{r}_u \times \mathbf{r}_v)\,du\,dv$ . Take the limit as $du, dv \to 0$ and you obtain the area element $dS$ . The direction of the cross product also gives a normal vector to the surface — that is going to matter the moment we talk about flux.

A geographic intuition

Take the sphere parametrisation with $|\mathbf{r}_\phi \times \mathbf{r}_\theta| = R^2\sin\phi$ . Near the equator, $\sin\phi \approx 1$ , so a one-degree by one-degree patch of $(\phi, \theta)$ is genuinely big. Near a pole, $\sin\phi \approx 0$ and the same parameter square shrinks to a tiny spherical triangle. The factor $\sin\phi$ in $dS$ is exactly the "Mercator distortion" you see on flat world maps.

Interactive: Parametric Surfaces

Drag the surface, change the parametrisation, and watch how the constant- $u$ and constant- $v$ curves rule the surface. The shape itself is global; the parametrisation is a chart we paint on top of it.

Loading parametric surface demo...

Scalar Surface Integrals

Definition and Intuition

A scalar surface integral of a function $f(x,y,z)$ over a surface $S$ is the limit of Riemann sums of $f$ against the surface area:

\iint_S f(x,y,z)\,dS = \lim_{\Delta S \to 0} \sum_i f(P_i)\,\Delta S_i.

Physical reading. If $f$ is a mass density per unit area ( $\text{kg}/\text{m}^2$ ), the sum gives the total mass of the shell. If $f$ is a temperature, dividing the integral by the total area gives the average temperature on the surface. The pattern "integrand times area element, summed" is universal.

Computing Scalar Surface Integrals

Pull everything back to the parameter rectangle $D$ :

\iint_S f\,dS \;=\; \iint_D f\bigl(\mathbf{r}(u,v)\bigr)\,|\mathbf{r}_u \times \mathbf{r}_v|\,du\,dv.

Recipe:

Pick a parametrisation $\mathbf{r}(u,v)$ matched to the symmetry.
Differentiate to get $\mathbf{r}_u$ and $\mathbf{r}_v$ .
Take the cross product $\mathbf{r}_u \times \mathbf{r}_v$ and its magnitude.
Substitute the parametrisation into $f$ .
Evaluate a familiar double integral over $D$ .

Surface Area as a Special Case

Setting $f \equiv 1$ gives total surface area:

\text{Area}(S) = \iint_S 1\,dS = \iint_D |\mathbf{r}_u \times \mathbf{r}_v|\,du\,dv.

For a sphere of radius $R$ , this immediately reproduces $4\pi R^2$ :

\int_0^{2\pi}\!\!\int_0^{\pi} R^2 \sin\phi\,d\phi\,d\theta = R^2 \cdot 2\pi \cdot [-\cos\phi]_0^{\pi} = R^2 \cdot 2\pi \cdot 2 = 4\pi R^2.

Worked Example — Center of Mass of a Hemisphere

We will compute the z-coordinate of the centroid of the upper hemisphere $x^2+y^2+z^2=R^2,\;z\ge 0$ , assumed to be a thin shell of uniform area density. Hand-work below — try it yourself before peeking.

▶ Solution (click to expand and work it by hand)

Setup. For uniform density the centroid coordinate is $\bar z = \dfrac{1}{A}\iint_S z\,dS$ , where $A$ is the area of the hemisphere.

Area. Half a sphere has area $A = 2\pi R^2$ .

Numerator. Use the spherical parametrisation with $z = R\cos\phi$ and $dS = R^2\sin\phi\,d\phi\,d\theta$ , integrating $\phi$ from $0$ to $\pi/2$ :

\iint_S z\,dS = \int_0^{2\pi}\!\!\int_0^{\pi/2} R\cos\phi \cdot R^2 \sin\phi\,d\phi\,d\theta = 2\pi R^3 \int_0^{\pi/2} \sin\phi\cos\phi\,d\phi.

The inner integral is $[\sin^2\phi/2]_0^{\pi/2} = 1/2$ , so the numerator is $\pi R^3$ .

Divide. $\bar z = \dfrac{\pi R^3}{2\pi R^2} = \dfrac{R}{2}.$

Sanity check. The centroid of a solid hemisphere is at $3R/8$ ; a thin shell is "weighted higher", so $R/2 > 3R/8$ is reasonable. For $R = 2$ the numerator is $8\pi \approx 25.133$ — the very number that the Python integrator below reproduces.

Interactive: Scalar Surface Integrals

Pick a surface, pick an integrand, and watch the Riemann sum colour-code each patch by its contribution. The running total is the surface integral.

Loading 3D surface visualization...

Vector Surface Integrals (Flux)

Flux Through a Surface

The flux of a vector field $\mathbf{F}$ through an oriented surface $S$ with unit normal $\mathbf{n}$ is the scalar surface integral of the normal-component of $\mathbf{F}$ :

\Phi \;=\; \iint_S \mathbf{F}\cdot d\mathbf{S} \;=\; \iint_S \mathbf{F}\cdot\mathbf{n}\,dS.

Physical reading. If $\mathbf{F} = \rho\mathbf{v}$ is the momentum density of a fluid ( $\rho$ mass density, $\mathbf{v}$ velocity), then $\Phi$ is the mass flow rate through the surface in kg/s. If $\mathbf{F} = \mathbf{E}$ is the electric field, $\Phi$ is the electric flux — by Gauss's law equal to the enclosed charge divided by $\varepsilon_0$ .

Flux as a Dot Product (Intuition)

The dot product $\mathbf{F}\cdot\mathbf{n}$ filters $\mathbf{F}$ to its component through the surface. Three cases say it all:

$\mathbf{F}\parallel\mathbf{n}$ (perpendicular to the surface): $\mathbf{F}\cdot\mathbf{n} = \pm|\mathbf{F}|$ — maximal flux per area.
$\mathbf{F}\perp\mathbf{n}$ (parallel to the surface): $\mathbf{F}\cdot\mathbf{n} = 0$ — the field slides along the surface and contributes nothing to flux.
General angle $\alpha$ : $\mathbf{F}\cdot\mathbf{n} = |\mathbf{F}|\cos\alpha$ — only the cosine projects through.

That is why a window perpendicular to a uniform wind catches the maximum air flow, while a window aligned with the wind catches none.

Surface Orientation

Every smooth surface has two sides. Choosing one — and therefore one of the two unit normals — is called fixing the orientation. Conventions:

Closed surfaces

For a sphere, an ellipsoid, the boundary of a cube — the outward normal is the default. Positive flux means net flow out of the enclosed region.

Open surfaces with a boundary curve

Use the right-hand rule: curl the fingers along the boundary curve in its chosen direction, and the thumb points to the positive normal side. This is the convention Stokes' theorem (next section) requires.

Sign flips with orientation

Reverse the choice of normal and every flux integral flips sign. So $\iint_S \mathbf{F}\cdot d\mathbf{S}$ is well-defined only once we have committed to an orientation.

Computing Flux Integrals

Pull back to the parameter rectangle using the vector area element $d\mathbf{S} = (\mathbf{r}_u \times \mathbf{r}_v)\,du\,dv$ :

\iint_S \mathbf{F}\cdot d\mathbf{S} \;=\; \iint_D \mathbf{F}\!\bigl(\mathbf{r}(u,v)\bigr)\cdot (\mathbf{r}_u \times \mathbf{r}_v)\,du\,dv.

Key observation. The cross product $\mathbf{r}_u \times \mathbf{r}_v$ already encodes both the direction of the normal and the magnitude of the area scale — so no unit vector and no extra norm appear in this version of the formula. The price is that you must check the sign of the cross product against your chosen orientation; flip the order of arguments if it points the wrong way.

Worked Example — Flux of $(0,0,z)$ Through a Paraboloid

Compute $\iint_S \mathbf{F}\cdot d\mathbf{S}$ for $\mathbf{F}=(0,0,z)$ through the paraboloid $z = 4 - x^2 - y^2,\;z\ge 0$ , oriented upward. Try it yourself before expanding.

▶ Solution (click to expand and work it by hand)

Polar parametrisation. $\mathbf{r}(r,\theta) = (r\cos\theta,\;r\sin\theta,\;4-r^2)$ for $0\le r\le 2,\;0\le\theta\le 2\pi$ .

Tangent vectors.

\mathbf{r}_r = (\cos\theta,\;\sin\theta,\;-2r),\quad \mathbf{r}_\theta = (-r\sin\theta,\;r\cos\theta,\;0).

Cross product.

\mathbf{r}_r \times \mathbf{r}_\theta = (2r^2\cos\theta,\;2r^2\sin\theta,\;r).

The z-component $r > 0$ , so this normal points upward. Good — matches our orientation.

Dot with $\mathbf{F}=(0,0,z)=(0,0,4-r^2)$ .

\mathbf{F}\cdot(\mathbf{r}_r\times\mathbf{r}_\theta) = (4-r^2)\cdot r = 4r - r^3.

Integrate.

\iint_S \mathbf{F}\cdot d\mathbf{S} = \int_0^{2\pi}\!\!\int_0^{2} (4r - r^3)\,dr\,d\theta = 2\pi\Bigl[2r^2 - \tfrac{r^4}{4}\Bigr]_0^{2} = 2\pi\,(8 - 4) = 8\pi.

Sanity check. The maximum flux per area happens where $\mathbf{F}$ aligns with $\mathbf{n}$ — at the apex $(0,0,4)$ where $\mathbf{F}=(0,0,4)$ and the normal is straight up. Near the rim, $z\to 0$ kills the integrand. Both endpoints vanish, the middle dominates, and the total comes to $8\pi$ .

Interactive: Flux Through Surfaces

Drag the vector field, change the surface, and watch every patch's contribution light up red (positive flux, out) or blue (negative flux, in). When you tilt the surface to be parallel to the field, the running total drops to zero — the visual proof of the $\cos\alpha$ projection.

Loading flux integral visualization...

Important Surface Types

Surfaces as Graphs $z = f(x,y)$

When the surface is the graph of a function, the parametrisation is free and the formulas simplify dramatically:

Parametrisation: $\mathbf{r}(x,y) = (x,\;y,\;f(x,y)).$

Cross product (upward normal): $\mathbf{r}_x \times \mathbf{r}_y = (-f_x,\;-f_y,\;1).$

Area element: $dS = \sqrt{1 + f_x^2 + f_y^2}\,dx\,dy.$

For flux through a graph with upward normal, writing $\mathbf{F}=(P,Q,R)$ :

\iint_S \mathbf{F}\cdot d\mathbf{S} = \iint_D \bigl(-P f_x - Q f_y + R\bigr)\,dx\,dy.

Spheres and Cylinders

Surface	Outward normal $\mathbf{n}$	$dS$	Orientation note
Sphere radius $R$	$(x,y,z)/R$	$R^2\sin\phi\,d\phi\,d\theta$	Standard: outward from centre
Cylinder radius $R$	$(\cos\theta,\;\sin\theta,\;0)$	$R\,d\theta\,dz$	Outward from the axis
Cone $z = \sqrt{x^2+y^2}$	$\propto (x,\;y,\;-\sqrt{x^2+y^2})$	$(r\sqrt{2})\,dr\,d\theta$	Outward (or downward) per choice

Pick the parametrisation to match the symmetry

Spheres love spherical coordinates. Cylinders love cylindrical. Graphs love Cartesian. The wrong choice can turn a one-line integral into a multi-page mess.

Gauss's Law as a Surface Integral

The electric field of a point charge $q$ at the origin is $\mathbf{E}(\mathbf{x}) = \dfrac{kq}{|\mathbf{x}|^2}\hat{\mathbf{r}}$ . Compute the flux of $\mathbf{E}$ through any sphere of radius $R$ centred at the origin.

▶ Solution (click to expand)

On the sphere, the outward unit normal is $\mathbf{n} = \hat{\mathbf{r}}$ and $|\mathbf{E}| = kq/R^2$ is the same constant everywhere on the surface.

So $\mathbf{E}\cdot\mathbf{n} = kq/R^2$ and

\iint_S \mathbf{E}\cdot d\mathbf{S} = \frac{kq}{R^2}\,\text{Area}(S) = \frac{kq}{R^2}\,(4\pi R^2) = 4\pi k q.

Take-away. The answer is independent of $R$ — the same flux pierces every concentric sphere. The geometric reason: as the sphere grows, $|\mathbf{E}|$ drops as $1/R^2$ , but the area grows as $R^2$ . They cancel exactly, which is the geometric content of Gauss's law.

Real-World Applications

Physics: Electric and Magnetic Flux

Maxwell's equations are written almost entirely in the language of flux:

Gauss's law (electric): $\oint_S \mathbf{E}\cdot d\mathbf{S} = Q_{\text{enc}}/\varepsilon_0$ — the electric flux out of a closed surface counts the charge inside.
Gauss's law (magnetic): $\oint_S \mathbf{B}\cdot d\mathbf{S} = 0$ — no magnetic monopoles, so magnetic flux out of any closed surface is always zero.
Faraday's law: a changing magnetic flux through a surface induces an electromotive force around its boundary — surface integrals show up on the right-hand side directly.

Fluid Dynamics: Mass Flow Rate

For a fluid of density $\rho$ and velocity $\mathbf{v}$ , the mass flow rate through a surface is the flux of $\rho\mathbf{v}$ :

\dot m = \iint_S \rho\,\mathbf{v}\cdot d\mathbf{S}\quad\text{(kg/s)}.

This is the equation used to size pipes, sketch ventilation, compute lift on an airfoil, and check conservation of mass in a control volume.

Machine Learning: Normalising Flows

In a normalising flow, an invertible neural network $f$ maps samples $z$ from a simple distribution $p_0$ to samples $x = f(z)$ from a target distribution $p$ . The change-of-variables formula

p(x) = p_0\bigl(f^{-1}(x)\bigr)\,\bigl|\det J_{f^{-1}}(x)\bigr|

is the local, infinitesimal version of "probability is conserved through a flow". The global version — integrating mass through a closed surface — is exactly the Divergence Theorem we will prove in section 19.9. Surface integrals are the bridge that turns local conservation into a global accounting law.

Coming next

The Divergence Theorem says $\iint_S \mathbf{F}\cdot d\mathbf{S} = \iiint_V \nabla\cdot\mathbf{F}\,dV$ : the flux out of a closed surface equals the total source strength inside. That is how Gauss's law gets its name and how conservation of probability gets its proof.

Python Implementation

Scalar surface integral over a hemisphere

Plain Python: ∬_S z dS over the upper hemisphere

🐍scalar_surface_integral.py

Explanation(20)

Code(42)

1Import math (no NumPy yet)

We use plain Python first so every operation is visible. `math.sin`, `math.cos`, `math.pi` are scalar functions — no broadcasting magic. The mechanics of the Riemann sum stay center stage.

3Function signature with sensible defaults

`R=2.0` is the sphere radius. `n_phi=50, n_theta=100` are the grid resolutions along phi (vertical angle) and theta (azimuth). More cells → smaller error, but more work. The defaults give ~0.02% error in a fraction of a second.

5Docstring: the math we are implementing

We restate the parametrization and the cross-product magnitude so the code is readable on its own. For a sphere of radius R the elementary patch has area R^2 sin(phi) d(phi) d(theta) — that is the only nontrivial geometric fact we need.

17Step 1 — uniform spacing in the parameter rectangle

Phi sweeps from 0 (north pole) to pi/2 (equator) for the upper hemisphere. Theta sweeps all the way around. We split each axis into equal cells so d_phi and d_theta are constants we can pull out of the inner loop.

18Constant d_phi

d_phi = (pi/2) / n_phi ≈ 0.0314 rad. Every patch in the phi direction is this tall in parameter space. The *spatial* height varies with R, but in (phi, theta) land every cell is identical — that is the gift of a good parametrization.

19Constant d_theta

d_theta = 2*pi / n_theta ≈ 0.0628 rad. Same idea around the azimuth.

21Accumulator initialised to 0

`integral` will collect the running sum ∑ f(point) · dS(point). Floating-point initialised to 0.0 so Python knows we are doing real arithmetic.

23Step 2 — outer loop over phi

We march down from the north pole toward the equator. Each value of i selects one ring of patches at the same phi.

24Midpoint rule: phi at cell centre

Using `(i + 0.5) * d_phi` evaluates the integrand at the *midpoint* of each cell rather than the corner. Midpoint Riemann sums converge as O(d^2) — much better than the O(d) of left/right endpoints. For n_phi = 50 that is the difference between 0.02% and 1% error.

25Inner loop over theta

For each phi-ring we spin all the way around. j indexes the longitude cell.

26Midpoint theta

Same midpoint trick: theta at the centre of cell j is `(j + 0.5) * d_theta`. This rule is symmetric so the leading error term cancels.

28Integrand value f(x,y,z) = z

For this example f is just the height z. On the sphere z = R cos(phi), so we never need x or y explicitly. If f depended on x or y, we would also compute them as R sin(phi) cos(theta) and R sin(phi) sin(theta).

29dS = R^2 sin(phi) d(phi) d(theta)

The crucial line. The R^2 sin(phi) factor is the magnitude of r_phi × r_theta for the spherical parametrization — it shrinks to zero at the poles (sin(0) = 0) and is maximal at the equator (sin(pi/2) = 1). Geographically: a 1-degree by 1-degree patch is tiny near a pole and large at the equator.

30Accumulate f · dS

The heart of every Riemann sum. We are adding tiny contributions of (value) × (area). After 50 × 100 = 5000 cells, `integral` is our discrete approximation of ∬_S z dS.

32Step 3 — compare to the exact answer

We know the closed form: ∬_S z dS = pi R^3. Computing it analytically: ∫₀^{2π} dθ ∫₀^{π/2} R cos(φ) · R^2 sin(φ) dφ = 2π R^3 · [sin²(φ)/2]₀^{π/2} = π R^3.

33Analytical = pi * R^3

For R=2 this is 8π ≈ 25.1327. The numerical result should agree to roughly four decimal places with n_phi=50, n_theta=100.

34Relative error as a percentage

Dividing by the analytical value normalises the error so it is independent of the magnitude of R. We expect ~0.02% with the default resolution.

36Report

Printing three lines (numerical / analytical / error) is enough to verify the code without firing up a plotting library.

39Return tuple

Returning both values lets a caller do further analysis (e.g. plot error vs. resolution) without re-running the integration.

42Run it

Executing the function at module level gives an immediate sanity check. Expected output: Numerical : 25.130430 Analytical : 25.132741 (= pi * R^3) Relative : 0.0092 %

22 lines without explanation

1import math
2
3def scalar_surface_integral_hemisphere(R=2.0, n_phi=50, n_theta=100):
4    """
5    Compute  I = ∬_S  z  dS   over the upper hemisphere of radius R.
6
7    Parametrization (spherical):
8        x = R sin(phi) cos(theta)
9        y = R sin(phi) sin(theta)
10        z = R cos(phi)
11
12    For a sphere:  |r_phi × r_theta| = R^2 sin(phi),
13    so the surface area element is
14        dS = R^2 sin(phi) d(phi) d(theta).
15    """
16    # Step 1. Build evenly spaced grids in the parameter rectangle.
17    d_phi   = (math.pi / 2) / n_phi          # phi runs 0 → pi/2 (upper half)
18    d_theta = (2 * math.pi) / n_theta        # theta runs 0 → 2*pi
19
20    integral = 0.0
21
22    # Step 2. Riemann sum over every (phi_i, theta_j) cell.
23    for i in range(n_phi):
24        phi = (i + 0.5) * d_phi              # midpoint of the phi cell
25        for j in range(n_theta):
26            theta = (j + 0.5) * d_theta      # midpoint of the theta cell
27
28            z   = R * math.cos(phi)          # integrand value f = z
29            dS  = R**2 * math.sin(phi) * d_phi * d_theta
30            integral += z * dS
31
32    # Step 3. Compare with the closed-form answer  pi * R^3.
33    analytical = math.pi * R**3
34    rel_err    = abs(integral - analytical) / analytical * 100
35
36    print(f"Numerical  : {integral:.6f}")
37    print(f"Analytical : {analytical:.6f}   (= pi * R^3)")
38    print(f"Relative   : {rel_err:.4f} %")
39    return integral, analytical
40
41
42scalar_surface_integral_hemisphere()

Flux integral through a paraboloid

Plain Python: flux of (0,0,z) through z = 4 − x² − y²

🐍flux_paraboloid.py

Explanation(18)

Code(41)

1math import

Same minimalist setup. The flux integral is conceptually richer than the scalar version but computationally just as friendly once we have spotted the symmetry.

3Signature

R=2 is the rim radius (where z = 4 - 4 = 0). n_r=80 controls how finely we slice the radial direction, n_theta=160 the azimuth.

5What flux means here

F = (0, 0, z) is a purely vertical field whose strength grows with height. The paraboloid is a bowl opening downward, peak at (0,0,4). With the upward normal, positive flux means F is pushing 'out the top' of the bowl.

11Polar parametrisation of the bowl

Putting x, y in polar form and substituting into z = 4 - x^2 - y^2 gives z = 4 - r^2. This collapses the surface to a 2-parameter map r(r,θ).

15Tangent vector r_r

Differentiating r(r,θ) with respect to r holds θ fixed and asks: 'as I move outward by 1 unit of r, where does my point on the surface go?' x goes by cos θ, y by sin θ, z drops by 2r (because dz/dr = -2r).

16Tangent vector r_θ

Differentiating with respect to θ holds r fixed: the point swings around the z-axis. dz/dθ = 0 because z depends only on r, not θ — the bowl is rotationally symmetric.

18Cross product r_r × r_θ

Using the determinant formula: the z-component is r (positive!), so this normal vector points *upward* — perfect for the orientation we chose. If we had chosen r_θ × r_r instead, the z-component would be -r and the normal would point downward.

22F · (r_r × r_θ) = r(4 − r²)

The huge payoff: F is (0,0,z) = (0,0,4−r²), the cross product is (·, ·, r), and the dot product picks up only the z component: (4−r²) · r. Theta dropped out entirely thanks to symmetry.

24Radial grid spacing

dr = R/n_r = 2/80 = 0.025. Same uniform-grid logic as before.

25Azimuth grid spacing

dtheta = 2π/160 ≈ 0.0393. We will not actually loop over theta — see below.

26Accumulator

flux starts at zero; we will accumulate r (4 − r²) · dr · 2π over every radial cell.

28Outer loop over r only

Because the integrand has no θ dependence, the inner θ integral is trivially ∫₀^{2π} dθ = 2π. We collapse the double loop to a single loop, saving a factor of n_theta in cost.

29Midpoint r

Same midpoint trick: r at cell centre is (i+0.5)·dr. The endpoints r=0 and r=2 are both legitimate, but midpoint sampling gives second-order accuracy for free.

30radial = r (4 − r²)

Evaluating F · (r_r × r_θ) at the cell-centre r. For r=0 this is 0. For r=2 it is 0 too. In between it rises and falls — the peak contribution comes from r ≈ 1.15.

32flux += radial · dr · 2π

One radial cell contributes (value) × (radial width) × (full azimuth). After 80 iterations, flux ≈ 25.13 — the discrete approximation of 8π.

34Analytic check

∫₀^{2π} dθ ∫₀^{2} (4r − r³) dr = 2π · [2r² − r⁴/4]₀² = 2π · (8 − 4) = 8π ≈ 25.1327.

35Print numerical, analytic, error

Expected output: Numerical : 25.132741 Analytical : 25.132741 (= 8*pi) Relative : 0.0000 % The match is so close because the integrand is a polynomial of degree 3 and the midpoint rule is exact for cubics in 1-D.

39Run

One call exercises the whole pipeline: parametrise, evaluate F, take the cross product analytically, dot it with F, integrate.

23 lines without explanation

1import math
2
3def flux_through_paraboloid(R=2.0, n_r=80, n_theta=160):
4    """
5    Compute the flux of  F(x,y,z) = (0, 0, z)
6    through the paraboloid  z = 4 - x^2 - y^2,  z >= 0,
7    oriented with the *upward* normal.
8
9    Parametrize the paraboloid in polar form:
10        x = r cos(theta)
11        y = r sin(theta)
12        z = 4 - r^2,    0 <= r <= 2,  0 <= theta <= 2*pi
13
14    Tangent vectors:
15        r_r     = ( cos(theta),  sin(theta), -2 r )
16        r_theta = (-r sin(theta), r cos(theta), 0  )
17
18    Cross product (upward-pointing):
19        r_r × r_theta = ( 2 r^2 cos(theta),  2 r^2 sin(theta),  r )
20
21    F · (r_r × r_theta) = (0, 0, 4-r^2) · (..., r) = r (4 - r^2)
22    """
23    dr     = R / n_r
24    dtheta = (2 * math.pi) / n_theta
25    flux   = 0.0
26
27    for i in range(n_r):
28        r = (i + 0.5) * dr                       # midpoint r in cell
29        radial = r * (4 - r**2)                  # F · (r_r × r_theta)
30        # All terms below depend on r only, so theta sum is just n_theta * dtheta.
31        flux += radial * dr * (2 * math.pi)
32
33    # Closed form: ∫₀^{2π} dθ ∫₀^2 (4r - r^3) dr = 2π (8 - 4) = 8π.
34    analytical = 8 * math.pi
35    print(f"Numerical  : {flux:.6f}")
36    print(f"Analytical : {analytical:.6f}   (= 8*pi)")
37    print(f"Relative   : {abs(flux - analytical)/analytical*100:.4f} %")
38    return flux, analytical
39
40
41flux_through_paraboloid()

Normalising-flow Jacobian (the ML link)

Plain Python: a planar normalising flow and its Jacobian determinant

🐍planar_flow_jacobian.py

Explanation(15)

Code(39)

1math import only

We illustrate the *idea* with a 2-D toy. Plain Python keeps the algebra visible.

3Function signature

`z` is a 2-D input point. `w` and `u_scale` are the planar-flow parameters. Real flows learn these from data; here we hard-code reasonable values.

5The map f: z → x

A planar flow shifts z by a single bump in the direction u, gated by a tanh of a linear combination of z. It is invertible whenever 1 + (1 − tanh²) · (w·u) > 0 — which is why people clip the parameters.

9Why the Jacobian determinant matters

Change-of-variables for probability density: p_x(x) = p_z(z) / |det J_f(z)|. If J is the identity (no stretching), the density is unchanged. If |det J| > 1, mass gets spread thinner — the new density at x is smaller than the original density at z.

14Surface-integral connection

Probability is conserved: ∫ p(x) dx = 1 for every choice of f. That global conservation, applied to an arbitrary region V with boundary S, is exactly the Divergence Theorem with F = p · v, where v is the velocity field of the flow. We will formalise this in section 19.9.

18Unpack z

z is a tuple (z1, z2) so we can write the algebra explicitly.

19Unpack flow parameters

w controls the *direction* of the bump's gating, u_scale its magnitude in each coordinate.

23Linear pre-activation s

s = w · z + b. This single scalar gates how much of the bump fires at z.

24Apply tanh

tanh squashes s into (−1, 1). Locations with |s| ≫ 1 are saturated and the map is locally close to the identity *plus a constant* — Jacobian = 1.

25First output coordinate

x1 = z1 + tanh(s) · u1.

26Second output coordinate

x2 = z2 + tanh(s) · u2. Together the bump pushes the point in the direction u with strength tanh(s).

29Matrix-determinant lemma

The Jacobian of f is I + u (1 − tanh²(s)) wᵀ — a rank-1 update of the identity. The matrix-determinant lemma says det(I + a bᵀ) = 1 + bᵀ a. Substituting a = u (1 − tanh²), b = w gives the one-line formula on the next line.

33Closed-form determinant

det J = 1 + (1 − tanh²(s)) (w · u). No general matrix inversion needed — a triumph of structure. For a deep flow we compose many such layers and sum log|det J| layer-by-layer.

35Return both x and det J

The training loss for a normalising flow is − log p_z(z) + log |det J_f(z)| averaged over data x. Both ingredients live in our return value.

39Smoke test at two points

Expected output: z = (0.0, 0.0) → x = (0.0000, 0.0000) |det J| = 1.0000 z = (1.5, -1.0) → x = (1.9268, -0.1463) |det J| = 0.4566 At z = (0,0) the bump is centred and gradient is full, but symmetry leaves the map fixed there. At z = (1.5, −1.0) the map stretches mass by a factor of ~0.46 — meaning probability density at x is roughly 2.2× the corresponding p_z value.

24 lines without explanation

1import math
2
3def planar_flow_jacobian(z, w=(1.0, 0.5), b=0.0, u_scale=(0.5, 1.0)):
4    """
5    A toy planar normalizing flow:  x = f(z) = z + tanh(w·z + b) * u.
6
7    Returns x and the *Jacobian determinant* of the map at z.
8    The change-of-variables formula then gives
9        p(x) = p_0(z) / |det J_f(z)|.
10
11    The link to surface integrals: probability behaves like an
12    incompressible fluid in z-space; |det J_f(z)| measures how
13    much a small volume gets stretched into x-space. This is the
14    infinitesimal version of the flux that the Divergence Theorem
15    formalises in the next section.
16    """
17    z1, z2 = z
18    w1, w2 = w
19    u1, u2 = u_scale
20
21    # Forward map.
22    s   = w1 * z1 + w2 * z2 + b
23    t   = math.tanh(s)
24    x1  = z1 + t * u1
25    x2  = z2 + t * u2
26
27    # Jacobian of f at z:
28    #   J = I + u · (1 - tanh^2(s)) · w^T          (rank-1 update of I)
29    #   det(I + a b^T) = 1 + b^T a
30    #   ⇒ det J = 1 + (1 - t^2) * (w1 * u1 + w2 * u2)
31    det_J = 1.0 + (1.0 - t**2) * (w1 * u1 + w2 * u2)
32
33    return (x1, x2), det_J
34
35
36# Sanity check at two sample points.
37for z in [(0.0, 0.0), (1.5, -1.0)]:
38    x, dJ = planar_flow_jacobian(z)
39    print(f"z = {z}  →  x = ({x[0]:.4f}, {x[1]:.4f})   |det J| = {abs(dJ):.4f}")

PyTorch Implementation

With the plain-Python intuition in hand, here is the vectorised PyTorch version of the flux integral. We let autograd build the tangent vectors $\mathbf{r}_r$ and $\mathbf{r}_\theta$ symbolically through the graph — no algebraic differentiation by us. The same idea scales to learnable surface parametrisations inside a neural network.

PyTorch: vectorised flux with autograd-built tangent vectors

🐍flux_paraboloid_torch.py

Explanation(22)

Code(52)

1Imports

torch for tensors and autograd, math for the reference constant 8π. We do not need numpy.

4Function signature

n_r and n_theta are larger than in the plain-Python version — vectorised ops make a 200 × 400 = 80 000 cell grid essentially free on a CPU.

6Why PyTorch here

Two reasons. (a) Vectorised tensor ops replace the nested for-loops. (b) We can hand the tangent-vector calculation to autograd — no algebraic differentiation by us. That second point is the real punch line.

12Cell widths

Same midpoint spacing as before. dr ≈ 0.01, dtheta ≈ 0.0157.

141-D midpoint grids

`torch.linspace(0.5*dr, R - 0.5*dr, n_r)` gives n_r values at cell centres. The same trick for theta.

162-D meshgrid

`torch.meshgrid(..., indexing='ij')` produces two (n_r × n_theta) matrices: R_grid[i,j] = r[i], T_grid[i,j] = theta[j]. Now every cell is a single matrix entry.

17requires_grad_(True) on R_grid

We need ∂X/∂r, ∂Y/∂r, ∂Z/∂r as autograd outputs. Setting requires_grad on the leaf tensor R_grid tells PyTorch to record operations involving it. Without this flag, `torch.autograd.grad` would refuse to differentiate.

18requires_grad_(True) on T_grid

Same for theta. Now any expression built from R_grid and T_grid lives in an autograd graph that we can query.

21X = R_grid * cos(T_grid)

First parametrisation component. This is a tensor of shape (n_r, n_theta). Autograd records the multiplication and the cosine.

22Y = R_grid * sin(T_grid)

Second component. Same shape.

23Z = 4 − R_grid²

Third component. Depends only on R_grid, so ∂Z/∂T_grid will turn out to be exactly 0 — matching our hand calculation.

26ones tensor for grad_outputs

`torch.autograd.grad` computes a vector-Jacobian product. To get the *plain* derivative tensor we feed grad_outputs = ones_like(out), which selects every output entry with weight 1.

27Local helper `grad`

A small wrapper so the next two lines stay readable. `retain_graph=True` is essential because we will call grad six times against the same graph (three for r_r, three for r_t).

29create_graph=False

We only want first derivatives, not second. False saves memory by not building the graph of the gradient calculation itself.

31Stack components into r_r

Three (n_r, n_theta) tensors stacked along a new last axis give shape (n_r, n_theta, 3) — the tangent vector at every cell. Component 0 = ∂x/∂r, component 1 = ∂y/∂r, component 2 = ∂z/∂r.

32Stack components into r_t

Same shape, this time the partials with respect to theta. If you print r_t[..., 2], every entry will be effectively 0 — confirming ∂z/∂θ = 0 for our parametrisation.

35torch.cross on the last dim

`torch.cross(r_r, r_t, dim=-1)` computes the cross product along the size-3 axis, returning a (n_r, n_theta, 3) tensor of normal vectors. The z-component should be r (positive ⇒ upward normal). If you flip the argument order you flip the orientation.

38F as a tensor field

F = (0, 0, z) evaluated on the surface. We use Z.detach() because the field value itself is fixed data, not something we need to differentiate. Detaching cleanly separates field evaluation from the geometric derivatives.

41F · (r_r × r_t)

Elementwise multiply then sum along the size-3 dim. Result is an (n_r, n_theta) scalar field — exactly the integrand of the parameter-space integral.

422-D midpoint Riemann sum

`integrand.sum() * dr * dtheta` is the vectorised Riemann sum: add every cell value, multiply by cell area. Result is a 0-D tensor `flux`.

44Compare to 8π

Expected output: PyTorch flux : 25.132741 Reference 8π : 25.132741 Four-decimal agreement, and on GPU this scales to millions of cells effortlessly.

48if __name__ guard

Standard Python idiom so the file can be imported without auto-running. Recommended for any PyTorch module that wants to live inside a larger codebase.

30 lines without explanation

1import torch
2import math
3
4def flux_paraboloid_torch(R: float = 2.0, n_r: int = 200, n_theta: int = 400,
5                          device: str = "cpu"):
6    """
7    Vectorised version of the same flux integral, in PyTorch.
8
9    The bonus: we let autograd build the tangent vectors r_r and r_theta
10    by differentiating the parametrisation symbolically through the graph,
11    instead of writing them out by hand.
12    """
13    # 1. Build (n_r × n_theta) grids of midpoint samples.
14    dr     = R / n_r
15    dtheta = (2 * math.pi) / n_theta
16    r      = torch.linspace(0.5*dr, R - 0.5*dr, n_r, device=device)
17    theta  = torch.linspace(0.5*dtheta, 2*math.pi - 0.5*dtheta, n_theta, device=device)
18    R_grid, T_grid = torch.meshgrid(r, theta, indexing="ij")
19    R_grid.requires_grad_(True)            # so autograd can differentiate
20    T_grid.requires_grad_(True)
21
22    # 2. Parametrisation as a single tensor expression.
23    X = R_grid * torch.cos(T_grid)
24    Y = R_grid * torch.sin(T_grid)
25    Z = 4.0 - R_grid**2
26
27    # 3. Tangent vectors via autograd — one component at a time.
28    ones = torch.ones_like(R_grid)
29    def grad(out, wrt):
30        return torch.autograd.grad(out, wrt, grad_outputs=ones,
31                                   retain_graph=True, create_graph=False)[0]
32
33    r_r = torch.stack([grad(X, R_grid), grad(Y, R_grid), grad(Z, R_grid)], dim=-1)
34    r_t = torch.stack([grad(X, T_grid), grad(Y, T_grid), grad(Z, T_grid)], dim=-1)
35
36    # 4. Cross product r_r × r_theta (upward-pointing).
37    cross = torch.cross(r_r, r_t, dim=-1)
38
39    # 5. Vector field F = (0, 0, z), evaluated on the surface.
40    F = torch.stack([torch.zeros_like(Z), torch.zeros_like(Z), Z.detach()], dim=-1)
41
42    # 6. Dot product F · (r_r × r_theta), then 2-D midpoint Riemann sum.
43    integrand = (F * cross).sum(dim=-1)
44    flux = integrand.sum() * dr * dtheta
45
46    print(f"PyTorch flux : {flux.item():.6f}")
47    print(f"Reference 8π : {8 * math.pi:.6f}")
48    return flux
49
50
51if __name__ == "__main__":
52    flux_paraboloid_torch()

Test Your Understanding

Summary

We extended integration to 2-D curved sheets sitting in 3-D space. Two flavours, one idea: chop the parameter rectangle, evaluate, weight by area, sum, take the limit.

Key formulas

Integral	Formula	What it measures
Surface area	$\iint_S 1\,dS = \iint_D \|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv$	Total area of $S$
Scalar surface integral	$\iint_S f\,dS = \iint_D f(\mathbf{r}(u,v))\,\|\mathbf{r}_u \times \mathbf{r}_v\|\,du\,dv$	Density weighted by area
Flux integral	$\iint_S \mathbf{F}\cdot d\mathbf{S} = \iint_D \mathbf{F}(\mathbf{r}) \cdot (\mathbf{r}_u \times \mathbf{r}_v)\,du\,dv$	Field flow through $S$
Graph $z = f(x,y)$	$dS = \sqrt{1 + f_x^2 + f_y^2}\,dx\,dy$	Area element for explicit graphs
Sphere $R$	$dS = R^2 \sin\phi\,d\phi\,d\theta$	Spherical-coords area element

Key takeaways

A parametric surface $\mathbf{r}(u,v)$ turns a flat rectangle into a curved sheet in $\mathbb{R}^3$ .
The cross product $\mathbf{r}_u\times\mathbf{r}_v$ carries both the local normal direction and the area scale.
Scalar surface integrals add up a density. Mass, average value, surface area — all the same recipe with different $f$ .
Flux integrals project the field onto the surface normal and accumulate. Orientation is part of the data — flip it and the sign flips.
The same algebra is the engine of Maxwell's equations, fluid mass-balance, and the change-of-variables formula in normalising flows.
Surface integrals are one side of two great theorems: Stokes (next section) and the Divergence Theorem (section 19.9).

The essence of surface integrals

"Chop the surface into tiny tilted parallelograms, ask each one a question, and add up the answers."

Coming next: Stokes' theorem ties the circulation of a vector field around a closed curve to the flux of its curl through any spanning surface — the 3-D upgrade of Green's theorem.