Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Compute divergence and curl of vector fields in 2D and 3D using their component formulas
Interpret divergence physically as measuring sources, sinks, and incompressibility in fluid flows and other physical systems
Interpret curl physically as measuring local rotation and the axis of that rotation in 3D
Apply fundamental vector calculus identities, including $\nabla \times (\nabla f) = \mathbf{0}$ and $\nabla \cdot (\nabla \times \mathbf{F}) = 0$
Connect curl and divergence to integral theorems: Green's Theorem, Stokes' Theorem, and the Divergence Theorem

Why This Matters: Curl and divergence are the fundamental differential operators of vector calculus. They appear in Maxwell's equations of electromagnetism, the Navier-Stokes equations of fluid dynamics, and throughout physics and engineering. In machine learning, understanding these operators helps with normalizing flows, score-based diffusion models, and physics-informed neural networks.

The Big Picture

Vector fields are everywhere in nature: velocity fields of fluids, electric and magnetic fields, gravitational fields, and gradient fields in optimization. To understand these fields deeply, we need tools to measure their fundamental properties at each point.

Two questions arise naturally:

Is there a source or sink here? Does the field expand outward or contract inward at this point? This is what divergence measures.
Is there rotation here? Does the field curl around this point, and if so, in what direction? This is what curl measures.

Operator	Symbol	Input	Output	Measures
Gradient	$\nabla f$	Scalar field	Vector field	Direction of steepest ascent
Divergence	$\nabla \cdot \mathbf{F}$	Vector field	Scalar field	Expansion/contraction
Curl	$\nabla \times \mathbf{F}$	Vector field	Vector field	Rotation

The Del Operator: The symbol

\nabla = \langle \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z} \rangle

(pronounced "del" or "nabla") is a vector of differential operators. When we write

\nabla f

\nabla \cdot \mathbf{F}

, or

\nabla \times \mathbf{F}

, we're applying this operator in different ways.

Historical Context

The concepts of divergence and curl emerged in the 19th century from the work of physicists and mathematicians trying to understand electromagnetism and fluid dynamics.

James Clerk Maxwell (1831-1879) unified electricity, magnetism, and light into a single theory expressed through four equations—now called Maxwell's equations—that use divergence and curl as their primary language. His famous equations include:

$\nabla \cdot \mathbf{E} = \rho/\varepsilon_0$ (Gauss's law: charges are sources of electric field)
$\nabla \cdot \mathbf{B} = 0$ (No magnetic monopoles)
$\nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t}$ (Faraday's law)
$\nabla \times \mathbf{B} = \mu_0 \mathbf{J} + \mu_0 \varepsilon_0 \frac{\partial \mathbf{E}}{\partial t}$ (Ampère-Maxwell law)

William Rowan Hamilton developed quaternion calculus, from which modern vector analysis emerged. Peter Guthrie Tait and Josiah Willard Gibbs then developed the modern vector notation we use today.

Divergence

Definition and Formulas

The divergence of a vector field measures the net outward flux per unit volume at each point. It tells us whether the field is expanding outward (like a source) or contracting inward (like a sink) at that location.

For a 3D vector field $\mathbf{F} = \langle P, Q, R \rangle$ :

\text{div}(\mathbf{F}) = \nabla \cdot \mathbf{F} = \frac{\partial P}{\partial x} + \frac{\partial Q}{\partial y} + \frac{\partial R}{\partial z}

In 2D with $\mathbf{F} = \langle P, Q \rangle$ :

\text{div}(\mathbf{F}) = \frac{\partial P}{\partial x} + \frac{\partial Q}{\partial y}

Reading the Formula: The divergence adds up how much each component of the field is changing in its own direction. If

\partial P/\partial x > 0

, the x-component of the field is increasing as we move in the x-direction—the field is spreading out in x.

Interactive: Divergence Visualizer

Explore how divergence measures expansion and contraction in vector fields. Watch the particles spread apart (positive divergence) or converge (negative divergence):

Loading divergence visualizer...

Physical Interpretation

The Flux Interpretation: Consider a tiny box centered at a point. Divergence measures the net rate at which "stuff" (fluid, electric field lines, etc.) flows outward through the faces of the box, per unit volume.

Divergence	Sign	Physical Meaning	Example
Positive	$\text{div}(\mathbf{F}) > 0$	Source: more flows out than in	Radiating heat source
Negative	$\text{div}(\mathbf{F}) < 0$	Sink: more flows in than out	Drain in a bathtub
Zero	$\text{div}(\mathbf{F}) = 0$	Incompressible: balanced flow	Water in a closed pipe

Incompressibility: In fluid dynamics, if

\nabla \cdot \mathbf{v} = 0

everywhere, the fluid is incompressible—like water. The flow neither creates nor destroys fluid volume.

Curl

Definition and Formulas

The curl of a vector field measures the local rotation or "circulation density" at each point. It tells us how much the field rotates around that point and in which direction (the axis of rotation).

For a 3D vector field $\mathbf{F} = \langle P, Q, R \rangle$ :

\text{curl}(\mathbf{F}) = \nabla \times \mathbf{F} = \left\langle \frac{\partial R}{\partial y} - \frac{\partial Q}{\partial z}, \frac{\partial P}{\partial z} - \frac{\partial R}{\partial x}, \frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} \right\rangle

This can be remembered as a symbolic determinant:

\nabla \times \mathbf{F} = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ \frac{\partial}{\partial x} & \frac{\partial}{\partial y} & \frac{\partial}{\partial z} \\ P & Q & R \end{vmatrix}

In 2D with $\mathbf{F} = \langle P, Q \rangle$ , the curl reduces to a scalar (the z-component):

\text{curl}(\mathbf{F}) = \frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y}

Interactive: Curl Visualizer

Explore how curl measures rotation in vector fields. Watch the paddle wheels spin—positive curl means counterclockwise rotation (by the right-hand rule):

Loading curl visualizer...

Interactive: 3D Curl Visualization

In 3D, the curl is a vector pointing along the axis of rotation. Explore how the curl vector varies with position in different 3D vector fields:

Loading 3D curl visualizer...

Physical Interpretation

The Paddle Wheel Test: Imagine placing a tiny paddle wheel at a point in a fluid flow. The curl vector points along the axis about which the paddle would spin, and its magnitude indicates how fast it would spin.

Curl	Meaning	Physical Example
$\text{curl}(\mathbf{F}) \neq \mathbf{0}$	Field has local rotation	Vortex in water, magnetic field around a wire
$\text{curl}(\mathbf{F}) = \mathbf{0}$	Field is irrotational	Gravitational field, electrostatic field
$\|\text{curl}(\mathbf{F})\|$	Strong local rotation	Tornado, tight circulation

Irrotational = Conservative: A vector field with zero curl everywhere (on a simply connected domain) is conservative—it's the gradient of some scalar function. This means path independence for line integrals!

The Laplacian

The Laplacian is another fundamental operator formed by composing divergence and gradient:

\nabla^2 f = \nabla \cdot (\nabla f) = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2} + \frac{\partial^2 f}{\partial z^2}

The Laplacian measures how much a function differs from its local average. It appears in:

Heat equation: $\frac{\partial u}{\partial t} = k \nabla^2 u$
Wave equation: $\frac{\partial^2 u}{\partial t^2} = c^2 \nabla^2 u$
Laplace's equation: $\nabla^2 f = 0$ (harmonic functions)
Schrödinger equation: in quantum mechanics

Graph Laplacian in ML: In graph neural networks and spectral clustering, the discrete analogue of the Laplacian measures how node values differ from their neighbors' average—the same concept in a discrete setting!

Vector Calculus Identities

Several identities connect gradient, divergence, and curl. The two most important are:

\nabla \times (\nabla f) = \mathbf{0}

The curl of any gradient is zero

\nabla \cdot (\nabla \times \mathbf{F}) = 0

The divergence of any curl is zero

These identities follow from the equality of mixed partial derivatives and have deep physical significance:

$\nabla \times (\nabla f) = \mathbf{0}$ : Conservative (gradient) fields have no rotation. Energy is conserved.
$\nabla \cdot (\nabla \times \mathbf{F}) = 0$ : Curl fields have no sources or sinks. This is why magnetic field lines form closed loops (no magnetic monopoles).

Interactive: Operator Identities

Hand Walkthrough: Do It Yourself

Pick up a pencil. We will walk one vector field through every operator in this section by hand: compute its divergence, its curl, its Laplacian-of-a-related-potential, and verify the identity $\nabla \times (\nabla f) = \mathbf{0}$ . One example, all the ideas — nothing hidden, every partial derivative shown.

Click to expand: trace

\mathbf{F}(x,y,z) = \langle x^2 y,\; y^2 z,\; z^2 x \rangle

at the point

(1, 2, 3)

We work with the vector field $\mathbf{F}(x,y,z) = \langle P, Q, R \rangle = \langle x^2 y,\; y^2 z,\; z^2 x \rangle$ and evaluate at the test point $(x,y,z) = (1, 2, 3)$ . The components are:

$P(x,y,z) = x^2 y$
$Q(x,y,z) = y^2 z$
$R(x,y,z) = z^2 x$

Step 1 — Compute every partial derivative

The divergence and curl together need nine partial derivatives: each of $P, Q, R$ with respect to each of $x, y, z$ . Let's lay them out in a $3 \times 3$ table so nothing gets lost.

	$\partial/\partial x$	$\partial/\partial y$	$\partial/\partial z$
$P = x^2 y$	$2xy$	$x^2$	$0$
$Q = y^2 z$	$0$	$2yz$	$y^2$
$R = z^2 x$	$z^2$	$0$	$2zx$

For example, $\partial P / \partial x = \partial(x^2 y)/\partial x = 2xy$ (treat $y$ as a constant, differentiate $x^2$ ). Similarly $\partial R / \partial z = \partial(z^2 x)/\partial z = 2zx$ .

Step 2 — Divergence

The divergence reads only the diagonal entries of the table:

$\nabla \cdot \mathbf{F} = \frac{\partial P}{\partial x} + \frac{\partial Q}{\partial y} + \frac{\partial R}{\partial z} = 2xy + 2yz + 2zx$

Plug in the test point $(1, 2, 3)$ :

$\nabla \cdot \mathbf{F}\big|_{(1,2,3)} = 2(1)(2) + 2(2)(3) + 2(3)(1) = 4 + 12 + 6 = 22$

Meaning: A tiny ball of fluid centered at $(1,2,3)$ would expand at a rate of 22 units of volume per unit volume per unit time if the velocity field were $\mathbf{F}$ . The point is a strong source.

Step 3 — Curl, component by component

The curl reads the off-diagonal entries with a specific sign pattern:

$\nabla \times \mathbf{F} = \left\langle \frac{\partial R}{\partial y} - \frac{\partial Q}{\partial z},\; \frac{\partial P}{\partial z} - \frac{\partial R}{\partial x},\; \frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} \right\rangle$

x-component:

$(\nabla \times \mathbf{F})_x = \frac{\partial R}{\partial y} - \frac{\partial Q}{\partial z} = 0 - y^2 = -y^2$

y-component:

$(\nabla \times \mathbf{F})_y = \frac{\partial P}{\partial z} - \frac{\partial R}{\partial x} = 0 - z^2 = -z^2$

z-component:

$(\nabla \times \mathbf{F})_z = \frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y} = 0 - x^2 = -x^2$

So the curl field is:

$\nabla \times \mathbf{F} = \langle -y^2,\; -z^2,\; -x^2 \rangle$

At the test point $(1,2,3)$ :

$\nabla \times \mathbf{F}\big|_{(1,2,3)} = \langle -4,\; -9,\; -1 \rangle$

Meaning: A tiny paddle wheel placed at $(1,2,3)$ would spin around the axis pointing in direction $\langle -4, -9, -1 \rangle$ . The magnitude is $\sqrt{16 + 81 + 1} = \sqrt{98} \approx 9.90$ radians per unit time. The dominant axis is $-\mathbf{j}$ : the rotation lies mostly in the $xz$ -plane.

Step 4 — Sanity check: divergence of the curl

The identity $\nabla \cdot (\nabla \times \mathbf{F}) = 0$ says: the curl field has no sources or sinks anywhere. Let's verify it for our specific $\nabla \times \mathbf{F} = \langle -y^2, -z^2, -x^2 \rangle$ :

$\nabla \cdot (\nabla \times \mathbf{F}) = \frac{\partial(-y^2)}{\partial x} + \frac{\partial(-z^2)}{\partial y} + \frac{\partial(-x^2)}{\partial z} = 0 + 0 + 0 = 0 \;\checkmark$

Each term involves a derivative of a variable that doesn't appear in that component — which is precisely why the identity holds for any sufficiently smooth $\mathbf{F}$ : the mixed partials of $P,Q,R$ cancel in pairs (Clairaut's theorem).

Step 5 — Sanity check: curl of a gradient

Now take the scalar potential $f(x,y,z) = x^2 + y^2 + z^2$ . Its gradient is the radial source field:

$\nabla f = \langle 2x, 2y, 2z \rangle$

Compute the curl of this gradient:

$\nabla \times (\nabla f) = \left\langle \frac{\partial(2z)}{\partial y} - \frac{\partial(2y)}{\partial z},\; \frac{\partial(2x)}{\partial z} - \frac{\partial(2z)}{\partial x},\; \frac{\partial(2y)}{\partial x} - \frac{\partial(2x)}{\partial y} \right\rangle = \langle 0, 0, 0 \rangle \;\checkmark$

All six derivatives vanish because each component of $\nabla f$ depends on only one variable. Identity confirmed.

Step 6 — Laplacian of the potential

For the same $f = x^2 + y^2 + z^2$ :

$\nabla^2 f = \nabla \cdot (\nabla f) = \frac{\partial(2x)}{\partial x} + \frac{\partial(2y)}{\partial y} + \frac{\partial(2z)}{\partial z} = 2 + 2 + 2 = 6$

The Laplacian is constant 6. Compare with the divergence of $\mathbf{F}$ at the same point (22): notice that $\mathbf{F}$ is not a gradient field — its divergence depends on position, and as we found, its curl is non-zero.

Summary of what you computed

Quantity	Symbolic	$\text{Value at } (1,2,3)$
$\nabla \cdot \mathbf{F}$	$2xy + 2yz + 2zx$	22
$\nabla \times \mathbf{F}$	$\langle -y^2,\; -z^2,\; -x^2 \rangle$	$\langle -4,\; -9,\; -1 \rangle$
$\|\nabla \times \mathbf{F}\|$	$\sqrt{y^4 + z^4 + x^4}$	$\sqrt{98} \approx 9.90$
$\nabla \cdot (\nabla \times \mathbf{F})$	0	0
$\nabla f \text{ for } f = x^2 + y^2 + z^2$	$\langle 2x,\; 2y,\; 2z \rangle$	$\langle 2,\; 4,\; 6 \rangle$
$\nabla \times (\nabla f)$	$\langle 0,\; 0,\; 0 \rangle$	$\langle 0,\; 0,\; 0 \rangle$
$\nabla^2 f$	6	6

The big takeaway: divergence reads the diagonal of the partial-derivative table, curl reads the off-diagonal entries with a signed pattern, and the two great identities

\nabla \cdot (\nabla \times \mathbf{F}) = 0

and

\nabla \times (\nabla f) = \mathbf{0}

hold because mixed partial derivatives commute. Every vector calculus identity in this section is a disguised Clairaut's theorem.

Worked Examples

Example 1: Find the divergence and curl of $\mathbf{F} = \langle x^2, y^2, z^2 \rangle$ .

Solution:

Divergence: $\nabla \cdot \mathbf{F} = \frac{\partial}{\partial x}(x^2) + \frac{\partial}{\partial y}(y^2) + \frac{\partial}{\partial z}(z^2) = 2x + 2y + 2z$

Curl: The partial derivatives needed are $\frac{\partial R}{\partial y} = 0$ , $\frac{\partial Q}{\partial z} = 0$ , etc. All cross-derivatives vanish, so $\nabla \times \mathbf{F} = \langle 0, 0, 0 \rangle$ .

This field is the gradient of

f = \frac{1}{3}(x^3 + y^3 + z^3)

, so its curl must be zero!

Example 2: Find the divergence and curl of $\mathbf{F} = \langle -y, x, 0 \rangle$ .

Solution:

Divergence: $\nabla \cdot \mathbf{F} = \frac{\partial}{\partial x}(-y) + \frac{\partial}{\partial y}(x) + \frac{\partial}{\partial z}(0) = 0 + 0 + 0 = 0$

This is an incompressible flow (like a vortex in water).

Curl: $\nabla \times \mathbf{F} = \langle 0 - 0, 0 - 0, 1 - (-1) \rangle = \langle 0, 0, 2 \rangle$

The curl is constant, pointing in the +z direction with magnitude 2. This represents uniform counterclockwise rotation in the xy-plane.

Example 3: Verify that $\nabla \times (\nabla f) = \mathbf{0}$ for $f = xyz$ .

Solution: First compute $\nabla f = \langle yz, xz, xy \rangle$ .

Now compute the curl: $\nabla \times (\nabla f) = \langle \frac{\partial(xy)}{\partial y} - \frac{\partial(xz)}{\partial z}, \frac{\partial(yz)}{\partial z} - \frac{\partial(xy)}{\partial x}, \frac{\partial(xz)}{\partial x} - \frac{\partial(yz)}{\partial y} \rangle$

$= \langle x - x, y - y, z - z \rangle = \langle 0, 0, 0 \rangle$

This confirms the identity.

Physical Applications

Electromagnetism

Maxwell's equations are the foundation of classical electromagnetism, and they are written entirely in terms of divergence and curl:

Equation	Name	Physical Meaning
$\nabla \cdot \mathbf{E} = \rho/\varepsilon_0$	Gauss's Law (Electric)	Charges are sources of electric field
$\nabla \cdot \mathbf{B} = 0$	Gauss's Law (Magnetic)	No magnetic monopoles
$\nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t}$	Faraday's Law	Changing magnetic fields create electric fields
$\nabla \times \mathbf{B} = \mu_0 \mathbf{J} + \mu_0 \varepsilon_0 \frac{\partial \mathbf{E}}{\partial t}$	Ampère-Maxwell Law	Currents and changing electric fields create magnetic fields

The identity $\nabla \cdot (\nabla \times \mathbf{B}) = 0$ implies charge conservation, since taking divergence of the Ampère-Maxwell law gives the continuity equation.

Fluid Dynamics

In fluid mechanics:

Vorticity: $\boldsymbol{\omega} = \nabla \times \mathbf{v}$ measures local rotation of fluid elements. Vortices are regions of high vorticity magnitude.
Incompressibility: $\nabla \cdot \mathbf{v} = 0$ describes fluids like water that maintain constant density.
Stream functions: For 2D incompressible flow, $\mathbf{v} = \nabla \times (\psi \mathbf{k})$ automatically satisfies incompressibility.

Connection to Machine Learning

Curl and divergence appear in several modern machine learning contexts:

Normalizing Flows: Continuous normalizing flows use the change of variables formula involving the divergence of the velocity field. The divergence controls how probability density changes as samples flow through the transformation.
Score-Based Diffusion Models: The score function $\nabla \log p(x)$ is a gradient field (hence has zero curl). Score matching leverages this structure for density estimation.
Physics-Informed Neural Networks (PINNs): When solving PDEs involving curl and divergence (like Maxwell's equations or Navier-Stokes), neural networks must learn to satisfy these differential constraints.
Helmholtz Decomposition: Any sufficiently smooth vector field can be decomposed into a curl-free part (gradient) and a divergence-free part (curl of something). This decomposition is used in analyzing learned vector fields.
Graph Laplacian: In graph neural networks, the graph Laplacian is the discrete analogue of the continuous Laplacian, measuring diffusion and smoothness on graph-structured data.

The Gradient Flow Connection: Gradient descent follows the negative gradient of the loss function. Since this is a gradient field, it has zero curl—the optimization path has no "rotation." Understanding when and why flows are irrotational helps analyze convergence properties.

Python Implementation

Here's a complete Python implementation for computing divergence and curl numerically and verifying the fundamental identities:

Computing Divergence and Curl in Python

🐍python

Explanation(11)

Code(75)

1Import NumPy

We only need NumPy in this trace — for the final small array of curl components and for printing a clean vector. No plotting library needed because the curl-divergence-visualizer above already handles the graphics.

4Define the three components P, Q, R

We hard-code the same vector field we worked out by hand in the collapsible walkthrough: F = (x²y, y²z, z²x). Splitting the field into three Python functions (one per component) keeps the partial-derivative code below uniform — we will pass any of P, Q, R, plus an axis (0, 1, or 2), into the same partial() helper.

EXAMPLE

P(1, 2, 3) = 1² · 2 = 2
Q(1, 2, 3) = 2² · 3 = 12
R(1, 2, 3) = 3² · 1 = 9

9The universal partial-derivative helper

Given any scalar function f(x,y,z), an axis index (0=x, 1=y, 2=z), and a base point, this returns the central-difference approximation of ∂f/∂(axis). One helper handles all nine partials we will need — both for divergence and for curl. Notice how we build p_plus and p_minus by copying the point and bumping only the chosen coordinate by ±h: that's the geometric meaning of a *partial* derivative — vary one variable, hold the others fixed.

EXAMPLE

axis = 0, point = (1.0, 2.0, 3.0), h = 1e-5
→ p_plus  = [1.00001, 2.0, 3.0]
→ p_minus = [0.99999, 2.0, 3.0]
→ partial(P, 0, point) ≈ ( P(1.00001, 2, 3) − P(0.99999, 2, 3) ) / 2e-5
                       ≈ ( 2.00004 − 1.99996 ) / 2e-5
                       ≈ 4.0   # exact: 2xy = 2·1·2 = 4 ✓

21Divergence — sum of the three diagonal partials

By definition, div(F) = ∂P/∂x + ∂Q/∂y + ∂R/∂z. We call partial() three times — each picks the diagonal entry of the 3×3 partial-derivative table from the hand walkthrough — and add them.

EXAMPLE

At (1, 2, 3):
  ∂P/∂x = 2xy = 2·1·2 = 4
  ∂Q/∂y = 2yz = 2·2·3 = 12
  ∂R/∂z = 2zx = 2·3·1 = 6
  div   = 4 + 12 + 6  = 22  ✓

28Curl — off-diagonal partials with a signed pattern

The curl uses the *six off-diagonal* entries, paired by axis. The recipe per component: • curl_x = ∂R/∂y − ∂Q/∂z (cycle y → z → x) • curl_y = ∂P/∂z − ∂R/∂x • curl_z = ∂Q/∂x − ∂P/∂y Reading the formula: for the x-component, we ask 'how does the z-component of F change as we move in y, minus how does the y-component change as we move in z?' That difference is exactly the local rate of rotation about the x-axis.

EXAMPLE

At (1, 2, 3):
  ∂R/∂y = 0,  ∂Q/∂z = y² = 4   →  curl_x = 0 − 4  = −4
  ∂P/∂z = 0,  ∂R/∂x = z² = 9   →  curl_y = 0 − 9  = −9
  ∂Q/∂x = 0,  ∂P/∂y = x² = 1   →  curl_z = 0 − 1  = −1
  curl(F) = (−4, −9, −1)  ✓

42Pick a test point and run the operators

We feed the same point (1, 2, 3) we used in the hand walkthrough into both functions. The print statements compare against the symbolic answers so the reader can spot any drift from the finite-difference approximation (h = 1e-5 keeps the error below ~1e-7 for these polynomials).

EXAMPLE

At point (1.0, 2.0, 3.0):
  divergence(F) = 22.0000     # hand:  22
  curl(F)       = [-4. -9. -1.] # hand: [-4, -9, -1]

51Build scalar accessors for each curl component

We need to take the divergence of curl(F), which means we must compute partial derivatives *of the curl* itself. But our curl() returns a vector all-at-once, not three separate functions. This helper closes over an index i and returns a scalar function (x,y,z) → curl((x,y,z))[i] so partial() can act on it line-by-line. Three calls, three scalar functions: CX, CY, CZ.

57Verify the identity div(curl F) = 0

Add ∂(curl_x)/∂x + ∂(curl_y)/∂y + ∂(curl_z)/∂z. By Clairaut's theorem (mixed partials commute), this *must* be exactly zero for any smooth F. Numerically we expect a tiny round-off — typically smaller than 1e-7 — not a clean zero, because the finite difference samples F at four different perturbed points per component.

EXAMPLE

Expected analytic value: 0
Observed numerical value:  ≈ 1e-8 or smaller
The scientific notation (e.g. 3.21e-08) tells you the identity holds up to floating-point noise.

66Define the scalar potential f = x² + y² + z²

To verify the other identity, curl(∇f) = 0, we need a scalar function whose gradient field is easy to recognize. The radial sum-of-squares is convenient: its gradient is the 'source field' ⟨2x, 2y, 2z⟩, which we already met in the divergence visualizer above.

67Define the gradient as three scalar functions

Same trick as for curl_component: we wrap each ∂f/∂x_i in its own (x,y,z) → scalar function so we can feed it back into partial(). These three functions *together* are the vector field ∇f.

EXAMPLE

grad_f_x(1, 2, 3) ≈ 2.0  # exact: 2x = 2
grad_f_y(1, 2, 3) ≈ 4.0  # exact: 2y = 4
grad_f_z(1, 2, 3) ≈ 6.0  # exact: 2z = 6

72Take the curl of that gradient

Same curl recipe, applied to (grad_f_x, grad_f_y, grad_f_z). Every entry has the form ∂²f/∂x_i ∂x_j − ∂²f/∂x_j ∂x_i. By Clairaut, each pair cancels exactly — so the result is the zero vector up to round-off.

EXAMPLE

Expected: [0, 0, 0]
Observed: [ ≈1e-8,  ≈1e-8,  ≈1e-8 ]  ✓

64 lines without explanation

1import numpy as np
2
3# The vector field we will study by hand and by code:
4#     F(x, y, z) = (x^2 * y,  y^2 * z,  z^2 * x)
5def P(x, y, z): return x * x * y       # F_x
6def Q(x, y, z): return y * y * z       # F_y
7def R(x, y, z): return z * z * x       # F_z
8
9def partial(f, axis, point, h=1e-5):
10    """Central-difference approximation of the partial derivative of f
11    along the given axis at the given (x, y, z) point.
12
13    axis = 0 -> ∂f/∂x,  axis = 1 -> ∂f/∂y,  axis = 2 -> ∂f/∂z
14    """
15    p_plus  = list(point)
16    p_minus = list(point)
17    p_plus[axis]  += h
18    p_minus[axis] -= h
19    return (f(*p_plus) - f(*p_minus)) / (2 * h)
20
21def divergence(point):
22    """div(F) = ∂P/∂x + ∂Q/∂y + ∂R/∂z (a scalar)."""
23    dP_dx = partial(P, 0, point)
24    dQ_dy = partial(Q, 1, point)
25    dR_dz = partial(R, 2, point)
26    return dP_dx + dQ_dy + dR_dz
27
28def curl(point):
29    """curl(F) = ( ∂R/∂y - ∂Q/∂z,
30                   ∂P/∂z - ∂R/∂x,
31                   ∂Q/∂x - ∂P/∂y )   (a vector)."""
32    dR_dy = partial(R, 1, point)
33    dQ_dz = partial(Q, 2, point)
34    dP_dz = partial(P, 2, point)
35    dR_dx = partial(R, 0, point)
36    dQ_dx = partial(Q, 0, point)
37    dP_dy = partial(P, 1, point)
38    return np.array([dR_dy - dQ_dz,
39                     dP_dz - dR_dx,
40                     dQ_dx - dP_dy])
41
42# Test the field at the same point we worked out by hand
43point = (1.0, 2.0, 3.0)
44div_value  = divergence(point)
45curl_value = curl(point)
46
47print(f"At point {point}:")
48print(f"  divergence(F) = {div_value:.4f}    # hand:  22")
49print(f"  curl(F)       = {curl_value}       # hand:  [-4, -9, -1]")
50
51# --- Sanity check: div(curl F) = 0 everywhere ------------------------
52def curl_component(i):
53    """Return a scalar function (x,y,z) -> i-th component of curl(F)."""
54    def comp(x, y, z): return curl((x, y, z))[i]
55    return comp
56
57CX, CY, CZ = curl_component(0), curl_component(1), curl_component(2)
58
59div_of_curl = (partial(CX, 0, point) +
60               partial(CY, 1, point) +
61               partial(CZ, 2, point))
62print(f"  div(curl F)   = {div_of_curl:.4e}  # identity says: 0")
63
64# --- Sanity check: curl(grad f) = 0 for f = x^2 + y^2 + z^2 ---------
65def f(x, y, z): return x*x + y*y + z*z
66def grad_f_x(x, y, z): return partial(f, 0, (x, y, z))
67def grad_f_y(x, y, z): return partial(f, 1, (x, y, z))
68def grad_f_z(x, y, z): return partial(f, 2, (x, y, z))
69
70curl_of_grad = np.array([
71    partial(grad_f_z, 1, point) - partial(grad_f_y, 2, point),
72    partial(grad_f_x, 2, point) - partial(grad_f_z, 0, point),
73    partial(grad_f_y, 0, point) - partial(grad_f_x, 1, point),
74])
75print(f"  curl(grad f)  = {curl_of_grad}  # identity says: [0, 0, 0]")

PyTorch: Exact Curl & Divergence via Autograd

The Python code above uses finite differences — fast and simple but with ~ $10^{-7}$ error from the $h$ step size. PyTorch can compute these derivatives exactly (to machine precision) by building the full Jacobian with reverse-mode autodiff. Once you have the Jacobian $J_{ij} = \partial F_i / \partial x_j$ :

Divergence is its trace: $\nabla \cdot \mathbf{F} = \text{tr}(J) = J_{00} + J_{11} + J_{22}$
Curl is the antisymmetric part: $(\nabla \times \mathbf{F})_i = \epsilon_{ijk} J_{kj}$ (with $\epsilon$ the Levi-Civita symbol)

Reading this way makes both identities obvious. The Jacobian of a gradient field is symmetric ( $J_{ij} = \partial^2 f / \partial x_i \partial x_j = J_{ji}$ ), so its antisymmetric part — the curl — vanishes. And the divergence of a curl is a sum of $\partial^2 F_k / \partial x_i \partial x_j$ terms that cancel in pairs.

Computing Divergence and Curl with PyTorch Autograd

🐍python

Explanation(6)

Code(36)

1Import PyTorch

PyTorch gives us *exact* partial derivatives via reverse-mode automatic differentiation — no finite-difference noise. The same field F that we approximated with central differences above will now produce integer answers to the last decimal.

3Vectorized definition of F

Whereas the plain-Python version split F into three scalar functions, PyTorch likes batched tensor operations, so F returns a tensor of shape (3,) for each input point of shape (3,). torch.stack glues the three component results into a single output tensor — that's what autograd will differentiate.

EXAMPLE

F(tensor([1., 2., 3.])) = tensor([2., 12., 9.])
# (matches our hand values for P, Q, R)

9Build the full 3×3 Jacobian

Both divergence and curl are linear combinations of the *Jacobian* matrix J[i, j] = ∂F_i/∂x_j. `torch.autograd.functional.jacobian` computes the entire matrix in one shot. We first call .detach().clone().requires_grad_(True) to make sure the input is a leaf tensor with gradient tracking turned on — otherwise autograd has nothing to record.

EXAMPLE

Jacobian at (1, 2, 3):
  [[ 2xy   x²    0   ]   [[ 4.   1.   0. ]
   [ 0    2yz   y²  ] =  [ 0.   12.  4. ]
   [ z²   0    2zx  ]]   [ 9.   0.   6. ]]

15Divergence = trace of the Jacobian

div(F) = ∂P/∂x + ∂Q/∂y + ∂R/∂z is *exactly* the sum of the diagonal entries of J, i.e. its trace. This is the cleanest one-liner definition of divergence: `torch.trace(jacobian(F))`.

EXAMPLE

trace([[4, 1, 0], [0, 12, 4], [9, 0, 6]]) = 4 + 12 + 6 = 22  ✓

19Curl from the antisymmetric part of the Jacobian

The curl is built from the antisymmetric part of J: each component is the difference of two mirror-image entries across the diagonal. Concretely: • curl_x = J[2, 1] − J[1, 2] (row z, col y minus row y, col z) • curl_y = J[0, 2] − J[2, 0] • curl_z = J[1, 0] − J[0, 1] If J is symmetric (as it is for any gradient field, because of Clairaut), every difference is zero — that's curl(∇f) = 0 again, made visual.

EXAMPLE

J = [[ 4,  1,  0],
     [ 0, 12,  4],
     [ 9,  0,  6]]

curl_x = J[2,1] − J[1,2] = 0 − 4  = −4
curl_y = J[0,2] − J[2,0] = 0 − 9  = −9
curl_z = J[1,0] − J[0,1] = 0 − 1  = −1
curl(F) = (−4, −9, −1)  ✓

29Run on the same hand-traced point

Same (1, 2, 3) as the collapsible walkthrough. The printout shows the full Jacobian (so you can compare to the analytic table from Step 1 of the walkthrough), then the scalar divergence (22.0000) and the curl vector ([-4.0, -9.0, -1.0]). No tolerance arguments needed — autograd produces these to machine precision.

EXAMPLE

Jacobian at (1, 2, 3):
tensor([[ 4.,  1.,  0.],
        [ 0., 12.,  4.],
        [ 9.,  0.,  6.]])

divergence(F) = 22.0000
curl(F)       = [-4.0, -9.0, -1.0]

30 lines without explanation

1import torch
2
3# Same vector field as before: F(x, y, z) = (x^2 y, y^2 z, z^2 x)
4def F(point):
5    x, y, z = point[0], point[1], point[2]
6    return torch.stack([x * x * y,
7                        y * y * z,
8                        z * z * x])
9
10def jacobian(point):
11    """Return the 3x3 Jacobian matrix J[i, j] = ∂F_i / ∂x_j  at the point.
12
13    PyTorch builds this for us by recording how F depends on its input."""
14    point = point.detach().clone().requires_grad_(True)
15    return torch.autograd.functional.jacobian(F, point)
16
17def divergence_torch(point):
18    """div(F) = trace(Jacobian) = sum of the diagonal entries."""
19    J = jacobian(point)
20    return torch.trace(J)
21
22def curl_torch(point):
23    """curl(F) read off the off-diagonal entries of the Jacobian:
24        curl_x = J[2,1] - J[1,2]    (∂R/∂y − ∂Q/∂z)
25        curl_y = J[0,2] - J[2,0]    (∂P/∂z − ∂R/∂x)
26        curl_z = J[1,0] - J[0,1]    (∂Q/∂x − ∂P/∂y)"""
27    J = jacobian(point)
28    return torch.stack([J[2, 1] - J[1, 2],
29                        J[0, 2] - J[2, 0],
30                        J[1, 0] - J[0, 1]])
31
32point = torch.tensor([1.0, 2.0, 3.0])
33print("Jacobian at (1, 2, 3):")
34print(jacobian(point))
35print(f"\ndivergence(F) = {divergence_torch(point).item():.4f}")
36print(f"curl(F)       = {curl_torch(point).tolist()}")

Why this matters for ML: The Jacobian view is what physics-informed neural networks (PINNs) actually compute when you ask them to satisfy

\nabla \cdot \mathbf{v} = 0

\nabla \times \mathbf{E} = -\partial \mathbf{B} / \partial t

as a soft loss. The same torch.autograd.functional.jacobian call powers their constraint terms.

Common Mistakes to Avoid

Watch Out For These Errors:

Sign errors in curl: The formula involves subtractions in a specific pattern. Remember: $\partial R/\partial y - \partial Q/\partial z$ (not the other way around) for the x-component.
Confusing 2D and 3D curl: In 2D, the "curl" is a scalar (the z-component of the 3D curl). In 3D, curl is a vector.
Forgetting that $\nabla \cdot (\nabla \times \mathbf{F}) = 0$ : If you compute a curl and then take its divergence and get something nonzero, you've made an error.
Assuming $\nabla \times \mathbf{F} = \mathbf{0}$ implies conservative: This is only true onsimply connected domains. On domains with holes, zero curl doesn't guarantee path independence.
Mixing up operators: div takes a vector and returns a scalar; grad takes a scalar and returns a vector; curl takes a vector and returns a vector. Keep track of what you're operating on!

Memory Aid for Curl Signs: Use the determinant formula with the symbolic cross product. Expand it like a regular

3 \times 3

determinant, and the signs work out automatically.

Test Your Understanding

Summary

In this section, we explored the two fundamental differential operators for vector fields:

Divergence ( $\nabla \cdot \mathbf{F}$ ) measures sources and sinks—how much a field expands or contracts at each point. Zero divergence means incompressible flow.
Curl ( $\nabla \times \mathbf{F}$ ) measures local rotation—both the axis and rate of spinning. Zero curl means the field is irrotational (conservative in simply connected domains).
Laplacian ( $\nabla^2 f = \nabla \cdot (\nabla f)$ ) measures deviation from local average, appearing in heat, wave, and Schrödinger equations.
Fundamental identities: $\nabla \times (\nabla f) = \mathbf{0}$ (gradients don't rotate) and $\nabla \cdot (\nabla \times \mathbf{F}) = 0$ (curls have no sources).

Looking Ahead: In the following sections, we'll study parametric surfaces and surface integrals, then connect everything with the major theorems of vector calculus: Stokes' Theorem (relating surface integrals of curl to line integrals) and the Divergence Theorem (relating volume integrals of divergence to surface integrals). These theorems are generalizations of the Fundamental Theorem of Calculus to higher dimensions.