Chapter 19
22 min read
Section 163 of 353

Vector Fields

Vector Calculus

Learning Objectives

By the end of this section you will be able to:

  1. Read a vector field as a function that assigns one arrow F(x,y)=P(x,y),Q(x,y)\mathbf{F}(x,y) = \langle P(x,y),\,Q(x,y) \rangle to every point of the plane.
  2. Recognise the four canonical 2-D patterns — constant, radial, rotational, and saddle — and predict what their arrows look like before you plot them.
  3. Build a gradient field F=f\mathbf{F} = \nabla f from any smooth scalar potential and explain why f\nabla f is perpendicular to the level curves of ff.
  4. Test whether a given field is conservative using Q/x=P/y\partial Q/\partial x = \partial P/\partial y.
  5. Interpret streamlines as solutions of the ODE dr/dt=F(r)d\mathbf{r}/dt = \mathbf{F}(\mathbf{r}) and compute them by hand for separable fields.
  6. Connect the picture to gradient descent: training a neural network is flowing through the field L-\nabla L.

The Big Picture: A Vector at Every Point

Stand outside on a windy day. At every spot around you, the air has a velocity — a direction and a speed. The collection of all those velocity arrows, one per location, is a vector field.

Calculus has been climbing a ladder. Single-variable calculus handled functions f:RRf:\mathbb{R}\to\mathbb{R} — one number in, one number out. Multivariable calculus added scalar fields f:R2Rf:\mathbb{R}^2\to\mathbb{R} — a temperature at every point on a map. Now we take the next rung:

F:R2    R2\displaystyle \mathbf{F}:\mathbb{R}^2 \;\longrightarrow\; \mathbb{R}^2

A function whose input is a point of the plane and whose output is a vector. Once you can see this, you start seeing it everywhere.

🌊 Velocity of a fluid

At each point in a river, water has a velocity v(x,y,z)\mathbf{v}(x,y,z). The map of all those arrows is the flow field.

⚡ Electric field

E(x,y,z)\mathbf{E}(x,y,z) tells you the force a unit charge would feel if placed at that point. The whole field is the function.

🏔️ Gravity near a planet

g(r)=GMr/r3\mathbf{g}(\mathbf{r}) = -\,GM\,\mathbf{r}/|\mathbf{r}|^{3}— a vector at every point telling a test mass which way to fall.

🤖 Loss gradient in ML

L(θ1,θ2,)\nabla L(\theta_1,\theta_2,\dots) is a vector at every point of parameter space. Gradient descent is flowing along L-\nabla L.

Scalar field vs vector field

A scalar field assigns a number to each point: f(x,y)Rf(x,y)\in\mathbb{R}. A vector field assigns a 2-D (or 3-D) arrow to each point: F(x,y)R2\mathbf{F}(x,y)\in\mathbb{R}^2. The gradient operator \nabla is the bridge that turns a scalar field into a vector field.


From One Vector to a Whole Field

Up to now you have thought of a vector as one arrow living somewhere in space. A vector field is the leap from that single arrow to a whole choreography: a rule that gives you a different arrow at every location.

Here is the mental model in three steps.

  1. Pick a point. Drop a pin at (x0,y0)(x_0, y_0) on the plane.
  2. Look up two numbers. The field gives you P(x0,y0)P(x_0, y_0) (the horizontal component) and Q(x0,y0)Q(x_0, y_0) (the vertical component).
  3. Draw the arrow. Start at (x0,y0)(x_0, y_0), go PP units right and QQ units up. That arrow is the value F(x0,y0)\mathbf{F}(x_0, y_0).

Repeat this at thousands of points and you get the pictures matplotlib calls quiver plots. The direction of an arrow says where the field is pushing; the length says how strongly.

The field is not the arrows

The arrows are a picture of the field. The field itself is the function (x,y)(P,Q)(x,y)\mapsto(P,Q). We just happen to draw a finite sample of values because we cannot fit infinitely many arrows on a page.


Definition of a Vector Field

Definition (2-D vector field)

Let DR2D\subseteq\mathbb{R}^2. A vector field on D is a function F:DR2\mathbf{F}:D\to\mathbb{R}^2 that assigns to each point (x,y)D(x,y)\in D a 2-D vector

F(x,y)  =  P(x,y)i+Q(x,y)j  =  P(x,y),  Q(x,y).\mathbf{F}(x,y) \;=\; P(x,y)\,\mathbf{i} + Q(x,y)\,\mathbf{j} \;=\; \langle\,P(x,y),\; Q(x,y)\,\rangle.

The scalar functions PP and QQ are called the component functions of F\mathbf{F}.

Three equivalent notations

NotationHow to read it
F(x,y)=Pi+Qj\mathbf{F}(x,y) = P\,\mathbf{i} + Q\,\mathbf{j}Components along the standard unit vectors
F(x,y)=P,Q\mathbf{F}(x,y) = \langle P, Q\rangleAngle brackets — emphasises 'this is a vector'
F=(P,Q)\mathbf{F} = (P, Q)Shorthand when (x, y) is clear from context

Continuity

The field F\mathbf{F} is continuous at a point exactly when both components PP and QQ are continuous there. We will usually deal with smooth fields, except at isolated singular points where the field blows up or is not defined (origin of a radial field, the wire in a magnetic field, etc.).


The Four Canonical 2D Fields

Almost every example you will meet is built from these four building blocks — possibly added, scaled, or composed. Learn them once and you can recognise pieces of them in every other picture.

1. Constant field — uniform wind

F(x,y)=a,b\mathbf{F}(x,y) = \langle a, b\rangle. Every arrow is identical. Same direction, same length, everywhere. Think of a uniform wind, or the gravitational field near Earth's surface treated locally as constant 0,g\langle 0, -g\rangle.

2. Radial field — explosion or shrink toward a point

F(x,y)=x,y\mathbf{F}(x,y) = \langle x, y\rangle. The arrow at point (x,y)(x,y) is simply the position vector itself — it points directly away from the origin and grows linearly in length. Replace x,y\langle x,y\rangle with x,y-\langle x,y\rangle and the field points inward (a sink).

3. Rotational field — pure spin

F(x,y)=y,x\mathbf{F}(x,y) = \langle -y, x\rangle. This is the position vector rotated by 9090^\circ counter-clockwise. At every point, the arrow is perpendicular to the line from the origin — exactly the velocity field of a rigid disc spinning about the origin with angular speed 1.

Quick check: try four points

At (1,0)(1,0), F=0,1\mathbf{F}=\langle 0,1\rangle (points up). At (0,1)(0,1), F=1,0\mathbf{F}=\langle -1,0\rangle (points left). At (1,0)(-1,0), F=0,1\mathbf{F}=\langle 0,-1\rangle (points down). At (0,1)(0,-1), F=1,0\mathbf{F}=\langle 1,0\rangle (points right). The arrows trace a CCW circle.

4. Saddle field — push along x, pull along y

F(x,y)=x,y\mathbf{F}(x,y) = \langle x, -y\rangle. This is the gradient of the saddle surface 12x212y2\tfrac12 x^2 - \tfrac12 y^2. The field pushes outward along the x-axis and pulls inward along the y-axis. The origin is an unstable equilibrium — perturbations along x grow, perturbations along y shrink.


Interactive Playground

Time to play. Pick a field, then drag the white dot to position your probe anywhere on the plane. The amber arrow shows F\mathbf{F} at the probe; the readout in the top-left gives the exact value of PP, QQ, F|\mathbf{F}|, and the angle in degrees.

Toggle equal-length arrows to see direction information without magnitude bias (great for the radial field, where corner arrows would otherwise be huge and central ones nearly invisible).

Loading vector field visualization...

Three experiments to try right now

(1) On the radial field, drag the probe to (3, 0): the readout should be F=3,0\mathbf{F}=\langle 3,0\rangle, F=3|\mathbf{F}|=3, angle 00^\circ. (2) On the rotational field, drag along the unit circle: the magnitude stays at 1, the angle rotates with you. (3) On the saddle field, drag along the line y=xy=x: components have equal magnitude but opposite signs, so the arrow always points along y=xy=-x.


Worked Example: Tracing F=y,x\mathbf{F} = \langle -y, x\rangle by Hand

The rotational field F(x,y)=y,x\mathbf{F}(x,y)=\langle -y,x\rangle is the single most useful 2-D example. Let's evaluate it at seven points, find the magnitude and angle of each, and verify the deep geometric fact that every arrow is tangent to a circle centred at the origin.

Try each row yourself first; the worked solution is hidden in the collapsible box below.

▶ Show full hand-worked solution (7 points + tangency proof)

Step 1 — Evaluate F at each point

Just plug each (x,y)(x,y) into F(x,y)=y,x\mathbf{F}(x,y)=\langle -y,x\rangle:

(x,y)(x, y)F=y,x\mathbf{F} = \langle -y, x\rangleF=x2+y2|\mathbf{F}| = \sqrt{x^2 + y^2}angle (deg)tangent to circle?
(1,0)(1, 0)0,1\langle 0, 1\rangle19090^\circ✓ tangent to r=1r = 1
(0,1)(0, 1)1,0\langle -1, 0\rangle1180180^\circ✓ tangent to r=1r = 1
(1,0)(-1, 0)0,1\langle 0, -1\rangle190-90^\circ✓ tangent to r=1r = 1
(0,1)(0, -1)1,0\langle 1, 0\rangle100^\circ✓ tangent to r=1r = 1
(2,0)(2, 0)0,2\langle 0, 2\rangle29090^\circ✓ tangent to r=2r = 2 (faster)
(1,1)(1, 1)1,1\langle -1, 1\rangle21.41\sqrt{2} \approx 1.41135135^\circ✓ tangent to r=2r = \sqrt{2}
(2,1)(-2, 1)1,2\langle -1, -2\rangle52.24\sqrt{5} \approx 2.24117\approx -117^\circ✓ tangent to r=5r = \sqrt{5}

Step 2 — Verify tangency at one point

The circle of radius rr centred at the origin is the level set x2+y2=r2x^2+y^2 = r^2. A tangent to that circle at (x,y)(x,y) is perpendicular to the radial vector x,y\langle x,y\rangle. Check:

x,yy,x  =  xy+yx  =  0.\langle x,y\rangle \cdot \langle -y,x\rangle \;=\; -xy + yx \;=\; 0.

The dot product is identically zero, for every (x,y)(x,y). So F\mathbf{F} is perpendicular to the position vector everywhere — which is the same as saying it is tangent to the circle of radius r=x2+y2r=\sqrt{x^2+y^2}.

Step 3 — Magnitude grows linearly with r

From the table: at r=1r=1, F=1|\mathbf{F}|=1; at r=2r=2, F=2|\mathbf{F}|=2; at r=5r=\sqrt{5}, F=5|\mathbf{F}|=\sqrt{5}. In general,

F(x,y)  =  (y)2+x2  =  x2+y2  =  r.|\mathbf{F}(x,y)| \;=\; \sqrt{(-y)^2 + x^2} \;=\; \sqrt{x^2+y^2} \;=\; r.

Physical reading: this is the velocity field of a rigid disc spinning about the origin with angular speed ω=1\omega = 1 rad/s. A point at radius rr moves at linear speed ωr=r\omega r = r.

Step 4 — Sanity-check on the interactive viewer

Pop open the playground above, select Rotational, then drag the probe to each of the seven points in the table. The on-screen P,Q,FP,Q,|\mathbf{F}|, and angle should match these values to two decimals.


Stepping Up to 3D

Definition (3-D vector field)

A vector field on ER3E\subseteq\mathbb{R}^3 is a function F:ER3\mathbf{F}:E\to\mathbb{R}^3 with three component functions:

F(x,y,z)=Pi+Qj+Rk=P,Q,R.\mathbf{F}(x,y,z) = P\,\mathbf{i} + Q\,\mathbf{j} + R\,\mathbf{k} = \langle P,\,Q,\,R\rangle.

Three 3-D examples that run physics

FieldFormula (r\mathbf{r} = position vector)What it does
Gravity (point mass M)F=GMmr/r3\mathbf{F} = -GMm \cdot \mathbf{r} / |\mathbf{r}|^3Pulls a test mass m toward M; inverse-square magnitude
Coulomb (point charge q)E=kqr/r3\mathbf{E} = kq \cdot \mathbf{r} / |\mathbf{r}|^3Radial, away from ++ charge / toward - charge
Magnetic (straight wire)B=μ0I2πrφ^\mathbf{B} = \frac{\mu_0 I}{2\pi r}\,\hat{\boldsymbol{\varphi}}Circles around the wire at distance rr
Uniform flowv=a,b,c\mathbf{v} = \langle a, b, c\rangleSame arrow everywhere — bulk transport

The inverse-square law

Many 3-D fields share the family form F(r)=cr2r^\mathbf{F}(\mathbf{r}) = \dfrac{c}{|\mathbf{r}|^{2}}\,\hat{\mathbf{r}}. The magnitude falls off as 1/r21/r^{2} because the field spreads over a sphere of surface area 4πr24\pi r^{2} — the further you are, the more dilute the flux. This single picture explains gravity, electrostatics, and how a flashlight grows dimmer with distance.


Gradient Fields: From Hills to Arrows

Take a scalar potential f(x,y)f(x,y) — think of it as the height of a hill. The gradient packages its two partial derivatives into one vector:

f(x,y)  =  fx,  fy.\nabla f(x,y) \;=\; \left\langle\,\dfrac{\partial f}{\partial x},\;\dfrac{\partial f}{\partial y}\,\right\rangle.

Evaluated at every point, f\nabla f is itself a vector field — the gradient field of ff.

💡 Why the gradient is perpendicular to level curves

Move along a level curve f(x,y)=cf(x,y)=c: ff does not change, so the directional derivative along the curve is 0. But the directional derivative along a unit tangent t\mathbf{t} is ft\nabla f \cdot \mathbf{t}. The only way that dot product can be zero for the tangent direction is if ft\nabla f \perp \mathbf{t}. Conclusion: f\nabla f is normal to the level set, and it points uphill (toward higher ff).

Quick examples

Example 1. Let f(x,y)=x2+y2f(x,y) = x^2 + y^2. Then f=2x,2y\nabla f = \langle 2x, 2y\rangle — the radial outward field scaled by 2. The level sets are circles, and the gradient points radially outward, perpendicular to them. ✓

Example 2. Let f(x,y)=12x212y2f(x,y) = \tfrac12 x^2 - \tfrac12 y^2. Then f=x,y\nabla f = \langle x, -y\rangle — the saddle field. Level sets are hyperbolas x2y2=2cx^2 - y^2 = 2c; the gradient is perpendicular to each branch.

Interactive: potential \leftrightarrow gradient field

The left panel shows the scalar potential as a coloured contour map. The right panel overlays the gradient arrows. Watch how the arrows are always normal to the level curves, and how they grow longer where the surface is steeper.

Loading gradient field demo...

When is a Field a Gradient?

Not every vector field comes from a potential. The rotational field y,x\langle -y, x\rangle emphatically does not. How can we tell?

Test (mixed-partials condition)

If F=P,Q\mathbf{F} = \langle P, Q\rangle is C1C^1 on a simply-connected region and F=f\mathbf{F} = \nabla f, then by equality of mixed partials,

Py  =  Qx.\dfrac{\partial P}{\partial y} \;=\; \dfrac{\partial Q}{\partial x}.

The converse is true on simply-connected regions: if the equality holds, the field is conservative and a potential ff exists.

Applying the test

Radial field F=x,y\mathbf{F} = \langle x, y\rangle: P/y=0\partial P/\partial y = 0, Q/x=0\partial Q/\partial x = 0. Equal — conservative. (Indeed, f=12(x2+y2)f = \tfrac12 (x^2+y^2) works.)

Rotational field F=y,x\mathbf{F} = \langle -y, x\rangle: P/y=1\partial P/\partial y = -1, Q/x=+1\partial Q/\partial x = +1. Not equal — the field is not a gradient of any scalar potential.

Curl appears here in disguise

The difference Q/xP/y\partial Q/\partial x - \partial P/\partial y is the 2-D curl (or scalar curl) of F\mathbf{F}. Conservative fields are exactly those with zero curl. We will study curl as its own operator in Section 19.5.


Flow Lines: A Field is a Recipe for Motion

Drop a tiny massless particle into a vector field and let it ride. At every instant, the particle's velocity equals the field at its current location. The path it traces is called a flow line or streamline.

Definition (flow line)

A flow line of the vector field F\mathbf{F} is a parametrised curve r(t)\mathbf{r}(t) satisfying the system of ODEs

drdt  =  F(r(t)).\dfrac{d\mathbf{r}}{dt} \;=\; \mathbf{F}\bigl(\mathbf{r}(t)\bigr).

For a 2-D field F=P,Q\mathbf{F}=\langle P,Q\rangle this is the coupled system dx/dt=P,  dy/dt=Qdx/dt = P,\; dy/dt = Q.

Solving for the rotational streamlines by hand

For F=y,x\mathbf{F}=\langle -y, x\rangle, the ODE system is dx/dt=y,  dy/dt=xdx/dt = -y,\; dy/dt = x. Eliminate tt by computing the slope:

dydx  =  dy/dtdx/dt  =  xy.\dfrac{dy}{dx} \;=\; \dfrac{dy/dt}{dx/dt} \;=\; \dfrac{x}{-y}.

Separate variables: ydy=xdxy\,dy = -x\,dx. Integrate both sides:

12y2  =  12x2+C        x2+y2=2C.\tfrac12 y^2 \;=\; -\tfrac12 x^2 + C \;\;\Longrightarrow\;\; x^2 + y^2 = 2C.

Streamlines are circles centred at the origin — exactly what the tangency calculation in the worked example predicted, and exactly what the playground will show you.

Interactive: streamlines for five different fields

Below you can scrub through five fields and watch the streamline pattern change. Hit Animate Particles to spawn massless dots that flow along the streamlines in real time.

Loading streamline visualization...

Streamlines never cross (except at fixed points)

At any point where F0\mathbf{F}\neq\mathbf{0} there is exactly one streamline through it — uniqueness of ODE solutions guarantees this. Two streamlines can only meet at a singular point where F=0\mathbf{F}=\mathbf{0} (the velocity vanishes and any direction is consistent).


Real-World Applications

1. Fluid dynamics

Wind, water, and traffic-flow models all live on a velocity field v(x,y,z,t)\mathbf{v}(x,y,z,t). The divergence v\nabla\cdot\mathbf{v} measures local expansion or compression; the curl ×v\nabla\times\mathbf{v} measures local rotation (vorticity). Both are introduced in §19.5.

2. Electromagnetism

Maxwell's equations are statements about how two vector fields, E\mathbf{E} and B\mathbf{B}, couple in time and space. One of them, ×E=B/t\nabla\times \mathbf{E} = -\partial \mathbf{B}/\partial t, says a changing magnetic field induces an electric one — the principle behind every electric generator.

3. Weather and climate

Forecasters work simultaneously with the wind vector field, the temperature scalar field, and the pressure scalar field. Pressure gradients drive wind: vp\mathbf{v}\propto -\nabla p.

4. Computer graphics

Fluid simulation, fur shading, and texture synthesis all rely on advecting particles through pre-computed vector fields. The same math as a streamline integration.


Machine Learning Connection

Training a neural network is gradient flow on a giant loss surface. Make the connection explicit:

  1. The loss L(θ1,,θn)L(\theta_1, \dots, \theta_n) is a scalar field on parameter space Rn\mathbb{R}^{n}.
  2. The loss gradient L\nabla L is its gradient vector field — built by backprop in deep learning.
  3. Gradient descent θt+1=θtηL(θt)\theta_{t+1} = \theta_t - \eta\,\nabla L(\theta_t) is forward-Euler integration of the ODE dθ/dt=Ld\theta/dt = -\nabla L. The training trajectory is a discretised streamline of the field L-\nabla L.

Saddle points

Where L=0\nabla L = \mathbf{0} but the Hessian has mixed signs. Vanilla SGD can stall here exactly like a marble on a horse's saddle — stable in some directions, unstable in others.

Learning rate

The integrator step size. Too large and the discrete trajectory diverges from the true streamline; too small and you take forever to reach a minimum.

Momentum

Replaces the velocity θ˙=L\dot\theta = -\nabla L with a damped second-order ODE. The trajectory smooths out, can escape narrow saddle valleys, and behaves like a ball rolling under gravity with friction.

Neural ODEs

Replace a stack of layers with the ODE dh/dt=f(h,t;θ)dh/dt = f(h, t; \theta). Now the hidden state is a particle drifting through a learned vector field — vector calculus is the architecture.


Python Implementation

Four short Python programs make every idea in this section runnable. Read each snippet, then expand the cards below to walk through it line by line.

1. Plot the four canonical 2-D fields

Plot constant, radial, rotational, saddle
🐍vector_field_2d.py
1Two imports do almost everything

numpy gives us vectorized math (we will evaluate the field at 225 points at once with no Python loops). matplotlib.pyplot gives us quiver — the standard 'arrow plot' for vector fields.

4What this function is really doing

It is the bridge from formula to picture. We start with F written symbolically, sample it on a grid, then ask matplotlib to draw one little arrow at each grid point.

EXAMPLE
Think of it like sampling a thermometer at every spot in a room — except instead of a number per spot, you get a 2D arrow per spot.
11x and y as 1-D ladders of sample points

np.linspace(-3, 3, 15) makes 15 evenly-spaced numbers from −3 to 3. These will be the x-coordinates of one row of the grid. Same for y.

EXECUTION STATE
x = [-3.00, -2.57, -2.14, ..., 2.57, 3.00]
x.shape = (15,)
13meshgrid turns 1-D ladders into a 2-D coordinate plane

X[i, j] is the x-coordinate of grid cell (i, j); Y[i, j] is its y-coordinate. After this line we can write any field as a single NumPy expression in X and Y — no for-loops.

EXECUTION STATE
X.shape = (15, 15)
Y.shape = (15, 15)
X[7, 7] = 0.0 (center column)
Y[7, 7] = 0.0 (center row)
16A dictionary of fields keyed by their label

We will iterate over (label, (U, V)) pairs, plot each one in its own subplot. The (U, V) tuple is the field evaluated on the whole grid.

17Constant field — same arrow everywhere

U = ones_like(X) means every cell gets the value 1. Same for V = 0.5. So at every (x, y), F(x, y) = (1, 0.5) — a uniform wind blowing slightly up and to the right.

EXAMPLE
At (−3, −3): F = (1, 0.5). At (2, 1): F = (1, 0.5). The arrow never changes.
18Radial field — the (x, y) trick

We set U = X and V = Y. Now at point (x, y), F = (x, y) — the position vector itself. Every arrow points directly away from the origin, growing longer the farther out you go.

EXAMPLE
At (1, 0): F = (1, 0), short arrow pointing right. At (3, 3): F = (3, 3), long arrow pointing up-right at 45°.
19Rotational field — swap and negate

F = (−y, x) is the position vector rotated 90° counter-clockwise. At every point, the arrow is perpendicular to the line from the origin — exactly what a rigid disc spinning about the origin does.

EXAMPLE
At (1, 0): F = (0, 1), arrow points up. At (0, 1): F = (−1, 0), arrow points left. At (−1, 0): F = (0, −1), arrow points down. Tracing this around: the disc spins CCW.
20Saddle field — x outward, y inward

F = (x, −y) pushes along the +x and −x axes (away from origin) but pulls along the y-axis (toward origin). The origin is a saddle: stable in one direction, unstable in the other.

EXAMPLE
At (2, 0): F = (2, 0) outward. At (0, 2): F = (0, −2) inward. Sit a marble at the origin and nudge it — diagonal nudges spiral out through the corridors.
25Magnitude as a NumPy expression

magnitude = sqrt(U² + V²) gives |F| at every grid cell. We will pass this to quiver as the color, so the arrow color encodes strength even after we normalize the lengths.

EXECUTION STATE
magnitude.shape = (15, 15)
magnitude[7, 7] (origin) = 0.0 for radial / rotational / saddle
magnitude[14, 14] (corner) = ≈ 4.24 for radial F = (x, y)
28Why add 1e-3 to magnitude?

Some fields are exactly zero at the origin. Dividing by zero would give NaN and matplotlib would silently drop those arrows. Adding 1e-3 (epsilon) keeps the division safe with no visible distortion.

29Normalize for readability, color for honesty

After this, every arrow has roughly the same length so we can see the direction even at the corner where the radial field is huge. We did NOT lose the magnitude information — it lives in the color now.

31quiver: the workhorse

quiver(X, Y, U_norm, V_norm, magnitude, cmap='viridis', scale=25) draws an arrow at each (X[i,j], Y[i,j]) with direction (U_norm[i,j], V_norm[i,j]) and color from the magnitude array.

EXAMPLE
scale=25 means 'one unit of U_norm shows up as roughly 1/25 of the axis width'. Smaller scale → longer arrows; larger scale → shorter arrows.
33 lines without explanation
1import numpy as np
2import matplotlib.pyplot as plt
3
4def plot_vector_field_2d():
5    """
6    Visualize 2D vector fields using matplotlib's quiver plot.
7
8    A vector field F(x, y) = (P(x,y), Q(x,y)) assigns a 2D vector
9    to each point in the plane. We visualize this by drawing an
10    arrow at a grid of sample points.
11    """
12    # 1. Build a square grid of (x, y) sample points.
13    x = np.linspace(-3, 3, 15)
14    y = np.linspace(-3, 3, 15)
15    X, Y = np.meshgrid(x, y)
16
17    # 2. Define four canonical fields as (U, V) pairs.
18    fields = {
19        'Constant: F = (1, 0.5)':  (np.ones_like(X), 0.5 * np.ones_like(Y)),
20        'Radial:   F = (x, y)':    (X, Y),
21        'Rotational: F = (-y, x)': (-Y, X),
22        'Saddle:   F = (x, -y)':   (X, -Y),
23    }
24
25    fig, axes = plt.subplots(2, 2, figsize=(12, 12))
26    axes = axes.flatten()
27
28    for ax, (name, (U, V)) in zip(axes, fields.items()):
29        # 3. |F| at every grid point, used to color the arrows.
30        magnitude = np.sqrt(U**2 + V**2)
31
32        # 4. Normalize direction so very long arrows do not dominate.
33        U_norm = U / (magnitude + 1e-3)
34        V_norm = V / (magnitude + 1e-3)
35
36        ax.quiver(X, Y, U_norm, V_norm, magnitude,
37                  cmap='viridis', scale=25)
38        ax.set_xlim(-3.5, 3.5)
39        ax.set_ylim(-3.5, 3.5)
40        ax.set_title(name)
41        ax.set_aspect('equal')
42
43    plt.tight_layout()
44    plt.show()
45
46plot_vector_field_2d()

2. Streamlines two ways: streamplot and odeint

Streamlines as ODE solutions
🐍streamlines.py
3Why scipy.integrate?

A streamline is the solution of an ODE. Solving an ODE numerically means stepping forward in time using the field as the velocity. odeint is SciPy's general-purpose ODE solver — internally it uses an adaptive Runge–Kutta-like scheme.

7What 'streamline' means precisely

A streamline of F is a curve whose tangent vector equals F at every point on the curve. Equivalently, it is the path traced by a massless particle that has velocity F(r) when it sits at r.

14Why a denser grid for streamplot

We jumped from 15×15 (quiver) to 100×100 (streamplot). streamplot internally interpolates the field on this grid as it integrates streamlines — a coarse grid gives jagged streamlines.

19Just two lines define the rotational field

U = −Y and V = X. Even though we will draw smooth curves, the underlying field is still F(x, y) = (−y, x). Streamplot will integrate this field forwards and backwards from many seed points.

22speed = √(U² + V²) — color encodes how fast

For F = (−y, x), speed = √(y² + x²) = r, the distance from the origin. Streamlines at radius 1 move at speed 1; at radius 3 they move at speed 3.

EXECUTION STATE
speed.shape = (100, 100)
speed at (1, 0) = 1.0
speed at (3, 0) = 3.0
25streamplot — the smart cousin of quiver

streamplot picks its own seed points, integrates the field forward and backward, and draws the resulting curves. density=2 means roughly twice the default number of streamlines.

32Now we trace ONE streamline by hand

We pick the initial point (2, 0) and ask: where will a particle starting there be at time t? Calculus tells us exactly: a circle of radius 2 traced once in 2π seconds.

39field(state, t) = [−y, x]

odeint expects a function (state, t) → derivative. state = (x, y). We return [dx/dt, dy/dt] = [−y, x] — exactly the field components.

EXAMPLE
At state = (2, 0): field returns [0, 2]. So initially the particle moves straight up — which is exactly the tangent to a CCW circle at the easternmost point.
42200 time samples from 0 to 2π

We pick t-values where we want the solution. odeint does its own internal stepping but reports the solution at these 200 points. 2π is one full revolution.

43sol = odeint(field, [2.0, 0.0], t)

sol has shape (200, 2). sol[i, 0] is x at time t[i]; sol[i, 1] is y at time t[i]. Plotting sol[:, 0] vs sol[:, 1] traces the streamline in the plane.

EXECUTION STATE
sol[0] = [2.0, 0.0] (start)
sol[50] = [≈ 0.0, ≈ 2.0] (quarter revolution: top of circle)
sol[100] = [≈ -2.0, ≈ 0.0] (half revolution: left side)
sol[-1] = [≈ 2.0, ≈ 0.0] (full revolution: back to start)
49Why start and end coincide

Because the closed-form solution is (2 cos t, 2 sin t). At t = 0 and t = 2π, cos = 1 and sin = 0, so we return exactly to (2, 0). Any tiny numerical drift you see is the integrator's local error — typically < 1e−5 with odeint.

47 lines without explanation
1import numpy as np
2import matplotlib.pyplot as plt
3from scipy.integrate import odeint
4
5def plot_streamlines():
6    """
7    A streamline is a curve r(t) = (x(t), y(t)) whose tangent matches
8    the field at every point:
9
10        dr/dt = F(r(t))         (a 2D system of ODEs)
11
12    matplotlib.streamplot draws many streamlines automatically.
13    """
14    x = np.linspace(-3, 3, 100)
15    y = np.linspace(-3, 3, 100)
16    X, Y = np.meshgrid(x, y)
17
18    # Rotational field
19    U = -Y
20    V =  X
21    speed = np.sqrt(U**2 + V**2)
22
23    fig, ax = plt.subplots(figsize=(7, 7))
24    ax.streamplot(X, Y, U, V, color=speed, cmap='coolwarm',
25                  linewidth=1.5, density=2, arrowstyle='->')
26    ax.set_aspect('equal')
27    ax.set_title('Rotational field — streamlines are circles')
28    plt.show()
29
30def numerical_streamline():
31    """
32    Compute ONE streamline by solving the IVP
33
34        dx/dt = -y
35        dy/dt =  x
36        (x(0), y(0)) = (2, 0)
37
38    The exact solution is (x(t), y(t)) = (2 cos t, 2 sin t).
39    We verify numerically with a Runge–Kutta solver (odeint).
40    """
41    def field(state, t):
42        x, y = state
43        return [-y, x]
44
45    t = np.linspace(0, 2 * np.pi, 200)
46    sol = odeint(field, [2.0, 0.0], t)
47
48    plt.figure(figsize=(7, 7))
49    plt.plot(sol[:, 0], sol[:, 1], 'b-', lw=2)
50    plt.plot(2, 0, 'go', label='start (2, 0)')
51    plt.plot(sol[-1, 0], sol[-1, 1], 'ro', label='end after 2π')
52    plt.axis('equal')
53    plt.legend()
54    plt.title('Streamline of F = (-y, x) from (2, 0)')
55    plt.show()
56
57plot_streamlines()
58numerical_streamline()

3. Build a gradient field from a scalar potential

From f to ∇f, perpendicular to level curves
🐍gradient_field.py
4Two objects, one relationship

f is a SCALAR field (one number per point). ∇f is a VECTOR field (an arrow per point). The gradient operator turns one into the other.

11Why this matters in three lines

(1) ∇f is perpendicular to the level set {f = c}. (2) ∇f points uphill. (3) Line integrals collapse to endpoint differences. These three facts are the heart of multivariable calculus.

19Pick a potential, get its gradient

We chose f = x² + y², a paraboloid bowl. The gradient is computed by hand: ∂f/∂x = 2x, ∂f/∂y = 2y. So ∇f = (2x, 2y).

EXAMPLE
At (1, 0.5): ∇f = (2, 1). Pointing right and slightly up — away from the bottom of the bowl, the direction the surface rises fastest.
20grad_x and grad_y as full grids

Like before, we evaluate the gradient over the entire mesh. grad_x = 2 * X is a (100, 100) array; same for grad_y.

EXECUTION STATE
grad_x[50, 67] = = 2 * X[50, 67] ≈ 2 * 1.06 = 2.12
grad_y[50, 67] = = 2 * Y[50, 67] ≈ 2 * 0 = 0
27contourf paints level sets

Level curves of x² + y² are circles. contourf colors the regions between consecutive levels. Brighter colors are higher elevations.

35Re-draw contours faintly, overlay ∇f

Same level curves, lighter. Then we sample (Xv, Yv) on a sparser 12×12 grid for the arrows so the picture stays readable.

42U, V mirror grad_x, grad_y on the sparse grid

Same formula, just evaluated at fewer points so the arrows do not overlap.

45Quiver with arrows normalized and colored by magnitude

Direction is preserved; length is roughly equal; color shows how fast the surface is rising. Notice every arrow crosses the contour lines AT RIGHT ANGLES.

51Numerical sanity check: gradient ⟂ tangent

Pick the point (1, 0.5). Compute ∇f = (2, 1). A tangent to the level circle at this point is the gradient rotated 90°: (−1, 2). Multiply by 2 to match magnitudes: (−2*0.5, 2*1)... in code we use rotate by 90° → (−2y, 2x) = (−1, 2).

EXAMPLE
grad · tangent = (2)(−1) + (1)(2) = 0. The two vectors are perpendicular. The gradient is ALWAYS perpendicular to the level curve.
52Dot product as a perpendicularity test

Two vectors are perpendicular iff their dot product is zero. The @ operator in NumPy is matrix multiplication, which for 1-D arrays acts as the dot product. So `grad @ tangent` is exactly ∇f · t.

EXECUTION STATE
grad = [2.0, 1.0]
tangent = [-1.0, 2.0]
grad @ tangent = (2)(-1) + (1)(2) = 0.0 ✓
54 lines without explanation
1import numpy as np
2import matplotlib.pyplot as plt
3
4def gradient_vector_field():
5    """
6    A gradient field is the special vector field
7
8        F(x, y) = ∇f(x, y) = (∂f/∂x, ∂f/∂y)
9
10    for some scalar 'potential' function f.
11
12    Key facts:
13      • ∇f is always perpendicular to the level curves of f.
14      • ∇f points uphill (in the direction of steepest ascent).
15      • Line integrals of ∇f depend only on the endpoints.
16    """
17    x = np.linspace(-3, 3, 100)
18    y = np.linspace(-3, 3, 100)
19    X, Y = np.meshgrid(x, y)
20
21    # Scalar potential and its gradient
22    f = X**2 + Y**2          # paraboloid bowl
23    grad_x = 2 * X
24    grad_y = 2 * Y
25
26    fig, axes = plt.subplots(1, 2, figsize=(14, 6))
27
28    # Left panel: contour map of f
29    ax1 = axes[0]
30    cs = ax1.contourf(X, Y, f, levels=20, cmap='viridis')
31    ax1.contour(X, Y, f, levels=20, colors='white', linewidths=0.3)
32    plt.colorbar(cs, ax=ax1, label='f(x, y) = x² + y²')
33    ax1.set_aspect('equal')
34    ax1.set_title('Scalar field f')
35
36    # Right panel: ∇f arrows on top of level curves
37    ax2 = axes[1]
38    ax2.contour(X, Y, f, levels=15, colors='gray', alpha=0.5)
39
40    x_vec = np.linspace(-2.5, 2.5, 12)
41    y_vec = np.linspace(-2.5, 2.5, 12)
42    Xv, Yv = np.meshgrid(x_vec, y_vec)
43    U = 2 * Xv
44    V = 2 * Yv
45    mag = np.sqrt(U**2 + V**2)
46
47    ax2.quiver(Xv, Yv, U / (mag + 1e-3), V / (mag + 1e-3), mag,
48               cmap='plasma', scale=25)
49    ax2.set_aspect('equal')
50    ax2.set_title('Gradient field ∇f = (2x, 2y) — perpendicular to level curves')
51
52    plt.tight_layout()
53    plt.show()
54
55    # Numerical check at one point: ∇f ⟂ tangent to level curve
56    px, py = 1.0, 0.5
57    grad = np.array([2 * px, 2 * py])         # = (2, 1)
58    tangent = np.array([-2 * py, 2 * px])     # rotate grad 90°
59    print(f"point: ({px}, {py})")
60    print(f"  grad      = {grad}")
61    print(f"  tangent   = {tangent}")
62    print(f"  grad · tangent = {grad @ tangent}   <- should be 0")
63
64gradient_vector_field()

4. Gradient descent as flow on a loss vector field

Train w·x + b by flowing along −∇L
🐍loss_gradient_descent.py
4The connection that motivates this section

Every neural network training run is a particle moving through a vector field. The field is −∇L over parameter space. The 'particle' is your current weights. Gradient descent IS Euler integration of an ODE in the field −∇L.

13Synthetic linear data with noise

We sample x ~ N(0, 1) and set y = 2x + 1 + 0.3·N(0, 1). The 'true' answer the network must recover is w=2, b=1. The 0.3·N(0, 1) noise prevents the loss from going to zero at the true parameters.

EXECUTION STATE
x_data.shape = (50,)
y_data.mean() = ≈ 1.0 (matches b = 1)
17The loss is a scalar function of two variables

L(w, b) = mean over data of (prediction − target)². For any pair (w, b), this returns one number. So L is a SCALAR field on the 2-D (w, b)-plane.

EXAMPLE
loss(2.0, 1.0) ≈ 0.082  (just the noise — best possible). loss(0, 0) ≈ 5.4  (terrible guess).
21The gradient is computed analytically

MSE has a closed-form gradient. ∂L/∂w = 2·mean(r · x). ∂L/∂b = 2·mean(r), where r = pred − y. No autograd needed for this toy problem.

25grad returns a 2-vector, the field at (w, b)

This function IS the vector field. Calling grad(w, b) at any point in parameter space returns the local gradient arrow. The entire 'loss vector field' is just this function evaluated everywhere.

EXECUTION STATE
grad(2.0, 1.0) = ≈ [≈ 0, ≈ 0] (at the minimum the field vanishes)
grad(0.0, 0.0) = ≈ [-4.4, -2.0] (points TOWARDS bigger loss; we'll negate it)
31np.vectorize makes loss(W, B) elementwise

loss expects two scalars, not arrays. np.vectorize wraps it so we can pass the entire (100, 100) W and B grids in one go. The result L is a 100×100 grid of loss values — ready for contourf.

39Pick a far-away starting point

(w, b) = (4.5, 3.5). The true answer is (2, 1) so we start about 2.5 units northeast of the minimum. We will watch −∇L drag us back.

41The whole optimizer in 4 lines

g = grad(w, b) → w ← w − lr · g[0] → b ← b − lr · g[1]. This is the textbook gradient-descent update. lr = 0.1 is the step size; 80 steps is enough for this convex bowl.

44path is a discrete streamline of -∇L

Each row of path is the (w, b) position at step t. The whole sequence is the trajectory of our 'particle' through the loss vector field. It will look just like a streamline drifting downhill on the contour plot.

EXECUTION STATE
path[0] = [4.5, 3.5] (start, on a high contour)
path[20] = [≈ 3.1, ≈ 2.2] (climbing down the slope)
path[80] = [≈ 2.00, ≈ 1.00] (basically at the minimum)
55Print the final (w, b)

Final values should match the true coefficients to within the noise. Typical output: final (w, b) = (1.998, 1.004). Recovery within 1% — exactly because the loss field is a clean convex bowl.

51 lines without explanation
1import numpy as np
2import matplotlib.pyplot as plt
3
4def loss_gradient_field():
5    """
6    In machine learning, the loss L(θ) is a scalar field over
7    parameter space and ∇L is its gradient vector field.
8    Gradient descent simply 'flows downhill' along -∇L:
9
10        θ_{t+1} = θ_t  -  η · ∇L(θ_t)
11
12    Here we fit y = w·x + b to noisy synthetic data and watch
13    descent trace a streamline of -∇L through (w, b)-space.
14    """
15    rng = np.random.default_rng(42)
16    x_data = rng.standard_normal(50)
17    y_data = 2.0 * x_data + 1.0 + 0.3 * rng.standard_normal(50)
18
19    def loss(w, b):
20        pred = w * x_data + b
21        return np.mean((pred - y_data) ** 2)
22
23    def grad(w, b):
24        pred = w * x_data + b
25        r = pred - y_data
26        return np.array([2 * np.mean(r * x_data),
27                         2 * np.mean(r)])
28
29    # Grid of (w, b) for the contour plot
30    w_axis = np.linspace(-1, 5, 100)
31    b_axis = np.linspace(-2, 4, 100)
32    W, B = np.meshgrid(w_axis, b_axis)
33    L = np.vectorize(loss)(W, B)
34
35    fig, ax = plt.subplots(figsize=(8, 7))
36    ax.contourf(W, B, L, levels=30, cmap='viridis')
37    ax.contour (W, B, L, levels=30, colors='white', linewidths=0.3)
38
39    # Gradient descent trajectory
40    w, b = 4.5, 3.5
41    lr = 0.1
42    path = [(w, b)]
43    for _ in range(80):
44        g = grad(w, b)
45        w -= lr * g[0]
46        b -= lr * g[1]
47        path.append((w, b))
48    path = np.array(path)
49
50    ax.plot(path[:, 0], path[:, 1], 'r.-', lw=2, ms=4)
51    ax.plot(path[0, 0], path[0, 1], 'go', ms=10, label='start')
52    ax.plot(path[-1, 0], path[-1, 1], 'r*', ms=14, label='end')
53    ax.set_xlabel('w'); ax.set_ylabel('b')
54    ax.set_title('Loss surface with -∇L flow (gradient descent)')
55    ax.legend()
56    plt.show()
57
58    print(f"true   (w, b) = (2.0, 1.0)")
59    print(f"final  (w, b) = ({path[-1, 0]:.3f}, {path[-1, 1]:.3f})")
60
61loss_gradient_field()

PyTorch: Gradient Fields via Autograd

For toy potentials like f=x2+y2f = x^{2}+y^{2} the gradient is easy by hand. For a 50-million-parameter neural network it is not. PyTorch's autograd computes f\nabla f mechanically by walking the computation graph backwards. The snippet below uses it to rediscover the radial field 2x,2y\langle 2x, 2y\rangle on a 5×55 \times 5 grid and compares it against the closed-form answer.

torch.autograd.grad recovers ∇(x² + y²)
🐍autograd_gradient_field.py
1Why PyTorch for a calculus example?

For toy problems like f = x² + y², differentiation by hand is trivial. But in deep learning, f is a 50-million-parameter loss function — and autograd computes the gradient mechanically. Practicing the API on f = x² + y² builds the muscle memory.

11The same potential we used in NumPy

f(x, y) = x² + y². We already KNOW ∇f = (2x, 2y) analytically. The test is whether torch.autograd.grad reproduces those numbers at every grid point.

14torch.linspace mirrors np.linspace

5 points from −2 to 2 inclusive: [−2, −1, 0, 1, 2]. We will sweep over the 25 combinations.

EXECUTION STATE
xs = tensor([-2., -1., 0., 1., 2.])
ys = tensor([-2., -1., 0., 1., 2.])
22Each (x, y) is a fresh leaf tensor

We must set requires_grad_(True) so PyTorch tracks operations involving x and y. .clone() detaches from the linspace tensor — without it the autograd would fail because xs and ys are not leaves you can ask gradients for.

23Build the computation graph by writing the formula

f = x * x + y * y. Behind the scenes PyTorch built a tiny graph: Mul → Mul → Add. f is a scalar tensor with a grad_fn — it knows how it was produced and can therefore be differentiated.

EXECUTION STATE
f = tensor(8.0, grad_fn=<AddBackward0>) at (x, y) = (2, 2)
27torch.autograd.grad — the API we want here

Asks: 'starting from f, push the gradient back to the inputs x and y'. Returns a tuple (∂f/∂x, ∂f/∂y), in the same order as the `inputs=` argument.

28outputs=f, inputs=(x, y)

outputs is the scalar (or list of scalars) we differentiate. inputs is the tuple of tensors we differentiate with respect to. The result has the same structure as inputs.

29create_graph=False

We do NOT need the gradient itself to be differentiable (no second-order autograd here). Setting create_graph=False keeps memory low and is the default — we set it explicitly to make the intent obvious.

31closed-form comparison

By hand, ∇f(x, y) = (2x, 2y). At (x, y) = (1, −2): closed-form is (2, −4). Autograd will return (2.00, −4.00). Bit-exact match for a polynomial — this is what makes autograd a reliable replacement for hand-derivation.

EXECUTION STATE
at (-2, -2) = autograd=(-4.00, -4.00) closed=(-4.00, -4.00)
at (0, 0) = autograd=(+0.00, +0.00) closed=(+0.00, +0.00) (critical point)
at (+2, +2) = autograd=(+4.00, +4.00) closed=(+4.00, +4.00)
30 lines without explanation
1import torch
2
3def grad_field_via_autograd():
4    """
5    The same gradient field as before — but now we let PyTorch
6    differentiate the potential for us. No more pencil-and-paper
7    ∂f/∂x.
8
9    f(x, y) = x² + y²,   ∇f = (2x, 2y)
10
11    We will sample f on a grid and use torch.autograd.grad to
12    recover the field one point at a time.
13    """
14    # Sample 5 evenly-spaced points along x and y
15    xs = torch.linspace(-2, 2, 5)
16    ys = torch.linspace(-2, 2, 5)
17
18    print(f"{'(x, y)':>10}  {'f(x,y)':>8}  {'autograd ∇f':>16}  {'closed-form':>14}")
19    print('-' * 60)
20
21    for x_val in xs:
22        for y_val in ys:
23            # Each call: build a leaf tensor at (x, y), compute f, differentiate.
24            x = x_val.clone().requires_grad_(True)
25            y = y_val.clone().requires_grad_(True)
26            f = x * x + y * y                       # scalar potential
27
28            # torch.autograd.grad returns gradients of f w.r.t. (x, y)
29            (grad_x, grad_y) = torch.autograd.grad(
30                outputs=f, inputs=(x, y),
31                create_graph=False,
32            )
33            closed = (2 * x_val.item(), 2 * y_val.item())
34            print(f"({x_val.item():+.1f},{y_val.item():+.1f})  "
35                  f"{f.item():>7.3f}   "
36                  f"({grad_x.item():+.2f}, {grad_y.item():+.2f})   "
37                  f"({closed[0]:+.2f}, {closed[1]:+.2f})")
38
39grad_field_via_autograd()

Why this is the real point of the section

Every gradient field you care about in deep learning — every weight update of every transformer ever trained — is computed by exactly this autograd mechanism. The calculus you learned in this chapter is what the library is doing behind the scenes.


Test Your Understanding


Summary

A vector field is a function that paints every point with an arrow. Two components P,QP,Q tell you the arrow at (x,y)(x,y). Four canonical patterns — constant, radial, rotational, saddle — are the alphabet from which most fields are built.

ConceptDefinitionKey fact
Vector fieldF:DRnRn\mathbf{F}: D \subset \mathbb{R}^n \to \mathbb{R}^nVisualised by a quiver plot
Gradient fieldF=f\mathbf{F} = \nabla f for some scalar potential ffPerpendicular to level curves of ff; conservative
Conservative testF=P,Q\mathbf{F} = \langle P, Q\rangle on simply-connected regionF\mathbf{F} is gradient P/y=Q/x\Leftrightarrow \partial P/\partial y = \partial Q/\partial x
StreamlineCurve r(t)\mathbf{r}(t) with dr/dt=F(r)d\mathbf{r}/dt = \mathbf{F}(\mathbf{r})Solutions of an ODE; never cross at regular points
Singular pointPoint where F=0\mathbf{F} = \mathbf{0}Streamlines can meet here; equilibria of the ODE
ML connectionLoss L(θ)L(\theta) defines L-\nabla L on parameter spaceGradient descent = forward-Euler streamline of L-\nabla L
The essence of a vector field:
“An arrow at every point — turning every problem about flow, force, or change into a piece of geometry.”
Coming next: §19.2 introduces line integrals integrating a function or a vector field along a curve. We will compute the work done by a force field along a path, and discover that for gradient fields the answer depends only on the endpoints. That is the fundamental theorem of line integrals — the multivariable analogue of FTC Part 2.
Loading comments...