Chapter 1
18 min read
Section 3 of 353

Polynomial Functions: Shapes and Behaviors

Mathematical Functions — The Building Blocks

Learning Objectives

By the end of this section you will be able to:

  • Recognise a polynomial the moment you see one — and explain why only non-negative integer exponents are allowed.
  • Predict the end behaviour of a polynomial in two seconds, using only the degree nn and the leading sign sgn(an)\operatorname{sgn}(a_n).
  • Count the maximum possible real roots (n\le n) and turning points (n1\le n - 1) from the degree alone.
  • Read a polynomial in factored form and tell whether each root makes the curve cross, kiss, or flex through the x-axis.
  • Evaluate a polynomial in two ways in Python and PyTorch, and explain why Horner's method is what every numerical library actually uses.

The Problem Polynomials Solve

You already know two extreme kinds of functions. Constant functions are rigid — they refuse to change. Linear functions change at a fixed rate. Real systems do neither: a thrown ball speeds up as it falls, a population grows in spurts, the area of a square scales faster than its side length. We need a family of functions that

  1. is rich enough to bend — multiple times, in either direction,
  2. but is still built entirely from the four arithmetic operations you can do by hand: addition, subtraction, multiplication by a constant, and raising xx to a positive integer power,
  3. so that every machine, every spreadsheet, every CPU can evaluate it exactly without ever needing to call exp or sin.

That family is the polynomials. They are the workhorse of applied mathematics: linear regression's big brother, the carrier of Taylor approximation (Chapter 13), the bones of every spline used in graphics and CAD, the "safe" non-linearity in classical control theory. If a function can be drawn with a single unbroken pen-stroke that never grows infinitely fast, a polynomial of high enough degree can match it as closely as you like (Weierstrass, 1885).

Intuition. Think of the degree nn as a bend budget. A line cannot bend (n=1n=1). A parabola can bend once (n=2n=2). A cubic can bend twice. The higher nn goes, the more turns the curve can make — but never more than n1n-1.

What Counts as a Polynomial

A polynomial in a single variable is any expression of the form

p(x)  =  anxn  +  an1xn1  +    +  a2x2  +  a1x  +  a0,p(x) \;=\; a_n x^{n} \;+\; a_{n-1} x^{n-1} \;+\; \cdots \;+\; a_2 x^{2} \;+\; a_1 x \;+\; a_0,

where the coefficients a0,a1,,ana_0, a_1, \ldots, a_n are real (or complex) constants, the leading coefficient an0a_n \neq 0, and the exponents are non-negative integers. The largest exponent that appears with a non-zero coefficient is the degree nn.

The exponent rule is the only thing that separates polynomials from their close cousins. x=x1/2\sqrt{x} = x^{1/2} is not a polynomial — the exponent 1/21/2 is not an integer. 1/x=x11/x = x^{-1} is not a polynomial — the exponent is negative. xxx^{x} is not a polynomial — the exponent is not constant. Removing even one of these restrictions takes you out of polynomial-land and into rational, algebraic, or transcendental territory.
ExpressionPolynomial?Why
7Yes — degree 0Constant. a_0 = 7.
3x − 1Yes — degree 1Linear. Coefficients: a_0 = −1, a_1 = 3.
x² + 5x + 6Yes — degree 2Quadratic.
x⁴ − 4x² + 2Yes — degree 4All exponents are non-negative integers.
x³ − 3xYes — degree 3Our running example.
√x + 2NoExponent 1/2 is not an integer.
1/(x² − 1)NoNegative exponent hidden in the denominator.
sin(x)NoTranscendental — no finite polynomial equals sin.
e^x − x³Noe^x is not a polynomial; sums with one bad term stay bad.

Anatomy: Degree, Leading Term, Constant

Every polynomial has three landmarks worth naming:

  1. The degree nn — the highest exponent. It controls everything important: how the curve flies off at the edges, how many roots and turning points it can have, and how fast it grows.
  2. The leading coefficient ana_n — the multiplier on the highest-degree term. Its sign decides whether the tails point up or down. Its magnitude is the eventual growth rate of the curve.
  3. The constant term a0a_0 — the y-intercept: p(0)=a0p(0) = a_0, because every xkx^k with k1k \ge 1 dies at x=0x = 0.
Tail rule (do this first, always). When you look at any polynomial, find anxna_n x^n and ignore everything else for a moment. Those two pieces — the degree and the leading sign — completely determine what happens as x±x \to \pm\infty. The other terms only matter inside the viewing window.

Interactive — Shape Explorer

Below is a polynomial up to degree five whose every coefficient is on a slider. Try this experiment first: pick the cubic preset y=x33xy = x^3 - 3x, then sweep a3a_3 from +1+1 to 1-1. You will see the curve flip vertically — that single sign change rewrites both tails. Then sweep a0a_0: the whole curve slides up and down without changing shape. The middle coefficients dent and ripple the curve, but the tails refuse to budge.

Loading polynomial explorer…
What to notice. Red dots mark real roots — places where p(x)=0p(x) = 0. Amber dots mark turning points — places where the curve momentarily stops going up (or stops going down) and reverses. The widget recomputes both every time you move a slider; the counts honour the bend budget n1n - 1.

End Behavior — The Tails Tell the Truth

Why does the leading term own the tails? Because for large x|x|, every other term is dwarfed:

p(x)anxn  =  1+an1anx++a0anxn    x±    1.\dfrac{p(x)}{a_n x^{n}} \;=\; 1 + \dfrac{a_{n-1}}{a_n x} + \cdots + \dfrac{a_0}{a_n x^{n}} \;\;\xrightarrow{\,x \to \pm\infty\,}\;\; 1.

Each correction ak/(anxnk)a_k / (a_n x^{n-k}) goes to zero because the denominator blows up. So for large x|x| the polynomial behaves exactly like anxna_n x^n. That gives the four-way rule:

Degree parityLeading signAs x → −∞As x → +∞Shape memory
even (2, 4, 6, …)a_n > 0+∞+∞bowl — both tails up
even (2, 4, 6, …)a_n < 0−∞−∞umbrella — both tails down
odd (1, 3, 5, …)a_n > 0−∞+∞rising S — left down, right up
odd (1, 3, 5, …)a_n < 0+∞−∞falling S — left up, right down
Loading end-behavior gallery…
Why parity matters. An even power kills the sign of xx: (2)2=(+2)2=4(-2)^{2} = (+2)^{2} = 4, so both tails of x2x^2 go to the same infinity. An odd power preserves the sign: (2)3=8(-2)^{3} = -8 versus (+2)3=+8(+2)^{3} = +8, so the tails of x3x^3 fly to opposite infinities.

Roots: Where the Curve Crosses Zero

A root (or zero) of pp is any number rr for which p(r)=0p(r) = 0. Visually it is exactly where the curve meets the x-axis. Two truths govern them, and almost everything else about polynomials follows from these two.

The Fundamental Theorem of Algebra (1799, Gauss)

Every polynomial of degree n1n \ge 1 has exactly nn roots in the complex numbers, counted with multiplicity. Some of those roots may be real (you see them as x-axis crossings); others may be complex conjugate pairs (you only see them as "something is missing" in the graph).

Consequence over the real numbers: a real polynomial of degree nn has at most nn real roots. If nn is odd, it has at least one real root, because the tails point to opposite infinities and the curve is continuous — it has to cross zero on the way.

So the root-count is bounded by the degree:

Degree nPossible numbers of real rootsAlways has ≥ 1 real root?
11Yes
20, 1, or 2No (e.g. x² + 1 has no real root)
31, 2, or 3Yes
40, 1, 2, 3, or 4No
51, 2, 3, 4, or 5Yes
n (even)0 to nNo
n (odd)1 to nYes

Factored Form & Multiplicity

Whenever we have all the real roots r1,r2,,rkr_1, r_2, \ldots, r_k, we can rebuild the polynomial as a product:

p(x)  =  an(xr1)m1(xr2)m2(xrk)mk,p(x) \;=\; a_n\,(x - r_1)^{m_1}\,(x - r_2)^{m_2}\,\cdots\,(x - r_k)^{m_k},

where mim_i is the multiplicity of root rir_i — how many times that linear factor appears. The multiplicities have to add up to the degree: m1+m2++mk=nm_1 + m_2 + \cdots + m_k = n.

Why multiplicity matters geometrically. Near a root rr, the polynomial behaves like C(xr)mC\,(x - r)^{m} for some constant C0C \neq 0. The function (xr)m(x - r)^{m} tells you exactly how the curve approaches zero:

  • m = 1 (simple root): (xr)(x-r) changes sign when xx passes through rr, so pp changes sign — the curve crosses the x-axis transversally, like a line.
  • m = 2 (double root): (xr)20(x-r)^2 \ge 0 on both sides, so pp does not change sign — the curve touches the axis and bounces back, like a parabola at its vertex.
  • m = 3 (triple root): (xr)3(x-r)^3 changes sign but with zero slope — the curve flexes through the axis with a horizontal tangent, like a cubic at the origin.
  • Even multiplicities → touch and bounce. Odd multiplicities → cross.
Loading factored-form playground…
Drag a simple root and a double root toward each other in the widget. You will see two crossings merge into a single "kiss". That is what people mean when they say roots coalesce; the same phenomenon is what creates repeated eigenvalues in linear algebra and bifurcations in dynamical systems.

Turning Points — The Bend Budget

A turning point (a local maximum or local minimum) is a place where the curve briefly stops climbing and starts falling, or vice-versa. Between every two distinct real roots of a smooth function, there has to be at least one turning point — otherwise the curve could not get back to zero. So once you know how many roots a polynomial has, you already know a lower bound on its turning points.

The upper bound comes from a fact you will prove in Chapter 4: the turning points of pp are exactly the real roots of its derivative pp'. If pp has degree nn, then pp' has degree n1n-1, and by the fundamental theorem can have at most n1n-1 real roots. That is the famous bend budget:

#{turning points of degree-n polynomial}    n1.\#\big\{\text{turning points of degree-}n\text{ polynomial}\big\} \;\le\; n - 1.

A line has 0 turning points. A parabola has exactly 1. A cubic has 0 or 2 (never just one — they come in pairs of local-max/local-min, or the inflection swallows both). A quartic has 1 or 3. Open the explorer above, switch to degree 5, and try to make a curve with exactly 4 turning points — you can; with 5 — you cannot.


Worked Example — p(x)=x33xp(x) = x^3 - 3x

Let us put every idea above to work on a single cubic. Read first, then open the foldout and finish it with pen and paper before looking at the answers.

Click to expand — hand-trace the cubic step by step

Step 1 — Read off the anatomy.

  • Degree: n=3n = 3 (the highest exponent).
  • Leading coefficient: a3=+1a_3 = +1.
  • Constant term: a0=0a_0 = 0, so the curve passes through the origin: p(0)=0p(0) = 0.

Step 2 — Predict the tails. Degree is odd, leading sign is positive, so the curve drops from -\infty on the left and rises to ++\infty on the right — a rising S.

Step 3 — Find the roots. Factor out the common xx:

p(x)=x33x=x(x23)=x(x3)(x+3).p(x) = x^3 - 3x = x\,(x^2 - 3) = x\,\bigl(x - \sqrt{3}\bigr)\bigl(x + \sqrt{3}\bigr).

Three real roots, each with multiplicity one: r=31.732,  0,  +31.732r = -\sqrt{3} \approx -1.732,\ \ 0,\ \ +\sqrt{3} \approx 1.732. The curve crosses the x-axis transversally at all three.

Step 4 — Count and locate the turning points. Differentiate (rule preview from Chapter 4 — multiply by the exponent, drop the exponent by one):

p(x)=3x23.p'(x) = 3x^2 - 3.

Set p(x)=0p'(x) = 0: 3x2=33x^2 = 3, so x=±1x = \pm 1. That gives exactly two turning points (the maximum allowed for a cubic), at x=1x = -1 and x=+1x = +1.

Step 5 — Classify each turning point. Plug in:

  • p(1)=(1)33(1)=1+3=2p(-1) = (-1)^3 - 3(-1) = -1 + 3 = 2. Coming from the left the curve rises and at this point reverses — this is a local maximum at (1,2)(-1,\, 2).
  • p(+1)=13=2p(+1) = 1 - 3 = -2. The curve falls past zero, bottoms out here, then climbs — this is a local minimum at (1,2)(1,\, -2).

Step 6 — Sketch the silhouette. Connect the dots: come up from -\infty on the far left, cross zero at 3-\sqrt{3}, peak at (1,2)(-1, 2), fall through the origin, bottom at (1,2)(1, -2), cross zero at +3+\sqrt{3}, rise to ++\infty on the right. Compare with the interactive widget above — it is the same picture.

Step 7 — Sanity-check by plugging numbers in.

xp(x) by handInterpretation
−2(−2)³ − 3(−2) = −8 + 6 = −2below axis (tail diving)
−√30left root (crossing)
−1−1 + 3 = 2local maximum
00middle root (crossing)
11 − 3 = −2local minimum
√30right root (crossing)
28 − 6 = 2above axis (tail rising)

Every entry agrees with the silhouette. The polynomial is fully understood with no machinery beyond arithmetic and a single derivative.


Python: Evaluate by Hand vs. Horner's Method

Translating the hand-trace into code does two things. First, it cements the convention you will see in every library: coefficients live in a vector indexed by exponent. Second, it reveals why nobody actually evaluates a polynomial by summing aixia_i x^i in production — Horner's method is faster and more numerically stable for free.

Pure Python — evaluating p(x) = x³ − 3x two ways
🐍poly_eval.py
1Comment — what we are about to do

Plain-text plan: take the polynomial p(x) = x^3 − 3x and evaluate it two different ways at the same x, so you can SEE that the algebraic identity p(x) = a0 + a1·x + a2·x² + a3·x³ and the nested Horner form a0 + x·(a1 + x·(a2 + x·a3)) really do produce the same number.

2Convention: coefficients in increasing order of power

We list coefficients from a0 (constant) up to a_n (leading). This is the convention every numerical library (NumPy.polynomial, SciPy, PyTorch) uses. Many textbooks list them the other way (highest first) — getting this backwards is the single most common bug when copying a formula into code.

EXECUTION STATE
📚 ordering = [a0, a1, a2, a3] ← index = exponent. coeffs[2] is always the coefficient of x².
7coeffs = [0, -3, 0, 1]

The concrete polynomial for the whole worked example. Reading left to right: constant = 0, slope-of-x term = −3, no x² term, leading x³ term has coefficient +1. So algebraically p(x) = 0 + (−3)x + 0·x² + 1·x³ = x³ − 3x.

EXECUTION STATE
coeffs[0] = 0 — the constant a0
coeffs[1] = -3 — the coefficient of x
coeffs[2] = 0 — the coefficient of x^2 (absent)
coeffs[3] = 1 — the leading coefficient a_n
9def eval_direct(coeffs, x) — the textbook way

Evaluates the polynomial term-by-term, exactly as you would on paper. Slow (n+1 multiplies of growing powers) and numerically less accurate for big n, but it is the most readable possible implementation.

EXECUTION STATE
⬇ coeffs = [0, -3, 0, 1]
⬇ x = the point we evaluate at, e.g. 2.0
⬆ returns = the number p(x)
11total = 0.0 — accumulator

Start with a running sum of zero, in float. We will add a_i · x^i into it as we walk through the coefficient list.

12for i, a in enumerate(coeffs):

enumerate yields the (index, value) pairs of the list. i becomes the EXPONENT of x in that term; a is the coefficient. Tracing on x = 2 gives:

LOOP TRACE · 4 iterations
i = 0
a = 0
x**i = 2**0 = 1
term = 0 * 1 = 0
total after = 0
i = 1
a = -3
x**i = 2**1 = 2
term = -3 * 2 = -6
total after = -6
i = 2
a = 0
x**i = 2**2 = 4
term = 0 * 4 = 0
total after = -6
i = 3
a = 1
x**i = 2**3 = 8
term = 1 * 8 = 8
total after = 2
13total += a * (x ** i)

x ** i raises x to the integer power i. We multiply by the coefficient, then add into the running total. Operator precedence in Python: ** is higher than * is higher than +=, so this reads (a * (x**i)) added to total.

14return total

Hand off the final sum. For x = 2 the function returns 2 — meaning p(2) = 2. Sanity check: 2³ − 3·2 = 8 − 6 = 2. Matches.

16def eval_horner(coeffs, x) — the smart way

Horner&apos;s method evaluates the same polynomial in NESTED form a0 + x·(a1 + x·(a2 + x·a3)). The algebra is identical; the bookkeeping is cleaner. Crucially we never compute x², x³, x⁴ as separate quantities — we only ever do (running * x) + next_coefficient.

EXECUTION STATE
📚 why it&apos;s better = n multiplications and n additions (vs. ~2n for direct). Better cache behavior. And the running value naturally stays inside the range of p, avoiding huge intermediate powers that overflow for big x.
18result = 0.0

Start the Horner accumulator at zero. This will be the value we keep multiplying by x and folding the next coefficient into.

19for a in reversed(coeffs):

Walk the coefficient list FROM THE LEADING TERM DOWN. reversed([0, -3, 0, 1]) yields 1, 0, -3, 0. Tracing on x = 2:

LOOP TRACE · 4 iterations
iter 1 (a = 1 — the leading a3)
before result = 0
result * x = 0 * 2 = 0
+ a = 0 + 1 = 1
after = 1
iter 2 (a = 0 — the a2)
before result = 1
result * x = 1 * 2 = 2
+ a = 2 + 0 = 2
after = 2
iter 3 (a = -3 — the a1)
before result = 2
result * x = 2 * 2 = 4
+ a = 4 + (-3) = 1
after = 1
iter 4 (a = 0 — the constant a0)
before result = 1
result * x = 1 * 2 = 2
+ a = 2 + 0 = 2
after = 2
20result = result * x + a

The Horner step. Each iteration multiplies the previous result by x and folds the next coefficient in. Geometrically this is the same as evaluating the nested form ((a3·x + a2)·x + a1)·x + a0 one shell at a time.

21return result

Hand back the value of p(x). Same answer as eval_direct, by construction.

24Probe loop — five test points

Walks five integer x values across the interesting window. We will see the curve dip to a minimum on the right of zero and rise to a maximum on the left of zero. Tracing the printed output:

LOOP TRACE · 5 iterations
x = -2
p(x) = (-2)**3 - 3*(-2) = -8 + 6 = -2
printed = x = -2 direct = -2.0 horner = -2.0
x = -1
p(x) = (-1)**3 - 3*(-1) = -1 + 3 = 2
printed = x = -1 direct = 2.0 horner = 2.0 (local MAX)
x = 0
p(x) = 0 - 0 = 0
printed = x = 0 direct = 0.0 horner = 0.0 (a root)
x = 1
p(x) = 1 - 3 = -2
printed = x = 1 direct = -2.0 horner = -2.0 (local MIN)
x = 2
p(x) = 8 - 6 = 2
printed = x = 2 direct = 2.0 horner = 2.0
25print(f"...")

Just formatted output: right-aligned columns of width 2 / 5 for readability. The actual mathematical work is done by the two eval functions.

11 lines without explanation
1# Two ways to evaluate  p(x) = x^3 - 3x  at a point.
2# Coefficients in INCREASING order of power:  a0, a1, a2, a3
3#   a0 = 0     (constant term)
4#   a1 = -3    (coefficient of x)
5#   a2 = 0     (coefficient of x^2)
6#   a3 = 1     (coefficient of x^3)
7coeffs = [0, -3, 0, 1]
8
9def eval_direct(coeffs, x):
10    """Sum every  a_i * x**i  term naively."""
11    total = 0.0
12    for i, a in enumerate(coeffs):
13        total += a * (x ** i)
14    return total
15
16def eval_horner(coeffs, x):
17    """Nested form:  a0 + x*(a1 + x*(a2 + x*a3))  — fewer multiplies."""
18    result = 0.0
19    for a in reversed(coeffs):
20        result = result * x + a
21    return result
22
23# Probe the curve at five points.
24for x in [-2, -1, 0, 1, 2]:
25    print(f"x = {x:>2}   direct = {eval_direct(coeffs, x):>5}   "
26          f"horner = {eval_horner(coeffs, x):>5}")
Why Horner is faster. Direct evaluation computes x0,x1,,xnx^0, x^1, \ldots, x^n separately (about 2n2n multiplies for a degree-nn polynomial). Horner's method does nn multiplies and nn additions — roughly half the work, and the rounding errors do not accumulate the way they do in repeated exponentiation. Every serious library — NumPy's numpy.polynomial, SciPy, PyTorch's internal kernels — uses Horner or a close cousin (Estrin's scheme for vectorisation).

PyTorch: Polynomials as Vectorised Tensors

The same algorithm scales straight to a tensor. The reason this matters is that polynomials are everywhere in modern machine learning — feature transforms, positional encodings, Chebyshev approximations, the polynomial activation layers explored in Kolmogorov-Arnold Networks (KANs). All of them rely on evaluating one polynomial at many points, fast.

PyTorch — vectorised Horner over 11 probe points
🐍poly_torch.py
1import torch

Brings in PyTorch. The point of this block is to show that the Python loop you just walked through generalises to a TENSOR loop with the same shape of code — but instead of one x at a time, every iteration acts on the whole batch in parallel.

4coeffs = torch.tensor([0.0, -3.0, 0.0, 1.0])

Same [a0, a1, a2, a3] vector you saw in the pure-Python block, but wrapped in a 1-D torch.Tensor of dtype float32. PyTorch needs floats here (not ints) because eventually you want autograd on the coefficients — gradients require floating point.

EXECUTION STATE
coeffs.shape = (4,)
coeffs.dtype = torch.float32
coeffs[3] = 1.0 ← the leading coefficient
5xs = torch.linspace(-2.5, 2.5, 11)

Builds a 1-D tensor of 11 numbers evenly spaced from −2.5 to 2.5 (inclusive). Step size is (2.5 − (−2.5)) / (11 − 1) = 0.5.

EXECUTION STATE
xs.shape = (11,)
xs = [-2.5, -2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
7def poly_horner(coeffs, x) — vectorised Horner

Identical to the pure-Python eval_horner, except every multiply and add is now a tensor op. The body never knows whether x is a scalar or a length-11 tensor — broadcasting handles both.

9result = torch.zeros_like(x)

Allocates the accumulator. zeros_like(x) returns a tensor with the same shape and dtype as x, filled with zeros. If x has shape (11,) you get an (11,) accumulator; if x is a scalar you get a scalar.

EXECUTION STATE
result.shape (here) = (11,)
result = [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]
10for a in torch.flip(coeffs, dims=[0]):

torch.flip reverses a tensor along the given axis. flip([a0, a1, a2, a3], dim 0) returns [a3, a2, a1, a0] — exactly the leading-first ordering Horner&apos;s method needs. The Python for-loop then yields these one scalar at a time.

LOOP TRACE · 4 iterations
iter 1 a = 1.0 (the leading a3)
result * x (broadcast) = [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]
+ a = [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]
iter 2 a = 0.0 (a2)
result * x = [-2.5, -2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
+ a = same (we just added 0)
iter 3 a = -3.0 (a1)
result * x = [6.25, 4.00, 2.25, 1.00, 0.25, 0.00, 0.25, 1.00, 2.25, 4.00, 6.25]
+ a = [3.25, 1.00, -0.75, -2.00, -2.75, -3.00, -2.75, -2.00, -0.75, 1.00, 3.25]
iter 4 a = 0.0 (the constant a0)
result * x = [-8.125, -2.000, 1.125, 2.000, 1.375, 0.000, -1.375, -2.000, -1.125, 2.000, 8.125]
+ a (=0) = [-8.125, -2.000, 1.125, 2.000, 1.375, 0.000, -1.375, -2.000, -1.125, 2.000, 8.125]
11result = result * x + a

The vectorised Horner step. result and x are both shape-(11,) tensors; a is a 0-D tensor (scalar). PyTorch broadcasts a across the 11 entries, multiplies elementwise, and adds. One line of code does 11 evaluations of the polynomial.

EXECUTION STATE
📚 broadcasting = shape (11,) * shape (11,) → shape (11,). Then + shape () broadcasts the scalar to (11,). The result keeps shape (11,).
12return result

Returns the length-11 tensor of p(x) values. No Python-level loop over x was ever needed; PyTorch&apos;s C++/CUDA kernels did the elementwise math in one shot.

14ys = poly_horner(coeffs, xs)

Calls our function on the eleven probe points. Tracing each:

LOOP TRACE · 11 iterations
row 1
x = -2.50
p(x) = (-2.5)^3 - 3(-2.5) = -15.625 + 7.5 = -8.125
row 2
x = -2.00
p(x) = -8 + 6 = -2.000
row 3
x = -1.50
p(x) = -3.375 + 4.5 = 1.125
row 4
x = -1.00
p(x) = -1 + 3 = 2.000 (local max)
row 5
x = -0.50
p(x) = -0.125 + 1.5 = 1.375
row 6
x = 0.00
p(x) = 0
row 7
x = 0.50
p(x) = 0.125 - 1.5 = -1.375
row 8
x = 1.00
p(x) = 1 - 3 = -2.000 (local min)
row 9
x = 1.50
p(x) = 3.375 - 4.5 = -1.125
row 10
x = 2.00
p(x) = 8 - 6 = 2.000
row 11
x = 2.50
p(x) = 15.625 - 7.5 = 8.125
15for x, y in zip(xs.tolist(), ys.tolist()):

.tolist() converts a tensor into a regular Python list so we can iterate in plain Python and format with f-strings. zip pairs each x with its corresponding p(x).

16print(f"x = {x:+.2f} p(x) = {y:+.4f}")

Format spec :+.2f means &apos;always show the sign, 2 decimals, fixed point&apos;. :+.4f is the same with 4 decimals. The full output reproduces the table you traced above.

6 lines without explanation
1import torch
2
3# Same polynomial:  p(x) = x^3 - 3x.
4# We will evaluate it at HUNDREDS of x values in one shot.
5coeffs = torch.tensor([0.0, -3.0, 0.0, 1.0])      # [a0, a1, a2, a3]
6xs = torch.linspace(-2.5, 2.5, 11)                # 11 evenly spaced probes
7
8def poly_horner(coeffs: torch.Tensor, x: torch.Tensor) -> torch.Tensor:
9    """Vectorised Horner: works on a SCALAR x or a 1-D batch of x's."""
10    result = torch.zeros_like(x)
11    for a in torch.flip(coeffs, dims=[0]):         # leading coeff first
12        result = result * x + a
13    return result
14
15ys = poly_horner(coeffs, xs)
16for x, y in zip(xs.tolist(), ys.tolist()):
17    print(f"x = {x:+.2f}   p(x) = {y:+.4f}")
Where this is going. Because every line in the body of poly_horner is a differentiable PyTorch op, you can compute p/coeffs\partial p / \partial \text{coeffs} with torch.autograd for free. That is exactly how a linear regression with polynomial features works — the model learns the coefficient vector by gradient descent on p(x)y2\| p(x) - y \|^2. You will revisit this in Chapter 4.

Where Polynomials Show Up in the Real World

  1. Physics — projectile motion. Height as a function of time is the quadratic h(t)=h0+v0t12gt2h(t) = h_0 + v_0 t - \tfrac{1}{2} g t^{2}. The single root in t>0t > 0 is the landing time; the unique turning point is the apex. A degree-2 polynomial captures the entire flight.
  2. Economics — cost & revenue curves. Total cost of producing qq units is often modelled as C(q)=a0+a1q+a2q2+a3q3C(q) = a_0 + a_1 q + a_2 q^{2} + a_3 q^{3}: a fixed cost plus marginal costs that rise nonlinearly. Roots of marginal cost (the derivative) give the production sweet spots.
  3. Computer graphics — Bézier splines. Every curve in a font, every path in an SVG, every CAD surface is a piecewise polynomial. The smoothness you feel in fonts is built into the fact that consecutive cubic Bézier pieces agree on value and first derivative at their seams.
  4. Numerical analysis — Taylor approximation. Any well-behaved function can be approximated, near a point, by a polynomial whose coefficients are the function's derivatives at that point. Polynomials are the universal "local" language of smooth functions — Chapter 13 turns this into a precise theorem.
  5. Machine learning — polynomial features. sklearn.preprocessing.PolynomialFeatures simply maps x(1,x,x2,x3,)x \mapsto (1, x, x^{2}, x^{3}, \ldots) so that a linear model can fit polynomial relationships. The bend budget you learned today is the same bend budget that controls how much such a model can overfit.

Common Pitfalls

1. Confusing degree with number of terms. x41x^{4} - 1 has only two terms but still has degree 4 — and therefore can still bend up to 3 times. Always look for the highest exponent, not the biggest-looking expression.
2. Sneaking in non-integer or negative exponents. x2+xx^{2} + \sqrt{x} is not a polynomial. The minute one term breaks the rule, the whole expression loses every polynomial guarantee — including the bend budget and the fundamental theorem of algebra.
3. Mistaking turning points for roots. A turning point is where the curve stops climbing; a root is where the curve touches zero. They coincide only when a root has multiplicity at least 2 (the "kiss" case). In our cubic x33xx^3 - 3x the turning points are at x=±1x = \pm 1 with values p=±2p = \pm 2 — nowhere near zero.
4. Reading coefficients in the wrong order. NumPy's numpy.poly1d takes coefficients highest power first, while numpy.polynomial.Polynomial takes them lowest first. Always double-check; a reversed list silently produces a totally different curve.
5. Trusting the curve outside the viewing window. A polynomial fitted to a few hundred data points often does something wild just past the last sample — this is called Runge's phenomenon and is the main reason high-degree polynomial fits are dangerous. Splines (low-degree pieces, glued smoothly) are the production answer.

Summary

  • A polynomial is a finite sum p(x)=i=0naixip(x) = \sum_{i=0}^{n} a_i x^{i} with an0a_n \neq 0 and non-negative integer exponents.
  • The degree nn caps the number of real roots (n\le n) and turning points (n1\le n - 1) — your bend budget.
  • The leading term anxna_n x^{n} owns the tails: parity of nn + sign of ana_n fully determines end behaviour.
  • In factored form, the multiplicity of a root tells you whether the curve crosses (odd), bounces (even-2), or flexes (odd-≥3) the x-axis.
  • Horner's method is the production algorithm for evaluating polynomials — fewer multiplications, better numerical stability, and identical structure on a CPU loop or a GPU tensor kernel.
Polynomials are the language a calculator can speak fluently. They are also the language in which limits, derivatives, integrals, and Taylor expansion all begin. Master their shape — degree, leading term, roots, multiplicity, turning points — and the next ten chapters will feel like reading prose instead of decoding hieroglyphs.
Loading comments...