Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section you will be able to:

Recognise a polynomial the moment you see one — and explain why only non-negative integer exponents are allowed.
Predict the end behaviour of a polynomial in two seconds, using only the degree $n$ and the leading sign $\operatorname{sgn}(a_n)$ .
Count the maximum possible real roots ( $\le n$ ) and turning points ( $\le n - 1$ ) from the degree alone.
Read a polynomial in factored form and tell whether each root makes the curve cross, kiss, or flex through the x-axis.
Evaluate a polynomial in two ways in Python and PyTorch, and explain why Horner's method is what every numerical library actually uses.

The Problem Polynomials Solve

You already know two extreme kinds of functions. Constant functions are rigid — they refuse to change. Linear functions change at a fixed rate. Real systems do neither: a thrown ball speeds up as it falls, a population grows in spurts, the area of a square scales faster than its side length. We need a family of functions that

is rich enough to bend — multiple times, in either direction,
but is still built entirely from the four arithmetic operations you can do by hand: addition, subtraction, multiplication by a constant, and raising $x$ to a positive integer power,
so that every machine, every spreadsheet, every CPU can evaluate it exactly without ever needing to call exp or sin.

That family is the polynomials. They are the workhorse of applied mathematics: linear regression's big brother, the carrier of Taylor approximation (Chapter 13), the bones of every spline used in graphics and CAD, the "safe" non-linearity in classical control theory. If a function can be drawn with a single unbroken pen-stroke that never grows infinitely fast, a polynomial of high enough degree can match it as closely as you like (Weierstrass, 1885).

Intuition. Think of the degree $n$ as a bend budget. A line cannot bend ( $n=1$ ). A parabola can bend once ( $n=2$ ). A cubic can bend twice. The higher $n$ goes, the more turns the curve can make — but never more than $n-1$ .

What Counts as a Polynomial

A polynomial in a single variable is any expression of the form

p(x) \;=\; a_n x^{n} \;+\; a_{n-1} x^{n-1} \;+\; \cdots \;+\; a_2 x^{2} \;+\; a_1 x \;+\; a_0,

where the coefficients $a_0, a_1, \ldots, a_n$ are real (or complex) constants, the leading coefficient $a_n \neq 0$ , and the exponents are non-negative integers. The largest exponent that appears with a non-zero coefficient is the degree $n$ .

The exponent rule is the only thing that separates polynomials from their close cousins.

\sqrt{x} = x^{1/2}

is not a polynomial — the exponent

1/2

is not an integer.

1/x = x^{-1}

is not a polynomial — the exponent is negative.

x^{x}

is not a polynomial — the exponent is not constant. Removing even one of these restrictions takes you out of polynomial-land and into rational, algebraic, or transcendental territory.

Expression	Polynomial?	Why
7	Yes — degree 0	Constant. a_0 = 7.
3x − 1	Yes — degree 1	Linear. Coefficients: a_0 = −1, a_1 = 3.
x² + 5x + 6	Yes — degree 2	Quadratic.
x⁴ − 4x² + 2	Yes — degree 4	All exponents are non-negative integers.
x³ − 3x	Yes — degree 3	Our running example.
√x + 2	No	Exponent 1/2 is not an integer.
1/(x² − 1)	No	Negative exponent hidden in the denominator.
sin(x)	No	Transcendental — no finite polynomial equals sin.
e^x − x³	No	e^x is not a polynomial; sums with one bad term stay bad.

Anatomy: Degree, Leading Term, Constant

Every polynomial has three landmarks worth naming:

The degree $n$ — the highest exponent. It controls everything important: how the curve flies off at the edges, how many roots and turning points it can have, and how fast it grows.
The leading coefficient $a_n$ — the multiplier on the highest-degree term. Its sign decides whether the tails point up or down. Its magnitude is the eventual growth rate of the curve.
The constant term $a_0$ — the y-intercept: $p(0) = a_0$ , because every $x^k$ with $k \ge 1$ dies at $x = 0$ .

Tail rule (do this first, always). When you look at any polynomial, find

a_n x^n

and ignore everything else for a moment. Those two pieces — the degree and the leading sign — completely determine what happens as

x \to \pm\infty

. The other terms only matter inside the viewing window.

Interactive — Shape Explorer

Below is a polynomial up to degree five whose every coefficient is on a slider. Try this experiment first: pick the cubic preset $y = x^3 - 3x$ , then sweep $a_3$ from $+1$ to $-1$ . You will see the curve flip vertically — that single sign change rewrites both tails. Then sweep $a_0$ : the whole curve slides up and down without changing shape. The middle coefficients dent and ripple the curve, but the tails refuse to budge.

Loading polynomial explorer…

What to notice. Red dots mark real roots — places where $p(x) = 0$ . Amber dots mark turning points — places where the curve momentarily stops going up (or stops going down) and reverses. The widget recomputes both every time you move a slider; the counts honour the bend budget $n - 1$ .

End Behavior — The Tails Tell the Truth

Why does the leading term own the tails? Because for large $|x|$ , every other term is dwarfed:

\dfrac{p(x)}{a_n x^{n}} \;=\; 1 + \dfrac{a_{n-1}}{a_n x} + \cdots + \dfrac{a_0}{a_n x^{n}} \;\;\xrightarrow{\,x \to \pm\infty\,}\;\; 1.

Each correction $a_k / (a_n x^{n-k})$ goes to zero because the denominator blows up. So for large $|x|$ the polynomial behaves exactly like $a_n x^n$ . That gives the four-way rule:

Degree parity	Leading sign	As x → −∞	As x → +∞	Shape memory
even (2, 4, 6, …)	a_n > 0	+∞	+∞	bowl — both tails up
even (2, 4, 6, …)	a_n < 0	−∞	−∞	umbrella — both tails down
odd (1, 3, 5, …)	a_n > 0	−∞	+∞	rising S — left down, right up
odd (1, 3, 5, …)	a_n < 0	+∞	−∞	falling S — left up, right down

Loading end-behavior gallery…

Why parity matters. An even power kills the sign of

x

(-2)^{2} = (+2)^{2} = 4

, so both tails of

x^2

go to the same infinity. An odd power preserves the sign:

(-2)^{3} = -8

versus

(+2)^{3} = +8

, so the tails of

x^3

fly to opposite infinities.

Roots: Where the Curve Crosses Zero

A root (or zero) of $p$ is any number $r$ for which $p(r) = 0$ . Visually it is exactly where the curve meets the x-axis. Two truths govern them, and almost everything else about polynomials follows from these two.

The Fundamental Theorem of Algebra (1799, Gauss)

Every polynomial of degree $n \ge 1$ has exactly $n$ roots in the complex numbers, counted with multiplicity. Some of those roots may be real (you see them as x-axis crossings); others may be complex conjugate pairs (you only see them as "something is missing" in the graph).

Consequence over the real numbers: a real polynomial of degree $n$ has at most $n$ real roots. If $n$ is odd, it has at least one real root, because the tails point to opposite infinities and the curve is continuous — it has to cross zero on the way.

So the root-count is bounded by the degree:

Degree n	Possible numbers of real roots	Always has ≥ 1 real root?
1	1	Yes
2	0, 1, or 2	No (e.g. x² + 1 has no real root)
3	1, 2, or 3	Yes
4	0, 1, 2, 3, or 4	No
5	1, 2, 3, 4, or 5	Yes
n (even)	0 to n	No
n (odd)	1 to n	Yes

Factored Form & Multiplicity

Whenever we have all the real roots $r_1, r_2, \ldots, r_k$ , we can rebuild the polynomial as a product:

p(x) \;=\; a_n\,(x - r_1)^{m_1}\,(x - r_2)^{m_2}\,\cdots\,(x - r_k)^{m_k},

where $m_i$ is the multiplicity of root $r_i$ — how many times that linear factor appears. The multiplicities have to add up to the degree: $m_1 + m_2 + \cdots + m_k = n$ .

Why multiplicity matters geometrically. Near a root $r$ , the polynomial behaves like $C\,(x - r)^{m}$ for some constant $C \neq 0$ . The function $(x - r)^{m}$ tells you exactly how the curve approaches zero:

m = 1 (simple root): $(x-r)$ changes sign when $x$ passes through $r$ , so $p$ changes sign — the curve crosses the x-axis transversally, like a line.
m = 2 (double root): $(x-r)^2 \ge 0$ on both sides, so $p$ does not change sign — the curve touches the axis and bounces back, like a parabola at its vertex.
m = 3 (triple root): $(x-r)^3$ changes sign but with zero slope — the curve flexes through the axis with a horizontal tangent, like a cubic at the origin.
Even multiplicities → touch and bounce. Odd multiplicities → cross.

Loading factored-form playground…

Drag a simple root and a double root toward each other in the widget. You will see two crossings merge into a single "kiss". That is what people mean when they say roots coalesce; the same phenomenon is what creates repeated eigenvalues in linear algebra and bifurcations in dynamical systems.

Turning Points — The Bend Budget

A turning point (a local maximum or local minimum) is a place where the curve briefly stops climbing and starts falling, or vice-versa. Between every two distinct real roots of a smooth function, there has to be at least one turning point — otherwise the curve could not get back to zero. So once you know how many roots a polynomial has, you already know a lower bound on its turning points.

The upper bound comes from a fact you will prove in Chapter 4: the turning points of $p$ are exactly the real roots of its derivative $p'$ . If $p$ has degree $n$ , then $p'$ has degree $n-1$ , and by the fundamental theorem can have at most $n-1$ real roots. That is the famous bend budget:

\#\big\{\text{turning points of degree-}n\text{ polynomial}\big\} \;\le\; n - 1.

A line has 0 turning points. A parabola has exactly 1. A cubic has 0 or 2 (never just one — they come in pairs of local-max/local-min, or the inflection swallows both). A quartic has 1 or 3. Open the explorer above, switch to degree 5, and try to make a curve with exactly 4 turning points — you can; with 5 — you cannot.

Worked Example — $p(x) = x^3 - 3x$

Let us put every idea above to work on a single cubic. Read first, then open the foldout and finish it with pen and paper before looking at the answers.

Click to expand — hand-trace the cubic step by step

Step 1 — Read off the anatomy.

Degree: $n = 3$ (the highest exponent).
Leading coefficient: $a_3 = +1$ .
Constant term: $a_0 = 0$ , so the curve passes through the origin: $p(0) = 0$ .

Step 2 — Predict the tails. Degree is odd, leading sign is positive, so the curve drops from $-\infty$ on the left and rises to $+\infty$ on the right — a rising S.

Step 3 — Find the roots. Factor out the common $x$ :

$p(x) = x^3 - 3x = x\,(x^2 - 3) = x\,\bigl(x - \sqrt{3}\bigr)\bigl(x + \sqrt{3}\bigr).$

Three real roots, each with multiplicity one: $r = -\sqrt{3} \approx -1.732,\ \ 0,\ \ +\sqrt{3} \approx 1.732$ . The curve crosses the x-axis transversally at all three.

Step 4 — Count and locate the turning points. Differentiate (rule preview from Chapter 4 — multiply by the exponent, drop the exponent by one):

$p'(x) = 3x^2 - 3.$

Set $p'(x) = 0$ : $3x^2 = 3$ , so $x = \pm 1$ . That gives exactly two turning points (the maximum allowed for a cubic), at $x = -1$ and $x = +1$ .

Step 5 — Classify each turning point. Plug in:

$p(-1) = (-1)^3 - 3(-1) = -1 + 3 = 2$ . Coming from the left the curve rises and at this point reverses — this is a local maximum at $(-1,\, 2)$ .
$p(+1) = 1 - 3 = -2$ . The curve falls past zero, bottoms out here, then climbs — this is a local minimum at $(1,\, -2)$ .

Step 6 — Sketch the silhouette. Connect the dots: come up from $-\infty$ on the far left, cross zero at $-\sqrt{3}$ , peak at $(-1, 2)$ , fall through the origin, bottom at $(1, -2)$ , cross zero at $+\sqrt{3}$ , rise to $+\infty$ on the right. Compare with the interactive widget above — it is the same picture.

Step 7 — Sanity-check by plugging numbers in.

x	p(x) by hand	Interpretation
−2	(−2)³ − 3(−2) = −8 + 6 = −2	below axis (tail diving)
−√3	0	left root (crossing)
−1	−1 + 3 = 2	local maximum
0	0	middle root (crossing)
1	1 − 3 = −2	local minimum
√3	0	right root (crossing)
2	8 − 6 = 2	above axis (tail rising)

Every entry agrees with the silhouette. The polynomial is fully understood with no machinery beyond arithmetic and a single derivative.

Python: Evaluate by Hand vs. Horner's Method

Translating the hand-trace into code does two things. First, it cements the convention you will see in every library: coefficients live in a vector indexed by exponent. Second, it reveals why nobody actually evaluates a polynomial by summing $a_i x^i$ in production — Horner's method is faster and more numerically stable for free.

Pure Python — evaluating p(x) = x³ − 3x two ways

🐍poly_eval.py

Explanation(15)

Code(26)

1Comment — what we are about to do

Plain-text plan: take the polynomial p(x) = x^3 − 3x and evaluate it two different ways at the same x, so you can SEE that the algebraic identity p(x) = a0 + a1·x + a2·x² + a3·x³ and the nested Horner form a0 + x·(a1 + x·(a2 + x·a3)) really do produce the same number.

2Convention: coefficients in increasing order of power

We list coefficients from a0 (constant) up to a_n (leading). This is the convention every numerical library (NumPy.polynomial, SciPy, PyTorch) uses. Many textbooks list them the other way (highest first) — getting this backwards is the single most common bug when copying a formula into code.

EXECUTION STATE

📚 ordering = [a0, a1, a2, a3] ← index = exponent. coeffs[2] is always the coefficient of x².

7coeffs = [0, -3, 0, 1]

The concrete polynomial for the whole worked example. Reading left to right: constant = 0, slope-of-x term = −3, no x² term, leading x³ term has coefficient +1. So algebraically p(x) = 0 + (−3)x + 0·x² + 1·x³ = x³ − 3x.

EXECUTION STATE

coeffs[0] = 0 — the constant a0

coeffs[1] = -3 — the coefficient of x

coeffs[2] = 0 — the coefficient of x^2 (absent)

coeffs[3] = 1 — the leading coefficient a_n

9def eval_direct(coeffs, x) — the textbook way

Evaluates the polynomial term-by-term, exactly as you would on paper. Slow (n+1 multiplies of growing powers) and numerically less accurate for big n, but it is the most readable possible implementation.

EXECUTION STATE

⬇ coeffs = [0, -3, 0, 1]

⬇ x = the point we evaluate at, e.g. 2.0

⬆ returns = the number p(x)

11total = 0.0 — accumulator

Start with a running sum of zero, in float. We will add a_i · x^i into it as we walk through the coefficient list.

12for i, a in enumerate(coeffs):

enumerate yields the (index, value) pairs of the list. i becomes the EXPONENT of x in that term; a is the coefficient. Tracing on x = 2 gives:

LOOP TRACE · 4 iterations

i = 0

a = 0

x**i = 2**0 = 1

term = 0 * 1 = 0

total after = 0

i = 1

a = -3

x**i = 2**1 = 2

term = -3 * 2 = -6

total after = -6

i = 2

a = 0

x**i = 2**2 = 4

term = 0 * 4 = 0

total after = -6

i = 3

a = 1

x**i = 2**3 = 8

term = 1 * 8 = 8

total after = 2

13total += a * (x ** i)

x ** i raises x to the integer power i. We multiply by the coefficient, then add into the running total. Operator precedence in Python: ** is higher than * is higher than +=, so this reads (a * (x**i)) added to total.

14return total

Hand off the final sum. For x = 2 the function returns 2 — meaning p(2) = 2. Sanity check: 2³ − 3·2 = 8 − 6 = 2. Matches.

16def eval_horner(coeffs, x) — the smart way

Horner's method evaluates the same polynomial in NESTED form a0 + x·(a1 + x·(a2 + x·a3)). The algebra is identical; the bookkeeping is cleaner. Crucially we never compute x², x³, x⁴ as separate quantities — we only ever do (running * x) + next_coefficient.

EXECUTION STATE

📚 why it's better = n multiplications and n additions (vs. ~2n for direct). Better cache behavior. And the running value naturally stays inside the range of p, avoiding huge intermediate powers that overflow for big x.

18result = 0.0

Start the Horner accumulator at zero. This will be the value we keep multiplying by x and folding the next coefficient into.

19for a in reversed(coeffs):

Walk the coefficient list FROM THE LEADING TERM DOWN. reversed([0, -3, 0, 1]) yields 1, 0, -3, 0. Tracing on x = 2:

LOOP TRACE · 4 iterations

iter 1 (a = 1 — the leading a3)

before result = 0

result * x = 0 * 2 = 0

+ a = 0 + 1 = 1

after = 1

iter 2 (a = 0 — the a2)

before result = 1

result * x = 1 * 2 = 2

+ a = 2 + 0 = 2

after = 2

iter 3 (a = -3 — the a1)

before result = 2

result * x = 2 * 2 = 4

+ a = 4 + (-3) = 1

after = 1

iter 4 (a = 0 — the constant a0)

before result = 1

result * x = 1 * 2 = 2

+ a = 2 + 0 = 2

after = 2

20result = result * x + a

The Horner step. Each iteration multiplies the previous result by x and folds the next coefficient in. Geometrically this is the same as evaluating the nested form ((a3·x + a2)·x + a1)·x + a0 one shell at a time.

21return result

Hand back the value of p(x). Same answer as eval_direct, by construction.

24Probe loop — five test points

Walks five integer x values across the interesting window. We will see the curve dip to a minimum on the right of zero and rise to a maximum on the left of zero. Tracing the printed output:

LOOP TRACE · 5 iterations

x = -2

p(x) = (-2)**3 - 3*(-2) = -8 + 6 = -2

printed = x = -2 direct = -2.0 horner = -2.0

x = -1

p(x) = (-1)**3 - 3*(-1) = -1 + 3 = 2

printed = x = -1 direct = 2.0 horner = 2.0 (local MAX)

x = 0

p(x) = 0 - 0 = 0

printed = x = 0 direct = 0.0 horner = 0.0 (a root)

x = 1

p(x) = 1 - 3 = -2

printed = x = 1 direct = -2.0 horner = -2.0 (local MIN)

x = 2

p(x) = 8 - 6 = 2

printed = x = 2 direct = 2.0 horner = 2.0

25print(f"...")

Just formatted output: right-aligned columns of width 2 / 5 for readability. The actual mathematical work is done by the two eval functions.

11 lines without explanation

1# Two ways to evaluate  p(x) = x^3 - 3x  at a point.
2# Coefficients in INCREASING order of power:  a0, a1, a2, a3
3#   a0 = 0     (constant term)
4#   a1 = -3    (coefficient of x)
5#   a2 = 0     (coefficient of x^2)
6#   a3 = 1     (coefficient of x^3)
7coeffs = [0, -3, 0, 1]
8
9def eval_direct(coeffs, x):
10    """Sum every  a_i * x**i  term naively."""
11    total = 0.0
12    for i, a in enumerate(coeffs):
13        total += a * (x ** i)
14    return total
15
16def eval_horner(coeffs, x):
17    """Nested form:  a0 + x*(a1 + x*(a2 + x*a3))  — fewer multiplies."""
18    result = 0.0
19    for a in reversed(coeffs):
20        result = result * x + a
21    return result
22
23# Probe the curve at five points.
24for x in [-2, -1, 0, 1, 2]:
25    print(f"x = {x:>2}   direct = {eval_direct(coeffs, x):>5}   "
26          f"horner = {eval_horner(coeffs, x):>5}")

Why Horner is faster. Direct evaluation computes

x^0, x^1, \ldots, x^n

separately (about

2n

multiplies for a degree-

n

polynomial). Horner's method does

n

multiplies and

n

additions — roughly half the work, and the rounding errors do not accumulate the way they do in repeated exponentiation. Every serious library — NumPy's numpy.polynomial, SciPy, PyTorch's internal kernels — uses Horner or a close cousin (Estrin's scheme for vectorisation).

PyTorch: Polynomials as Vectorised Tensors

The same algorithm scales straight to a tensor. The reason this matters is that polynomials are everywhere in modern machine learning — feature transforms, positional encodings, Chebyshev approximations, the polynomial activation layers explored in Kolmogorov-Arnold Networks (KANs). All of them rely on evaluating one polynomial at many points, fast.

PyTorch — vectorised Horner over 11 probe points

🐍poly_torch.py

Explanation(11)

Code(17)

1import torch

Brings in PyTorch. The point of this block is to show that the Python loop you just walked through generalises to a TENSOR loop with the same shape of code — but instead of one x at a time, every iteration acts on the whole batch in parallel.

4coeffs = torch.tensor([0.0, -3.0, 0.0, 1.0])

Same [a0, a1, a2, a3] vector you saw in the pure-Python block, but wrapped in a 1-D torch.Tensor of dtype float32. PyTorch needs floats here (not ints) because eventually you want autograd on the coefficients — gradients require floating point.

EXECUTION STATE

coeffs.shape = (4,)

coeffs.dtype = torch.float32

coeffs[3] = 1.0 ← the leading coefficient

5xs = torch.linspace(-2.5, 2.5, 11)

Builds a 1-D tensor of 11 numbers evenly spaced from −2.5 to 2.5 (inclusive). Step size is (2.5 − (−2.5)) / (11 − 1) = 0.5.

EXECUTION STATE

xs.shape = (11,)

xs = [-2.5, -2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5]

7def poly_horner(coeffs, x) — vectorised Horner

Identical to the pure-Python eval_horner, except every multiply and add is now a tensor op. The body never knows whether x is a scalar or a length-11 tensor — broadcasting handles both.

9result = torch.zeros_like(x)

Allocates the accumulator. zeros_like(x) returns a tensor with the same shape and dtype as x, filled with zeros. If x has shape (11,) you get an (11,) accumulator; if x is a scalar you get a scalar.

EXECUTION STATE

result.shape (here) = (11,)

result = [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]

10for a in torch.flip(coeffs, dims=[0]):

torch.flip reverses a tensor along the given axis. flip([a0, a1, a2, a3], dim 0) returns [a3, a2, a1, a0] — exactly the leading-first ordering Horner's method needs. The Python for-loop then yields these one scalar at a time.

LOOP TRACE · 4 iterations

iter 1 a = 1.0 (the leading a3)

result * x (broadcast) = [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]

+ a = [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]

iter 2 a = 0.0 (a2)

result * x = [-2.5, -2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5]

+ a = same (we just added 0)

iter 3 a = -3.0 (a1)

result * x = [6.25, 4.00, 2.25, 1.00, 0.25, 0.00, 0.25, 1.00, 2.25, 4.00, 6.25]

+ a = [3.25, 1.00, -0.75, -2.00, -2.75, -3.00, -2.75, -2.00, -0.75, 1.00, 3.25]

iter 4 a = 0.0 (the constant a0)

result * x = [-8.125, -2.000, 1.125, 2.000, 1.375, 0.000, -1.375, -2.000, -1.125, 2.000, 8.125]

+ a (=0) = [-8.125, -2.000, 1.125, 2.000, 1.375, 0.000, -1.375, -2.000, -1.125, 2.000, 8.125]

11result = result * x + a

The vectorised Horner step. result and x are both shape-(11,) tensors; a is a 0-D tensor (scalar). PyTorch broadcasts a across the 11 entries, multiplies elementwise, and adds. One line of code does 11 evaluations of the polynomial.

EXECUTION STATE

📚 broadcasting = shape (11,) * shape (11,) → shape (11,). Then + shape () broadcasts the scalar to (11,). The result keeps shape (11,).

12return result

Returns the length-11 tensor of p(x) values. No Python-level loop over x was ever needed; PyTorch's C++/CUDA kernels did the elementwise math in one shot.

14ys = poly_horner(coeffs, xs)

Calls our function on the eleven probe points. Tracing each:

LOOP TRACE · 11 iterations

row 1

x = -2.50

p(x) = (-2.5)^3 - 3(-2.5) = -15.625 + 7.5 = -8.125

row 2

x = -2.00

p(x) = -8 + 6 = -2.000

row 3

x = -1.50

p(x) = -3.375 + 4.5 = 1.125

row 4

x = -1.00

p(x) = -1 + 3 = 2.000 (local max)

row 5

x = -0.50

p(x) = -0.125 + 1.5 = 1.375

row 6

x = 0.00

p(x) = 0

row 7

x = 0.50

p(x) = 0.125 - 1.5 = -1.375

row 8

x = 1.00

p(x) = 1 - 3 = -2.000 (local min)

row 9

x = 1.50

p(x) = 3.375 - 4.5 = -1.125

row 10

x = 2.00

p(x) = 8 - 6 = 2.000

row 11

x = 2.50

p(x) = 15.625 - 7.5 = 8.125

15for x, y in zip(xs.tolist(), ys.tolist()):

.tolist() converts a tensor into a regular Python list so we can iterate in plain Python and format with f-strings. zip pairs each x with its corresponding p(x).

16print(f"x = {x:+.2f} p(x) = {y:+.4f}")

Format spec :+.2f means 'always show the sign, 2 decimals, fixed point'. :+.4f is the same with 4 decimals. The full output reproduces the table you traced above.

6 lines without explanation

1import torch
2
3# Same polynomial:  p(x) = x^3 - 3x.
4# We will evaluate it at HUNDREDS of x values in one shot.
5coeffs = torch.tensor([0.0, -3.0, 0.0, 1.0])      # [a0, a1, a2, a3]
6xs = torch.linspace(-2.5, 2.5, 11)                # 11 evenly spaced probes
7
8def poly_horner(coeffs: torch.Tensor, x: torch.Tensor) -> torch.Tensor:
9    """Vectorised Horner: works on a SCALAR x or a 1-D batch of x's."""
10    result = torch.zeros_like(x)
11    for a in torch.flip(coeffs, dims=[0]):         # leading coeff first
12        result = result * x + a
13    return result
14
15ys = poly_horner(coeffs, xs)
16for x, y in zip(xs.tolist(), ys.tolist()):
17    print(f"x = {x:+.2f}   p(x) = {y:+.4f}")

Where this is going. Because every line in the body of poly_horner is a differentiable PyTorch op, you can compute

\partial p / \partial \text{coeffs}

with torch.autograd for free. That is exactly how a linear regression with polynomial features works — the model learns the coefficient vector by gradient descent on

\| p(x) - y \|^2

. You will revisit this in Chapter 4.

Where Polynomials Show Up in the Real World

Physics — projectile motion. Height as a function of time is the quadratic $h(t) = h_0 + v_0 t - \tfrac{1}{2} g t^{2}$ . The single root in $t > 0$ is the landing time; the unique turning point is the apex. A degree-2 polynomial captures the entire flight.
Economics — cost & revenue curves. Total cost of producing $q$ units is often modelled as $C(q) = a_0 + a_1 q + a_2 q^{2} + a_3 q^{3}$ : a fixed cost plus marginal costs that rise nonlinearly. Roots of marginal cost (the derivative) give the production sweet spots.
Computer graphics — Bézier splines. Every curve in a font, every path in an SVG, every CAD surface is a piecewise polynomial. The smoothness you feel in fonts is built into the fact that consecutive cubic Bézier pieces agree on value and first derivative at their seams.
Numerical analysis — Taylor approximation. Any well-behaved function can be approximated, near a point, by a polynomial whose coefficients are the function's derivatives at that point. Polynomials are the universal "local" language of smooth functions — Chapter 13 turns this into a precise theorem.
Machine learning — polynomial features. sklearn.preprocessing.PolynomialFeatures simply maps $x \mapsto (1, x, x^{2}, x^{3}, \ldots)$ so that a linear model can fit polynomial relationships. The bend budget you learned today is the same bend budget that controls how much such a model can overfit.

Common Pitfalls

1. Confusing degree with number of terms.

x^{4} - 1

has only two terms but still has degree 4 — and therefore can still bend up to 3 times. Always look for the highest exponent, not the biggest-looking expression.

2. Sneaking in non-integer or negative exponents.

x^{2} + \sqrt{x}

is not a polynomial. The minute one term breaks the rule, the whole expression loses every polynomial guarantee — including the bend budget and the fundamental theorem of algebra.

3. Mistaking turning points for roots. A turning point is where the curve stops climbing; a root is where the curve touches zero. They coincide only when a root has multiplicity at least 2 (the "kiss" case). In our cubic

x^3 - 3x

the turning points are at

x = \pm 1

with values

p = \pm 2

— nowhere near zero.

4. Reading coefficients in the wrong order. NumPy's numpy.poly1d takes coefficients highest power first, while numpy.polynomial.Polynomial takes them lowest first. Always double-check; a reversed list silently produces a totally different curve.

5. Trusting the curve outside the viewing window. A polynomial fitted to a few hundred data points often does something wild just past the last sample — this is called Runge's phenomenon and is the main reason high-degree polynomial fits are dangerous. Splines (low-degree pieces, glued smoothly) are the production answer.

Summary

A polynomial is a finite sum $p(x) = \sum_{i=0}^{n} a_i x^{i}$ with $a_n \neq 0$ and non-negative integer exponents.
The degree $n$ caps the number of real roots ( $\le n$ ) and turning points ( $\le n - 1$ ) — your bend budget.
The leading term $a_n x^{n}$ owns the tails: parity of $n$ + sign of $a_n$ fully determines end behaviour.
In factored form, the multiplicity of a root tells you whether the curve crosses (odd), bounces (even-2), or flexes (odd-≥3) the x-axis.
Horner's method is the production algorithm for evaluating polynomials — fewer multiplications, better numerical stability, and identical structure on a CPU loop or a GPU tensor kernel.

Polynomials are the language a calculator can speak fluently. They are also the language in which limits, derivatives, integrals, and Taylor expansion all begin. Master their shape — degree, leading term, roots, multiplicity, turning points — and the next ten chapters will feel like reading prose instead of decoding hieroglyphs.

Learning Objectives

The Problem Polynomials Solve

What Counts as a Polynomial

Anatomy: Degree, Leading Term, Constant

Interactive — Shape Explorer

End Behavior — The Tails Tell the Truth

Roots: Where the Curve Crosses Zero

The Fundamental Theorem of Algebra (1799, Gauss)

Factored Form & Multiplicity

Turning Points — The Bend Budget

Worked Example — p(x)=x3−3xp(x) = x^3 - 3xp(x)=x3−3x

Python: Evaluate by Hand vs. Horner's Method

PyTorch: Polynomials as Vectorised Tensors

Where Polynomials Show Up in the Real World

Common Pitfalls

Summary

Worked Example — $p(x) = x^3 - 3x$