Learning Objectives
By the end of this section you will be able to:
- State what makes a function linear and recognise the form .
- Compute slope from any two points using — and explain why the answer never depends on which two points you pick.
- Translate between three forms of a line: slope–intercept, point–slope, and two-point.
- Generalise the slope formula to average rate of change for any function — the secant slope .
- See how the secant rotates onto the tangent as the two points collapse — the first taste of the derivative.
- Implement all of the above in plain Python and then in PyTorch, and confirm the limit numerically against autograd.
The Big Picture: Constant Rate of Change
A linear function is a function whose rate of change never changes. Pick any starting point, walk any distance to the right, and the function will always step up (or down) by the same multiple of how far you walked.
Imagine you are filling a bathtub from a tap that pours at exactly the same speed every second. After one second the water is up by some amount; after two seconds it is up by twice that amount; after three seconds, three times. The water level is a linear function of time. The story is boring in the best possible way — every second is exactly like the last.
The intuition in one sentence
A linear function is what you get when nothing about the rate ever changes. Everything else we will study in this book — curves, growth, decay, motion — is what happens when the rate does change. Linear is the baseline.
Why is this the right place to start a calculus book? Because all of differential calculus is built on a single dream: zoom in close enough to any smooth curve and it looks like a straight line. The slope of that local straight line is what we will eventually call the derivative. So before we earn the right to study curves, we must master the lines that approximate them.
What is a Linear Function?
A linear function is a function of the form
where and are fixed real numbers. The graph is a straight line in the -plane.
| Symbol | Name | Geometric meaning |
|---|---|---|
| x | Input | Horizontal coordinate of any point on the line. |
| f(x), or y | Output | Vertical coordinate. Always equal to m·x + b. |
| m | Slope | How much y changes when x changes by 1. Steepness. |
| b | y-intercept | The value of y when x = 0. Where the line hits the y-axis. |
What is NOT a linear function (in this book)
You will sometimes hear that or are "non-linear". They are. In calculus, the word linear function is reserved strictly for the form — first power of , no squares, no logs, no sines. (Mathematicians in higher courses also call "linear" in the algebraic sense, because breaks additivity, but the calculus convention is the one above.)
Slope: The Soul of a Linear Function
Take any two points on a line, call them and . The slope of the line is
The symbol (Greek capital delta) is the standard mathematical notation for "the change in". means "the change in y", means "the change in x". The slope is rise over run.
The single most important property
For a linear function the ratio gives the same number no matter which two points you pick. That is what "constant rate of change" means in symbols.
Why is the slope constant? A short proof you can do in your head
Take any two points and on the line . Then by definition and . Subtract:
The two 's cancel. Divide both sides by and you get . The slope formula always returns , independent of which two points you picked. This is the algebraic shadow of the geometric fact that the line is perfectly straight.
Interactive Explorer: y = m·x + b
Drag the two coloured points along the line. The dashed amber leg is and the dashed red leg is . Watch the readout in the side panel: never moves from , no matter where you drag.
- Make huge, then tiny — the ratio does not budge.
- Drag to the same place as : the formula goes to (undefined). To talk about slope you must always pick two different points.
- Slide to : the line is horizontal — "y never changes when x changes".
- Set negative: the line slopes down ("y decreases as x increases").
Worked Example — Pizza Delivery by Hand
Let's leave the abstract and ground everything in a realistic scenario. You deliver pizzas. Your pay model is simple: a flat daily stipend plus per delivery.
Let be the number of deliveries on a given day and the dollars you take home. Try to derive the linear law, slope, intercept, and predict your earnings — all by hand — before opening the answer below.
Step-by-step solution (click to expand)
Step 1 — Tabulate the first few values. With deliveries you still get the stipend, so . With : . With : . With : .
Step 2 — Find the slope from any two points. Pick and :
Pick and instead:
Same answer. Pick and :
Still . The slope is $3 per delivery — which matches the wage rule we started with.
Step 3 — Find the y-intercept. From the table at : . So the linear law is
Step 4 — Predict. If you do deliveries:
Step 5 — Reverse the question. If you want to take home dollars, how many deliveries? Solve .
Step 6 — Sanity check the units. has units of dollars per delivery; has units of dollars. Adding them only makes sense because first multiplies dollars/delivery by deliveries, leaving dollars. Always think in units — it catches half of all algebra bugs.
The y-Intercept b: Where the Line Starts
The constant in answers one simple question: what is y when x is zero? Plug it in: .
Geometrically, is the height at which the line crosses the vertical axis. The interactive explorer above marks it with the green dot. Slide the slider and the entire line moves up or down without tilting — slope does not depend on intercept.
Real-world meaning of b
In the pizza example is the base stipend — what you earn for showing up before delivering anything. In physics it is often a starting position. In economics it is fixed cost. Whenever you see , ask: what does "zero input" mean here, and is reasonable at that moment?
Three Ways to Write the Same Line
The same straight line can be described in different algebraic outfits. Knowing all three lets you start from whatever information you happen to have.
| Form | Formula | Best used when you know… |
|---|---|---|
| Slope–intercept | y = m·x + b | the slope m and the y-intercept b |
| Point–slope | y − y₁ = m·(x − x₁) | the slope m and one point (x₁, y₁) on the line |
| Two-point | y − y₁ = ( (y₂ − y₁)/(x₂ − x₁) )·(x − x₁) | two points (x₁, y₁), (x₂, y₂) on the line |
All three are algebraically equivalent. The point–slope form is especially important in calculus because it is the form of the tangent line: once you know the slope at a point (the derivative) and the point itself, point–slope is how you write the approximation.
Compare this with the slope formula : point–slope is literally the same equation, just multiplied out. The two are rearrangements of one another. Never memorise both — memorise and derive the rest.
The Staircase: Watching Slope Step by Step
Here is a way to feel the slope in your body. Imagine walking along the line one unit at a time on the x-axis. Each step rises by exactly on the y-axis. Press Play and watch the climber zigzag right and up, right and up, always the same horizontal foot and the same vertical rise.
If you flip the slope slider to the rises disappear — the climber walks on flat ground. Flip to and each step jumps up two. Flip to and the climber descends one unit per step. The staircase is the slope made physical.
Average Rate of Change for Any Function
We have been very strict so far: the slope formula belongs to straight lines. But the formula is so good that we use it on curves too — we just rename it.
For any function and any two points and in its domain, the average rate of change of on the interval is
Geometrically this is the slope of the straight line — called a secant — drawn through the two graph points and .
The deep observation
For a linear function this average is the same on every interval — we just call it the slope . For a curve, the average changes depending on where you measure. Calculus is fundamentally the study of how that average behaves as you let the two endpoints crash into each other.
An analogy you can drive home
Average rate of change is exactly the same idea as your average speed on a road trip. If you drive 240 km in 4 hours, your average speed is km/h. But your speedometer almost certainly was not pinned at 60 km/h the whole time — at any instant your instantaneous speed was probably higher or lower. The speedometer reads what calculus will eventually call the derivative; the average from trip start to trip end is the secant slope.
Secant → Tangent: A Preview of the Derivative
Below is the central animation of the entire book. A curve , an anchor point , and a movable second point . The orange line through them is the secant; the dashed green line is the tangent at .
Press Shrink h → 0. Watch the orange secant rotate until it nearly overlaps the green tangent. Read the side panel: the average rate approaches .
This is the only idea in differential calculus, stripped to its skeleton:
Read it slowly. is the instantaneous rate of change of at . It equals the limit of the average rate of change, as the interval shrinks to a single point. We will spend the whole of Chapter 2 explaining what means rigorously, but you already have the picture: rotate the secant until both endpoints merge.
Why does linear come first?
Because the tangent line itself is a linear function. The tangent at has slope and passes through , so its equation is
That is point–slope form. Every derivative you ever compute will produce a number that is the slope of a particular line. Learning lines deeply now pays back forever.
Python: Computing Rates of Change
Let's pin down everything we just said in code. First we verify that for a linear function the slope is the same for any two points (so our formula really is well-defined). Then we look at the famously non-linear function at and watch the discrete difference quotient converge to .
Expected output
hours 0 -> 1 Δy/Δx = ( 35.00 - 10.00) / (1 - 0) = 25.0000
hours 2 -> 7 Δy/Δx = (185.00 - 60.00) / (7 - 2) = 25.0000
hours 3.5 -> 10.25 Δy/Δx = (266.25 - 97.50) / (10.25 - 3.5) = 25.0000
Average rate of f(x) = x^2 on [1, 1 + h]:
h avg rate distance to 2
1.0 3.000000 1.000000
0.5 2.500000 0.500000
0.1 2.100000 0.100000
0.01 2.010000 0.010000
0.001 2.001000 0.001000
1e-05 2.000010 0.000010Notice the structure. The wage example has a flat slope of 25, period. The curve example has a slope of that you can see converging to 2 in the column on the right. Every time you halve , the distance to 2 halves with it. That is what "the limit equals 2" means in plain Python.
PyTorch: Vectorising the Rate of Change
Python for-loops are fine for an interactive demonstration but slow once we want to apply the same formula to millions of points or differentiate a neural network. PyTorch does two things for us:
- Vectorisation: compute the difference quotient at many values in parallel on the CPU or GPU.
- Autograd: hand us the exact derivative with no numerical error — perfect for cross-checking that the limit really is what we claim.
Expected output
h : [1.0, 0.5, 0.1, 0.01, 0.001, 1e-05] avg rate : [3.0, 2.5, 2.1, 2.01, 2.001, 2.00001] f'(1) by autograd = 2.0
The two answers agree. Numerical limit and symbolic derivative converge to the same number. This is the recipe we will reuse in every later chapter: derive a formula on paper, sanity-check it with a one-line PyTorch experiment.
Why This Matters — Applications
🚗 Physics — uniform motion
For an object moving at constant velocity , position is . The slope is the velocity; the y-intercept is the starting position.
💰 Economics — fixed and marginal cost
Total cost for units. The slope is the marginal cost per unit; the intercept is fixed overhead.
🌡️ Engineering — calibration curves
A thermocouple voltage related to temperature via . Linear fit gives the sensor's sensitivity and offset .
🤖 Machine learning — linear regression
The simplest predictive model is . Training fits the slope (weight) and intercept (bias) to data. Every neuron in a deep network is one of these followed by a non-linear squashing.
The big arc
Linear models are everywhere in applied science, and even when the real-world law is non-linear we typically linearise it by approximating the curve with its tangent. That single move — replace a hard curve locally with an easy line — is the unifying technique of physics, engineering and modern machine learning. The derivative will hand us those tangent lines, but the structure they live in is exactly the we are studying today.
Common Pitfalls
Sign errors from inconsistent subtraction order
The slope formula is . If you compute the numerator as but the denominator as , you flip the sign. Always keep the same order top and bottom.
Vertical lines have no slope
A vertical line like is not a function (one input cannot map to many outputs), and its slope is undefined, not infinite. The denominator kills the formula.
Confusing slope with the function value
The slope and the output are different beasts. The slope measures how fast y changes, not how big y is. A line that sits high but is flat has small slope; a line near the origin that dives steeply has big slope.
Reading the difference quotient out loud
When you see , train yourself to read it as "the change in , divided by the change in , on a tiny interval starting at ". The Greek and the Roman are the same idea.
Summary
Linear functions are the simplest functions whose rate of change is meaningful — and the rate is constant. This single property forces the graph to be a straight line, makes the slope formula return the same answer for every pair of points, and seeds every later idea in calculus.
| Concept | Formula | One-sentence meaning |
|---|---|---|
| Linear function | f(x) = m·x + b | First-power formula whose graph is a straight line. |
| Slope | m = (y₂ − y₁)/(x₂ − x₁) | Constant rise per unit run. |
| y-intercept | b = f(0) | Where the line crosses the y-axis. |
| Point–slope form | y − y₁ = m·(x − x₁) | Line written from one point and a slope. |
| Average rate of change | (f(b) − f(a))/(b − a) | Slope of the secant through two points of any graph. |
| Derivative (preview) | f'(a) = lim_{h→0} (f(a+h) − f(a))/h | Instantaneous rate, the limit of the average rate as the interval shrinks. |
Key Takeaways
- A function is linear iff its rate of change is the same on every interval.
- The slope formula works on any two points of a line and returns the same answer.
- For a curve, that same formula gives an average rate of change — the secant slope.
- As the two endpoints collapse to one, the secant rotates onto the tangent line. Its slope is the derivative.
- The tangent itself is a linear function — point–slope form .
- Both plain Python and PyTorch make this concrete: tabulate the difference quotient, watch it converge, double-check with
autograd.
Coming next: Section 1.3 zooms out from straight lines to polynomial functions, where the rate-of-change idea finally has something to chew on — the slope of , , and their combinations will be different at every point, foreshadowing Chapter 4's derivative rules.