Learning Objectives
By the end of this section, you will be able to:
- Apply the Power Rule to differentiate any function of the form
- Use the Constant Rule to recognize that derivatives of constants are zero
- Apply the Constant Multiple Rule to pull constants out of derivatives
- Combine the Sum and Difference Rules to differentiate polynomial functions
- Extend the Power Rule to negative and fractional exponents
- Connect these rules to gradient computation in machine learning
The Big Picture: From Definition to Efficiency
"The derivative rules are shortcuts — they replace tedious limit calculations with simple algebraic operations."
In the previous sections, we learned that the derivative is defined as a limit: . While this definition is fundamental, computing derivatives using limits every time would be incredibly tedious. Imagine having to expand just to find the derivative of !
The derivative rules we learn in this section are powerful shortcuts derived from the limit definition. Once proven, they allow us to differentiate most functions instantly, without ever writing a limit.
Why These Rules Matter
These three rules — Power, Sum, and Constant — are the building blocks for differentiating all polynomial functions. Combined with rules for products, quotients, and compositions (which we'll learn later), they let us differentiate virtually any function we encounter.
The rules we'll learn:
Constant Rule
Power Rule
Sum Rule
Historical Context: Newton and Leibniz
Both Isaac Newton (1643–1727) and Gottfried Wilhelm Leibniz (1646–1716) independently discovered calculus in the late 17th century. They both recognized that certain patterns emerged when differentiating polynomial functions:
- The derivative of is
- The derivative of is
- The derivative of is
The pattern was clear: the exponent comes down as a coefficient, and the exponent decreases by one. This became the Power Rule, one of the most frequently used rules in all of calculus.
The Notation We Use
Leibniz introduced the notation for derivatives. This notation emphasizes that differentiation is an operation we perform with respect to a variable. Newton used a dot notation (still used in physics for time derivatives). Both notations survive today, each with its advantages.
The Constant Rule
The simplest derivative rule states that the derivative of any constant is zero:
The Constant Rule
where is any constant
Why Does This Make Sense?
The derivative measures the rate of change. A constant, by definition, doesn't change. If , then no matter what is, the output is always 5. The function is perfectly flat — its slope is zero everywhere.
Proof Using the Limit Definition
Let where is a constant.
| Function | Derivative | Explanation |
|---|---|---|
| f(x) = 7 | f'(x) = 0 | The number 7 never changes |
| f(x) = π | f'(x) = 0 | π is a constant (≈ 3.14159...) |
| f(x) = -100 | f'(x) = 0 | Negative constants also don't change |
The Power Rule
The Power Rule is perhaps the most frequently used differentiation formula. It tells us how to differentiate any power of :
The Power Rule
for any real number
In words: bring down the exponent as a coefficient, then reduce the exponent by 1.
Deriving the Power Rule
Let's prove the Power Rule for positive integers using the limit definition and the Binomial Theorem:
Goal: Show that
Proof: Let .
By the Binomial Theorem:
Substituting:
All terms with vanish as , leaving only .
| f(x) | f'(x) | Pattern |
|---|---|---|
| x¹ | 1 | 1·x⁰ = 1 |
| x² | 2x | 2·x¹ = 2x |
| x³ | 3x² | 3·x² = 3x² |
| x⁴ | 4x³ | 4·x³ = 4x³ |
| x¹⁰ | 10x⁹ | 10·x⁹ = 10x⁹ |
Interactive Power Rule Explorer
Use the visualizer below to explore how the Power Rule works. Adjust the exponent and observe how both the function and its derivative change:
Explore how the Power Rule transforms functions and their derivatives
Adjust the exponent to see how the derivative changes
At x = 1.00:
The slope of the tangent line at this point is 2.0000
Power Rule Formula:
The Constant Multiple Rule
Constants can be "pulled out" of derivatives:
The Constant Multiple Rule
Proof
Example: Find the derivative of .
Sum and Difference Rules
The derivative of a sum is the sum of the derivatives:
Sum Rule
Difference Rule
Proof of Sum Rule
Key Insight
The Sum Rule means differentiation is a linear operator. This property is fundamental in advanced mathematics and allows us to differentiate complex expressions term by term.
Interactive Sum Rule Demo
Explore how the derivative of a sum equals the sum of derivatives:
The derivative of a sum equals the sum of the derivatives
Function f(x)
Function g(x)
Sum Rule in Action:
Combining the Rules: Polynomial Differentiation
With the Power Rule, Constant Multiple Rule, and Sum Rule, we can differentiate any polynomial:
Example 1: Find
Example 2: Find
Negative and Fractional Exponents
The Power Rule works for all real exponents, not just positive integers. This greatly extends its usefulness.
Negative Exponents
Recall that . The Power Rule still applies:
| Function | Rewrite | Derivative |
|---|---|---|
| 1/x | x⁻¹ | -1·x⁻² = -1/x² |
| 1/x² | x⁻² | -2·x⁻³ = -2/x³ |
| 1/x³ | x⁻³ | -3·x⁻⁴ = -3/x⁴ |
Example: Find
Fractional Exponents (Roots)
Roots can be written as fractional exponents: , , etc.
| Function | Rewrite | Derivative |
|---|---|---|
| √x | x^(1/2) | (1/2)x^(-1/2) = 1/(2√x) |
| ∛x | x^(1/3) | (1/3)x^(-2/3) |
| x^(3/2) | x^(3/2) | (3/2)x^(1/2) = (3/2)√x |
Example: Find
Real-World Applications
Physics: Motion
If position is given by a polynomial function of time, velocity and acceleration are found by differentiation:
Problem: A ball is thrown vertically. Its height (in meters) after seconds is .
Find the velocity and acceleration functions.
Solution:
- Velocity: m/s
- Acceleration: m/s² (constant, due to gravity)
At seconds: . This is when the ball reaches its maximum height.
Economics: Marginal Analysis
In economics, the derivative represents "marginal" quantities — the rate of change of one quantity with respect to another:
Problem: A company's cost to produce units is dollars.
Find the marginal cost when producing 30 units.
Solution:
The 31st unit costs approximately $8 more to produce than the 30th.
Biology: Population Growth
Population models often involve derivatives to understand growth rates:
Problem: A bacterial colony population after hours is modeled by .
Find the growth rate at t = 5 hours.
Solution:
Machine Learning Connection
The derivative rules are the foundation of gradient-based optimization, which powers virtually all modern machine learning.
Polynomial Regression
In polynomial regression, we fit a model of the form:
To minimize the loss function, we need derivatives with respect to each weight . The Power Rule tells us exactly how polynomial features contribute to the gradient.
Gradient Descent
The update rule for gradient descent is:
Computing requires the derivative rules. For polynomial models:
- Power Rule: Differentiates each polynomial feature
- Sum Rule: Combines gradients from multiple terms
- Constant Rule: Handles bias terms (their gradient is simpler)
Automatic Differentiation
Modern deep learning frameworks like PyTorch and TensorFlow use automatic differentiation to compute gradients. Under the hood, they apply the same rules we're learning — Power, Sum, Product, Chain — to build a computational graph and compute derivatives efficiently.
Python Implementation
Computing Polynomial Derivatives
Let's implement the derivative rules in Python:
Application to Machine Learning
Here's how these rules appear in gradient descent for polynomial regression:
Common Mistakes to Avoid
Mistake 1: Forgetting to reduce the exponent
Wrong:
Correct:
The Power Rule requires reducing the exponent by 1.
Mistake 2: Treating x like a constant
Wrong:
Correct:
The variable is not a constant — it's .
Mistake 3: Confusing coefficient and exponent
Wrong:
Correct:
The 3 is a coefficient (stays), the 2 comes down as multiplier, and the exponent reduces.
Mistake 4: Forgetting the constant of a linear term
Wrong:
Correct:
, so the derivative is .
Test Your Understanding
What is the derivative of f(x) = x⁵?
Summary
The derivative rules provide powerful shortcuts for computing derivatives without using the limit definition every time.
The Core Rules
| Rule | Formula | Example |
|---|---|---|
| Constant Rule | d/dx(c) = 0 | d/dx(7) = 0 |
| Power Rule | d/dx(xⁿ) = nxⁿ⁻¹ | d/dx(x³) = 3x² |
| Constant Multiple | d/dx[cf(x)] = cf'(x) | d/dx(5x²) = 10x |
| Sum Rule | d/dx[f + g] = f' + g' | d/dx(x² + x) = 2x + 1 |
| Difference Rule | d/dx[f - g] = f' - g' | d/dx(x³ - x) = 3x² - 1 |
Key Takeaways
- The Power Rule is the workhorse: for any real
- Constants have zero derivative because they don't change
- The Sum Rule allows term-by-term differentiation of polynomials
- The Power Rule works for negative and fractional exponents too
- These rules are building blocks for the Product, Quotient, and Chain Rules
- Machine learning uses these rules constantly in gradient computation
Coming Next: In the next section, we'll learn the Product Rule — how to differentiate products of functions, which is essential when our functions can't be written as simple polynomials.