# Taylor Series

Taylor Series
Field: Algebra
Image Created By: Peng Zhao
Website: Math Images Project

Taylor Series

Taylor series and Taylor polynomials allow us to approximate functions that are otherwise difficult to calculate as accurately as we want. The image at the right, for example, shows how successive Taylor polynomials come to better approximate the function sin(x). In this page, we will focus on how such approximations might be obtained as well as how the error of such approximations might be bounded.

# Basic Description

A Taylor series is a power series representation of an infinitely differentiable function. In other words, many functions, like the trigonometric functions, can be written alternatively as an infinite series of terms.

An nth-degree Taylor polynomial $P_n(x)$ for a function is the sum of the first n terms of a Taylor series. As a finite series, a Taylor polynomial can be computed exactly (no limits needed). Although it will not exactly match the infinite Taylor series or the original function, the approximation becomes progressively better as n increases.

In the animation above, Taylor polynomials are compared to the actual function y = sin(x) using the following polynomial expansion:

$\sin(x) \approx P_n(x) = x - {x^3 \over 3!} + {x^5 \over 5!} - {x^7 \over 7!} + \cdots \pm {x^n \over n!}$    (for odd n)

n varies from 0 to 36. As n becomes larger and there are more terms in the Taylor polynomial, the Taylor polynomial comes to "look" more like the original function; in other words, it becomes a progressively better approximation of the function.

Taylor series are important because they allow us to compute functions that cannot be computed directly. While the above Taylor polynomial for the sine function looks complicated and is annoying to evaluate by hand, it is just the sum of terms consisting of exponents and factorials, so the Taylor polynomial can be reduced to the basic operations of addition, subtraction, multiplication, and division. We can obtain an approximation by truncating the infinite Taylor series into a finite-degree Taylor polynomial, which we can evaluate.

Readers may, without knowing it, already be familiar with a particular type of Taylor series. Consider, for instance, an infinite geometric series with first term 1 and common ratio x:

${1 \over {1-x}} = 1 + x + x^2 + x^3 + \cdots$ for $-1 < x < 1$

The left side of the equation is the formula for the sum of the convergent geometric series on the right. The right side is also an infinite power series, so it is the Taylor series for $f (x) = {1 \over {1-x}}$. The More Mathematical Explanation will provide examples of some other Taylor series, as well as the process for deriving them from the original functions.

Using Taylor series, we can approximate infinitely differentiable functions. For example, imagine that we want to approximate the sum of the infinite geometric series with first term 1 and common ratio $x = {1 \over 4}$. Using our knowledge of infinite geometric series, we know that the sum is ${1 \over {1 - {1 \over 4}}} = {4 \over 3} = 1.333 \cdots$. Let's see how the Taylor approximation does:

${P_2 \left({1 \over 4}\right) =} 1 + {1 \over 4} + \left({1 \over 4}\right)^2 = 1.3125$

This second-order Taylor polynomial brings us somewhat close to the value of $4 \over 3$ that we obtained above. Let's observe how adding on another term can improve our estimate:

${P_3 \left({1 \over 4}\right) =} 1 + {1 \over 4} + \left({1 \over 4}\right)^2 + \left({1 \over 4}\right)^3 = 1.328125$

As we expect, this approximation is closer still to the actual value, but not exact. Adding more terms would improve this accuracy further, but so long as the amount of terms that we add is finite, the approximation will never be exact.

Figure 1
The approximation zoomed in 2,000 times.

At this point, you may be wondering what the use of a Taylor series approximation is if, as in the previous example, we don't need an estimate; we already have the exact answer on the left-hand side. Well, we don't always know the exact answer. For instance, a more complicated Taylor series is that of cos(x):

$\cos (x) = 1 - {x^2 \over 2!} + {x^4 \over 4!} - {x^6 \over 6!} + \cdots$ where x is in radians.

In this case, it is easy to select x so that we cannot exactly evaluate the left-hand side of the equation. For such functions, making an approximation can be more valuable. For instance, consider:

$\cos 35^\circ$

First we must convert degrees to radians in order to use the Taylor series:

$\cos 35^\circ = \cos \left({35 \over 180} \pi \right) \approx \cos 0.610865$

Then, substitute into the Taylor series of cosine above:

$\cos (0.610865) \approx 1 - {0.610865^2 \over 2!} + {0.610865^4 \over 4!}$

Here we have written the 4th-degree Taylor polynomial, but this should be enough to show us something. The right side of the equation can be reduced to the four simple operations, so we can easily calculate its value:

$\cos (0.523599) \approx 0.81922$

We can compare this to the value given by the calculator. The calculator's value, actually, is also an approximation obtained by a similar method, but we can expect it to be accurate for all displayed decimal places.

$\cos 35^\circ = 0.81915 \cdots$

So our approximating value agrees with the "actual" value to three decimal places, which is good accuracy for a basic approximation. As above, better accuracy can be attained by using more terms in the Taylor series.

This result can be observed if we zoom in the graph at the intersection point, as shown in Figure 1. In a large graph, the functions look almost identical at the intersection point, but there is indeed a difference between these two functions, as the graph shows.

# A More Mathematical Explanation

Note: understanding of this explanation requires: *Calculus

## The general form of a Taylor series

In this subsection, we will derive the general formula for a [...]

## The general form of a Taylor series

In this subsection, we will derive the general formula for a function's Taylor series. We begin by defining Taylor polynomials as follows:

The Taylor polynomial of degree n for f at a, written $P _n (x)$, is the polynomial that has the same 0th- to nth-order derivatives as function f(x) at point a. In other words, the nth-degree Taylor polynomial must satisfy:
$P _n (a) = f (a)$ (the 0th-order derivative of a function is itself)
$P _n ' (a) = f ' (a)$
$P _n '' (a) = f '' (a)$
$\vdots$
$P _n ^{(n)} (a) = f^{(n)} (a)$
where $P _n ^{(k)} (a)$ is the kth-order derivative of $P _n (x)$.

We define Taylor series as follows:

The Taylor series $T (x)$ is the infinite Taylor polynomial for which all derivatives at a are equal to those of $f (x)$.

The following set of images show some examples of Taylor polynomials, from 0th- to 2nd-order:

 Figure 2aA 0th-degree Taylor polynomial. Figure 2bA first-degree Taylor polynomial. Figure 2cA second degree Taylor polynomial.

In order to construct a general formula for a Taylor series, we start with what we know: a Taylor series is a power series. Using the definition of power series, we write a general Taylor series for a function f around a as

Eq. 1         $T(x) = a_0 + a_1 (x-a)+ a_2 (x-a)^2 + a_3 (x-a)^3 + \cdots$,

in which a0, a1, a2, ... are unknown coefficients. Our goal is to find a more useful expression for these coefficients. By definition of a Taylor polynomial, we know that the function $f(x)$ and Taylor series $T(x)$ must have same derivatives of all degrees evaluated at a:

$T(a) = f(a)$,    $T'(a) = f'(a)$,     $T''(a) = f''(a)$,   $T ^{(3)} (a) = f ^{(3)} (a) \cdots$

How might we use this fact to bring us closer to finding the coefficients a0, a1, a2, ...? Let's start by taking the first few derivatives of Eq. 1:

$T(x) = a_0 + a_1 (x-a) + a_2 (x-a)^2 + a_3 (x-a)^3 + \cdots$
$T'(x) = 1 a_1 + 2 a_2 (x-a) + 3 a_3 (x-a)^2 + 4 a_4 (x-a)^3 + \cdots$
$T''(x) = 2\cdot 1 a_2 + 3 \cdot 2 a_3 (x-a) + 4 \cdot 3 a_4 (x-a)^2 + 5 \cdot 4 a_5 (x-a)^3 + \cdots$
$T^{(3)}(x) = 3 \cdot 2 \cdot 1 a_3 + 4 \cdot 3 \cdot 2 a_4 (x-a) + 5 \cdot 4 \cdot 3 a_5 (x-a)^2 + \cdots$
$T^{(4)}(x) = 4 \cdot 3 \cdot 2 \cdot 1 a_4 + 5 \cdot 4 \cdot 3 \cdot 2 a_5 (x-a) + 6 \cdot 5 \cdot 4 \cdot 3 a_6 (x-a)^2 + \cdots$

The pattern should now be recognizable. and it may be apparent how to solve for ak. When we evaluate any of the above derivatives at x = a, only the constant term will remain because all terms with (x - a) go to 0. Note then what happens after k derivatives. We get:

$T ^{(k)} (a) = k! \cdot a_k.$

Since in addition $T ^{(k)} (a) = f^{(k)}(a)$ by definition, we conclude

$f^{(k)}(a) = k! \cdot a_k$,   so   $a_k = {f ^{(k)}(a) \over k!}$.

This formula even holds for k=0, since 0! = 1. Thus it holds for all non-negative integers k. So, using derivatives, we have obtained an expression for all unknown coefficients of the given function f. Substitute this back into Eq. 1 to get an explicit expression of Taylor series:

Eq. 2        $T(x) = f(a)+\frac {f'(a)}{1!} (x-a)+ \frac{f''(a)}{2!} (x-a)^2+\frac{f^{(3)}(a)}{3!}(x-a)^3+ \cdots$

or, in summation notation,

$T(x)=\sum_{k=0} ^ {\infin } \frac {f^{(k)}(a)}{k!} \, (x-a)^{k}$.

This is the standard formula of Taylor series that we will use throughout the rest of this page. In many cases, it is convenient to let a = 0 to get a neater expression:

Eq. 3        $T(x) = f(0)+\frac {f'(0)}{1!} x + \frac{f''(0)}{2!} x^2 + \frac{f^{(3)}(0)}{3!}x^3 + \cdots$

Eq. 3 is called the Maclaurin series after Scottish mathematician Colin Maclaurin, who made extensive use of these series in the 18th century.

## Finding the Taylor series for a specific function

Many Taylor series can be derived using Eq. 2 by substituting in f and a. Here we will demonstrate this process in detail for the natural logarithm function. The process in this section can be repeated for other elementary functions, such as sin(x), cos(x), and ex. Their Taylor series will be discussed in the other Taylor series section.

The natural log function is:

$f (x) = \ln (x)$

Its derivatives are:

$f'(x)=1/x$,   $f''(x)=-1/x^2$,   $f ^{(3)}(x)=2/x^3, \cdots$ $, f ^{(k)}(x) = {{(-1)^{k-1} \cdot (k-1)!} \over x^k}$

Since this function and its derivatives are undefined at x = 0, we cannot construct a Maclaurin series (Eq. 3) for it. Note that, when choosing a, one should select a value not only for which the derivatives f (k)(a) exist but at which they can be evaluated. For instance, centering our Taylor series at a = 2 would not be helpful because f (0)(2) = ln (2) is unknown and, in fact, cannot even be approximated until we have obtained our Taylor series. While it would be possible to write out the Taylor series, it would not be usable.

For the natural log, it makes sense to let a = 1 and compute the derivatives at this point:

$f(1) = \ln 1 = 0$, $f'(1) = {1 \over 1} = 1$, $f''(1) = -{ 1 \over 1^2} = -1$, $f ^{(3)} (1) = {2 \over 1^3} = 2, \cdots$ $f ^{(k)} (1) = {(-1)^{k-1} \cdot (k-1)!}$

Figure 3
Taylor series for natural log

Substitute these derivatives into Eq. 2, and we can get the Taylor series for $\ln (x)$ centered at x = 1:

$\ln (x) = (x-1) - {(x-1)^2 \over 2} + {(x-1)^3 \over 3} + \cdots$

We can avoid the cumbersome (x - 1)k notation by introducing a new function g(x) = ln (1 + x). Now we can expand our polynomial around x = 0:

$\ln (1 + x) = x - {x^2 \over 2} + {x^3 \over 3} - {x^4 \over 4} + \cdots$

The animation to the right shows this Taylor polynomial with degree n varying from 0 to 25. As we can see, on the left side this polynomial quickly comes to approximate the original function closely. However, the right side exhibits some strange behavior: the polynomial seems to diverge farther away from the function as n grows larger. This tells us that a Taylor series is not always a reliable approximation of the original function. The fact that they have same derivatives at one point doesn't always guarantee that the Taylor series will represent a suitable approximation at all values of x, even for arbitrarily large n. Other factors need to be considered.

Alas, power series, like the Taylor series for ln(1 + x) do not necessarily converge for all values of x. The Taylor series for natural log is divergent when $x > 1$, while a valid polynomial approximation needs to be convergent. Consider an arbitrary term in this series, $\pm x^n \over n$. As n increases, the denominator grows linearly, and the numerator grows exponentially. For arbitrarily large n, exponential growth will override linear growth, so the convergence or divergence of the series is determined by xn. If x > 1, then the Taylor series will diverge, hence the abnormal behavior of the right side of Figure 3. In this "divergent zone," although we can still write out and evaluate the polynomial for whatever n we like, we cannot expect it to approximate the original function.

Does this make it impossible to approximate ln(1 +x) for x greater than 1? It would seem that this would make our Taylor series useless in many cases. For example, imagine that we want to approximate ln(4):

$\ln (4) = \ln (1 + 3) = 3 - {3^2 \over 2} + {3^3 \over 3} - {3^4 \over 4} \cdots$

It is clear that this series will diverge rapidly, which contradicts our knowledge that ln(4) is defined. With some clever mathematical footwork, though, we can still find a solution. Instead, we write:

$\ln (4) = \ln (e \cdot {4 \over e}) = \ln (e) + \ln({4 \over e}) \approx 1 + \ln (1.47152) = 1 + \ln (1 + 0.47152)$
$\approx 1 + (0.47152 - {0.47152^2 \over 2} + {0.47152^3 \over 3} - {0.47152^4 \over 4} + \cdots)$

By using the identity log(a ·b ) = log (a ) + log (b ), we were able to rewrite the logarithm so that our Taylor series did not diverge. Larger powers of e may be used for larger values of x.

Let's review what we have done to find a Taylor series for ln(1 + x). How might this process be generalized to finding other Taylor series?

• We began by choosing a base point at which we could evaluate the derivatives of our function.
• We then figured out what those derivatives would be and found a general expression for the kth derivative of our function at a.
• With this information, we could substitute into Eq. 2 to obtain our Taylor series.
• In this example, we modified this Taylor series by recentering it around 0. This is generally not necessary; many Taylor series can be centered around x = 0 to begin with.
• In using our Taylor series, we had to be attentive to its "divergent zone." This, also, is not always necessary. Other Taylor series, to be introduced in the next section, are absolutely convergent; they converge for all x.

## Other Taylor series

Using the process described above, we can obtain Taylor series for a variety of other functions, such as the following:

$\sin (x) = x - {x^3 \over 3!} + {x^5 \over 5!} - {x^7 \over 7!} + {x^9 \over 9!} - \cdots$ , expanded around the origin. x is in radians.
$\cos (x) = 1 - {x^2 \over 2!} + {x^4 \over 4!} - {x^6 \over 6!} + {x^8 \over 8!} - \cdots$ , expanded around the origin. x is in radians.
$e^x = 1 + x + {x^2 \over 2!} + {x^3 \over 3!} + {x^4 \over 4!} + {x^5 \over 5!} + \cdots$ , expanded around the origin. e is Euler's number.

In comparison with the above example of ln(x), these Taylor series are perhaps more straightforward to derive, even though they look slightly more complicated. Because the derivatives of sine, cosine and exare all defined and easily evaluable at x = 0, we can center their respective Taylor series at 0 from the outset. As noted above, these series are also absolutely convergent; they converge for all x (although a given Taylor polynomial for some finite n may not be accurate, particularly for values of x that are not close to the base point).

Note that the powers of each successive term in the Taylor series for sine and cosine increase by 2, and each term alternates between positive and negative; this makes sense when we consider the nature of successive derivatives of sin (x) and cos(x) at x = 0. The Taylor series for ex follows from the fact that the derivative of ex is itself. ex will be derived in Approximating e. Let the derivation of Taylor series for sine and cosine using Eq. 2 be left to the reader.

These days, Taylor series are not usually used directly to approximate the trigonometric functions. They are, however, used in various indirect ways. For instance, we can compute the Taylor series of the function composition $\sin (2x^2)$ by substituting $2x^2$ for x in the Taylor series for sin(x):

$\sin (2x^2) = 2x^2 - {(2x^2)^3 \over 3!} + {(2x^2)^5 \over 5!} - {(2x^2)^7 \over 7!} + \cdots = 2x^2 - {8x^6 \over 3!} + {32x^{10} \over 5!} - {128x^{14} \over 7!} + \cdots$

More complicated composition is also possible; for instance, to find the Taylor series for $e^{\sin x}$ one may substitute the whole Taylor series of sin(x) for x in the Taylor series for ex.

Consider another example:

$\lim_{x \rightarrow 0} {\sin(x) \over x}$

It is clear that when x = 0, the quotient in this limit expression is undefined, so one cannot evaluate the limit by evaluating the quotient at 0. So one way to evaluate the limit is by using L'Hôpital's rule:

$\lim_{x \rightarrow 0} {\sin(x) \over x} = \lim_{x \rightarrow 0} {(\sin(x))' \over (x)'} = \lim_{x \rightarrow 0} {\cos(x) \over 1} = 1$

However, one can use Taylor series! Substitute the Taylor series for sin(x) in:

$\lim_{x \rightarrow 0} {\sin(x) \over x} = \lim_{x \rightarrow 0} {{x - {x^3 \over 3!} + {x^5 \over 5!} - \cdots} \over x} = \lim_{x \rightarrow 0} ({1 - {x^2 \over 3!} + {x^4 \over 5!} - \cdots}) = 1$

We have obtained the same limit.

Taylor series also help us understand the derivatives of these functions. Above it was mentioned that each derivative of ex is itself. More generally, for any real c, the arbitrary kth derivative of ecx is given by:

${d^k \over dx^k}e^{cx} = c^k e^{cx}$

If we substitute cx for x in our Taylor series for ex, we get:

$e^{cx} = 1 + cx + {(cx)^2 \over 2!} + {(cx)^3 \over 3!} + {(cx)^4 \over 4!} + \cdots = 1 + cx + {c^2 \over 2!} x^2 + {c^3 \over 3!} x^3 + {c^4 \over 4!} x^4 + \cdots$

Differentiating this, we get:

${d \over dx} e^{cx} = c + c^2 x + {c^3 \over 2!} x^2 + {c^4 \over 3!} x^3 + {c^5 \over 4!} x^4 + \cdots$
$= c(1 + cx + {c^2 \over 2!} x^2 + {c^3 \over 3!} x^3 + {c^4 \over 4!} x^4 + \cdots)$
$=ce^{cx}$

Each differentiation of the Taylor series will multiply ecx by c, as expected.

## Error bound of a Taylor series

Throughout this page so far, we have made frequent reference to the "accuracy" of our Taylor polynomial approximations...

Throughout this page so far, we have often made reference to the "accuracy" of our Taylor-polynomial approximations. What does it actually mean for a Taylor polynomial to be "accurate"? It would be practical to be able to quantify the closeness of our approximations so that we can know how much we can rely on them, or so that we may add more terms if our approximation is not sufficiently accurate. In other words, we want to understand how much error there might be for a given Taylor approximation so that the approximation is usable.

We should not expect to be able to calculate the exact error. If that were possible, then we would be able to find an exact "approximation" by adding the "error" to our Taylor polynomial. What we can do is bound the error; we can find how accurate our approximation is at worst.

Consider a function $f (x)$ for which we have a Taylor polynomial $P_n (x)$ centered at a. We would like to find a formula to bound our approximation. We define the remainder $R_n (x)$ by:

$R_n (x) = f (x) - P_n (x)$, or
$f (x) = P_n (x) + R_n(x)$

A useful expression for $R_n (x)$ happens to be:

Eq. 4        $R_n (x) = {f^{(n+1)}(c) \over (n+1)!} (x-a)^{n+1}$ for some $c \in [a, x]$.

It is not obvious how Eq. 4 is derived from our definition of remainder; the proof is rather complex and unintuitive.

 Recall that we constructed our Taylor polynomial Pn(x) such that f(x) and Pn(x) have the same first n derivatives at a. We defined $R_n(x) = f(x) - P_n(x)$. It must hold that $R_n(a) = f(a) - P_n(a) = 0$ $R'_n(a) = f'(a) - P'_n(a) = 0$ $R''_n(a) = f''(a) - P''_n(a) = 0$ $\vdots$ $R^{(n)}_n(a) = f^{(n)}(a) - P^{(n)}_n(a) = 0$ Since Pn(x) is an nth-degree polynomial, it (n + 1)th derivative is 0: $P^{(n+1)}_n(x) = 0$ so $R^{(n+1)}_n(x) = f^{(n+1)}(x)$. We bound f (n + 1) (x) on the interval [a, x]. In particular, we choose M so that $|f^{(n+1)}(x)| = |R^{(n+1)}_n(x)| \leq M$. So $-M \leq R^{(n+1)}_n(x) \leq M$ and $- \int_a ^x M dx \leq \int_a ^x R^{(n+1)}_n(x) dx \leq \int_a ^x M dx$ $-M(x-a) \leq R^{(n)}_n(x) - R^{(n)}_n(a) \leq M(x-a)$. As established above, $R^{(n)}_n(a) = 0$, so $-M(x-a) \leq R^{(n)}_n(x) \leq M(x-a)$. We can integrate this again: $-\int_a^x M(x-a) dx \leq \int_a^x R^{(n)}_n(x) dx \leq \int_a^x M(x-a) dx$. Examine the integral: $\int_a^x (Mx-Ma)dx = \left[ {Mx^2 \over 2} - Max \right]_a^x = {Mx^2 \over 2} - Max - {Ma^2 \over 2} + Ma^2 = {Mx^2 \over 2} - Max + {Ma^2 \over 2} = {M \over 2}(x^2 - 2ax + a^2) = {M \over 2}(x - a)^2$ So we now have: $-{M \over 2}(x-a)^2 \leq R^{(n-1)}_n(x) \leq {M \over 2}(x-a)^2$ It might be intuitively evident that, integrating this inequality n - 1 more times, we will obtain Eq. 4. We will not demonstrate this by induction. We have already established our base case above. We must now assume that the inequality holds for some integer k and demonstrate that it therefore holds for k + 1. $-{M \over k!}(x-a)^k \leq R^{(n+1-k)}_n(x) \leq {M \over k!}(x-a)^k$ Again, we examine the integral: $\int_a^x {M \over k!}(x-a)^k dx$

How might one use Eq. 4 to check the accuracy of a Taylor polynomial? Part of the confusion in finding the error of a Taylor polynomial is that Eq. 4 is not to be evaluated exactly, since we generally do not know what c is. What we can do is put some bound on $f^{(n+1)}(c)$ so that we can understand at most how large $|R_n(x)|$is. (From here on, we will refer to the absolute value of the error, since that is often what is useful to us.)

Figure 4
A comparison between the actual error and the upper error bound computed using Eq. 4 for increasing values of n.
Figure 5
A comparison of the Taylor polynomial with the actual function sin(x) at two x values for successive approximations. Notice that the approximation becomes sufficiently close to 0 for the lower x value much more quickly, although by including enough terms, we can make our approximation as accurate as we would like at either point.

Note that, although the error eventually has a 0 in each decimal place that is displayed, the error never actually reaches 0, so long that n is finite. Also note that Rn in this graphic is actual error, the difference between the Taylor polynomial and the original function, not the error bound computed by boundingf (k + 1)(c)

Modified from KeyCurriculum Taylor series activity on Sketchpad.

Imagine that we are trying to find an error bound for sin(x). All derivatives of sin(x) are one of:

$\pm \sin(x), \pm \cos(x)$.

In other terms,

$-1 \leq f^{(k+1)}(x) \leq 1$ $\forall$ $x,k$.

Then we can say that, for any Taylor polynomial for sin(x) evaluated at any x,

$R_n(x) \leq \left |{x^{n+1} \over (n+1)!} \right|$

This is straightforward to evaluate. Since the denominator grows faster than the numerator, it is evident that, as expected, the error becomes smaller for larger n; the factorial growth in the denominator outpaces the exponential growth in the numerator.

In Figure 4, the "flattened" part in the center of the graph is where our approximation is "good". To the naked eye, at least, the error appears to be very close to 0. For any specific x value and Taylor polynomial of degree n, we can calculate the upper bound for the error.

Recall, as well, that we have bounded the absolute value of the k+1th derivative as being no greater than 1. In Eq. 4, the c is determined by the Intermediate Value Theorem. We do not always know what c is on a given interval [0, x], but $|f^{(k+1)}(c)|$ is rarely as large as possible. In other words, our error bound is rarely equal to actual error; it is usually greater, often much greater, than the actual error. For instance,

$P_5(1) - \sin (1) = 0.000196 \cdots$

but

$R_5 (1) = {1 \over 6!} = 0.00138 \cdots$

As we can see, even at a point where the difference between the Taylor polynomial and original function could not be distinguished by the naked eye, the actual error is often much smaller than the bounded error. This should make us especially confident in using our approximations. In Figure 4, the red curve is almost always less than or equal to the blue curve (with a small exception when n = 1). This is desirable when approximating error: we would like to be certain that the actual error is less than our approximation.

Suppose that we would like to make an approximation of $f(x) = e^{2x}$ at x = 0.25. Say we choose to make a 3rd-degree Taylor approximation using the Taylor polynomial centered at 0. The Taylor series is:

$T(x) = 1 + {(2x)^1 \over 1!} + {(2x)^2 \over 2!} + {(2x)^3 \over 3!} + {(2x)^4 \over 4!} + \cdots = 1 + {2x \over 1!} + {4x \over 2!} + {8x \over 3!} + {16x \over 4!} + \cdots$

The error is:

$R_n (x) = {f^{(n+1)}(c) \over (n+1)!} x^{n+1}$

How do we bound the n + 1 st derivative of f at c? We know that in general,

${d^k \over dx^k}e^{px} = p^k e^{px}$

Evaluating this initially seems to be problematic. In our example, p = 2, and p^k can be calculated easily. But we don't know what e^2x is at most for the interval [0, 0.25]; that is why we are making a Taylor polynomial approximation in the first place! However, we just need to recall that we are looking for an error bound, which does not to be exact. We are certain, at least, that on the interval [0, 0.25], $f^{(4)}(c) < 2$, so we will specify:

$R_n (x) = {2^{n+1} e^c \over (n+1)!} x^{n+1} < {2 \cdot 2^{n+1} \over (n+1)!} x^{n+1}$.

We are now prepared to make our approximation. We evaluate the third-degree Taylor polynomial:

$P_3 (0.25) = 1 + {2 \cdot 0.25 \over 1!} + {2 \cdot 0.25^2 \over 2!} + {2 \cdot 0.25^3 \over 3!} = 1.645833 \cdots$

How accurate is this?

$R_n(x) = {2 \cdot 2^{n+1} \over (n+1)!} (0.25)^{n+1} = {2 \cdot (2 \cdot 0.25)^{n+1} \over (n+1)!} = {2 \cdot 0.5^{n+1} \over (n+1)!}$
$R_3(x) = {2 \cdot 0.5^4 \over 4!} = 0.00521$

This gives us a good idea of how accurate our approximation is. The actual value of the function is less than 0.00521 away from the third-degree Taylor approximation:

$P_3(0.25) - R_3(0.25) < f(0.25) < P_3(0.25) + R_3(0.25)$
$1.645833 - 0.00521 < e^{2 \cdot 0.25} < 1.645833 + 0.00521$

Suppose that we desire greater accuracy. Say, specifically, that we would like to know what degree Taylor polynomial would be necessary to have error less than 10-4. We must solve for N:

$R_N(x) = {2 \cdot 0.5^{N+1} \over (N+1)!} < {1 \over 10^4}$
$2 \cdot 10^4 < 2^{N+1} (N+1)!$

By substituting in various values of N, we find that the lowest integer for which this inequality holds is N = 5.

# Why It's Interesting

Figure 6
A modern TI calculator

Have you ever wondered how calculators determine square roots, sines, cosines, and exponentials? For instance, if you were to type $\sin{\pi \over 2}$ or $e^2$ into your calculator, how does it determine which value to spit out? The number must be related to our input in some way, but what exactly is the relationship? Does the calculator just read from an index of known values? Is there a more mathematical and precise way for the calculator to evaluate these functions?

The answer to this latter question is yes. There are algorithms that give an approximate value of sine, for example, using only the four basic operations (+, -, x, /)[1]. Before the age of electronic calculators, mathematicians studied these algorithms in order to approximate these functions manually. The Taylor series, named after English mathematician Brook Taylor, is one such way of making these approximations. Basically, Taylor said that there is a way to expand any infinitely differentiable function into a polynomial series about a certain point. The power of the Taylor series is to approximate certain functions that cannot otherwise be calculated.

The calculator's algorithm uses this method to efficiently find a suitable approximation in the form of a polynomial series. Expanding enough terms for several digits of accuracy is easy for a computing device, even though Taylor series may look daunting and tedious to the naked eye. This algorithm is built in the permanent memory (ROM) of electronic calculators, and is triggered when a function like sine or cosine is called[2].

As is shown in the More Mathematical Explanation, Taylor series can be used to derive many interesting and useful series. Some of these series have helped mathematicians to approximate the values of important irrational constants such as $\pi$ and $e$.

### Approximating $\pi$

$\pi$, or the ratio of a circle's circumference to its diameter, is one of the oldest, most important, and most interesting mathematical constants. The earliest documentation of $\pi$ can be traced back to ancient Egypt and Babylon, in which people used empirical values of $\pi$ such as 25/8 = 3.1250, or (16/9)2 ≈ 3.1605[3].

Figure 7a
Archimedes' method to approximate π

The first recorded algorithm for rigorously calculating the value of $\pi$ was a geometrical approach using polygons, devised around 250 BC by the Greek mathematician Archimedes. Archimedes computed upper and lower bounds of $\pi$ by drawing regular polygons inside and outside a circle, and calculating the perimeters of the outer and inner polygons. He proved that 223/71 < $\pi$ < 22/7 by using a 96-sided polygon, which gives us 2 accurate decimal digits: π ≈ 3.14[4].

Mathematicians continued to use this polygon method for the next 1,800 years. The more sides their polygons have, the more accurate their approximations would be. This approach peaked at around 1600, when the Dutch mathematician Ludolph van Ceulen used a 260 - sided polygon to obtain the first 35 digits of $\pi$[5]. He spent a major part of his life on this calculation. In memory of his contribution, sometimes $\pi$ is still called "the Ludolphine number".

However, mathematicians have had enough of trillion-sided polygons. Starting from the 17th century, they devised much better approaches for computing $\pi$, using calculus rather than geometry. Mathematicians discovered numerous infinite series associated with $\pi$ , and the most famous one among them is the Leibniz series:

${\pi \over 4} = 1 - {1 \over 3} + {1 \over 5} - {1 \over 7} + {1 \over 9} \cdots$

We will explain how Leibniz got this amazing result and how it allowed him to approximate $\pi$.

This amazing series comes directly from the Taylor series of arctan(x)...

This amazing series comes directly from the Taylor series of arctan(x):

Eq. 5a        $\arctan (x) = x - {x^3 \over 3} + {x^5 \over 5} - {x^7 \over 7} + {x^9 \over 9} \cdots$

We can get Eq. 5a by directly computing the derivatives of all orders for arctan(x) at x = 0, but the calculation involved is rather complicated. There is a much easier way to do this if we notice the following fact:

Eq. 5b        ${{d \arctan (x)} \over dx} = {1 \over {1 + x^2}}$

Recall that we gave the summation formula of geometric series in the More Mathematical Explanation section :

${ 1 \over {1 - r}} = 1 + r + r^2 + r^3 + r^4 \cdots$ , $-1 < r < 1$

If we substitute r = - x2 into the summation formula above, we can expand the right side of Eq. 5b into an infinite sequence:

Figure 7b
Gottfried Wilhelm Leibniz
Discoverer of Leibniz series
${ 1 \over {1 + x^2}} = 1 - x^2 + x^4 - x^6 + x^8 \cdots$

So Eq. 5b changes into:

${{d \arctan (x)} \over dx} = 1 - x^2 + x^4 - x^6 + x^8 \cdots$

Integrating both sides gives us:

$\arctan (x) = C + x - {x^3 \over 3} + {x^5 \over 5} - {x^7 \over 7} + {x^9 \over 9} \cdots$

Let x = 0, this equation changes into 0 = C . So the integrating constant C vanishes, and we get Eq. 5a.

One may notice that, like Taylor series of many other functions, this series is not convergent for all values of x. It only converges for -1 ≤ x ≤ 1. Fortunately, this is just enough for us to proceed. Substituting x = 1 into it, we can get the Leibniz series:

${\pi \over 4} = 1 - {1 \over 3} + {1 \over 5} - {1 \over 7} + {1 \over 9} \cdots$

The Leibniz series gives us a radically improved way to approximate $\pi$: no polygons, no square roots, just the four basic operations. However, this particular series is not very efficient for computing $\pi$, since it converges rather slowly. The first 1,000 terms of Leibniz series give us only two accurate digits: π ≈ 3.14. This is horribly inefficient, so most mathematicians would prefer not to use this algorithm.

Fortunately, we can get series that converge much faster if we substitute smaller values of x , such as $1 \over \sqrt{3}$ , into Eq. 5a:

$\arctan {1 \over \sqrt{3}} = {\pi \over 6} = {1 \over \sqrt{3}} - {1 \over {3 \cdot 3 \sqrt{3}}} + {1 \over {5 \cdot 3^2 \sqrt{3}}} - {1 \over {7 \cdot 3^3 \sqrt{3}}} \cdots$

which gives us:

$\pi = \sqrt{12}(1 - {1 \over {3 \cdot 3}} + {1 \over {5 \cdot 3^2}} - {1 \over {7 \cdot 3^3}} + \cdots)$

This series is much more efficient than the Leibniz series, since there are powers of 3 in the denominators. The first 10 terms of it give us 5 accurate digits, and the first 100 terms give us 50. Leibniz himself used the first 22 terms to compute an approximation of pi correct to 11 decimal places as 3.14159265358.

However, mathematicians are still not satisfied with this efficiency. They kept substituting smaller x values into Eq. 5a to get more convergent series. Among them is Leonhard Euler, one of the greatest mathematicians in the 18th century. In his attempt to approximate $\pi$, Euler discovered the following non-intuitive formula:

Eq. 5c        $\pi = 20 \arctan {1 \over 7} + 8 \arctan {3 \over 79}$

Although Eq. 5c looks really weird, it is indeed an equality, not an approximation. The following hidden section shows how it is derived in detail:

 Eq. 5c comes from the trigonometric identity of the tangent of two angles. Suppose we have 3 angles, $\alpha$, $\beta$, and $\gamma$ that satisfy: $\gamma = \alpha - \beta$ Then the trigonometric identity gives us: $\tan \gamma = \tan (\alpha - \beta) = {{\tan \alpha - \tan \beta} \over {1 + \tan \alpha \cdot \tan \beta}}$ Let $\tan \alpha = a$ , $\tan \beta = b$, and substitute into the equation above: $\tan \gamma = {{a - b} \over {1 + a \cdot b}}$ , or $\gamma = \arctan {{a - b} \over {1 + a \cdot b}}$ Recall that we have the relationship: $\alpha - \beta = \gamma$ Change the angles into arctan functions: $\arctan(a) - \arctan (b) = \arctan {{a - b} \over {1 + a \cdot b}}$ If we move arctan(b) to the right side, we will get Euler's arctangent addition formula, which is the most important formula in this hidden section: Eq. 5d        $\arctan(a) = \arctan (b) + \arctan {{a - b} \over {1 + a \cdot b}}$ What Eq. 5d does is that, it takes a large angle, arctan(a), and divides it into two smaller angles, as shown in Figure 7c. From our previous discussion, we know that the series we use to estimate $\pi$ gets more convergent when we plug in smaller angles. So this formula helps us to get more efficient algorithms. Figure 7cDividing an angle Euler himself used this formula to get his algorithm for estimating $\pi$. He started from a simple fact: Step 1        ${\pi \over 4} = \arctan 1$ To divide this angle into smaller angles, we can plug a = 1 and b = 1/2 into Eq. 5d: $\arctan 1 = \arctan {1 \over 2} + \arctan {1 \over 3}$ So it turns out that the angle left is arctan (1/3). Substituting this into Step 1 yields: Figure 7dEuler's approximation of $\pi$ Step 2        ${\pi \over 4} = \arctan {1 \over 2} + \arctan {1 \over 3}$ Next, let's focus on the angle arctan (1/2). Plug a = 1/2 and b = 1/3 into Eq. 5d: $\arctan {1 \over 2} = \arctan {1 \over 3} + \arctan {1 \over 7}$ Substitute this into Step 2: Step 3        ${\pi \over 4} = 2\arctan {1 \over 3} + \arctan {1 \over 7}$ We can keep doing this, using the Euler's arctangent addition formula to get smaller and smaller angles: $\arctan {1 \over 3} = \arctan {1 \over 7} + \arctan {2 \over 11}$ (a = 1/3 , b = 1/7) Step 4        ${\pi \over 4} = 3\arctan {1 \over 7} + 2\arctan {2 \over 11}$ $\arctan {2 \over 11} = \arctan {1 \over 7} + \arctan {3 \over 79}$ (a = 2/11 , b = 1/7) Step 5        ${\pi \over 4} = 5\arctan {1 \over 7} + 2\arctan {3 \over 79}$ This is Eq. 5c, the formula that Euler used to approximate $\pi$. Figure 7d shows a graphic representation of these 5 steps. We can certainly carry on to keep dividing it into even smaller angles, or try different values for a and b to get different series, but Euler stopped here because he thought these angles were small enough to give him an efficient algorithm.

The next step is to expand Eq. 5c using Taylor series, which allows us to do the numeric calculations:

$\pi = 20 ({1 \over 7} - {1 \over 3 \cdot 7^3} + {1 \over 5 \cdot 7^5} - {1 \over 7 \cdot 7^7} \cdots)$
$+ 8 ({3 \over 79} - {3^3 \over 3 \cdot 79^3} + {3^5 \over 5 \cdot 79^5} - {3^7 \over 7 \cdot 79^7} \cdots)$

This series converges so fast that each term of it gives more than 1 digit of $\pi$. Using this algorithm, it will not take more several days to calculate the first 35 digits of $\pi$ with pencil and paper, which Ludolph spent most of his life on.

Although Euler himself has never undertaken the calculation, this idea was developed and used by many other mathematicians at his time. In 1789, the Slovene mathematician Jurij Vega calculated the first 140 decimal places for $\pi$ of which the first 126 were correct. This record was broken in 1841, when William Rutherford calculated 208 decimal places with 152 correct ones. By the time of the invention of electronic digital computers, $\pi$ had been expanded to more than 500 digits. And we shouldn't forget that all of these started from the Taylor series of trigonometric functions.

### Approximating e

The mathematical constant $e$, approximately equal to 2.71828, is also called Euler's Number. This important constant appears in calculus, differential equations, complex numbers, and many other branches of mathematics. It's also widely used in other disciplines like physics and engineering. So we would really like to know its exact value as closely as possible.

Figure 8a
Definition of $e$

One way to define $e$ is:

$e = \lim_{n \to \infin} (1 + {1 \over n}) ^n$

In principle, we can approximate $e$ using this definition. However, this method is slow and inefficient, so mathematicians have tried to find another one. For example, let n = 100 and substitute it into the definition. We get:

$e \approx (1 + {1 \over 100}) ^{100} = 2.70481 \cdots$

This is only accurate to 2 accurate digits. This is horrible accuracy for an approximating algorithm, so we have to find an alternative. One such alternative approximation can be found using Taylor series. Using calculus, we can derive the Taylor series for ex and use it to make our approximation.

ex has the very convenient property...

ex has a very convenient property:

$\frac{d}{dx} e^x = e^x$

The proof of this property can be found in almost every calculus textbook. It tells us that all derivatives of the exponential function are equal:

$f(x) = f'(x) = f''(x) = f ^{(3)}(x) = \cdots = e^x$,

and:

$f(0) = f'(0) = f''(0) = f ^{(3)}(0) = \cdots = 1$

Substitute these derivatives into Eq. 2, the general formula of Taylor Series. We get:

$e^x = 1 + x + {x^2 \over 2!} + {x^3 \over 3!} + {x^4 \over 4!} + \cdots$

Let x = 1 to approximate $e$:

$e = 1 + 1 + {1 \over 2!} + {1 \over 3!} + {1 \over 4!} + \cdots$

This sequence converges quickly, since there are factorials in the denominators of each term, and factorials grow really fast as n increases. Just take the first 10 terms and we can get:

Figure 8b
Two approximations of ex. Taylor series is much faster.

$e \approx 1 + 1 + {1 \over 2!} + {1 \over 3!} + {1 \over 4!} + \cdots + {1 \over 9!} = 2.718281801 \cdots$

The real value of $e$ is 2.718281828··· , so we have obtained 7 accurate digits! Compared to the approximation by definition, which gives us only two digits at order 100, this algorithm is incredibly fast and efficient.

In fact, we can get the same conclusion if we plot the function ex and its two approximations together, and see which one converges faster. We already have the Taylor series approximation:

$e^x = 1 + x + {x^2 \over 2!} + {x^3 \over 3!} + \cdots + {x^n \over n!}$

In Figure 8b, these two approximations are graphed together to approximate the original function ex. As we can see in the animation, Taylor series approximates the original function much faster than the definition does.

### Small-angle approximation

Taylor series are useful in physics for approximating the trigonometric values of small angles. Consider sin(0.1):

$\sin (0.1) = 0.1 - {0.1^3 \over 3!} + {0.1^5 \over 5!} - \cdots$

It is straightforward to evaluate both P1(0.1) = 0.1 and P3(0.1) = 0.099833···. The calculator evaluates sin(0.1) = 0.0998334166. For small angles like 0.1 radians, the third-order term is substantially smaller than the first-order term, which is equal to the argument of the function in the case of sine. (The second-order term, of course, is 0.) It is often suitable, then, to take the first-order term of the Taylor series for sin(x) as its small-angle approximation:

$\sin (x) \approx x$ for small x

By a similar token, we may obtain small-angle approximations for the other trigonometric functions. In the case of cosine, we go out to the second-order term, since that is the first term that includes x:

$\cos (x) \approx 1 - {x ^2 \over 2!}$ for small x

Then

$\tan (x) = {\sin (x) \over \cos (x)} \approx {x\over 1-{x^2 \over 2}} \approx {x\over 1} = x$ for small x,

because for small x be only need the first-order approximations and the first-order approximation of cos(x) is 1.

 Figure 9aComparison of sin(x) with its small-angle approximation. Figure 9bComparison of cos(x) with its small-angle approximation. Figure 9cComparison of tan(x) with its small-angle approximation.

#### How small is small?

Figure 10a
The absolute error of the various small angle approximations.

It is, of course, important to know when a small-angle approximation is appropriate and at what values the small-angle approximation ceases to be accurate. There is not a universal answer to this concern. Physicists will often use an approximation insofar that can be used to model whatever they need to model. If the approximation is not useful, then they will not use it.

In any case, it is necessary to have an idea of how accurate an approximation is. Here, we will try to at least get a sense of just how accurate these small-angle approximations are.

Figure 10b
The relative error (the absolute error divided by the actual function value) of the various small angle approximations. The horizontal, black curve represents 1% accuracy.

One way to do this would be to bound the error of our approximation as we do above in error bound of a Taylor series, but, for reasons explained in that section, this would necessarily be an overestimation of the error, which is helpful in practical circumstances but not for the point we are trying to make in this section. We can simply compare our small-angle approximations to the actual values of the functions that they approximate.

Figure 10a plots the actual error of these functions; that is, the absolute value of the difference between the small-angle approximation and the original function. Figure 10b plots the relative error, or the actual error divided by the value of the actual function. The horizontal line represents 1% of relative error; where the curves intersect with this horizontal line is where the approximations begin to exceed 1% of relative error.

Cosine's small-angle approximation is the most accurate, while tangent is the least accurate. This makes sense when we consider the nature of each of the small-angle approximations. Sine and tangent are both first-order approximations, while cosine must be a second-order approximation, since its first-order Taylor polynomial is always 1. We would expect it, then, to be the most accurate.

On the other hand, in making our small-angle approximation for tangent, we assume that the cosine is essentially 1. One could of course improve the tangent approximation's accuracy by using the small-angle approximation for cosine instead of 1, but then the tangent approximation would lose its simplicity, which is the appeal and utility of a small-angle approximation in the first place!

#### Pendulum

In the simplest respect, the small-angle approximation is "close enough," and it's quicker than evaluating more terms of a Taylor series. However, the small-angle approximation has an additional utility in that it can allow us to solve certain differential equations in closed form. Or rather, it allows us to find an exact closed-form solution to a simpler differential equation which is approximately correct, making the closed form an approximate solution to the exact differential equation. A particular example of this utility is the derivation of the closed-form approximate equation for a simple pendulum.

This explanation uses knowledge from physics and differential equations.

Figure 11
Free-body diagram of a simple pendulum with mass m.

Figure 11 on the right is a diagram of a simple pendulum. The force due to gravity on the object is mg. Using basic trigonometry, this downward force can be decomposed into a force parallel to the string and a force perpendicular to the string. The force parallel to the string (mg sin θ) is canceled out by the tension in the string, so the only force acting on the bob is the force perpendicular to the string (mg cos θ). For this reason the direction of the object's instantaneous motion is perpendicular to the string.

We would like to find a general equation for θ(t), the angle formed by the string and the the vertical as a function of time.

We begin by noting that

$s = L \cdot \theta$, where s is linear position, so
$\theta = {s \over L}$.

Furthermore, we know

$F = m a = m {d^2 s \over dt^2}$ where a is linear acceleration.

Since the only net force acting on the bob is the force due to gravity perpendicular to the string, we have

$- mg \sin {s \over L} = m {d^2 s \over dt^2}$

Substituting and simplifying, we get:

$0 = m {d^2 s \over dt^2} + mg \sin {s \over L}$
Eq. 6         $0 = {d^2 s \over dt^2} + g \sin {s \over L}$

This is where the small-angle approximation comes in. Because of the $\sin{s \over L}$ expression, we cannot solve this differential equation in closed form. Instead, here we will substitute in our small-angle approximation for sine:

$0 = {d^2 s \over dt^2} + g {s \over L} = {d^2 s \over dt^2} + {g \over L}s$

Differential equations of this form have solutions in terms of the sine and cosine functions. Consider, for instance:

${d^2 \cos (kt + c) \over dt^2} = -k^2\cos (kt + c)$ where k and c are arbitrary constants

In this case, $k^2$ is $g \over L$, so we obtain the solution:

$s (t)= A \cos(\sqrt{g \over L}t + B)$.

By dividing each side by L, we may put the equation back in terms of θ.

Eq. 7         $\theta (t) = \theta_{max} \cos (\sqrt{g \over L}t + \phi)$

$\theta _{max}$ is generally known beforehand; it is the largest angle formed by the bob in the pendulum's arc. The phase shift $\phi$ is found by finding a zero of $\theta (t)$. Any such solution for the differential equation could alternatively be expressed as a sine function with a different phase shift.

At this point it is important to reflect that Eq. 7, a relatively simple analytical solution, was made possible only by the assumption of a small angle. Naturally, this means that our formula does not work for larger angles. For larger t, or after several oscillations, the approximation will become gradually less accurate, since it is not an exact formula, and the error will be compounded over time.

# References

1. How does the calculator find values of sine, from homeschoolmath. This is an article about calculator programs for approximating functions.
2. Calculator, from Wikipedia. This article explains the structure of an electronic calculator.
3. Pi, from Wolfram MathWorld. This article contains some history of Pi.
4. Archimedes' Approximation of Pi. This is a thorough explanation of Archimedes' method.
5. Digits of Pi, by Barry Cipra. Documentation of Ludolph's work is included here.
6. How Euler Did It, by Ed Sandifer. This articles talks about Euler's algorithm for estimating π.