Catalan Numbers

From Math Images

Jump to: navigation, search

Worm and Apple

This greedy little worm wants to eat the poor apple. He can only go to the east and to the north in this 8 by 8 grid. Since there is stain on the grid, he cannot pass above the diagonal connecting the worm and the apple. How many ways could he get there? The main image shows only one way of reaching the apple.
This is a very famous grid problem in combinatorics, which could be solved by Catalan numbers.


Basic Description

Catalan numbers grow rapidly. The first several Catalan numbers are listed as following.

More explicit and detailed description is under More Mathematical Explanation section.

Why It's Interesting


The first person who discovered Catalan numbers was Leonhard Euler. In 1751, Euler discussed the number of ways to cut a polygon with lines into triangles without any of the lines intersecting in his letter to Christian Goldbach, a German mathematician.

It was a French and Belgian mathematician, Eugène Charles Catalan, who described this number sequence in a well-defined formula, and introduced this subject to solve parentheses expressions.

Before Catalan, a Mongolian mathematician Minggatu was the first person in China who established and applied what was later to be known as Catalan numbers. In the 1730s, he brought forward this sequence of numbers and continued using it when he was trying to express series expansions of sin(ma), where m = 2, 3, 4, 5, 10, 100, 1000, and10000. This topic was included in his book, Ge Yuan Mi Lu Jie Fa (The Quick Method for Obtaining the Precise Ratio of Division of a Circle).


In this section we will consider 10 most representative examples of applications of Catalan numbers that arise in a variety of combinatorial problems. Examples are indicated with bullet points.

Stacking Coins

  • We are going to stack coins on a bottom row that consists of n consecutive coins. It is not allowed to put the coins on the two sides of the bottom coins. How many ways there are to stack coins on the n coins?
n: The number of ways to stack coins in the plane.
Solution: C_n.

Balanced Parentheses

  • We want to group a string of parentheses. Each open parenthesis must have a matching closed parenthesis. Therefore, "(( )( ))" is valid, but ")( )) ((" and "( ))( ) (" are not. How many groupings are there to group n pairs of parentheses?
n: The number of pairs of parentheses.
Solution: C_n.

Do Nothing!
1 solution
( )
1 solution
(( ))
( )( )
2 solutions
( (( )) )
( ( )( ) )
(( )) ( )
( ) (( ))
( )( )( )
5 solutions

  • Many other applications are equivalent to balanced parentheses, and here is one example. If we want to connect 2n dots lying on a horizontal line in the plane with n nonintersecting arcs, the solution is also the Catalan sequence. Each arc connecting the two dots is equivalent to a pair of parentheses, with the left dot equivalent to an open parenthesis and the right dot equivalent to a closed parenthesis.

Mountain Ranges

  • We want to form mountain ranges on a line with n upstrokes and n downstrokes. Same as the matching rule of the parentheses grouping problem, each upstroke must have a matching downstroke. How many mountain ranges are there for each value of n?
n: The number of pairs of upstrokes and downstrokes.
Do Nothing!
1 solution
/ \
1 solution
/ \ / \
/ \
/ \/ \
2 solutions
n=3/ \ / \ / \
\ / / \
/ \ / \/ \
/ \ \ /
/ \/ \ / \
\ / \ / \ \
/ / \ / \ \
/ // \\ \
/ / / \ \ \
/ / / \ \\
5 solutions
  • Note that a pair of strokes and a pair of parentheses are equivalent: upstrokes are equivalent to open parentheses, and downstrokes are equivalent to closed parentheses. The fact that one pair of parentheses are inside another pair corresponds to that one pair of strokes are on top of another pair, thus forming the shape of mountain ranges.

Polygon Triangulation

  • We want to cut convex polygons into triangles by connecting the vertices with straight, non-intersecting lines. How many different ways are there for a polygon with n+2 sides? This is the application Euler was interested in.
n: The number of sides of the polygon - 2.
Solution: C_n
Note that a 2-sided polygon is set to be triangulated in exactly one way, do nothing, so it follows C_0 = 1 .

Binary Trees

  • How many full binary trees there are in order to have n internal nodes?
n: The number of internal nodes on full binary trees.
Note that when there is only one node, we have one solution, which is the node itself, so it matches with C_0 = 1.
In summary, a full binary tree with n internal nodes has 2n + 1 nodes, 2n branches and n+1 leaves.

  • Other transformations of binary trees and plane trees also contains Catalan sequence:
1. Binary trees with n vertices [1].
2. Plane trees with n+1 vertices [1].

Binary Paths

  • In a n × n grid, we are going to joint the lower left point A and the upper right point B by a path. We are only allowed to go to the right or upwards for each unit, and cannot pass above the diagonal connecting A and B.
n: The number of paths described above.
Solution: C_n. (Thus, the answer to the main image is C_8.)
Figure-1 An example of Dyck Path.
Figure-1 An example of Dyck Path.
1. Did you find out that these kind of paths look a lot like mountain ranges if you rotate them counterclockwise about origin until the diagonal is horizontal?
2. Did you notice that, whatever value n is, the first step is alway to the east and the last step is always to the north? It is because we cannot pass above the diagonal.

  • This kind of lattice walk is also known as Dyck Path. Based on Cartesian Coordinates system, a Dyck path is a walk from (0, 0) to (n, n) in a n × n lattice that is composed of one-unit steps only in positive x-axis and positive y-axis directions without passing above the line y = x (see Figure-1). Other transforms of Dyck Paths turn out to follow the sequence of Catalan numbers as well:
1. Dyck Paths (as defined above) from (0, 0) to (2n + 2, 0) such that any maximal sequence of consecutive steps (1, -1) ending on the x-axis has odd length [1].
2. Dyck Paths (as defined above) from (0, 0) to (2n + 2, 0) with no peaks at height two [1].


  • A permutation of {1, 2, ... , n} is an rearrangement of the n numbers. For example, the permutation of {1, 2, 3} includes 6 terms: (1, 2, 3), (1, 3, 2), (2, 1, 3,), (2, 3, 1), (3, 1, 2), (3, 2, 1). 123-avoiding permutation means to avoid an increasing subsequence of 3 terms (the 3 terms do not have to be consecutive). Therefore, we should avoid (1, 2, 3) for n=3. Take n=4 as another example, (4, 3, 1, 2) is valid, but (4, 1, 2, 3) is not valid because of the subsequence 123, and neither is (2, 3, 1, 4) because of 234.
n: The number of permutations that avoid 123.
Solution: C_n.
1 solution
1 solution
(1, 2), (2, 1).
2 solutions
(1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1).
5 solutions
(1, 4, 3, 2), (2, 1, 4, 3), (2, 4, 1, 3), (2, 4, 3, 1),
(3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1), (3, 4, 1, 2), (3, 4, 2, 1),
(4, 1, 3, 2), (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1).
14 solutions
Note that 123-avoiding permutation only avoids an increasing subsequence of three terms, regardless of the value of n. Therefore, (1,2), (2,1) are valid although they are increasing subsequences as well.

  • Similarly, there is a 321-avoiding permutation of [n], which avoids a decreasing subsequence of $3$ terms. Take n = 3 as an example, we will then have (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2). And a permutation of [n] is called 132-avoiding, ``if it does not have three entries a < b < c so that a is the leftmost of them and b is the rightmost of them [2].

  • Catalan numbers count shuffles of the permutation 1,2, \cdots, n with itself, i.e., permutations of the multiset \left \{ 1^2, 2^2, \cdots , n^2 \right \} which are a union of two disjoint subsequences 1,2, \cdots, n . On top of this, there should be no weakly decreasing subsequence of length three . [3]

Explanations. A shuffle of 1, 2, \cdots, n and itself 1, 2, \cdots, n is obtained by intermixing the letters in each string of numbers, while the letters in each string must stay in the original order. For example, a shuffle of 123 and 456 could be: 124536 . No weakly decreasing of length three means that the subsequence either is strictly increasing or has at most two equal entries not followed by a decrease. For example, 121233 is valid (because it has only two equal entries instead of three with no decrease followed), but 112332is not (because there is a decrease, 2, followed after 33).
Back to the application, let's see an example of n = 3, where we want to know the number of shuffles of permutations of 1, 2, 3 with 1, 2, 3. It turns out that there are 5 distinct shuffles:
112233 112323 121233 121323 123123.
Note that they are distinctive shuffles. In fact, for each single shuffle, some of the entries may come from either string of 1, 2, 3, although the sequence appears the same. For instance, the first shuffle 112233 could have a multiplicity 8.
1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3

  • Catalan sequence is also the answer to the number of permutations of a_1 a_2 a_3 \cdots a_{2n} , formed from integers 1, 1, 2, 2, 3, 3, ... n, n such that
a) these integers 1, 2, 3, ... , n are in increasing order when they first occur, and
b) there is no form like \alpha\beta\alpha\beta \cdots, where the integers \alpha, \beta, \cdots do not have to be consecutive.
For example, 1212 and 122313 are not valid. Here are the solutions where n = 3:
112233, 112332, 122331, 122133, 123321.

Young Diagrams

Figure-2  Partition of 4.
Figure-2 Partition of 4.

In combinatorics, partition of a positive integer n is an expression of rewriting n as a sum of positives integers, where the summands do not differ in orders. The sum would be called composition if order matters. Take n=4 as an example, there are 5 ways to partition 4: 4, 3+1, 2+2, 2+1+1, 1+1+1+1. Remember, the ordering of the integers does not matter, i.e. 1+1+2, 1+2+1, 2+1+1 are equivalent partitions.

Partitions could be visualized in explicit graphs, and the most commonly used one is called Young diagrams. Again, take n = 4 and n = 5 as two examples for better understanding. Young diagrams of partition 4 and 5 are shown in Figure-2 and Figure-3.

Figure-3  Partition of 5.
Figure-3 Partition of 5.
  • Young diagrams that fit in the shape (n - 1, n - 2, ... , 1) follow Catalan sequence [1].
Explanations. The shape (n - 1, n - 2, ... , 1) is a Young diagram that looks like a upside down staircase. Its top stair consists of n - 1 blocks, next stair n - 2 blocks, and so on until the last stair has only 1 block. By "fit," we mean that we try to find Young diagrams that could be a part of the shape (n - 1, n - 2, ... , 1) or the entire shape.
In this image above, the last figure is the shape (2, 1) where n = 3. Each of the other four figures, including the empty set, could be a part of the shape. Add the original shape, then we have five solutions for shape (2,1) in total.


Partially Ordered Set P, or Poset P for short, is a set together with a binary relation denoted \le , satisfying the following three axioms, where x and y are arbitrary objects:

  1. For all x \in P, x \le x. (reflexivity)
  2. If x \le y, and  y \le x , then  x = y. (antisymmetry)
  3. If x \le y, and  y \le z , then  x \le z. (transitivity) [1]

Hasse diagram is used to represent a finite poset. Each element in the poset is a vertex in Hasse diagram. The transitive relation in the poset is represented by lines going up from one vertex to another in Hasse diagram. The lines could cross each other but cannot touch other vertex before it reaches the endpoint. It is the line segments and labeled vertices in the diagram that illustrates the partial order of a set. See Figure-4, -5, -6 for several examples of Hasse diagram. As you can see, the element in the poset is any object; it could be a set, a diagram or a number.

How do Hasse diagrams embody the 3 definitions of posets? In Figure 4,

  1. Each element in the diagram reflects itself, i.e., the set {a} is less or equal to itself {a}. This shows reflexivity.
  2. Since set {a} is less or equal to {a} and {a} is less or equal to {a}, then {a} = {a}. This symmetry does not work between, for example {a} and {b}, because {a} and {b} are two different, nonsymmetric sets. This is the idea of antisymmetry.
  3. Since the empty set is less or equal to set {a}, and set {a} is less or equal to set {a, b}, the empty set is less or equal to {a, b}. In the diagram, the three sets are connected with 2 line segments. This shows that if we can follow the lines going from the bottom element up to the top one, then all the elements on the way obey transitivity. Likewise, we can tell that {b} is less or equal to {a, b, c} according to the diagram.

  • Linear extensions of the poset 2 × n follow Catalan sequence [1] (See Figure 7).
Figure-7 This is the Hasse diagram of the poset 2 × n .
Figure-7 This is the Hasse diagram of the poset 2 × n .
Explanations. If n = 3, then it looks like this . Linear extension is obtained by rewriting the Hasse diagram on a line, and people read it from bottom up as they read Hasse diagrams. How do we know where to put the elements? Starting from the bottom 1, we write each element in Hasse diagram only after the elements it connects from the bottom are already written. Hence, there are more than one linear extension of a poset. The gif pictures will help you comprehend the process.
Repeat the same steps as shown in Figure-8 and Figure-9, and we will get 5 linear extensions. The starting and ending point will never change, whereas the points in between vary.
123456 , 123546, 132456, 132546, 135246
The number of linear extensions of a poset 2× n turns out to be the nth Catalan numbers.

Pascal's Triangle


If you take the difference of numbers in the middle column on odd rows and their adjacent column, you will find the Catalan sequence.

1 - 0 = 1 = C_0
2 - 1 = 1 = C_1
6 - 4 = 2 = C_2
20 - 15 = 5 = C_3
70 - 56 = 14 = C_4
252 - 210 = 42 = C_5

It is not a coincidence but certainty. Each entry in Pascal's Triangle is in the form of \tbinom{k}{r}, where k is the row number starting at 0 from top to bottom, and r is the entry number starting at 0 from left to right. Since all of the blue numbers are on odd rows "0, 2, 4, ...", we use 2n to represent them. Therefore, the numbers in middle column could be expressed as \tbinom{2n}{n}. The numbers in adjacent column could be expressed as \tbinom{2n}{n+1}. It is now obvious that the differences of numbers in the middle column on odd rows and their adjacent column are Catalan numbers.

 2n \\
\end{pmatrix} -  \begin{pmatrix}
 2n \\
 n + 1
\end{pmatrix} = C_{n}, n = 0, 1, 2, ...

A More Mathematical Explanation

Note: understanding of this explanation requires: *combinatorics

Basic Description

The nth Catalan number is defined as

UNIQ3025856d572a1211-math [...] </font>

Basic Description

The nth Catalan number is defined as

C_{n} = \frac{1}{n+1} \cdot \begin{pmatrix}
 2n \\
\end{pmatrix}  = \frac{(2n)!}{n!(n+1)!}, \qquad n = 0, 1, 2, ...

The binomial coefficient, \tbinom{n}{r}, pronounced as n choose r, represents the number of possible combinations of r objects from a collection of  n objects:

\binom{n}{r} = \frac{n!}{r! (n-r)!} .


 2n \\
\end{pmatrix} = \frac{(2n)!}{n! (2n -n)!} = \frac{2n \cdot (2n -1) \cdot (2n -2) \cdots (2n - n + 1)}{n!}

Example: \begin{pmatrix}
 11 \\
\end{pmatrix}  = \frac{11!}{4!~(11-4)!} = \frac{11 \times 10 \times 9 \times 8}{4 \times 3 \times 2 \times 1}.

Catalan numbers could be described in various but equivalent ways. If you transform the first formula just a little bit, you will get another useful formula of Catalan numbers.

C_{n} = \begin{pmatrix}
 2n \\
\end{pmatrix} -  \begin{pmatrix}
 2n \\
 n + 1
\end{pmatrix}, \qquad n = 0, 1, 2, ...

See Proof in the next section to know more about its proof. Note that \tbinom{2n}{n} , \tbinom{2n}{n+1} \in \mathbb{N} and \tbinom{2n}{n} > \tbinom{2n}{n + 1} . Therefore, C_n is the difference between two positive, natural numbers, which could be extended to Pascal's Triangle.


  • Prove C_{n} = \begin{pmatrix}
 2n \\
\end{pmatrix} -  \begin{pmatrix}
 2n \\
 n + 1
\end{pmatrix}, n = 0, 1, 2, ... is the formula for Catalan sequence.
1. Check if it is true when n = 0.
C_{0} = \begin{pmatrix}
 2(0) \\
\end{pmatrix} -  \begin{pmatrix}
 2(0) \\
 0 + 1
\end{pmatrix} = 
 0 \\
\end{pmatrix} -  \begin{pmatrix}
 0 \\
\end{pmatrix} = 1 - 0 = 1 .
2. Show it is true when n \geqslant 1.
C_n = \frac{1}{n+1} {2n \choose n} = {2n \choose n} - \frac{n}{n+1} {2n \choose n} = \frac{2n!}{n!n!} - \frac{n}{n+1} \frac{2n!}{n!n!} = \frac{2n!}{n!n!} - \frac{2n!}{(n+1)!(n-1)!} = {2n \choose n} - {2n \choose n+1}. \blacksquare

  • Prove this recurrence relation: C_{n+1} = \frac{2(2n+1)}{n+2} C_n for n=0, 1, 2, ... .
Recall that
C_n = \frac{(2n)!}{n!(n+1)!}, \qquad n = 0, 1, 2, \cdots.
Therefore, we have
C_{n+1} = \frac{(2n+2)!}{(n+1)!(n+2)!}, \qquad n = 0, 1, 2, \cdots.
Take the ratio of C_{n+1} to C_n:
\frac{C_{n+1}}{C_n} = \frac{ \frac{(2n+2)!}{(n+1)!(n+2)!}}{ \frac{(2n)!}{n!(n+1)!}} = \frac{ (2n+2)! n! (n+1)!} {(n+1)! (n+2)! (2n)!} = \frac{(2n+2) (2n+1)}{(n+1) (n+2)} = \frac{ 2(2n+1)}{(n+2)}.
C_{n+1} = \frac{ 2( 2n+1)}{(n+2)} C_n
for all nonnegative values of n. \blacksquare

Recursive Definition

We have seen various kinds of applications of Catalan numbers so far: "Stacking Coins," "Balanced Parentheses," "Mountain Ranges," "Polygon Triangulation," "Binary Trees," "Binary Paths," "Permutation," "Young Diagrams" and "Posets." In fact, all the sequences are equivalent, and we will show that there is a common formula that counts them all.

C_0 =1,  C_{n+1} = \sum_{i=0}^n C_i C_{n-i} \text{ for  } n\ge 1 .

In the application Balanced Parenthese, it is already known that for each open parenthesis, there is a close parenthesis. Now, let's try to find a pattern in these paired parentheses with example n = 3:

( (( )) ) - ( ( )( ) ) - (( )) ( ) - ( ) (( )) - ( )( )( ) .

The "pattern" is that we can always separate them in two collections. For example, we can separate the set "(( )) ( )" into: "(( ))" and "( )." We name them collection A and collection B, either of which is able to contain zero pairs of parentheses. Similarly, ( (( )) ) could be separated into " ( (( )) ) " and nothing. For ( ( )( ) ) , we treat it as a whole and put it in collection A, so B is, again, empty.

What about ( )( )( ) ? At first glance, we see three pairs of parentheses, but we have only two collections. We could choose to put the first two pairs of parentheses in collection A and the last pair should be in B, or put two in collection B and only one in A. This is exactly the same and we do not want to count them twice, thus there is a need for a regulation in order to avoid the repetitiveness. Since n is no less than 1 in the recurrence definition mentioned above, it is certain that there is at least a pair of parentheses, and we will fix it in collection A. Thus, the simplest form where n = 1 is:

 ( \quad  )  \quad {\color{White} ( ) }
 {\color{Maroon}A}  \quad   {\color{Blue}B}   ,

and this is our base form. For values of n that are greater than 1, we simply add more pairs of parentheses inside the fixed black parenthese to collection A, and place the rest in collection B. In this way, both collection A and B are able to contain up to n - 1 pairs of parentheses (the black parentheses in the base form does not count as one of them). If collection A contains k pairs, then it is not hard to find that there are n - ( k + 1) pairs in collection B.

What is the purpose of separating the parentheses into two collections? Well, we want to two collections A and B for the purpose of counting the combinations of parentheses systematically, that is, if A has 0 pairs, B has n - 1 pairs; A 1 pair, and B n - 2 pairs; A 2 pairs, and B n - 3 pairs, etc.

Number of Pairs
Contained in A
Number of Pairs
Contained in B
Number of Solutions
for Each Situation
n - 1
 \Big( \quad  \Big) {\color{Blue}( \cdots ) \cdots}
 {\color{Maroon}A} \qquad   {\color{Blue}B}
C_0 C_{n-1}
 n - 2
 \Big( \  {\color{Maroon} ( \ )} \ \Big)  {\color{Blue}(  \cdots  ) \cdots}
 {\color{Maroon}A}    \qquad    {\color{Blue}B}
C_1 C_{n-2}
n - 1
 \Big( \ {\color{Maroon} ( ( \cdots ) )( \cdots ) \cdots} \  \Big) {\color{White} ABCDEFGH}
 {\color{Maroon}A}    \qquad \qquad  {\color{Blue}B}
C_{n - 1} C_0

Add up all of the situations, and we get the total number:

C_n = C_0 C_{n-1} + C_1 C_{n-2} + C_2 C_{n-3} + \cdots + C_{n+2} C_1 + C_{n-1} C_0.

This formula is the recursive relation that we are looking for. Plugging actual numbers may help you understand this great formula.

C_1 = C_0 C_0
C_2 = C_0 C_1 + C_1 C_0
C_3 = C_0 C_2 + C_1 C_1 + C_2 C_0

Slightly change the base form into

/ A \ B ,

we could generate the recursive relation formula in similar examples, such as Mountain Ranges and Binary Paths, in the same token.

This method may not be so powerful for those applications with figures. Hence, we need another way of thinking to approach problems like Polygon Triangulation, in order to obtain the recursive definition.

Here, we use the case of a 6-side polygon where n = 4.

We will start with drawing the first triangle based on the horizontal side at the top of hexagon. This horizontal line is going to be part of the triangle. Since there are four vertices left, besides the two vertices that the horizontal line connects, we can draw four different cases. See figure below.

Each triangle divides the hexagon into two polygons, on the left and right of the original triangle. Our next step is trying to triangulate these two separated polygons. Recall that a polygon with k > 3 sides can be triangulated into C_{k-2} ways. In the first case (the first hexagon in the figure above), there is a pentagon on the left that has C_3 ways of triangulation and nothing on the right of the selected triangle, which has C_0 triangulations. Thus, the first case has C_3 \cdot C_0 solutions in total. In Case 2, we have a quadrangle and a triangle, so there are C_2 \cdot C_1 solutions. Similar methods can be applied to Case 3 and 4.

Add up the number of solutions in each case, and we will get the total number of ways to triangulate a 6 - side polygon C_4 = C_3 C_0 + C_2 C_1 + C_1 C_2 + C_0 C_3.

Generally, a n+2 - side polygon have n different first triangles after the initial step of triangulation. On the left and right side of each of those triangle, there are n+1 -side polygon and nothing C_{n-1} C_0, n - side polygon and a triangle  C_{n-2} C_1 , n-1 - side polygon and a quadrangle  C_{n-3}  C_2 , n-2-side polygon and a 5 - side polygon  C_{n-4}  C_3 , and so on.

Take the sum of them, and the total number of ways to triangulate a n+2 - side polygon is

C_n = C_{n-1} C_0 + C_{n-2} C_1  +C_{n-3}  C_2  + \cdots + C_1 C_{n+2}  + C_0 C_{n-1}.

Bijection is the one-to-one correspondence of two sets or a both one-to-one and onto function. In a more understandable way, we can always pair every element in one set with exactly one element in the other set. Hence, there are no unpaired elements in either sets and the total numbers of elements in both sets are the same.

To have an exact pairing between X and Y (where Y need not be different from X), four properties must hold:

To have an exact pairing between X and Y (where Y need not be different from X), four properties must hold:

  1. each element of X must be paired with at least one element of Y,
  2. no element of X may be paired with more than one element of Y,
  3. each element of Y must be paired with at least one element of X, and
  4. no element of Y may be paired with more than one element of X. [4]

Property 1 and 2 guarantee that the bijection is within domain X. Functions satisfying property 3 are called "onto." Functions satisfying property 4 are called "one-to-one."

Here is an example of bijection. Let X, Ybe the two sets, and f: X \to Y. If X = \left \{ A, B, C, D \right \}, and  Y = \left \{ a, b, c, d \right \}, one possible bijection function is:

f(A) = b,
f(B) = c,
f(C) = d,
f(D) = a.

The followings are not correct bijection functions:

f(A) = b, \quad \quad \quad \quad f(A) = b,
f(B) = b, \quad \quad \quad \quad f(A) = c,
f(C) = d, \quad \quad \quad \quad f(C) = d,
f(D) = a.  \quad \quad \quad \quad f(D) = a.

The formula for recursive definition is essential because it connects any two applications of Catalan numbers and shows their bijection. Try it if you want to analyze and find bijections among other examples with the same idea and method!

A Few Hints:

* Binary Trees. The base is the one full binary tree at the bottom. Similar to the first method for "Balanced Parentheses," collection A contains the "baby" trees branching out from the left node of the base tree, while collection B contains the "baby" trees branching out from the right node of the base tree. All of the trees in both collections have a total number of n - 1.
* Binary Paths: Besides rotating counterclockwise, there is another way to look at the bijection between binary paths and open parentheses. Starting with the origin (0, 0), the unit path to the east is equivalent to an open parentheses, so the unit path to the north is equivalent to a closed parentheses. The reason we could do this kind of bijection is that there are definitely same numbers of steps to the east and steps to the north because the grid has the same length and width.
* Stacking Coins. If you outline a border that is tangent to the coins for each coin stack, it will be obvious that it looks like mountain ranges. For instance, is equivalent to the second mountain range when n=3 illustrated in the section Mountain Ranges.
* 123 Avoiding Permutation.

How to Generate the Formula of Catalan Number

We start by defining a function f(x) that contains all of the Catalan numbers,

Eq. 1        
f(x) = C_0 + C_1 x + C_2 x^2 + \cdots = \sum_{i=0}^\infty C_i x^i .

Multiply f(x) by itself to get \left ( f(x) \right ) ^2,

\left ( f(x) \right ) ^2 = C_0 C_0 + (C_0 C_1 + C_1 C_0) x + ( C_0 C_2 + C_1 C_1 + C_2 C_0) x ^2+ \cdots .

Apply recurrence relation,  C_{n+1} = \sum_{i=0}^n C_i C_{n-i} \text{ for }n\ge 1 , that is, C_1 = C_0 C_0, C_2 = C_0 C_1 + C_1 C_0, \text{etc.} Therefore,

\left ( f(x) \right ) ^2 = C_1 + C_2 x + C_3 x ^2+ \cdots .

Relate this new equation to Eq. 1 by multiplying it by x and add C_0:

Eq. 2        
 f(x) = C_0 + x \left ( f(x) \right ) ^2 .

Rewrite Eq. 2 into a common quadratic form.

 x \left ( f(x) \right ) ^2 - f(x) + C_0 = 0.

Solve it as  a m^2 + bm + c = 0 where a = x, b =-1, c = C_0 = 1, and m = f(x) , by using the quadratic formula m=\frac{-b \pm \sqrt {b^2-4ac}}{2a} to get the roots.

Eq. 3        
f(x) = \frac{1 - \sqrt {1-4x}}{2x}

If we follow the + symbol, f(x) will go to  \infty as x \to 0. Therefore, we only use the - sign.

In order to move on with Eq. 3, we need to transform this equation. Since \sqrt {1-4x} is not nice-looking, we are going to expand it with the help of binomial formula. Binomial formula is able to expand any power of a + b into the sum of the form:

(a + b) ^ n =  {n \choose 0}a^n b^0 + {n \choose 1}a^{n-1}b^1 + {n \choose 2}a^{n-2}b^2 + \cdots
{\color{white}(a + b) ^ n} =  a^n + \frac{n}{1} a^{n-1} b^1 + \frac{n(n-1)}{2 \cdot 1} a^{n-2} b^2 + \frac{ n (n-1) (n-2)} {3 \cdot 2 \cdot 1} a^{n-3} b^3 +\; \cdots .

So when a = 1,  b= -4x, and n = \frac{1}{2},

\sqrt{1-4x} = (1-4x)^{\frac{1}{2}}
 = (1)^{\frac{1}{2}} + \frac{(\frac{1}{2})}{1} (1)^{\frac{1}{2} -1} (-4x)^1 + \frac{ (\frac{1}{2}) (\frac{1}{2} -1)} {2 \cdot 1} (1)^{\frac{1}{2} -2} (-4x)^2 + \frac{ (\frac{1}{2}) (\frac{1}{2} -1) (\frac{1}{2} -2)} {3 \cdot 2 \cdot 1} (1)^{\frac{1}{2} -3} (-4x)^3 + \frac{ (\frac{1}{2})(\frac{1}{2} -1)(\frac{1}{2} -2)(\frac{1}{2} -3) } {5 \cdot 4 \cdot 3 \cdot 2 \cdot 1} (1)^{\frac{1}{2} -4} (-4x)^4 + \; \cdots .

Simplify it, and we get:

 (1-4x)^{\frac{1}{2}} = 1 + \frac{(\frac{1}{2})}{1} (-4)^1 (x)^1 + \frac{ (\frac{1}{2}) (-\frac{1}{2})} {2 \cdot 1}(-4)^2 (x)^2 + \frac{ (\frac{1}{2}) (- \frac{1}{2}) (-\frac{3}{2})} {3 \cdot 2 \cdot 1}(-4)^3 (x)^3 + \frac{ (\frac{1}{2})(-\frac{1}{2})(-\frac{3}{2})(-\frac{5}{2}) } {5 \cdot 4 \cdot 3 \cdot 2 \cdot 1} (-4)^4 (x)^4 + \; \cdots .

In order to make it nicer and clearer, replace the powers of 2 with factorials:

(1-4x)^{\frac{1}{2}} = 1 - \frac{1}{1!} (2) x - \frac{1} {2!} (2^2) x^2 - \frac{ 3 \cdot 1} {3!} (2^3) x^3 + \frac{5 \cdot 3 \cdot 1}{4!} (2^4) x^4 + \; \cdots .

Plug it back into Eq. 3.

f(x) = \frac{1 - \sqrt {1-4x} }{2x} =  \frac{1}{2x} \cdot \left ( 1 - \frac{1}{1!} (2) x - \frac{1} {2!} (2^2) x^2 - \frac{ 3 \cdot 1} {3!} (2^3) x^3 + \frac{5 \cdot 3 \cdot 1}{4!} (2^4) x^4 + \cdots \right ) .


 f(x) = 1 + \frac{1}{2!} (2) x + \frac{3 \cdot 1}{3!} (2^2) x^2 + \frac{5 \cdot 3 \cdot 1}{4!} (2^3) x^3 + \frac{7 \cdot 5 \cdot 3 \cdot 1}{5!} (2^4) x^4 + \; \cdots

The terms 3 \cdot 1, 5 \cdot 3 \cdot 1 look like factorials, but even numbers are missing. However, 2^2 \cdot 2! = 2 \cdot (2 \cdot 1) = 4 \cdot 2, 2^3 \cdot 3! = 2 \cdot (3 \cdot 2 \cdot 1) = 6 \cdot 4 \cdot 2. Thus, if we want to express the factorial of odd numbers, all we need to do is to divid by factorials of even numbers:

5 \cdot 3 \cdot 1 = \frac{6 \cdot 5 \cdot 4 \cdot 3 \cdot 2 \cdot 1}{6 \cdot 4 \cdot 2} = \frac{6!}{2^3 \cdot 3!}
7 \cdot 5 \cdot 3 \cdot 1 = \frac{8 \cdot 7 \cdot 6 \cdot 5 \cdot 4 \cdot 3 \cdot 2 \cdot 1}{8 \cdot 6 \cdot 4 \cdot 2} = \frac{8!}{2^4 \cdot 4!}
(2n-1) \cdot (2n-3) \cdot \cdots \cdot 3 \cdot 1 = \frac{(2n)!}{2^n \cdot n!}


 f(x) = 1 + \frac{ \frac{(2\cdot 1)!}{2^1 \cdot 1!} \cdot 2}{2!} x + \frac{ \frac{(2 \cdot 2)!}{2^2 \cdot 2!} \cdot 2^2}{3!} x^2 + \frac{ \frac{(2 \cdot 3)!}{2^3 \cdot 3!} \cdot 2^3}{4!} x^3 +  \cdots .

After several steps of simplification,

 f(x) = 1 + \frac{1}{2} \left( \frac{2!}{1!1!} \right) x + \frac{1}{3} \left( \frac{4!}{2!2!} \right) x^2 + \frac{1}{4} \left( \frac{6!}{3!3!} \right) x^3 + \frac{1}{5} \left( \frac{8!}{4!4!} \right) x^4 + \; \cdots =  \sum_{i=0}^\infty \frac{1}{i + 1}  {2i \choose i} x^i .

Recall Eq. 1, that we defined f(x)  = \sum_{i=0}^\infty C_i x^i .

We can now conclude that the coefficient is the formula for the ith Catalan number is C_i = \frac{1}{i+1} {2i \choose i}. \blacksquare

Other approaches could produce this definition formula as well, but we will not show them here. The proof shown is the most fundamental one, based on a proof from [5].


So far we have seen a certain number of applications of Catalan numbers and how they are related to each other. In fact, Catalan numbers arise in over 600 examples.

We have been convinced that, however the examples vary, applications of Catalan numbers are related to each other in an equivalent way. If we remember the Ballot sequence, which says that "In a sequence of 2n items with n + 's and n - 's, if there are no more - 's than + 's anywhere in the sequence(in other words, the sum of this sequence is always nonnegative), then the number of ways of counting these items is the nth Catalan number."

Think about the parentheses. If there are more closed parentheses than open parentheses somewhere in the sequence, then it will not make sense.

Any problem that follows this rule could be solved by Catalan numbers. Here is one example. A class of 400 college students is voting for their class president. 200 students vote for A, and 200 students vote for B. If in the voting process B always trails A, then could you be able to tell me the number ways of the sequence in which the votes could be appear?

Teaching Materials

There are currently no teaching materials for this page. Add teaching materials.


  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Stanley, Richard P. Enumerative Combinatorics, vol.2. Cambridge University Press. New York/Cambridge. 1999.
  2. Bona, Miklos. A self-dual poset on objects counted by the Catalan numbers. Schol of Mathe- matics, Institute of Advanced Study. November 10, 1998.
  3. Stanley, Richard P. Catalan Addendum. MIT Mathematics. Version of 22 October 2011. (No.t^6) Retrieved from Richard Stanley's home page
  4. Wikipedia. Bijection. Retrieved from
  5. Davis, Tom. Catalan Numbers. Mathematical Circles Topics. 2010. Retrieved from

[6] Stanley, Richard P. Enumerative Combinatorics, vol.1. Cambridge University Press. New York/Cambridge. 1999.

[7] Campbell, Douglas M.. The Computation of Catalan Numbers. Mathematics Magazine, Vol. 57, No.4 (Sep., 1984), pp. 195 - 208.

[8] Choo, Koo-Guan. Catalan Numbers. Retrieved from

[9] Britz, Thomas. Cameron, Peter. Partially ordered sets. 2001. Retrieved from

[10] Dowling, Thomas A. Catalan Numbers. Department of Mathematics, Ohio State University. Retrieved from

If you are able, please consider adding to or editing this page!

Have questions about the image or the explanations on this page?
Leave a message on the discussion page by clicking the 'discussion' tab at the top of this image page.

Personal tools