# Transformations and Matrices

(Difference between revisions)
 Revision as of 10:07, 18 July 2011 (edit)← Previous diff Current revision (13:59, 29 July 2011) (edit) (undo) (24 intermediate revisions not shown.) Line 1: Line 1: - {{Image Description + {{Image Description Ready |ImageName=Transformations |ImageName=Transformations - |Image=Tranformations2.png + |Image=Tranformations4.png |ImageIntro= This picture shows an example of four basic transformations (where the original teapot is a red wire frame). On the top left is a translation, which is essentially the teapot being moved. On the top right is a scaling. The teapot has been squished or stretched in each of the three dimensions. On the bottom left is a rotation. In this case the teapot has been rotated around the x axis and the z axis (veritcal). On the bottom right is a shearing, creating a skewed look. |ImageIntro= This picture shows an example of four basic transformations (where the original teapot is a red wire frame). On the top left is a translation, which is essentially the teapot being moved. On the top right is a scaling. The teapot has been squished or stretched in each of the three dimensions. On the bottom left is a rotation. In this case the teapot has been rotated around the x axis and the z axis (veritcal). On the bottom right is a shearing, creating a skewed look. - |ImageDescElem=When an object undergoes a transformation, the transformation can be represented as a matrix. Different transformations such as translations, rotations, scaling and shearing are represented mathematically in different ways. One matrix can also represent multiple transformations in sequence when the matrices are multiplied together.

+ |ImageDescElem=When an object undergoes a transformation, the transformation can be represented as a [[Matrix|matrix]]. Different transformations such as translations, rotations, scaling and shearing are represented mathematically in different ways. One matrix can also represent multiple transformations in sequence when the matrices are multiplied together.
+ + ==Basic Transformations For Graphics== + Computer graphics works by representing objects in terms of simple [[Graphics Primitives|primitives]] that are manipulated with transformations that preserve some primitives’ essential properties. These properties may include angles, lengths, or basic shapes. Some of these transformations can work on primitives with vertices in standard 2D or 3D space, but some need to have vertices in homogeneous coordinates. The general graphics approach is to do everything in homogeneous coordinates, but we’ll talk about the primitives in terms of both kinds when we can.

+ + The most fundamental kinds of transformations for graphics are rotation, scaling, and translation. There are also a few cases when you might want to use shear transformations, so we’ll talk about these as well.

+ |ImageDesc= ==Linear Transformations Are Matrices== ==Linear Transformations Are Matrices== Line 35: Line 41: From this example, we see that the linear transformation is exactly determined by the matrix whose first column is $f(i)$, whose second column is $f(j)$, and whose third column is $f(k)$, and that applying the function f is exactly the same as multiplying by the matrix. So the linear transformation is the matrix multiplication, and we can use the concepts of linear transformation and matrix multiplication interchangeably.
From this example, we see that the linear transformation is exactly determined by the matrix whose first column is $f(i)$, whose second column is $f(j)$, and whose third column is $f(k)$, and that applying the function f is exactly the same as multiplying by the matrix. So the linear transformation is the matrix multiplication, and we can use the concepts of linear transformation and matrix multiplication interchangeably.
- ==Transformation Composition Is Matrix Multiplication== - Transformations are usually not used by themselves, especially in graphics, so you need to have a way to compose transformations, as in $g(f(P))$. But if G is the matrix for the transformation g, and F is the matrix for the transformation f, then the matrix product G*F is the matrix for the composed functions gf.
- ==Basic Transformations For Graphics== - Computer graphics works by representing objects in terms of simple primitives (link to the graphics primitives page) that are manipulated with transformations that preserve some primitives’ essential properties. These properties may include angles, lengths, or basic shapes. Some of these transformations can work on primitives with vertices in standard 2D or 3D space, but some need to have vertices in homogeneous coordinates. The general graphics approach is to do everything in homogeneous coordinates, but we’ll talk about the primitives in terms of both kinds when we can.

- - The most fundamental kinds of transformations for graphics are rotation, scaling, and translation. There are also a few cases when you might want to use shear transformations, so we’ll talk about these as well.

- |ImageDesc= ===Rotation=== ===Rotation=== {{hide|1= {{hide|1= - [[Image:Rotationgraph.png|300px|right]] A 2D rotation transformation rotates everything in 2D space around the origin by a given angle. In order to see what it does, let’s take a look at what a rotation by a positive angle  does to the coordinate axes. Now (x,y) is the result when you apply the transformation to (1,0), which means that
+ [[Image:Rotationgraph.png|300px|right]] A 2D rotation transformation rotates everything in 2D space around the origin by a given angle. In order to see what it does, let’s take a look at what a rotation by a positive angle $\theta$ does to the coordinate axes. Now (x,y) is the result when you apply the transformation to (1,0), which means that
::\begin{align} x = cos(\theta ) \\ ::[itex] \begin{align} x = cos(\theta ) \\ y = sin(\theta ) \end{align}
y = sin(\theta ) \end{align} [/itex]
But (x’,y’) is the result when you apply the transformation to (0,1), or
But (x’,y’) is the result when you apply the transformation to (0,1), or
- ::\begin{align} x' = cos(\theta +\frac{p}{2}) = cos(\theta )cos(\frac{p}{2}) - sin(\theta )sin(\frac{p}{2}) = -sin(\theta ) \\ + ::[itex] \begin{align} x' &= cos(\theta +\frac{p}{2}) = cos(\theta )cos(\frac{p}{2}) - sin(\theta )sin(\frac{p}{2}) = -sin(\theta ) \\ - y' = sin(\theta +\frac{p}{2}) = sin(\theta )cos(\frac{p}{2}) + cos(\theta )sin(\frac{p}{2}) = cos(\theta ) \end{align}
+ y' &= sin(\theta +\frac{p}{2}) = sin(\theta )cos(\frac{p}{2}) + cos(\theta )sin(\frac{p}{2}) = cos(\theta ) \end{align} [/itex]
Then as we saw above, the rotation transformation must have the image of (1,0) as the first column and the image of (0,1) as the second column, or
Then as we saw above, the rotation transformation must have the image of (1,0) as the first column and the image of (0,1) as the second column, or
Line 58: Line 57: sin(\theta ) & cos(\theta ) sin(\theta ) & cos(\theta ) \end{bmatrix} [/itex]
\end{bmatrix} [/itex]
- or, as XKCD (http://xkcd.com/184/) sees it (notice that the rotation is by -90° and $sin(-90\,^{\circ}) = -sin(90\,^{\circ})$,

The situation for 3D rotations is different because a rotation in 3D space must leave a fixed line through the origin. In fact we really only handle the special cases where the fixed line is one of the coordinate axes. Let’s start with the easiest one.

The situation for 3D rotations is different because a rotation in 3D space must leave a fixed line through the origin. In fact we really only handle the special cases where the fixed line is one of the coordinate axes. Let’s start with the easiest one.

Line 78: Line 77: For rotations around the Y-axis, the view down the Y-axis looks different from the one down the Z-axis; it is
For rotations around the Y-axis, the view down the Y-axis looks different from the one down the Z-axis; it is
::[[Image:Xzplane.png| 200px]]
::[[Image:Xzplane.png| 200px]]
- Here a positive-angle is from the X-axis towards the Z-axis, but $X \times Z = -Z \times X = -Y$, so the rotation axis dimension is pointing in the opposite direction from the Y-axis. Thus a the angle for the rotation is the negative of the angle we would see in the axes above, and since cos is an even function but sin is odd, we have the rotation matrix
+ Here a positive-angle is from the X-axis towards the Z-axis, but $X \times Z = -Z \times X = -Y$, so the rotation axis dimension is pointing in the opposite direction from the Y-axis. Thus a the angle for the rotation is the negative of the angle we would see in the axes above, so we use $-\theta$ instead of $\theta$. Since cos is an even function but sin is odd, we can substitute in $cos(\theta )$ for $cos(-\theta )$ and $-sin(\theta )$ for $sin(-\theta )$. Thus we have the rotation matrix
::$\begin{bmatrix} ::[itex] \begin{bmatrix} cos(-\theta ) & 0 & -sin(-\theta ) \\ cos(-\theta ) & 0 & -sin(-\theta ) \\ Line 102: Line 101: * & * & * & 0 \\ [0.3em] * & * & * & 0 \\ [0.3em] * & * & * & 0 \\ * & * & * & 0 \\ - 0 & 0 & 0 & 1 \end{bmatrix}$ where the * terms are the terms from the 3D rotations above. + 0 & 0 & 0 & 1 \end{bmatrix} [/itex]
+ where the * terms are the terms from the 3D rotations above. }} }} ===Scaling=== ===Scaling=== Line 112: Line 112: f(0,1,0) = (0,3,0) \\ f(0,1,0) = (0,3,0) \\ f(0,0,1) = (0,0,4) \end{align} [/itex]
f(0,0,1) = (0,0,4) \end{align} [/itex]
- So the matrix for this transformation is $\begin{bmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{bmatrix}$ and, in general, a scaling matrix looks like
+ So the matrix for this transformation is
- ::$\begin{bmatrix} \sigma_x & 0 & 0 \\ 0 & \sigma_y & 0 \\ 0 & 0 & \sigma_z \end{bmatrix}$ where the $\sigma_x , \sigma_y ,\text{ and }\sigma_z$ are the scaling factors for x, y, and z respectively.
+ ::$\begin{bmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{bmatrix}$
+ and, in general, a scaling matrix looks like
+ ::$\begin{bmatrix} \sigma_x & 0 & 0 \\ 0 & \sigma_y & 0 \\ 0 & 0 & \sigma_z \end{bmatrix}$
+ where the $\sigma_x , \sigma_y ,\text{ and }\sigma_z$ are the scaling factors for x, y, and z respectively.
In case of only 2D transformations, scaling simply scales down to two dimensions and we simply have
In case of only 2D transformations, scaling simply scales down to two dimensions and we simply have
- $\begin{bmatrix} \sigma_x & 0 \\ 0 & \sigma_y \end{bmatrix}$

+ ::$\begin{bmatrix} \sigma_x & 0 \\ 0 & \sigma_y \end{bmatrix}$

- In case we are working with homogeneous coordinates, a scaling transformation only acts on the three primary components and leaves the homogeneous component alone, so we simply have the matrix $\begin{bmatrix} \sigma_x & 0 & 0 & 0 \\ 0 & \sigma_y & 0 & 0 \\ 0 & 0 & \sigma_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$ for the scaling transformation. + In case we are working with homogeneous coordinates, a scaling transformation only acts on the three primary components and leaves the homogeneous component alone, so we simply have the matrix
+ ::$\begin{bmatrix} \sigma_x & 0 & 0 & 0 \\ 0 & \sigma_y & 0 & 0 \\ 0 & 0 & \sigma_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$
+ for the scaling transformation. }} }} ===Translation=== ===Translation=== Line 126: Line 131: ::$\begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \\ 0 & 0 & 1 \end{bmatrix} \times ::[itex] \begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \\ 0 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = - \begin{bmatrix} x+T_x \\ y+T_y \\ 1 \end{bmatrix}$ so that the matrix $\begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \\ 0 & 0 & 1 \end{bmatrix}$ gives a 2D translation.

+ \begin{bmatrix} x+T_x \\ y+T_y \\ 1 \end{bmatrix} [/itex]
+ so that the matrix
+ ::$\begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \\ 0 & 0 & 1 \end{bmatrix}$
+ gives a 2D translation.

The 3D case is basically the same, and by the same argument we see that the 3D translation is given by
The 3D case is basically the same, and by the same argument we see that the 3D translation is given by
Line 137: Line 145: The shear transformation is not widely used in computer graphics, but can be used for things like the oblique view in engineering drawings. The concept of a shear is to add a multiple of one coordinate to another coordinate of each point, or, for example,
The shear transformation is not widely used in computer graphics, but can be used for things like the oblique view in engineering drawings. The concept of a shear is to add a multiple of one coordinate to another coordinate of each point, or, for example,
::$shear(x,y,z) = (x+3y,y,z)$
::$shear(x,y,z) = (x+3y,y,z)$
- The matrix for this shear transformation looks like $\begin{bmatrix} 1 & 3 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$.

+ The matrix for this shear transformation looks like
+ ::$\begin{bmatrix} 1 & 3 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$.

In general, the matrix for a shear transformation will look like the identity matrix with one non-zero element A off the diagonal. If A is in row $i$, column $j$, then the matrix will add A times the $j^{th}$ coordinate of the vector to the $i^{th}$ coordinate.

In general, the matrix for a shear transformation will look like the identity matrix with one non-zero element A off the diagonal. If A is in row $i$, column $j$, then the matrix will add A times the $j^{th}$ coordinate of the vector to the $i^{th}$ coordinate.

Line 159: Line 168: In general, getting the inverse of a matrix can be difficult, but for the basic graphics transformations the inverses are easy because we can simply undo the geometric action of the original transformation.

In general, getting the inverse of a matrix can be difficult, but for the basic graphics transformations the inverses are easy because we can simply undo the geometric action of the original transformation.

- The inverse of the scaling matrix $\begin{bmatrix} \sigma_x & 0 & 0 \\ 0 & \sigma_y & 0 \\ 0 & 0 & \sigma_z \end{bmatrix}$ is clearly + The inverse of the scaling matrix
- $\begin{bmatrix} \frac{1}{\sigma_x } & 0 & 0 \\ 0 & \frac{1}{\sigma_y } & 0 \\ 0 & 0 & \frac{1}{\sigma_z } \end{bmatrix}$

+ ::$\begin{bmatrix} \sigma_x & 0 & 0 \\ 0 & \sigma_y & 0 \\ 0 & 0 & \sigma_z \end{bmatrix}$
+ is clearly
+ ::$\begin{bmatrix} \frac{1}{\sigma_x } & 0 & 0 \\ 0 & \frac{1}{\sigma_y } & 0 \\ 0 & 0 & \frac{1}{\sigma_z } \end{bmatrix}$

- The inverse of a rotation transformation by angle $\theta$ is clearly the rotation around the same line by the angle $-\theta$.

+ The inverse of a rotation transformation by angle $\theta$ is clearly the rotation around the same line by the angle $-\theta$. For example, the rotation matrix + ::$\begin{bmatrix}cos(\theta ) & -sin(\theta ) & 0 \\ sin(\theta ) & cos(\theta ) & 0 \\ 0 & 0 & 1\end{bmatrix}$
+ has an inverse of
+ ::$\begin{bmatrix}cos(\theta ) & sin(\theta ) & 0 \\ -sin(\theta ) & cos(\theta ) & 0 \\ 0 & 0 & 1\end{bmatrix}$
+ Note that $sin(-\theta ) = -sin(\theta )$ and $cos(-\theta ) = cos(\theta)$.

- The inverse of the translation matrix $\begin{bmatrix} 1 & 0 & 0 & T_x \\ 0 & 1 & 0 & T_y \\ 0 & 0 & 1 & T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$ is clearly $\begin{bmatrix} 1 & 0 & 0 & -T_x \\ 0 & 1 & 0 & -T_y \\ 0 & 0 & 1 & -T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$

+ The inverse of the translation matrix
+ ::$\begin{bmatrix} 1 & 0 & 0 & T_x \\ 0 & 1 & 0 & T_y \\ 0 & 0 & 1 & T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$
+ is clearly
+ ::$\begin{bmatrix} 1 & 0 & 0 & -T_x \\ 0 & 1 & 0 & -T_y \\ 0 & 0 & 1 & -T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$

The inverse of the simple shear transformation is also straightforward. Since a simple shear adds a multiple of one vector component to another component, the inverse only needs to subtract that multiple. So we have

The inverse of the simple shear transformation is also straightforward. Since a simple shear adds a multiple of one vector component to another component, the inverse only needs to subtract that multiple. So we have

- ::$\begin{bmatrix} 1 & A \\ 0 & 1 \end{bmatrix}^{-1} = \begin{bmatrix} 1 & -A \\ 0 & 1 \end{bmatrix}$ and the 3D case is a simple extension of this.

+ ::$\begin{bmatrix} 1 & A \\ 0 & 1 \end{bmatrix}^{-1} = \begin{bmatrix} 1 & -A \\ 0 & 1 \end{bmatrix}$
+ and the 3D case is a simple extension of this.

So we have a major observation: If any transformation is the product of basic graphics transformations, it is easy to find the inverse of its matrix (and hence its inverse transformation) as the product of the inverses of the components in reverse order. Or:
So we have a major observation: If any transformation is the product of basic graphics transformations, it is easy to find the inverse of its matrix (and hence its inverse transformation) as the product of the inverses of the components in reverse order. Or:
::$(A \times B \times C)^{-1} = C^{-1} \times B^{-1} \times A^{-1}$
::$(A \times B \times C)^{-1} = C^{-1} \times B^{-1} \times A^{-1}$
}} }} + + ==Transformation Composition Is Matrix Multiplication== + Transformations are usually not used by themselves, especially in graphics, so you need to have a way to compose transformations, as in $g(f(P))$. But if G is the matrix for the transformation g, and F is the matrix for the transformation f, then the matrix product G*F is the matrix for the composed functions gf.

+ For example, we have the translation represented by the matrix
+ ::$\begin{bmatrix} 1 & 0 & 2 \\ + 0 & 1 & 1 \\ + 0 & 0 & 1 + \end{bmatrix}$
+ which represents a move two units in the x direction and one unit in the y direction. If we want to then rotate the same object with the matrix
+ ::$\begin{bmatrix} -0.5 & 0.866 & 0 \\ + -0.866 & -0.5 & 0 \\ + 0 & 0 & 1 + \end{bmatrix}$
+ we can represent the combination of the two actions with a single composed matrix. This matrix is found by multiplying the second action by the first action.
+ ::$\begin{bmatrix} -0.5 & 0.866 & 0 \\ + -0.866 & -0.5 & 0 \\ + 0 & 0 & 1 + \end{bmatrix} * \begin{bmatrix} 1 & 0 & 2 \\ + 0 & 1 & 1 \\ + 0 & 0 & 1 + \end{bmatrix} = \begin{bmatrix} -0.5 & 0.866 & -0.134 \\ + -0.866 & -0.5 & -2.23 \\ + 0 & 0 & 1 + \end{bmatrix}$
+ So this matrix represents moving, then rotating an object in sequence.

+ + In the example below, the teapot on the left has just been translated by the translation matrix above. The next image is just the rotation from the rotation images. The two images that follow are the translation then rotation and rotation then translation respectively. This demonstrates the combination of different transformations and how they must be done in the right order.
+
[[Image:Rotationimage.png|1200px]]
+ ==Transformations and Graphics Environments== ==Transformations and Graphics Environments== {{hide|1= {{hide|1= Line 196: Line 244: ::sphere for main part ::sphere for main part :pop :pop + :push ::translate ::translate ::scale ::scale ::sphere for left eye ::sphere for left eye :pop :pop + :push ::Translate ::Translate ::Scale ::Scale ::sphere for right eye ::sphere for right eye :pop :pop + :push ::Translate ::Translate ::Rotate ::Rotate Line 209: Line 260: ::sphere for left ear ::sphere for left ear :pop :pop + :push ::Translate ::Translate ::Rotate ::Rotate Line 218: Line 270: Notice something important: the transformations are written in the order they are applied, with the one closest to the geometry to be applied first. The right ear operations are really Translate(Rotate(Scale(sphere-for-right-ear)))

Notice something important: the transformations are written in the order they are applied, with the one closest to the geometry to be applied first. The right ear operations are really Translate(Rotate(Scale(sphere-for-right-ear)))

- If you are not familiar with stacks, this won’t make much sense, but this isn’t necessary to understand basic transformation concepts. A simple way to look at these stacks is to notice that a transformation is a 4x4 matrix or, equivalently, a 16-element array, so maintaining a stack is simply a matter of building an array
+ If you are not familiar with stacks, this won’t make much sense, but you don't need to understand this to understand basic transformation concepts. A simple way to look at these stacks is to notice that a transformation is a 4x4 matrix or, equivalently, a 16-element array, so maintaining a stack is simply a matter of building an array
::float transStack[N][16];
::float transStack[N][16];
or
or
Line 224: Line 276: where N is the number of transformations one wants to save. where N is the number of transformations one wants to save. }} }} + |other=stacks |other=stacks |AuthorName= Nordhr |AuthorName= Nordhr |Field=Geometry |Field=Geometry + |Field2=Algebra + |References=Page written by Steve Cunningham. |Pre-K=No |Pre-K=No |Elementary=No |Elementary=No Line 235: Line 290: |InProgress=No |InProgress=No }} }} + + + + + =Messages to the Future= + When there is a page for 2D, 3D, 4D real spaces; affine spaces; homogeneous coordinates, this page should link to that page (to the homogeneous coordinates section).

## Current revision

Transformations
Fields: Geometry and Algebra
Image Created By: Nordhr

Transformations

This picture shows an example of four basic transformations (where the original teapot is a red wire frame). On the top left is a translation, which is essentially the teapot being moved. On the top right is a scaling. The teapot has been squished or stretched in each of the three dimensions. On the bottom left is a rotation. In this case the teapot has been rotated around the x axis and the z axis (veritcal). On the bottom right is a shearing, creating a skewed look.

# Basic Description

When an object undergoes a transformation, the transformation can be represented as a matrix. Different transformations such as translations, rotations, scaling and shearing are represented mathematically in different ways. One matrix can also represent multiple transformations in sequence when the matrices are multiplied together.

## Basic Transformations For Graphics

Computer graphics works by representing objects in terms of simple primitives that are manipulated with transformations that preserve some primitives’ essential properties. These properties may include angles, lengths, or basic shapes. Some of these transformations can work on primitives with vertices in standard 2D or 3D space, but some need to have vertices in homogeneous coordinates. The general graphics approach is to do everything in homogeneous coordinates, but we’ll talk about the primitives in terms of both kinds when we can.

The most fundamental kinds of transformations for graphics are rotation, scaling, and translation. There are also a few cases when you might want to use shear transformations, so we’ll talk about these as well.

# A More Mathematical Explanation

Note: understanding of this explanation requires: *stacks

## Linear Transformations Are Matrices

A linear transformation on 2D (or 3D) space is a function f f [...]

## Linear Transformations Are Matrices

A linear transformation on 2D (or 3D) space is a function f from 2D (or 3D) space to itself that has the property that

$f(aA + bB) = af(A) + bf(B).$

Since points in 2D or 3D space can be written as $P = xi + yj$ or $P = xi + yj +zk$ with $i$, $j$, and $k$ the coordinate vectors, then we see that $f(P) = xf(i) + yf(j)$ or $f(P) = xf(i) + yf(j) + zf(k)$
This tells us that the linear transformation is completely determined by what it does to the coordinate vectors.

Let’s see an example of this: if the transformation has the following action on the coordinates:

\begin{align}f(1,0,0) = f(i) = (2,-2,1) \\ f(0,1,0) = f(j) = (-1,3,2) \\ f(0,0,1) = f(k) = (4,3,-2) \end{align}

then for any point we have:

\begin{align}f(x,y,z) = (2x,-2x,x)+(-y,3y,2y)+(4z,3z,-2z) \\ = (2x-y+4z, -2x+3y+3z, x+2y-2z) \\ = \begin{bmatrix} 2x - y + 4z \\ -2x + 3y + 3z \\ x + 2y - 2z \end{bmatrix} = \begin{bmatrix} 2 & -1 & 4 \\ -2 & 3 & 3 \\ 1 & 2 & -2 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} \end{align}

From this example, we see that the linear transformation is exactly determined by the matrix whose first column is $f(i)$, whose second column is $f(j)$, and whose third column is $f(k)$, and that applying the function f is exactly the same as multiplying by the matrix. So the linear transformation is the matrix multiplication, and we can use the concepts of linear transformation and matrix multiplication interchangeably.

### Rotation

A 2D rotation transformation rotates everything in 2D space around the origin by a given angle. In order to see what it does, let’s take a look at what a rotation by a positive angle $\theta$ does to the coordinate axes. Now (x,y) is the result when you apply the transformation to (1,0), which means that
\begin{align} x = cos(\theta ) \\ y = sin(\theta ) \end{align}

But (x’,y’) is the result when you apply the transformation to (0,1), or

\begin{align} x' &= cos(\theta +\frac{p}{2}) = cos(\theta )cos(\frac{p}{2}) - sin(\theta )sin(\frac{p}{2}) = -sin(\theta ) \\ y' &= sin(\theta +\frac{p}{2}) = sin(\theta )cos(\frac{p}{2}) + cos(\theta )sin(\frac{p}{2}) = cos(\theta ) \end{align}

Then as we saw above, the rotation transformation must have the image of (1,0) as the first column and the image of (0,1) as the second column, or

$rotate(\theta ) = \begin{bmatrix} cos(\theta ) & -sin(\theta ) \\ sin(\theta ) & cos(\theta ) \end{bmatrix}$

or, as XKCD (http://xkcd.com/184/) sees it (notice that the rotation is by -90° and $sin(-90\,^{\circ}) = -sin(90\,^{\circ})$,

The situation for 3D rotations is different because a rotation in 3D space must leave a fixed line through the origin. In fact we really only handle the special cases where the fixed line is one of the coordinate axes. Let’s start with the easiest one.

A rotation around the Z-axis is a 2D rotation as above with the third dimension fixed. So the matrix for this rotation is pretty clearly

$\begin{bmatrix} cos(\theta ) & -sin(\theta ) & 0 \\ sin(\theta ) & cos(\theta ) & 0 \\ 0 & 0 & 1 \end{bmatrix}$

A rotation around the X-axis is pretty similar. If we look down the X-axis, we see the following 2D coordinates:

with $Y \times Z = X$, the axis of rotation. This looks like an exact analogue of the XY-plane, and so we can see that the rotation matrix must leave X fixed and operate only on Y and Z as

$\begin{bmatrix} 1 & 0 & 0 \\ 1 & cos(\theta ) & -sin(\theta ) \\ 0 & sin(\theta ) & cos(\theta ) \end{bmatrix}$

For rotations around the Y-axis, the view down the Y-axis looks different from the one down the Z-axis; it is

Here a positive-angle is from the X-axis towards the Z-axis, but $X \times Z = -Z \times X = -Y$, so the rotation axis dimension is pointing in the opposite direction from the Y-axis. Thus a the angle for the rotation is the negative of the angle we would see in the axes above, so we use $-\theta$ instead of $\theta$. Since cos is an even function but sin is odd, we can substitute in $cos(\theta )$ for $cos(-\theta )$ and $-sin(\theta )$ for $sin(-\theta )$. Thus we have the rotation matrix

$\begin{bmatrix} cos(-\theta ) & 0 & -sin(-\theta ) \\ 0 & 1 & 0 \\ sin(-\theta ) & 0 & cos(-\theta ) \end{bmatrix} = \begin{bmatrix} cos(\theta ) & 0 & sin(\theta ) \\ 0 & 1 & 0 \\ sin(-\theta ) & 0 & cos(\theta ) \end{bmatrix}$

around the Y-axis.

When you want to get a rotation around a different line than a coordinate axis, the usual approach is to find two rotations that, when composed, take a coordinate line into the fixed line you want. You can then apply these two rotations, apply the rotation you want around the coordinate line, and apply the inverses of the two rotations (in inverse order) to construct the general rotation. The sequence goes like this:

apply a rotation $R_1$ around the Z-axis to move your fixed line into the YZ-plane
apply a rotation $R_2$ around the X-axis to move that line to the Y-axis
apply the rotation by your desired angle around the Y-axis
apply the inverse $R_2^{-1}$ to move your rotation line back into the YZ-plane
apply the inverse $R_1^{-1}$ to move your rotation line back to the original line.

Whew! This can all be put into a function – or you can simply keep everything in terms of rotations around X, Y, and Z.

If we are working in homogeneous coordinates, we see that all of the rotation operations take place in standard 3D space and so the fourth coordinate is not changed. Thus the general pattern for all the rotations in homogeneous coordinates is

$\begin{bmatrix} * & * & * & 0 \\ [0.3em] * & * & * & 0 \\ [0.3em] * & * & * & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$
where the * terms are the terms from the 3D rotations above.

### Scaling

Scaling is the action of multiplying each coordinate of a point by a constant amount. As an example, let $f(x,y,z) = (2x,3y,4z)$. Then
$f((x,y,z)+(a,b,c)) = f(x+a,y+b,z+c) = (2(x+a),3(y+b), 4(z+c)) = (2x,3y,4z)+(2a,3b,4c)$
So this is a linear transformation. If we look at what this transformation does to each of the coordinate vectors, we have

\begin{align} f(1,0,0) = (2,0,0) \\ f(0,1,0) = (0,3,0) \\ f(0,0,1) = (0,0,4) \end{align}

So the matrix for this transformation is

$\begin{bmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{bmatrix}$

and, in general, a scaling matrix looks like

$\begin{bmatrix} \sigma_x & 0 & 0 \\ 0 & \sigma_y & 0 \\ 0 & 0 & \sigma_z \end{bmatrix}$

where the $\sigma_x , \sigma_y ,\text{ and }\sigma_z$ are the scaling factors for x, y, and z respectively.
In case of only 2D transformations, scaling simply scales down to two dimensions and we simply have

$\begin{bmatrix} \sigma_x & 0 \\ 0 & \sigma_y \end{bmatrix}$

In case we are working with homogeneous coordinates, a scaling transformation only acts on the three primary components and leaves the homogeneous component alone, so we simply have the matrix

$\begin{bmatrix} \sigma_x & 0 & 0 & 0 \\ 0 & \sigma_y & 0 & 0 \\ 0 & 0 & \sigma_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$
for the scaling transformation.

### Translation

Notice that a translation function cannot be a linear transformation on normal space because it does not take the origin to the origin. These are examples of affine transformations, transformations that are composed of a linear transformation, such as a rotation, scaling, or shear, and a translation. In order to write a translation matrix, we need to use homogeneous coordinates.

If we want to add $T_x$ to the X-coordinate and $T_y$ to the Y-coordinate of every point in 2D space, we see that

$\begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \\ 0 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x+T_x \\ y+T_y \\ 1 \end{bmatrix}$

so that the matrix

$\begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \\ 0 & 0 & 1 \end{bmatrix}$

gives a 2D translation.

The 3D case is basically the same, and by the same argument we see that the 3D translation is given by

$\begin{bmatrix} 1 & 0 & 0 & T_x \\ 0 & 1 & 0 & T_y \\ 0 & 0 & 1 & T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$

hese are linear transformations in the space one degree higher than the geometry you are working with. In fact, the main reason for including homogeneous coordinates is the math for graphics is to be able to handle translations (and thus all basic transformations) as linear transformations represented by matrices.

### Shear

The shear transformation is not widely used in computer graphics, but can be used for things like the oblique view in engineering drawings. The concept of a shear is to add a multiple of one coordinate to another coordinate of each point, or, for example,

$shear(x,y,z) = (x+3y,y,z)$

The matrix for this shear transformation looks like

$\begin{bmatrix} 1 & 3 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$.

In general, the matrix for a shear transformation will look like the identity matrix with one non-zero element A off the diagonal. If A is in row $i$, column $j$, then the matrix will add A times the $j^{th}$ coordinate of the vector to the $i^{th}$ coordinate.

For the oblique view of engineering drawings, we look at the shear matrices that add a certain amount of the z-coordinate to each of the x- and y-coordinates. The matrices are

$\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ A & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & B & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ A & B & 1 \end{bmatrix}$

that take $(x,y,z) \times (x+Az,y+Bz,z)$. The values of A and B are adjusted to give precisely the view that you want, and the z-term of the result is usually dropped to give the needed 2D view of the 3D object. An example is the classical cabinet view shown below:

To experiment with these transformations, we have two interactive applets. The first one lets you apply the 2D transformations to a 2D figure. Transformation Matrix

The second applet lets you apply the 3D transformations to a 3D figure.

(Currently Unavailable)

### Matrix Inverses

In general, getting the inverse of a matrix can be difficult, but for the basic graphics transformations the inverses are easy because we can simply undo the geometric action of the original transformation.

The inverse of the scaling matrix

$\begin{bmatrix} \sigma_x & 0 & 0 \\ 0 & \sigma_y & 0 \\ 0 & 0 & \sigma_z \end{bmatrix}$

is clearly

$\begin{bmatrix} \frac{1}{\sigma_x } & 0 & 0 \\ 0 & \frac{1}{\sigma_y } & 0 \\ 0 & 0 & \frac{1}{\sigma_z } \end{bmatrix}$

The inverse of a rotation transformation by angle $\theta$ is clearly the rotation around the same line by the angle $-\theta$. For example, the rotation matrix

$\begin{bmatrix}cos(\theta ) & -sin(\theta ) & 0 \\ sin(\theta ) & cos(\theta ) & 0 \\ 0 & 0 & 1\end{bmatrix}$

has an inverse of

$\begin{bmatrix}cos(\theta ) & sin(\theta ) & 0 \\ -sin(\theta ) & cos(\theta ) & 0 \\ 0 & 0 & 1\end{bmatrix}$

Note that $sin(-\theta ) = -sin(\theta )$ and $cos(-\theta ) = cos(\theta)$.

The inverse of the translation matrix

$\begin{bmatrix} 1 & 0 & 0 & T_x \\ 0 & 1 & 0 & T_y \\ 0 & 0 & 1 & T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$

is clearly

$\begin{bmatrix} 1 & 0 & 0 & -T_x \\ 0 & 1 & 0 & -T_y \\ 0 & 0 & 1 & -T_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$

The inverse of the simple shear transformation is also straightforward. Since a simple shear adds a multiple of one vector component to another component, the inverse only needs to subtract that multiple. So we have

$\begin{bmatrix} 1 & A \\ 0 & 1 \end{bmatrix}^{-1} = \begin{bmatrix} 1 & -A \\ 0 & 1 \end{bmatrix}$

and the 3D case is a simple extension of this.

So we have a major observation: If any transformation is the product of basic graphics transformations, it is easy to find the inverse of its matrix (and hence its inverse transformation) as the product of the inverses of the components in reverse order. Or:

$(A \times B \times C)^{-1} = C^{-1} \times B^{-1} \times A^{-1}$

## Transformation Composition Is Matrix Multiplication

Transformations are usually not used by themselves, especially in graphics, so you need to have a way to compose transformations, as in $g(f(P))$. But if G is the matrix for the transformation g, and F is the matrix for the transformation f, then the matrix product G*F is the matrix for the composed functions gf.

For example, we have the translation represented by the matrix

$\begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{bmatrix}$

which represents a move two units in the x direction and one unit in the y direction. If we want to then rotate the same object with the matrix

$\begin{bmatrix} -0.5 & 0.866 & 0 \\ -0.866 & -0.5 & 0 \\ 0 & 0 & 1 \end{bmatrix}$

we can represent the combination of the two actions with a single composed matrix. This matrix is found by multiplying the second action by the first action.

$\begin{bmatrix} -0.5 & 0.866 & 0 \\ -0.866 & -0.5 & 0 \\ 0 & 0 & 1 \end{bmatrix} * \begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} -0.5 & 0.866 & -0.134 \\ -0.866 & -0.5 & -2.23 \\ 0 & 0 & 1 \end{bmatrix}$

So this matrix represents moving, then rotating an object in sequence.

In the example below, the teapot on the left has just been translated by the translation matrix above. The next image is just the rotation from the rotation images. The two images that follow are the translation then rotation and rotation then translation respectively. This demonstrates the combination of different transformations and how they must be done in the right order.

## Transformations and Graphics Environments

Attention – this concept needs a bit of programming background; it involves stacks.

When you are defining the geometry for a graphics image, you will sometimes want to model your scene as a hierarchy of simpler objects. You might have a desk, for example, that is made up of several parts (legs, drawers, shelves); the drawers may have handles or other parts; you may want to put several things on top of the desk; and so on. It’s common to define general models for each simple part, and then to put the pieces together in a common space, called the “world space.”

Each simple part will be defined in its own “model space,” and then you can apply transformations that move all the parts into the right place in the more complex part. In turn, that whole more complex part may be moved into another position, and so on – you can build up quite complex models this way. One common technique for this kind of hierarchical modeling is to build a “scene graph” that shows how everything is assembled and the transformations that are used in the assembly.

As an example, consider the simple picture of a bunny head, basically made up of several spheres. Each sphere is scaled (making it an ellipsoid of the right size), rotated into the right orientation, and then translated into the proper place. The tree next to the picture shows how this is organized.

In order to make this work, you have to apply each set of transformations to its own sphere and then “forget” those transformations so you can apply the transformations for the next piece. You could, of course, use inverses to undo the transformations, but that’s slow and invites roundoff errors from many multiplications.

instead, it is common to maintain a “transformation stack” that holds the history of every place you want to get back to – all the transformations you have saved. This is a stack of 4x4 matrices that implement the transformations. You also have an active transformation to which you apply any new transformations by matrix multiplication.

To save a transformation to get back to later, you push a copy of the current active transformation (as a 4x4 matrix) onto the stack. Later, when you have applied whatever new transformations you need and want to get back to the last saved transformation, you pop the top matrix off the stack and make it the current active transformation. Presto – all the transformations you had used since the corresponding push operation are gone.

So let’s get back to the rabbit. We want to create the rabbit head, and we have whatever active transformation was in place when we wanted to draw the head. Then we have

push
scale
sphere for main part
pop
push
translate
scale
sphere for left eye
pop
push
Translate
Scale
sphere for right eye
pop
push
Translate
Rotate
Scale
sphere for left ear
pop
push
Translate
Rotate
Scale
sphere for right ear
pop

Notice something important: the transformations are written in the order they are applied, with the one closest to the geometry to be applied first. The right ear operations are really Translate(Rotate(Scale(sphere-for-right-ear)))

If you are not familiar with stacks, this won’t make much sense, but you don't need to understand this to understand basic transformation concepts. A simple way to look at these stacks is to notice that a transformation is a 4x4 matrix or, equivalently, a 16-element array, so maintaining a stack is simply a matter of building an array

float transStack[N][16];

or

float transStack[N][4][4];
where N is the number of transformations one wants to save.

# References

Page written by Steve Cunningham.

# Messages to the Future

When there is a page for 2D, 3D, 4D real spaces; affine spaces; homogeneous coordinates, this page should link to that page (to the homogeneous coordinates section).