THE MATHEMATICS BEHIND PERSPECTIVE
by Max Pandaemonium
The illustration of perspective on a two-dimensional screen from a set
of three-dimensional images can be accomplished by mapping a three-
dimensional point onto a two-dimensional plane. Taking it one point
at a time, we find the point of intersection where the line-of-sight
from the observer to this given point interscts a plane (which
represents the screen of a computer, for instance).
I will use vector mathematics in the derivation that follows. If you
are not familiar with vector mathematics, you're going to have some
trouble following along; but I will explain what I am doing as I go
along, if not why it works mathematically.
Hereafter I will use this notation: Figures, such as points and
planes and lines, will be referenced by capital letters; values, such
as vectors or real numbers, will be referenced by lowercase letters.
If there is a point A, then the vector a (if not previously defined as
something else) indicates the location of the point A from the origin.
That is, if A is (a, b, c), then a = .
Here is how I will proceed with the derivation. First we will define
the reference plane (screen) C. Then we will find the line of sight,
L, between the point in question and the observer. Then we will find
the point of intersection of the line and the plane, and will
translate that into a two-dimensional location on the plane (i.e., the
screen).
Let the point P be the location of the point we wish to map to the
plane. Therefore, p is the vector from the origin to the point P.
Similarly we will define point V to be the location of the observer.
Also, let a unit vector (i.e., having a length of 1) be the facing
vector, r. Thus r represents the direction in which the observer is
looking, and therefore does not have to be back toward the origin; it
can be in any direction.
We'll define the plane C to be located at the head of the facing
vector r. In other words, the primarily point in C, which we will
call S (which represents the center of the screen) is given by
s = v + r
(i)
Since we want this plane to be perpendicular to the facing
vector, r, we can declare that r is a normal vector for C. Thus an
equation of C (where u is a general vector) can be:
C: r dot (u - v - r) = 0.
(ii)
(This equation implies that r and u - v - r are always perpendicular.)
Now that C is defined, we will define the line of sight, L. This is
very simple, since a direction vector for this line might be p - v.
Here lies the main part of the derivation. We must now find the
point, which we shall call T, where the line L and the plane C
intersect. The vector t - v, logically, should be a scalar multiple
of p - v. Thus we shall say
t - v = k(p - v)
(iii)
where k is some real number.
The pieces of the puzzle are almost ready to be fit together. One
clear objective is to find k, the scalar multiple that corresponds to
finding T. In doing this we will define a vector x, which will have
its tail at S, the center of the screen, and its head at T, the mapped
point. This is useful because, when we want to transfer the three-
dimensional point T to a two-dimensional map on the screen, it will be
useful to have the vector set up this way. (Just as we always have
vectors with their tails at the origins in Cartesian space, we would
like a vector in this new two-dimensional map to have its tail at the
"origin," or center of the screen.) Thus, x is given by
x = t - v - r.
(iv)
Substituting equation (iii) into (iv), we get
x = k(p - v) - r.
(v)
Now we must have an equation to find k. We know that t - v will lie
on the plane if, according to our equation (i), the two vectors from S
are perpendicular. Equivalently,
r dot x = 0.
(vi)
Substuting equation (v) into (vi),
r dot [k(p - v) - r] = 0.
(vii)
Now we must substitute identifiers for these vectors to solve for k.
We shall make the following definitions:
p = ;
r = ;
v = .
(viii)
Thus equation (vii) becomes
dot
= 0
(ix)
Multiplying through the dot product,
rx[k(px - vx) - rx] + ry[k(py - vy) - ry] + rz[k(pz - vz) - rz] = 0
k rx (px - vx) - rx^2 + k ry (py - vy) - ry^2 + k rz (pz - vz) - rz^2
= 0
k[rx(px - vx) + ry(py - vy) + rz(pz - vz)] = rx^2 + ry^2 + rz^2
rx^2 + ry^2 = rz^2
k = ---------------------------------------.
rx(px - vx) + ry(py - vy) + rz(pz - vz)
(1)
It is useful to consider some things about k at the present time.
Ideally, k should be a positive number. Consider: If k = 0, then t -
v is of zero length, and therefore P and V are coincident. If k < 0,
that means that P is not on the other side of C that V is; in simple
terms, P is "behind" V. If k is undefined (the denominator is zero),
then there is no k that will define t - v -- in other words, line L is
parallel to C, and therefore there will be no mapping. In either of
cases, the calculation can stop here, since the point will not be
visible.
Now we still have the matter of transferring this three-dimensional
space coordinate into a two-dimensional screen coordinate. To do
this, first we will find the value of x. From equation (v),
substituting the definitions in (viii), we get
x = k(p - v) - r
x = k( - ) -
x = .
(x)
Now we must define two unit vectors, u and u', which represent the x'-
and y'-axes of the new two-dimensional system. Note that u and u'
have their tails at S, and both u and u' must be contained in C. In
other words,
r dot u = r dot u' = 0.
(xi)
Since we want to find these x' and y' values in the new two-
dimensional system, we will find the scalar projection of vector x on
both u and u' in turn. This is a simple matter:
x' = u dot x;
y' = u' dot x.
(xii)
If we take the value definitions
u = ;
u' = __
(xiii)
then we can combine equations (x), (xii), and (xiii) to get
x' = u dot x;
y' = u' dot x
x' = dot
;
y' = ____ dot
x' = ux[k(px - vx) - rx) + uy[k(py - vy) - ry] + uz[k(pz - vz) - rz];
y' = u'x[k(px - vx) - rx) + u'y[k(py - vy) - ry] +
u'z[k(pz - vz) - rz]
(2)
where k is calculated according to equation (1). QED.
Note that proof above is quite transparent as to the actual values of
u and u'. Naturally they should be perpendicular, since they represent
x'- and y'-axes, but what should we set them to be?
A natural interpretation is to have the y'-axis, that is, the one
represented by the vector u', to be "up"; that is, to correspond to
the real z-axis. A choice of u and u' for this might be
u = ;
u' = .
(3)
It should be noted quickly that this will not work if r and k are
parallel. If this should be so (that is, rx and ry are simultaneously
zero), then this alternate form can be used:
u = <-rz, 0, 0>;
v = <0, -1, 0>.
(3a)
-)(- __