Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Computer Graphics Part 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

CSC418 / CSCD18 / CSC2504 Camera Models

6 Camera Models
Goal: To model basic geometry of projection of 3D points, curves, and surfaces onto a 2D surface,
the view plane or image plane.

6.1 Thin Lens Model


Most modern cameras use a lens to focus light onto the view plane (i.e., the sensory surface). This
is done so that one can capture enough light in a sufficiently short period of time that the objects do
not move appreciably, and the image is bright enough to show significant detail over a wide range
of intensities and contrasts.
Aside:
In a conventional camera, the view plane contains either photoreactive chemicals;
in a digital camera, the view plane contains a charge-coupled device (CCD) array.
(Some cameras use a CMOS-based sensor instead of a CCD). In the human eye, the
view plane is a curved surface called the retina, and and contains a dense array of
cells with photoreactive molecules.
Lens models can be quite complex, especially for compound lens found in most cameras. Here we
consider perhaps the simplist case, known widely as the thin lens model. In the thin lens model,
rays of light emitted from a point travel along paths through the lens, convering at a point behind
the lens. The key quantity governing this behaviour is called the focal length of the lens. The
focal length,, |f |, can be defined as distance behind the lens to which rays from an infinitely distant
source converge in focus.

surface point
view plane
lens

optical axis
z0 z1

More generally, for the thin lens model, if z1 is the distance from the center of the lens (i.e., the
nodal point) to a surface point on an object, then for a focal length |f |, the rays from that surface
point will be in focus at a distance z0 behind the lens center, where z1 and z0 satisfy the thin lens
equation:
1 1 1
= + (25)
|f | z0 z1

Copyright c 2005 David Fleet and Aaron Hertzmann 32


CSC418 / CSCD18 / CSC2504 Camera Models

6.2 Pinhole Camera Model


A pinhole camera is an idealization of the thin lens as aperture shrinks to zero.

view plane

infinitesimal
pinhole

Light from a point travels along a single straight path through a pinhole onto the view plane. The
object is imaged upside-down on the image plane.

Note:
We use a right-handed coordinate system for the camera, with the x-axis as the hor-
izontal direction and the y-axis as the vertical direction. This means that the optical
axis (gaze direction) is the negative z-axis.
y

-z

Here is another way of thinking about the pinhole model. Suppose you view a scene with one eye
looking through a square window, and draw a picture of what you see through the window:

(Engraving by Albrecht Dürer, 1525).

Copyright c 2005 David Fleet and Aaron Hertzmann 33


CSC418 / CSCD18 / CSC2504 Camera Models

The image you’d get corresponds to drawing a ray from the eye position and intersecting it with
the window. This is equivalent to the pinhole camera model, except that the view plane is in front
of the eye instead of behind it, and the image appears rightside-up, rather than upside down. (The
eye point here replaces the pinhole). To see this, consider tracing rays from scene points through a
view plane behind the eye point and one in front of it:

For the remainder of these notes, we will consider this camera model, as it is somewhat easier to
think about, and also consistent with the model used by OpenGL.

Aside:
The earliest cameras were room-sized pinhole cameras, called camera obscuras. You
would walk in the room and see an upside-down projection of the outside world on
the far wall. The word camera is Latin for “room;” camera obscura means “dark
room.”

18th-century camera obscuras. The camera on the right uses a mirror in the roof to
project images of the world onto the table, and viewers may rotate the mirror.

6.3 Camera Projections


Consider a point p̄ in 3D space oriented with the camera at the origin, which we want to project
onto the view plane. To project py to y, we can use similar triangles to get y = pfz py . This is
perspective projection.

Note that f < 0, and the focal length is |f |.

In perspective projection, distant objects appear smaller than near objects:

Copyright c 2005 David Fleet and Aaron Hertzmann 34


CSC418 / CSCD18 / CSC2504 Camera Models

pz

y
py

z
f
pinhole image

Figure 1: *
Perspective projection

The man without the hat appears to be two different sizes, even though the two images of him have
identical sizes when measured in pixels. In 3D, the man without the hat on the left is about 18
feet behind the man with the hat. This shows how much you might expect size to change due to
perspective projection.

6.4 Orthographic Projection


For objects sufficiently far away, rays are nearly parallel, and variation in pz is insignificant.

Copyright c 2005 David Fleet and Aaron Hertzmann 35


CSC418 / CSCD18 / CSC2504 Camera Models

Here, the baseball players appear to be about the same height in pixels, even though the batter
is about 60 feet away from the pitcher. Although this is an example of perspective projection, the
camera is so far from the players (relative to the camera focal length) that they appear to be roughly
the same size.

In the limit, y = αpy for some real scalar α. This is orthographic projection:

z
image

6.5 Camera Position and Orientation


Assume camera coordinates have their origin at the “eye” (pinhole) of the camera, ē.

y v
u
g
e
w

Figure 2:

Let ~g be the gaze direction, so a vector perpendicular to the view plane (parallel to the camera
z-axis) is
−~g
w
~= (26)
k~g k

Copyright c 2005 David Fleet and Aaron Hertzmann 36


CSC418 / CSCD18 / CSC2504 Camera Models

We need two more orthogonal vectors ~u and ~v to specify a camera coordinate frame, with ~u and
~v parallel to the view plane. It may be unclear how to choose them directly. However, we can
instead specify an “up” direction. Of course this up direction will not be perpendicular to the gaze
direction.

Let ~t be the “up” direction (e.g., toward the sky so ~t = (0, 1, 0)). Then we want ~v to be the closest
vector in the viewplane to ~t. This is really just the projection of ~t onto the view plane. And of
course, ~u must be perpendicular to ~v and w.~ In fact, with these definitions it is easy to show that ~u
must also be perpendicular to ~t, so one way to compute ~u and ~v from ~t and ~g is as follows:

~t × w
~
~u = ~v = w
~ × ~u (27)
k~t × wk
~

Of course, we could have use many different “up” directions, so long as ~t × w


~ 6= 0.

Using these three basis vectors, we can define a camera coordinate system, in which 3D points are
represented with respect to the camera’s position and orientation. The camera coordinate system
has its origin at the eye point ē and has basis vectors ~u, ~v , and w,
~ corresponding to the x, y, and z
axes in the camera’s local coordinate system. This explains why we chose w ~ to point away from
the image plane: the right-handed coordinate system requires that z (and, hence, w) ~ point away
from the image plane.
Now that we know how to represent the camera coordinate frame within the world coordinate
frame we need to explicitly formulate the rigid transformation from world to camera coordinates.
With this transformation and its inverse we can easily express points either in world coordinates or
camera coordinates (both of which are necessary).
To get an understanding of the transformation, it might be helpful to remember the mapping from
points in camera coordinates to points in world coordinates. For example, we have the following
correspondences between world coordinates and camera coordinates: Using such correspondences

Camera coordinates (xc , yc , zc ) World coordinates (x, y, z)


(0, 0, 0) ē
(0, 0, f ) ē + f w ~
(0, 1, 0) ē + ~v
(0, 1, f ) ē + ~v + f w ~

it is not hard to show that for a general point expressed in camera coordinates as p̄c = (xc , yc , zc ),
the corresponding point in world coordinates is given by

p̄w = ē + xc~u + yc~v + zc w


~ (28)
  c
= ~u ~v w~ p̄ + ē (29)
= Mcw p̄c + ē. (30)

Copyright c 2005 David Fleet and Aaron Hertzmann 37


CSC418 / CSCD18 / CSC2504 Camera Models

where  
  u1 v 1 w 1
Mcw = ~u ~v w
~ =  u2 v 2 w 2  (31)
u3 v 3 w 3

Note: We can define the same transformation for points in homogeneous coordinates:
 
Mcw ē
M̂cw = ~0T 1 .

Now, we also need to find the inverse transformation, i.e., from world to camera coordinates.
Toward this end, note that the matrix Mcw is orthonormal. To see this, note that vectors ~u, ~v
and, w
~ are all of unit length, and they are perpendicular to one another. You can also verify this
T
by computing Mcw Mcw . Because Mcw is orthonormal, we can express the inverse transformation
(from camera coordinates to world coordinates) as

p̄c = Mcw
T
(p̄w − ē)
= Mwc p̄w − d¯,

~uT
 
T
where Mwc = Mcw =  ~v T . (why?), and d¯ = Mcw
T
ē.
T
w~

In homogeneous coordinates, p̂c = M̂wc p̂w , where


 
Mwc −Mwc ē
M̂v = ~0T 1
Mwc ~0
  
I −ē
= ~0T 1 ~0T 1 .

This transformation takes a point from world to camera-centered coordinates.

6.6 Perspective Projection


Above we found the form of the perspective projection using the idea of similar triangles. Here we
consider a complementary algebraic formulation. To begin, we are given

• a point p̄c in camera coordinates (uvw space),

• center of projection (eye or pinhole) at the origin in camera coordinates,

• image plane perpendicular to the z-axis, through the point (0, 0, f ), with f < 0, and

Copyright c 2005 David Fleet and Aaron Hertzmann 38


CSC418 / CSCD18 / CSC2504 Camera Models

• line of sight is in the direction of the negative z-axis (in camera coordinates),
we can find the intersection of the ray from the pinhole to p̄c with the view plane.
The ray from the pinhole to p̄c is r̄(λ) = λ(p̄c − 0̄).
The image plane has normal (0, 0, 1) = ~n and contains the point (0, 0, f ) = f¯. So a point x̄c is on
the plane when (x̄c − f¯) · ~n = 0. If x̄c = (xc , y c , z c ), then the plane satisfies z c − f = 0.
To find the intersection of the plane z c = f and ray ~r(λ) = λp̄c , substitute ~r into the plane equation.
With p̄c = (pcx , pcy , pcz ), we have λpcz = f , so λ∗ = f /pcz , and the intersection is
px pcy
 c  c c 
px py


~r(λ ) = f c , f c , f = f c
, c , 1 ≡ x̄∗ . (32)
pz pz pz pz
The first two coordinates of this intersection x̄∗ determine the image coordinates.

2D points in the image plane can therefore be written as


f pcx
 ∗     
x 1 0 0 f c
= c = p̄ .
y∗ pz pcy 0 1 0 pcz

The mapping from p̄c to (x∗ , y ∗ , 1) is called perspective projection.

Note:
Two important properties of perspective projection are:
• Perspective projection preserves linearity. In other words, the projection of a
3D line is a line in 2D. This means that we can render a 3D line segment by
projecting the endpoints to 2D, and then draw a line between these points in
2D.
• Perspective projection does not preserve parallelism: two parallel lines in 3D
do not necessarily project to parallel lines in 2D. When the projected lines inter-
sect, the intersection is called a vanishing point, since it corresponds to a point
infinitely far away. Exercise: when do parallel lines project to parallel lines and
when do they not?

Aside:
The discovery of linear perspective, including vanishing points, formed a corner-
stone of Western painting beginning at the Renaissance. On the other hand, defying
realistic perspective was a key feature of Modernist painting.

To see that linearity is preserved, consider that rays from points on a line in 3D through a pinhole
all lie on a plane, and the intersection of a plane and the image plane is a line. That means to draw
polygons, we need only to project the vertices to the image plane and draw lines between them.

Copyright c 2005 David Fleet and Aaron Hertzmann 39


CSC418 / CSCD18 / CSC2504 Camera Models

6.7 Homogeneous Perspective


The mapping of p̄c = (pcx , pcy , pcz ) to x̄∗ = pfc (pcx , pcy , pcz ) is just a form of scaling transformation.
z
However, the magnitude of the scaling depends on the depth pcz . So it’s not linear.

Fortunately, the transformation can be expressed linearly (ie as a matrix) in homogeneous coordi-
nates. To see this, remember that p̂ = (p̄, 1) = α(p̄, 1) in homogeneous coordinates. Using this
property of homogeneous coordinates we can write x̄∗ as
c
 
∗ c c c pz
x̂ = px , py , pz , .
f

As usual with homogeneous coordinates, when you scale the homogeneous vector by the inverse
of the last element, when you get in the first three elements is precisely the perspective projection.
Accordingly, we can express x̂∗ as a linear transformation of p̂c :
 
1 0 0 0
 0 1 0 0  c
x̂∗ = 
 0 0 1 0 p̂ ≡ M̂p p̂ .
 c

0 0 1/f 0

Try multiplying this out to convince yourself that this all works.
Finally, M̂p is called the homogeneous perspective matrix, and since p̂c = M̂wc p̂w , we have x̂∗ =
M̂p M̂wc p̂w .

6.8 Pseudodepth
After dividing by its last element, x̂∗ has its first two elements as image plane coordinates, and its
third element is f . We would like to be able to alter the homogeneous perspective matrix M̂p so
pcz ∗
that the third element of f x̂ encodes depth while keeping the transformation linear.
 
1 0 0 0
 0 1 0 0  f
Idea: Let x̂∗ = 
 0
p̂c , so z ∗ =
pcz
(apcz + b).
0 a b 
0 0 1/f 0

What should a and b be? We would like to have the following two constraints:

−1 when pcz = f


z = ,
1 when pcz = F

where f gives us the position of the near plane, and F gives us the z coordinate of the far plane.

Copyright c 2005 David Fleet and Aaron Hertzmann 40


CSC418 / CSCD18 / CSC2504 Camera Models

So −1 = af + b and 1 = af + b Ff . Then 2 = b Ff − b = b f

F
− 1 , and we can find

2F
b= .
f −F
2F
Substituting this value for b back in, we get −1 = af + f −F
, and we can solve for a:
 
1 2F
a = − +1
f f −F
 
1 2F f −F
= − +
f f −F f −F
 
1 f +F
= − .
f f −F

These values of a and b give us a function z ∗ (pcz ) that increases monotonically as pcz decreases
(since pcz is negative for objects in front of the camera). Hence, z ∗ can be used to sort points by
depth.

Why did we choose these values for a and b? Mathematically, the specific choices do not matter,
but they are convenient for implementation. These are also the values that OpenGL uses.

What is the meaning of the near and far planes? Again, for convenience of implementation, we will
say that only objects between the near and far planes are visible. Objects in front of the near plane
are behind the camera, and objects behind the far plane are too far away to be visible. Of course,
this is only a loose approximation to the real geometry of the world, but it is very convenient
for implementation. The range of values between the near and far plane has a number of subtle
implications for rendering in practice. For example, if you set the near and far plane to be very far
apart in OpenGL, then Z-buffering (discussed later in the course) will be very inaccurate due to
numerical precision problems. On the other hand, moving them too close will make distant objects
disappear. However, these issues will generally not affect rendering simple scenes. (For homework
assignments, we will usually provide some code that avoids these problems).

6.9 Projecting a Triangle


Let’s review the steps necessary to project a triangle from object space to the image plane.

1. A triangle is given as three vertices in an object-based coordinate frame: p̄o1 , p̄o2 , p̄o3 .

Copyright c 2005 David Fleet and Aaron Hertzmann 41


CSC418 / CSCD18 / CSC2504 Camera Models

p2

p3 p1 x
z
A triangle in object coordinates.

2. Transform to world coordinates based on the object’s transformation: p̂w w w


1 , p̂2 , p̂3 , where
p̂w o
i = M̂ow p̂i .

p2w
c
y
p1w
p3w

z
The triangle projected to world coordinates, with a camera at c̄.

3. Transform from world to camera coordinates: p̂ci = M̂wc p̂w


i .

Copyright c 2005 David Fleet and Aaron Hertzmann 42


CSC418 / CSCD18 / CSC2504 Camera Models

p2c
y

p1c
p3c

z
The triangle projected from world to camera coordinates.

4. Homogeneous perspective transformation: x̂∗i = M̂p p̂ci , where

pcx
   
1 0 0 0
c
, so x̂∗i =  cpy
 0 1 0 0   
M̂p = 
 0  apz + b
.
0 a b  
pcz
0 0 1/f 0 f

5. Divide by the last component:


 pcx 
x∗
 
pcz
pcy
 y∗  = f  .

 pcz
∗ apcz +b
z
pcz

(1, 1, 1)
*
p 2

p1*
p3*
(-1, -1, -1)
The triangle in normalized device coordinates after perspective division.

Copyright c 2005 David Fleet and Aaron Hertzmann 43


CSC418 / CSCD18 / CSC2504 Camera Models

Now (x∗ , y ∗ ) is an image plane coordinate, and z ∗ is pseudodepth for each vertex of the
triangle.

6.10 Camera Projections in OpenGL


OpenGL’s modelview matrix is used to transform a point from object or world space to camera
space. In addition to this, a projection matrix is provided to perform the homogeneous perspective
transformation from camera coordinates to clip coordinates before performing perspective divi-
sion. After selecting the projection matrix, the glFrustum function is used to specify a viewing
volume, assuming the camera is at the origin:

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glFrustum(left, right, bottom, top, near, far);

For orthographic projection, glOrtho can be used instead:

glOrtho(left, right, bottom, top, near, far);

The GLU library provides a function to simplify specifying a perspective projection viewing frus-
tum:

gluPerspective(fieldOfView, aspectRatio, near, far);

The field of view is specified in degrees about the x-axis, so it gives the vertical visible angle. The
aspect ratio should usually be the viewport width over its height, to determine the horizontal field
of view.

Copyright c 2005 David Fleet and Aaron Hertzmann 44


CSC418 / CSCD18 / CSC2504 Visibility

7 Visibility
We have seen so far how to determine how 3D points project to the camera’s image plane. Ad-
ditionally, we can render a triangle by projecting each vertex to 2D, and then filling in the pixels
of the 2D triangle. However, what happens if two triangles project to the same pixels, or, more
generally, if they overlap? Determining which polygon to render at each pixel is visibility. An
object is visible if there exists a direct line-of-sight to that point, unobstructed by any other ob-
jects. Moreover, some objects may be invisible because they are behind the camera, outside of the
field-of-view, or too far away.

7.1 The View Volume and Clipping


The view volume is made up of the space between the near plane, f , and far plane, F . It is bounded
by B, T , L, and R on the bottom, top, left, and right, respectively.

The angular field of view is determined by f , B, T , L, and R:

α
e f

1 T −B
From this figure, we can find that tan(α) = 2 |f |
.

Clipping is the process of removing points and parts of objects that are outside the view volume.

We would like to modify our homogeneous perspective transformation matrix to simplify clipping.
We have  
1 0 0 0
 0 1
0  0 

M̂p =  2F .

f +F
 0 0 − f1 f −F f −F 
0 0 −1/f 0
Since this is a homogeneous transformation, it may be multiplied by a constant without changing

Copyright c 2005 David Fleet and Aaron Hertzmann 45


CSC418 / CSCD18 / CSC2504 Visibility

its effect. Multiplying M̂p by f gives us


 
f 0 0 0
 0 f 0 0 
.
   
 0 0 − ff −F
+F 2f F

f −F 
0 0 1 0

If we alter the transform in the x and y coordinates to be


 2f R+L

R−L
0 R−L
0
 0 2f T +B
T −B T −B  0 
x̂∗ = 
 c
p̂ ,

 0 0 − ff +F−F
2f F
f −F

0 0 1 0

then, after projection, the view volume becomes a cube with sides at −1 and +1. This is called
the canonical view volume and has the advantage of being easy to clip against.

Note:
The OpenGL command glFrustum(l, r, b, t, n, f) takes the distance to the near and
far planes rather than the position on the z-axis of the planes. Hence, the n used by
glFrustum is our −f and the f used by glFrustum is −F . Substituting these values
into our matrix gives exactly the perspective transformation matrix used by OpenGL.

7.2 Backface Removal


Consider a closed polyhedral object. Because it is closed, far side of the object will always be invis-
ible, blocked by the near side. This observation can be used to accelerate rendering, by removing
back-faces.

Example:
For this simple view of a cube, we have three backfacing polygons, the left side,
back, and bottom:

Only the near faces are visible.

We can determine if a face is back-facing as follows. Suppose we compute a normals ~n for a mesh
face, with the normal chosen so that it points outside the object For a surface point p̄ on a planar

Copyright c 2005 David Fleet and Aaron Hertzmann 46


CSC418 / CSCD18 / CSC2504 Visibility

patch and eye point ē, if (p̄ − ē) · ~n > 0, then the angle between the view direction and normal
is less than 90◦ , so the surface normal points away from ē. The result will be the same no matter
which face point p̄ we use.

Hence, if (p̄ − ē) · ~n > 0, the patch is backfacing and should be removed. Otherwise, it might be
visible. This should be calculated in world coordinates so the patch can be removed as early as
possible.

Note:
To compute ~n, we need three vertices on the patch, in counterclockwise order, as
seen from the outside of the object, p̄1 , p̄1 , and p̄3 . Then the unit normal is

(p̄2 − p̄1 ) × (p̄3 − p̄1 )


.
k(p̄2 − p̄1 ) × (p̄3 − p̄1 )k

Backface removal is a “quick reject” used to accelerate rendering. It must still be used together
with another visibility method. The other methods are more expensive, and removing backfaces
just reduces the number of faces that must be considered by a more expensive method.

7.3 The Depth Buffer


Normally when rendering, we compute an image buffer I(i,j) that stores the color of the object
that projects to pixel (i, j). The depth d of a pixel is the distance from the eye point to the object.
The depth buffer is an array zbuf(i, j) which stores, for each pixel (i, j), the depth of the
nearest point drawn so far. It is initialized by setting all depth buffer values to infinite depth:
zbuf(i,j)= ∞.

To draw color c at pixel (i, j) with depth d:


if d < zbuf(i, j) then
putpixel(i, j, c)
zbuf(i, j) = d
end

When drawing a pixel, if the new pixel’s depth is greater than the current value of the depth buffer
at that pixel, then there must be some object blocking the new pixel, and it is not drawn.

Advantages
• Simple and accurate

• Independent of order of polygons drawn

Copyright c 2005 David Fleet and Aaron Hertzmann 47


CSC418 / CSCD18 / CSC2504 Visibility

Disadvantages
• Memory required for depth buffer

• Wasted computation on drawing distant points that are drawn over with closer points that
occupy the same pixel

To represent the depth at each pixel, we can use pseudodepth, which is available after the homo-
geneous perspective transformation.1 Then the depth buffer should be initialized to 1, since the
pseudodepth values are between −1 and 1. Pseudodepth gives a number of numerical advantages
over true depth.

To scan convert a triangular polygon with vertices x̄1 , x̄2 , and x̄3 , pseudodepth values d1 , d2 , and
d3 , and fill color c, we calculate the x values and pseudodepths for each edge at each scanline. Then
for each scanline, interpolate pseudodepth between edges and compare the value at each pixel to
the value stored in the depth buffer.

7.4 Painter’s Algorithm


The painter’s algorithm is an alternative to depth buffering to attempt to ensure that the closest
points to a viewer occlude points behind them. The idea is to draw the most distant patches of a
surface first, allowing nearer surfaces to be drawn over them.

In the heedless painter’s algorithm, we first sort faces according to depth of the vertex furthest from
the viewer. Then faces are rendered from furthest to nearest.

There are problems with this approach, however. In some cases, a face that occludes part of another
face can still have its furthest vertex further from the viewer than any vertex of the face it occludes.
In this situation, the faces will be rendered out of order. Also, polygons cannot intersect at all as
they can when depth buffering is used instead. One solution is to split triangles, but doing this
correctly is very complex and slow. Painter’s algorithm is rarely used directly in practice; however,
a data-structure called BSP trees can be used to make painter’s algorithm much more appealing.

7.5 BSP Trees


The idea of binary space partitioning trees (BSP trees) is to extend the painter’s algorithm to
make back-to-front ordering of polygons fast for any eye location and to divide polygons to avoid
overlaps.

Imagine two patches, T1 and T2 , with outward-facing normals ~n1 and ~n2 .
1
The OpenGL documentation is confusing in a few places — “depth” is used to mean pseudodepth, in commands
like glReadPixels and gluUnProject.

Copyright c 2005 David Fleet and Aaron Hertzmann 48


CSC418 / CSCD18 / CSC2504 Visibility

T2
T1
n2
n1
n2 T1
T2
n1
e e

If the eye point, ē, and T2 are on the same side of T1 , then we draw T1 before T2 . Otherwise, T2
should be drawn before T1 .

We know if two points are on the same side of a plane containing T1 by using the implicit equation
for T1 ,

f1 (x̄) = (x̄ − p̄1 ) · ~n. (33)

If x̄ is on the plane, f1 (x̄) = 0. Otherwise, if f1 (x̄) > 0, x̄ is on the “outside” of T1 , and if


f1 (x̄) < 0, x̄ is “inside.”

Before any rendering can occur, the scene geometry must be processed to build a BSP tree to
represent the relative positions of all the facets with respect to their inside/outside half-planes. The
same BSP tree can be used for any eye position, so the tree only has to be constructed once if
everything other than the eye is static. For a single scene, there are many different BSP trees that
can be used to represent it — it’s best to try to construct balanced trees.

The tree traversal algorithm to draw a tree with root F is as follows:


if eye is in the outside half-space of F
draw faces on the inside subtree of F
draw F
draw faces on the outside subtree of F
else
draw faces on the outside subtree of F
draw F (if backfaces are drawn)
draw faces on the inside subtree of F
end

7.6 Visibility in OpenGL


OpenGL directly supports depth buffering, but it is often used in addition to other visibility tech-
niques in interactive applications. For example, many games use a BSP tree to prune the amount
of static map geometry that is processed that would otherwise not be visible anyway. Also, when

Copyright c 2005 David Fleet and Aaron Hertzmann 49


CSC418 / CSCD18 / CSC2504 Visibility

dealing with blended, translucent materials, these objects often must be drawn from back to front
without writing to the depth buffer to get the correct appearance. For simple scenes, however, the
depth buffer alone is sufficient.

To use depth buffering in OpenGL with GLUT, the OpenGL context must be initialized with mem-
ory allocated for a depth buffer, with a command such as

glutInitDisplayMode(GLUT_RGB | GLUT_DOUBLE | GLUT_DEPTH);

Next, depth writing and testing must be enabled in OpenGL:

glEnable(GL_DEPTH_TEST);

OpenGL will automatically write pseudodepth values to the depth buffer when a primitive is ren-
dered as long as the depth test is enabled. The glDepthMask function can be used to disable depth
writes, so depth testing will occur without writing to the depth buffer when rendering a primitive.

When clearing the display to render a new frame, the depth buffer should also be cleared:

glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

Copyright c 2005 David Fleet and Aaron Hertzmann 50


CSC418 / CSCD18 / CSC2504 Basic Lighting and Reflection

8 Basic Lighting and Reflection


Up to this point, we have considered only the geometry of how objects are transformed and pro-
jected to images. We now discuss the shading of objects: how the appearance of objects depends,
among other things, on the lighting that illuminates the scene, and on the interaction of light with
the objects in the scene. Some of the basic qualitative properties of lighting and object reflectance
that we need to be able to model include:

Light source - There are different types of sources of light, such as point sources (e.g., a small
light at a distance), extended sources (e.g., the sky on a cloudy day), and secondary reflections
(e.g., light that bounces from one surface to another).

Reflectance - Different objects reflect light in different ways. For example, diffuse surfaces ap-
pear the same when viewed from different directions, whereas a mirror looks very different from
different points of view.

In this chapter, we will develop simplified model of lighting that is easy to implement and fast to
compute, and used in many real-time systems such as OpenGL. This model will be an approxima-
tion and does not fully capture all of the effects we observe in the real world. In later chapters, we
will discuss more sophisticated and realistic models.

8.1 Simple Reflection Models


8.1.1 Diffuse Reflection
We begin with the diffuse reflectance model. A diffuse surface is one that appears similarly bright
from all viewing directions. That is, the emitted light appears independent of the viewing location.
Let p̄ be a point on a diffuse surface with normal ~n, light by a point light source in direction ~s from
the surface. The reflected intensity of light is given by:

Ld (p̄) = rd I max(0, ~s · ~n) (34)

where I is the intensity of the light source, rd is the diffuse reflectance (or albedo) of the surface,
and ~s is the direction of the light source. This equation requires the vectors to be normalized, i.e.,
||~s|| = 1, ||~n = 1||.

The ~s · ~n term is called the foreshortening term. When a light source projects light obliquely at
a surface, that light is spread over a large area, and less of the light hits any specific point. For
example, imagine pointing a flashlight directly at a wall versus in a direction nearly parallel: in the
latter case, the light from the flashlight will spread over a greater area, and individual points on the
wall will not be as bright.

Copyright c 2005 David Fleet and Aaron Hertzmann 51


CSC418 / CSCD18 / CSC2504 Basic Lighting and Reflection

For color rendering, we would specify the reflectance in color (as (rd,R , rd,G , rd,B )), and specify
the light source in color as well (IR , IG , IB ). The reflected color of the surface is then:
Ld,R (p̄) = rd,R IR max(0, ~s · ~n) (35)
Ld,G (p̄) = rd,G IG max(0, ~s · ~n) (36)
Ld,B (p̄) = rd,B IB max(0, ~s · ~n) (37)

8.1.2 Perfect Specular Reflection


For pure specular (mirror) surfaces, the incident light from each incident direction d~i is reflected
toward a unique emittant direction d~e . The emittant direction lies in the same plane as the incident
direction d~i and the surface normal ~n, and the angle between ~n and d~e is equal to that between ~n and
d~i . One can show that the emittant direction is given by d~e = 2(~n · d~i )~n − d~i . (The derivation was

n
di de

covered in class). In perfect specular reflection, the light emitted in direction d~e can be computed
by reflecting d~e across the normal (as 2(~n · d~e )~n − d~e ), and determining the incoming light in this
direction. (Again, all vectors are required to be normalized in these equations).

8.1.3 General Specular Reflection


Many materials exhibit a significant specular component in their reflectance. But few are perfect
mirrors. First, most specular surfaces do not reflect all light, and that is easily handled by intro-
ducing a scalar constant to attenuate intensity. Second, most specular surfaces exhibit some form
of off-axis specular reflection. That is, many polished and shiny surfaces (like plastics and metals)
emit light in the perfect mirror direction and in some nearby directions as well. These off-axis
specularities look a little blurred. Good examples are highlights on plastics and metals.

More precisely, the light from a distant point source in the direction of ~s is reflected into a range
of directions about the perfect mirror directions m
~ = 2(~n · ~s)~n − ~s. One common model for this is
the following:
Ls (d~e ) = rs I max(0, m
~ · d~e )α , (38)
where rs is called the specular reflection coefficient I is the incident power from the point source,
and α ≥ 0 is a constant that determines the width of the specular highlights. As α increases, the
effective width of the specular reflection decreases. In the limit as α increases, this becomes a
mirror.

Copyright c 2005 David Fleet and Aaron Hertzmann 52


CSC418 / CSCD18 / CSC2504 Basic Lighting and Reflection

Specularity as a function of α and φ


1
α = .1
α = .5
0.9 α=1
α=2
α = 10
0.8

0.7

0.6
max(0,cosφ)α

0.5

0.4

0.3

0.2

0.1

0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
φ

Figure 3: Plot of specular intensity as a function of viewing angle φ.

The intensity of the specular region is proportional to max(0, cos φ)α , where φ is the angle between
~ and d~e . One way to understand the nature of specular reflection is to plot this function, see Figure
m
3.

8.1.4 Ambient Illumination


The diffuse and specular shading models are easy to compute, but often appear artificial. The
biggest issue is the point light source assumption, the most obvious consequence of which is that
any surface normal pointing away from the light source (i.e., for which ~s · ~n < 0) will have a
radiance of zero. A better approximation to the light source is a uniform ambient term plus a point
light source. This is a still a remarkably crude model, but it’s much better than the point source by
itself. Ambient illumintation is modeled simply by:

La (p̄) = ra Ia (39)

where ra is often called the ambient reflection coefficient, and Ia denotes the integral of the uniform
illuminant.

8.1.5 Phong Reflectance Model


The Phong reflectance model is perhaps the simplest widely used shading model in computer
graphics. It comprises a diffuse term (Eqn (81)), an ambient term (Eqn (82)), and a specular term

Copyright c 2005 David Fleet and Aaron Hertzmann 53


CSC418 / CSCD18 / CSC2504 Basic Lighting and Reflection

(Eqn (85)):

L(p̄, d~e ) = rd Id max(0, ~s · ~n) + ra Ia + rs Is max(0, m


~ · d~e )α , (40)

where

• Ia , Id , and Ir are parameters that correspond to the power of the light sources for the ambient,
diffuse, and specular terms;

• ra , rd and rs are scalar constants, called reflection coefficients, that determine the relative
magnitudes of the three reflection terms;

• α determines the spread of the specurlar highlights;

• ~n is the surface normal at p̄;

• ~s is the direction of the distant point source;

• m
~ is the perfect mirror direction, given ~n and ~s ; and

• and d~e is the emittant direction of interest (usually the direction of the camera).

In effect, this is a model in which the diffuse and specular components of reflection are due to
incident light from a point source. Extended light sources and the bouncing of light from one
surface to another are not modeled except through the ambient term. Also, arguably this model
has more parameters than the physics might suggest; for example, the model does not constrain
the parameters to conserve energy. Nevertheless it is sometimes useful to give computer graphics
practitioners more freedom in order to acheive the appearance they’re after.

8.2 Lighting in OpenGL


OpenGL provides a slightly modified version of Phong lighting. Lighting and any specific lights
to use must be enabled to see its effects:

glEnable(GL_LIGHTING); // enable Phong lighting


glEnable(GL_LIGHT0); // enable the first light source
glEnable(GL_LIGHT1); // enable the second light source
...

Lights can be directional (infinitely far away) or positional. Positional lights can be either point
lights or spotlights. Directional lights have the w component set to 0, and positional lights have w
set to 1. Light properties are specified with the glLight functions:

Copyright c 2005 David Fleet and Aaron Hertzmann 54


CSC418 / CSCD18 / CSC2504 Basic Lighting and Reflection

GLfloat direction[] = {1.0f, 1.0f, 1.0f, 0.0f};


GLfloat position[] = {5.0f, 3.0f, 8.0f, 1.0f};
Glfloat spotDirection[] = {0.0f, 3.0f, 3.0f};
Glfloat diffuseRGBA[] = {1.0f, 1.0f, 1.0f, 1.0f};
Glfloat specularRGBA[] = {1.0f, 1.0f, 1.0f, 1.0f};

// A directional light
glLightfv(GL_LIGHT0, GL_POSITION, direction);
glLightfv(GL_LIGHT0, GL_DIFFUSE, diffuseRGBA);
glLightfv(GL_LIGHT0, GL_SPECULAR, specularRGBA);

// A spotlight
glLightfv(GL_LIGHT1, GL_POSITION, position);
glLightfv(GL_LIGHT1, GL_DIFFUSE, diffuseRGBA);
glLightfv(GL_LIGHT1, GL_SPOT_DIRECTION, spotDirection);
glLightf(GL_LIGHT1, GL_SPOT_CUTOFF, 45.0f);
glLightf(GL_LIGHT1, GL_SPOT_EXPONENT, 30.0f);

OpenGL requires you to specify both diffuse and specular components for the light source. This
has no physical interpretation (real lights do not have “diffuse” or “specular” properties), but may
be useful for some effects. The glMaterial functions are used to specify material properties, for
example:
GLfloat diffuseRGBA = {1.0f, 0.0f, 0.0f, 1.0f};
GLfloat specularRGBA = {1.0f, 1.0f, 1.0f, 1.0f};
glMaterialfv(GL_FRONT, GL_DIFFUSE, diffuseRGBA);
glMaterialfv(GL_FRONT, GL_SPECULAR, specularRGBA);
glMaterialf(GL_FRONT, GL_SHININESS, 3.0f);

Note that both lights and materials have ambient terms. Additionally, there is a global ambient
term:
glLightfv(GL_LIGHT0, GL_AMBIENT, ambientLight);
glMaterialfv(GL_FRONT, GL_AMBIENT, ambientMaterial);
glLightModelfv(GL_LIGHT_MODEL_AMBIENT, ambientGlobal);

The material has an emission term as well, that is meant to model objects that can give off their
own light. However, no light is actually cast on other objects in the scene.
glMaterialfv(GL_FRONT, GL_EMISSION, em);

The global ambient term is multiplied by the current material ambient value and added to the
material’s emission value. The contribution from each light is then added to this value.

When rendering an object, normals should be provided for each face or for each vertex so that
lighting can be computed:

Copyright c 2005 David Fleet and Aaron Hertzmann 55


CSC418 / CSCD18 / CSC2504 Basic Lighting and Reflection

glNormal3f(nx, ny, nz);


glVertex3f(x, y, z);

Copyright c 2005 David Fleet and Aaron Hertzmann 56


CSC418 / CSCD18 / CSC2504 Shading

9 Shading
Goal: To use the lighting and reflectance model to shade facets of a polygonal mesh — that is, to
assign intensities to pixels to give the impression of opaque surfaces rather than wireframes.

Assume we’re given the following:


• ēw - center of projection in world coordinates
• ¯lw - point light source location
• Ia , Id - intensities of ambient and directional light sources
• ra , rd , rs - coefficients for ambient, diffuse, and specular reflections
• α - exponent to control width of highlights

9.1 Flat Shading


With flat shading, each triangle of a mesh is filled with a single color.

For a triangle with counterclockwise vertices p̄1 , p̄2 , and p̄3 , as seen from the outside, let the
(p̄2 −p̄1 )×(p̄3 −p̄1 )
midpoint be p̄ = 31 (p̄1 + p̄2 + p̄3 ) with normal ~n = k(p̄ 2 −p̄1 )×(p̄3 −p̄1 )k
. Then we may find the
intensity at p̄ using the Phong model and fill the polygon with that:

E = I˜a ra + rd I˜d max(0, ~n · ~s) + rs I˜d max(0, ~r · ~c)α , (41)


l̄w −p̄ ēw −p̄
where ~s = kl̄w −p̄k
, ~c = kēw −p̄k
, and ~r = −~s + 2(~s · ~n)~n.

Flat shading is a simple approach to filling polygons with color, but can be inaccurate for smooth
surfaces, and shiny surfaces. For smooth surfaces—which are often tesselated and represented as
polyhedra, using flat shading can lead to a very strong faceting effect. In other words, the surface
looks very much like a polyhedron, rather than the smooth surface it’s supposed to be. This is
because our visual system is very sensitive to variations in shading, and so using flat shading
makes faces really look flat.

9.2 Interpolative Shading


The idea of interpolative shading is to avoid computing the full lighting equation at each pixel by
interpolating quantites at the vertices of the faces.

Given vertices p̄1 , p̄2 , and p̄3 , we need to compute the normals for each vertex, compute the radi-
ances for each vertex, project onto the window in device coordinates, and fill the polygon using
scan conversion.

Copyright c 2005 David Fleet and Aaron Hertzmann 57


CSC418 / CSCD18 / CSC2504 Shading

There are two methods used for interpolative shading:

Gouraud Shading The radiance values are computed at the vertices and then linearly interpo-
lated within each triangle. This is the form of shading implemented in OpenGL.

Phong shading The normal values at each vertex are linearly interpolated within each triangle,
and the radiance is computed at each pixel.

Gouraud shading is more efficient, but Phong shading is more accurate. When will Gouraud shad-
ing give worse results?

9.3 Shading in OpenGL


OpenGL only directly supports Gouraud shading or flat shading. Gouraud is enabled by default,
computing vertex colors, and interpolating colors across triangle faces. Flat shading can be enabled
with glShadeModel(GL FLAT). This renders an entire face with the color of a single vertex,
giving a faceted appearance.

Left: Flat shading of a triangle mesh in OpenGL. Right: Gouraud shading. Note that the mesh
appears smooth, although the coarseness of the geometry is visible at the silhouettes of the mesh.

With pixel shaders on programmable graphics hardware, it is possible to achieve Phong shading
by using a small program to compute the illumination at each pixel with interpolated normals. It
is even possible to use a normal map to assign arbitrary normals within faces, with a pixel shader
using these normals to compute the illumination.

Copyright c 2005 David Fleet and Aaron Hertzmann 58


CSC418 / CSCD18 / CSC2504 Texture Mapping

10 Texture Mapping
10.1 Overview
We would like to give objects a more varied and realistic appearance through complex variations
in reflectance that convey textures. There are two main sources of natural texture:

• Surface markings — variations in albedo (i.e. the total light reflected from ambient and
diffuse components of reflection), and

• Surface relief — variations in 3D shape which introduces local variability in shading.

We will focus only on surface markings.

Examples of surface markings and surface relief

These main issues will be covered:

• Where textures come from,

• How to map textures onto surfaces,

• How texture changes reflectance and shading,

• Scan conversion under perspective warping, and

• Aliasing

10.2 Texture Sources


10.2.1 Texture Procedures
Textures may be defined procedurally. As input, a procedure requires a point on the surface of
an object, and it outputs the surface albedo at that point. Examples of procedural textures include
checkerboards, fractals, and noise.

Copyright c 2005 David Fleet and Aaron Hertzmann 59

You might also like