Computer Graphics With Using Opengl
Computer Graphics With Using Opengl
Hill - Chapter 1
9/9/99
page 1
Finally, computer graphics often means the whole field of study that involves these tools and the
pictures they produce. (So its also used in the singular form: computer graphics is...). The field is often
acknowledged to have started in the early 1960s with Ivan Sutherlands pioneering doctoral thesis at MIT
on Sketchpad [ ref]. Interest in graphics grew quickly, both in academia and industry, and there were
rapid advances in display technology and in the algorithms used to manage pictorial information. The
special interest group in graphics, SIGGRAPH1, was formed in 1969, and is very active today around the
world. (The must-not-miss annual SIGGRAPH meeting now attracts 30,000 participants a year.) More
can be found at http://www.siggraph.org. Today there are hundreds of companies around the world having
some aspect of computer graphics as their main source of revenue, and the subject of computer graphics is
taught in most computer science or electrical engineering departments.
Computer graphics is a very appealing field of study. You learn to write programs that create pictures,
rather than streams of text or numbers. Humans respond readily to pictorial information, and are able to
absorb much more information from pictures than from a collection of numbers. Our eye-brain systems
are highly attuned to recognizing visual patterns. Reading text is of course one form of pattern
recognition: we instantly recognize character shapes, form them into words, and interpret their meaning.
But we are even more acute when glancing at a picture. What might be an inscrutable blather of numbers
when presented as text becomes an instantly recognizable shape or pattern when presented graphically.
The amount of information in a picture can be enormous. We not only recognize whats in it, but also
glean a world of information from its subtle details and texture
People study computer graphics for many reasons. Some just want a better set of tools for plotting curves
and presenting the data they encounter in their other studies or work. Some want to write computeranimated games, while others are looking for a new medium for artistic expression. Everyone wants to be
more productive, and to communicate ideas better, and computer graphics can be a great help.
There is also the input side. A program generates output pictures or otherwise from a combination
of the algorithms executed in the program and the data the user inputs to the program. Some programs
accept input crudely through characters and numbers typed at the keyboard. Graphics program, on the
other hand, emphasize more familiar types of input: the movement of a mouse on a desktop, the strokes of
a pen on a drawing tablet, or the motion of the users head and hands in a virtual reality setting. We
examine many techniques of interactive computer graphics in this book; that is, we combine the
techniques of natural user input with those that produce pictorial output.
(Section 1.2 on uses of Computer Graphics deleted.)
SIGGRAPH is a Special Interest Group in the ACM: the Association for Computing Machinery.
Hill - Chapter 1
9/9/99
page 2
1.3.1. Polylines.
A polyline is a connected sequence of straight lines. Each of the examples in Figure 1.8 contain several
polylines: a). one polyline extends from the nose of the dinosaur to its tail; the plot of the mathematical
function is a single polyline, and the wireframe picture of a chess pawn contains many polylines that
outline its shape .
Hill - Chapter 1
(1.1)
9/9/99
page 3
For instance, the polyline shown in Figure 1.10 is given by the sequence (2, 4), (2, 11), (6, 14), (12, 11),
(12, 4), .... (what are the remaining vertices in this polyline?).
10
10
E).
D).
Hill - Chapter 1
9/9/99
page 4
When a line is thick its ends have shapes, and a user must decide how two adjacent edges join. Figure
1.13 shows various possibilities. Case a) shows butt-end lines that leave an unseemly crack at the
joint. Case b) shows rounded ends on the lines so they join smoothly, part c) shows a mitered joint, and
part d) shows a trimmed mitered joint. Software tools are available in some packages to allow the user to
choose the type of joining. Some methods are quite expensive computationally.
a).
b).
c).
d).
1.3.2. Text.
Some graphics devices have two distinct display modes, a text mode and a graphics mode. The text mode
is used for simple input/output of characters to control the operating system or edit the code in a program.
Text displayed in this mode uses a built-in character generator. The character generator is capable of
drawing alphabetic, numeric, and punctuation characters, and some selection of special symbols such as
, , and . Usually these characters cant be placed arbitrarily on the display but only in some row and
column of a built-in grid.
A graphics mode offers a richer set of character shapes, and characters can be placed arbitrarily. Figure
1.14 shows some examples of text drawn graphically.
Big Text
Little Text
Shadow Text
SMALLCAPS
Figure 1.14. Some text drawn graphically.
A tool to draw a character string might look like: drawString(x, y, string); It places the
starting point of the string at position (x, y), and draws the sequence of characters stored in the variable
string.
Text Attributes.
There are many text attributes, the most important of which are typeface, color, size, spacing, and
orientation.
Font. A font is a specific set of character shapes (a typeface) in a particular style and size. Figure 1.15
shows various character styles.
Hill - Chapter 1
9/9/99
page 5
Times
Times bold
Times italic
Helvetica
Helvetica bold
Helvetica italic
Courier
Courier bold
Courier italic
Figure 1.15. Some examples of fonts.
The shape of each character can be defined by a polyline (or more complicated curves such as Bezier
curves see Chapter 11), as shown in Figure 1.16a, or by an arrangement of dots, as shown in part b.
Graphics packages come with a set of predefined fonts, and additional fonts can be purchased from
companies that specialize in designing them.
B
Figure 1.16. A character shape defined by a polyline and by a pattern of dots.
Orientation of characters and strings: Characters may also be drawn tilted along some direction. Tilted
strings are often used to annotate parts of a graph. The graphic presentation of high-quality text is a
complex subject. Barely perceptible differences in detail can change pleasing text into ugly text. Indeed,
we see so much printed material in our daily lives that we subliminally expect characters to be displayed
with certain shapes, spacings, and subtle balances.
1.3.3. Filled Regions
The filled region (sometimes called fill area) primitive is a shape filled with some color or pattern. The
boundary of a filled region is often a polygon (although more complex regions are considered in Chapter
4). Figure 1.17 shows several filled polygons. Polygon A is filled with its edges visible, whereas B is filled
with its border left undrawn. Polygons C and D are non-simple. Polygon D even contains polygonal holes.
Such shapes can still be filled, but one must specify exactly what is meant by a polygons interior, since
filling algorithms differ depending on the definition. Algorithms for performing the filling action are
discussed in Chapter 10.
B
Hill - Chapter 1
9/9/99
page 6
Figure 1.19. a). A raster image of a chess piece. b). A blow-up of the image. (Raytracing courtesy of
Andrew Slater)
Hill - Chapter 1
9/9/99
page 7
A raster image is stored in a computer as an array of numerical values. This array is thought of as being
rectangular, with a certain number of rows and a certain number of columns. Each numerical value
represents the value of the pixel stored there. The array as a whole is often called a pixel map. The term
bitmap is also used, (although some people think this term should be reserved for pixel maps wherein
each pixel is represented by a single bit, having the value 0 or 1.)
Figure 1.20 shows a simple example where a figure is represented by a 17 by 19 array (17 rows by 19
columns) of cells in three shades of gray. Suppose the three gray levels are encoded as the values 1, 2, and
7. Figure 1.20b shows the numerical values of the pixel map for the upper left 6 by 8 portion of the image.
a).
b).
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
7
7
2
2
2
7
1
1
2
2
7
1
1
1
2
2
7
1
1
7
2
7
1
1
1
7
Hill - Chapter 1
9/9/99
page 8
Figure 1.21. a). a collection of lines and text. b). Blow-up of part a, having jaggies.
3). Scanned images.
A photograph or television image can be digitized as described above. In effect a grid is placed over the
original image, and at each grid point the digitizer reads into memory the closest color in its repertoire.
The bitmap is then stored in a file for later use. The image of the kitten in Figure 1.22 was formed this way.
Hill - Chapter 1
9/9/99
page 9
Figure 1.23. Three successive blow-ups of the kitten image. a). three times enlargement, b). six times
enlargement.
Hill - Chapter 1
9/9/99
page 10
a).
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
1
1
1
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
Hill - Chapter 1
9/9/99
page 11
Figure 1.27 shows 16 gray levels ranging from black to white. Each of the sixteen possible pixel values is
associated with a binary 4-tuple such as 0110 or 1110. Here 0000 represents black, 1111 denotes white,
and the other 14 values represent gray levels in between.
0000
0001 0010
1110
1111
brightness
black
white
.
Figure 1.28. The image reduced to 6 bits/pixel and 5 bits/pixel.
But there is a significant loss in quality in the images of Figure 1.29. Part a shows the effect of truncating
each pixel value to 4 bits, so there are only 16 possible shades of gray. For example, pixel value 01110100
is replaced with 0111. In part b the eight possible levels of gray are clearly visible. Note that some areas of
the figure that show gradations of gray in the original now show a lake of uniform gray. This is often
called banding, since areas that should show a gradual shift in the gray level instead show a sequence of
uniform gray bands.
Thousands are available on the Internet, frequently as Gif, Jpeg, or Tiff images.
Hill - Chapter 1
9/9/99
page 12
Hill - Chapter 1
9/9/99
page 13
The number of bits used to represent the color of each pixel is often called its color depth. Each value in
the (red, green, blue) 3-tuple has a certain number of bits, and the color depth is the sum of these values.
A color depth of 3 allows one bit for each component. For instance the pixel value (0, 1, 1) means that the
red component is off, but both green and blue are on. In most displays the contributions from each
component are added together (see Chapter 12 for exceptions such as in printing), so (0,1,1) would
represent the addition of green and blue light, which is perceived as cyan. Since each component can be
on or off there are eight possible colors, as tabulated in Figure 1.31. As expected, equal amounts of red,
green, and blue, (1, 1, 1), produce white.
color value
displayed
0,0,0
black
0,0,1
blue
0,1,0
green
0,1,1
cyan
1,0,0
red
1,0,1
magenta
1,1,0
yellow
1,1,1
white
Figure 1.31. A common correspondence between color value and perceived color.
A color depth of 3 rarely offers enough precision for specifying the value of each component, so larger
color depths are used. Because a byte is such a natural quantity to manipulate on a computer, many
images have a color depth of eight. Each pixel then has one of 256 possible colors. A simple approach
allows 3 bits for each of the red and the green components, and 2 bits for the blue component. But more
commonly the association of each byte value to a particular color is more complicated, and uses a color
look-up table, as discussed in the next section.
The highest quality images, known as true color images, have a color depth of 24, and so use a byte for
each component. This seems to achieve as good color reproduction as the eye can perceive: more bits
dont improve an image. But such images require a great deal of memory: three bytes for every pixel. A
high quality image of 1080 by 1024 pixels requires over three million bytes!
Plates 19 through 21 show some color raster images having different color depths. Plate 19 shows a full
color image with a color depth of 24 bits. Plate 20 shows the degradation this image suffers when the
color depth is reduced to 8 by simply truncating the red and green components to 3 bits each, and the blue
component to 2 bits. Plate 21 also has a color depth of 8, so its pixels contain only 256 colors, but the 256
particular colors used have been carefully chosen for best reproduction. Methods to do this are discussed
in Chapter 12.
author-supplied
Plate 19. Image with 24 bits/pixel.
author-supplied
Plate 20. Image with 3 bits for red and green pixels, and two bits for blue pixels.
author-supplied
Plate 1.21. Image with 256 carefully chosen colors.
Hill - Chapter 1
9/9/99
page 14
Hill - Chapter 1
9/9/99
page 15
vertical coordinate sy increases from top to bottom. This upside-down coordinate system is typical of
raster devices.
Figure 1.35. The built-in coordinate system for the surface of a raster display.
Raster displays are always connected one way or another to a frame buffer, a region of memory
sufficiently large to hold all of the pixel values for the display (i.e. to hold the bitmap). The frame buffer
may be physical memory on-board the display, or it may reside in the host computer. For example, a
graphics card that is installed in a personal computer actually houses the memory required for the frame
buffer.
Figure 1.36 suggests how an image is created and displayed. The graphics program is stored in system
memory and is executed instruction by instruction by the central processing unit (CPU). The program
computes appropriate values for each pixel in the desired image and loads them into the frame buffer.
(This is the part we focus on later when it comes to programming: building tools that write the correct
pixel values into the frame buffer. ) A scan controller takes care of the actual display process. It runs
autonomously (rather than under program control), and does the same thing pixel after pixel. It causes the
frame buffer to send each pixel through a converter to the appropriate physical spot on the display
surface. The converter takes a pixel value such as 01001011 and converts it to the corresponding quantity
that produces a spot of color on the display.
scan
controller
CPU
System
memory
frame
buffer
converter
display
surface
system bus
Figure 1.36. Block diagram of a computer with raster display.
Hill - Chapter 1
9/9/99
page 16
x
logical
address
scan
controller
geometric
position
at (639, 0)
639
spot at (x,y)
convert pixel 479
value to color
at (639, 479)
display surface
Figure 1.37. Scanning out an image from the frame buffer to the display surface.
The scan controller sends logical address (136, 252) to the frame buffer, which emits the value
mem[136][252]. The controller also simultaneously addresses a physical (geometric) position (136, 252)
on the display surface. Position (136, 252) corresponds to a certain physical distance of 136 units
horizontally, and 252 units vertically, from the upper left hand corner of the display surface. Different
raster displays use different units.
The value mem[136][252] is converted to a corresponding intensity or color in the conversion circuit, and
the intensity or color is sent to the proper physical position (136, 252) on the display surface.
To scan out the image in the entire frame buffer, every pixel value is visited once, and its corresponding
spot on the display surface is excited with the proper intensity or color.
In some devices this scanning must be repeated many times per second, in order to "refresh" the picture.
The video monitor to be described next is such a device.
With these generalities laid down, we look briefly at some specific raster devices, and see the different
forms that arise.
Video Monitors.
Video monitors are based on a CRT, or cathode-ray tube, similar to the display in a television set. Figure
1.38 adds some details to the general description above for a system using a video monitor as the display
device. In particular, the conversion process from pixel value to spot of light is illustrated. The system
shown has a color depth of 6 bits; the frame buffer is shown as having six bit planes. Each pixel uses
one bit from each of the planes.
Hill - Chapter 1
9/9/99
page 17
x
scan controller
y
x
red
green
spot
pixel
value
frame
buffer
(6 planes)
blue
DAC's
electron
beam
guns
deflection
coils
Hill - Chapter 1
9/9/99
page 18
At the other extreme, there are monochrome video displays, which display a single color in different
intensities. A single DAC converts pixel values in the frame buffer to voltage levels, which drive a single
electron beam gun. The CRT has only one type of phosphor so it can produce various intensities of only
one color. Note that 6 planes of memory in the frame buffer gives 26 = 64 levels of gray.
The color display of Figure 1.39 has a fixed association with a displayed color. For instance, the pixel
value 001101 sends 00 to the red DAC, 11 to the green DAC, and 01 to the blue DAC, producing a
mix of bright green and dark blue a bluish-green. Similarly, 110011 is displayed as a bright magenta,
and 000010 as a medium bright blue.
1.4.3. Indexed Color and the LUT.
Some systems are built using an alternative method of associating pixel values with colors. They
use a color lookup table (or LUT), which offers a programmable association between pixel
value and final color. Figure 1.40 shows a simple example. The color depth is again six, but the
six bits stored in each pixel go through an intermediate step before they drive the CRT. They are
used as an index into a table of 64 values, say LUT[0]...LUT[63]. (Why are there exactly 64
entries in this LUT?) For instance, if a pixel value is 39, the values stored in LUT[39] are used
to drive the DACs, as opposed to having the bits in the value 39 itself drive them. As shown
LUT[39] contains the 15 bit value 01010 11001 10010. Five of these bits (01010) are routed to
drive the red DAC, five others drive the green DAC, and the last five drive the blue DAC.
LUT
63
frame buffer:
6 bit planes
index
to LUT
39
39
2
1
0
Each time the frame buffer is scanned out to the display, this pixel is read as value 39, which
causes the value stored in LUT[39] to be sent to the DACs.
Hill - Chapter 1
9/9/99
page 19
This programmability offers a great deal of flexibility in choosing colors, but of course it comes
at a price: the program (or programmer) has to figure out which colors to use! We consider this
further in Chapter 10.
What is the potential of this system for displaying colors? In the system of Figure 1.41 each entry
of the LUT consists of 15 bits, so each color can be set to one of 215 = 32K = 32,768 possible
colors. The set of 215 possible colors displayable by the system is called its palette, so we say this
display has a palette of 32K colors.
The problem is that each pixel value lies in the range 0..63, and only 64 different colors can be
stored in the LUT at one time. Therefore this system can display a maximum of 64 different
colors at one time. At one time here means during one scan-out of the entire frame buffer
something like 1/60-th of a second. The contents of the LUT are not changed in the middle of a
scan-out of the image, so one whole scan-out uses a fixed set of 64 palette colors. Usually the
LUT contents remain fixed for many scan-outs, although a program can change the contents of a
small LUT during the brief dormant period between two successive scan-outs.
In more general terms, suppose that a raster display system has a color depth of b bits (so there
are b bit planes in its frame buffer), and that each LUT entry is w bits wide. Then we have that:
The system can display 2w colors, any 2b at one time.
Examples.
(1). A system with b = 8 bit planes and a LUT width w = 12 can display 4096 colors, any 256 of
them at a time.
(2). A system with b = 8 bitplanes and a LUT width w = 24 can display 224 = 16,777,216 colors,
any 256 at a time.
(3). If b = 12 and w = 18, the system can display 256k = 262,144 colors, 4096 at a time.
There is no enforced relationship between the number of bit planes, b, and the width of the LUT,
w. Normally w is a multiple of 3, so the same number of bits (w/3) drives each of the three
DACs. Also, b never exceeds w , so the palette is at least as large as the number of colors that
can be displayed at one time. (Why would you never design a system with w < b?)
Note that the LUT itself requires very little memory, only 2b words of w bits each. For example,
if b = 12 and w = 18 there are only 9,216 bytes of storage in the LUT.
So what is the motivation for having a LUT in a raster display system? It is usually a need to
reduce the cost of memory. Increasing b increases significantly the amount of memory needed
for the frame buffer, mainly because there are so many pixels. The tremendous amount of
memory needed can add significantly to the cost of the overall system.
To compare the costs of two systems, one with a LUT and one without, Figure 1.41 shows an
example of two 1024 by 1280 pixel displays, (so each of them supports about 1.3 million pixels).
Both systems allow colors to be defined with a precision of 24 bits, often called true color.
Hill - Chapter 1
9/9/99
page 20
8
8
8
8
8
1.3 million pixels
8
8
Hill - Chapter 1
9/9/99
page 21
Hill - Chapter 1
9/9/99
page 22
Buttons. Sometimes a separate bank of buttons is installed on a workstation. The user presses one of the
buttons to perform a choice input function.
Mouse. The mouse is perhaps the most familiar input device of all, as it is easy and comfortable to
operate. As the user slides the mouse over the desktop, the mouse sends the changes in its position to the
workstation. Software within the workstation keeps track of the mouse's position and moves a graphics
cursor a small dot or cross on the screen accordingly. The mouse is most often used to perform a
locate or a pick function. There are usually some buttons on the mouse that the user can press to
trigger the action.
Tablet. Like a mouse, a tablet is used to generate locate or pick input primitives. A tablet provides
an area on which the user can slide a stylus. The tip of the stylus contains a microswitch. By pressing
down on the stylus the user can trigger the logical function.
The tablet is particularly handy for digitizing drawings: the user can tape a picture onto the tablet surface
and then move the stylus over it, pressing down to send each new point to the workstation. A menu area is
sometimes printed on the tablet surface, and the user Picks a menu item by pressing down the stylus
inside one of the menu item boxes. Suitable software associates each menu item box with the desired
function for the application that is running.
Space Ball and Data Glove. The Space Ball and Data Glove are relatively new input devices. Both are
designed to give a user explicit control over several variables at once, by performing hand and finger
motions. Sensors inside each device pick up subtle hand motions and translate them into Valuator values
that get passed back to the application. They are particularly suited to situations where the hand
movements themselves make sense in the context of the program, such as when the user is controlling a
virtual robot hand, and watching the effects of such motions simulated on the screen.
(Section 1.6 Summary - deleted.)
Hill - Chapter 1
9/9/99
page 23
Computer Graphics
Chap 2
09/14/99
1:01 PM
page 1
In part b) a more modern "window-based" system is shown. It can support a number of different rectangular
windows on the display screen at one time. Initialization involves creating and opening a new window
(which we shall call the screen window1) for graphics. Graphics commands use a coordinate system that is
attached to the window: usually x increases to the right and y increases downward2. Part c) shows a variation
where the initial coordinate system is right side up, with y increasing upward3.
Each system normally has some elementary drawing tools that help to get started. The most basic has a name
like setPixel(x, y, color): it sets the individual pixel at location (x, y) to the color specified by color.
It sometimes goes by different names, such as putPixel(), SetPixel(), or drawPoint(). Along with
setPixel() there is almost always a tool to draw a straight line, line(x1, y1, x2, y2), that draws a
line between (x1, y1) and (x2, y2). In other systems it might be called drawLine() or Line(). The
commands
line(100, 50, 150, 80);
line(150, 80, 0, 290);
would draw the pictures shown in each system in Figure 2.1. Other systems have no line() command, but
rather use moveto(x, y) and lineto(x, y). They stem from the analogy of a pen plotter, where the
pen has some current position. The notion is that moveto(x, y) moves the pen invisibly to location (x,
y), thereby setting the current position to (x, y); lineto(x, y) draws a line from the current position to (x,
y), then updates the current position to this (x, y). Each command moves the pen from its current position to a
new position. The new position then becomes the current position. The pictures in Figure 2.1 would be drawn
using the commands
moveto(100, 50);
lineto(150, 80);
lineto(0, 290);
For a particular system the energetic programmer can develop a whole toolkit of sophisticated functions that
utilize these elementary tools, thereby building up a powerful library of graphics routines. The final graphics
applications are then written making use of this personal library.
An obvious problem is that each graphics display uses different basic commands to drive it, and every
environment has a different collection of tools for producing the graphics primitives. This makes it difficult to
port a program from one environment to another (and sooner or later everyone is faced with reconstructing a
program in a new environment): the programmer must build the necessary tools on top of the new
environments library. This may require major alterations in the overall structure of a library or application,
and significant programmer effort.
2.1.1. Device Independent Programming, and OpenGL.
It is a boon when a uniform approach to writing graphics applications is made available, such that the same
program can be compiled and run on a variety of graphics environments, with the guarantee that it will
produce nearly identical graphical output on each display. This is known as device independent graphics
programming. OpenGL offers such a tool. Porting a graphics program only requires that you install the
appropriate OpenGL libraries on the new machine; the application itself requires no change: it calls the same
functions in this library with the same parameters, and the same graphical results are produced. The OpenGL
way of creating graphics has been adopted by a large number of industrial companies, and OpenGL libraries
exist for all of the important graphics environments4.
OpenGL is often called an application programming interface (API): the interface is a collection of
routines that the programmer can call, along with a model of how the routines work together to produce
graphics. The programmer sees only the interface, and is therefore shielded from having to cope with the
specific hardware or software idiosyncrasies on the resident graphics system.
1 The word "window" is overused in graphics: we shall take care to distinguish the various instances of the term.
2 Example systems are unix workstations using X Windows, an IBM pc running Windows 95 using the basic Windows
Application Programming Interface, and an Apple Macintosh using the built-in QuickDraw library.
3 An example is any window-based system using OpenGL.
4 Appendix 1 discusses how to obtain and get started with OpenGL in different environments.
Computer Graphics Chap 2
09/14/99
1:01 PM
page 2
OpenGL is at its most powerful when drawing images of complex three dimensional (3D) scenes, as we shall
see. It might be viewed as overkill for simple drawings of 2D objects. But it works well for 2D drawing, too,
and affords a unified approach to producing pictures. We start by using the simpler constructs in OpenGL,
capitalizing for simplicity on the many default states it provides. Later when we write programs to produce
elaborate 3D graphics we tap into OpenGLs more powerful features.
Although we will develop most of our graphics tools using the power of OpenGL, we will also look under
the hood and examine how the classical graphics algorithms work. It is important to see how such tools
might be implemented, even if for most applications you use the ready-made OpenGL versions. In special
circumstances you may wish to use an alternative algorithm for some task, or you may encounter a new
problem that OpenGL does not solve. You also may need to develop a graphics application that does not use
OpenGL at all.
2.1.2. Windows-based programming.
As described above, many modern graphics systems are windows-based, and manage the display of multiple
overlapping windows. The user can move the windows around the screen using the mouse, and can resize
them. Using OpenGL we will do our drawing in one of these windows, as we saw in Figure 2.1c.
Event-driven programming.
Another property of most windows-based programs is that they are event-driven. This means that the program
responds to various events, such as a mouse click, the press of a keyboard key, or the resizing of a screen
window. The system automatically manages an event queue, which receives messages that certain events have
occurred, and deals with them on a first-come first-served basis. The programmer organizes a program as a
collection of callback functions that are executed when events occur. A callback function is created for each
type of event that might occur. When the system removes an event from the queue it simply executes the
callback function associated with the type of that event. For programmers used to building programs with a
do this, then do this, structure some rethinking is required. The new structure is more like: do nothing
until an event occurs, then do the specified thing.
The method of associating a callback function with an event type is often quite system dependent. But
OpenGL comes with a Utility Toolkit (see Appendix 1), which provides tools to assist with event
management. For instance
glutMouseFunc(myMouse);
registers the function myMouse() as the function to be executed when a mouse event occurs. The prefix
glut indicates it is part of the OpenGL Utility Toolkit. The programmer puts code in myMouse() to handle
all of the possible mouse actions of interest.
Figure 2.2 shows a skeleton of an example main() function for an event-driven program. We will base most
of our programs in this book on this skeleton. There are four principle types of events we will work with, and
a glut function is available for each:
void main()
{
initialize things5
create a screen window
glutDisplayFunc(myDisplay);
// register the redraw function
glutReshapeFunc(myReshape);
// register the reshape function
glutMouseFunc(myMouse);
// register the mouse action function
glutKeyboardFunc(myKeyboard); // register the keyboard action function
perhaps initialize other things
glutMainLoop();
// enter the unending main loop
}
all of the callback functions are defined here
Figure 2.2. A skeleton of an event-driven program using OpenGL.
5 Notes shown in italics in code fragments are pseudocode rather than actual program code. They suggest the actions
that real code substituted there should accomplish.
Computer Graphics Chap 2
09/14/99
1:01 PM
page 3
If a particular program does not use mouse interaction, the corresponding callback function need not be
registered or written. Then mouse clicks have no effect in the program. The same is true for programs that have
no keyboard interaction.
The final function shown in Figure 2.2 is glutMainLoop(). When this is executed the program draws the
initial picture and enters an unending loop, in which it simply waits for events to occur. (A program is normally
terminated by clicking in the go away box that is attached to each window.)
2.1.3. Opening a Window for Drawing.
The first task is to open a screen window for drawing. This can be quite involved, and is system dependent.
Because OpenGL functions are device independent, they provide no support for window control on specific
systems. But the OpenGL Utility Toolkit introduced above does include functions to open a window on
whatever system you are using.
Figure 2.3 fleshes out the skeleton above to show the entire main() function for a program that will draw
graphics in a screen window. The first five function calls use the toolkit to open a window for drawing with
OpenGL. In your first graphics programs you can just copy these as is: later we will see what the various
arguments mean and how to substitute others for them to achieve certain effects. The first five functions
initialize and display the screen window in which our program will produce graphics. We give a brief
description of what each one does.
// appropriate #includes go here see Appendix 1
void main(int argc, char** argv)
{
glutInit(&argc, argv); // initialize the toolkit
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB); // set the display mode
glutInitWindowSize(640,480); // set window size
glutInitWindowPosition(100, 150); // set the window position on screen
glutCreateWindow("my first attempt"); // open the screen window
// register the callback functions
glutDisplayFunc(myDisplay);
glutReshapeFunc(myReshape);
glutMouseFunc(myMouse);
glutKeyboardFunc(myKeyboard);
myInit();
glutMainLoop();
}
Figure 2.3. Code using the OpenGL utility toolkit to open the initial window for drawing.
Computer Graphics
Chap 2
09/14/99
1:01 PM
page 4
glutInit(&argc, argv); This function initializes the toolkit. Its arguments are the standard ones
for passing command line information; we will make no use of them here.
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB); This function specifies how the display
should be initialized. The built-in constants GLUT_SINGLE and GLUT_RGB, which are ORd together,
indicate that a single display buffer should be allocated and that colors are specified using desired
amounts of red, green, and blue. (Later we will alter these arguments: for example, we will use double
buffering for smooth animation.)
glutInitWindowSize(640,480); This function specifies that the screen window should initially
be 640 pixels wide by 480 pixels high. When the program is running the user can resize this window as
desired.
glutInitWindowPosition(100, 150); This function specifies that the windows upper left
corner should be positioned on the screen 100 pixels from the left edge and 150 pixels down from the top.
When the program is running the user can move this window wherever desired.
glutCreateWindow("my first attempt"); This function actually opens and displays the
screen window, putting the title my first attempt in the title bar.
The remaining functions in main() register the callback functions as described earlier, perform any
initializations specific to the program at hand, and start the main event loop processing. The programmer
(you) must implement each of the callback functions as well as myInit().
Chap 2
09/14/99
1:01 PM
page 5
glVertex2i()
gl
library
basic
command
number of
arguments
type of
argument
Chap 2
09/14/99
1:02 PM
page 6
like GLint or GLfloat for OpenGL types. The OpenGL types are listed in Figure 2.7. Some of these types
will not be encountered until later in the book.
suffix
data type
typical C or C++ type
OpenGL type name
b
8-bit integer
signed char
GLbyte
s
16-bit integer
short
GLshort
i
32-bit integer
int or long
GLint, GLsizei
f
32-bit floating point
float
GLfloat, GLclampf
d
64-bit floating point
double
GLdouble,GLclampd
ub
8-bit unsigned number
unsigned char
GLubyte,GLboolean
us
16-bit unsigned number
unsigned short
GLushort
ui
32-bit unsigned number
unsigned int or unsigned long
GLuint,Glenum,GLbitfield
Figure 2.7. Command suffixes and argument data types.
As an example, a function using suffix i expects a 32-bit integer, but your system might translate int as a
16-bit integer. Therefore if you wished to encapsulate the OpenGL commands for drawing a dot in a generic
function such as drawDot() you might be tempted to use:
0.0,
0.0,
1.0,
1.0,
0.0);
0.0);
1.0);
0.0);
//
//
//
//
set
set
set
set
drawing
drawing
drawing
drawing
color
color
color
color
to
to
to
to
red
black
white
yellow
6 Using this function instead of the specific OpenGL commands makes a program more readable. It is not unusual to
build up a personal collection of such utilities.
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 7
The background color is set with glClearColor(red, green, blue, alpha), where alpha
specifies a degree of transparency and is discussed later (use 0.0 for now.) To clear the entire window to the
background color, use glClear(GL_COLOR_BUFFER_BIT). The argument GL_COLOR_BUFFER_BIT is
another constant built into OpenGL.
Establishing the Coordinate System.
Our method for establishing our initial choice of coordinate system will seem obscure here, but will become
clearer in the next chapter when we discuss windows, viewports, and clipping. Here we just take the few
required commands on faith. The myInit() function in Figure 2.9 is a good place to set up the coordinate
system. As we shall see later, OpenGL routinely performs a large number of transformations. It uses matrices
to do this, and the commands in myInit() manipulate certain matrices to accomplish the desired goal. The
gluOrtho2D() routine sets the transformation we need for a screen window that is 640 pixels wide by 480
pixels high.
void myInit(void)
{
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0, 640.0, 0, 480.0);
}
Figure 2.9. Establishing a simple coordinate system.
Putting it together: A Complete OpenGL program.
Figure 2.10 shows a complete program that draws the lowly three dots of Figure 2.5. It is easily extended to
draw more interesting objects as we shall see. The initialization in myInit() sets up the coordinate system,
the point size, the background color, and the drawing color. The drawing is encapsulated in the callback
function myDisplay(). As this program is non-interactive, no other callback functions are used. glFlush()
is called after the dots are drawn to insure that all data is completely processed and sent to the display. This is
important in some systems that operate over a network: data is buffered on the host machine and only sent to
the remote display when the buffer becomes full or a glFlush() is executed.
#include <windows.h>
// use as needed for your system
#include <gl/Gl.h>
#include <gl/glut.h>
//<<<<<<<<<<<<<<<<<<<<<<< myInit >>>>>>>>>>>>>>>>>>>>
void myInit(void)
{
glClearColor(1.0,1.0,1.0,0.0);
// set white background color
glColor3f(0.0f, 0.0f, 0.0f);
// set the drawing color
glPointSize(4.0);
// a dot is 4 by 4 pixels
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0.0, 640.0, 0.0, 480.0);
}
//<<<<<<<<<<<<<<<<<<<<<<<< myDisplay >>>>>>>>>>>>>>>>>
void myDisplay(void)
{
glClear(GL_COLOR_BUFFER_BIT);
// clear the screen
glBegin(GL_POINTS);
glVertex2i(100, 50);
// draw three points
glVertex2i(100, 130);
glVertex2i(150, 130);
glEnd();
glFlush();
// send all output to display
}
//<<<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>
void main(int argc, char** argv)
{
glutInit(&argc, argv);
// initialize the toolkit
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB); // set display mode
glutInitWindowSize(640,480);
// set window size
glutInitWindowPosition(100, 150); // set window position on screen
glutCreateWindow("my first attempt"); // open the screen window
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 8
glutDisplayFunc(myDisplay);
myInit();
glutMainLoop();
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 9
b).
a).
T0
T0
p
p
T1
T2
T2
T1
Chap 2
09/14/99
1:02 PM
page 10
It is convenient to define a simple class GLintPoint that describes a point whose coordinates are integers8:
class GLintPoint{
public:
GLint x, y;
};
We then build and initialize an array of three such points T[0], T[1], and T[2] to hold the three corners
of the triangle using GLintPoint T[3]= {{10,10},{300,30},{200, 300}}. There is no
need to store each point pk in the sequence as it is generated, since we simply want to draw it and then move
on. So we set up a variable point to hold this changing point. At each iteration point is updated to hold
the new value.
We use i = random(3) to choose one of the points T[i] at random. random(3) returns one of the values
0, 1, or 2 with equal likelihood. It is defined as9
int random(int m) { return rand() % m; }
Figure 2.14 shows the remaining details of the algorithm, which generates 1000 points of the Sierpinski
gasket.
void Sierpinski(void)
{
GLintPoint T[3]= {{10,10},{300,30},{200, 300}};
int index = random(3);
// 0, 1, or 2 equally likely
GLintPoint point = T[index];
// initial point
drawDot(point.x, point.y);
// draw initial point
for(int i = 0; i < 1000; i++) // draw 1000 dots
{
index = random(3);
point.x = (point.x + T[index].x) / 2;
point.y = (point.y + T[index].y) / 2;
drawDot(point.x,point.y);
}
glFlush();
}
Figure 2.14. Generating the Sierpinski Gasket.
Example 2.2.3. Simple Dot Plots.
Suppose you wish to learn the behavior of some mathematical function f(x) as x varies. For example, how
does
-x
f(x) = e cos(2x)
vary for values of x between 0 and 4? A quick plot of f(x) versus x, such as that shown in Figure 2.15, can
reveal a lot.
8 If C rather than C++ is being used, a simple struct is useful here: typedef struct{GLint x,
y;}GLintPoint;
9 Recall that the standard function rand() returns a pseudorandom value in the range 0 to 32767. The modulo
function reduces it to a value in the range 0 to 2.
Computer Graphics Chap 2
09/14/99
1:02 PM
page 11
Computer Graphics
(2.1)
Chap 2
09/14/99
1:02 PM
page 12
for properly chosen values of the constants A, B, C, and D. A and C perform scaling; B and D perform
shifting. This scaling and shifting is basically a form of affine transformation. We study affine
transformations in depth in Chapter 5. They provide a more consistent approach that maps any specified range
in x and y to the screen window.
We need only set the values of A, B, C, and D appropriately, and draw the dot -plot using:
GLdouble A, B, C, D, x;
A = screenWidth / 4.0;
B = 0.0;
C = screenHeight / 2.0;
D = C;
glBegin(GL_POINTS);
for(x = 0; x < 4.0 ; x += 0.005)
glVertex2d(A * x + B, C * f(x) + D);
glEnd();
glFlush();
Figure 2.16 shows the entire program to draw the dot plot, to illustrate how the various ingredients fit together. The
initializations are very similar to those for the program that draws three dots in Figure 2.10. Notice that the width and
height of the screen window are defined as constants, and used where needed in the code.
#include <windows.h> // use proper includes for your system
#include <math.h>
#include <gl/Gl.h>
#include <gl/glut.h>
const int screenWidth = 640;
// width of screen window in pixels
const int screenHeight = 480;
// height of screen window in pixels
GLdouble A, B, C, D; // values used for scaling and shifting
//<<<<<<<<<<<<<<<<<<<<<<< myInit >>>>>>>>>>>>>>>>>>>>
void myInit(void)
{
glClearColor(1.0,1.0,1.0,0.0);
// background color is white
glColor3f(0.0f, 0.0f, 0.0f);
// drawing color is black
glPointSize(2.0);
// a 'dot' is 2 by 2 pixels
glMatrixMode(GL_PROJECTION);
// set "camera shape"
glLoadIdentity();
gluOrtho2D(0.0, (GLdouble)screenWidth, 0.0, (GLdouble)screenHeight);
A = screenWidth / 4.0; // set values used for scaling and shifting
B = 0.0;
C = D = screenHeight / 2.0;
}
//<<<<<<<<<<<<<<<<<<<<<<<< myDisplay >>>>>>>>>>>>>>>>>
void myDisplay(void)
{
glClear(GL_COLOR_BUFFER_BIT);
// clear the screen
glBegin(GL_POINTS);
for(GLdouble x = 0; x < 4.0 ; x += 0.005)
{
Gldouble func = exp(-x) * cos(2 * 3.14159265 * x);
glVertex2d(A * x + B, C * func + D);
}
glEnd();
glFlush();
// send all output to display
}
//<<<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>
void main(int argc, char** argv)
{
glutInit(&argc, argv);
// initialize the toolkit
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB); // set display mode
glutInitWindowSize(screenWidth, screenHeight); // set window size
glutInitWindowPosition(100, 150); // set window position on screen
glutCreateWindow("Dot Plot of a Function"); // open the screen window
glutDisplayFunc(myDisplay);
// register redraw function
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 13
myInit();
glutMainLoop();
Chap 2
09/14/99
1:02 PM
page 14
default thickness is 1.0. Figure 2.17c shows stippled (dotted and dashed) lines. The details of stippling are
addressed in Case Study 2.5 at the end of this chapter.
2.3.1. Drawing Polylines and Polygons.
Recall from Chapter 1 that a polyline is a collection of line segments joined end to end. It is described by an
ordered list of points, as in:
p0 = (x0, y0), p1 = (x1, y1), ... , pn = (xn, yn).
(2.3.1)
In OpenGL a polyline is called a line strip, and is drawn by specifying the vertices in turn between
glBegin(GL_LINE_STRIP) and glEnd(). For example, the code:
glBegin(GL_LINE_STRIP);
glVertex2i(20,10);
glVertex2i(50,10);
glVertex2i(20,80);
glVertex2i(50,80);
glEnd();
glFlush();
produces the polyline shown in Figure 2.18a. Attributes such as color, thickness and stippling may be applied
to polylines in the same way they are applied to single lines. If it is desired to connect the last point with the
first point to make the polyline into a polygon simply replace GL_LINE_STRIP with GL_LINE_LOOP. The
resulting polygon is shown in Figure 2.18b.
Chap 2
09/14/99
1:02 PM
page 15
glBegin(GL_LINE_STRIP);
for(x = 0; x <= 300; x += 3)
glVertex2d(A * x + B, C * f(x) + D);
glEnd();
glFlush;
Figure 2.20. Plotting a function using a line graph.
Example 2.3.2. Drawing Polylines stored in a file.
Most interesting pictures made up of polylines contain a rather large number of line segments. Its convenient
to store a description of the polylines in a file, so that the picture can be redrawn at will. (Several interesting
examples may be found on the Internet - see the Preface.)
Its not hard to write a routine that draws the polylines stored in a file. Figure 2.21 shows an example of what
might be drawn.
Chap 2
09/14/99
1:02 PM
page 16
{
fstream inStream;
inStream.open(fileName, ios ::in); // open the file
if(inStream.fail())
return;
glClear(GL_COLOR_BUFFER_BIT);
// clear the screen
GLint numpolys, numLines, x ,y;
inStream >> numpolys;
// read the number of polylines
for(int j = 0; j < numpolys; j++) // read each polyline
{
inStream >> numLines;
glBegin(GL_LINE_STRIP);
// draw the next polyline
for (int i = 0; i < numLines; i++)
{
inStream >> x >> y;
// read the next x, y pair
glVertex2i(x, y);
}
glEnd();
}
glFlush();
inStream.close();
}
Figure 2.22. Drawing polylines stored in a file.
This version of drawPolyLineFile()does very little error checking. If the file cannot be opened
perhaps the wrong name is passed to the function the routine simply returns. If the file contains bad data,
such as real values where integers are expected, the results are unpredictable. The routine as given should be
considered only as a starting point for developing a more robust version.
Example 2.3.3. Parameterizing Figures.
Figure 2.23 shows a simple house consisting of a few polylines. It can be drawn using code shown partially
in Figure 2.24. (What code would be suitable for drawing the door and window?)
Chap 2
09/14/99
1:02 PM
page 17
This is not a very flexible approach. The position of each endpoint is hard-wired into this code, so
hardwirededHouse() can draw only one house in one size and one location. More flexibility is achieved if
we parameterize the figure, and pass the parameter values to the routine. In this way we can draw families of
objects, which are distinguished by different parameter values. Figure 2.25 shows this approach. The
parameters specify the location of the peak of the roof, the width of the house, and its height. The details of
drawing the chimney, door, and window are left as an exercise.
void parameterizedHouse(GLintPoint peak, GLint width, GLint height)
// the top of house is at the peak; the size of house is given
// by height and width
{
glBegin(GL_LINE_LOOP);
glVertex2i(peak.x,
peak.y); // draw shell of house
glVertex2i(peak.x + width / 2, peak.y - 3 * height /8);
glVertex2i(peak.x + width / 2 peak.y height);
glVertex2i(peak.x - width / 2, peak.y height);
glVertex2i(peak.x - width / 2, peak.y - 3 * height /8);
glEnd();
draw chimney in the same fashion
draw the door
draw the window
}
Figure 2.25. Drawing a parameterized house.
This routine may be used to draw a village as shown in Figure 2.26, by making successive calls to
parameterizedHouse() with different parameter values. (How is a house flipped upside down? Can
all of the houses in the figure be drawn using the routine given?)
Chap 2
09/14/99
1:02 PM
page 18
set CP to (x, y)
draw a line from C2970P to (x, y), and then update CP to (x, y)
A line from (x1, y1) to (x2, y2) is therefore drawn using the two calls moveto(x1, y1);lineto(x2,
y2). A polyline based on the list of points (x0, y0), (x1, y1), ... , (xn-1, yn-1) is easily drawn using:
moveto(x[0], y[0]);
for(int i = 1; i < n; i++)
lineto(x[i], y[i]);
It is straightforward to build moveto() and lineto() on top of OpenGL. To do this we must define and
maintain our own CP. For the case of integer coordinates the implementation shown in Figure 2.29 would do
the trick.
GLintPoint CP;
// global current position
//<<<<<<<<<<<<< moveto >>>>>>>>>>>>>>
void moveto(GLint x, GLint y)
{
CP.x = x; CP.y = y; // update the CP
}
//<<<<<<<<<<<< lineTo >>>>>>>>>>>>>>>>>
void lineto(GLint x, GLint y)
{
glBegin(GL_LINES); // draw the line
glVertex2i(CP.x, CP.y);
glVertex2i(x, y);
glEnd();
glFlush();
CP.x = x; CP.y = y; // update the CP
}
Figure 2.29. Defining moveto() and lineto() in OpenGL.
2.3.4. Drawing Aligned Rectangles.
A special case of a polygon is the aligned rectangle, so called because its sides are aligned with the
coordinate axes. We could create our own function to draw an aligned rectangle (how?), but OpenGL
provides the ready-made function:
glRecti(GLint x1, GLint y1, GLint x2, GLint y2);
// draw a rectangle with opposite corners (x1, y1) and (x2, y2);
// fill it with the current color;
This command draws the aligned rectangle based on two given points. In addition the rectangle is filled with
the current color. Figure 2.30 shows what is drawn by the code:
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 19
b).
10 Recall that random(N) returns a randomly-chosen value between 0 and N - 1 (see Appendix 3).
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 20
The principal properties of an aligned rectangle are its size, position, color, and shape. Its shape is
embodied in its aspect ratio, and we shall be referring to the aspect ratios of rectangles throughout the book.
The aspect ratio of a rectangle is simply the ratio of its width to its height11:
aspect ratio =
width
height
(2.2)
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 21
2.3.5. Scaling and positioning a figure using parameters. Write the function void
drawDiamond(GLintPoint center, int size) that draws the simple diamond shown in Figure
2.33, centered at center, and having size size.
x
Chap 2
09/14/99
1:02 PM
page 22
glVertex2f(xn, yn);
glEnd();
It will be filled in the current color. It can also be filled with a stipple pattern see Case Study 2.5, and later
we will paint images into polygons as part of applying a texture.
Figure 2.36 shows a number of filled convex polygons. In Chapter 10 we will examine an algorithm for filling
any polygon, convex or not.
Chap 2
09/14/99
1:02 PM
page 23
Recall that when the user presses or releases a mouse button, moves the mouse, or presses a keyboard key, an
event occur. Using the OpenGL Utility Toolkit (GLUT) the programmer can register a callback function with
each of these events by using the following commands:
glutMouseFunc(myMouse) which registers myMouse() with the event that occurs when the mouse
button is pressed or released;
glutMotionFunc(myMovedMouse) which registers myMovedMouse() with the event that occurs
when the mouse is moved while one of the buttons is pressed;
glutKeyboardFunc(myKeyboard) which registers myKeyBoard() with the event that occurs when a
keyboard key is pressed.
Chap 2
09/14/99
1:02 PM
page 24
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 25
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 26
The value of key is the ASCII value12 of the key pressed. The values x and y report the position of the mouse
at the time that the event occurred. (As before y measures the number of pixels down from the top of the
window.)
The programmer can capitalize on the many keys on the keyboard to offer the user a large number of choices
to invoke at any point in a program. Most implementations of myKeyboard() consist of a large switch
statement, with a case for each key of interest. Figure 2.41 shows one possibility. Pressing p draws a dot at
the mouse position; pressing the left arrow key adds a point to some (global) list, but does no drawing13;
pressing E exits from the program. Note that if the user holds down the p key and moves the mouse around
a rapid sequence of points is generated to make a freehand drawing.
void myKeyboard(unsigned char theKey, int mouseX, int mouseY)
{
GLint x = mouseX;
GLint y = screenHeight - mouseY; // flip the y value as always
switch(theKey)
{
case p:
drawDot(x, y);
// draw a dot at the mouse position
break;
case GLUT_KEY_LEFT: List[++last].x = x; // add a point
List[ last].y = y;
break;
case E:
exit(-1);
//terminate the program
default:
break;
// do nothing
}
}
Figure 2.41. An example of the keyboard callback function.
2.5. Summary
The hard part in writing graphics applications is getting started: pulling together the hardware and software
ingredients in a program to make the first few pictures. The OpenGL application programmer interface (API)
helps enormously here, as it provides a powerful yet simple set of routines to make drawings. One of its great
virtues is device independence, which makes it possible to write programs for one graphics environment, and
use the same program without changes in another environment.
Most graphics applications are written today for a windows-based environment. The program opens a window
on the screen that can be moved and resized by the user, and it responds to mouse clicks and key strokes. We
saw how to use OpenGL functions that make it easy to create such a program.
Primitive drawing routines were applied to making pictures composed of dots, lines, polylines, and polygons,
and were combined into more powerful routines that form the basis of one's personal graphics toolkit. Several
examples illustrated the use of these tools, and described methods for interacting with a program using the
keyboard and mouse. The Case studies presented next offer additional programming examples that explore
deeper into the topics discussed so far, or branch out to interesting related topics.
12 ASCII stands for American Standard Code for Information Interchange. Tables of ASCII values are readily
available on the internet. Also see ascii.html in the web site for this book.
13 Names for the various special keyboard keys, such as the function keys, arrow keys, and home, may be found
in the include file glut.h.
Computer Graphics Chap 2
09/14/99
1:02 PM
page 27
Some of the Case Studies are simple exercises that only require fleshing out some pseudocode given in the
text, and then running the program through its paces. Others are much more challenging, and could be the
basis of a major programming project within a course. It is always difficult to judge how much time someone
else will need to accomplish any project. The Level of Effort that accompanies each Case Study is a rough
guess at best.
Level of Effort:
I: a simple exercise. It could be assigned for the next class.
II: an intermediate exercise. It probably needs several days for completion14.
III: An advanced exercise. It would probably be assigned for two weeks or so ahead.
(2.3)
where A, B, and N are suitably chosen constants. One set of numbers that works fairly well is: A =
1103515245, B = 12345, and N = 32767. Multiplying ni-1 by A and adding B forms a large value, and the
modulo operation brings the value into the range 0 to N-1. The process begins with some seed value chosen
for n0.
Because the numbers only give an appearance of randomness they are called pseudo random numbers. The
choices of the values for A, B, and N are very important, and slightly different values give rise to very
different characteristics in the sequence of numbers. More details can be found in [knuth, weiss98] .
Scatter Plots.
Some experiments yield data consisting of many pairs of numbers (ai, bi), and the goal is to infer visually how
the a-values and b-values are related. For instance, a large number of people are measured, and one
wonders if there is a strong correlation between a persons height and weight.
A scatter plot can be used to give visual insight into the data. The data for each person is plotted as a dot at
position (height, weight) so only the drawDot() tool is needed. Figure 2.42 shows an example. It suggests
that a persons height and weight are roughly linearly related, although some people (such as A) are
idiosyncratic, being very tall yet quite light.
14
A day of programming means several two hour sessions, with plenty of thinking (and resting) time
between sessions. It also assumes a reasonably skilled programmer (with at least two semesters of
programming in hand), who is familiar with the idiosyncrasies of the language and the platform being used. It
does not allow for those dreadful hours we all know too well of being stuck with some obscure bug that
presents a brick wall of frustration until it is ferreted out and squashed.
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 28
Here we use scatter plots to visually test the quality of a random number generator. Each time the function
random(N)is called it returns a value in the range 0..N - 1 that is apparently chosen at random, unrelated to
values previously returned from random(N). But are successive values truly unrelated?
One simple test builds a scatter plot based on pairs of successive values returned by random(N). It calls
random(N) twice in succession, and plots the first value against the second. This can be done using
drawDot():
for(int i = 0; i < num; i++)
drawDot(random(N), random(N));
or in "raw" OpenGL by placing the for loop between glBegin() and glEnd():
glBegin(GL_POINTS);
for(int i = 0; i < num; i++)
glVertex2i(random(N), random(N));
glEnd();
// do it num times
It is more efficient to do it the second way, which avoids the overhead associated with making many calls to
glBegin() and glEnd().
Figure 2.43 shows a typical plot that might result. There should be a uniform density of dots throughout the
square, to reassure you that the values 0..N-1 occur with about equal likelihood, and that there is no
discernible dependence between one value and its successor.
a).
c).
b).
Chap 2
09/14/99
1:02 PM
page 29
a is a number
Figure 2.45. Taking the square root repetitively.
In this example the function being iterated is f(x) =
functions f(.) can be used instead, such as:
x , or symbolically f(.) =
f(.) = 2(.);
the doubler doubles its argument;
f(.) = cos(.);
the cosiner;
f(.) = 4 (.) (1 - (.))
the logistic function, used in Chaos theory (see Chapter 3).
2
f(.) = (.) + c for a constant c;
used to define the Mandelbrot set (see Chapter 8);
It is sometimes helpful to give a name to each number that emerges from the IFS. We call the k-th such
number dk, and say that the process begins at k = 0 by injecting the initial value d0 into the system. Then
the sequence of values generated by the IFS is:
d0
d1 = f(d0)
d2 = f(f(d0))
d3 = f(f(f(d0)))
...
so d3 is formed by applying function f(.) three times. This is called the third iterate of f() applied to the
initial value d0. More succinctly we can denote the k-th iterate of f() by
dk = f[k](d0)
(2.4)
meaning the value produced after f(.) has been applied k times to d0. (Note: it does not mean the value f(d0) is
raised to the k-th power.) We can also use the recursive form and say:
dk = f(dk-1) for k = 1,2,3,..., for a given value of d0.
This sequence of values d0, d1, d2, d3, d4, is called the orbit of d0 for the system.
Example: The orbit of 64 for the function f(.) = . is 64, 8, 2.8284, 1.68179,..., and the orbit of 10000 is 100,
10, 3.162278, 1.77828,.... (What is the orbit of 0? What is the orbit of 0.1?)
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 30
Example: The orbit of 7 for the doubler f(.) = 2(.) is: 7, 14, 28, 56, 112, ... The k-th iterate is 7 * 2 .
Example: The orbit of 1 for f(.) = sin(.) can be found using a hand calculator: 1, .8414, .7456, .6784, ... ,
which very slowly approaches the value 0. (What is the orbit of 1 for cos(.)? In particular, to what value does
the orbit converge?)
Project 1: Plotting the Hailstone sequence.
Consider iterating the intriguing function f(.):
x
f(x) = 2
3x+1
if x is even
if x is odd
(2.5)
Even valued arguments are cut in half, whereas odd ones are enlarged. For example, the orbit of 17 is the
sequence: 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1 . . .. Once a power of 2 is reached, the sequence falls
like a hailstone to 1 and becomes trapped in a short repetitive cycle (which one?). An unanswered question
in mathematics is:
Unanswered Question: Does every orbit fall to 1?
That is, does a positive integer exist, that when used as a starting point and iterated with the hailstone
function, does not ultimately crash down to 1? No one knows, but the intricacies of the sequence have been
widely studied (See [Hayes 1984] or numerous sources on the internet, such as
www.cecm.sfu.ca/organics/papers/lagarias/.).
Write a program that plots the course of the sequence yk = f[k](y0) versus k. The user gives a starting value y0
between 1 and 4,000,000,000. (unsigned longs will hold values of this size.) Each value yk is plotted as
the point (k, yk). Each plot continues until yk reaches a value of 1 (if it does...).
Because the hailstone sequence can be very long, and the values of yk can grow very large, it is essential to
scale the values before they are displayed. Recall from Section 2.2 that appropriate values of A, B, C, and D
are determined so that when the value (k, yk) is plotted at screen coordinates:
sx = ( A * k + B)
sy = ( C * yk + D)
(2.6)
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 31
pk-1
pk
f(.,.)
p = (x k ,y k ) is a point
k
Figure 2.46. Iterated function sequence generator for points.
Once again, we call the sequence of points p0, p1, p2, ... the orbit of p0.
Aside: The Sierpinski gasket seen as an IFS;
In terms of an IFS the k-th dot, pk, of the Sierpinski gasket is formed from pk-1 using:
pk = ( pk-1 + T[random(3)] ) /2
where it is understood that the x and y components must be formed separately. Thus the function that is
iterated is:
f(.) = ((.) + T[random(3)] ) / 2
Project 2: The Gingerbread Man.
The gingerbread man shown in Figure 2.47 is based on another IFS, and it can be drawn as a dot
constellation. It has become a familiar creature in chaos theory [peitgen88, gleick87, schroeder91] because it
is a form of strange attractor: the successive dots are attracted into a region resembling a gingerbread
man, with curious hexagonal holes.
(2.7)
where constants M and L are carefully chosen to scale and position the gingerbread man on the display. (The
values M = 40 and L = 3 might be good choices for a 640 by 480 pixel display.)
Write a program that allows the user to choose the starting point for the iterations with the mouse, and draws
the dots for the gingerbread man. (If a mouse is unavailable, one good starting point is (115, 121). ) Fix
suitable values of M and L in the routine, but experiment with other values as well.
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 32
You will notice that for a given starting point only a certain number of dots appears before the pattern repeats
(so it stops changing). Different starting points give rise to different patterns. Arrange your program so that
you can add to the picture by inputting additional starting points with the mouse.
Practice Exercise 2.6.1. A Fixed point on the Gingerbread man. Show that this process has fixed point:
((1+ L)M, (1 + L)M). That is, the result of subjecting this point to the process of Equation 2.7 is the same
point. (This would be a very uninteresting starting point for generating the gingerbread man!)
2.6.3. Case Study 2.3. The Golden Ratio and Other Jewels.
(Level of Effort: I.) The aspect ratio of a rectangle is an important attribute. Over the centuries, one aspect
ratio has been particularly celebrated for its pleasing qualities in works of art: that of the golden rectangle.
The golden rectangle is considered as the most pleasing of all rectangles, being neither too narrow nor too
squat. It figures in the Greek Parthenon (see Figure 2.48), Leonardo da Vinci's Mona Lisa, Salvador Dali's
The Sacrament of the Last Supper, and in much of M. C. Escher's works.
old Fig 3.18 (picture of Greek Parthenon inside golden rectangle)
Figure 2.48. The Greek Parthenon fitting within a Golden Rectangle.
The golden rectangle is based on a fascinating quantity, the golden ratio = 1.618033989.. The value
appears in a surprising number of places in computer graphics.
Figure 2.49 shows a golden rectangle, with sides of length and 1. Its shape has the unique property that if a
square is removed from the rectangle, the piece that remains will again be a golden rectangle! What value
must have to make this work? Note in the figure that the smaller rectangle has height 1 and so to be golden
must have width 1/. Thus
=1+ 1
(2.8)
= 1+ 5 = 1.618033989...
2
(2.9)
This is approximately the aspect ratio of a standard 3-by-5 index card. From Equation 2.8 we see also that if 1
is subtracted from the reciprocal of is obtained: 1/ = .618033989. . . . This is the aspect ratio of a golden
rectangle lying on its short end.
The number is remarkable mathematically in many ways, two favorites being
= 1 + 1 + 1 + 1 + ...
(2.10)
and
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 33
=1+
1
1
1+
1+
1
1 + ...
(2.11)
These both are easy to prove (how?) and display a pleasing simplicity in the use of the single digit 1.
The idea that the golden rectangle contains a smaller version of itself suggests a form of infinite regression
of figureswithin figureswithin figures ad infinitum. Figure 2.50 demonstrates this. Keep removing
squares from each remaining golden rectangle.
W = k+ k+ k + k +
2.6.3. On and Golden Rectangles. a). Show the validity of Equations 2.10 and 2.11.
b). Find the point at which the two dotted diagonals shown in Figure 2.50 lie, and show that this is the point to
which the sequence of golden rectangles converges.
c). Use Equation 2.8 to derive the relationship:
2 + 1 = 3
2
(2.12)
2.6.4. Golden orbits. The expressions in Equations 2.10 and 2.11 show that the golden ratio is the limiting
value of applying certain functions again and again. The first function is f(.) = 1 + (.) . What is the second
function? Viewing these expressions in terms of iterated functions systems, is seen to be the value to which
orbits converge for some starting values. (The starting value is hidden in the ... of the expressions.) Explore
with a hand calculator what starting values one can use and still have the process converge to .
Chap 2
09/14/99
1:02 PM
page 34
b). Extend the program in the previous part to accept some other file formats. For instance, have it accept
differentially coded x- and y- coordinates. Here the first point (x1, y1) of each polyline is encoded as above,
but each remaining one (xi, yi) is encoded after subtracting the previous point from it: the file contains (xi - xi, yi - yi-1). In many cases there are fewer significant digits in the difference than in the original point values,
1
allowing more compact files. Experiment with this format.
c). Adapt the file format above so that a color value is associated with each polyline in the file. This color
value appears in the file on the same line as the number of points in the associated polyline. Experiment with
several polyline files.
d). Adjust the polyline drawing routine so that it draws a closed polygon when a minus sign precedes the
number of points in a polyline, as in:
-3
0
35
57
5
0
12
23
.. etc
0
3
8
1
21
34
The first polyline is drawn as a triangle: its last point is connected to its first..
Chap 2
09/14/99
1:02 PM
page 35
0xEECC
1
.. .. .. .. .. .. .. .. .. ..
Figure 2.51. Example stipple patterns.
Write a program that allows the user to type in a pattern (in hexadecimal notation) and a value for factor,
and draws stippled lines laid down with the mouse.
Polygon Stippling.
It is also not difficult to define a stipple pattern for filling a polygon: but there are more details to cope with.
After the pattern is specified, it is applied to subsequent polygon filling once it is enabled with
glEnable(GL_POLYGON_STIPPLE), until disabled with glDisable(GL_POLYGON_STIPPLE).
The function
glPolygonStipple(const GLubyte * mask);
attaches the stipple pattern to subsequently drawn polygons, based on a 128 byte array mask[]. These 128
bytes provide the bits for a bitmask that is 32 bits wide and 32 bits high. The pattern is tiled throughout the
polygon (which is invoked with the usual glBegin(GL_POLYGON); glVertex*();. . .;
glEnd();).The pattern is specified by an array definition such as:
GLubyte mask[] = {0xff, 0xfe, 0x34, ... };
The first four bytes prescribe the 32 bits across the bottom row, from left to right; the next 4 bytes give the next
row up, etc. Figure 2.52 shows the result of filling a specific polygon with a fly pattern specified in the
OpenGL red book [woo97].
a).Add points
b).Move a point
c).Delete a point
Chap 2
09/14/99
1:02 PM
page 36
Figure 2.53c shows how a point is deleted from a polyline. The user clicks near some polyline vertex, and the
two line segments connected to it are erased. Then the two other endpoints of the segments just erased are
connected with a line segment.
Write and exercise a program that allows the user to enter and edit pictures made up of as many as 60
polylines. The user interacts by pressing keyboard keys and pointing/clicking with the mouse. The
functionality of the program should include the actions:
begin
delete
move
refresh
quit
(b):
(d):
(m):
(r):
(q):
A list of polylines can be maintained in an array such as: GLintPointArray polys[60]. The verb
begin, activated by pressing the key b, permits the user to create a new polyline, which is stored in the
first available slot in array polys. The verb delete requires that the program identify which point of
which polyline lies closest to the current mouse point. Once identified, the previous and next vertices in
the chosen polyline are found. The two line segments connected to the chosen vertex are erased, and the
previous and next vertices are joined with a line segment. The verb move finds the vertex closest to the
current mouse point, and waits for the user to click the mouse a second time, at which point it moves the
vertex to this new point.
What other functions might you want in a polyline editor? Discuss how you might save the array of polylines
in a file, and read it in later. Also discuss what a reasonable mechanism might be for inserting a new point
inside a polyline.
Chap 2
09/14/99
1:02 PM
page 37
Generating a Maze. Start with all walls intact so that the maze is a simple grid of horizontal and vertical
lines. The program draws this grid. An invisible mouse whose job is to eat through walls to connect
adjacent cells, is initially placed in some arbitrarily chosen cell. The mouse checks the four neighbor cells
(above, below, left, and right) and for each asks whether the neighbor has all four walls intact. If not, the cell
has previously been visited and so is already on some path. The mouse may detect several candidate cells that
haven't been visited: It chooses one randomly and eats through the connecting wall, saving the locations of
the other candidates on a stack. The eaten wall is erased, and the mouse repeats the process. When it becomes
trapped in a dead endsurrounded by visited cellsit pops an unvisited cell and continues. When the stack is
empty, all cells in the maze have been visited. A start and end cell is then chosen randomly, most likely
along some edge of the maze. It is delightful to watch the maze being formed dynamically as the mouse eats
through walls. (Question: Might a queue be better than a stack to store candidates? How does this affect the
order in which later paths are created?)
Running the Maze. Use a backtracking algorithm. At each step, the mouse tries to move in a random
direction. If there is no wall, it places its position on a stack and moves to the next cell. The cell that the
mouse is in can be drawn with a red dot. When it runs into a dead end, it can change the color of the cell to
blue and backtrack by popping the stack. The mouse can even put a wall up to avoid ever trying the dead-end
cell again.
Addendum: Proper mazes aren't too challenging because you can always traverse them using the shoulderto-the-wall rule. Here you trace the maze by rubbing your shoulder along the left-hand wall. At a dead end,
sweep around and retrace the path, always maintaining contact with the wall. Because the maze is a tree,
you will ultimately reach your destination. In fact, there can even be cycles in the graph and you still always
find the end, as long as both the start and the end cells are on outer boundaries of the maze (why?). To make
things more interesting, place the start and end cells in the interior of the maze and also let the mouse eat
some extra walls (maybe randomly 1 in 20 times). In this way, some cycles may be formed that encircle the
end cell and defeat the shoulder method.
Computer Graphics
Chap 2
09/14/99
1:02 PM
page 38
Preview.
Section 3.1 introduces world coordinates and the world window. Section 3.2 describes the window to viewport
transformation. This transformation simplifies graphics applications by letting the programmer work in a
reasonable coordinate system, yet have all pictures mapped as desired to the display surface. The section also
discusses how the programmer (and user) choose the window and viewport to achieve the desired drawings. A
key property is that the aspect ratios of the window and viewport must agree, or distortion results. Some of the
choices can be automated. Section 3.3 develops a classical clipping algorithm that removes any parts of the
picture that lie outside the world window.
Section 3.4 builds a useful C++ class called Canvas that encapsulates the many details of initialization and
variable handling required for a drawing program. Its implementation in an OpenGL environment is developed.
A programmer can use the tools in Canvas to make complex pictures, confident that the underlying data is
protected from inadvertent mishandling.
Section 3.5 develops routines for relative drawing and turtle graphics that add handy methods to the
programmers toolkit. Section 3.6 examines how to draw interesting figures based on regular polygons, and
Section 3.7 discusses the drawing of arcs and circles. The chapter ends with several Case Studies, including the
development of the Canvas class for a non-OpenGL environment, where all the details of clipping and the
window to viewport transformation must be explicitly developed.
Section 3.8 describes different representations for curves, and develops the very useful parametric form, that
permits straightforward drawing of complex curves. Curves that reside in both 2D space and 3D space are
considered.
3.1. Introduction.
It is as interesting and as difficult to say a thing well as to paint it.
Vincent Van Gogh
In Chapter 2 our drawings used the basic coordinate system of the screen window: coordinates that are
essentially in pixels, extending from 0 to some value screenWidth 1 in x, and from 0 to some value
screenHeight 1 in y. This means that we can use only positive values of x and y, and the values must
extend over a large range (several hundred pixels) if we hope to get a drawing of some reasonable size.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 1
In a given problem, however, we may not want to think in terms of pixels. It may be much more natural to think
in terms of x varying from, say, -1 to 1, and y varying from 100.0 to 20.0. (Recall how awkward it was to scale
and shift values when making the dot plots in Figure 2.16.) Clearly we want to make a separation between the
values we use in a program to describe the geometrical objects and the size and position of the pictures of them
on the display.
In this chapter we develop methods that let the programmer/user describe objects in whatever coordinate system
best fits the problem at hand, and to have the picture of the object automatically scaled and shifted so that it
comes out right in the screen window. The space in which objects are described is called world coordinates.
It is the usual Cartesian xy-coordinate system used in mathematics, based on whatever units are convenient.
We define a rectangular world window1 in these world coordinates. The world window specifies which part of
the world should be drawn. The understanding is that whatever lies inside the window should be drawn;
whatever lies outside should be clipped away and not drawn.
In addition, we define a rectangular viewport in the screen window on the screen. A mapping (consisting of
scalings and shiftings) between the world window and the viewport is established so that when all the objects in
the world are drawn, the parts that lie inside the world window are automatically mapped to the inside of the
viewport. So the programmer thinks in terms of looking through a window at the objects being drawn, and
placing a snapshot of whatever is seen in that window into the viewport on the display. This window/viewport
approach makes it much easier to do natural things like zooming in on a detail in the scene, or panning
around a scene.
We first develop the mapping part that provides the automatic change of coordinates. Then we see how clipping
is done.
sinc( x ) =
sin(x )
x
(3.1)
You want to know how it bends and wiggles as x varies. Suppose you know that as x varies from - to the
value of sinc(x) varies over much of the range 1 to 1, and that it is particularly interesting for values of x near
0. So you want a plot that is centered at (0, 0), and that shows sinc(x) for closely spaced x-values between, say,
4.0 to 4.0. Figure 3.1 shows an example plot of the function. It was generated using the simple OpenGL
display function (after a suitable world window and viewport were specified, of course):
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 2
{
glBegin(GL_LINE_STRIP);
for(GLfloat x = -4.0; x < 4.0; x += 0.1)
{
GLfloat y = sin(3.14159 * x) / (3.14159 * x);
glVertex2f(x, y);
}
glEnd();
glFlush();
}
Note that the code in these examples operates in a natural coordinate system for the problem: x is made to vary
in small increments from 4.0 to 4.0. The key issue here is how the various (x, y) values become scaled and
shifted so that the picture appears properly in the screen window.
We accomplish the proper scaling and shifting by setting up a world window and a viewport, and establishing a
suitable mapping between them. The window and viewport are both aligned rectangles specified by the
programmer. The window resides in world coordinates. The viewport is a portion of the screen window. Figure
3.2 shows an example world window and viewport. The notion is that whatever lies in the world window is
scaled and shifted so that it appears in the viewport; the rest is clipped off and not displayed.
2For
the sake of brevity we use l for left, t for top, etc. in mathematical formulas.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 3
W.l
V.r
V.l
W.t
V.t
W.r
x
W.b
V.b
window
viewport
sx
graphics
window
sy
Figure 3.4. A picture mapped from a window to a viewport. Here some distortion is produced.
Given a description of the window and viewport, we derive a mapping or transformation, called the windowto-viewport mapping. This mapping is based on a formula that produces a point (sx, sy) in the screen window
coordinates for any given point (x, y) in the world. We want it to be a proportional mapping, in the sense that
if x is, say, 40% of the way over from the left edge of the window, then sx is 40% of the way over from the left
edge of the viewport. Similarly if y is some fraction, f, of the window height from the bottom, sy must be the
same fraction f up from the bottom of the viewport.
Proportionality forces the mappings to have a linear form:
sx = A * x + C
sy = B * y + D
(3.2)
for some constants A, B, C and D. The constants A and B scale the x and y coordinates, and C and D shift (or
translate) them.
How can A, B, C, and D be determined? Consider first the mapping for x. As shown in Figure 3.5,
proportionality dictates that (sx - V.l) is the same fraction of the total (V.r - V.l) as (x - W.l) is of the total (W.r W.l), so that
sx
W.l
Computer Graphics
V.l
W.r
Chap 3
09/21/99
V.r
5:38 PM
page 4
sx V.l
x W.l
V.r V .l = W.r W.l
or
sx =
V. r V . l
V. r V. l
x + (V . l
W . l)
W.r W.l
W. r W . l
Now identifying A as the part that multiplies x and C as the constant part, we obtain:
A=
V. r V.l
, C = V . l A W. l
W . r W. l
sy V .b
y W.b
V.t V.b = W.t W.b
and writing sy as B y + D yields:
B=
V.t V. b
, D = V. b B W. b
W . t W. b
(3.3)
V. r V.l
, C = V. l A W. l
W . r W. l
V.t V. b
, D = V. b B W. b
B=
W . t W. b
A=
The mapping can be used with any point (x, y) inside or outside the window. Points inside the window map to
points inside the viewport, and points outside the window map to points outside the viewport.
(Important!) Carefully check the following properties of this mapping using Equation 3.3:
a). if x is at the windows left edge: x = W.l, then sx is at the viewports left edge: sx = V.l.
b). if x is at the windows right edge then sx is at the viewports right edge.
c). if x is fraction f of the way across the window, then sx is fraction f of the way across the viewport.
d). if x is outside the window to the left, (x < w.l), then sx is outside the viewport to the left (sx < V.l), and
similarly if x is outside to the right.
Also check similar properties for the mapping from y to sy.
Example 3.2.1: Consider the window and viewport of Figure 3.6. The window has (W.l, W.r, W.b, W.t) = (0,
2.0, 0, 1.0) and the viewport has (V.l, V.r, V.b, V.t) = (40, 400, 60, 300).
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 5
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 6
Because OpenGL uses matrices to set up all its transformations, gluOrtho2D()3 must be preceded by two set
up functions glMatrixMode(GL_PROJECTION) and glLoadIdentity(). (We discuss what is going on
behind the scenes here more fully in Chapter 5.)
Thus to establish the window and viewport used in Example 3.2.1 we would use:
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0.0, 2.0, 0.0, 1.0);
glViewport(40, 60, 360, 240);
Hereafter every point (x, y) sent to OpenGL using glVertex2*(x, y) undergoes the mapping of Equation 3.3, and
edges are automatically clipped at the window boundary. (In Chapter 7 we see the details of how this is done in
3D, where it also becomes clear how the 2D version is simply a special case of the 3D version.)
It will make programs more readable if we encapsulate the commands that set the window into a function
setWindow() as shown in Figure 3.7. We also show setViewport() that hides the OpenGL details of
glViewport(..). To make it easier to use, its parameters are slightly rearranged to match those of
setWindow(), so they are both in the order left, right, bottom, top.
Note that for convenience we use simply the type float for the parameters to setWindow(). The parameters left,
right, etc. are automatically cast to type Gldouble when they are passed to gluOrtho2D(), as specified by
this function's prototype. Similarly we use the type int for the parameters to setViewport(), knowing the
arguments to glViewport() will be properly cast.
//--------------- setWindow --------------------void setWindow(float left, float right, float bottom, float top)
{
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(left, right, bottom, top);
}
//---------------- setViewport -----------------void setViewport(float left, float right, float bottom, float top)
{
glViewport(left, bottom, right left, top - bottom);
}
Figure 3.7. Handy functions to set the window and viewport.
It is worthwhile to look back and see what we used for a window and viewport in the early OpenGL programs
given in Chapter 2. In Figures 2.10 and 2.17 the programs used:
1). in main():
glutInitWindowSize(640,480);
which set the size of the screen window to 640 by 480. The default viewport was used since no
glViewport() command was issued; the default viewport is the entire screen window.
2). in myInit():
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0.0, 640.0, 0.0, 480.0);
The root ortho appears because setting the window this way is actually setting up a so-called
orthographic projection in 3D, as well see in Chapter 7.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 7
This set the world window to the aligned rectangle with corners (0,0) and (640.0, 480.0), just matching the
viewport size. So the underlying window to viewport mapping didnt alter anything. This was a reasonable
first choice for getting started.
Example 3.2.2: Plotting the sinc function revisited.
Putting these ingredients together, we can see what it takes to plot the sinc() function shape of Figure 3.1. With
OpenGL it is just a matter of defining the window and viewport. Figure 3.8 shows the required code, assuming
we want to plot the function from closely spaced x-values between 4.0 and 4.0, into a viewport with width 640
and height 480. (The window is set to be a little wider than the plot range to leave some cosmetic space around
the plot.)
void myDisplay(void) // plot the sinc function, using world coordinates
{
setWindow(-5.0, 5.0, -0.3, 1.0);
// set the window
setViewport(0, 640, 0, 480);
// set the viewport
glBegin(GL_LINE_STRIP);
for(GLfloat x = -4.0; x < 4.0; x += 0.1)
// draw the plot
glVertex2f(x, sin(3.14159 * x) / (3.14159 * x));
glEnd();
glFlush();
}
Figure 3.8. Plotting the sinc function.
Example 3.2.3: Drawing polylines from a file.
In Chapter 2 we drew the dinosaur shown in Figure 3.9 using the routine drawPolylineFile(
dino.dat) of Figure 2.22. The polyline data for the figure was stored in a file dino.dat. The world
window and viewport had not yet been introduced, so we just took certain things on faith or by default, and
luckily still got a picture of the dinosaur.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 8
(Its easier to use glViewport() here than setViewport(). What would the arguments to setViewport()
be if we chose to use it instead?) Each copy is drawn in a viewport 64 by 48 pixels in size, whose aspect ratio
64/48 matches that of the world window. This draws each dinosaur without any distortion.
Figure 3.10b shows another tiling, but here alternate motifs are flipped upside down to produce an intriguing effect. This was
done by flipping the window upside down every other iteration: interchanging the top and bottom values in
setWindow()4. (Check that this flip of the window properly affects B and D in the window to viewport transformation of
Equation 3.3 to flip the picture in the viewport.) Then the preceding double loop was changed to:
for(int i = 0; i < 5; i++)
for(int j = 0; j < 5; j++)
{
if((i + j) % 2 == 0)
setWindow(0.0, 640.0, 0.0, 480.0);
else
setWindow(0.0, 640.0, 480.0, 0.0);
glViewport(i * 64, j * 44, 64, 44);
drawPolylineFile(dino.dat);
}
// if (i + j) is even
// right side up window
// upside down window
// set the next viewport
// draw it again
It might seem easier to invert the viewport, but OpenGL does not permit a viewport to have a negative
height.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 9
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 10
A skeleton of the code to achieve this is shown in Figure 3.13. For each new frame the screen is cleared, the
window is made smaller (about a fixed center, and with a fixed aspect ratio), and the figure within the window is
drawn in a fixed viewport.
float cx = 0.3, cy = 0.2; //center of the window
float H, W = 1.2, aspect = 0.7; // window properties
set the viewport
for(int frame = 0; frame < NumFrames; frame++) // for each frame
{
clear the screen
// erase the previous figure
W *= 0.7;
// reduce the window width
H = W * aspect;
// maintain the same aspect ratio
setWindow(cx - W, cx + W, cy - H, cy + H); //set the next window
hexSwirl();
// draw the object
}
Figure 3.13. Making an animation.
Achieving a Smooth Animation.
The previous approach isnt completely satisfying, because of the time it takes to draw each new figure. What the
user sees is a repetitive cycle of:
a). Instantaneous erasure of the current figure;
b). A (possibly) slow redraw of the new figure.
The problem is that the user sees the line-by-line creation of the new frame, which can be distracting. What the
user would like to see is a repetitive cycle of:
a). A steady display of the current figure;
b). Instantaneous replacement of the current figure by the finished new figure;
The trick is to draw the new figure somewhere else while the user stares at the current figure, and then to
move the completed new figure instantaneously onto the users display. OpenGL offers double-buffering
to accomplish this. Memory is set aside for an extra screen window which is not visible on the actual display,
and all drawing is done to this buffer. (The use of such off-screen memory is discussed fully in Chapter 10.)
The command glutSwapBuffers() then causes the image in this buffer to be transferred onto the screen
window visible to the user.
To make OpenGL reserve a separate buffer for this, use GLUT_DOUBLE rather than GLUT_SINGLE in the
routine used in main() to initialize the display mode:
glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB); // use double buffering
The command glutSwapBuffers() would be placed directly after drawPolylineFile() in the code of
Figure 3.13. Then, even if it takes a substantial period for the polyline to be drawn, at least the image will
change abruptly from one figure to the next in the animation, producing a much smoother and visually
comfortable effect.
Practice Exercise 3.2.2. Whirling swirls. As another example of clipping and tiling, Figure 3.14a shows the
swirl of hexagons with a particular window defined. The window is kept fixed in this example, but the viewport
varies with each drawing. Figure 3.14b shows a number of copies of this figure laid side by side to tile the
display. Try to pick out the individual swirls. (Some of the swirls have been flipped: which ones?) The result is
dazzling to the eye, in part due to the eyes yearning to synthesize many small elements into an overall pattern.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 11
Figure 3.14. a). Whirling hexagons in a fixed window. b). A tiling formed using many viewports.
Except for the flipping, the code shown next creates this pattern. Function myDisplay() sets the window once,
then draws the clipped swirl again and again in different viewports.
void myDisplay(void)
{
clear the screen
setWindow(-0.6, 0.6, -0.6, 0.6); // the portion of the swirl to draw
for(int i = 0; i < 5; i++)
// make a pattern of 5 by 4 copies
for(int j = 0; j < 4; j++)
{
int L = 80; // the amount to shift each viewport
setViewport(i * L, L + i * L, j * L, L + j * L); // the next viewport
hexSwirl();
}
}
Type this code into an OpenGL environment, and experiment with the figures it draws. Taking a cue from a
previous example, determine how to flip alternating figures upside down.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 12
(0.36, 1.75)
extent
sx
x
sy
(3.44, -0.51)
Figure 3.16. Possible aspect ratios for the world and screen windows.
Case a): R > W/H. Here the world window is short and stout relative to the screen window, so the viewport with
a matching aspect ratio R will extend fully across the screen window, but will leave some unused space above
or below. At its largest, therefore, it will have width W and height W/R, so the viewport is set using (check that
this viewport does indeed have aspect ratio R):
setViewport(0, W, 0, W/R);
Case b): R < W/H. Here the world window is tall and narrow relative to the screen window, so the viewport of
matching aspect ratio R will reach from the top to the bottom of the screen window, but will leave some unused
space to the left or right. At its largest it will have height H but width HR, so the viewport is set using:
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 13
setViewport(0, H * R, 0, H);
Example 3.2.7: A tall window. Suppose the window has aspect ratio R = 1.6 and the screen window has H =
200 and W = 360, and hence W/H = 1.8. Therefore Case b) applies, and the viewport is set to have a height of
200 pixels and a width of 320 pixels.
Example 3.2.8: A short window. Suppose R = 2 and the screen window is the same as in the example above.
Then case a) applies, and the viewport is set to have a height of 180 pixels and a width of 360 pixels.
Resizing the screen window, and the resize event.
In a windows-based system the user can resize the screen window at run-time, typically by dragging one of its
corners with the mouse. This action generates a resize event that the system can respond to. There is a function
in the OpenGL utility toolkit, glutReshape() that specifies a function to be called whenever this event
occurs:
glutReshape(myReshape);
(This statement appears in main() along with the other calls that specify callback functions.) The registered
function is also called when the window is first opened. It must have the prototype:
void myReshape(GLsizei W, GLsizei H);
When it is executed the system automatically passes it the new width and height of the screen window, which it
can use in its calculations. (GLsizei is a 32 bit integer see Figure 2.7.)
What should myReshape() do? If the user makes the screen window bigger the previous viewport could still be
used (why?), but it might be desired to increase the viewport to take advantage of the larger window size. If the
user makes the screen window smaller, crossing any of the boundaries of the viewport, you almost certainly want
to recompute a new viewport.
Making a matched viewport.
One common approach is to find a new viewport that a) fits in the new screen window, and b) has the same
aspect ratio as the world window. Matching the aspect ratios of the viewport and world window in this way
will prevent distortion in the new picture. Figure 3.17 shows a version of myReshape() that does this: it finds
the largest matching viewport (matching the aspect ratio, R, of the window), that will fit in the new screen
window. The routine obtains the (new) screen window width and height through its arguments. Its code is a
simple embodiment of the result in Figure 3.16.
void myReshape(GLsizei W, GLsizei H)
{
if(R > W/H) // use (global) window aspect ratio
setViewport(0, W, 0, W/R);
else
setViewport(0, H * R, 0, H);
}
Figure 3.17. Using a reshape function to set the largest matching viewport upon a resize event.
Practice Exercises.
3.2.3. Find the bounding box for a polyline. Write a routine that computes the extent of the polyline stored in the
array of points pt[i], for i = 0, 2, ..., n 1.
3.2.4. Matching the Viewport. Find the matching viewport for a window with aspect ratio .75 when the screen
window has width 640 and height 480.
3.2.5. Centering the viewport. (Dont skip this one!) Adjust the myReshape() routine above so that the
viewport, rather than lying in the lower left corner of the display, is centered both vertically and horizontally in
the screen window.
3.2.6. How to squash a house. Choose a window and a viewport so that a square is squashed to half its proper
height. What are the coefficients A, B, C, and D in this case?
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 14
3.2.7. Calculation of the mapping. Find the coefficients A, B, C, and D of the window to viewport mapping for
a window given by (-600, 235, -500, 125) and a viewport (20, 140, 30, 260). Does distortion occur for figures
drawn in the world? Change the right border of the viewport so that distortion will not occur.
w indow
C
E
D
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 15
segments, and each must be clipped against the window. The CohenSutherland algorithm provides a rapid
divide-and-conquer attack on the problem. Other clipping methods are discussed beginning in Chapter 4.
w indow
B
D
A
is P to t he right of W?
Figure 3.20. Encoding how point P is disposed with respect to the window.
For example, if P is inside the window its code is FFFF; if P is below but neither to the left nor right its code is
FFFT. Figure 3.21 shows the nine different regions possible, each with its code.
TTFF
FTFF
FTTF
TFFF
FFFF
FFTF
window
TFFT
FFFT
FFTT
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 16
The actual formation of the code words and tests can be implemented very efficiently using the bit
manipulation capabilities of C/ C++, as we describe in Case Study 3.3.
Chopping when there is neither trivial accept nor reject.
The Cohen-Sutherland algorithm uses a divide-and-conquer strategy. If the segment can neither be trivially
accepted nor rejected it is broken into two parts at one of the window boundaries. One part lies outside the
window and is discarded. The other part is potentially visible, so the entire process is repeated for this segment
against another of the four window boundaries. This gives rise to the strategy:
do{
form the code words for p1 and p2
if (trivial accept) return 1;
if (trivial reject) return 0;
chop the line at the next window border; discard the outside part;
} while(1);
The algorithm terminates after at most four times through the loop, since at each iteration we retain only the
portion of the segment that has survived testing against previous window boundaries, and there are only four
such boundaries. After at most four iterations trivial acceptance or rejection is assured.
How is the chopping at each boundary done? Figure 3.22 shows an example involving the right edge of the
window.
P1
w indow
t op
dely
P2
bott om
de lx
left
right
d
e
=
dely delx
where e is p1.x - W.right and:
(3.4)
are the differences between the coordinates of the two endpoints. Thus d is easily determined, and the new
p1.y is found by adding an increment to the old as
p1.y += (W.right - p1.x)
Computer Graphics
Chap 3
(3.5)
* dely / delx
09/21/99
5:38 PM
page 17
Similar reasoning is used for clipping against the other three edges of window.
In some of the calculations the term dely/delx occurs, and in others it is delx/dely. One must always be
concerned about dividing by zero, and in fact delx is zero for a vertical line, and dely is 0 for a horizontal
line. But as discussed in the exercises the perilous lines of code are never executed when a denominator is zero,
so division by zero will not occur.
These ideas are collected in the routine clipSegment( ) shown in Figure 3.23. The endpoints of the segment
are passed by reference, since changes made to the endpoints by clipSegment() must be visible in the
calling routine. (The type Point2 holds a 2D point, and the type RealRect holds an aligned rectangle. Both
types are described fully in Section 3.4.)
int clipSegment(Point2& p1, Point2& p2, RealRect W)
{
do{
if(trivial accept) return 1; // some portion survives
if(trivial reject) return 0; // no portion survives
if(p1 is outside)
{
if(p1 is to the left) chop against the left edge
else if(p1 is to the right) chop against the right edge
else if(p1 is below) chop against the bottom edge
else if(p1 is above) chop against the top edge
}
else
// p2 is outside
{
if(p2 is to the left)chop against the left edge
else if(p2 is to the right)chop against the right edge
else if(p2 is below) chop against the bottom edge
else if(p2 is above)chop against the top edge
}
}while(1);
}
Figure 3.23. The Cohen-Sutherland line clipper (pseudocode).
Each time through the do loop the code for each endpoint is recomputed and tested. When trivial acceptance
and rejection fail, the algorithm tests whether p1 is outside, and if so it clips that end of the segment to a
window boundary. If p1 is inside then p2 must be outside (why?) so p2 is clipped to a window boundary.
This version of the algorithm clips in the order left, then right, then bottom, and then top. The choice of order is
immaterial if segments are equally likely to lie anywhere in the world. A situation that requires all four clips is
shown in Figure 3.24. The first clip
p2
D
B
p1
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 18
changes p1 to A ; the second alters p2 to B; the third finds p1 still outside and below and so changes A to C ;
and the last changes p2 to D. For any choice of ordering for the chopping tests, there will always be a situation
in which all four clips are necessary.
Clipping is a fundamental operation that has received a lot of attention over the years. Several other approaches
have been developed. We examine some of them in the Case Studies at the end of this chapter, and in Chapter
4.
3.3.2. Hand Simulation of clipSegment( ).
Go through the clipping routine by hand for the case of a window given by (left, right, bottom, top) = (30, 220,
50, 240) and the following line segments:
1). p1=(40,140), p2=(100,200);
2). p1=(10,270), p2=(300,0);
3). p1=(20,10), p2=(20,200);
4). p1=(0,0), p2=(250,250);
In each case determine the endpoints of the clipped segment, and for a visual check, sketch the situation on
graph paper.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 19
public:
Point2() {x = y = 0.0f;}
// constructor1
Point2(float xx, float yy) {x = xx; y = yy;} // constructor2
void set(float xx, float yy) {x = xx; y = yy;}
float getX() {return x;}
float getY() {return y;}
void draw(void) { glBegin(GL_POINTS); // draw this point
glVertex2f((Glfloat)x, (Glfloat)y);
glEnd();}
private:
float x, y;
};
Note that values of x and y are cast to the type Glfloat when glVertex2f() is called. This is mot likely
unnecessary since the type Glfloat is defined on most systems as float anyway.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 20
float getWindowAspectRatio(void);
void clearScreen();
void setBackgroundColor(float r, float g, float b);
void setColor(float r, float g, float b);
void lineTo(float x, float y);
void lineTo(Point2 p);
void moveTo(float x, float y);
void moveTo(Point2 p);
others later
private:
Point2 CP;
// current position in the world
IntRect viewport; // the current window
RealRect window; // the current viewport
others later
};
Figure 3.25. The header file Canvas.h.
The Canvas constructor takes the width and height of the screen window along with the title string for the
window. As we show below it creates the screen window desired, performing all of the appropriate
initializations. Canvas also includes functions to set and return the dimensions of the window and the viewport,
and to control the drawing and background color. (There is no explicit mention of data for the window to
viewport mapping in this version, as this mapping is managed silently by OpenGL. In Case Study 3.4 we add
members to hold the mapping for an environment that requires it.). Other functions shown are versions of
lineTo() and moveTo() that do the actual drawing (in world coordinates, of course). We add relative drawing
tools in the next section.
Figure 3.26 shows how the Canvas class might typically be used in an application. A single global object cvs is
created, which initializes and opens the desired screen window. It is made global so that callback functions such
as display() can see it. (We cannot pass cvs as a parameter to such functions, as their prototypes are fixed
by the rules of the OpenGL utility toolkit.) The display() function here sets the window and viewport, and
then draws a line, using Canvas member functions. Then a rectangle is created and drawn using its own member
function.
Canvas cvs(640, 480, try out Canvas);
// create a global canvas object
//<<<<<<<<<<<<<<<<<<<<<<<<<<<<< display >>>>>>>>>>>>>>>>>>>>>>
void display(void)
{
cvs.clearScreen();
// clear screen
cvs.setWindow(-10.0, 10.0, -10.0, 10.0);
cvs.setViewport(10, 460, 10, 460);
cvs.moveTo(0, -10.0); // draw a line
cvs.lineTo(0, 10.0);
RealRect box( -2.0, 2.0, -1.0, 1.0); // construct a box
box.draw();
// draw the box
. . .
}
//<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
void main(void)
{
// the window has already been opened in the Canvas constructor
cvs.setBackgroundColor(1.0, 1.0, 1.0); // background is white
cvs.setColor(0.0, 0.0, 0.0); // set drawing color
glutDisplayFunc(display);
glutMainLoop();
}
Figure 3.26. Typical usage of the Canvas class.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 21
The main() routine doesnt do any initialization: this has all been done in the Canvas constructor. The routine
main() simply sets the drawing and background colors, registers function display(), and enters the main
event loop. (Could these OpenGL-specific functions also be buried in Canvas member functions?) Note that
this application makes almost no OpenGL-specific calls, so it could easily be ported to another environment
(which used a different implementation of Canvas, of course).
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 22
for k = 1, 2, . . .
where a is a constant between 0 and 2; yk is 0 for k < 0; and y0 = 1 (see [oppenheim83]). In general, one cycle
consists of S points if we set a = 2 cos(2/S). A good picture results with S = 40. Write a routine that draws
sequences generated in this fashion, and test it for various values of S.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 23
}
void Canvas :: lineRel(float dx, float dy)
{
float x = CP.x + dx, y = CP.y + dy;
lineTo(x, y);
CP.set(x, y);
}
Figure 3.29. The functions moveRel() and lineRel().
Example 3.5.1. An arrow marker. Markers of different shapes can be placed at various points in a drawing to
add emphasis. Figure 3.30 shows pentagram markers used to highlight the data points in a line graph.
y
4
3
4
6
3
5
Figure 3.30. Placing markers for emphasis.
Because the same figure is drawn at several different points it is convenient to be able to say simply
drawMarker() and have it be drawn at the CP. Then the line graph of Figure 3.30 can be drawn along with the
markers using code suggested by the pseudocode:
moveTo(first data point);
drawMarker();
// draw a marker there
for(each remaining data point)
{
lineTo(the next point); // draw the next line segment
drawMarker();
// draws it at the CP
}
Figure 3.31 shows an arrow-shaped marker, drawn using the routine in Figure 3.32. The arrow is positioned
with its uppermost point at the CP. For flexibility the arrow shape is parameterized through four size parameters
f, h, t, and w as shown. Function arrow() uses only lineRel(), and no reference is made to absolute
positions. Also note that although the CP is altered while drawing is going on, at the end the CP has been set
back to its initial position. Hence the routine produces no side effects (beyond the drawing itself).
worldCP
f
h
t
w
w
Figure 3.31. Model of an arrow.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 24
cvs.lineRel(0,
cvs.lineRel(t,
cvs.lineRel(0,
cvs.lineRel(w,
cvs.lineRel(-w
-h);
0);
h);
0);
- t / 2, f);
// across
// back up
}
Figure 3.32. Drawing an arrow using relative moves and draws.
st
di
Figure 3.33 shows that in going forward in direction CD the turtle just moves in x through the amount dist *
cos( * CD/180) and in y through the amount dist * sin( * CD/180), so the implementation of forward() is
immediate:
new worldCP
CD
old worldCP
Figure 3.33. Effect of the forward() routine.
e.g. [Abel81]
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 25
for some choice of L. Suppose that procedure hook() encapsulates these instructions. Then the shape in Figure
3.34b is drawn using four repetitions of hook(). The figure can be positioned and oriented as desired by
choices of the initial CP and CD.
a).
b).
motif
Figure 3.34. Building a figure out of several turtle motions.
Example 3.5.3. Polyspirals. A large family of pleasing figures called polyspirals can be generated easily using
turtlegraphics. A polyspiral is a polyline where each successive segment is larger (or smaller) than its
predecessor by a fixed amount, and oriented at some fixed angle to the predecessor. A polyspiral is rendered by
the following pseudocode:
for(<some number of iterations>)
{
forward(length,1);
// draw a line in the current direction
turn(angle);
// turn through angle degrees
length += increment;
// increment the line length
}
Each time a line is drawn both its length and direction are incremented. If increment is 0, the figure neither
grows nor shrinks.. Figure 3.35 shows several polyspirals. The implementation of this routine is requested in the
exercises.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 26
Figure 3.35. Examples of polyspirals. Angles are: a). 60, b). 89.5, c). -144, d). 170.
Practice Exercises.
3.5.1. Drawing Turtle figures. Provide routines that use turtle motions to draw the three figures shown in
Figure 3.36. Can the turtle draw the shape in part c without lifting the pen and without drawing any line
twice?
a).
b).
c).
7Based on the name Maeander (which has modern name Menderes), a winding river in Turkey [Janson 86].
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 27
a).
b).
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 28
Definition: A polygon is regular if it is simple, if all its sides have equal lengths, and if adjacent sides meet at
equal interior angles.
As discussed in Chapter 1, a polygon is simple if no two of its edges cross each other (more precisely: only
adjacent edges can touch, and only at their shared endpoint). We give the name n-gon to a regular polygon
having n sides. Familiar examples are the 4-gon (a square), a 5-gon (a regular pentagon), 8-gon (a regular
octagon), and so on. A 3-gon is an equilateral triangle. Figure 3.41 shows various examples. If the number of
sides of an n-gon is large the polygon approximates a circle in appearance. In fact this is used later as one way
to implement the drawing of a circle.
n: 3
40
P1 = (R cos(a), R sin(a))
a
R
P0
(3.6)
Its easy to modify this n-gon. To center it at position (cx, cy) we need only add cx and cy to the x- and ycoordinates, respectively. To scale it by factor S we need only multiply R by S. To rotate through angle A we
need only add A to the arguments of cos() and sin(). More general methods for performing geometrical
transformations are discussed in Chapter 6.
It is simple to implement a routine that draws an n-gon, as shown in Figure 3.43. The n-gon is drawn centered at
(cx, cy), with radius radius, and is rotated through rotAngle degrees.
void ngon(int n, float cx, float cy, float radius, float rotAngle)
{
// assumes global Canvas object, cvs
if(n < 3) return;
// bad number of sides
double angle = rotAngle * 3.14159265 / 180; // initial angle
double angleInc = 2 * 3.14159265 /n;
//angle increment
cvs. moveTo(radius + cx, cy);
for(int k = 0; k < n; k++) // repeat n times
{
angle += angleInc;
cvs.lineTo(radius * cos(angle) + cx, radius * sin(angle) + cy);
}
}
Figure 3.43. Building an n-gon in memory.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 29
Example 3.6.1: A Turtle-driven n-gon. It is also simple to draw an n-gon using turtlegraphics. Figure 3.44
shows how to draw a regular hexagon. The initial position and direction of the turtle is indicated by the small
triangle. The turtle simply goes forward six times, making a CCW turn of 60 degrees between each move:
360 / n
R
L
Figure 3.44. Drawing a hexagon.
for (i = 0; i < 6; i++)
{
cvs.forward(L, 1);
cvs.turn(60);
}
One vertex is situated at the initial CP, and both CP and CD are left unchanged by the process. Drawing the
general n-gon, and some variations of it, is discussed in the exercises.
Figure 3.45. A 7-gon and its offspring. a). the 7-gon, b). a stellation, c). a 7-rosette.
Example 3.6.2. The rosette, and the Golden 5-rosette.
The rosette is an n-gon with each vertex joined to every other vertex. Figure 3.46 shows 5-, 11-, and 17rosettes. A rosette is sometimes used as a test pattern for computer graphics devices. Its orderly shape readily
reveals any distortions, and the resolution of the device can be determined by noting the amount of crowding
and blurring exhibited by the bundle of lines that meet at each vertex.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 30
Rosettes are easy to draw: simply connect every vertex to every other. In pseudocode this looks like
void Rosette(int N, float radius)
{
Point2 pt[big enough value for largest rosette];
generate the vertices pt[0],. . .,pt[N-1], as in Figure 3.43
for(int i = 0; i < N - 1; i++)
for(int j = i + 1; j < N ; j++)
{
cvs.moveTo(pt[i]); // connect all the vertices
cvs.lineTo(pt[j]);
}
}
The 5-rosette is particularly interesting because it embodies many instances of the golden ratio (recall Chapter
2). Figure 3.47a shows a 5-rosette, which is made up of an outer pentagon and an inner pentagram. The Greeks
saw a mystical significance in this figure. Its segments have an interesting relationship: Each segment is times
longer than the next smaller one (see the exercises). Also, because the edges of the star pentagram form an inner
pentagon, an infinite regression of pentagrams is possible, as shown in Figure 3.47b.
a).
b).
2
radius R
radius f R
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 31
3.6.3. Prime Rosettes. If a rosette has a prime number N of sides, it can be drawn without lifting the pen, that
is, by using only lineTo(). Start at vertex v0 and draw to each of the others in turn: v1,v2, v3, . . . until v0 is
again reached and the polygon is drawn. Then go around again drawing lines, but skip a vertex each time that
is, increment the index by 2 thereby drawing to v2, v4, . . . , v0. This will require going around twice to arrive
back at v0. (A modulo operation is performed on the indices so that their values remain between 0 and N-1.)
Then repeat this, incrementing by 3: v3, v6, v0, . . . , v0. Each repeat draws exactly N lines. Because there are N(N
- 1) / 2 lines in all, the process repeats (N - 1) / 2 times. Because the number of vertices is a prime, no pattern is
ever repeated until the drawing is complete. Develop and test a routine that draws prime rosettes in this way.
3.6.4. Rosettes with an odd number of sides. If n is prime we know the n-rosette can be drawn as a single
polyline without lifting the pen. It can also be drawn as a single polyline for any odd value of n. Devise a
method that does this.
3.6.5. The Geometry of the Star Pentagram. Show that the length of each segment in the 5-rosette stands in
the golden ratio to that of the next smaller one. One way to tackle this is to show that the triangles of the star
pentagram are golden triangles with an inner angle of / 5 radians. Show that 2 * cos( / 5) = and 2 *
cos(2 / 5) = 1 / . Another approach uses only two families of similar triangles in the pentagram and the
relation 3 = 2 + 1 satisfied by .
3.6.6. Erecting Triangles on n-gon legs. Write a routine that draws figures like the logo in part a of Figure 3.48
for any value of f, positive or negative. What is a reasonable geometric interpretation of negative f?
3.6.7. Drawing the Star with Relative Moves and Draws. Write a routine to draw a pentagram that uses only
relative moves and draws, centering the star at the CP.
3.6.8. Draw a pattern of stars. Write a routine to draw the pattern of 21 stars shown in Figure 3.49. The small
stars are positioned at the vertices of an n-gon.
P
(R, 0)
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 32
3.6.10. Turtle drawings of the n-gon. Write turtleNgon(int numSides, float length) that uses
turtlegraphics to draw an n-gon with numSides sides and a side of length length.
3.6.11. Polygons sharing an edge. Write a routine that draws n-gons, for n = 3,, 12, on a common edge, as
in Figure 3.51.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 33
a
c
x
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 34
The routine drawCircle( ) is called by specifying a center and radius, but there are other ways to describe a
circle, which have important applications in interactive graphics and computer-aided design. Two familiar ones
are:
1). The center is given, along with a point on the circle. Here drawCircle( ) can be used as soon as the
radius is known. If c is the center and p is the given point on the circle, the radius is simply the distance from c
to p, found using the usual Pythagorean Theorem.
2). Three points are given through which the circle must pass. It is known that a unique circle passes through
any three points that don't lie in a straight line. Finding the center and radius of this circle is discussed in
Chapter 4.
Example 3.7.1. Blending Arcs together. More complex shapes can be obtained by using parts of two circles
that are tangent to one another. Figure 3.58 illustrates the underlying principle. The two circles are tangent at
point A, where they share tangent line L. Because of this the two arcs shown by the thick curve blend together
seamlessly at A with no visible break or corner. Similarly the arc of a circle blends smoothly with any tangent
line, as at point B.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 35
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 36
3.7.7. A tear drop. A tear drop shape that is used in many ornamental figures is shown in Figure 3.65a. As
shown in part b) it consists of a circle of given radius R snuggled down into an angle . What are the
coordinates of the circles center C for a given R and ? What are the initial angle of the arc, and its sweep?
Develop a routine to draw a tear drop at any position and in any orientation.
b).
a).
b).
Computer Graphics
Chap 3
09/21/99
(3.7)
5:38 PM
page 37
For example, the straight line through points A and B has implicit form:
F(x, y) = (y - Ay)(Bx - Ax) - (x - Ax)(By - Ay)
(3.8)
and the circle with radius R centered at the origin has implicit form:
2
2
2
F(x, y) = x + y - R
(3.9)
A benefit of using the implicit form is that you can easily test whether a given point lies on the curve:
simply evaluate F(x, y) at the point in question. For certain classes of curves it is meaningful to speak of
an inside and an outside of the curve, in which case F(x, y) is also called the inside-outside function,
with the understanding that
F(x, y) = 0
F(x, y) > 0
F(x, y) < 0
(3.10)
(Is F(x, y) of Equation 3.9 a legitimate inside-outside function for the circle?)
Some curves are single-valued in x, in which case there is a function g(.) such that all points on the curve
satisfy y = g(x). For such curves the implicit form may be written F(x, y) = y - g(x). (What is g(.) for the
line of Equation 3.8?) Other curves are single-valued in y, (so there is a function h(.) such that points on
the curve satisfy x = h(y). And some curves are not single-valued at all: F(x, y) = 0 cannot be rearranged
into either of the forms y = g(x) nor x = h(y). The circle, for instance, can be expressed as:
y = R2 x 2
(3.11)
(3.12)
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 38
Thus the point P(t) = (x(t), y(t)) sweeps through all of the points on the line between A and B as t varies
from 0 to 1 (check this out).
Another classic example is the ellipse, a slight generalization of the circle. It is described parametrically
by
x(t) = W cos(t)
y(t) = H sin(t)
(3.13)
, for 0 t 2.
Here W is the half-width, and H the half-height of the ellipse. Some of the geometric properties of
the ellipse are explored in the exercises. When W and H are equal the ellipse is a circle of radius W.
Figure 3.69 shows this ellipse, along with the component functions x(.) and y(.).
y(t)
y
@t = /2
(x(t), y(t))
H
H
@t =
t
W
2
t
-c
x
-H
@t = 3/2
W
x(t)
2
t
Figure 3.69. An ellipse described parametrically.
As t varies from 0 to 2 the point P(t) = (x(t), y(t)) moves once around the ellipse starting (and finishing)
at (W, 0). The figure shows where the point is located at various times t. It is useful to visualize
drawing the ellipse on an Etch-a-Sketch. The knobs are turned back and forth in an undulating pattern,
one mimicking W cos(t) and the other H sin(t). (This is surprisingly difficult to do manually.)
Finding an implicit form from a parametric form - implicitization.
Suppose we want to check that the parametric form in Equation 3.13 truly represents an ellipse. How
do we find the implicit form from the parametric form? The basic step is to combine the two
equations for x(t) and y(t) to somehow eliminate the variable t. This provides a relationship that must
hold for all t. It isnt always easy to see how to do this there are no simple guidelines that apply
for all parametric forms. For the ellipse, however, square both x/W and y/H and use the well-known
fact cos(t)2 + sin(t)2 = 1 to obtain the familiar equation for an ellipse:
x y
W + H =1
2
(3.14)
The following exercises explore properties of the ellipse and other classical curves. They develop
useful facts about the conic sections, which will be used later. Read them over, even if you dont stop to
solve each one.
Practice Exercises
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 39
3.8.1. On the geometry of the Ellipse. An ellipse is the set of all points for which the sum of the
distances to two foci is constant. The point (c, 0) shown in Figure 3.69 forms one focus, and (-c, 0)
forms the other. Show that H, W, and c are related by: W2 = H2 + c2.
3.8.2. How eccentric. The eccentricity, e = c / W, of an ellipse is a measure of how non circular the
ellipse is, being 0 for a true circle. As interesting examples, the planets in our solar system have very
nearly circular orbits, with e ranging from 1/143 (Venus) to 1/4 (Pluto). Earths orbit exhibits e = 1/60.
As the eccentricity of an ellipse approaches 1, the ellipse flattens into a straight line. But e has to get very
close to 1 before this happens. What is the ratio H / W of height to width for an ellipse that has e = 0.99?
3.8..3. The other Conic Sections.
The ellipse is one of the three conic sections, which are curves formed by cutting (sectioning) a circular
cone with a plane, as shown in Figure 3.70. The conic sections are:
ellipse: if the plane cuts one nappe of the cone;
hyperbola: if the plane cuts both nappes
parabola: if the plane is parallel to the side of the cone;
Pm
@t=T
a).
b).
P2
P1
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 40
3.8.3. Superellipses
An excellent variation of the ellipse is the superellipse, a family of ellipse-like shapes that can produce
good effects in many drawing situations. The implicit formula for the superellipse is
x
W
y
+
H
Computer Graphics
=1
(3.17)
Chap 3
09/21/99
5:38 PM
page 41
where n is a parameter called the bulge. Looking at the corresponding formula for the ellipse in Equation
3.14, the superellipse is seen to become an ellipse when n = 2. The superellipse has the following
parametric representation:
x(t) = W cos(t)|cos(t)
2/n -1
|
(3.18)
y(t) = H sin(t)|sin(t)
2/n-1
for 0 t 2. The exponent on the sin() and cos() is really 2/n, but the peculiar form as shown is used
to avoid trying to raise a negative number to a fractional power. A more precise version avoids this.
Check that this form reduces nicely to the equation for the ellipse when n = 2. Also check that the
parametric form for the superellipse is consistent with the implicit equation.
Figure 3.74a shows a family of supercircles, special cases of superellipses for which W = H. Figure
3.74b shows a scene composed entirely of superellipses, suggesting the range of shapes possible.
1st Ed. Figures 4.16 and 4.17 together
Figure 3.74. Family of supercircles. b). Scene composed of superellipses.
For n > 1 the bulge is outward, whereas for n < 1 it is inward. When n = 1, it becomes a square. (In
Chapter 6 we shall look at three-dimensional superquadrics, surfaces that are sometimes used in CAD
systems to model solid objects.)
Superellipses were first studied in 1818 by the French physicist Gabriel Lam. More recently in 1959, the
extraordinary inventor Piet Hein (best known as the originator of the Soma cube and the game Hex) was
approached with the problem of designing a traffic circle in Stockholm. It had to fit inside a rectangle
(with W/ H = 6 / 5) determined by other roads, and had to permit smooth traffic flow as well as be
pleasing to the eye. An ellipse proved to be too pointed at the ends for the best traffic patterns, and so
Piet Hein sought a fatter curve with straighter sides and dreamed up the superellipse. He chose n = 2.5 as
the most pleasing bulge. Stockholm quickly accepted the superellipse motif for its new center. The curves
were strangely satisfying, neither too rounded nor too orthogonal, a happy blend of elliptical and
rectangular beauty [Gardner75, p. 243]. Since that time, superellipse shapes have appeared in furniture,
textile patterns, and even silverware. More can be found out about them in the references, especially in
[Gardner75] and [Hill 79b].
The superhyperbola can also be defined [Barr81]. Just replace cos(t) by sec(t), and sin(t) by tan(y), in
Equation 3.18. When n = 2, the familiar hyperbola is obtained. Figure 3.75 shows example
superhyperbolas. As the bulge n increases beyond 2, the curve bulges out more and more, and as it
decreases below 2, it bulges out less and less, becoming straight for n = 1 and pinching inward for n < 1.
1st Ed. Figure 9.14.
Figure 3.75. The superhyperbola family.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 42
(3.19)
But a simplification is possible for a large number of appealing curves. In these instances the radius r is
expressed directly as a function of , and the parameter that sweeps out the curve is itself. For each
point (r, ) the corresponding Cartesian point (x, y) is given by
x = f() cos()
y = f() sin()
(3.20)
Curves given in polar coordinates can be generated and drawn as easily as any others: The parameter is ,
which is made to vary over an interval appropriate to the shape. The simplest example is a circle with
radius K: f() = K. The form f() = 2K cos() is another simple curve (which one?). Figure 3.77 shows
some shapes that have simple expressions in polar coordinates:
1st Ed. Figure 4.19
Figure 3.77. Examples of curves with simple polar forms..
Cardioid:
Rose curves:
shown.
f() = K (1 + cos()).
f() = K cos(n ), where n specifies the number of petals in the rose. Two cases are
Archimedian spiral:
f() = K.
In each case, constant K gives the overall size of the curve. Because the cardioid is periodic, it can be
drawn by varying from 0 to 2. The rose curves are periodic when n is an integer, and the Archimedian
spiral keeps growing forever as increases from 0. The shape of this spiral has found wide use as a cam
to convert rotary motion to linear motion (see [Yates46] and [Seggern90].
The conic sections (ellipse, parabola, and hyperbola) all share the following polar form:
f ( ) =
1
1 e cos( )
(3.21)
where e is the eccentricity of the conic section. For e = 1 the shape is a parabola; for 0 e <1 it is an
ellipse; and for e > 1 it is a hyperbola.
The Logarithmic Spiral
The logarithmic spiral (or equiangular spiral) f() = Kea, shown in Figure 3.78a, is also of particular
interest [Coxeter61]. This curve cuts all radial lines at a constant angle , where a = cot(). This is the
only spiral that has the same shape for any change of scale: Enlarge a photo of such a spiral any amount,
and the enlarged
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 43
3.8.5. 3D Curves.
Curves that meander through 3D space may also be represented parametrically, and will be discussed
fully in later chapters. To create a parametric form for a 3D curve we invent three functions x(.), y(.), and
z(.), and say the curve is at P(t) = (x(t), y(t), z(t)) at time t.
Some examples are:
The helix: The circular helix is given parametrically by:
x(t) = cos(t)
y(t)= sin(t)
z(t) = bt
(3.22)
for some constant b. It illustrated in Figure 3.79 as a stereo pair. See the Preface for viewing stereo pairs.
If you find this unwieldy, just focus on one of the figures.
(3.23)
9This
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 44
is formed by winding a string about a torus (doughnut). Figure 3.80 shows the case c = 10, so the string
makes 10 loops around the torus. We examine tubes based on this spiral in Chapter 6.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 45
routines such as setWindow(), setViewport(), moveTo(), lineTo(), and forward(), and insures that all
proper initializations are carried out. In a Case Study we implement Canvas for a more basic non-OpenGL
environment, where explicit clipping and window-to-viewport mapping routines are required. Here the value of
data-hiding within the class is even more apparent.
A number of additional tools were developed for performing relative drawing and turtle graphics, and for
creating drawings that include regular polygons, arcs and circles. The parametric form for a curve was
introduced, and shown to be a very natural description of a curve. It makes it simple to draw a curve, even those
that are multi-valued, cross over themselves, or have regions where the curve moves vertically.
3.10.1. Case Study 3.1. Studying the Logistic Map and Simulation of
Chaos.
(Level of Effort: II) Iterated function systems (IFS's) were discussed at the end of Chapter 2. Another IFS
provides a fascinating look into the world of chaos (see [Gleick87, Hofs85]), and requires proper setting of a
window and viewport. A sequence of values is generated by the repeated application of a function f(.), called
the logistic map. It describes a parabola:
f(x)= 4 x (1 - x)
(3.24)
where is some chosen constant between 0 and 1. Beginning at a given starting point, x0, between 0 and 1,
function f(.) is applied iteratively to generate the orbit (recall its definition in Chapter 2):
[k]
xk = f (x0)
How does this sequence behave? A world of complexity lurks here. The action can be made most vivid by
displaying it graphically in a certain fashion, as we now describe. Figure 3.82 shows the parabola y = 4 x (1 x) for = 0.7 as x varies from 0 to 1.
1srt Ed. Figure 3.28
Figure 3.82. The logistic map for = 0.7.
The starting point x0 = 0.1 is chosen here, and at x = 0.1 a vertical line is drawn up to the parabola, showing the
value f(x0) = 0.252. Next we must apply the function to the new value x1 = 0.252. This is shown visually by
moving horizontally over to the line y = x, as illustrated in the figure. Then to evaluate f( ) at this new value a
line is again drawn up vertically to the parabola. This process repeats forever as in other IFS's. From the
previous position (xk-1, xk) a horizontal line is drawn to (xk, xk) from which a vertical line is drawn to (xk, xk+1).
The figure shows that for = 0.7, the values quickly converge to a stable attractor, a fixed point so that f(x) =
x. (What is its value for = 0.7?) This attractor does not depend on the starting point; the sequence always
converges quickly to a final value.
If is set to small values, the action will be even simpler: There is a single attractor at x = 0. But when the knob is increased, something strange begins to happen. Figure 3.83a shows what results when = 0.85. The
orbit that represents the sequence falls into an endless repetitive cycle, never settling down to a final value.
There are several attractors here, one at each vertical line in the limit cycle shown in the figure. And when is
increased beyond the critical value = 0.892486418 the process becomes truly chaotic.
1st Ed. Figure 3.29
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 46
Figure 3.83. The logistic map for a). = 0.85 and b). = 0.9.
The case of = 0.9 is shown in Figure 3.83b. For most starting points the orbit is still periodic, but the number
of orbits observed between the repeats is extremely large. Other starting points yield truly aperiodic motion, and
very small changes in the starting point can lead to very different behavior. Before the truly remarkable
character of this phenomenon was first recognized by Mitchell Feigenbaum in 1975, most researchers believed
that very small adjustments to a system should produce correspondingly small changes in its behavior and that
simple systems such as this could not exhibit arbitrarily complicated behavior. Feigenbaum's work spawned a
new field of inquiry into the nature of complex nonlinear systems, known as chaos theory [Gleick87]. It is
intriguing to experiment with this logistic map.
Write and exercise a program that permits the user to study the behavior of repeated iterations of the logistic
map, as shown in Figure 3.83. Set up a suitable window and viewport so that the entire logistic map can be
clearly seen. The user gives the values of x0 and and the program draws the limit cycles produced by the
system.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 47
individual bits of code can be tested to see on which side of the window P lies, and the chopping can be
accomplished as in Equation 3.5. Figure 3.85 shows a chop routine that finds the new point (such as A in Figure
3.22) and replaces P with it. It uses the bit-wise AND of code with a mask to determine where P lies relative to
the window.
ChopLine(Point2 &P, unsigned char code)
{
if(code & 8){
// to the Left
P.y += (window.l - P.x) * dely / delx);
P.x = window.l;
}
else if(code & 2){
// to the Right
P.y += (window.r - P.x) * dely / delx;
P.x = window.r;
}
else if(code & 1){
// below
P.x += (window.b - P.y) * delx / dely;
P.y = window.b;
}
else if(code & 4){
// above
P.x += (window.t - P.y) * delx / dely;
P.y = window.t;
}
}
Figure 3.85. Chopping the segment that lies outside the window.
Write a complete implementation of the Cohen Sutherland algorithm, putting together the pieces described here
with those in Section 3.3.2. If you do this in the context of a Canvas class implementation as discussed in the
next Case Study, consider how the routine should best access the private data members of the window and the
points involved, and develop the code accordingly.
Test the algorithm by drawing a window and a large assortment of randomly chosen lines , showing the parts
that lie inside the window in red, and those that lie outside in black.
Practice Exercises.
3.10.1. Why will a divide by zero never occur? Consider a vertical line segment such that delx is zero.
Why is the code P.y += (window.l - P.x) * dely / delx) that would cause a divide by zero
never reached? Similarly explain why each of the four statements that compute delx/dely or dely/delx
are never reached if the denominator happens to be zero.
3.10.2. Do two chops in the same iteration? It would seem to improve performance if we replaced lines such
else if(code & 2) with if(c & 2) and tried to do two line chops in succession. Show that this
can lead to erroneous endpoints being computed, and hence to disaster.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 48
private:
Point2 CP;
// current position in the world
IntRect viewport; // the current window
RealRect window; // the current viewport
float mapA, mapB, mapC, mapD; // data for the window to viewport mapping
void makeMap(void); // builds the map
int screenWidth, screenHeight;
float delx,dely;
// increments for clipper
char code1, code2;
// outside codes for clipper
void ChopLine(tPoint2 &p, char c);
int clipSegment(tPoint2 &p1, tPoint2 &p2);
};
Figure 3.86. Interface for the Canvas class for Turbo C++.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 49
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 50
a) Rounded Arch
R
R
W
-H
-H
C
fR
R
W
3.10.5. Case Study 3.5. Some Figures used in Physics and Engineering.
(Level of Effort: II) This Case Study works with a collection of interesting pictures that arise in certain topics
within physics and engineering. The first illustrates a physical principal of circles intersecting at right angles;
the second creates a chart that can be used to study electromagnetic phenomena; the third develops symbols that
are used in designing digital systems.
1). Electrostatic Fields. The pattern of circles shown in Figure 3.89 is studied in physics and electrical
engineering, as the electrostatic field lines that surround electrically charged wires. It also appears in
mathematics in connection with the analytic functions of a complex variable. In Chapter 5 these families also
are found when we examine a fascinating set of transformations, inversions in a circle. Here we view them
simply as an elegant array of circles and consider how to draw them.
10From J.Fleming, H. Honour, N. Pevsner: Dictionary of Architecture. Penguin Books, London 1980
11From the old French ogive meaning an S-shaped curve.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 51
two-pointers
surrounders
-a
m2 - 1 ) and
radius = am
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 52
26
26
13
32
10
Figure 3.91. Standard Graphic Symbol for the Nand and Nor Gates.
NAND gate, according to a world-wide standard12. The NAND gate is basically a rounded arch placed on its
side. The arc has radius 13 units relative to the other elements, so the NAND gate must be 26 units in height.
Figure 3.91b shows the standard symbol for a NOR gate. It is similar to a pointed arch turned on its side. Three
arcs are used, each having a radius of 26 units. (The published standard as shown has an error in it, that makes it
impossible for certain elements to fit together. What is the error?)
Write a program that can draw both of these circuit types at any size and position in the world. (For the NOR
gate find and implement a reasonable correction to the error in Figure 3.77b.) Also arrange matters so that your
program can draw these gates rotated by 90 , 180, or 270.
b).
12The Institute of Electrical and Electronic Engineers (IEEE) publishes many things, including standard definitions
of terminology and graphic shapes of circuit elements. These drawings are taken from the standard document: IEEE
Std. 91-1984 .
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 53
B). Truchet Tiles. A slight variation of the method above selects successive motifs randomly from a pool of
candidate motifs. Figure 3.93a shows the well-known Truchet tiles13, which are based on two quarter circles
centered at opposite corners of a square. Tile 0 and tile 1 differ only by a 90 rotation.
Figure 3.93. Truchet Tiles. a). the two tiles. b). A truchet pattern
.
Figure 3.93.b.
Write an application that draws Truchet tiles over the entire viewport. Each successive tile uses tile 0 or tile 1,
selected at random.
Curves other than arcs can be used as well, as suggested in Figure 3.94. What conditions should be placed on
the angle with which each curve meets the edge of the tile in order to avoid sharp corners in the resulting curve?
This notion can also be extended to include more than two tiles.
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 54
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 55
One can also draw webs, as suggested in Figure 3.96. Here the index values cycle many times
through the possible values, skipping by some M each time. This is easily done by forming the next index
from the previous one using i = (i + M) mod (n+1).
1st Ed. Figure 4.27
Figure 3.96. Adding webs to a curve.
(a + b)t
)
b
( a + b)t
)
y(t ) = (a + b)sin(2 t ) k sin(2
b
x (t ) = ( a + b )cos(2 t ) k cos(2
(3.24)
The hypotrochoid:
(a b)t
)
b
( a b )t
)
y(t ) = (a b)sin(2 t ) k sin(2
b
x (t ) = ( a b )cos(2 t ) + k cos(2
(3.25)
14
15
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 56
Computer Graphics
Chap 3
09/21/99
5:38 PM
page 57
Preview
This chapter develops a number of useful tools for dealing with geometric objects encountered in
computer graphics. Section 4.1 motivates the use of vectors in graphics, and describes the principal
coordinate systems used. Section 4.2 reviews the basic ideas of vectors, and describes the key operations
that vectors allow. Although most results apply to any number of dimensions, vectors in 2D and 3D are
stressed. Section 4.3 reviews the powerful dot product operation, and applies it to a number of geometric
tasks, such as performing orthogonal projections, finding the distance from a point to a line, and finding
the direction of a ray reflected from a shiny surface. Section 4.4 reviews the cross product of two
vectors, and discusses its important applications in 3D graphics.
Section 4.5 introduces the notion of a coordinate frame and homogeneous coordinates, and stresses that
points and vectors are significantly different types of geometric objects. It also develops the two principal
mathematical representations of a line and a plane, and shows where each is useful. It also introduces
affine combinations of points and describes an interesting kind of animation known as tweening. A
preview of Bezier curves is described as an application of tweening.
Section 4.6 examines the central problem of finding where two line segments intersect, which is vastly
simplified by using vectors. It also discusses the problem of finding the unique circle determined by three
points. Section 4.7 discusses the problem of finding where a ray hits a line or plane, and applies the
notions to the clipping problem. Section 4.8 focuses on clipping lines against convex polygons and
polyhedra, and develops the powerful Cyrus-Beck clipping algorithm.
The chapter ends with Case Studies that extend these tools and provide opportunities to enrich your
graphics programming skills. Tasks include processing polygons, performing experiments in 2D ray
tracing, drawing rounded corners on figures, animation by tweening, and developing advanced clipping
tools.
4.1 Introduction.
Hill - Chapter 4
09/23/99
page 1
In computer graphics we work, of course, with objects defined in a three dimensional world (with 2D objects
and worlds being just special cases). All objects to be drawn, and the cameras used to draw them, have shape,
position, and orientation. We must write computer programs that somehow describe these objects, and describe
how light bounces around illuminating them, so that the final pixel values on the display can be computed.
Think of an animation where a camera flies through a hilly scene containing various buildings, trees, roads, and
cars. What does the camera see? It all has to be converted ultimately to numbers. Its a tall order.
The two fundamental sets of tools that come to our aid in graphics are vector analysis and transformations. By
studying them in detail we develop methods to describe the various geometric objects we will encounter, and
we learn how to convert geometric ideas to numbers. This leads to a collection of crucial algorithms that we can
call upon in graphics programs.
In this chapter we examine the fundamental operations of vector algebra, and see how they are used in graphics;
transformations are addressed in Chapter 5. We start at the beginning and develop a number of important tools
and methods of attack that will appear again and again throughout the book. If you have previously studied
vectors much of this chapter will be familiar, but the numerous applications of vector analysis to geometric
situations should still be scrutinized. The chapter might strike you as a mathematics text. But having it all
collected in one place, and related to the real problems we encounter in graphics, may be found useful.
Why are vectors so important?
A preview of some of some situations where vector analysis comes to the rescue might help to motivate the
study of vectors. Figure 4.1 shows three geometric problems that arise in graphics. Many other examples could
be given.
a ).
b ).
c).
(4,6 )
ce n te r?
(2,2 )
(7,1 )
vie w pla n e
Figure 4.1. Three sample geometric problems that yield readily to vector analysis.
Part a) shows a computer-aided design problem: the user has placed three points on the display with the mouse,
and wants to draw the unique circle that passes through them. (Can you visualize this circle?). For the
coordinates given where is the center of the circle located? We see in Section 4.6 that this problem is thorny
without the use of vectors, but almost trivial when the right vector tools are used.
Part b) shows a camera situated in a scene that contains a Christmas tree. The camera must form an image of the
tree on its viewplane (similar to the film plane of a physical camera), which will be transferred to a screen
window on the users display. Where does the image of the tree appear on this plane, and what is its exact
shape? To answer this we need a detailed study of perspective projections, which will be greatly aided by the
use of vector tools. (If this seems too easy, imagine that you are developing an animation, and the camera is
zooming in on the sphere along some trajectory, and rotating as it does so. Write a routine that generates the
whole sequence of images!)
Part c) shows a shiny cone in which the reflection of a cube can be seen. Given the positions of the cone, cube,
and viewing camera, where exactly does the reflected image appear, and what is its color and shape? When
studying ray tracing in Chapter 15 we will make extensive use of vectors, and we will see that this problem is
readily solved.
Some Basics.
All points and vectors we work with are defined relative to some coordinate system. Figure 4.2 shows the
coordinate systems that are normally used. Each system has an origin called and some axes emanating from
. The axes are usually oriented at right angles to one another. Distances are marked along each axis, and a
Hill - Chapter 4
09/23/99
page 2
point is given coordinates according to how far along each axis it lies. Part a) shows the usual two-dimensional
system. Part b) shows a right handed 3D coordinate system, and part c) shows a left handed coordinate system.
y
a).
b).
c)
(2, 3 )
x
x
Hill - Chapter 4
09/23/99
page 3
a).
4
3
v
P
v
v
Q
1
x
5
(4.1)
Mostly we will be interested in 2D or 3D vectors as in r = (3.4, -7.78) or t = (33, 142.7, 89.1). Later when it
becomes important we will explore the distinction between a vector and its representation, and in fact will
use a slightly expanded notation to represent vectors (and points). Writing a vector as a row matrix like t =
(33, 142.7, 89.1) fits nicely on the page, but when it matters we will instead write vectors as column
matrices:
33
3.4
r=
.7
7.78 , or t = 142
89.1
It matters when we want to multiply a point or a vector by a matrix, as we shall see in Chapter 5.
Hill - Chapter 4
09/23/99
page 4
b).
a+b
a+b
b
a
a
a).
b).
c).
a-c
a
c
a-c
a
-c
Hill - Chapter 4
09/23/99
page 5
Definition:
A linear combination of the m vectors v1, v2, . . . , vm is a vector of the form
w = a1 v1 + a2 v2 +. . . + am vm
where a1, a2, . . . , am are scalars.
(4.2)
For example, the linear combination 2(3, 4,-1) + 6(-1, 0, 2) forms the vector (0, 8, 10). In later chapters we
shall deal with rather elaborate linear combinations of vectors, especially when representing curves and
surfaces using spline functions.
Two special types of linear combinations, affine and convex combinations, are particularly important in
graphics.
(4.3)
(4.4)
Affine combinations of vectors appear in various contexts, as do affine combinations of points, as we see
later.
(4.5)
and ai 0, for i = 1,,m.. As a consequence all ai must lie between 0 and 1. (Why?).
Thus .3a+.7b is a convex combination of a and b, but 1.8a -.8b is not. The set of coefficients a1, a2, . . . , am
is sometimes said to form a partition of unity, suggesting that a unit amount of material is partitioned into
pieces. Convex combinations frequently arise in applications when one is making a unit amount of some
brew and can combine only positive amounts of the various ingredients. They appear in unexpected
contexts. For instance, we shall see in Chapter 8 that spline curves are in fact convex combinations of
certain vectors, and in our discussion of color in Chapter 12 we shall find that colors can be considered as
vectors, and that any color of unit brightness may be considered to be a convex combination of three primary
colors!
We will find it useful to talk about the set of all convex combinations of a collection of vectors. Consider
the set of all convex combinations of the two vectors v1 and v2. It is the set of all vectors
v = (1 - a) v1 + a v2
(4.6)
as the parameter a is allowed to vary from 0 to 1 (why?) What is this set? Rearranging the equation, v is seen
to be:
Hill - Chapter 4
09/23/99
page 6
v = v1 + a (v2 - v1)
(4.7)
Figure 4.8a shows this to be the vector that is v1 plus some fraction of v2 - v1, so the tip of v lies on the line
joining v1 and v2. As a varies from 0 to 1, v takes on all the positions on the line from v1 to v2, and only
those.
a).
b).
v1
v2
L
a (v2 - v1)
v3
v2
v
v1
.2v1
.3v3
.5v2
(4.8)
where we also insist that a1 plus a2 does not exceed one. This is a convex combination, since none of the
coefficients is ever negative and they sum to one. Figure 4.9 shows the three position vectors v1 = (2, 6), v2 =
(3, 3), and v3 = (7, 4). By the proper choices of a1 and a2, any vector lying within the shaded triangle of
vectors can be represented, and no vectors outside this triangle can be reached. The vector b = .2 v1 + .5 v2 +
.3 v3, for instance, is shown explicitly as the vector sum of the three weighted ingredients. (Note how it is
built up out of portions of the three constituent vectors.) So the set of all convex combinations of these
three vectors spans the shaded triangle. The proof of this is requested in the exercises.
If a2 = 0, any vector in the line L that joins v1 and v3 can be reached by the proper choice of a1. For
example, the vector that is 20 percent of the way from v1 to v3 along L is given by .8 v1+ 0 v2 +.2 v3.
w = w12 + w2 2 +...+ wn 2
(4.9)
For example, the magnitude of w = (4, -2) is 20 , and that of w = (1, -3, 2) is 14 . A vector of zero
length is denoted as 0. Note that if w is the vector from point A to point B, then |w| will be the distance from
A to B (why?).
It is often useful to scale a vector so that the result has a length equal to one. This is called normalizing a
vector, and the result is known as a unit vector. For example, we form the normalized version of a, denoted
a , by scaling it with the value 1/|a|:
Hill - Chapter 4
09/23/99
page 7
a =
a
a
(4.10)
Clearly this is a unit vector: | a | = 1 (why?), having the same direction as a. For example, if a = (3, -4), then
3 -4
| a | = 5 and the normalized version is a = ( 5 , 5 ) . At times we refer to a unit vector as a direction. Note
d = v w = vi wi
(4.11)
i =1
Example 4.3.1:
The dot product of (2, 3, 1) and (0, 4, -1) is 11.
(2, 2, 2, 2) (4, 1, 2, 1.1) = 16.2.
(1, 0, 1, 0, 1) (0, 1, 0, 1, 0) = 0.
(169, 0, 43) (0, 375.3, 0) = 0.
Hill - Chapter 4
09/23/99
page 8
Symmetry:
Linearity:
Homogeneity:
|b|2 = b b
ab=ba
(a + c) b = a b + c b
(sa) b = s (a b)
The first states that the order in which the two vectors are combined does not matter: the dot product is
commutative. The next two proclaim that the dot product is linear; that is, the dot product of a sum of
vectors can be expressed as the sum of the individual dot products, and scaling a vector scales the value of
the dot product. The last property is also useful, as it asserts that taking the dot product of a vector with itself
yields the square of the length of the vector. It appears frequently in the form |b| = bb .
The following manipulations show how these properties can be used to simplify an expression involving dot
products. The result itself will be used in the next section.
Example 4.3.2: Simplification of |a - b|2.
Simplify the expression for the length (squared) of the difference of two vectors, a and b, to obtain the
following relation:
| a - b |2 = | a |2 - 2 a b + | b |2
(4.12)
The derivation proceeds as follows: Give the name C to the expression | a - b |2. By the fourth property, C is
the dot product:
C = | a - b |2 = (a - b) (a - b).
Using linearity: C = a (a - b) - b (a - b).
Using symmetry and linearity to simplify this further: C = a a - 2a b + b b.
Using the fourth property above to obtain C = | a |2 - 2 a b + | b |2 gives the desired result.
By replacing the minus with a plus in this relation, the following similar and useful relation emerges:
| a + b |2 = | a |2 + 2a b + | b |
(4.13)
b
c
Hill - Chapter 4
09/23/99
page 9
(4.14)
where is the angle from b to c. Thus b c varies as the cosine of the angle from b to c. The same result
holds for vectors of three, four, or any number of dimensions.
To obtain a slightly more compact form, divide through both sides by |b| |c| and use the unit vector notation
b = b / b to obtain
cos( ) = b c
(4.15)
This is the desired result: The cosine of the angle between two vectors b and c is the dot product of their
normalized versions.
Example 4.3.3. Find the angle between b = (3, 4) and c = (5, 2).
less than
exactly
more than
90o apart
90o apart
90o apart
if b c > 0;
if b c = 0;
if b c < 0;
(4.16)
This is indicated by Figure 4.10. The sign of the dot product is used in many algorithmic tests.
b
b
c
c
b c> 0
b c= 0
c
b c< 0
Hill - Chapter 4
09/23/99
page 10
Other names for perpendicular are orthogonal and normal, and we shall use all three interchangeably.
The most familiar examples of orthogonal vectors are those aimed along the axes of 2D and 3D coordinate
systems, as shown in Figure 4.11. In part a) the 2D vectors (1, 0) and (0, 1) are mutually perpendicular unit
vectors. The 3D versions are so commonly used they are called the standard unit vectors and are given
names i, j, and k.
y
a).
b).
c)
y
j
(0,1)
(1,0)
x
x
(4.18)
Part b) of the figure shows them for a right-handed system, and part c) shows them for a left-handed
system. Note that k always points in the positive z direction.
Using these definitions any 3D vector such as (a, b, c) can be written in the alternative form:
(a, b, c) = a i + b j + c k
(4.19)
Example 4.3.4. Notice that v = (2, 5, -1) is clearly the same as 2 (1, 0, 0) + 5 (0, 1, 0) -1 (0, 0, 1), which is
recognized as 2 i + 5 j - k.
This form presents a vector as a sum of separate elementary component vectors, so it simplifies various
pencil-and-paper calculations. It is particularly convenient when dealing with the cross product, discussed in
Section 4.4.
Practice Exercises.
4.3.1. Alternate proof of b c = |b| |c| cos . Note that b and c form two sides of a triangle, and the third
side is b - c. Use the law of cosines to obtain the square of the length of b - c in terms of the lengths of b and
c and the cosine of . Compare this with Equation 4.13 to obtain the desired result.
4.3.2. Find the Angle. Calculate the angle between the vectors (2, 3) and (-3, 1), and check the result
visually using graph paper. Then compute the angle between the 3D vectors (1, 3, -2) and (3, 3, 1).
4.3.3. Testing for Perpendicularity. Which pairs of the following vectors are perpendicular to one another:
(3, 4, 1), (2, 1, 1), (-3, -4, 1), (0, 0, 0), (1, -2, 0), (4, 4, 4), (0, -1, 4), and (2, 2, 1)?
4.3.4. Pythagorean Theorem. Refer to Equations 4.12 and 4.13. For the case in which a and b are
perpendicular, these expressions have the same value, which seems to make no sense geometrically. Show
that it works all right, and relate the result to the Pythagorean theorem.
Hill - Chapter 4
09/23/99
page 11
Suppose the 2D vector a has components (ax, ay). What vectors are perpendicular to it? One way to obtain
such a vector is to interchange the x- and y- components and negate one of them.3 Let b = (-ay, ax). Then
the dot product a b equals 0 so a and b are indeed perpendicular. For instance, if a = (4,7) then b = (-7, 4)
is a vector normal to a. There are infinitely many vectors normal to any a, since any scalar multiple of b,
such as (-21, 12) and (7, -4) is also normal to a. (Sketch several of them for a given a.)
It is convenient to have a symbol for one particular vector that is normal to a given 2D vector a. We use
the symbol (pronounced perp) for this.
Definition: Given a = (ax, ay), a = (-ay, ax) is the counterclockwise
(4.20)
perpendicular to a.
Note that a and a have the same length: |a| = |a| . Figure 4.12a shows an arbitrary vector a and the
resulting a . Note that moving from the a direction to direction a requires a left turn. (Making a right
turn is equivalent to turning in the direction -a .)
a).
b).
a
a
-a
4.3.5. Some Pleasant Properties of a. It is useful in some discussions to view the perp symbol as an
operator that performs a rotate 90 left operation on its argument, so that a is the vector produced by
applying the to vector a, much as x is the value produced by applying the square root operator to x.
Viewing in this way, show that it enjoys the following properties:
a). Linearity: (a + b) = a + b and (Aa) = Aa for any scalar A;
b). a = (a) = -a
(two perps make a reversal)
4.3.6. The perp dot product. Interesting things happen when we dot the perp of a vector with another
vector, as in a. b. We call this the perp dot product [hill95]. Use the basic definition of a above to
show:
a.a = 0,
|a|2 = |a|2.
a. b = - b . a,
(a is perpendicular to a)
(a and a have the same length)
(4.21)
(antisymmetric)
3 This is equivalent to the familiar fact that perpendicular lines have slopes that are negative reciprocals of one another.
In Chapter 5 we see the interchange and negate operation arise naturally in connection with a rotation of 90 degrees.
Hill - Chapter 4
09/23/99
page 12
The fourth fact shows that the perp dot product is antisymmetric: moving the from one vector to the
other reverses the sign of the dot product. Other useful properties of the perp dot product will be discussed
as they are needed.
4.3.7. Calculate one. Compute a b and a. b for a = (3,4) and b = (2,1).
4.3.8. Its a determinant. Show that a. b can be written as the determinant (for definitions of matrices
and determinants see Appendix 2):
a b =
ax
bx
ay
by
a).
c).
C
v
c
A
Mv
c
F
Kv
(4.22)
Given c and v we want to solve for K and M. Once found, we say that the orthogonal projection of c onto
v is Kv, and that the distance from C to the line is |Mv|.
Figure 4.13c shows a situation where these questions might arise. We wish to analyze how the gravitational
force vector G acts on the block to pull it down the incline. To do this we must resolve G into the force F
acting along the incline and the force B acting perpendicular to the incline. That is, find F and B such that
G = F + B.
Equation 4.22 is really two equations: the left and right hand sides must agree for the x-components and
they also must agree for the y-components. There are two unknowns K and M. So we have two equations
in two unknowns, and Cramers rule can be applied. But who remembers Cramers rule? We use a trick
Hill - Chapter 4
09/23/99
page 13
here that is easy to remember and immediately reveals the solution. It is equivalent to Cramers rule, but
simpler to apply.
The trick in solving two equations in two unknowns is to eliminate one of the variables. We do this by
forming the dot product of both sides with the vector v:
cv = Kvv + M vv
(4.23)
K=
c v
.
vv
M=
c v
v v
where we have used the third property in Equation 4.21. Putting these together we have
v c
v c
c = 2 v+
v
| v |
| v |2
(4.24)
This equality holds for any vectors c and v. The part along v is known as the orthogonal projection of c
onto the vector v. The second term gives the difference term explicitly and compactly. Its size is the
distance from C to the line:
v c
v c
v =
,
distance =
| v |2
v
(Check that the second form really equals the first). Referring to Figure 4.13b we can say: the distance
from a point C to the line through A in the direction v is:
distance =
v (C A)
v
(4.25)
Example 4.3.5. Find the orthogonal projection of the vector c = (6, 4) onto a = (1, 2). (Sketch the relevant
vectors.) Solution: Evaluate the first term in Equation 4.24, obtaining the vector (14, 28) / 5.
Example 4.3.6: How far is the point C = (6,4) from the line that passes through (1,1) and (4,9)? Solution:
Set A = (1, 1), use v = (4, 9) - (1, 1) = (3, 8), and evaluate distance in Equation 4.25. The result is:
d = 31 / 73 .
Practice Exercises.
4.3.10. Resolve it. Express vector g = (4, 7) as a linear combination of b = (3, 5) and b. How far is (4, 2)
+ g from the line through (4, 2) that moves in the direction b?
4.3.11. A Block pulled down an incline. A block rests on an incline tilted 30 from the horizontal. Gravity
exerts a force of one newton on the block. What is the force that is trying to move the block along the
incline?
4.3.12. How far is it? How far from the line through (2, 5) and ( 4, -1) does the point (6, 11) lie? Check your
result on graph paper.
Hill - Chapter 4
09/23/99
page 14
a).
b).
1 2
L
-m
m=
an
2 n = (a n )n
|n|
(4.26)
(4.27)
In three dimensions physics demands that the reflected direction r must lie in the plane defined by n and a.
The expression for r above indeed supports this, as we show in Chapter five.
Example 4.3.7. Let a = (4, -2) and n = (0, 3). Then Equation 4.27 yields r = (4, 2), as expected. Both the
angle of incidence and reflection are equal to tan-1(2).
Practice Exercises.
4.3.13. Find the Reflected Direction. For a = (2, 3) and n = (-2, 1), find the direction of the reflection.
4.3.14. Lengths of the Incident and Reflected Vectors. Using Equation 4.27 and properties of the dot
product, show that |r| = |a|.
Hill - Chapter 4
09/23/99
page 15
Given the 3D vectors a = (ax, ay, az) and b = (bx, by, bz), their cross product is denoted as a b . It is
defined in terms of the standard unit vectors i, j, and k (see Equation 4.18) by
Definition of a b :
(4.28)
(It can actually be derived from more fundamental principles: See the exercises.) As this form is rather
difficult to remember, it is often written as an easily remembered determinant (see Appendix 2 for a review
of determinants).
a b = ax
bx
ay
by
az
bz
(4.29)
Remembering how to form the cross product thus requires only remembering how to form a determinant.
Example 4.4.1. For a = (3, 0, 2) and b = (4, 1, 8), direct calculation shows that a b = -2i -16j + 3k. What
is b a?
From this definition one can easily show the following algebraic properties of the cross product:
1.
i j= k
jk =i
ki = j
2. a b = b a
3. a (b + c ) = a b + a c
4. (sa) b = s(a b)
(antisymmetry)
(linearity)
(homogeneity)
(4.30)
These equations are true in both left-handed and right-handed coordinate systems. Note the logical
(alphabetical) ordering of ingredients in the equation i j = k , which also provides a handy mnemonic
device for remembering the direction of cross products.
Practice Exercises.
4.4.1. Demonstrate the Four Properties. Prove each of the preceding four properties given for the cross
product.
4.4.2. Derivation of the Cross Product. The form in Equation 4.28, presented as a definition, can actually
be derived from more fundamental ideas. We need only assume that:
a. The cross product operation is linear.
b. The cross product of a vector with itself is 0.
c. i j = k , j k = i , and k i = j .
By writing a = ax i + ay j + az k and b = bx i + by j + bz k, apply these rules to derive the proper form for
a b.
4.4.3. Is a b perpendicular to a? Show that the cross product of vectors a and b is indeed
perpendicular to a.
4.4.4. Vector Products. Find the vector b = (bx, by, bz) that satisfies the cross product relation a b = c,
where a = (2, 1, 3) and c = (2, -4, 0). Is there only one such vector?
Hill - Chapter 4
09/23/99
page 16
4.4.5. Nonassociativity of the Cross Product. Show that the cross product is not associative. That is, that
a (b c) is not necessarily the same as (a b) c .
4.4.6. Another Useful Fact. Show by direct calculation on the components that the length of the cross
product has the form:
ab =
2
a b (a b)
2
|a b|=|a||b|sin( )
(4.31)
where is the angle between a and b, measured from a to b or b to a, whichever produces an angle less than 180
degrees. As a special case, a b = 0 if, and only if, a and b have the same or opposite directions or if either has
zero length. What is the magnitude of the cross product if a and b are perpendicular?
3. The sense of a b is given by the right-hand rule when working in a right-handed system. For example,
twist the fingers of your right hand from a to b, and then a b will point in the direction of your thumb. (When
working in a left-handed system, use your left hand instead.) Note that i j = k supports this.
Example 4.4.2. Let a = (1, 0, 1) and b = (1, 0, 0). These vectors are easy to visualize, as they both lie in the x, zplane. (Sketch them.) The area of the parallelogram defined by a and b is easily seen to be 1. Because a b is
orthogonal to both a and b, we expect it to be parallel to the y-axis and hence be proportional to j. In either a
right-handed or a left-handed system, sweeping the fingers of the appropriate hand from a to b reveals a thumb
pointed along the positive y-axis. Direct calculation based on Equation 4.28 confirms all of this: a b = j.
Practice Exercise 4.4.7. Proving the Properties. Prove the three properties given above for the cross product.
Hill - Chapter 4
09/23/99
page 17
P3
P1
a
P2
x
z
a xb
b).
Hill - Chapter 4
09/23/99
page 18
with lines and planes, which are central to graphics, and whose straightness and flatness makes them
easy to represent and manipulate.
What does it mean to represent a line or plane, and why is it important? The goal is to come up with a
formula or equation that distinguishes points that lie on the line from those that dont. This might be an
equation that is satisfied by all points on the line, and only those points. Or it might be a function that returns
different points in the line as some parameter is varied. The representation allows one to test such things as:
is point P on the line?, or where does the line intersect another line or some other object. Very importantly, a
line lying in a plane divides the plane into two parts, and we often need to ask whether point P lies on one
side or the other of the line.
In order to deal properly with lines and planes we must, somewhat unexpectedly, go back to basics and
review how points and vectors differ, and how each is represented. The need for this arises because, to
represent a line or plane we must add points together, and scale points, operations that for points are
nonsensical. To see what is really going on we introduce the notion of a coordinate frame, that makes clear
the significant difference between a point and a vector, and reveals in what sense it is legitimate to add
points. The use of coordinate frames leads ultimately to the notion of homogeneous coordinates, which
is a central tool in computer graphics, and greatly simplifies many algorithms. We will make explicit use of
coordinate frames in only a few places in the book, most notably when changing coordinate systems and
flying cameras around a scene (see Chapters 5, 6, and 7) 4. But even when not explicitly mentioned, an
underlying coordinate frame will be present in every situation.
4 This is an area where graphics programmers can easily go astray: their programs produce pictures that look OK for simple
situations, and become mysteriously and glaringly wrong when things get more complex.
5 The ideas for a 2D system are essentially identical.
6 In more general contexts the vectors need not be mutually perpendicular, but rather only linearly independent (such that, roughly,
none of them is a linear combination of the other two). The coordinate frames we work with will always have perpendicular axis
vectors.
Hill - Chapter 4
09/23/99
page 19
(4.32)
and say that v has the representation (v1, v2, v3) in this system.
On the other hand, to represent a point, P, we view its location as an offset from the origin by a certain
amount: we represent the vector P - by finding three numbers (p1, p2, p3) such that:
P - = p1a + p2b + p3c
and then equivalently write P itself as:
P = + p1a + p2b + p3c
(4.33)
The representation of P is not just a 3-tuple, but a 3-tuple along with an origin. P is at a location that is
offset from the origin by p1a + p2b + p3c. The basic idea is to make the origin of the coordinate system
explicit. This becomes important only when there is more than one coordinate frame, and when transforming
one frame into another.
Note that when we earlier defined the standard unit vectors i, j, and k as (1, 0, 0), (0, 1, 0), and (0, 0, 1),
respectively, we were actually defining their representations in an underlying coordinate frame. Since by
Equation 4.32 i = 1a + 0b + 0c, vector i is actually just a itself! Its a matter of naming: whether you are
talking about the vector or about its representation in a coordinate frame. We usually dont bother to
distinguish them.
Note that you cant explicitly say where is, or cite the directions of a, b, and c: To do so requires having
some other coordinate frame in which to represent this one. In terms of its own coordinate frame, has the
representation (0, 0, 0), a has the representation (1, 0, 0), etc.
The homogeneous representation of a point and a vector.
It is useful to represent both points and vectors using the same set of basic underlying objects, (a, b, c, ).
From Equations 4.32 and 4.33 the vector v = v1a + v2 b + v3 c then needs the four coefficients (v1, v2, v3, 0)
whereas the point P = p1a + p2b + p3c + needs the four coefficients (p1, p2, p3, 1). The fourth component
designates whether the object does or does not include . We can formally write any v and P using a matrix
multiplication (multiplying a row vector by a column vector - see Appendix 2):
Hill - Chapter 4
09/23/99
page 20
v
v
v = (a, b, c, )
v
0
p
p
P = (a, b, c, )
p
1
1
2
(4.34)
1
2
(4.35)
Here the row matrix captures the nature of the coordinate frame, and the column vector captures the
representation of the specific object of interest. Thus vectors and points have different representations: there
is a fourth component of 0 for a vector and 1 for a point. This is often called the homogeneous
representation.7 The use of homogeneous coordinates is one of the hallmarks of computer graphics, as it
helps to keep straight the distinction between points and vectors, and provides a compact notation when
working with affine transformations. It pays off in a computer program to represent the points and vectors of
interest in homogeneous coordinates as 4-tuples, by appending a 1 or 08. This is particularly true when we
must convert between one coordinate frame and another in which points and vectors are represented.
It is simple to convert between the ordinary representation of a point or vector (a 3-tuple for 3D objects or
a 2-tuple for 2D objects) and the homogeneous form:
To go from ordinary to homogeneous coordinates:
if its a point append a 1;
if its a vector, append a 0;
To go from homogeneous coordinates to ordinary coordinates:
If its a vector its final coordinate is 0. Delete the 0.
If its a point its final coordinate is 1 Delete the 1.
OpenGL uses 4D homogeneous coordinates for all its vertices. If you send it a 3-tuple in the form (x, y, z), it
converts it immediately to (x, y, z, 1). If you send it a 2D point (x, y), it first appends a 0 for the z-component
and then a 1, to form (x, y, 0, 1). All computations are done within OpenGL in 4D homogeneous
coordinates.
Linear Combinations of Vectors .
Note how nicely some things work out in homogeneous coordinates when we combine vectors coordinatewise: all the definitions and manipulations are consistent:
The difference of two points (x, y, z, 1) and ( u, v, w, 1) is ( x - u, y - v, z - w, 0), which is, as expected, a
vector.
The sum of a point (x, y, z, 1) and a vector (d, e, f, 0) is (x + d, y + e, z + f, 1), another point;
Two vectors can be added: (d, e, f, 0) + (m, n, r, 0) = (d + m, e + n, f + r, 0) which produces another vector;
It is meaningful to scale a vector: 3(d, e, f, 0) = (3d, 3e, 3f, 0);
7 Actually we are only going part of the way in this discussion. As we see in Chapter 7 when studying projections, homogeneous
coordinates in that context permit an additional operation, which makes them truly homogeneous. Until we examine projections this
operation need not be introduced.
8 In the 2D case, points are 3-tuples (p , p , 1) and vectors are 3-tuples (v , v , 0).
1
2
1
2
Hill - Chapter 4
09/23/99
page 21
It is meaningful to form any linear combination of vectors. Let the vectors be v = (v1 , v2, v3, 0) and w =
(w1, w2, w3, 0). Then using arbitrary scalars a and b, we form av + bw = (av1 + bw1, av2 + bw2, av3 + bw3, 0),
which is a legitimate vector.
Forming a linear combination of vectors is well defined, but does it make sense for points? The answer is no,
except in one special case, as we explore next.
(4.36)
when f + g is different from 1? The problem arises if we shift the origin of the coordinate system
[Goldman85]. Suppose the origin is shifted by vector u, so that P is altered to P + u and R is shifted to R +
u. If E is a legitimate point, it too must be shifted to the new point E = E + u. But instead we have
E = fP + gR + (f + g)u
which is not E + u unless f + g = 1.
The failure of a simple sum P1 + P2 of two points to be a true point is shown in Figure 4.19. Points P1 and P2
are shown represented in two coordinate systems, one offset from the other. Viewing each point as the head
of a vector bound to its origin, we see that the sum P1 + P2 yields two different points in the two systems.
Therefore P1 + P2 depends on the choice of coordinate system. Note, by way of contrast, that the affine
combination 0.5(P1 + P2) does not depend on this choice.
P1
depends on
system
System 2
(P1+ P 2)
P1
System 1
(P1+ P 2)/2
P1
Hill - Chapter 4
09/23/99
page 22
There is another way of examining affine sums of points that is interesting on its own, and also leads to a
useful tool in graphics. It doesnt require the use of homogeneous coordinates.
Consider forming a point as a point A offset by a vector v that has been scaled by scalar t: A + tv. This is the
sum of a point and a vector so it is a legitimate point. If we take as vector v the difference between some
other point B and A: v = B - A then we have the point P:
P = A + t(B - A)
(4.37)
(4.38)
and it is seen to be an affine combination of points (why?). This further legitimizes writing affine sums of
points. In fact, any affine sum of points can be written as a point plus a vector (see the exercises). If you are
ever uncomfortable writing an affine sum of points as in Equation 4.38 (a form we will use often), simply
understand that it means the point given by Equation 4.37.
Example 4.5.1: The centroid of a triangle. Consider the triangle T with vertices A, B, and C shown in
Figure 4.20. We use the ideas above to show that the three medians of T meet at a point that lies 2/3 of the
way along each median. This is the centroid (center of gravity9) of T.
E
G
F
centroid C
D
C=
D+E+ F
3
(Try it!) Heres the cute part [pedoe70]. Since this result is symmetrical in D, E, and F, it must also be 2/3 of
the way along the median from E, and 2/3 of the way along the median from F. Hence the 3 medians meet
there, and C is the centroid.
This result generalizes nicely for a regular polygon of N sides: the centroid is simply the average of the N
vertex locations, another affine combination. For an arbitrary polygon the formula is more complex
Practice Exercises.
4.5.1. Any affine combination of points is legitimate. Consider three scalars a, b, and c that sum to one,
and three points A, B, and C. The affine combination a A + b B + c C is a legal point because using c = 1 - a
- b it is seen to be the same as a A + b B + (1 - a - b) C = C + a (A - C) + b (B - C), the sum of a point and
two vectors (check this out!). To generalize: Given the affine combination of points w1A1 + w2A2 + ... +
wnAn, where w1 + w2 + ... + wn = 1, show that it can be written as a point plus a vector, and is therefore a
legitimate point.
4.5.2. Shifting the coordinate system [Goldman85]. Consider the general situation of forming a linear
combination of m points:
9The reference to gravity arises because if a thin plate is cut in the shape of T, the plate hangs level if suspended by a
thread attached at the centroid. Gravity pulls equally on all sides of the centroid, so the plate is balanced.
Hill - Chapter 4
09/23/99
page 23
E = ai Pi
i =1
We ask whether E is a point, a vector, or nothing at all? By considering the effect of a shift in each Pi by u
show that E is shifted to E = E + S u, where S is the sum of the coefficients:
m
S = ai
i =1
Show that:
i). E is a point if S = 1.
ii). E is a vector if S = 0.
iii). E is meaningless for other values of S.
Hill - Chapter 4
09/23/99
page 24
shapes. For small values of t it looks like A, but as t increases it warps (smoothly) towards a shape close to
B. For t = 0.25, for instance, point Pi(.25) of the tween is 25% of the way from A to B.
Figure 4.22 shows a simple example, in which polyline A has the shape of a house, and polyline B has the
shape of the letter T. The point R on the house corresponds to point S on the T. The various tweens of
point R on the house and point S on the T lie on the line between R and S. The tween for t = 1/2 lies at the
midpoint of RS. The in between polylines show the shapes of the tweens for t = 0, 0.25, 0.5, 0.75, and 1.0.
S
R
Hill - Chapter 4
09/23/99
page 25
Figure 4.25. Face Caricature: Tweening and extrapolation. (Courtesy of Susan Brennan.)
Tweening is used in the film industry to reduce the cost of producing animations such as cartoons. In earlier
days an artist had to draw 24 pictures for each second of film, because movies display 24 frames per
second. With the assistance of a computer, however, an artist need draw only the first and final pictures,
called key-frames, in certain sequences and let the others be generated automatically. For instance, if the
characters are not moving too rapidly in a certain one-half-second portion of a cartoon, the artist can draw
and digitize the first and final frames of this portion, and the computer can create 10 tweens using linear
interpolation, thereby saving a great deal of the artist's time. See the case study at the end of this chapter for
a programming project that produces these effects.
Practice Exercises.
4.5.3. A Limiting Case of Tweening. What is the effect of tweening when all of the points Ai in polyline A
are the same? How is polyline B distorted in its appearance in each tween?
4.5.4. An Extrapolation. Polyline A is a square with vertices (1, 1), (-1, 1), (-1, -1), (1, -1), and polyline B
is a wedge with vertices (4, 3), (5, -2), (4, 0), (3, -2). Sketch (by hand) the shape P(t) for t = -1, -0.5, 0.5,
and 1.5.
4.5.5. Extrapolation Versus Tweening. Suppose that five polyline pictures are displayed side by side.
From careful measurement you determine that the middle three are in-betweens of the first and the last, and
you calculate the values of t used. But someone claims that the last is actually an extrapolation of the first
and the fourth. Is there any way to tell whether this is true? If it is an extrapolation, can the value of t used
be determined? If so, what is it?
a).
B
P(0)
P(t)
P(0)
P(1)
A
P(1)
Hill - Chapter 4
09/23/99
page 26
Practice Exercise 4.5.6. Try it out. Draw three points A, B, and C on a piece of graph paper. For each of
the values t = 0, .1, .2, ..., .9, 1 compute the position of P(t) in Equation 4.38, and draw the polyline that
passes through these points. Is it always a parabola?
C
s ta rti n g
p o in t
(4.39)
As t varies so does the position of L(t) along the line. (One often thinks of t as time, and uses language
such as: at time 0 ..., as time goes on.., or later to describe different parts of the line.) Figure 4.28
shows vector b and the line L passing through C and B. (A 2D version is shown but the 3D version uses the
same ideas.) Note where L(t) is located for various values of t. If t = 0, L(0) evaluates to C so at t = 0 we are
at point C. At t = l then L(1) = C + (B - C) = B. As t varies we add a longer or shorter version of b to the
point C, resulting in a new point along the line. If t is larger than 1 this point lies somewhere on the opposite
side of C from B, and when t is less than 0 it lies on the side of C opposite from B.
Hill - Chapter 4
09/23/99
page 27
@t > 1
B
y
v
@t = 1
L
@t = 0
@t < 0
x
Figure 4.28. Parametric representation L(t) of a line.
For a fixed value of t, say t = 0.6, Equation 4.39 gives a formula for exactly one point along the line through
C and B: the particular point L(0.6). Thus it is a description of a point. But since one can view it as a
function of t that generates the coordinates of every point on L as t varies, it is called the parametric
representation of line L.
The line, ray, and segment of Figure 4.26 are all represented by the same L(t) of Equation 4.39. They differ
parametrically only in the values of t that are relevant:
segment : 0 t 1
ray: 0 t <
line : - < t <
(4.40)
The ray starts at C when t = 0 and passes through B at t = 1, then continues forever as t increases. C is
often called the starting point of the ray.
A very useful fact is that L(t) lies fraction t of the way between C and B when t lies between 0 and 1. For
instance, when t = 1/2 the point L(0.5) is the midpoint between C and B, and when t = 0.3 the point L(0.3) is
30% of the way from C to B. This is clear from Equation 4.39 since |L(t) - C = |b| |t| and |B - C| = |b|, so the
value of |t| is the ratio of the distances |L(t) - C| to |B - C|, as claimed.
One can also speak of the speed with which the point L(t) moves along line L. Since it covers distance
|b| t in time t it is moving at constant speed |b|.
Example 4.5.2. A line in 2D. Find a parametric form for the line that passes through C= (3, 5) and B = (2,
7). Solution: Build vector b = B - C = (-1, 2) to obtain the parametric form L(t) = (3 - t, 2 + 2 t).
Example 4.5.3. A line in 3D. Find a parametric form for the line that passes through C= (3, 5,6) and B = (2,
7,3). Solution: Build vector b = B - C = (-1, 2, -3) to obtain the parametric form L(t) = (3 - t, 2 + 2 t, 6 - 3t).
Other parametrizations for a straight line are possible, although they are rarely used. For instance, the point
W(t) given by
W(t) = C + bt3
also sweeps over every point on L. It lies at C when t = 0 , and reaches B when t = 1. Unlike L(t), however,
W(t) accelerates along its path from C to B.
Point normal form for a line (the implicit form).
This is the same as the equation for a line, but we rewrite it in a way that better reveals the underlying
geometry. The familiar equation of a line in 2D has the form
fx+gy=1
Hill - Chapter 4
(4.41)
09/23/99
page 28
where f and g are some constants. The notion is that every point (x, y) that satisfies this equation lies on the
line, so it provides a condition for a point to be on the line. Note: This is true only for a line in 2D; a line in
3D requires two equations. So, unlike the parametric form that works perfectly well in 2D and 3D, the point
normal form only applies to lines in 2D.
This equation can be written using a dot product: (f, g) (x, y) = 1, so for every point on a line a certain dot
product must have the same value. We examine the geometric interpretation of the vector (f, g), and in so
doing develop the point normal form of a line. It is very useful in such tasks as clipping, hidden line
elimination, and ray tracing. Formally the point normal form makes no mention of dimensionality: A line in
2D has a point normal form, and a plane in 3D has one.
Suppose that we know line L passes through points C and B, as in Figure 4.29. What is its point normal
form? If we can find a vector n that is perpendicular to the line, then for any point R = (x, y) on the line the
vector R - C must be perpendicular to n, so we have the condition on R:
n
B
R
A
Figure 4.29. Finding the point normal form for a line.
n (R - C) = 0
(4.42)
This is the point normal equation for the line, expressing that a certain dot product must turn out to be zero
for every point R on the line. It employs as data any point lying on the line, and any normal vector to the
line.
We still must find a suitable n. Let b = B - C denote the vector from C to B. Then b will serve well as the
desired n. For purposes of building the point normal form, any scalar multiple of b works just as well for
n.
Example 4.5.4. Find the point normal form. Suppose line L passes through points C = (3, 4) and B = (5, 2). Then b = B - C = (2, -6) and b = (6, 2) (sketch this). Choosing C as the point on the line, the point
normal form is: (6, 2) . ((x, y) - (3, 4)) = 0, or 6x + 2y = 26. Both sides of the equation can be divided by 26
(or any other nonzero number) if desired.
Its also easy to find the normal to a line given the equation of the line, say, f x + g y = 1. Writing this once
again as (f, g) (x, y) = 1 it is clear that the normal n is simply (f, g) (or any multiple thereof). For instance,
the line given by 5x - 2y = 7 has normal vector (5, -2), or more generally K(5, -2) for any nonzero K.
Its also straightforward to find the parametric form for a line if you are given its point normal form.
Suppose it is known that line L has point normal form n (P - C) = 0, where n and C are given explicitly.
The parametric form is then L(t) = C + nt (why?). You can also obtain the parametric form if the equation
of the line is given. a). find the normal n as in the previous paragraph, and b). find a point (Cx, Cy) on the
line by choosing any value for Cx and use the equation to find the corresponding Cy.
Moving from each representation to the others.
We have described three different ways to characterize a line. Each representation uses certain data that
distinguishes one line from another. This is the data that would be stored in a suitable data structure within
a program to capture the specifics of each line being stored. For instance, the data associated with the
representation that specifies a line parametrically as in C + bt would be the point C and the direction b. We
summarize this by saying the relevant data is {C, b}.
Hill - Chapter 4
09/23/99
page 29
Planes in 3D space.
Because there is such a heavy use of polygons in 3D graphics, planes seem to appear everywhere. A
polygon (a faceof an object) lies in its parent plane, and we often need to clip objects against planes, or
find the plane in which a certain face lies.
Planes, like lines, have three fundamental forms: the three-point form, the parametric representation and
the point normal form. We examined the three-point form in Section 4.4.2.
The parametric representation of a plane.
The parametric form for a plane is built on three ingredients: one of its points, C, and two (nonparallel)
vectors, a and b, that lie in the plane, as shown in Figure 4.31. If we are given the three (non-collinear)
points A, B, and C in the plane, then take a = A - C and b = B - C.
Hill - Chapter 4
09/23/99
page 30
b
C
a
(4.43)
Given any values of s and t we can identify the corresponding point on the plane. For example, the position
at s = t = 0 is C itself, and that at s = 1 and t = - 2 is P(1, - 2) = C + a - 2 b.
Note that two parameters are involved in the parametric expression for a surface, whereas only one
parameter is needed for a curve. In fact if one of the parameters is fixed, say s = 3, then P(3, t) is a function
of one variable and represents a straight line: P(3, t) = (C + 3 a) + b t.
It is sometimes handy to arrange the parametric form into its component form by collecting terms
P(s, t) = (Cx + ax s + bx t, Cy + ay s + by t , Cz + az s + bz t).
(4.44)
We can rewrite the parametric form in Equation 4.43 explicitly in terms of the given points A, B, and C:
just use the definitions of a and b:
P(s, t) = C + s(A - C) + t(B - C)
which can be rearranged into the affine combination of points:
P(s, t) = s A + t B + (1 - s - t)C
(4.45)
Example 4.5.6. Find a parametric form given three points in a plane. Consider the plane passing through A
= (3,3,3), B = (5,5,7), and C = (1, 2, 4). From Equation 4.43 it has parametric form
P(s, t) = (1, 2, 4) + (2, 1, - 1) s + (4, 3, 3) t. This can be rearranged to the component form: P(s, t) = (1 + 2 s + 4
t) i + (2 + s + 3 t) j + (4 - s + 3 t) k, or to the affine combination form P(s, t) = s(3, 3, 3) + t(5, 5, 7) + (1 - s - t)(1,
2, 4).
The point normal form for a plane.
Planes can also be represented in point normal form, and the classic equation for a plane emerges at once.
Figure 4.32 shows a portion of plane P in three dimensions. A plane is completely specified by giving a
single point, B = (bx, by, bz), that lies within it, and the normal direction, n = (nx, ny, nz), to the plane. Just
as the normal vector to a line in two dimensions orients the line, the normal to a plane orients the plane in
space.
Hill - Chapter 4
09/23/99
page 31
z
n
B
x
(4.46)
This is the point normal equation of the plane. It is identical in form to that for the line: a dot product set
equal to 0. All points in a plane form vectors with B that have the same dot product with the normal vector.
By spelling out the dot product and using n = (nx, ny, nz), we see that the point normal form is the
traditional equation for a plane:
nx x + ny y + nz z = D
(4.47)
where D = n (B - 0). For example, if given the equation for a plane such as 5x - 2y + 8z = 2, you know
immediately that the normal to this plane is (5, -2, 8) or any multiple of this. (How do you find a point in
this plane?)
Example 4.5.7. Find a point normal form. Let plane P pass through (1, 2, 3) with normal vector (2, -1, 2).
Its point normal form is (2, -1, -2) ((x, y, z) - (1, 2, 3)) = 0. The equation for the plane may be written out
as 2x - y - 2z = 6.
Example 4.5.8. Find a parametric form given the equation of the plane. Find a parametric form for the
plane 2 x - y + 3 z = 8. Solution: By inspection the normal is (2, - 1, 3). There are many parametrizations;
we need only find one. For C, choose any point that satisfies the equation; C = (4, 0, 0) will do. Find two
(noncollinear) vectors, each having a dot product of 0 with (2, - 1, 3); some hunting finds that a = (1, 5, 1)
and b = (0, 3, 1) will work. Thus the plane has parametric form P(s, t) = (4, 0, 0) + (1, 5, 1) s + (0, 3, 1) t.
Example 4.5.9. Finding two noncollinear vectors. Given the normal n to a plane, what is an easy way to
find two noncollinear vectors a and b that are both perpendicular to n? (In the previous exercise we just
invented two that work.) Here we use the fact that the cross product of any vector with n is normal to n. So
we take a simple choice such as (0, 0, 1), and construct a as its cross product with n:
a = (0, 0, 1) n = ( -ny, nx, 0)
(Is this indeed normal to n?). We can use the same idea to form b that is normal to both n and a:
b = n a = (-nx nz, -ny nz, nx2 + ny2)
(Check that b a and b n.) So b is certainly not collinear with a.
We apply this method to the plane (3, 2, 5) (R- (2,7,0)) = 0. Set a = (0, 0, 1) n = (-2, 3, 0) and b = (-15,
-10, 13). The plane therefore has parametric form:
P(s, t) = (2 -2 s - 15 t, 7 + 3 s - 10 t, 13 t).
Check: Is P(s, t) - C = ( -2s - 15t, -3s -10 t, 13t) indeed normal to n for every s and t?
Practice Exercise 4.5.7. Find the Plane. Find a parametric form for the plane coincident with the y, zplane.
Hill - Chapter 4
09/23/99
page 32
Planar Patches.
Just as we can restrict the parameter t in the representation of a line to obtain a ray or a segment, we can
restrict the parameters s and t in the representation of a plane.
In the parametric form of Equation 4.43 the values for s and t can range from - to , and thus the plane
can extend forever. In some situations we want to deal with only a piece of a plane, such as a
parallelogram that lies in it. Such a piece is called a planar patch, a term that invites us to imagine the
plane as a quilt of many patches joined together. Later we examine curved surfaces made up of patches
which are not necessarily planar. Much of the practice of modeling solids involves piecing together patches
of various shapes to form the skin of an object.
Hill - Chapter 4
09/23/99
page 33
A planar patch is formed by restricting the range of allowable parameter values for s and t. For instance,
one often restricts s and t to lie only between 0 and 1. The patch is positioned and oriented in space by
appropriate choices of a, b, and C. Figure 4.34a shows the available range of s and t as a square in
parameter space, and Figure 4.34b shows the patch that results from this restriction in object space.
a).
u
b).
Parameter
space
@(0, 1)
b
C
@(1, 1)
@(0, 0)
a @(1, 0)
u
world
coordinates
(4.48)
The vectors a and b determine both the size and the orientation of the patch. If a and b are perpendicular,
the grid will become rectangular, and if in addition a and b have the same length, the grid will become
square. Changing C just shifts the patch without changing its shape or orientation.
Example 4.5.10. Make a patch. Let C = (1, 3, 2), a = (1, 1, 0), and b = (1, 4, 2). Find the corners of the
planar patch. Solution: From the preceding table we obtain the four corners: P(0, 0) = (1, 3, 2), P(0, 1) =
(2, 7, 4), P(1, 0) = (2, 4, 2), and P(1, 1) = (3, 8, 4).
Example 4.5.11. Characterize a Patch. Find a, b, and C that create a square patch of length 4 on a side
centered at the origin and parallel to the x, z-plane. Solution: The corners of the patch are at (2, 0, 2), (2, 0,
- 2), ( - 2, 0, 2), and ( - 2, 0, - 2). Choose any corner, say (2, 0, - 2), for C. Then a and b each have length 4
and are parallel to either the x- or the z-axis. Choose a = ( - 4, 0, 0) and b = (0, 0, 4).
Practice Exercise 4.5.8. Find a Patch. Find point C and some vectors a and b that create a patch having
the four corners ( - 4, 2, 1), (1, 7, 4), ( - 2, - 2, 2), and (3, 3, 5).
Hill - Chapter 4
09/23/99
page 34
Suppose one segment has endpoints A and B and the other segment has endpoints C and D. As shown in
Figure 4.35 the two segments can be situated in many different ways: They can miss each other (a and b),
overlap in one point (c and d), or even overlap over some region (e). They may or may not be parallel. We
need an organized approach that handles all of these possibilities.
b).
a).
c).
D
D
A
A
A
C
d).
D
B
e).
A
(4.49)
where for convenience we define b = B - A. As t varies from 0 to 1 all points on the finite line segment are
visited. If t is allowed to vary from - to the entire parent line is swept out.
Similarly we call the segment from C to D by the name CD, and give it parametric representation (using a
new parameter, say, u)
CD(u) = C + d u,
where d = D - C. We use different parameters for the two lines, t for one and u for the other, in order to
describe different points on the two lines independently. (If the same parameter were used, the points on the
two lines would be locked together.)
For the parent lines to intersect, there must be specific values of t and u for which the two equations above
are equal:
A + bt = C + du
Defining c = C - A for convenience we can write this condition in terms of three known vectors and two
(unknown) parameter values:
bt = c + du
(4.50)
This provides two equations in two unknowns, similar to Equation 4.22. We solve it the same way: dot both
sides with d to eliminate the term in d, giving d bt = d c . There are two main cases: the term
d b is zero or it is not.
Hill - Chapter 4
09/23/99
page 35
t=
d c
d b
(4.51)
Similarly dot both sides of Equation 4.50 with b to obtain (after using one additional property of perpdot productswhich one?):
b c
u=
d b
(4.52)
Now we know that the two parent lines intersect, and we know where. But this doesnt mean that the line
segments themselves intersect. If t lies outside the interval [0, 1], segment AB doesnt reach the other
segment, with similar statements if u lies outside of [0,1]. If both t and u lie between 0 and 1 the line
segments do intersect at some point, I. The location of I is easily found by substituting the value of t in
Equation 4.49:
d c
b
I= A+
d b
(4.53)
Example 4.6.1: Given the endpoints A = (0, 6), B = (6, 1), C = (1, 3), and D = (5, 5), find the intersection if
it exists. Solution: db = -32, so t = 7/16 and u = 13/32 which both lie between 0 and 1, and so the
segments do intersect. The intersection lies at (x, y) = (21/8, 61/16). This result may be confirmed visually
by drawing the segments on graph paper and measuring the observed intersection.
4.6.1. When the parent lines overlap. We explore case 2 above, where the term db = 0, so the parent
lines are parallel. We must determine whether the parent lines are identical, and if so whether the segments
themselves overlap.
To test whether the parent lines are the same, see whether C lies on the parent line through A and B.
a). Show that the equation for this parent line is bx (y - Ay) - by (x - Ax) = 0.
We then substitute Cx for x and Cy for y and see whether the left-hand side is sufficiently close to zero (i.e.
its size is less than some tolerance such as 10-8). If not, the parent lines do not coincide, and no intersection
exists. If the parents lines are the same, the final test is to see whether the segments themselves overlap.
b). To do this, show how to find the two values tc and td at which this line through A and B reaches C and
D, respectively. Because the parent lines are identical, we can use just the x-component. Segment AB
begins at 0 and ends at 1, and by examining the ordering of the four values 0, 1, tc, and td, we can readily
determine the relative positions of the two lines.
c). Show that there is an overlap unless both tc and td are less than 0 or both are larger than 1. If there is an
overlap, the endpoints of the overlap can easily be found from the values of tc and td.
d). Given the endpoints A = (0, 6), B = (6, 2), C = (3, 4), and D = (9, 0), determine the nature of any
intersection.
Hill - Chapter 4
09/23/99
page 36
4.6.2. The Algorithm for determining the intersection. Write the routine segIntersect() that would
be used in the context: if(segIntersect(A, B, C, D, InterPt)) <do something>
It takes four points representing the two segments, and returns 0 if the segments do not intersect, and 1 if
they do. If they do intersect the location of the intersection is placed in interPt. It returns -1 if the parent
lines are identical.
4.6.3. Testing the Simplicity of a Polygon. Recall that a polygon P is simple if there are no edge
intersections except at the endpoints of adjacent edges. Fashion a routine int isSimple(Polygon P)
that takes a brute force approach and tests whether any pair of edges of the list of vertices of the polygon
intersect, returning 0 if so, and 1 if not so. (Polygon is some suitable class for describing a polygon.) This
is a simple algorithm but not the most efficient one. See [moret91] and [preparata85] for more elaborate
attacks that involve some sorting of edges in x and y.
4.6.4. Line Segment Intersections. For each of the following segment pairs, determine whether the
segments intersect, and if so where.
1. A = (1, 4),
B = (7, 1/2),
C = (7/2, 5/2),
D = (7, 5);
2. A = (1, 4),
B = (7, 1/2),
C = (5, 0),
D = (0, 7);
3. A = (0, 7),
B = (7, 0),
C = (8, - 1),
D = (10, - 3);
?
C
B
B
Figure 4.36. Finding the excircle.
perp.
bisector #2
S
C
perp.
bisector #1
C
B
Figure 4.35c shows how to find it. The center S of the desired circle must be equidistant from all three
vertices, so it must lie on the perpendicular bisector of each side of triangle ABC (The perpendicular
bisector is the locus of all points that are equidistant from two given points.). Thus we can determine S if
we can compute where two of the perpendicular bisectors intersect.
We first show how to find a parametric representation of the perpendicular bisector of a line segment.
Figure 4.37 shows a segment S with endpoints A and B. Its perpendicular bisector L is the infinite line that
passes through the midpoint M of segment S, and is oriented perpendicular to it. But we know that midpoint
M is given by (A + B)/2, and the direction of the normal is given by ( B A ) , so the perpendicular
bisector has parametric form:
Hill - Chapter 4
09/23/99
page 37
M
A
L(t) =
1
(A + B) + (B A) t
2
(4.54)
Now we are in a position to compute the excircle of three points. Returning to Figure 4.35b we seek the
intersection S of the perpendicular bisectors of AB and AC. For convenience we define the vectors:
a=B-A
b=C-B
c=A-C
(4.55)
To find the perpendicular bisector of AB we need the midpoint of AB and a direction perpendicular to AB.
The midpoint of AB is A + a / 2 (why?). The direction perpendicular to AB is a . So the parametric form
for the perpendicular bisector is A + a / 2 + a t. Similarly the perpendicular bisector of AC is A - c / 2 +
c u, using parameter u. Point S lies where these meet, at the solution of:
a t = b / 2 + c u
(where we have used a + b + c = 0). To eliminate the term in u take the dot product of both sides with c,
and obtain t = 1/2 (b c) / (a c). To find S use this value for t in the representation of the perpendicular
bisector: A + a / 2 + a t, which yields the simple explicit form10:
b c
1
S = A + a + a
a c
2
(4.56)
The radius of the excircle is the distance from S to any of the three vertices, so it is |S - A|. Just form the
magnitude of the last term in Equation 4.56. After some manipulation (check this out) we obtain:
a
radius =
2
b c
1
a c +
2
(4.57)
Once S and the radius are known, we can use drawCircle() from Chapter 3 to draw the desired circle.
Example 4.6.2. Find the perpendicular bisector L of the segment S having endpoints A = (3, 5) and B = (9,
3).
Solution: By direct calculation, midpoint M = (6, 4), and (B A) = (2, 6), so L has representation L(t) =
(6 + 2t, 4 + 6t). It is useful to plot both S and L to see this result.
10Other closed form expressions for S have appeared previously, e.g. in [goldman90] and [lopex92]
Hill - Chapter 4
09/23/99
page 38
Every triangle also has an inscribed circle, which is sometimes necessary to compute in a computer-aided
design context. A case study examines how to do this, and also discusses the beguiling nine-point circle.
Practice Exercise 4.6.5. A Perpendicular Bisector. Find a parametric expression for the perpendicular
bisector of the segment with endpoints A = (0, 6) and B = (4, 0). Plot the segment and the line.
t hit =
n (B A)
n c
Hill - Chapter 4
09/23/99
(4.58)
page 39
As always with a ratio of terms we must examine the eventuality that the denominator of thit is zero. This
occurs when nc = 0, or when the ray is aimed parallel to the plane, in which case there is no hit at all.11
When the hit time has been computed, it is simple to find the location of the hit point : Substitute thit into
the representation of the ray:
hit point: Phit = A + cthit
(4.59)
In the intersection problems treated below we will also need to know generally which direction the ray
strikes the line or plane: along with the normal n or counter to n. (This will be important because we
will need to know whether the ray is exiting from an object or entering it.) Figure 4.39 shows the two
possibilities for a ray hitting a line. In part a) the angle between the rays direction, c, and n is less than 900
so we say the ray is aimed along with n. In part b) the angle is greater than 90o so the ray is aimed
counter to n.
a). ray is aimed "along with" n
b). ray is aimed "counter to" n
n
c
c
A
if n c > 0
if n c = 0
if n c < 0
(4.60)
Practice Exercises.
4.7.1. Intersections of rays with lines and planes. Find when and where the ray A + ct hits the object n
(P - B) = 0 (lines in the 2D or planes in the 3D).
a). A = (2, 3), c = (4, -4), n = (6,8), B = (7,7).
b). A = (2, -4, 3), c = (4, 0, -4), n = (6,9, 9), B = (-7, 2, 7).
c). A = (2, 0), c = (0 -4), n = (0,8), B = (7,0).
d). A = (2, 4, 3), c = (4, 4, -4), n = (6,4, 8), B = (7, 4, 7).
4.7.2. Rays hitting Planes. Find the point where the ray (1,5,2) + (5, -2, 6)t hits the plane 2x -4y + z = 8.
4.7.3. What is the intersection of two planes? Geometrically we know that two planes intersect in a
straight line. But which line? Suppose the two planes are given by n ( P A) = 0 and m ( P B) = 0 .
Find the parametric form of the line in which they intersect. You may find it easiest to:
a). First obtain a parametric form for one of the planes: say, C + as + bt for the second plane.
b). Then substitute this form into the point normal form for the first plane, thereby obtaining a linear
equation that relates parameters s and t.
c). Solve for s in terms of t, say s = E + Ft. (Find expressions for E and F.)
d). Write the desired line as C + a(E + Ft) + bt.
Hill - Chapter 4
09/23/99
page 40
We know polygons are the fundamental objects used in both 2D and 3D graphics. In 2D graphics their
straight edges make it easy to describe them and draw them. In 3D graphics, an object is often modeled as
a polygonal mesh: a collection of polygons that fit together to make up its skin. If the skin forms a
closed surface that encloses some space the mesh is called a polyhedron. We study meshes and polyhedra
in depth in Chapter 6.
Figure 4.40 shows a 2D polygon and a 3D polyhedron that we might need to analyze or render in a graphics
application. Three important questions that arise are:
Hill - Chapter 4
09/23/99
page 41
a).
b).
c).
outside
half space
L2
L2
L1
L0
y
y=1
1
1
1
an outward normal
Hill - Chapter 4
09/23/99
page 42
P0
P1
c
A
P2
n1
L1
(4.61)
a).
b).
0
tin
tout
c).
1
1
tin
C
A
(4.62)
Now how are tin and tout computed? We must consider each of the bounding lines of P in turn, and find
where the ray A + c t intersects it. We suppose each bounding line is stored in point normal form as the pair
Hill - Chapter 4
09/23/99
page 43
{B, n}, where B is some point on the line and n is the outward pointing normal for the line: it points to the
outside of the polygon. Because it is outward pointing the test of Equation 4.60 translates to:
if n c > 0
if n c = 0
if n c < 0
(4.63)
candidate
interval
the ray is
outside P here
the ray is
outside P here
t
tin
tout
For the ray intersection problem, where the ray extends infinitely far in both directions, we set tin = -
and tout = . In practice tin is set to a large negative value, and tout to a large positive value.
12
Hill - Chapter 4
09/23/99
page 44
@.2
@0
L5
L0
A
L4
@.28
@1
intersects L3
@ -4.7
@.66
C
@.83
L1
L3
intersects L2
@3.4
Figure 4.48 shows pseudocode for the Cyrus Beck algorithm. The types LineSegment, LineList, and
Vector2 are suitable data types to hold the quantities in question (see the exercises). Variables numer
and denom hold the numerator and denominator for thit of Equation 4.48:
Hill - Chapter 4
09/23/99
page 45
numer = n ( B A)
denom = n c
(4.64)
Hill - Chapter 4
09/23/99
page 46
If the ray is parallel to the line it could lie entirely in the inside half space of the line, or entirely out of it. It
turns out that numer = n (B - A) is exactly the quantity needed to tell which of these cases occurs. See the
exercises.
The 3D case: Clipping a line against a Convex Polyhedron.
The Cyrus Beck clipping algorithm works in three dimensions in exactly the same way. In 3D the edges of
the window become planes defining a convex region in three dimensions, and the line segment is a line
suspended in space. ChopCI() needs no changes at all (since it uses only the values of dot products through numer and denom ). The data types in CyrusBeckClip() must of course be extended to 3D
types, and when the endpoints of the line are adjusted the z-component must be adjusted as well.
Practice Exercises.
4.8.2. Data types for variable in the Cyrus Beck Clipper. Provide useful definitions for data types, either as
structs or classes, for LineSegment, LineList, and Vector2 used in the Cyrus Beck clipping
algorithm.
4.8.3. What does numer <= 0 do?
Sketch the vectors involved in value of numer in chopCI() and show that when the ray A + c t moves
parallel to the bounding line n (P - B) = 0, it lies wholly in the inside half space of the line if and only if
numer > 0.
4.8.4. Find the Clipped Line. Find the portion of the segment with endpoints (2, 4) and (20, 8) that lies within
the quadrilateral window with corners at (0, 7), (9, 9), (14,4), and (2, 2).
1.
P4
P3
Hill - Chapter 4
09/23/99
page 47
ct = bi + eiu
Equations 4.51 and 4.52 hold the answers. When converted to the current notation we have:
ei bi
t=
ei c
and
c bi
u= .
ei c
If e i c is 0 the i-th edge is parallel to the ray direction c and there is no intersection. There is a true
intersection with the i-th edge only if u falls in the interval [0,1].
We need to find all of the legitimate hits of the ray with edges of P, and place them in a list of the hit times.
Call this list hitList. Then pseudocode for the process would look like:
(6,2)
P0 = (3,2)
(8,2)
B
A
(4,1)
1
P1
(6,-1)
u
0.3846
-0.727
0.9048
0.4
Hill - Chapter 4
t
0.2308
-0.2727
0.7142
0.6
09/23/99
page 48
0.375
0.375
The hit with edge 1 occurs at t outside of [0,1] so it is discarded. We sort the remaining t-values and arrive
at the sorted hit list: {0.2308, 0.375, 0.6, 0.7142}. Thus the ray enters P at t = 0.2308, exits it at t = 0.375,
re-enters it at t = 0.6, and exits it for the last time at t = 0.7142 .
Practice exercise 4.9.4. Clip a line. Find the portions of the line from A = (1, 3.5 ) to B = (9, 3.5) that lie
inside the polygon with vertex list: (2, 4), (3, 1), (4, 4), (3, 3).
Hill - Chapter 4
09/23/99
page 49
the object, but when a point lies off of the object the sign of f() can reveal on which side of the object the
point lies. In this chapter we addressed finding representations of the two fundamental flat objects in
graphics: lines and planes. For such objects both the parametric form and implicit form are linear in their
arguments. The implicit form can be revealingly written as the dot product of a normal vector and a vector
lying within the object.
It is possible to form arbitrary linear combinations of vectors, but not of points. For points only affine
combinations are allowed, or else chaos reigns if the underlying coordinate system is ever altered, as it
frequently is in graphics. Affine combinations of points are useful in graphics, and we showed that they
form the basis of tweening for animations and for Bezier curves.
The parametric form of a line or ray is particularly useful for such tasks as finding where two lines intersect
or where a ray hits a polygon or polyhedron. These problems are important in themselves, and they also
underlie clipping algorithms that are so prominent in graphics. The Cyrus-Beck clipper, which finds where
a line expressed parametrically shares the same point in space as a line or plane expressed implicitly,
addresses a larger class of problems than the Cohen Sutherland clipper of Chapter 2, and will be seen in
action in several contexts later.
In the Case Studies that are presented next, the vector tools developed so far are applied to some
interesting graphics situations, and their power is seen even more clearly. Whether or not you intend to
carry out the required programming to implement these mini-projects, it is valuable to read through them
and imagine what process you would pursue to solve them.
A
B
x
Figure 4.52. Tweening two polylines.
a). Develop a routine similar to routine drawTween(A, B, n, t)of Figure 4.23 that draws the tween
at t of the polylines A and B.
b). Develop a routine that draws a sequence of tweens between A and B as t varies from 0 to 1, and
experiment with it. Use the double buffering offered by OpenGL to make the animation smooth.
c). Extend the routine so that after t increases gradually from 0 to 1 it decreases gradually back to 0
and then repeats, so the animation repeatedly shows A mutating into B then back into A. This should
continue until a key is pressed.
Hill - Chapter 4
09/23/99
page 50
d). Arrange so that the user can enter two polylines with the mouse, following which the polylines are
tweened as just described. The user presses key A and begins to lay down points to form polyline A, then
presses key B and lays down the points for polyline B. Pressing T terminates that process and begins the
tweening, which continues until the user types Q. Allow for the case where the user inputs a different
number of points for A than for B: your program automatically creates the required number of extra points
along line segments (perhaps at their midpoints) of the polyline having fewer points.
La
A
T
R
Lb
Lc
B
Lb
|b| = Lb + Lc,
|c| = La + Lc
13Note: finding the incircle also solves the problem of finding the unique circle that is tangent to 3 noncollinear lines in
the plane.
14 Suggested by Russell Swan.
Hill - Chapter 4
09/23/99
page 51
a
| a|
b
S = B + Lb
|b|
c
T = A La
|c|
R = A + La
(4.65)
(4.66)
Figure 4.55 illustrates the test for the particular bounding line that passes through P1 and P2. For the case of
point Q, which lies inside P, the angle with n1 is greater than 900. For the case of point Q which lies
outside P the angle is less than 900.
15This circle is the first really exciting one to appear in any couse on elementary geometry. Daniel Pedoe. Circles,
Pergamon Press, New York, 1957
Hill - Chapter 4
09/23/99
page 52
n0
n1
P0
P1
Q'
P2
Q
Figure 4.55. Is point Q inside polygon P?
Write and test a program that allows the user to:
a). lay down the vertices of a convex polygon, P, with the mouse;
b). successively lay down test points, Q, with the mouse;
c). prints is inside or is not inside depending on whether the point Q is or is not inside P.
a).
b).
c
B1
B1
B2
B2
B3
B3
Hill - Chapter 4
09/23/99
page 53
there is a hit with a pillar, the hit time is taken to be the time at which the ray enters the pillar. We
encapsulate this test in the routine:
int rayHit(Ray thisRay, int which, double& tHit);
that calculates the hit time tHit of the ray thisRay against pillarwhich and returns 1 if the ray hits the
pillar, and 0 if it misses. A suitable type for Ray is struct{Point2 startPt; Vector2 dir;}
or the corresponding class; it captures the starting point S and direction c of the ray.
We want to know which pillar the ray hits first. This is done by keeping track of the earliest hit time as we
scan through the list of pillars. Only positive hit times need to be considered: negative hit times correspond
to hits at spots in the opposite direction from the rays travel. When the earliest hit point is found, the ray is
drawn from S to it.
We must find the direction of the reflected ray as it moves away from this latest hit spot. The direction c of
the reflected ray is given in terms of the direction c of the incident ray by Equation 4.27:
c = c 2(c n )n
(4.67)
is the unit normal to the wall of the pillar that was hit. If a pillar inside the chamber was hit we
where n
use the outward pointing normal; if the chamber itself was hit, we use the inward pointing normal.
Write and exercise a program that draws the path of a ray as it reflects off the inner walls of chamber W and
the walls of the convex pillars inside the chamber. Arrange to read in the list of pillars from an external file
and to have the user specify the ray's starting position and direction. (Also see Chapter 7 for the
elliptipool 2D ray tracing simulation.)
Hill - Chapter 4
09/23/99
page 54
a).
b).
a b
a b
Subject
Polygon
Clipped
Polygons
Window
Hill - Chapter 4
09/23/99
page 55
a).
inside
outside
b)
inside
outside
i
p
c).
inside
outside
d).
inside
outside
i
p
4.10.7. Case Study 4.7. Clipping a Polygon against another Weiler Atherton
Clipping.
(Level of Effort: III). This method provides the most general clipping mechanism of all we have studied. It
clips any subject polygon against any (possibly non-convex) clip polygon. The polygons may even contain
holes.
Hill - Chapter 4
09/23/99
page 56
The Sutherland-Hodgman algorithm examined in Case Study 4.6 exploits the convexity of the clipping
polygon through the use of inside-outside half-spaces. In some applications, such as hidden surface
removal and rendering shadows, however, one must clip one concave polygon against another. Clipping is
more complex in such cases. The WeilerAtherton approach clips any polygon against any other, even
when they have holes. It also allows one to form the set theoretic union, intersection, and difference of
two polygons, as we discuss in Case Study 4.8.
We start with a simple example, shown in Figure 4.60. Here two concave polygons, SUBJ and CLIP, are
represented by the vertex lists, (a, b, c, d) and (A, B, C, D), respectively. We adopt the convention here of
listing vertices so that the interior of the polygon is to the right of each edge as we move cyclically from
vertex to vertex through the list. For instance, the interior of SUBJ lies to the right of the edge from c to d
and to the right of that from d to a. This is akin to listing vertices in clockwise order.
a
SUBJ
C
A
6
5
3
4
1
2
d
b
CLIP
D
Figure 4.60 .WeilerAtherton clipping.
All of the intersections of the two polygons are identified and stored in a list (see later). For the example
here, there are six such intersections. Now to clip SUBJ against CLIP, traverse around SUBJ in the
forward direction (i.e., so that its interior is to the right) until an entering intersection is found: one for
which SUBJ is moving from the outside to the inside of CLIP. Here we first find 1, and it goes to an output
list that records the clipped polygon(s).
The process is now simple to state in geometric terms: Traverse along SUBJ, moving segment by segment,
until an intersection is encountered (2 in the example). The idea now is to turn away from following SUBJ
and to follow CLIP instead. There are two ways to turn. Turn so that CLIP is traversed in its forward
direction. This keeps the inside of both SUBJ and CLIP to the right. Upon finding an intersection, turn and
follow along SUBJ in its forward direction, and so on. Each vertex or intersection encountered is put on the
output list. Repeat the turn and jump between polygons process, traversing each polygon in its forward
direction, until the first vertex is revisited. The output list at this point consists of (1, b, 2, B).
Now check for any other entering intersections of SUBJ. Number 3 is found and the process repeats,
generating output list (3, 4, 5, 6). Further checks for entering intersections show that they have all been
visited, so the clipping process terminates, yielding the two polygons (1, b, 2, B) and (3, 4, 5, 6). An
organized way to implement this follow in the forward direction and jump process is to build the two lists
SUBJLIST: a, 1, b, 2, c, 3, 4, d, 5, 6
CLIPLIST: A, 6, 3, 2, B, 1, C, D, 4, 5
that traverse each polygon (so that its interior is to the right) and list both vertices and intersections in the
order they are encountered. (What should be done if no intersections are detected between the two
Hill - Chapter 4
09/23/99
page 57
polygons?) Therefore traversing a polygon amounts to traversing a list, and jumping between polygons is
effected by jumping between lists.
Notice that once the lists are available, there is very little geometry in the processjust a point outside
polygon test to properly identify an entering vertex. The proper direction in which to traverse each
polygon is embedded in the ordering of its list. For the preceding example, the progress of the algorithm is
traced in Figure 4.61.
start
restart
SUB_LIST:
CLIP_LIST:
visited
visited
A more complex example involving polygons with holes is shown in Figure 4.62. The
Hill - Chapter 4
09/23/99
page 58
A
B
A
B
Hill - Chapter 4
09/23/99
page 59
POLYA - POLYB:
4, 5, 6, H, E, F, 7, e, 8, B, C, D, 1, a
2, 3, k
POLYB - POLYA:
1, b, c, d, 8, 5, g, h, 4, A, 3, i, j, 2
7, f, 6, G
1st edition Figure A6.8.
Figure 4.64. Forming the union and difference of two polygons.
Notice how the holes (E, F, G, H) and (k, i, j) in the polygons are properly handled, and that the algorithm
generates holes as needed (holes are polygons listed in counterclockwise fashion).
Task: Adapt the WeilerAtherton method so that it can form the union and difference of two polygons, and
exercise your routines on a variety of polygons. Generate A and B polygons, either in files or
algorithmically, to assist in the testing. Draw the polygons A and B in two different colors, and the result of
the operation in a third color.
Hill - Chapter 4
09/23/99
page 60
Preview.
Section 5.1 motivates the use of 2D and 3D transformations in computer graphics, and sets up some basic
definitions. Section 5.2 defines 2D affine transformations and establishes terminology for them in terms of a
matrix. The notation of coordinate frames is used to keep clear what objects are being altered and how. The
section shows how elementary affine transformations can perform scaling, rotation, translation, and shearing.
Section 5.2.5 shows that you can combine as many affine transformations as you wish, and the result is another
affine transformation, also characterized by a matrix. Section 5.2.7 discusses key properties of all affine
transformations most notably that they preserve straight lines, planes, and parallelism and shows why they
are so prevalent in computer graphics.
Section 5.3 extends these ideas to 3D affine transformations, and shows that all of the basic properties hold here
as well. 3D transformations are more complex than 2D ones, however, and more difficult to visualize,
particularly when it comes to 3D rotations. So special attention is paid to describing and combining various
rotations.
Section 5.4 discusses the relationship between transforming points and transforming coordinate systems. Section
5.5 shows how transformations are managed within a program when OpenGL is available, and how
transformations can greatly simplify many operations commonly needed in a graphics program. Modeling
transformations and the use of the current transformation are motivated through a number of examples.
Section 5.6 discusses modeling 3D scenes and drawing them using OpenGL. A camera is defined that is
positioned and oriented so that it takes the desired snapshot of the scene. The section discusses how
transformations are used to size and position objects as desired in a scene. Some example 3D scenes are modeled
and rendered, and the code required to do it is examined. This section also introduces a Scene Description
language, SDL, and shows how to write an application that can draw any scene described in the language. This
Chap 5. Transformations
9/28/99
page 1
requires the development of a number of classes to support reading and parsing SDL files, and creating lists of
objects that can be rendered. These classes are available from the books web site.
The chapter ends with a number of Case Studies that elaborate on the main ideas and provide opportunities to
work with affine transformations in graphics programs. One case study asks you to develop routines that perform
transformations when OpenGL is not available. Also described there are ways to decompose an affine
transformation into its elementary operations, and the development of a fast routine to draw arcs of circles that
capitalizes on the equivalence between a rotation and three successive shears.
5.1. Introduction.
The main goal in this chapter is to develop techniques for working with a particularly powerful family of
transformations called affine transformations, both with pencil and paper and in a computer program, with and
without OpenGL. These transformations are a fundamental cornerstone of computer graphics, and are central to
OpenGL as well as most other graphics systems. They are also a source of difficulty for many programmers
because it is often difficult to get them right.
One particularly delicate area is the confusion of points and vectors. Points and vectors seem very similar, and are
often expressed in a program using the same data type, perhaps a list of three numbers like (3.0, 2.5, -1.145) to
express them in the current coordinate system. But this practice can lead to disaster in the form of serious bugs
that are very difficult to ferret out, principally because points and vectors do not transform the same way. We need
a way to keep them straight, which is offered by using coordinate frames and appropriate homogeneous
coordinates as introduced in Chapter 4.
b).
aft er
aft er
before
before
x
Figure 5.1. Drawings of objects before and after they are transformed.
Chap 5. Transformations
9/28/99
page 2
Chap 5. Transformations
9/28/99
page 3
the scene alone and move the camera to different orientations and positions for each snapshot. Positioning and
reorienting a camera can be carried out through the use of 3D affine transformations.
scene
Chap 5. Transformations
9/28/99
page 4
The current transformation therefore provides a crucial tool in the manipulation of graphical objects, and it is
essential for the application programmer to know how to adjust the CT so that the desired transformations are
produced. After developing the underlying theory of affine transformations, we turn in Section 5.5 to showing
how this is done.
b).
y
T
Q
P
P
x
1More
formally, if S is a set of points, its image T(S) is the set of all points T(P) where P is some point in S.
Chap 5. Transformations
9/28/99
page 5
Take the 2D case first, as it is easier to visualize. In whichever coordinate frame we are using, point P and Q have
Px
Qx
~
~
P = Py , Q = Qy
1
1
Recall that this means the point P is at location P = Px i + Py j + , and similarly for Q. Px and Py are
~
familiarly called the coordinates of P. The transformation operates on the representation P and produces the
Q P
Q = T P
1 1
x
(5.1)
or more succinctly,
~
~
Q = T ( P) .
(5.2)
Q cos(ln(PP)e)
Q = 1 +
1 1P
x
Py
and such transformations might have interesting geometric effects, but we restrict ourselves to much simpler
families of functions, those that are linear in Px and Py This property characterizes the affine transformations.
Q m P + m P + m
Q = m P + m P + m
1
1
x
11 x
12 y
13
21 x
22 y
23
(5.3)
for some six given constants m11, m12, etc. Qx consists of portions of both of Px and Py, and so does Qy. This cross
fertilization between the x- and y-components gives rise to rotations and shears.
Chap 5. Transformations
9/28/99
page 6
The affine transformation of Equation 5.3 has a useful matrix representation that helps to organize your thinking:2
Q m
Q = m
1 0
x
11
21
m12
m22
P
P
1 1
m13
m23
(5.4)
(Just multiply this out to see that its the same as Equation 5.3. In particular, note how the third row of the matrix
forces the third component of Q to be 1.) For an affine transformation the third row of the matrix is always (0, 0, 1).
Vectors can be transformed as well as points. Recall that if vector V has coordinates Vx and Vy then its coordinate
frame representation is a column vector with a third component of 0. When transformed by the same affine
transformation as above the result is
W m
W = m
0 0
x
11
21
V
V
1 0
m12
m22
m13
m23
(5.5)
3
2
0
Practice Exercise 5.2.1. Apply the transformation. An affine transformation is specified by the matrix:
0 5
1 2 .
0 1
8 3
Solution: 2 = 2
1 0
0 5 1
1 2 2 .
0 1 1
Translation.
You often want to translate a picture into a different position on a graphics display. The translation part of the
affine transformation arises from the third column of the matrix
Q 1
Q = 0
1 0
x
y
2See
0 m13
1 m23
0 1
P
P
1
x
(5.6)
Chap 5. Transformations
9/28/99
page 7
or simply
Q P + m
Q = P + m
1 1
x
13
23
Scaling.
A scaling changes the size of a picture and involves two scale factors, Sx and Sy, for the x- and y-coordinates,
respectively:
(Qx, Qy) = (SxPx, SyPy)
Thus the matrix for a scaling by itself is simply
S
0
0
0
Sy
0
0
0
1
(5.7)
Scaling in this fashion is more accurately called scaling about the origin, because each point P is moved Sx times
farther from the origin in the x-direction, and Sy times farther from the origin in the y-direction. If a scale factor is
negative, then there is also a reflection about a coordinate axis. Figure 5.11 shows an example in which the scaling
(Sx, Sy) = (-1, 2) is applied to a collection of points. Each point is both reflected about the y-axis and scaled by 2 in
the y-direction.
y
There are also pure reflections, for which each of the scale factors is +1 or -1. An example is
T(Px, Py) = (-Px, Py)
(5.8)
which produces a mirror image of a picture by flipping it horizontally about the y-axis, replacing each
occurrence of x with -x. (What is the matrix of this transformation?)
If the two scale factors are the same, Sx = Sy = S, the transformation is a uniform scaling, or a magnification
about the origin, with magnification factor |S|. If S is negative, there are reflections about both axes. A point is
Chap 5. Transformations
9/28/99
page 8
moved outward from the origin to a position |S| times farther away from the origin. If |S| < 1, the points will be
moved closer to the origin, producing a reduction (or demagnification). If, on the other hand, the scale factors
are not the same, the scaling is called a differential scaling.
Practice Exercise 5.2.2. Sketch the effect. A pure scaling affine transformation uses scale factors Sx = 3 and Sy =
-2. Find the image of each of the three objects in Figure 5.12 under this transformation, and sketch them. (Make
use of the facts - to be proved later - that an affine transformations maps straight lines to straight lines, and ellipses
to ellipses.)
a).
b).
c).
1
(1, 1)
1
Rotation.
A fundamental graphics operation is the rotation of a figure about a given point through some angle. Figure 5.13
o
shows a set of points rotated about the origin through an angle of = 60 .
y
x
60
T
Qx = Px cos( ) Py sin( )
(5.9)
Qy = Px sin( ) + Py cos( )
As we derive next, this form causes positive values of to perform a counterclockwise (CCW) rotation. In terms
of its matrix form, a pure rotation about the origin is given by
cos( )
sin( )
0
sin( ) 0
cos( ) 0
0
1
(5.10)
Example 5.2.1. Find the transformed point, Q, caused by rotating P = (3, 5) about the origin through an angle of
0
0
60 . Solution: For an angle of 60 , cos() = .5 and sin() = .866, and Equation 5.9 yields Qx = (3)(0.5) 0
(5)(0.866) = -2.83 and Qy = (3)(0.866) + (5)(0.5) = 5.098. Check this on graph paper by swinging an arc of 60
from (3, 5) and reading off the position of the mapped point. Also check numerically that Q and P are at the same
distance from the origin. (What is this distance?)
Chap 5. Transformations
9/28/99
page 9
P
R
Qx = R cos( + )
Qy = Rsin( + )
Substitute into this equation the two familiar trigonometric relations:
cos( + ) = cos() cos() - sin() sin()
sin(+ ) = sin() cos() + cos() sin()
and use Px = R cos() and Py = R sin() to obtain Equation 5.9.
Practice Exercise 5.2.3. Rotate a Point. Use Equation 5.9 to find the image of each of the following points after
rotation about the origin:
a). (2, 3) through an angle of - 45o
b). (1, 1) through an angle of - 180 o.
c). (60, 61) through an angle of 4 o.
In each case check the result on graph paper, and compare numerically the distances of the original point and its
image from the origin.
Solution: a). (3.5355, .7071), b). (-1, -1), c). (55.5987, 65.0368).
Shearing.
An example of shearing is illustrated in Figure 5.15 is a shear in the x-direction (or along x). In this case the ycoordinate of each point is unaffected, whereas each x-coordinate is translated by an amount that increases linearly
with y. A shear in the x-direction is given by
y
before
after
Chap 5. Transformations
9/28/99
page 10
Qy = Py
where the coefficient h specifies what fraction of the y-coordinate of P is to be added to the x-coordinate. The
quantity h can be positive or negative. Shearing is sometimes used to make italic letters out of regular letters. The
matrix associated with this shear is:
1
0
0
h 0
1 0
0 1
(5.11)
One can also have a shear along y, for which Qx = Px and Qy = g Px + Py for some value g, so that the matrix is
given by
1
g
0
0 0
1 0
0 1
(5.12)
Example 5.2.2: Into which point does (3, 4) shear when h = .3 in Equation 5.11? Solution: Q = (3 + (.3)4 , 4) = (4.2, 4).
Example 5.2.3: Let g = 0.2 in Equation 5.12. To what point does (6, - 2) map? Solution: Q = (6, 0.2 6 - 2) = (6, - 0.8).
A more general shear along an arbitrary line is discussed in a Case Study at the end of the chapter. A notable
feature of a shear is that its matrix has a determinant of 1. As we see later this implies that the area of a figure is
unchanged when it is sheared.
Practice Exercise 5.2.4. Shearing Lines. Consider the shear for which g = .4 and h = 0. Experiment with various
sets of three collinear points to build some assurance that the sheared points are still collinear. Then, assuming that
lines do shear into lines, determine into what objects the following line segments shear:
a. the horizontal segment between ( - 3, 4) and (2, 4);
b. the horizontal segment between ( - 3, - 4) and (2, - 4);
c. the vertical segment between ( - 2, 5) and ( - 2, - 1);
d. the vertical segment between (2, 5) and (2, - 1);
e. the segment between ( - 1, - 2) and (3, 2);
Into what shapes do each of the objects in Figure 5.2.12 shear?
which evaluates to
(5.13)
is nonzero. Notice that the third column of M, which represents the amount of translation, does not affect the
determinant. This is a direct consequence of the two zeroes appearing in the third row of M. We shall make special
note on those rare occasions that we use singular transformations.
Chap 5. Transformations
9/28/99
page 11
It is reassuring to be able to undo the effect of a transformation. This is particularly easy to do with nonsingular
affine transformations. If point P is mapped into point Q according to Q = MP, simply premultiply both sides by
the inverse of M, denoted M -1, and write
P = M -1 Q
(5.14)
m22 -m12
M -1 = 1
det M -m 21 m11
(5.15)
We therefore obtain the following matrices for the elementary inverse transformations:
Scaling (use M as found in Equation 5.7):
1
S
= 0
0
0
1
1
Sy
0
cos( )
= sin( )
0
sin( ) 0
cos( ) 0
0
1
1
= h
0
0 0
1 0
0 1
Translations: The inverse transformation simply subtracts the offset rather than adds it.
1
= 0
0
0 m13
1 m23
0
1
Practice Exercises.
5.2.5. What Is the Inverse of a Rotation? Show that the inverse of a rotation through is a rotation through -.
Is this reasonable geometrically? Why?
5.2.6. Inverting a Shear. Is the inverse of a shear also a shear? Show why or why not.
5.2.7. An Inverse Matrix. Compute the inverse of the matrix
3
M = 1
0
2 1
1 0 .
0 1
Chap 5. Transformations
9/28/99
page 12
T 1()
T 2()
P
W
T()
(5.16)
M = M 2 M 1 .
(5.17)
When homogeneous coordinates are used, composing affine transformations is accomplished by simple matrix
multiplication. Notice that the matrices appear in reverse order to that in which the transformations are applied: if
we first apply T1 with matrix M1 , and then apply T2 with matrix M 2 to the result, the overall transformation has
~ ~
matrix M2 M1 , with the second matrix appearing first in the product as you read from left to right. (Just the
opposite order will be seen when we transform coordinate systems.)
By applying the same reasoning, any number of affine transformations can be composed simply by multiplying
their associated matrices. In this way, transformations based on an arbitrary succession of rotations, scalings,
shears, and translations can be formed and captured in a single matrix.
Chap 5. Transformations
9/28/99
page 13
1
0
0
1.06 3
0 3 1.5 0 0 .707 .707 0
1.06
1 5 0 2 0 .707 .707 0 = 1.414 1.414 5
0 1 0 0 1
0
0
1
0
0
1
Now to transform point (1, 2), enlarge it to the triple (1, 2, 1), multiply it by the composite matrix to obtain (1.94,
0.758, 1), and drop the one to form the image point (1.94, 0.758). It is instructive to use graph paper, and to
perform each of this transformations in turn to see how (1, 2) is mapped.
V
P'
Creating a matrix for each elementary transformation, and multiplying them out produces:
1
0
0
0 Vx
1 Vy
0 1
cos( )
sin( )
0
Chap 5. Transformations
sin( ) 0 1 0 Vx
cos( ) sin( ) d x
cos( ) 0 0 1 Vy = sin( ) cos( ) d y
0
1 0 0 1
0
0
1
9/28/99
page 14
A
P
b
ctio
refle
n ax
is
Chap 5. Transformations
9/28/99
page 15
c
s
0
s 0 1 0 0 c s 0
c2 s2
c 0 0 1 0 s c 0 = 2cs
0 1 0 0 1 0 0 1
0
2cs
s2 c2
0
1
0
0
where c stands for cos() and s for sin(). Using trigonometric identities, the final matrix can be written (check
this out!)
cos(2 )
sin(2 )
0
sin( 2 ) 0
cos(2 ) 0
0
1
(5.18)
This has the general look of a rotation matrix, except the angle has been doubled and minus signs have crept into
the second column. But in fact it is the matrix for a reflection about the axis at angle .
Practice Exercises.
Exercise 5.2.8. The classic: the Window to Viewport Transformation.
We developed this transformation in Chapter 3. Rewriting Equation 3.2 in the current notation we have:
A 0 C
~
M= 0 B D
0 0 1
where the ingredients A, B, C, and D depend on the window and viewport and are given in Equation 3.3. Show
that this transformation is composed of:
A translation through (-W.l, -W.b) to place the lower left corner of the window at the origin;
A scaling by (A, B) to size things.
A translation through (V.l, V.b) to move the corner of the viewport to the desired position.
5.2.9. Alternative Form for a Rotation About a Point. Show that the transformation of Figure 5.17 can be
written out as
Qx = cos()(Px - Vx) - sin()(Py - Vy) + Vx
Qy = sin()(Px - Vx) + cos()(Py - Vy) + Vy
This form clearly reveals that the point is first translated by (-Vx, -Vy), rotated, and then translated by (Vx, Vy).
5.2.10. Where does it end up? Where is the point (8, 9) after it is rotated through 500 about the point (3, 1)? Find
the M matrix.
5.2.11. Seeing it two ways. On graph paper place point P = (4, 7) and the result Q of rotating P about V = (5, 4)
through 450. Now rotate P about the origin through 450 to produce Q, which is clearly different from Q. The
difference between them is V - VM. Show the point V - VM in the graph, and check that Q - Q equals V - VM.
5.2.12. What if the axis doesnt go through the origin? Find the affine transformation that produces a reflection
about the line given parametrically by L(t) = A + bt. Show that it reduces to the result in Equation 5.20 when A +
bt does pass through the origin.
5.2.13. Reflection in x = y. Show that a reflection about the line x = y is equivalent to a reflection in x followed
0
by a 90 rotation.
5.2.14. Scaling About an Arbitrary Point. Fashion the affine transformation that scales points about a pivot
point, (Vx, Vy). Test the overall transformation on some sample points, to confirm that the scaling operation is
correct. Compare this with the transformation for rotation about a pivot point.
5.2.15. Shearing Along a Tilted Axis. Fashion the transformation that shears a point along the axis described by
vector u tilted at angle , as shown in Figure 5.19. Point P is shifted along u an amount that is fraction f of the
displacement d of P from the axis.
Chap 5. Transformations
9/28/99
page 16
Q
P
d
c
(-3, 3)
(0, 3)
(1,
3)
c'
x
(2, 0)
Chap 5. Transformations
9/28/99
page 17
5.2.20. Some Transformations Commute. Show that uniform scaling commutes with rotation, in that the
resulting transformation does not depend on the order in which the individual transformations are applied. Show
that two translations commute, as do two scalings. Show that differential scaling does not commute with rotation.
5.2.21. Reflection plus a rotation. Show that a reflection in x followed by a reflection in y is the same as a rotation
by 1800.
5.2.22. Two Successive Rotations. Suppose that R() denotes the transformation that produces a rotation through
angle . Show that applying R(1) followed by R(2) is equivalent to applying the single rotation R(1 + 2). Thus
successive rotations are additive.
5.2.23. A Succession of Shears. Find the composition of a pure shear along the x-axis followed by a pure shear
along the y-axis. Is this still a shear? Sketch by hand an example of what happens to a square centered at the origin
when subjected to a simultaneous shear versus a succession of shears along the two axes.
where a1 + a2 = 1
What happens when we apply an affine transformation T() to this point W? We claim T(W) is the same affine
combination of the transformed points, that is:
Claim: T(a1 P1 + a2 P2) = a1 T(P1) + a2 T(P2),
(5.19)
For instance, T(0.7 (2, 9) + 0.3 (1, 6)) = 0.7 T((2, 9)) + 0.3 T((1, 6)).
~~
The truth of this is simply a matter of linearity. Using homogeneous coordinates, the point T(W) is MW , and
we can do the following steps using linearity of matrix multiplication :
~~ ~ ~
~
~~
~~
MW = M (a1 P1 + a2 P2 ) = a1 MP1 + a2 MP2
which in ordinary coordinates is just a1T(P1) + a2T(P2) as claimed. The property that affine combinations of
points are preserved under affine transformations seems fairly elementary and abstract, but it turns out to be
pivotal. It is sometimes taken as the definition of what an affine transformation is.
Chap 5. Transformations
(5.20)
9/28/99
page 18
This is another straight line passing through T(A) and T(B). In computer graphics this vastly simplifies drawing
transformed line segments: We need only compute the two transformed endpoints T(A) and T(B) and then draw a
straight line between them! This saves having to transform each of the points along the line, which is obviously
impossible.
The argument is the same to show that a plane is transformed into another plane. Recall from Equation 4.45 that
the parametric representation for a plane can be written as an affine combination of points:
P(s, t) = sA + tB + (1 - s - t)C
When each point is transformed this becomes:
T(P(s, t)) = sT(A) + t T(B) + (1 - s - t)T(C)
which is clearly also the parametric representation of some plane.
Preservation of collinearity and flatness guarantees that polygons will transform into polygons, and planar
polygons (those whose vertices all lie in a plane) will transform into planar polygons. In particular, triangles will
transform into triangles.
~ ~ ~
~~
~~
~~
homogeneous coordinates by M ( A + bt ) = MA + ( Mb)t which has direction vector Mb . This new direction
does not depend on point A. Thus two different lines A 1+ bt and A2 + bt that have the same direction will
~~
transform into two lines both having the direction Mb , so they are parallel. An important consequence of this
property is that parallelograms map into other parallelograms.
The same argument applies to planes: its direction vectors (see Equation 4.43) transform into new direction
vectors whose values do not depend on the location of the plane. A consequence of this is that parallelepipeds5
map into other parallelepipeds.
Example 5.2.8. How is a grid transformed?
Because affine transformations map parallelograms into parallelograms they are rather limited in how much they
can alter the shape of geometrical objects. To illustrate this apply any 2D affine transformation T to a unit square
grid, as in Figure 5.21. Because a grid consists of two sets of parallel lines, T maps the square grid to another grid
consisting of two sets of parallel lines. Think of the grid carrying along whatever objects are defined in the
grid, to get an idea of how the objects are warped by the transformation. This is all that an affine transformation
can do: warp figures in the same way that one grid is mapped into another. The new lines can be tilted at any
angle; they can be any (fixed) distance apart; and the two new axes need not be perpendicular. And of course the
whole grid can be positioned anywhere in the plane.
j
O
v
O' u
5 As we see later, a parallelipiped is the 3D analog of a parallelogram: it has six sides that occur in pairs of parallel faces.
Chap 5. Transformations
9/28/99
page 19
The same result applies in 3D: all a 3D affine transformation can do is map a cubical grid into a grid of
parallelipipeds.
4). The Columns of the Matrix reveal the Transformed Coordinate Frame.
It is useful to examine the columns of the matrix M of an affine transformation, for they prescribe how the
coordinate frame is transformed. Suppose the matrix M is given by
m
M = m
0
11
21
m12
m22
0
1
m13
m23 = m1 m 2 m3
1
(5.21)
so its columns are m1, m2, and m3. The first two columns are vectors (their third component is 0) and the last
column is a point (its third component is a 1). As always the coordinate frame of interest is defined by the origin
, and the basis vectors i and j, which have representations:
0 1
0
= 0 , i = 0 ,and j = 1
1 0
0
Notice that vector i transforms into the vector m1 (check this out):
m1 = Mi
and similarly j maps into m2 and maps into the point m3. This is illustrated in Figure 5.22a. The coordinate
frame (i, j, ) transforms into the coordinate frame (m1, m2, m3), and these new objects are precisely the
columns of the matrix.
a).
b).
Figure 5.22. The transformation forms a new coordinate frame.
The axes of the new coordinate frame are not necessarily perpendicular, nor must they be unit length. (They are
still perpendicular if the transformation involves only rotations and uniform scalings.) Any point P = Pxi + Pyj +
transforms into Q = Px m1 + Pym2 + m3. It is sometimes very revealing to look at the matrix of an affine
transformation in this way.
Example 5.2.9. Rotation about a point. The transformation explored in Example 5.2.5 is a rotation of 30o about
the point (-2, 3). This yielded the matrix:
.866
.5
0
.5 1.232
.866 1.402
0
1
As shown in Figure 5.22b the coordinate frame therefore maps into the new coordinate frame with origin at
(1.232, 1.402, 1) and coordinate axes given by the vectors (0.866, 0.5, 0) and (-0.5, 0.866, 0). Note that these
axes are still perpendicular, since only a rotation is involved.
Chap 5. Transformations
9/28/99
page 20
B
1-t
T
t
A
P
t
T(A)
1-t
T(P)
T(B)
Figure 5.23. Relative ratios are preserved.
As a special case, midpoints of lines map into midpoints. This result pops out a nice geometric result: the
diagonals of any parallelogram bisect each other. (Proof: any parallelogram is an affine-transformed square
(why?), and the diagonals of a square bisect each other, so the diagonals of a parallelogram also bisect each
other.) The same applies in 3D space: the diagonals of any parallelepiped bisect each other.
Interesting Aside. In addition to preserving lines, parallelism, and relative ratios, affine transformations also
preserve ellipses and ellipsoids, as we see in Chapter 8!
(5.22)
In 2D the determinant of M in Equation 5.6 is m11m22 - m12m21.6 Thus for a pure scaling as in Equation 5.10, the
new area is SxSy times the original area, whereas for a shear along one axis the new area is the same as the
original area! Equation 5.21 also confirms that a rotation does not alter the area of a figure, since cos2() +
sin2() = 1.
In 3D similar arguments apply, and we can conclude that the volume of a 3D object is scaled by |det M| when the
object is transformed by the 3D transformation based on matrix M.
Example 5.2.9: The Area of an Ellipse. What is the area of the ellipse that fits inside a rectangle with width W
and height H? Solution: This ellipse can be formed by scaling the unit circle x2 + y2 = 1 by the scale factors Sx =
W and Sy = H, a transformation for which the matrix M has determinant WH. The unit circle is known to have
area , and so the ellipse has area WH.
Chap 5. Transformations
9/28/99
page 21
Basically a matrix M may be factored into a product of elementary matrices in various ways. One particular way
~
of factoring the matrix M associated with a 2D affine transformation, elaborated upon in Case Study 5.3, yields
the result:
~
M = (shear)(scaling)(rotation)(translation)
~
That is, any 3 by 3 matrix M that represents a 2D affine transformation can be written as the product of (reading
right to left) a translation matrix, a rotation matrix, a scaling matrix, and a shear matrix. The specific ingredients
of each matrix are given in the Case Study.
In 3D things are somewhat more complicated. The 4 by 4 matrix M that represents a 3D affine transformation
can be written as:
~
M = (scaling)(rotation)(shear1)(shear2)(translation),
the product of (reading right to left) a translation matrix, a shear matrix, another shear matrix, a rotation matrix,
and a scaling matrix. This result is developed in Case study 5.???.
Practice Exercises.
5.2.294 Generalizing the argument. Show that if W is an affine combination of the N points Pi, i = 1,..,N, and
T() is an affine transformation, then T(W) is the same affine combination of the N points T(Pi), i = 1,..,N.
5.2.25. Show that relative ratios are preserved. Consider P given by A + bt where b = B - A. Find the distances
|P - A| and |P - B| from P to A and B respectively, showing that they lie in the ratio t to 1 - t. Is this true if t lies
outside of the range 0 to 1? Do the same for the distances |T(P) - T(A)| and |T(P) - T(B)|.
5.2.26. Effect on Area. Show that a 2D affine transformation causes the area of a figure to be multiplied by the
factor given in Equation 5.27. Hint: View a geometric figure as made up of many very small squares, each of
which is mapped into a parallelogram, and then find the area of this parallelogram.
P
~ P
P=
P
1
x
y
z
Suppose T() is an affine transformation that transforms point P to point Q. Then just as in the 2D case T() is
~
represented by a matrix M which is now 4 by 4:
m
~ m
M=
m
0
11
21
31
m12
m22
m32
m13
m23
m33
Chap 5. Transformations
1
m14
m24
m34
(5.23)
9/28/99
page 22
and we can say that the representation of point Q is found by multiplying P by matrix M :
Q P
Q = M~ P
Q P
1 1
x
(5.24)
Notice that once again for an affine transformation the final row of the matrix is a string of zeroes followed a lone
one. (This will cease to be the case when we examine projective matrices in Chapter 7.)
Translation.
1
0
0
0
0 0 m14
1 0 m24
0 1 m34
0 0 1
Scaling.
Scaling in three dimensions is a direct extension of the 2D case, having a matrix given by:
S
0
0
0
0
Sy
0
0
0
0
Sz
0
0
0
0
1
(5.25)
where the three constants Sx, Sy, and Sz cause scaling of the corresponding coordinates. Scaling is about the origin,
just as in the 2D case. Figure 5.24 shows the effect of scaling in the z-direction by 0.5 and in the x-direction by a
factor of two.
Chap 5. Transformations
9/28/99
page 23
Shearing.
Three-dimensional shears appear in greater variety than do their two-dimensional counterparts. The matrix for the
simplest elementary shear is the identity matrix with one zero term replaced by some value, as in
1
f
0
0
0
1
0
0
0
0
1
0
0
0
0
1
(5.26)
which produces Q = (Px, f Px + Py, Pz); that is, Py is offset by some amount proportional to Px, and the other
components are unchanged. This causes an effect similar to that in 2D shown in Figure 5.15. Goldman
[goldman???] has developed a much more general form for a 3D shear, which is described in Case Study 5.???.
Rotations.
Rotations in three dimensions are common in graphics, for we often want to rotate an object or a camera in order
to obtain different views. There is a much greater variety of rotations in three than in two dimensions, since we
must specify an axis about which the rotation occurs, rather than just a single point. One helpful approach is to
decompose a rotation into a combination of simpler ones.
Elementary rotations about a coordinate axis.
The simplest rotation is a rotation about one of the coordinate axes. We call a rotation about the x-axis an x-roll,
a rotation about the y-axis a y-roll, and one about the z-axis a z-roll. We present individually the matrices that
produce an x-roll, a y-roll, and a z-roll. In each case the rotation is through an angle, , about the given axis. We
define positive angles using a looking inward convention:
Positive values of cause a counterclockwise (CCW) rotation about an axis as one looks inward from a point on
the positive axis toward the origin.
Chap 5. Transformations
9/28/99
page 24
Q'
P'
Q ''
Q
P''
1
0
R ( ) =
0
0
x
(5.27)
(5.28)
0 0
c s 0
s c 0
0 0 1
2. A y-roll:
c
0
R ( ) =
s
0
y
0
1
0
0
s 0
0 0
c 0
0 1
3. A z-roll:
7In a left-handed system the sense of a rotation through a positive would be CCW looking outward along the positive axis from the origin.
This formulation is used by some authors.
Chap 5. Transformations
9/28/99
page 25
c
s
R ( ) =
0
0
z
s 0 0
c 0 0
0 1 0
0 0 1
(5.29)
Note that 12 of the terms in each matrix are the zeros and ones of the identity matrix. They occur in the row and
column that correspond to the axis about which the rotation is being made (e.g., the first row and column for a xroll). They guarantee that the corresponding coordinate of the point being transformed will not be altered. The c
and s terms always appear in a rectangular pattern in the other rows and columns.
Aside: Why is the y-roll different? The -s term appears in the lower row for the x- and z-rolls, but in the upper
row for the y- roll. Is a y-roll inherently different in some way? This question is explored in the exercises.
Example 5.3.1. Rotating the barn. Figure 5.26 shows a barn in its original orientation (part a), and after
a -70 x-roll (part b), a 30 y-roll (part c), and a -90 z-roll (part d).
a). the barn
b). -700 x-roll
c
0
Q=
s
0
0 s 0 3
4.6
1 0 0 1
1
=
0 c 0 4
1.964
0 0 1 1
1
Chap 5. Transformations
9/28/99
page 26
Practice Exercises.
5.3.1. Visualizing the 90 Rotations. Draw a right-handed 3D system and convince yourself that a 90 rotation
(CCW looking toward the origin) about each axis rotates the other axes into one another, as specified in the
preceding list. What is the effect of rotating a point on the x-axis about the x-axis?
5.3.2. Rotating the Basic Barn. Sketch the basic barn after each vertex has experienced a 45 x-roll. Repeat for
y- and z-rolls.
5.3.3. Do a Rotation. Find the image Q of the point P = (1, 2, -1) after a 45 y-roll. Sketch P and Q in a 3D
coordinate system and show that your result is reasonable.
5.3.4. Testing 90 Rotations of the Axes. This exercise provides a useful trick for remembering the form of the
rotation matrices. Apply each of the three rotation matrices to each of the standard unit position vectors, i, j, and
k, using a 90 rotation. In each case discuss the effect of the transformation on the unit vector.
5.3.5. Is a y-roll indeed different? The minus sign in Equation 5.28 seems to be in the wrong place: on the
lower s rather than the upper one. Here you show that Equations 5.27-29 are in fact consistent. Its just a matter
of how things are ordered. Think of the three axes x, y, and z as occurring cyclically: x -> y -> z -> x -> y , etc.
If we are discussing a rotation about some current axis (x-, y-, or z-) then we can identify the previous axis
and the next axis. For instance, if x- is the current axis, then the previous one is z- and the next is y-. Show that
with this naming all three types of rotations use the same equations: Qcurr = Pcurr, Qnext = c Pnext - s Pprev, and Qprev
= s Pnext + c Pprev. Write these equations out for each of the three possible current axes.
M = M 2 M 1 .
(5.30)
Any number of affine transformations can be composed in this way, and a single matrix results that represents
the overall transformation.
Figure 5.27 shows an example, where a barn is first transformed using some M1, then that transformed barn is
again transformed using M2. The result is the same as the barn transformed once using M2M1.
Chap 5. Transformations
9/28/99
page 27
M = Rx ( x )Ry ( y )Rz ( z )
(5.31)
In this context the angles 1, 2, and 3 are often called Euler8 angles. One form of Eulers Theorem asserts
that any 3D rotation can be obtained by three rolls about the x-, y-, and z-axes, so any rotation can be written as
in Equation 5.32 for the appropriate choice of Euler angles. This implies that it takes three values to completely
specify a rotation.
Example 5.3.3. What is the matrix associated with an x-roll of 450 followed by a y-roll of 300 followed by a zroll of 600 ? Direct multiplication of the three component matrices (in the proper reverse order) yields:
.5
.866
0
0
.866
.5
0
0
0
0
1
0
0 .866
0
0
0 .5
1
0
0 .5 .0
1
0
0
0 .866 0
0
0
1
1
0
0
0
.707 .707
.707 .707
0
0
.433
0 .75
=
0 .5
1 0
0
.436
.789
.66
.612
0
.047
.612
0
0
0
1
0
Some people use a different ordering of rolls to create a complicated rotation. For instance, they might express a
rotation as Ry ( 1 )Rz ( 2 )Rx ( 3 ) : first a y-roll then a z-roll then an x-roll. Because rotations in 3D do not
commute this requires the use of different Euler angles 1, 2, and 3 to create the same rotation as in Equation
5.32. There are 12 possible orderings of the three individual rolls, and each uses different values for 1, 2, and 3.
Rotations About an Arbitrary Axis.
When using Euler angles we perform a sequence of x-, y-, and z-rolls, that is, rotations about a coordinate axis.
But it can be much easier to work with rotations if we have a way to rotate about an axis that points in an
arbitrary direction. Visualize the earth, or a toy top, spinning about a tilted axis. In fact, Eulers theorem states
that every rotation can be represented as one of this type:
Eulers Theorem: Any rotation (or sequence of rotations) about a point is equivalent to a single rotation about
some axis through that point.9
What is the matrix for such a rotation, and can we work with it conveniently?
Figure 5.28 shows an axis represented by vector u, and an arbitrary point P that is to be rotated through angle
about u to produce point Q.
8 Leonhard Euler, 1707-1783, a Swiss mathematician of extraordinary ability who made important contributions to all branches of
mathematics.
9 This is sometimes stated: Given two rectangular coordinate systems with the same origin and arbitrary directions of axes, one can always
specify a line through the origin such that one coordinate system goes into the other by a rotation about this line. [gellert75]
Chap 5. Transformations
9/28/99
page 28
z
u
(5.32)
each being a rotation about one of the coordinate axes. This is tedious to do by hand but is straightforward to
carry out in a program. However, expanding out the product gives little insight into how the ingredients go
together.
2). The constructive way. Using some vector tools we can obtain a more revealing expression for the matrix
Ru(). This approach has become popular recently, and versions of it are described by several authors in GEMS I
[glass90]. We adapt the derivation of Maillot [mail90].
Figure 5.29 shows the axis of rotation u, and we wish to express the operation of rotating point P through angle
into point Q. The method, spelled out in Case Study 5.5, effectively establishes a 2D coordinate system in the
plane of rotation as shown. This defines two orthogonal vectors a and b lying in the plane, and as shown in
Figure 5.29b point Q is expressed as a linear combination of them. The expression for Q involves dot products
and cross products of various ingredients in the problem. But because each of the terms is linear in the
coordinates of P, it can be rewritten as P times a matrix.
Chap 5. Transformations
9/28/99
page 29
a).
b).
z
u
a'
h
a'
P
p
c + (1 c)u
(1 c)u u + su
R ( ) =
(1 c)u u su
0
2
x z
(1 c)u yu x suz
c + (1 c)uy 2
(1 c)uy uz + su x
(1 c)uz ux + su y
(1 c)uz uy sux
c + (1 c )uz 2
0
0
0
1
(5.33)
where c= cos(), and s = sin(), and (ux, uy, uz) are the components of the unit vector u. This looks more
complicated than it is. In fact, as we see later, there is so much structure in the terms that, given an arbitrary
rotation matrix, we can find the specific axis and angle that produces the rotation (which proves Eulers
theorem).
As we see later, OpenGL provides a function to create a rotation about an arbitrary axis:
glRotated(angle, ux, uy, uz);
Example 5.3.4. Rotating about an axis. Find the matrix that produces a rotation through 450 about the
axis u = (111
, , ) / 3 = (0.577,0.577,0.577) . Solution: For a 450 rotation, c = s = 0.707, and filling in the
terms in Equation 5.33 we obtain:
.8047
.5058
R ( 45 ) =
.31
0
0
.31 .5058
.8047 .31
.5058 .8047
0
0
0
0
0
1
This has a determinant of 1 as expected. Figure 5.30 shows the basic barn, shifted away from the origin, before it
is rotated (dark), after a rotation through 22.5o (medium), and after a rotation of 45o (light).
Chap 5. Transformations
9/28/99
page 30
m
m
R ( ) =
m
0
11
21
31
m12
m22
m32
0
m13
m23
m33
0
0
0
1
0
m32 m23
2 sin( )
m m31
uy = 13
2 sin( )
m m12
uz = 21
2 sin( )
ux =
Chap 5. Transformations
(5.34)
9/28/99
page 31
Example 5.3.5. Find the axis and angle. Pretend you dont know the underlying axis and angle for the rotation
matrix in Example 5.3.3, and solve for it. The trace is 2.414, so cos() = 0.707, must be 450, and sin() =
0.707. Now calculate each of the terms in Equation 5.35: they all yield the value 0.577, so u = (1,1,1)/ 3 ,
just as we expected.
Practice Exercises.
5.3.6. Which ones commute? Consider two affine transformations T1 and T2. Is T1T2 the same as T2T1 when:
a). They are both pure translations? b). They are both scalings? c). They are both shears?
d). One is a rotation and one a translation? e). One is a rotation and one is a scaling?
f). One is a scaling and one is a shear?
5.3.7. Special cases of rotation about a general axis u. It always helps to see that a complicated result collapses
to a familiar one in special cases. Check that this happens in Equation 5.34 when u is itself
a). the x-axis, i; b). the y-axis, j; c). the z-axis, k.
5.3.8. Classic Approach to Rotation about an Axis. Here we suggest how to find the rotations that cause u to
become aligned with the z-axis. (See Appendix 2 for a review of spherical coordinates.) Suppose the direction of
u is given by the spherical coordinate angles and as indicated in Figure 5.27. Align u with the z-axis by a zroll through -: this swings u into the xz-plane to form the new axis, u' (sketch this). Use Equation 5.29 to obtain
Rz(-). Second, a y-roll through - completes the alignment process. With u aligned along the z-axis, do the
desired z-roll through angle , using Equation 5.29. Finally, the alignment rotations must be undone to restore
the axis to its original direction. Use the inverse matrices to Ry( -) and Rz( -), which are Ry() and Rz( ),
respectively. First undo the y-roll and then the z-roll. Finally, multiply these five elementary rotations to obtain
Equation 5.34. Work out the details, and apply them to find the matrix M that performs a rotation through angle
35 about the axis situated at =30 and = 45. Show that the final result is:
.877
~ .445
M=
.124
0
.366 .281
.842 .306
.396
.910
0
0
0
0
0
1
5.3.9. Orthogonal Matrices. A matrix is orthogonal if its columns are mutually orthogonal unit-length vectors.
Show that each of the three rotation matrices given in Equations 5.27-29 is orthogonal. What is the determinant
of an orthogonal matrix? An orthogonal matrix has a splendid property: Its inverse is identical to its transpose
(also see Appendix 2). Show why the orthogonality of the columns guarantees this. Find the inverse of each of
the three rotation matrices above, and show that the inverse of a rotation is simply a rotation in the opposite
direction.
5.3.10. The Matrix Is Orthogonal. Show that the complicated rotation matrix in Equation 5.34 is orthogonal .
5.3.11. Structure of a rotation matrix. Show that for a 3x3 rotation M the three rows are pair-wise orthogonal,
and the third is the cross product of first two.
5.3.12. What if the axis of rotation does not pass through the origin? If the axis does not pass through the
origin but instead is given by S + ut for some point S, then we must first translate to the origin through -S, apply
the appropriate rotation, and then translate back through S. Derive the overall matrix that results.
Chap 5. Transformations
9/28/99
page 32
The Columns of the Matrix reveal the Transformed Coordinate Frame. If the columns of M are the
vectors m1, m2, m3, and the point m4, the transformation maps the frame (i, j, k, ) to the frame (m1, m2,
m3, m4).
Relative Ratios Are Preserved. If P is fraction f of the way from point A to point B, then T(P) is also
fraction f of the way from point T(A) to T(B).
Effect of Transformations on the Areas of Figures. If 3D object D has volume V, then its image T(D) has
volume |det M | V, where |det M| is the absolute value of the determinant of M.
Every Affine Transformation is Composed of Elementary Operations. A 3D affine transformation may
be decomposed into a composition of elementary transformations. This can be done in several ways.
P
in place of P . (Also see Appendix 2.) The superscript T denotes the transpose, so we are
1
x
(Px, Py, 1)
Suppose we have a 2D coordinate frame #1 as shown in Figure 5.31, with origin and axes i and j. Further
suppose we have an affine transformation T(.) represented by matrix M. So T(.) transforms coordinate frame #1
into coordinate frame #2, with new origin = T(), and new axes i = T(i) and j = T(j).
a c
b = M d
1 1
(5.35)
Summarizing: Suppose coordinate system #2 is formed from coordinate system #1 by the affine transformation
M. Further suppose that point P = (Px, Py, Pz,1) are the coordinates of a point P expressed in system #2. Then the
coordinates of P expressed in system #1 are MP.
This may seem obvious to some readers, but in case it doesnt, a derivation is developed in the exercises. This
result also holds for 3D systems, of course, and we use it extensively when calculating how 3D points are
transformed as they are passed down the graphics pipeline.
Example 5.4.1. Rotating a coordinate system. Consider again the transformation of Example 5.2.5 that rotates
points through 30o about the point (-2, 3). (See Figure 5.22.) This transformation maps the origin and axes i
and j into the system #2 as shown in that figure. Now consider the point P with coordinates (Px, Py, 1)T in the new
coordinate system. What are the coordinates of this point expressed in the original system #1? The answer is
simply MP. For instance, (1, 2, 1)T in the new system lies at M(1, 2, 1)T = (1.098, 3.634, 1)T in the original
system. (Sketch this in the figure.) Notice that the point (-2, 3, 1)T, the center of rotation of the transformation, is
Chap 5. Transformations
9/28/99
page 33
a fixed point of the transformation: M(2, 3, 1)T = (2, 3, 1)T. Thus if we take P = (-2, 3, 1)T in the new system, it
maps to (-2, 3, 1)T in the original system (check this visually).
Successive Changes in a Coordinate frame.
Now consider forming a transformation by making two successive changes of the coordinate system. What is the
overall effect? As suggested in Figure 5.32, system #1 is converted to system #2 by transformation T1(.), and
system #2 is then transformed to system #3 by transformation T2(.). Note that system #3 is transformed relative
to #2.
y
b
f
e
d
system #3
T2
c
T1
system #2
system #1
x
a
a c
e
b = M d = M M f
1 1
1
1
(5.36)
The essential point is that when determining the desired coordinates (a, b,1)T from (e, f,1)T we first apply M2 and
then M1, just the opposite order as when applying transformations to points.
We summarize this fact for the case of three successive transformations. The result generalizes immediately to
any number of transformations.
Transforming points. To apply a sequence of transformations T1(), T2(), T3() (in that order) to a point P, form the
matrix:
M = M 3 M 2 M1
Then P is transformed to MP. To compose each successive transformation Mi you must premultiply by Mi.
Transforming the coordinate system. To apply a sequence of transformations T1(), T2(), T3() (in that order) to
the coordinate system, form the matrix:
M = M1 M2 M3
Chap 5. Transformations
9/28/99
page 34
Then a point P expressed in the transformed system has coordinates MP in the original system. To compose
each additional transformation Mi you must postmultiply by Mi.
How OpenGL operates.
We shall see in the next section that OpenGL provides tools for successively applying transformations in order to
build up an overall current transformation. In fact OpenGL is organized to postmultiply each new
transformation matrix to combine it with the current transformation. Thus it will often seem more natural to the
modeler to think in terms of successively transforming the coordinate system involved, as the order in which
these transformations is carried out is the same as the order in which OpenGL computes them.
Practice Exercises.
5.4.1. How transforming a coordinate system relates to transforming a point.
We wish to show the result in Equation 5.35. To do this, show each of the following steps.
a). Show why the point P with representation (c, d, 1)T used in system #2 lies at ci + dj + .
b). We want to find where this point lies in system #1. Show that the representation (in system #1) of i is M(1, 0,
0)T, that of j is M(0, 1, 0)T, and that of is M(0, 0, 1).
c). Show that therefore the representation of the point ci + dj + is cM(1, 0, 0)T + dM(0, 1, 0)T+ M(0, 0, 1)T.
d). Show that this is the same as M(c, 0, 0)T+ M(0, d, 0)T + M(0, 0, 1) and that this is M(c, d, 1)T, as claimed.
5.4.2. Using elementary examples. Figure 5.33 shows the effect of four elementary transformations of a
coordinate system. In each case the original system with axes x and y is transformed into the new system with
axes x and y.
a). translate(m, n)
b). rotate(a)
y'
y
y'
(e,f)
(e,f)
x'
x'
n
e
e+m
m
c). scale(3, 2)
y'
(e, f)
-f
6
4
2
3
2
x
x'
(2, 2)
1 1
x'
y'
9/28/99
page 35
We want to see how to apply the theory of affine transformations in a program to carry out scaling, rotating, and
translating of graphical objects. We also investigate how it is done when OpenGL is used. We look at 2D
examples first as they are easier to visualize, then move on to 3D examples.
To set the stage, suppose you have a routine house() that draws the house #1 in Figure 5.34. But you wish to
draw the version #2 shown that has been rotated through -300 and then translated through (32, 25). This is a
frequently encountered situation: an object is defined at a convenient size and position, but we want to draw it
(perhaps many times) at different sizes, orientations, and locations.
a).
y
23
#1
#2
x
32
transform2D(M, P);
Chap 5. Transformations
9/28/99
page 36
~~
The routine produces Q = MP . To apply the transformation to each point V[i] in house() we must adjust the
source code above as in
cvs.moveTo(transform2D(M, V[0])); // move to the transformed point
cvs.lineTo(transform2D(M, V[1]));
cvs.lineTo(transform2D(M, V[2]));
...
so that the transformed points are sent to moveTo() and lineTo(). This is workable if the source code for
house() is at hand. But it is cumbersome at best, and not possible at all if the source code for house() is not
available. It also requires tools to create the matrix M in the first place.
The easy way.
We cause the desired transformation to be applied automatically to each vertex. Just as we know the window to
viewport mapping is quietly applied to each vertex as part of moveTo() and lineTo(), we can have an
additional transformation be quietly applied as well. It is often called the current transformation, CT. We
enhance moveTo() and lineTo() in the Canvas class so that they first quietly apply this transformation to the
argument vertex, and then apply the window to viewport mapping. (Clipping is performed at the world window
boundary as well.)
Figure 5.35 provides a slight elaboration of the graphics pipeline we introduced in Figure 5.7. When
glVertex2d()is called with argument V, the vertex V is first transformed by the CT to form point Q. Q is then
passed through the window to viewport mapping to form point S in the screen window. (As we see later, clipping
is also performed, inside this last mapping process.)
Chap 5. Transformations
9/28/99
page 37
(5.37)
The order is important. As we saw earlier, applying CT * M to a point is equivalent to first performing the
transformation embodied in M, followed by performing the transformation dictated by the previous value of CT.
Or if we are thinking in terms of transforming the coordinate system, it is equivalent to performing one
additional transformation to the existing current coordinate system.
OpenGL routines for applying transformations in the 2D case are:
Since these routines only compose a transformation with the CT, we need some way to get started: to initialize
the CT to the identity transformation. OpenGL provides glLoadIdentity(). And because these functions can
be set to work on any of the matrices that OpenGL supports, we must inform OpenGL which matrix we are
altering. This is accomplished using glMatrixMode(GL_MODELVIEW).
Figure 5.37 shows suitable definitions of four new methods of the Canvas class that manage the CT and allow us
to build up arbitrarily complex 2D transformations. Their pleasing simplicity is possible because OpenGL is
doing the hard work.
//<<<<<<<<<<<<<<< initCT >>>>>>>>>>>>>>>>>
void Canvas:: initCT(void)
{
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
// set CT to the identity matrix
}
//<<<<<<<<<<<<<< scale2D >>>>>>>>>>>>>>>>>>>>
void Canvas:: scale2D(double sx, double sy)
{
glMatrixMode(GL_MODELVIEW);
glScaled(sx, sy, 1.0); // set CT to CT * (2D scaling)
}
//<<<<<<<<<<<<<<< translate2D >>>>>>>>>>>>>>>>>
void Canvas:: translate2D(double dx, double dy)
{
10 The suffix d indicates that its arguments are doubles. There is also the version glRotatef() that takes float arguments.
11 Here, as always, positive angles produce CCW rotations.
Chap 5. Transformations
9/28/99
page 38
glMatrixMode(GL_MODELVIEW);
glTranslated(dx, dy, 1.0); // set CT to CT * (2D translation)
}
//<<<<<<<<<<<<<<<< rotate2D >>>>>>>>>>>>>>>>>>>>
void Canvas:: rotate2D(double angle)
{
glMatrixMode(GL_MODELVIEW);
glRotated(angle, 0.0, 0.0, 1.0); // set CT to CT * (2D rotation)
}
Figure 5.37. Routines to manage the CT for 2D transformations.
We are now in a position to use 2D transformations. Returning to drawing version #2 of the house in Figure 5.34,
we next show the code that first rotates the house through -300 and then translates it through (32, 25). Notice that,
to get the ordering straight, it calls the operations in opposite order to the way they are applied: first the
translation operation, and then the rotation operation.
cvs.setWindow(...);
cvs.setViewport(..);
cvs.initCT();
house();
cvs.translate2D(32, 25);
cvs.rotate2D(-30.0);
house();
//
//
//
//
//
//
Notice that we can scale, rotate, and position the house in any manner we choose, and never need to go inside
the routine house() or alter it. (In particular, the source code for house() need not be available.)
Some people find it more natural to think in terms of transforming the coordinate system. As shown in Figure
5.38 they would think of first translating the coordinate system through (32, 25) to form system #2, and then
rotating that system through -30o to obtain coordinate system #3. Because OpenGL applies transformations in the
order that coordinate systems are altered, the code for doing it this way first calls cvs.translate2D(32,
25) and then calls cvs.rotate2D(-30.0). This is, of course, identical to the code obtained doing it the
other way, but it has been arrived at through a different thinking process.
Figure 5.38. The same transformation viewed as a sequence of coordinate system changes.
We give some further examples to show how easily the CT is manipulated to produce various effects.
Example 5.5.1. Capitalizing on rotational symmetry.
Figure 5.39a shows a star made of stripes that seem to interlock with one another. This is easy to draw using
rotate2D(). Suppose that routine starMotif() draws a part of the star, the polygon shown in Figure 5.39b.
(Determining the positions of this polygons vertices is challenging, and is addressed in Case Study 5.2.) To
draw the whole star we just draw the motif five times, each time rotating the motif through an additional 72:
Chap 5. Transformations
9/28/99
page 39
a).
b).
(x1, y1 )
b).
30 o line
//
//
//
//
To draw the entire snowflake just do this six times, with an intervening rotation of 60o:
void drawFlake()
{
for(int count = 0; count < 6; count++) // draw a snowflake
{
flakeMotif();
cvs.scale2D(1.0,-1.0);
flakeMotif();
Chap 5. Transformations
9/28/99
page 40
cvs.scale2D(1.0,-1.0);
cvs.rotate2D(60.0);
}
}
Example 5.5.3. A Flurry of Snowflakes.
A flurry of snowflakes like that shown in Figure 5.41 can be drawn by drawing the flake repeatedly at random
positions, as in:
Chap 5. Transformations
9/28/99
page 41
Figure 5.42. Two patterns based on a motif. a). each motif is rotated separately. b). all motifs are upright.
Suppose that drawDino() draws an upright dinosaur centered at the origin. In part a) the coordinate system for
each motif is first rotated about the origin through a suitable angle, and then this coordinate system is translated
along its y-axis by H units as shown in the following code. Note that the CT is reinitialized each time through the
loop so that the transformations dont accumulate. (Think through the transformations you would use if instead
you took the point of view of transforming points of the motif.)
const int numMotifs = 12;
for(int i = 0; i < numMotifs; i++)
{
cvs.initCT(); // init CT at each iteration
cvs.rotate2D(i * 360 / numMotifs); // rotate
cvs.translate2D(0.0, H); // shift along y-axis
drawDino();
}
An easy way to keep the motifs upright as in part b) is to pre-rotate each motif before translating it. If a
particular motif is to appear finally at 120o, it is first rotated (while still at the origin) through -120o, then
translated up by H units, and then rotated through 120o. What ajustments to the preceding code will achieve this?
Chap 5. Transformations
9/28/99
page 42
a). before
Chap 5. Transformations
9/28/99
page 43
}
cvs.popCT();
Figure 5.46. Drawing a hexagonal tiling.
Example 5.5.6. Using modeling transformations in a CAD program. Some programs must draw many
instances of a small collection of shapes. Figure 5.47 shows the example of a CAD program that analyzes the
behavior of an interconnection of digital logic gates. The user can construct a circuit by picking and placing
different gates at different places in the work area, possibly with different sizes and orientations. Each picture of
the object in the scene is called an instance of the object. A single definition of the object is given in a
coordinate system that is convenient for that object shape, called its master coordinate system. The
transformation that carries the object from its own master coordinate system to world coordinates to produce an
instance is often called a modeling transformation.
Create
Connect
Delete
Simulate
Quit
Work area
Figure 5.47. Creating instances in a pick-and-place application.
Figure 5.48 shows two logic gates, each defined once in its own master coordinate system. As the user creates
each instance of one of these gates, the appropriate modeling transformation is generated that orients and
positions the instance. The transformation might be stored simply as a set of parameters, say, S, A, dx, dy, with
the understanding that the modeling transformation would always consist of:
1.
2.
3.
A scaling by factor S;
A rotation through angle A
A translation through (dx, dy)
performed in that order. A list is kept of the gates in the circuit, along with the transformation parameters of each
gate.
a).
b).
y
NAND
2
NOT
x
1 2 3
1 2 3
Figure 5.48. Each gate type is defined in its own coordinate system.
Chap 5. Transformations
9/28/99
page 44
Whenever the drawing must be refreshed, each instance is drawn in turn, with the proper modeling
transformation applied. Code to do this might look like:
clear the screen
for(i = 0; i < numberOfGates; i++) // for each gate
{
pushCT(); // remember the CT
translate2D(dx[i], dy[i]); // apply the transformation
rotate2D(A[i]);
scale2D( S[i], S[i]);
drawGate(type[i]); // draw one of the two types
popCT(); // restore the CT
}
The CT is pushed before drawing each instance, so that it can be restored after the instance has been drawn. The
modeling transformation for the instance is determined by its parameters, and then one of the two gate shapes is
drawn. The necessary code has a simple organization, because the burden of sizing, orienting, and positioning
each instance has been passed to the underlying tools that maintain the CT and its stack.
Practice Exercises.
5.5.1. Developing the Transformations. Supposing OpenGL were not available, detail how you would write
the routines that perform elementary coordinate system changes:
void scale2D(double sx, double sy);
void translate2D(double dx, double dy);
5.5.2. Implementing the transformation stack. Define, in the absence of OpenGL, appropriate data types for a
stack of transformations, and write the routines pushCT() and popCT().
5.5.3. A hexagonal tiling. A hexagonal pattern provides a rich setting for tilings, since regular hexagons fit
together neatly as in a beehive. Figure 5.49 shows 9 columns of stacked 6-gons. Here the hexagons are shown
empty, but we could draw interesting figures inside them.
a). Show that the length of a hexagon with radius R is also R.
b). Show that the centers of adjacent hexagons in a column are separated vertically by 3 R and adjacent
columns are separated horizontally by 3 R / 2.
c). Develop code that draws this hexagonal tiling, using pushCT() and popCT() and suitable transformations to
keep track of where each row and column start.
y
R
x
Figure 5.49. A simple hexagonal tiling.
Chap 5. Transformations
9/28/99
page 45
In this section we examine how 3D transformations are used in an OpenGL-based program. The main emphasis
is on transforming objects in order to orient and position them as desired in a 3D scene. Not surprisingly it is all
done with matrices, and OpenGL provides the necessary functions to build the required matrices. Further, the
matrix stack maintained by OpenGL makes it easy to set up a transformation for one object, and then back up
to a previous transformation, in preparation for transforming another object.
It is very satisfying to build a program that draws different scenes using a collection of 3D transformations.
Experimenting with such a program also improves your ability to visualize what the various 3D transformations
do. OpenGL makes it easy to set up a camera that takes a snapshot of the scene from a particular point of
view. The camera is created with a matrix as well, and we study in detail in Chapter 7 how this is done. Here we
just use an OpenGL tool to set up a reasonable camera, so that attention can be focussed on transforming objects.
Granted we are using a tool before seeing exactly how it operates, but the payoff is high: you can make
impressive pictures of 3D scenes with a few simple calls to OpenGL functions.
Chap 5. Transformations
9/28/99
page 46
The modelview matrix basically provides what we have been calling the CT. It combines two effects: the
sequence of modeling transformations applied to objects, and the transformation that orients and positions the
camera in space (hence its peculiar name modelview). Although it is a single matrix in the actual pipeline, it is
easier to think of it as the product of two matrices, a modeling matrix M, and a viewing matrix V. The modeling
matrix is applied first, and then the viewing matrix, so the modelview matrix is in fact the product VM (why?).
Figure 5.53 suggests what the M and V matrices do, for the situation introduced in Figure 5.51, where a camera
looks down on a scene consisting of a block. Part a shows a unit cube centered at the origin. A modeling
transformation based on M scales, rotates, and translates the cube into block shown in part b. Part b also shows
the relative position of the cameras view volume.
a). before model
b). after model
c) after model view
Figure 5.53. Effect of the modelview matrix in the graphics pipeline. a). Before the transformations. b). After the
modeling transformation. c). After the modelview transformation.
The V matrix is now used to rotate and translate the block into a new position. The specific transformation used
is that which would carry the camera from its position in the scene to its generic position, with the eye at the
origin and the view volume aligned with the z-axis, as shown in part c. The vertices of the block are now
positioned (that is, their coordinates have the proper values) so that projecting them onto a plane such as the near
plane yields the proper values for displaying the projected image. So the matrix V in fact effects a change of
coordinates of the scene vertices into the cameras coordinate system. (Camera coordinates are sometimes also
called eye coordinates. )
In the camera coordinate system the edges of the view volume are parallel to the x-, y-, and z-axes. The view volume
extends from left to right in x, from bottom to top in y, and from -near to -far in z, as shown. When the vertices of
the original cube have passed through the entire modelview matrix, they are located as shown in part c.
The projection matrix scales and shifts each vertex in a particular way, so that all those that lie inside the view
volume will lie inside a standard cube that extends from -1 to 1 in each dimension12. (When perspective
projections are being used this matrix does quite a bit more, as we see in Chapter 7.) This matrix effectively
squashes the view volume into the cube centered at the origin. This cube is a particularly efficient boundary
against which to clip objects, as we see in Chapter 7. Scaling the block in this fashion might badly distort it, of
course, but this distortion will be compensated for in the viewport transformation. The projection matrix also
reverses the sense of the z-axis, so that increasing values of z now represent increasing values of depth of a point
from the eye. Figure 5.54 shows how the block is transformed into a different block by this transformation.
Chap 5. Transformations
9/28/99
page 47
glScaled(sx, sy, sz); Postmultiply the current matrix by a matrix that performs a scaling by sx in
x, by sy in y, and by sz in z; Put the result back in the current matrix.
glTranslated(dx, dy, dz); Postmultiply the current matrix by a matrix that performs a
translation by dx in x, by dy in y, and by dz in z; Put the result back in the current matrix.
glRotated(angle, ux, uy, uz); Postmultiply the current matrix by a matrix that performs a
rotation through angle degrees about the axis that passes through the origin and the point (ux, uy, uz).13 Put
the result back in the current matrix. Equation 5.33 shows the matrix used to perform the rotation.
Chap 5. Transformations
9/28/99
page 48
We discuss how this function operates in Chapter 7, and also develop more flexible tools for establishing the
camera. For those curious about what values gluLookAt() actually places in the modelview matrix, see the
exercises.
Example 5.6.1: Set up a typical camera. Cameras are often set to look down on the scene from some nearby
position. Figure 5.56 shows the camera with its eye situated at eye = (4,4,4) looking at the origin with lookAt
= (0,1,0). The up direction is set to up = (0, 1, 0). Suppose we also want the view volume to have a width of
6.4, a height of 4.8 (so its aspect ratio is 640/480), and to set near to 1 and far to 50. This camera would be
established using:
Chap 5. Transformations
9/28/99
page 49
n = eye look
u = up n
v = nu
u
v
V=
n
0
(5.38)
1
and then normalizes all three of these to unit length. It builds the matrix
x
x
x
uy
vy
ny
uz
vz
nz
dx
dy
dz
There is also a glutSolidCube(), glutSolidSphere(), etc., that we use later. The shape of the torus is determined by i
inner radius inRad and outer radius outRad. The sphere and torus are approximated by polygonal faces, and you can adjus
parameters nSlices and nStacks to specify how many faces to use in the approximation. nSlices is the number of
subdivisions around the z-axis, and nStacks is the number of bands along the z-axis, as if the shape were a stack of nSta
disks.
Four of the Platonic solids (the fifth is the cube, already presented):
tetrahedron: glutWireTetrahedron()
octahedron:
glutWireOctahedron()
dodecahedron: glutWireDodecahedron()
icosahedron: glutWireIcosahedron()
All of the shapes above are centered at the origin.
cone:
Chap 5. Transformations
9/28/99
page 50
The axes of the cone and tapered cylinder coincide with the z-axis. Their bases rest on the z = 0 plane, and they extend to z =
height along the z-axis. The radius of the cone and tapered cylinder at z = 0 is given by baseRad. The radius of the tapered
cylinder at z = height is topRad.
The tapered cylinder is actually a family of shapes, distinguished by the value of topRad. When topRad is 1 there is no ta
this is the classic cylinder. When topRad is 0 the tapered cylinder is identical to the cone.
Note that drawing the tapered cylinder in OpenGL requires some extra work, because it is a special case of a quadric surface,
we shall see in Chapter 6. To draw it you must 1) define a new quadric object, 2). set the drawing style (GLU_LINE for a
wireframe, GLU_FILL for a solid rendering), and 3). draw the object:
GLUquadricObj * qobj = gluNewQuadric();
// make a quadric object
gluQuadricDrawStyle(qobj,GLU_LINE);
// set style to wireframe
gluCylinder(qobj, baseRad, topRad, nSlices, nStacks); // draw the cylinder
Figure 5.58. Shapes available in the GLUT.
We next employ some of these shapes in two substantial examples that focus on using affine transformations to
model and view a 3D scene. The complete program to draw each scene is given. A great deal of insight can be
obtained if you enter these programs and produce the figures, and then see the effect of varying the various
parameters.
Example 5.6.2: A scene composed of wireframe objects.
Figure 5.59 shows a scene with several objects disposed at the corners of a unit cube. The cube has one corner at
the origin. Seven objects appear at various corners of the cube, all drawn as wireframes.
The camera is given a view volume that extends from -2 to 2 in y, with an aspect ratio of aspect = 640/480. Its
near plane is at N = 0.1, and its far plane is at F = 100. This is accomplished using:
glOrtho(-2.0* aspect, 2.0* aspect, -2.0, 2.0, 0.1, 100);
The camera is positioned with eye = (2, 2.2, 3), lookAt = (0, 0, 0), and up = (0, 1, 0) (parallel to the y-axis), using:
gluLookAt(2.0, 2.0, 2.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0);
Chap 5. Transformations
9/28/99
page 51
Chap 5. Transformations
9/28/99
page 52
// sphere at (1,1,0)
glPushMatrix();
glTranslated(1.0,0,1.0);
// cone at (1,0,1)
glutWireCone(0.2, 0.5, 10, 8);
glPopMatrix();
glPushMatrix();
glTranslated(1,1,1);
glutWireTeapot(0.2); // teapot at (1,1,1)
glPopMatrix();
glPushMatrix();
glTranslated(0, 1.0 ,0); // torus at (0,1,0)
glRotated(90.0, 1,0,0);
glutWireTorus(0.1, 0.3, 10,10);
glPopMatrix();
glPushMatrix();
glTranslated(1.0, 0 ,0); // dodecahedron at (1,0,0)
glScaled(0.15, 0.15, 0.15);
glutWireDodecahedron();
glPopMatrix();
glPushMatrix();
glTranslated(0, 1.0 ,1.0); // small cube at (0,1,1)
glutWireCube(0.25);
glPopMatrix();
glPushMatrix();
glTranslated(0, 0 ,1.0); // cylinder at (0,0,1)
GLUquadricObj * qobj;
qobj = gluNewQuadric();
gluQuadricDrawStyle(qobj,GLU_LINE);
gluCylinder(qobj, 0.2, 0.2, 0.4, 8,8);
glPopMatrix();
glFlush();
}
//<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
void main(int argc, char **argv)
{
Chap 5. Transformations
9/28/99
page 53
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB );
glutInitWindowSize(640,480);
glutInitWindowPosition(100, 100);
glutCreateWindow("Transformation testbed - wireframes");
glutDisplayFunc(displayWire);
glClearColor(1.0f, 1.0f, 1.0f,0.0f); // background is white
glViewport(0, 0, 640, 480);
glutMainLoop();
}
Figure 5.60. Complete program to draw Figure 5.65 using OpenGL.
Notice that the sides of the large cube that are parallel in 3D are also displayed as parallel. This is a result of using a parallel
projection. The cube looks slightly unnatural because we are used to seeing the world with a perspective projection. As we see
in Chapter 7, if a perspective projection were used instead, these parallel edges would not be drawn parallel.
Example 5.6.3. A 3D scene rendered with shading.
We develop a somewhat more complex scene to illustrate further the use of modeling transformations. We also show how eas
OpenGL makes it to draw much more realistic drawings of solid objects by incorporating shading, along with proper hidden
surface removal.
Two views of a scene are shown in Figure 5.61. Both views use a camera set by gluLookAt(2.3, 1.3, 2,
0, 0.25, 0, 0.0,1.0,0.0). Part a uses a large view volume that encompasses the whole scene; part b
uses a small view volume that encompasses only a small portion of the scene, thereby providing a close-up view.
The scene contains three objects resting on a table in the corner of a room. Each of the three walls is made by flattening a
cube into a thin sheet, and moving it into position. (Again, they look somewhat unnatural due to the use of a parallel
projection.) The jack is composed of three stretched spheres oriented at right angles plus six small spheres at their ends.
Figure 5.61. A simple 3D scene - a). Using a large view volume. b). Using a small view volume.
The table consists of a table top and four legs. Each of the tables five pieces is a cube that has been scaled to the desired size
and shape. The layout for the table is shown in Figure 5.62. It is based on four parameters that characterize the size of its parts
topWidth, topThick, legLen, and legThick. A routine tableLeg() draws each leg, and is called four times within
the routine table() to draw the legs in the four different locations. The different parameters used produce different modeling
transformations within tableLeg(). As always, a glPushMatrix(), glPopMatrix() pair surrounds the modeling
functions to isolate their effect.
a).
b).
Chap 5. Transformations
9/28/99
page 54
Chap 5. Transformations
9/28/99
page 55
Chap 5. Transformations
9/28/99
page 56
}
//<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
void main(int argc, char **argv)
{
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB| GLUT_DEPTH);
glutInitWindowSize(640,480);
glutInitWindowPosition(100, 100);
glutCreateWindow("shaded example - 3D scene");
glutDisplayFunc(displaySolid);
glEnable(GL_LIGHTING); // enable the light source
glEnable(GL_LIGHT0);
glShadeModel(GL_SMOOTH);
glEnable(GL_DEPTH_TEST); // for hidden surface removal
glEnable(GL_NORMALIZE); // normalize vectors for proper shading
glClearColor(0.1f,0.1f,0.1f,0.0f); // background is light gray
glViewport(0, 0, 640, 480);
glutMainLoop();
}
Figure 5.63. Complete program to draw the shaded scene.
Practice Exercises.
5.6.3. Inquiring of the values in a matrix in OpenGL. Test some of the assertions above that put specific
values into the modelview matrix. You can see what is stored in the modelview matrix in OpenGL by defining an
array GLfloat mat[16] and using glGetFloatv(GL_MODELVIEW_MATRIX,mat) which copies into
mat[] the 16 values in the modelview matrix. M[i][j] is copied into the element mat[4j+i], for i, j = 0,
1,..,3.
Chap 5. Transformations
9/28/99
page 57
and the read() method of the class is called to read in a scene file:
scn.read("simple.dat");
Figure 5.64 shows the data structure for the scn object, after the following simple SDL file has been read.
Chap 5. Transformations
9/28/99
page 58
{
tellMaterialsGL();//pass material data to OpenGL
glPushMatrix();
glMultMatrixf(transf.m); // load this objects matrix
glutSolidCone(1.0,1.0, 10,12); // draw a cone
glPopMatrix();
}
Figure 5.65. the drawOpenGL() methods for two shapes.
Figure 5.66 shows the program that reads an SDL file and draws the scene. It is very short, (but of course the
code for the classes Scene, Shape, etc. must be loaded as well). It reads the particular SDL file
myScene1.dat, which recreates the same scene as in Figure 5.63. Note that by simply changing the SDL file
that is read this program can draw any scene described in SDL, without any changes in its code.
#include "basicStuff.h"
#include "Scene.h"
//############################ GLOBALS ########################
Scene scn; // construct the scene object
//<<<<<<<<<<<<<<<<<<<<<<< displaySDL >>>>>>>>>>>>>>>>>>>>>>>>>>
void displaySDL(void)
{
glMatrixMode(GL_PROJECTION); //set the camera
glLoadIdentity();
double winHt = 1.0; // half-height of the window
glOrtho(-winHt*64/48.0, winHt*64/48.0, -winHt, winHt, 0.1, 100.0);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt(2.3, 1.3, 2, 0, 0.25, 0, 0.0,1.0,0.0);
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
// clear screen
scn.drawSceneOpenGL();
} // end of display
//<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
void main(int argc, char **argv)
{
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_RGB |GLUT_DEPTH);
glutInitWindowSize(640, 480);
glutInitWindowPosition(100, 100);
glutCreateWindow("read and draw an SDL scene");
glutDisplayFunc(displaySDL);
glShadeModel(GL_SMOOTH);
glEnable(GL_DEPTH_TEST);
glEnable(GL_NORMALIZE);
glViewport(0, 0, 640, 480);
scn.read("myScene1.dat"); //read the SDL file and build the objects
glEnable(GL_LIGHTING);
scn.makeLightsOpenGL(); // scan the light list and make OpenGL lights
glutMainLoop();
}
Figure 5.66. Drawing a scene read in from an SDL file.
The SDL file that describes the scene of Figure 5.63 is shown in Figure 5.67. It defines the jack shape of nine
spheres by first defining a jackPart and then using it three times, as explained in Appendix 4. Similarly a leg
of the table is first defined as a unit, and then used four times.
! myScene1.dat
light 20 60 30 .7 .7 .7 !put a light at (20,60,30),color:(.7, .7, .7)
ambient .7 .7 .7 ! set material properties for all of the objects
diffuse .6 .6 .6
specular 1 1 1
exponent 50
def jackPart{ push scale .2 .2 1 sphere pop
Chap 5. Transformations
9/28/99
page 59
Chap 5. Transformations
9/28/99
page 60
manipulate a camera, as well as to size and position different objects into a scene. Two of the matrices used by
OpenGL (the modelview and viewport transformations) define affine transformations, whereas the projection
matrix normally defines a perspective transformation, to be examined thoroughly in Chapter 7. OpenGL also
maintains a stack of transformations, which make it easy for the scene designer to control the dependency of one
objects position on that of another, and to create objects that are composed of several related parts.
The SDL language, along with the Scene and Shape classes, make it much simpler to separate programming
issues from scene design issues. An application is developed once that can draw any scene described by a list of
light sources and a list of geometric objects. This application is then used over and over again with different
scene files. A key task in the scene design process is applying the proper geometric transformations to each
object. Since a certain amount of trial and error is usually required, it is convenient to be able to express these
transformations in a concise and readable way.
The next section presents a number of Case Studies that elaborate on the main ideas of the chapter and suggest
ways to practice with affine transformations in a graphics program. These range from plunging deeper into the
theory of transformations to actual modeling and rendering of objects such as electronic CAD circuits and robots.
Canvas::
Canvas::
Canvas::
Canvas::
initCT(void);
// init CT to unit transformation
scale2D(double sx, double sy);
translate2D(double dx, double dy);
rotate2D(double angle);
as well as others that are incorporated into moveTo() and lineTo() so that all points sent to them are silently
transformed before being used. For extra benefit add the stack mechanism for the CT as well, along with functions
pushCT() and popCT(). Exercise your new tools on some interesting 2D modeling and drawing examples.
Case Study 5.2. Draw the star of Fig 5.39. using multiple rotations.
(Level of Effort: I). Develop a function that draws the polygon in figure 5.39b that is one fifth of the star. Use it
with rotation transformations to draw the whole star.
Chap 5. Transformations
9/28/99
page 61
a
c
b 1
= ac + bd
d R2
0 R
1 0
a
0
ad bc R
b
R
R
b
R
a
R
(5.39)
where R = a2 + b2 . The leftmost matrix on the right hand side is recognized as a shear, the middle one as a
scaling, and the rightmost one as a rotation (why?).
Thus any 2D affine transformation is a rotation followed by a scaling followed by a shear followed by a
translation.
An alternative decomposition, based on the so-called Gram-Schmidt process, is discussed in Case Study 5.5.
4 3
into the product of shear, scaling, and rotation matrices.
2 7
0 5
0 4 / 5 3 / 5
4 3 1
M=
=
2 7 13 / 25 1 0 34 / 5 3 / 5 4 / 5
b). A 2D Rotation is three shears.
Matrices can be factored in different ways. In fact a rotation matrix can be written as the product of three shear
matrices! [Paeth90] This leads to a particularly fast method for performing a series of rotations, as we will see.
Equation 5.40 shows a rotation represented as three successive shears. It can be verified by direct multiplication.
It demonstrates that a rotation is a shear in y followed by a shear in x, followed by a repetition of the first shear.
cos(a) sin(a)
-sin(a) cos(a)
1
0
tan(a/2)
1
1
-sin(a)
0
1
1
0
tan(a/2)
1
(5.40)
Calling T = tan(a/2), and S = sin (a) show that we can write the sequence of operations that rotate point (x, y)
as15:
x = T * y + x;
y = y;
{first shear}
x = x;
y = y - S *x;
{second shear}
x = T * y + x;
y = y;
{third shear}
using primes to distinguish new values from old. But operations like x = x do nothing, so the primes are
unnecessary and this sequence reduces to:
x = x + T * y;
y = y - S * x;
x = x + T *y;
15Note that sin(a) may be found quickly from tan(a/2) by one multiplication and one division (see Appendix 2): S = (2T)/(1 +
T2)
Chap 5. Transformations
9/28/99
page 62
If we have only one rotation to perform there is no advantage to going about it this way. However, it becomes
very efficient if we need to do a succession of rotations through the same angle. Two places where we need to do
this are 1). calculating the vertices of an n-gon which are equispaced points around a circle, and 2). computing
the points along the arc of a circle.
Figure 5.68 shows a code fragment for calculating the positions of n points around a circle. It loads the
successive values (cos(i * 2/n + b), sin(i * 2/n + b)) into the array of points p[].
T = tan(PI/n);
// tangent of half angle16
S = 2 * T/(1 + T * T);
// sine of angle
p[0].x = sin(b);
// initial angle, vertex 0
p[0].y = cos(b);
for (int i = 1; i < n; i++)
{
p[i].y = p[i-1].x * T + p[i-1].y; //1st shear
p[i].x = p[i-1].x -S * p[i-1].y; //2nd shear
p[i].y = p[i-1].x * T + p[i-1].y; //3rd shear
}
Figure 5.68. Building the vertices of an n-gon efficiently.
Figure 5.69 shows how to use shears to build a fast arc drawer. It does the same job as drawArc() in Chapter
3, but much more efficiently, since it avoids the repetitive computation of sin() and cos().
void drawArc2(RealPoint c, double R,
double startangle, double sweep) // in degrees
{
#define n 30
#define RadPerDeg .01745329
double delang = RadPerDeg * sweep / n;
double T = tan(delang/2);
// tan. of half angle
double S = 2 * T/(1 + T * T);
// sine of half angle
double snR = R * sin(RadPerDeg * startangle);
double csR = R * cos(RadPerDeg * startangle);
moveTo(c.x + csR, c.y + snR);
for(int i = 1; i < n; i++)
{
snR += T * csR;
// build next snR, csR pair
csR -= S * snR;
snR += T * csR;
lineTo(c.x + csR, c.y + snR);
}
}
Figure 5.69. A fast arc drawer.
Develop a test program that uses this routine to draw arcs. Compare its efficiency with arc drawers that compute
each vertex using trigonometry.
a - 1a
16
1
1 + a2
1 -a
a 1
a
0
0
1/a
a 1
-1 a
1
1 + a2
(5.41)
Note: Some C environments do not support tan() directly. Use sin()/cos() - of course first checking that this denominator is not zero.
Chap 5. Transformations
9/28/99
page 63
The middle matrix is a scaling, while the outer two matrices (when combined with the scale factors shown) are
rotations. For the left-hand rotation associate 1/ 1 + a2 with cos() and -a / 1 + a2 with sin() for some
angle . Thus tan() = -a. Similarly for the right-hand rotation associate cos() and sin) for angle , where
tan() = 1/a. Note that and are related: = + / 2 (why?).
Using this decomposition, write the shear in Equation 5.41 as a rotation then a scaling then a rotation to
conclude that every 2D affine transformation is the following sequence of elementary transformations:
Any affine transformation = Scale * Rotation * Scale * Rotation * Translation
(5.42)
Practice Exercises
5.8.1. A golden decomposition. Consider the special case of a unit shear where the term a - 1/a in Equation
5.3.5 is 1. What value must a have? Determine the two angles and associated with the rotations.
Solution: Since a must satisfy a = 1 + 1/a, a is the golden ratio !. Thus
1 0 = cos( ) -sin()
sin( ) cos( )
1 1
0 1
cos( ) sin( )
-sin( ) cos( )
(5.43)
where = tan-1() = 58.28 and = tan - 1(1/) = 31.72.
5.8.2. Unit Shears. Show that any shear in x contains within it a unit shear. Decompose the shear given by
1 0
h 1
12
21
22
11
21
22
12
5.8.7. Ellipses Are Invariant. Show that ellipses are invariant under an affine transformation. That is, if E is an
ellipse and if T is an affine transformation, then the image T(E) of the points in E also makes an ellipse. Hint to
Solution: Any affine is a combination of rotations and scalings. When an ellipse is rotated it is still an ellipse, so
only the non-uniform scalings could possibly destroy the ellipse property. So it is necessary only to show that an
ellipse, when subjected to a non-uniform scaling, is still an ellipse.
5.8.8. What else is invariant? Consider what class of shapes (perhaps a broader class than ellipses) is invariant
to affine transformations. If the equation f(x,y)=0 describes a shape, show that after transforming it with
transformation T, the new shape is described by all points that satisfy g(x,y) = f(T-1(x,y)) = 0. Then show the
details of this form when T is affine. Finally, try to describe the largest class of shapes which is preserved under
affine transformations.
Chap 5. Transformations
9/28/99
page 64
(Level of Effort:II) A shear can be more general than those discussed in Section 5.???. As suggested by Goldman
[goldman 91] the ingredients of a shear are:
A plane through the origin having unit normal vector m;
A unit vector v lying in the plane (thus perpendicular to m);
An angle .
Then as shown in Figure 5.70 point P is sheared to point Q by shifting it in direction v a certain amount. The
amount is proportional to both the distance at which P lies from the plane, and to tan . Goldman shows that this
shear has the matrix representation:
m
P
Q
P0
v
mx v x
M = I + tan( ) my v x
mz vx
m xv y
m yv y
mz v y
m x vz
m yv z
m zv z
(5.44)
where I is the 3 by 3 identity matrix, and an offset vector of 0. Some details of the derivation are given in the
exercises.
Example 5.8.2: Find the shear associated with the plane having unit normal vector m = (1,1,1)/ 3 = (0.577,
0.577, 0.577), unit vector v = (0, 0.707 -0.707), and angle = 30. Solution: Note that v lies in the plane as
required (why?). Applying Equation 9.9.5 we obtain:
Derive the shear matrix. We want to express the point Q in Figure 5.73 in terms of P and the ingredients of the
shear. The (signed) distance of P from the plane is given by Pm (considering P as a position vector pinned to
the origin). a). Put the ingredients together to obtain:
Q = P + (Pm) tan() v
(5.45)
Note that points on the other side of the plane are sheared in the opposite direction, as we would expect.
Now we need to the second term as P times some matrix. Use Appendix 2 to show that Pm is PmT, and
therefore that Q = P ( I + tan() mT v), and that the shear matrix is ( I + tan() mT v). Show that the matrix mT v
has the form:
m xv x
mx
m v = m y (vx ,vy ,vz ) = my v x
mz
mz vx
T
Chap 5. Transformations
mx v y mx v z
m y v y my vz
mz vy m zvz
9/28/99
(5.46)
page 65
1
Q = P 0
t
0 0
1 0
s 1
(5.47)
which alters the 3D point P by shearing it along x with factor t owing to z and along y with factor s owing to z. If
P lies in the z = 1 plane with coordinates P = (px, py, 1), show that P is transformed into (px + t, py + s, 1).
Hence it is simply shifted in x by amount t and in y by amount s. Show that for any point in the z = 1 plane, this
shear is equivalent to a translation. Show that the matrix in Equation 5.48 is identical to the homogeneous
coordinate form for a pure translation in two dimensions. This correspondence helps reinforce how homogeneous
coordinates operate.
(Level of Effort: II) You are asked to fill in the details in the following derivation of the rotation matrix Ru() of
Equation 9.9.15. For simplicity we assume u is a unit vector: |u| = 1. Denote the position vector based on P by
the name p, so p = P - O where O is the origin of the coordinate system. Now project p onto u to obtain vector h,
as shown in Figure 9.9.8a.
a). Show it has the form h = (p u) u. Define two perpendicular vectors a and b that lie in the plane of rotation: a
= p - h, and b = u a
b). Show that they are perpendicular, they have the same length, they both lie in the plane, and that b = u (p h) simplifies to u p. This effectively establishes a 2D coordinate system in the plane of rotation. Now look
onto the plane of rotation, as in Figure 9.9.8b. The rotation rotates a to a = cos a + sin b, so the rotated point
Q is given by:
Q = h + cos a + sin b, or using the expressions for a and b,
Q = p cos + (1 - cos ) (p u) u + sin (u p)
(5.48)
This is a rather general result, showing how the rotated point Q can be decomposed into portions along h and
along two orthogonal axes lying in the plane of rotation.
This form for Q hardly looks like the multiplication of P by some matrix, but it is because each of the three terms
is linear in p. Convert each of the terms to the proper form as follows:
c). Replace p with P to obtain immediately p (cos ) = P (cos ) I where I is the 3 by 3 identity matrix.
d). Use the result (Appendix 2) that a dot product p c can be rewritten as P times a matrix: P uT, to show that
(p u)u = P uTu where uTu is an outer product similar to Equation 9.9.7.
e). Use the fact developed in Appendix 2 that a cross product u p can also be written as P times a matrix to
show that u p = P Cross(u) where matrix Cross(c) is:
Chap 5. Transformations
9/28/99
page 66
0
Cross(u) = uz
uy
uz
0
ux
uy
ux
(5.49)
M is therefore the sum of three weighted matrices. It surely is easier to build than the product of five matrices as
in the classic route.
g). Write this out to obtain Equation 5.33.
(5.51)
Every 3D affine transformation, then, can be viewed as this sequence of elementary operations, followed by a
translation. In Case Study 5.??? we explore the mathematics behind this form, and see how an actual
decomposition might be carried out.
Useful Classes of Transformations.
Its useful to categorize affine transformations, according to what they do or dont do to certain properties of
an object when it is transformed. We know they always preserve parallelism of edges of an object, but which
transformations also preserve the length of each edge, and which ones preserve the angles between each pair of
edges?
1). Rigid Body Motions.
It is intuitively clear that translating an object, or rotating it, will not change its shape or size. In addition,
reflecting it about a plane has no effect on its shape or size. Since the shape is not affected by any one of these
transformations alone, it is also not affected by an arbitrary composition of them. We denote by
Trigid = {rotations, reflections, translations}
the collection of all affine transformations that consist of any sequence of rotations, reflections, and translations.
These are known classically as the rigid body motions, since an object is moved rigidly from one position and
orientation to another. Such transformations have orthogonal matrices in homogeneous coordinates. These are
matrices for which the inverse is the same as the transpose:
~
~
M 1 = M T
2). Angle-Preserving Transformations.
A uniform scaling (having equal scale factors Sx = Sy = Sz) expands or shrinks an object, but does so uniformly,
so there is no change in the objects shape. Thus the angle between any two edges is unaffected. We can denote
such a class as
17Goldman [gold90] reports the same form for M, and gives compact results for several other complex transformations.
Chap 5. Transformations
9/28/99
page 67
(5.52)
You are asked to verify each step along the way, and to develop a routine that will produce the individual
matrices S,R , H1 and H2.
Suppose the matrix M has rows u, v, and w, each a 3D vector:
u
M = v
Goldmans approach is based on the classical Gram-Schmidt orthogonalization procedure, whereby the rows of
M are combined in such a way that they become mutually orthogonal and of unit length. The matrix composed of
these rows is therefore orthogonal (see Practice Exercise 9.9.14) and so represents a rotation (or a rotation with a
reflection). Goldman shows that the orthogonalization process is in fact two shears. The rest is detail.
The steps are as follows. Carefully follow each step, and do each of the tasks given in brackets.
1. Normalize u to u* = u/S1 , where S1 = |u|.
2. Subtract a piece of u* from v so that what is left is orthogonal to u*: Call b = v - du* where d = vu*. [Show
that bu* = 0.]
3. Normalize b: set v* = b/S2, where S2 = |b|.
4. Set up some intermediate values: m = wu* and n = wv*, e = m + n , and r = (mu* + nv*)/e.
5. Subtract a piece of r from w so that what is left is orthogonal to both u* and v*: Call c = w - er. [Show that
cu* = cv* = 0.]
5. Normalize c: set w* = c/S3 where S3 = |c|.
7. The matrix
2
u *
R = v *
w *
is therefore orthogonal, and so represents a rotation. (Compute its determinant: if it is -1 then simply replace w*
with -w*.)
8. Define the shear matrix H1 = I +
d
(v * u*) (see Practice exercise 9.9.1), where
S2
Chap 5. Transformations
9/28/99
page 68
S1
S = 0
0
0
S2
0
0
0
S3
Note that the decomposition is not unique, since the vectors u, v, and w could be orthogonalized in a different
order. For instance, we could first form w* as w/|w|, then subtract a piece of v from w* to make a vector
orthogonal to w*, then subtract a vector from u to make a vector orthogonal to the other two.
Write the routine:
void decompose(DBL m[3][3],DBL S[3][3],DBL R[3][3],DBL H1[3][3],DBL
H2[3][3])
where DBL is defined as double, that takes matrix m and computes the matrices S, R, H1 and H2 as described
above. Test your routine on several matrices.
Other ways to decompose a 3D transformation have been found as well. See for instance [thomas, GEMS II, p.
320] and [Shoemake, GEMS IV p. 207].
Chap 5. Transformations
9/28/99
page 69
6.1 Introduction.
In this chapter we examine ways to describe 3D objects using polygonal meshes. Polygonal meshes are simply
collections of polygons, or faces, which together form the skin of the object. They have become a standard
way of representing a broad class of solid shapes in graphics. We have seen several examples, such as the
cube, icosahedron, as well as approximations of smooth shapes like the sphere, cylinder, and cone (see Figure
5.65). In this chapter we shall see many more examples. Their prominence in graphics stems from the simplicity of using polygons. Polygons are easy to represent (by a sequence of vertices) and transform, have simple
properties (a single normal vector, a well-defined inside and outside, etc.), and are easy to draw (using a polygon-fill routine, or by mapping texture onto the polygon).
Many rendering systems, including OpenGL, are based on drawing objects by drawing a sequence of polygons.
Each polygonal face is sent through the graphics pipeline (recall Figure 5.56), where its vertices undergo various transformations, until finally the portion of the face that survives clipping is colored in, or shaded, and
shown on the display device.
We want to see how to design complicated 3D shapes by defining an appropriate set of faces. Some objects
can be perfectly represented by a polygonal mesh, whereas others can only be approximated. The barn of Figure 6.1a, for example, naturally has flat faces, and in a rendering the edges between faces should be visible.
But the cylinder in Figure 6.1b is supposed to have a smoothly rounded wall. This roundness cannot be
achieved using only polygons: the individual flat faces are quite visible, as are the edges between them. There
are, however, rendering techniques that make a mesh like this appear to be smooth, as in Figure 6.1c. We examine the details of so-called Gouraud shading in Chapter 8.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 1
10/23/99
page 2
nx
-1
ny
0
nz
0
10/23/99
page 3
-0.477 0.8944
0
1
0.447
0.8944
0
2
1
0
0
3
0
-1
0
4
0
0
1
5
0
0
-1
6
Figure 6.6. The list of distinct normal vectors involved.
face
vertices
0,5,9,4
0 (left)
3,4,9,8
1 (roof left)
2,3,8,7
2 (roof right)
1,2,7,6
3 (right)
0,1,6,5
4 (bottom)
5,6,7,8,9
5 (front)
0,4,3,2,1
6 (back)
Figure 6.7. Face list for the basic barn.
associated normal
0,0,0,0
1,1,1,1
2,2,2,2
3,3,3,3
4,4,4,4
5,5,5,5,5
6,6,6,6,6
Figure 6.7 shows the barns face list: each face has a list of vertices and the normal vector associated with each
vertex. To save space, just the indices of the proper vertices and normals are used. (Since each surface is flat
all of the vertices in a face are associated with the same normal.) The list of vertices for each face begins with
any vertex in the face, and then proceeds around the face vertex by vertex until a complete circuit has been
made. There are two ways to traverse a polygon: clockwise and counterclockwise. For instance, face #5 above
could be listed as (5,6,7,8,9) or (9,8,7,6,5). Either direction could be used, but we follow a convention that
proves handy in practice:
Traverse the polygon counterclockwise as seen from outside the object.
Using this order, if you traverse around the face by walking on the outside surface from vertex to vertex, the
interior of the face is on your left. We later design algorithms that exploit this ordering. Because of it the algorithms are able to distinguish with ease the front from the back of a face.
The barn is an example of a data intensive model, where the position of each vertex is entered (maybe by
hand) by the designer. In contrast, we see later some models that are generated algorithmically. Here it isnt
too hard to come up with the vertices for the basic barn: the designer chooses a simple unit square for the
floor, decides to put one corner of the barn at the origin, and chooses a roof height of 1.5 units. By suitable
scaling these dimension can be altered later (although the relative height of the wall to the barns peak, 1: 1.5,
is forever fixed).
6.2.2. Finding the normal vectors.
It may be possible to set vertex positions by hand, but it is not so easy to calculate the normal vectors. In general each face will have three or more vertices, and a designer would find it challenging to jot down the normal
vector. Its best to let the computer do it during the creation of the mesh model.
If the face is considered flat, as in the case of the barn, we need only find the normal vector to the face itself,
and associate it with each of the faces vertices. One direct way uses the vector cross product to find the normal, as in Figure 4.16. Take any three adjacent points on the face, V1, V2, and V3, and compute the normal as
their cross product m = (V1 - V2)X(V3 - V2). This vector can now be normalized to unit length.
There are two problems with this simple approach. a). If the two vectors V1 - V2 and V3 - V2 are nearly parallel
the cross product will be very small (why?), and numerical inaccuracies may result. b). As we see later, it may
turn out that the polygon is not perfectly planar, i.e. that all of the vertices do not lie in the same plane. Thus
10/23/99
page 4
the surface represented by the vertices cannot be truly flat. We need to form some average value for the
normal to the polygon, one that takes into consideration all of the vertices.
A robust method that solves both of these problems was devised by Martin Newell [newm79]. It computes the
components mx, my, mz of the normal m according to the formula:
N 1
(6.1)
i=0
N 1
where N is the number of vertices in the face, (xi, yi, zi) is the position of the i-th vertex, and where next(j) =
(j+1) mod N is the index of the next vertex around the face after vertex j, in order to take care of the wraparound from the (N-1)-st to the 0-th vertex. The computation requires only one multiplication per edge for
each of the components of the normal, and no testing for collinearity is needed. This result is developed in
Case Study 6.2, and C++ code is presented for it.
The vector m computed by the Newell method could point toward the inside or toward the outside of the polygon. We also show in the Case study that if the vertices of the polygon are traversed (as i increases) in a CCW
direction as seen from outside the polygon, then m points toward the outside of the face.
Example 6.2.2: Consider the polygon with vertices P0 = (6, 1, 4), P1 = (7, 0, 9), and P2 = (1, 1, 2). Find the
normal to this polygon using the Newell method. Solution: Direct use of the cross product gives ((7, 0, 9) - (6,
1, 4)) ((1, 1, 2) - (6, 1, 4)) = (2, -23, -5). Application of the Newell method yields the same result: (2, -23,
-5).
Practice Exercises.
6.2.1. Using the Newell Method.
For the three vertices (6, 1, 4), (2, 0, 5), and (7, 0, 9), compare the normal found using the Newell method
with that found using the usual cross product. Then use the Newell method to find (nx,ny,nz) for the polygon
having the vertices (1, 1, 2), (2, 0, 5), (5, 1, 4), (6, 0, 7). Is the polygon planar? If so, find its true normal using
the cross product, and compare it with the result of the Newell method.
6.2.2. What about a non-planar polygon? Consider the quadrilateral shown in Figure 6.8 that has the vertices
(0, 0, 0), (1, 0, 0), (0, 0,1), and (1, a, 1). When a is nonzero this is a non-planar polygon. Find the normal to
it using the Newell method, and discuss how good an estimate it is for different values of a.
Figure 6.8. A non-planar polygon.
6.2.3. Represent the generic cube. Make vertex, normal, and face lists for the generic cube, which is
centered at the origin, has its edges aligned with the coordinate axes, and has edges of length two. Thus its
eight vertices lie at the eight possible combinations of + and - in (1, 1, 1).
6.2.4. Faces with holes. Figure 6.9 shows how a face containing a hole can be captured in a face list. A pair of
imaginary edges are added that bridge the gap between the circumference of the face and the hole, as suggested in the figure.
10/23/99
page 5
2
7
6
9
4
8
3
10/23/99
page 6
connected and solid. If we just want to draw the object, however, much greater freedom is available, since
many objects can still be drawn, even if they are non-physical.
Figure 6.11 shows some examples of objects we might wish to represent by meshes. PYRAMID is made up of
triangular faces which are necessarily planar. It is not only convex, in fact it has all of the properties above.
DONUT
PYRAMID
IMPOSSIBLE
BARN
FACE
Figure 6.12. Some surfaces describable by meshes. (Part c is courtesy of the University of Utah.)
BOX is an open box whose lid has been raised. In a graphics context we might want to color the outside of
BOX's six faces blue and their insides green. (What is obtained if we remove one face from PYRAMID
above?)
10/23/99
page 7
Two complex surfaces, STRUCT and FACE, are also shown. For these examples the polygonal faces are being
used to approximate a smooth underlying surface. In some situations the mesh may be all that is available for
the object, perhaps from digitizing points on a person's face. If each face of the mesh is drawn as a shaded
polygon, the picture will look artificial, as seen in FACE. Later we shall examine tools that attempt to draw
the smooth underlying surface based only on the mesh model.
Many geometric modeling software packages construct a model from some object - a solid or a surface - that
tries to capture the true shape of the object in a polygonal mesh. The problem of composing the lists can be
difficult. As an example, consider creating an algorithm that generates vertex and face lists to approximate the
shape of an engine block, a prosthetic limb, or a building. This area in fact is a subject of much ongoing research [Mant88], [Mort85]. By using a sufficient number of faces, a mesh can approximate the underlying
surface to any degree of accuracy desired. This property of completeness makes polygon meshes a versatile
tool for modeling.
6.2.5. Working with Meshes in a Program.
We want an efficient way to capture a mesh in a program that makes it easy to create and draw the object.
Since mesh data is frequently stored in a file, we also need simple ways to read and write mesh files.
It is natural to define a class Mesh, and to imbue it with the desired functionality. Figure 6.13 shows the declaration of the class Mesh, along with those of two simple helper classes, VertexID and Face1. A Mesh object has a vertex list, a normal list, and a face list, represented simply by arrays pt, norm, and face, respectively. These arrays are allocated dynamically at runtime, when it is known how large they must be. Their
lengths are stored in numVerts, numNormals, and numFaces, respectively. Additional data fields can be
added later that describe various physical properties of the object such as weight and type of material.
//################# VertexID ###################
class VertexID{
public:
int vertIndex; // index of this vert in the vertex list
int normIndex; // index of this vertexs normal
};
//#################### Face ##################
class Face{
public:
int nVerts; // number of vertices in this face
VertexID * vert; // the list of vertex and normal indices
Face(){nVerts = 0; vert = NULL;} // constructor
~Face(){delete[] vert; nVerts = 0;} // destructor
};
//###################### Mesh #######################
class Mesh{
private:
int numVerts;
// number of vertices in the mesh
Point3* pt;
// array of 3D vertices
int numNormals;
// number of normal vectors for the mesh
Vector3 *norm;
// array of normals
int numFaces;
// number of faces in the mesh
Face* face;
// array of face data
// ... others to be added later
public:
Mesh();
// constructor
~Mesh();
// destructor
int readFile(char * fileName); // to read in a filed mesh
.. others ..
};
Figure 6.13. Proposed data type for a mesh.
1
Definitions of the basic classes Point3 and Vertex3 have been given previously, and also appear in Appendix 3.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 8
The Face data type is basically a list of vertices and the normal vector associated with each vertex in the face.
It is organized here as an array of index pairs: the v-th vertex in the f-th face has position pt[face[f].
vert[v].vertIndex] and normal vector norm[face[f].vert[v].normIndex]. This appears
cumbersome at first exposure, but the indexing scheme is quite orderly and easy to manage, and it has the advantage of efficiency, allowing rapid random access indexing into the pt[] array.
Example 6.2.3. Data for the tetrahedron. Figure 6.14 shows the specific data structure for the tetrahedron
shown, which has vertices at (0,0,0), (1,0,0), (0,1,0), and (0,0,1). Check the values reported in each field. (We
discuss how to find the normal vectors later.)
For proper shading these vectors must be normalized. Otherwise place glEnable(GL_NORMALIZE) in the
init() function. This requests that OpenGL automatically normalize all normal vectors.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 9
6.3. Polyhedra.
It is frequently convenient to restrict the data in a mesh so that it represents a polyhedron. A very large number of solid objects of interest are indeed polyhedra, and algorithms for processing a mesh can be greatly simplified if they need only process meshes that represent a polyhedron.
Slightly different definitions for a polyhedron are used in different contexts, [Coxeter69, Courant & Robbins??, F&VD90,] but we use the following:
Definition
A polyhedron is a connected mesh of simple planar polygons that encloses a finite amount of space.
So by definition a polyhedron represents a single solid object. This requires that:
every edge is shared by exactly two faces;
at least three edges meet at each vertex:
faces do not interpenetrate: Two faces either do not touch at all, or they touch
only along their common edge.
In Figure 6.11 PYRAMID is clearly a polyhedron. DONUT evidently encloses space, so it is a polyhedron if
its faces are in fact planar. It is not a simple polyhedron, since there is a hole through it. In addition, two of its
faces themselves have holes. Is IMPOSSIBLE a polyhedron? Why? If the texture faces are omitted BARN
might be modeled as two polyhedra, one for the main part and one for the silo.
Eulers formula:
Eulers formula (which is very easy to prove; see for instance [courant and robbins, 61, p. 236]) provides a
fundamental relationship between the number of faces, edges, and vertices (F, E, and V respectively) of a simple polyhedron:
V+F-E= 2
(6.1)
10/23/99
page 10
V + F - E = 2 + H - 2G
(6.2)
where H is the total number of holes occurring in faces, and G is the number of holes through the polyhedron.
Figure 6.16a shows a donut constructed with a hole in the shape of another parallelepiped. The two ends are
beveled off as shown. For this object V = 16, F = 16, E = 32, H = 0, and G = 1. Figure 6.16b shows a polyhedron having a hole A penetrating part way into it, and hole B passing through it.
a).
b).
Figure 6.18. Schlegel diagrams for PYRAMID and the basic barn.
6.3.1. Prisms and Antiprisms.
A prism is a particular type of polyhedron that embodies certain symmetries, and therefore is quite simple to
describe. As shown in Figure 6.19 a prism is defined by sweeping (or extruding) a polygon along a straight
line, turning a 2D polygon into a 3D polyhedron. In Figure 6.19a polygon P is swept along vector d to form
the polyhedron shown in part b. When d is perpendicular to the plane of P the prism is a right prism. Figure
6.19c shows some block letters of the alphabet swept to form prisms.
10/23/99
page 11
a).
b).
c).
equilateral
triangle
square
Figure 6.20. A regular prism and antiprism.
Practice Exercises
6.3.1. Build the lists for a prism.
Give the vertex, normal, and face lists for the prism shown in Figure 6.21. Assume the base of the prism (face
#4) lies in the xy-plane, and vertex 2 lies on the z-axis at z = 4. Further assume vertex 5 lies 3 units along the xaxis, and that the base is an equilateral triangle.
2
#1 (top)
1
#2 (back)
#3 (front)
#5 (right)
6
5
#4 (bottom)
10/23/99
page 12
If all of the faces of a polyhedron are identical and each is a regular polygon, the object is a regular polyhedron. These symmetry constraints are so severe that only five such objects can exist, the Platonic solids3
shown in Figure 6.22 [Coxe61]. The Platonic solids exhibit a sublime symmetry and a fascinating array of
properties. They make interesting objects of study in computer graphics, and often appear in solid modeling
CAD applications.
a). Tetrahedron
d). Icosahedron
c). Octahedron
e). Dodecahedron
in honor of Plato (427-347 bc) who commemorated them in his Timaeus. But they were known before
this: a toy dodecahedron was found near Padua in Etruscan ruins dating from 500 BC.
4 Named for L. Schlfli, (1814-1895), a Swiss Mathematician.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 13
b). Cube
3
2
4
6
6
1
2
3
3
2
2
7
1
4
3
5
2
a). Octahedron
4
2
4
5
5
1
3
3
7
(6.3)
Practice Exercises
6.3.7. The octahedron. Consider the octahedron that is the dual of the cube described in Exercise 9.4.3. Build
vertex and face lists for this octahedron.
6.3.8. Check duality. Beginning with the face and vertex lists of the octahedron in the previous exercise, find
its dual and check that this dual is (a scaled version of) the cube.
Normal Vectors for the Platonic Solids.
If we wish to build meshes for the Platonic solids we must compute the normal vector to each face. This can
be done in the usual way using Newells method, but the high degree of symmetry of a Platonic solid offers a
much simpler approach. Assuming the solid is centered at the origin, the normal vector to each face is the
vector from the origin to the center of the face, which is formed as the average of the vertices. Figure 6.26
shows this for the octahedron. he normal to the face shown is simply:
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 14
(6.4)
(Note also: This vector is also the same as that from the origin to the appropriate vertex on the dual Platonic
solid.)
The Tetrahedron
The vertex list of a tetrahedron depends of course on how the tetrahedron is positioned, oriented, and sized. It
is interesting that a tetrahedron can be inscribed in a cube (such that its four vertices lie in corners of the cube,
and its four edges lie in faces of the cube). Consider the unit cube having vertices (1,1,1), and choose the
tetrahedron that has one vertex at (1,1,1). Then it has vertex and face lists given in Figure 6.27 [blinn87].
face list
vertex list
vertex
x
y
z
face number
vertices
0
1
1
1
0
1,2,3
1
1
-1
-1
1
0,3,2
2
-1
-1
1
2
0,1,3
3
-1
1
-1
3
0,2,1
Figure 6.27. Vertex List and Face List for a Tetrahedron.
The Icosahedron
The vertex list for the icosahedron presents more of a challenge, but we can exploit a remarkable fact to make
it simple. Figure 6.28 shows that three mutually perpendicular golden rectangles inscribe the icosahedron, and
so a vertex list may be read directly from this picture. We choose to align each golden rectangle with a coordinate axis. For convenience, we size the rectangles so their longer edge extends from -1 to 1 along its axis. The
shorter edge then extends from - to , where = ( 5 1) / 2 = 0.618... is the reciprocal of the golden
ratio .
old Fig A8.4 golden rectangles in icosahedron.
Figure 6.28. Golden rectangles defining the icosahedron.
From this it is just a matter of listing vertex positions, as shown in Figure 6.29.
x
y
z
vertex
0
0
1
1
0
1
-
2
1
0
3
1
- 0
4
0
-1 -
5
0
-1
6
0
1
7
0
1
-
8
0
-1
9
0
-1
-
10
-1
0
11
-1
- 0
Figure 6.29. Vertex List for the Icosahedron.
A model for the icosahedron is shown in Figure 6.30. The face list for the icosahedron can be read directly off
of it. (Question: what is the normal vector to face #8?)
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 15
11
11
10
14
19
9
15
11
7
18
13
4
9
8
0
4
16
17
6
2
11
10
8
3
12
11
4
4
Figure 6.30. Model for the icosahedron.
Some people prefer to adjust the model for the icosahedron slightly into the form shown in Figure 6.31. This
makes it clearer that an icosahedron is made up of an anti-prism (shown shaded) and two pentagonal pyramids
on its top and bottom.
5 faces for
pyramidal cap
10 faces make
an antiprism
5 faces for
pyramidal base
2
1
12
12
13
11
0
0 4
13
18 13
11
15
8
10
9
12
6 7
7
1
9
19
17
16
2
8
16
10
16
14
19
19
10/23/99
page 16
Practice Exercises
6.3.9. Icosahedral Distances. What is the radial distance of each vertex of the icosahedron above from the
origin?
6.3.10. Vertex list for the dodecahedron. Build the vertex list and normal list for the dodecahedron.
6.3.3. Other interesting Polyhedra.
There are endless varieties of polyhedra (see for instance [Wenn71] and [Coxe63 polytopes]), but one class is
particularly interesting. Whereas each Platonic solid has the same type of n-gon for all of its faces, the Archimedian (also semi-regular) solids have more than one kind of face, although they are still regular polygons. In addition it is required that every vertex is surrounded by the same collection of polygons in the same
order.
For instance, the truncated cube, shown in Figure 6.33a, has 8-gons and 3-gons for faces, and around each
vertex one finds one triangle and two 8-gons. This is summarized by associating the symbol 388 with this
solid.
a).
b).
c).
1
A
1+ A
1 A
C+
D
2
2
1+ A
1 A
W=
C+
D
2
2
V=
(6.5)
Based on this it is straightforward to build vertex and face lists for the truncated cube (see the exercises and
Case Study 6.3).
Given the constraints that faces must be regular polygons, and that they must occur in the same arrangement
about each vertex, there are only 13 possible Archimedian solids discussed further in Case Study 6.10. Archimedean solids still enjoy enough symmetry that the normal vector to each face is found using the center of
the face.
One Archimedian solid of particular interest is the truncated icosahedron 562 shown in Figure 6.34, which
consists of regular hexagons and pentagons. The pattern is familiar from soccer balls used around the world.
More recently this shape has been named the Buckyball after Buckminster Fuller because of his interest in
geodesic structures similar to this. Crystallographers have recently discovered that 60 atoms of carbon can be
10/23/99
page 17
arranged at the vertices of the truncated icosahedron, producing a new kind of carbon molecule that is neither
graphite nor diamond. The material has many remarkable properties, such as high-temperature stability and
superconductivity [Brow90 Sci. Amer., also see Arthur hebard, Science Watch 1991], and has acquired the
name Fullerine.
10
9
8
5
1
11
7
2
6
10/23/99
page 18
V1
b).
V1
P18
P81
w18
V8
w81
V3
V2
W18 =
2
1
V1 + V8 .
3
3
(6.6)
(What is W81?). Projecting W18 onto the enclosing sphere of radius R is simply a scaling:
P18 = R
w 18
| w18 |
(6.7)
where we write w18 for the position vector associated with point W18. The old and new vertices are connected
by straight lines to produce the nine triangular faces for each face of the icosahedron. (Why isnt this a new
Platonic solid?) What are the values for E, F, and V for this polyhedron? Much more can be found on geodesic
domes in the references, particularly [full75] and [kapp91].
Practice Exercises.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 19
6.3.11. Lists for the truncated icosahedron. Write the vertex, normal, and face lists for the truncated icosahedron described above.
6.3.12. Lists for a Bucky Ball. Create the vertex, normal, and face lists for a Bucky ball. As computing 60
vertices is tedious, it is perhaps easiest to write a small routine to form each new vertex using the vertex list of
the icosahedron of Figure 6.29.
6.3.13. Build lists for the geodesic dome. Construct vertex, normal, and face lists for a frequency 3 geodesic
dome as described above.
z
y
y
ARROW
P
x
x
3
5
#7
#8
8 9
#2
#0
#6
10 11
12
13
10/23/99
page 20
We want a tool to make a mesh for the prism based on an arbitrary polygon. Suppose the prism's base is a
polygon with N vertices (xi, yi). We number the vertices of the base 0, . . . , N-1 and those of the cap N, . . .,
2N -1, so that an edge joins vertices i and i + N, as in the example. The vertex list is then easily constructed to
contain the points (xi, yi, 0) and (xi, yi, H), for i = 0, 1, ..., N-1.
The face list is also straightforward to construct. We first make the side faces or walls, and then add the
cap and base. For the j-th wall (j = 0,...,N-1) we create a face with the four vertices having indices j, j + N,
next(j)+ N, and next(j) where next(j) is j+1 except if j equals N-1, whereupon it is 0. This takes care of the
wrap-around from the (N-1)st to the 0-th vertex. next(j) is given by:
next(j) = (j+1) modulo N
(6.8)
or in terms of program code: next = (j < (N-1)) ? (j + 1) : 0. Each face is inserted in the face
list as it is created. The normal vector to each face is easily found using the Newell method described earlier.
We then create the base and cap faces and insert them in the face list. Case Study 6.3 provides more details for
building mesh models of prisms.
6.4.2. Arrays of Extruded Prisms - brick laying.
Some rendering tools, like OpenGL, can reliably draw only convex polygons. They might fail, for instance, to
draw the arrow of Figure 6.38 correctly. If this is so the polygon can be decomposed (tesselated) into a set of
convex polygons, and each one can be extruded. Figure 6.39 shows some examples, including a few extruded
letters of the alphabet.
(6.9)
10/23/99
page 21
The vertices are understood to be taken in pairs, with the odd ones forming one edge of the quad-strip, and
the even ones forming the other edge. Not every polygon can be represented as a quad-strip. (Which of the
polygons in Figure 6.39 are not quad-strips? What block letters of the alphabet can be drawn as quad-strips?)
When a mesh is formed as an extruded quad-strip only 2M vertices are placed in the vertex list, and only the
outside walls are included in the face list. There are 2M - 2 faces in all. (Why?) Thus no redundant walls are
drawn when the mesh is rendered. A method for creating a mesh for an extruded quad-strip would take an array of 2D points and an extrusion vector as its parameters:
void Mesh:: makeExtrudedQuadStrip(Point2 p[], int numPts, Vector3 d);
Figure 6.41 shows some examples of interesting extruded quad-strips. Case Study 6.4 considers how to make
such meshes in more detail.
(6.10)
where M is some 4 by 4 matrix representing an affine transformation. Figure 6.42 shows some examples. Parts
a and b show pyramids, or tapered cylinders (also truncated cones), where the cap is a smaller version of
the base. The transformation matrix for this is:
10/23/99
page 22
0.7
0
M =
0
0
0
0.7
0
0
0 0
0 0
1 H
0 1
based simply on a scaling factor of 0.7 and a translation by H along z. Part c shows a cylinder where the cap
has been rotated through an angle about the z-axis before translation, using the matrix:
cos( )
sin( )
M =
0
0
sin( )
cos( )
0
0
0
0
1
0
H
1
0
0
And part d shows in cross section how cap P can be rotated arbitrarily before it is translated to the desired
position.
Prisms such as these are just as easy to create as those that use a simple translation for M: the face list is identical to the original; only the vertex positions (and the values for the normal vectors) are altered.
Practice exercises.
6.4.1. The tapered cylinder. Describe in detail how to make vertex, normal, and face lists for a tapered cylinder having regular pentagons for its base and cap, where the cap is one-half as large as the base.
6.4.2. The tetrahedron as a tapered cylinder. Describe how to model a tetrahedron as a tapered cylinder
with a triangular base. Is this an efficient way to obtain a mesh for a tetrahedron?
6.4.3. An anti-prism. Discuss how to model the anti-prism shown in Figure 6.20b. Can it be modeled as a
certain kind of extrusion?
6.4.4. Building Segmented Extrusions - Tubes and Snakes.
Another rich set of objects can be modeled by employing a sequence of extrusions, each with its own transformation, and laying them end-to-end to form a tube. Figure 6.43a shows a tube made by extruding a
square P three times, in different directions with different tapers and twists. The first segment has end polygons M0P and M1P, where the initial matrix M0 positions and orients the starting end of the tube. The second
segment has end polygons M1P and M2P, etc. We shall call the various transformed squares the waists of the
tube. In this example the vertex list of the mesh contains the 16 vertices M0p0, M0 p1, M0 p2, M0 p3, M1p0, M1p1,
M1p2, M1p3, ..., M3p0, M3p1, M3p2, M3p3. Figure 6.43b shows a snake, so called because the matrices Mi cause
the tube to grow and shrink to represent the body and head of a snake.
10/23/99
page 23
(6.11)
Figure 6.45. Constructing local coordinate systems along the spine curve.
It is most convenient to let the curve C(t) itself determine the local coordinate systems. A method well-known
in differential geometry creates the Frenet frame at each point along the spine [gray 93]. At each value ti of
interest a vector T(ti) that is tangent to the curve is computed. Then two vectors, N(ti) and B(ti), which are perpendicular to T(ti) and to each other, are computed. These three vectors constitute the Frenet frame at ti.
Once the Frenet frame is computed it is easy to find the transformation matrix M that transforms the base
polygon of the tube to its position and orientation in this frame. It is the transformation that carries the world
The VRML 2.0 modeling language includes an extrusion node that works in a similar fashion, allowing the
designer to define a spine along which the polygons are placed, each with its own transformation.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 24
coordinate system into this new coordinate system. (The reasoning is very similar to that used in Exercise 5.6.1
on transforming the camera coordinate system into the world coordinate system.) The matrix Mi must carry i, j,
and k into N(ti), B(ti), T(ti), respectively, and must carry the origin of the world into the spine point C(t). Thus
the matrix has columns consisting directly of N(ti), B(ti), T(ti), and C(ti) expressed in homogeneous coordinates:
(6.12)
of t, that is in the direction of the tangent to the curve. We normalize it to unit length to obtain the unit tangent vector at t. For example, the helix of Equation 6.11 has the unit tangent vector given by
T(t ) =
1
1 + b2
( sin(t ), cos(t ), b)
(6.13)
Figure 6.46*. a). Tangents to the helix. B). Frenet frame at various values of t, for the helix.
If we form the cross product of this with any non-collinear vector we must obtain a vector perpendicular
to T(t) and therefore perpendicular to the spine of the curve. (Why?) A particularly good choice is the
(t ) . So we form C
(t ) C
(t ) , and since it will be used
acceleration, based on the second derivative, C
for an axis of the coordinate system, we normalize it, to obtain the unit binormal vector as:
B(t ) =
(t ) C
(t )
C
(t ) C
(t )
C
(6.14)
We then obtain a vector perpendicular to both T(t) and B(t) by using the cross product again:
10/23/99
page 25
N(t ) = B (t ) T(t )
(6.15)
Convince yourself that these three vectors are mutually perpendicular and have unit length, and thus
constitute a local coordinate system at C(t) (known as a Frenet frame). For the helix example these
vectors are given by:
B(t ) =
(6.16)
Figure 6.46b shows the Frenet frame at various values of t along the helix.
Aside: Finding the Frenet frame Numerically.
If the formula for C(t) is complicated it may be awkward to form its successive derivatives in closed
form, such that formulas for T(t), B(t), and N(t) can be hard-wired into a program. As an alternative, it is
possible to approximate the derivatives numerically using:
C ( t + ) C (t )
2
C ( t ) 2 C (t ) + C (t + )
(t ) =
C
2
(t ) =
C
(6.17)
This computation will usually produce acceptable directions for T(t), B(t), and N(t), although the user
should beware that numerical differentiation is an inherently unstable process [burden85].
Figure 4.47 shows the result of wrapping a decagon about the helix in this way. The helix was sampled at
30 points, a Frenet frame was constructed at each point, and the decagon was erected in the new frame.
10/23/99
page 26
Figure 6.48. Tubes based on toroidal spirals. (file: torusKnot.bmp, file: torusKnot7.bmp)
(6.18)
for some choice of constants a, b, p, and q. For part a the parameters p and q were chosen as 2 and 5, and
for part b they are chosen to be 1 and 7.
Figure 6.49 shows a sea shell, formed by wrapping a tube with a growing radius about a helix. To accomplish this, the matrix of Equation 6.12 was multiplied by a scaling matrix, where the scale factors also depend
on t:
g( t )
0
M' = M
0
0
0
g( t )
0
0
0
0
1
0
0
0
0
1
Here g(t) = t. It is also possible to add a rotation to the matrix, so that the tube appears to twist more vigorously as one looks along the spine.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 27
One of the problems with using Frenet frames for sweeping curves is that the local frame sometimes
twists in such a way as to introduce undesired knots in the surface. Recent work, such as [wang97],
finds alternatives to the Frenet frame that produce less twisting and therefore more graceful surfaces.
Practice Exercises.
2
( t ) ( C
(t ) C
( t )) C
(t ) / C
( t ) , so it points in
6.4.4. What is N(t)? Show that N(t) is parallel to C
the direction of the acceleration when the velocity and acceleration at t are perpendicular.
6.4.5. The Frame for the helix. Consider the circular helix treated in the example above. Show that the
formulas above for the unit tangent, binormal, and normal vectors are correct. Also show that these are
unit length and mutually perpendicular. Visualize how this local coordinate system orients itself as you
move along the curve.
Figure 6.50 shows additional examples. Part (a) shows a hexagon wrapped about an elliptical spine to form a
kind of elliptical torus, and part (b) shows segments arranged into a knot along a Lissajous figure given by:
(6.19)
with M = 2 , N = 3 , = 0.
Figure 6.50. a). A hexagon wrapped about an elliptical torus. b). A 7-gon wrapped about a Lissajous figure.
Case Study 6.5 examines more details of forming meshes that model tubes based on a parametric curve.
6.4.5. Discretely Swept Surfaces of Revolution.
The tubes above use affine transformations to fashion a new coordinate system at each spine point. If we use
pure rotations for the affine transformations, and place all spine points at the origin, a rich set of polyhedral
shapes emerges. Figure 6.51 shows an example, where a base polygon - now called the profile - is initially
positioned 3 units out along the x-axis, and then is successively rotated in steps about the y-axis to form an
approximation of a torus.
10/23/99
page 28
Figure 6.52. Approximating a martini glass with a discretely swept polyline. a). the profile, b). the swept surface.
cos( )
0
~
M =
sin( )
0
i
0 sin( i )
1
0
0 cos( i )
0
0
0
0
0
1
where i = 2i/K, i = 0, 1, ..., K-1. Note there is no translation involved. This transformation is simple enough
that we can write the positions of the vertices directly. The rotation sets the points of the i-th waist polyline at:
(xj cos(i), yj, xj sin(i))
(6.20)
Building meshes that model surfaces of revolution is treated further in Case Study 6.6.
10/23/99
page 29
we compute the normal vector to the underlying smooth surface. We discuss the Gouraud shading algorithm in
Chapter 11.
The basic approach for each type of surface is to polygonalize (also called tesselate) it into a collection of flat
faces. If the faces are small enough and there is a graceful change in direction from one face to the next, the
resulting mesh will provide a good approximation to the underlying surface. The faces have vertices that are
found by evaluating the surfaces parametric representation at discrete points. A mesh is created by building a
vertex list and face list in the usual way, except here the vertices are computed from formulas. The same is
true for the vertex normal vectors: they are computed by evaluating formulas for the normal to the surface at
discrete points.
6.5.1. Representations for Surfaces.
To set the stage, recall that in Section 4.5.5 we examined the planar patch, given parametrically by
P(u, v) = C + au + bv
(6.21)
where C is a point, and a and b are vectors. The range of values for the parameters u and v is usually restricted to [0, 1], in which case the patch is a parallelogram in 3D with corner vertices C, C + a, C + b,
and C + a + b, (recall Figure 4.31).
Here we enlarge our interests to nonlinear forms, to represent more general surface shapes. We introduce
three functions X(), Y(), and Z() so that the surface has parametric representation in point form
P(u, v) = (X(u, v), Y(u, v), Z(u, v))
(6.22)
with u and v restricted to suitable intervals. Different surfaces are characterized by different functions, X,
Y, and Z. The notion is that the surface is at (X(0, 0), Y(0, 0), Z(0, 0)) when both u and v are zero, at
(X(1, 0), Y(1, 0), Z(1, 0)) when u = 1 and v = 0, and so on. Keep in mind that two parameters are required when representing a surface, whereas a curve in 3D requires only one. Letting u vary while
keeping v constant generates a curve called a v-contour. Similarly, letting v vary while holding u constant produces a u-contour. (Look ahead to Figure 6.58 to see examples u- and v-contours.)
The implicit form of a surface.
Although we are mainly concerned with parametric representations of different surfaces, it will prove
useful to keep track of an alternative way to describe a surface, through its implicit form. Recall from
Section 3.8 that a curve in 2D has an implicit form F(x, y) which must evaluate to 0 for all points (x, y)
that lie on the curve, and for only those. For surfaces in 3D a similar functions F(x, y, z) exists that
evaluates to 0 if and only if the point (x, y, z) is on the surface. The surface therefore has an implicit
equation given by
F(x, y, z) = 0
(6.23)
that is satisfied for all points on the surface, and only those. The equation constrains the way that values
of x, y, and z must be related to confine the point (x, y, z) to the surface in question. For example, recall
(from Chapter 4) that the plane that passes through point B and has normal vector n is described by the
equation nx x + ny y + nz z = D (where D = nB), so the implicit form for this plane is F(x, y, z) = nx x +
ny y + nz z - D. Sometimes it is more convenient to think of F as a function of a point P, rather than a
function of three variables x, y, and z, and we write F(P) = 0 to describe all points that lie on the surface.
For the example of the plane here, we would define F(P) = n(P - B) and say that P lies in the plane if
and only if F(P) = n(P-B) is zero. If we wish to work with coordinate frames (recall Section 4.5) so
~ ~ ~
~
~
T
that P is the 4-tuple P = ( x, y, z,1) , the implicit form for a plane is even simpler: F ( P ) = n
P,
~
where n = ( nx , ny , nz , D) captures both the normal vector and the value D.
10/23/99
page 30
It is not always easy to find the function F(x, y, z) or F(P) from a given parametric form (nor can you
always find a parametric form when given F(x, y, z)). But if both a parametric form and an implicit form
are available it is simple to determine whether they describe the same surface. Simply substitute X(u, v),
Y(u, v), and Z(u, v) for x, y, and z, respectively, in F(x, y, z) and check that F is 0 for all values of u and v
of interest.
For some surfaces like a sphere that enclose a portion of space it is meaningful to define an inside region and an outside region. Other surfaces like a plane clearly divide 3D space into two regions, but
one must refer to the context of the application to tell which half-space is the inside and which the outside. There are also many surfaces, such as a strip of ribbon candy, for which it makes little sense to
name an inside and an outside.
When it is meaningful to designate an inside and outside to a surface, the implicit form F(x, y, z) of a
surface is also called its inside outside function. We then say that a point (x, y, z) is
inside the surface if:
on the surface if:
outside the surface if:
F(x, y, z) < 0
F(x, y, z) = 0
F(x, y, z) > 0
(6.24)
This provides a quick and simple test for the disposition of a given point (x', y', z') relative to the surface:
Just evaluate F(x', y', z') and test whether it is positive, negative, or zero. This is seen to be useful in hidden line and hidden surface removal algorithms in Chapter 14, and in Chapter 15 it is used in ray tracing
algorithms. There has also been vigorous recent activity in rendering surfaces directly from their implicit
forms: see [bloomenthal, 97].
6.5.2. The Normal Vector to a Surface.
As described earlier, we need to determine the direction of the normal vector to a surface at any desired
point. Here we present one way based on the parametric expression, and one based on the implicit form,
of the surface. As each surface type is examined later, we find the suitable expressions for its normal
vector at any point.
The normal direction to a surface can be defined at a point, P(u0, v0), on the surface by considering a
very small region of the surface around P(u0, v0). If the region is small enough and the surface varies
smoothly in the vicinity, the region will be essentially flat. Thus it behaves locally like a tiny planar
patch and has a well-defined normal direction. Figure 6.53 shows a surface patch with the normal vector
drawn at various points. The direction of the normal vector is seen to be different at different points on
the surface.
Since p(u, v) is simply the difference P(u, v) - (0,0,0), the derivative of p() is the same as that of P().
10/23/99
page 31
n(u0 , v0 ) =
p p
u v
(6.25)
u = u0 ,v = v0
where the vertical bar | indicates that the derivatives are evaluated at u = u0, v = v0. Formed this way,
n(u0, v0) is not automatically a unit length vector, but it can be normalized if desired.
Example 6.5.1: Does this work for a plane? Consider the plane given parametrically by P(u, v) = C +
au + bv. The partial derivative of this with respect to u is just a, and that with respect to v is b. Thus according to Equation 6.19, n(u, v) = a b , which we recognize as the correct result.
More generally, the partial derivatives of p(u, v) exist whenever the surface is smooth enough. Most
of the surfaces of interest to us in modeling scenes have the necessary smoothness and have simple
enough mathematical expressions so that finding the required derivatives is not difficult. Because p(u, v)
= X(u, v)i + Y(u, v)j + Z(u, v)k, the derivative of a vector is just the vector of the individual derivatives:
p(u, v )
X (u, v ) Y (u, v ) Z (u, v )
,
,
=
u
u
u
u
(6.26)
n( x 0 , y 0 , z 0 ) = F
x = x0 , y = y 0 , z = z 0
F , F , F
x y z
(6.27)
x = x0 , y = y 0 , z = z 0
where each partial derivative is evaluated at the desired point, (x0, y0, z0). If the point (x0, y0, z0) for the
surface in question corresponds to the point P(u0,v0) of the parametric form, then n(x0, y0, z0) has the same
direction as n(u0,v0) in Equation 6.19, but it may have a different length. Again, it can be normalized if
desired.
Example 6.5.2. The plane again: Consider once again the plane with normal n that passes through point
A, given implicitly by F ( x , y , z ) = n (( x , y , z ) A) = 0 , or nx x + n y y + nz z n A = 0 . This has
gradient F = n as expected.
Note that the gradient-based form gives the normal vector as a function of x, y, and z, rather than of u
and v. Sometimes for a surface we know both the inside-outside function, F(x, y, z), and the parametric
form, p(u, v) = X(u, v)i + Y(u, v)j + Z(u, v)k. In such cases it may be easiest to find the parametric form,
n(u, v), of the normal at (u, v) by a two-step method: (1) Use Equation 6.21 to get the normal at (x, y, z)
in terms of x, y, and z, and then (2) substitute the known functions X(u, v) for x, Y(u, v) for y, and Z(u, v)
for z. Some of the later examples illustrate this method.
6.5.3. The Effect of an Affine Transformation.
We shall need on occasion to work with the implicit and parametric forms of a surface after the surface
has been subjected to an affine transformation. We will also want to know how the normal to the surface
is affected by the transformation.
Suppose the transformation is represented by 4 by 4 matrix M, and that the original surface has implicit
~
form (in terms of points in homogeneous coordinates) F ( P ) and parametric
form P (u, v ) = ( X (u, v), Y (u, v ), Z * (u, v ),1) . Then it is clear that the transformed surface has para-
10/23/99
page 32
metric form MP (u, v) (why?). It is also easy to show (see the exercises) that the transformed surface
~
~
F ( P ) = F ( M 1 P )
Further, if the original surface has normal vector n(u, v) then the transformed surface has:
~ P with
For example, suppose we transform the plane examined above, given by F ( P ) = n
~
~
n~ = (nx , ny , nz , D) . The transformed plane has implicit form F ( P ) = n~ ( M 1P ) . This can be writ~
T ~
ten (see the exercises) as ( M n
) P , so the normal vector of the transformed plane involves the in-
verse transpose of the matrix, consistent with the claimed form for the normal to a general surface.
Practice Exercises.
6.5.1. The implicit form of a transformed surface. Suppose all points on a surface satisfy F(P) =
~
~
~
~
~
0, and that M transforms P into Q ; i.e. Q = MP . Then argue that any point Q on the transformed
1
surface comes from a point M Q , and those points all satisfy F ( M Q ) = 0 . Show that this
proves that the implicit form for the transformed surface is F' (Q ) = F ( M Q) .
T
6.5.2. How are normal vectors affected? Let n = (nx, ny, nz, 0) be the normal at P and let v be any
vector tangent to the surface at P. Then n must be perpendicular to v and we can write n v = 0.
a). Show that the dot product can be written as a matrix product: nTv = 0 (see Appendix 2).
b). Show that this is still 0 when the matrix product M-1M is inserted: nTM-1Mv = 0.
c). Show that this can be rewritten as (M-Tn)(Mv)=0, so M-Tn is perpendicular to (Mv).
Now since the tangent v transforms to Mv, which is tangent to the transformed surface, show that
this says M-Tn must be normal to the transformed surface, which we wished to show.
d). The normal to a surface is also given by the gradient of the implicit form, so the normal to the
transformed surface at point P must be the gradient of F(M-1P). Show (by the chain rule of calculus)
that the gradient of this function is M-T multiplied onto the gradient of F().
6.5.3. The tangent plane to a transformed surface. To find how normal vectors are transformed
we can also find how the tangent plane to a surface is mapped to the tangent plane on the transformed surface. Suppose the tangent plane to the original surface at point P has parametric representation P + au + bv, where a and b are two vectors lying in the plane. The normal to the surface is
therefore n = a b .
a). Show that the parametric representation of the transformed plane is MP + Mau + Mbv, and that
this plane has normal n' = ( Ma ) ( Mb) .
b). Referring to Appendix 2, show the following identity:
( Ma ) ( Mb) = (det M ) M T (a b)
This relates the cross product of transformed vectors to the cross product of the vectors themselves. c). Show
that n is therefore parallel to M-Tn.
6.5.4. Three generic shapes: the sphere, cylinder, and cone.
We begin with three classic objects, generic versions of the sphere, cylinder, and cone. We develop
the implicit form and parametric form for each of these, and see how one might make meshes to approximate them. We also derive formulas for the normal direction at each point on these objects. Note
that we have already used OpenGL functions in Chapter 5 to draw these shapes. Adding our own tools
has several advantages, however: a). We have much more control over the detailed nature of the shape
10/23/99
page 33
being created; b). We have the object as an actual mesh that can be operated upon by methods of the
Mesh class.
The Generic Sphere.
We call the sphere of unit radius centered at the origin the generic sphere (see Figure 6.54a). It forms
the basis for all other sphere-like shapes we use. It has the familiar implicit form
F(x, y, z) = x2 + y2 + z2 - 1
(6.28)
In the alternate notation F(P) we obtain the more elegant F(P) = |P|2 - 1. (What would these forms be if
the sphere has radius R?)
A parametric description of this sphere comes immediately from the basic description of a point in
spherical coordinates (see Appendix 2). We choose to let u correspond to azimuth and v correspond to
latitude. Then any point P = (x, y, z) on the sphere has the representation (cos(v) cos(u), cos(v) sin(u),
sin(v)) in spherical coordinates (see Figure 6.54b). We let u vary over (0, 2) and v vary over (-/2, /2)
to cover all such points. A parametric form for the sphere is therefore
a). generic sphere b). parametrized by azimuth, latitude. c). pars & merids
Figure 6.54. a). The generic sphere, b) a parametric form. c). parallels and meridians
P(u, v) = (cos(v) cos(u), cos(v) sin(u), sin(v))
(6.29)
Its easy to check that this is consistent with the implicit form: substitute terms of Equation 6.29 into corresponding terms of Equation 6.28 and see that zero is obtained for any value of u and v.
(Question: What is the corresponding parametric form if the sphere instead has radius R and is centered
at (a, b, c)?)
For geographical reasons certain contours along a sphere are given common names. For this parametrization u-contours are called meridians, and v-contours are known as parallels, as suggested in Figure
6.54c. (Note that this classical definition of spherical coordinates, parallels and meridians causes the
sphere to appear to lie on its side. This is simply a result of how we sketch 3D figures, with the y-axis
pointing up.)
Different parametric forms are possible for a given shape. An alternative parametric form for the sphere
is examined in the exercises.
What is the normal direction n(u, v) of the spheres surface at the point specified by parameters (u, v)?
Intuitively the normal vector is always aimed radially outward, so it must be parallel to the vector
from the origin to the point itself. This is confirmed by Equation 6.21: the gradient is simply 2(x, y, z),
which is proportional to P. Working with the parametric form, Equation 6.19 yields n(u, v) = -cos(v)p(u,
v), so n(u, v) is parallel to p(u, v) as expected. The scale factor -cos(v) will disappear when we normalize
n. We must make sure to use p(u,v) rather than -p(u,v) for the normal, so that it does indeed point radially outward.
The Generic Cylinder.
We adopt as the generic cylinder the cylinder whose axis coincides with the z-axis, has a circular cross
section of radius 1, and extends in z from 0 to 1, as pictured in Figure 6.55a.
a). generic cylinder
b). tapered cylinder
10/23/99
page 34
It is convenient to view this cylinder as one member of the large family of tapered cylinders, as we did
in Chapter 5. Figure 6.55b shows the generic tapered cylinder, having a small radius of s when z = 1.
The generic cylinder is simply a tapered cylinder with s = 1. Further, the generic cone to be examined
next is simply a tapered cylinder with s = 0. We develop formulas for the tapered cylinder with an arbitrary value of s. These provide formulas for the generic cylinder and cone by setting s to 1 or 0, respectively.
If we consider the tapered cylinder to be a thin hollow shell, its wall is given by the implicit form
(6.30)
(6.31)
for appropriate ranges of u and v (which ones?). What are these expressions for the generic cylinder
with s = 1?
When it is important to model the tapered cylinder as a solid object, we add two circular discs at its
ends: a base and a cap. The cap is a circular portion of the plane z = 1, characterized by the inequality x2
+ y2 < s2, or given parametrically by P(u, v) = (v cos(u), v sin(u), 1) for v in [0, s]. (What is the parametric representation of the base?)
The normal vector to the wall of the tapered cylinder is found using Equation 6.27. (Be sure to check
this). It is
n(x, y, z) = (x, y, -(s - 1)(1+ (s - 1)z) )
(6.32)
or in parametric form n(u, v) = (cos(u), sin(u), 1 s). For the generic cylinder the normal is simply
(cos(u), sin(u), 0). This agrees with intuition: the normal is directed radially away from the axis of the
cylinder. For the tapered cylinder it is also directed radially, but shifted by a constant z-component.
(What are the normals to the cap and base?)
The Generic Cone.
We take as the generic cone the cone whose axis coincides with the z-axis, has a circular cross section
of maximum radius 1, and extends in z from 0 to 1, as pictured in Figure 6.56. It is a tapered cylinder
with small radius of s = 0. Thus its wall has implicit form
generic cone
(6.33)
and parametric form P(u, v) = ((1-v) cos(u), (1- v) sin(u), v) for azimuth u in [0, 2] and v in [0, 1]. Using the results for the tapered cylinder again, the normal vector to the wall of the cone is (x, y, 1-z). What
is it parametrically?
For easy reference Figure 6.57 shows the normal vector to the generic surfaces we have discussed.
surface
F( x , y, z )
n(u, v) at p(u, v)
sphere
tapered cylinder
p(u, v)
(cos(u), sin(u), 1 s)
(x, y, z)
(x, y, -(s - 1)(1+ (s-1)z))
10/23/99
page 35
cylinder
(cos(u), sin(u), 0)
cone
(cos(u), sin(u), 1)
Figure 6.57. Normal vectors to the generic surfaces.
(x, y, 0)
(x, y, 1 - z)
Practice Exercises.
6.5.4. Alternative representation for the generic sphere. We can associate different geometric quantities with the parameters u and v and obtain a different parametric form for the sphere. We again use parameter u for azimuth, but use v for the height of the point above the xy-plane. All points at height v lie
on a circle of radius
P2(u, v) = (
(6.34)
for u in [0, 2] and v in [-1,1]. Show that P2 lies unit distance from the origin for all u and v.
6.5.5. Whats the surface? Let A be a fixed point with position vector a, and P be an arbitrary point
with position vector p. Describe in words and sketch the surface described by: a). pa = 0; b). pa = |a|;
c). |p x a| = |a|; d). p a = p p; e). p a = |a||p|/2.
6.5.6. Finding the normal vector to the generic cylinder and cone. Derive the normal vector for the
generic tapered cylinder and the generic cone in two ways:
a). Using the parametric representation;
b). Using the implicit form, then expressing the result parametrically.
6.5.7. Transformed Spheres. Find the implicit form for a generic sphere that has been scaled in x by 2
and in y by 3, and then rotated 30o about the z-axis.
6.5.5. Forming a Polygonal Mesh for a Curved Surface.
Now we examine how to make a mesh object that approximates a smooth surface such as the sphere,
cylinder, or cone. The process is called polygonalization or tesselation, and it involves replacing the
surface by a collection of triangles and quadrilaterals. The vertices of these polygons lie in the surface itself, and they are joined by straight edges (which usually do not lie in the surface). One proceeds by
choosing a number of values of u and v, and sampling the parametric form for the surface at these
values to obtain a collection of vertices. These vertices are placed in a vertex list. A face list is then created: each face consists of three or four indices pointing to suitable vertices in the vertex list. Associated
with each vertex in a face is the normal vector to the surface. This normal vector is the normal direction
to the true underlying surface at each vertex location. (Note how this contrasts with the normal used
when representing a flat-faced polyhedron: there the vertex of each face is associated with the normal to
the face.)
Figure 6.58 shows how this works for the generic sphere. We think of slicing up the sphere along azimuth lines and latitude lines. Using OpenGL terminology of slices and stacks (see Section 5.6.3), we
choose to slice the sphere into nSlices slices around the equator, and nStacks stacks from the
south pole to the north pole. The figure shows the example of 12 slices and 8 stacks. The larger
nSlices and nStacks are, the better the mesh approximates a true sphere.
10/23/99
page 36
Figure 6.58*. a). A mesh approximation to the generic sphere. b). Numbering the vertices.
To make slices we need nSlices values of u between 0 and 2. Usually these are chosen to be equispaced: ui = 2i/nSlices, i = 0, 1, ..., nSlices -1. As for stacks, we put half of them above the
equator and half below. The top and bottom stacks will consist of triangles; all other faces will be quadrilaterals. This requires we define (nStacks + 1) values of latitude: vj = - j/nStacks, j = 0, 1, ...,
nStacks.
The vertex list can now be created. The figure shows how we might number the vertices (the ordering is
a matter of convenience): We put the north pole in pt[0], the bottom points of the top stack into the next
12 vertices, etc. With 12 slices and 8 stacks there will be a total of 98 points (why?)
The normal vector list is also easily created: norm[k] will hold the normal for the sphere at vertex
pt[k]. norm[k] is computed by evaluating the parametric form of n(u,v) at the same (u,v) used for the
points. For the sphere this is particularly easy since norm[k] is the same as pt[k].
For this example the face list will have 96 faces, of which 24 are triangles. We can put the top triangles
in the first 12 faces, the 12 quadrilaterals of the next stack down in the next 12 faces, etc. The first few
faces will contain the data:
number of vertices:
vertex indices:
normal indices:
3
012
012
3
023
023
3
034
034
...
...
...
Note that for all meshes that try to represent smooth shapes the normIndex is always the same as the
vertIndex, so the data structure holds redundant information. (Because of this, one could use a more
streamlined data structure for such meshes. What would it be?) Polygonalization of the sphere in this
way is straightforward, but for more complicated shapes it can be very tricky. See [refs] for further discussions.
Ultimately we need a method, such as makeSurfaceMesh(), that generates such meshes for a given
surface P(u, v). We discuss the implementation of such a function in Case Study 6.13.
10/23/99
page 37
Note that some graphics packages have routines that are highly optimized when they operate on triangles. To exploit these we might choose to polygonalize the sphere into a collection of triangles, subdividing each quadrilateral into two triangles.
A simple approach would use the same vertices as above, but alter the face list replacing each quadrilateral with two triangles. For instance, a face that uses vertices 2, 3, 15, 14 might be subdivided into two
triangles, once using 2, 3, 15 and the other using 2, 15, 14.
The sphere is a special case of a surface of revolution, which we treat in Section 6.5.7. The tapered cylinder is also a surface of revolution. It is straightforward to develop a mesh model for the tapered cylinder.Figure 6.59 shows the tapered cylinder approximated with nSlices = 10 and nStacks = 1. A
decagon is used for its cap and base. (If you prefer to use only triangles in a mesh, the walls, the cap, and
the base could be dissected into triangles. (How?))
10/23/99
page 38
are points. But for ruled surfaces the points P0 and P1 become functions of another parameter u: P0 becomes P0(u), and P1 becomes P1(u). Thus the ruled surfaces that we examine have the parametric form
P(u, v) = (1 - v) P0(u) + v P1(u)
(6.35)
The functions P0(u) and P1(u) define curves lying in 3D space. Each is described by three component
functions, as in P0(u) = (X0(u), Y0(u), Z0(u)). Both P0(u) and P1(u) are defined on the same interval in u
(commonly from 0 to 1). The ruled surface consists of one straight line joining each pair of corresponding points, P0(u) and P1(u), for each u in (0,1), as indicated in Figure 6.60. At v = 0 the surface is at
P0(u), and at v = 1 it is at P1(u). The straight line at u = u is often called the ruling at u.
1st Ed. Figure 9.3
Figure 6.60. A ruled surface as a family of straight lines.
For a particular fixed value, v the v-contour is some blend of the two curves P0(u) and P1(u). It is an affine combination of them, with the first weighted by (1 - v) and the second by v. When v is close to 0,
the shape of the v-contour is determined mainly by P0(u), whereas when v is close to 1, the curve P1(u)
has the most influence.
If we restrict v to lie between 0 and 1, only the line segment between corresponding points on the curves
will be part of the surface. On the other hand, if v is not restricted, each line will continue forever in both
directions, and the surface will resemble an unbounded curved sheet. A ruled patch is formed by restricting the range of both u and v, to values between, say, 0 and 1.
A ruled surface is easily polygonalized in the usual fashion: choose a set of samples ui and vj, and compute the position P(ui, vj) and normal n(ui, vj) at each. Then build the lists as we have done before.
Some special cases of ruled surfaces will reveal their nature as well as their versatility. We discuss three
important families of ruled surfaces, the cone, the cylinder, and the bilinear patch.
Cones.
A cone is a ruled surface for which one of the curves, say, P0(u), is a single point P0(u) = P0, the apex
of the cone, as suggested in Figure 6.61a. In Equation 6.35 this restriction produces:
{a general cone}
(6.36)
For this parameterization all lines pass through P0 at v = 0, and through P1(u) at v = 1. Certain special
cases are familiar. A circular cone results when P1(u) is a circle, and a right circular cone results when
the circle lies in a plane that is perpendicular to the line joining the circle's center to P0. The specific example shown in Figure 6.61b uses P1(u) = (r(u) cos u, r(u) sin u, 1) where the radius curve r(u) varies
sinusoidally: r(u) = 0.5 + 0.2 cos(5u).
Cylinders.
A cylinder is a ruled surface for which P1(u) is simply a translated version of P0(u): P1(u) = P0(u) + d,
for some vector d, as shown in Figure 6.62a. Sometimes one speaks of sweeping the line with endpoints P0(0) and P0(0) + d (often called the generator) along the curve P0(u) (often called the directrix), without altering the direction of the line.
10/23/99
page 39
(6.37)
To be a true cylinder, the curve P0(u) is confined to lie in a plane. If P0(u) is a circle the cylinder is a circular cylinder. The direction d need not be perpendicular to this plane, but if it is, the surface is called a
right cylinder. This is the case for the generic cylinder. Figure 6.55b shows a ribbon candy cylinder
where P0(u) undulates back and forth like a piece of ribbon. The ribbon shape is explored in the exercises.
Bilinear Patches.
A bilinear patch is formed when both P0(u) and P1(u) are straight line segments defined over the same
interval in u, say, 0 to 1. Suppose the endpoints of P0(u) are P00 and P01, (so that P0(u) is given by (1 u)P00 + u P01), and the endpoints of P1(u) are P10 and P11. Then using Equation 6.35 the patch is given parametrically by:
P(u, v) = (1 - v)(1 - u)P00 + (1 - v) u P01 + v(1 - u)P10 + uv P11
(6.38)
This surface is called bilinear because its dependence is linear in u and linear in v. Bilinear patches
need not be planar; in fact they are planar only if the lines P0(u) and P1(u) lie in the same plane (see the
exercises). Otherwise there must be a twist in the surface as we move from one of the defining lines to
the other.
An example of a nonplanar bilinear patch is shown in Figure 6.63. P0(u) is the line from (2, -2, 2) to (2,
2, -2), and P1(u) is the line from (-2,-2,-2) to (-2, 2, 2). These lines are not coplanar. Several u-contours
are shown, and the twist in the patch is clearly visible.
10/23/99
page 40
10/23/99
page 41
Figure 6.64. a). double helix b). Mbius strip, c).Vaulted roof.
Bilinearly blended Surfaces - Coons Patches.
An interesting and useful generalization of a ruled surface, which interpolates two boundary curves P0(u)
and P1(u), is a bilinearly blended patch that interpolates to four boundary curves. This family was first
developed by Steven Coons [coons] and is sometimes called a Coons patch.
Figure 6.65 shows four adjoining boundary curves, named pu0(u), pu1(u), p0v(v), and p1v(v). These curves
meet at the patch corners (where u and v are combinations of 0 and 1) but otherwise have arbitrary
shapes. This therefore generalizes the bilinear patch for which the boundary curves are straight lines.
(6.39)
Note that at each (u, v) this is still an affine combination of points, as we insist. Check that at (u, v)=(0,
0) this evaluates to pu0(0), and similarly that it coincides with the other three corners at the other extreme
values of u and v. Figure 6.67 shows an example Coons patch bounded by curves that have a sinusoidal
oscillation.
10/23/99
page 42
More formally, a meridian is the intersection of the surface with a plane that contains the axis of revolution.,
and a parallel is the intersection of the surface with a plane perpendicular to the axis.
10/23/99
page 43
(6.40)
The generic sphere, tapered cylinder, and cone are all familiar special cases. (What are their profiles?)
The normal vector to a surface of revolution is easily found by direct application of Equation 6.34 to
Equation 6.19 (see the exercises). This yields
(6.41)
where the dot denotes the first derivative of the function. The scaling factor X(v) disappears upon normalization of the vector. This result specializes to the forms we found above for the simple generic
shapes (see the exercises).
For example, the torus is generated by sweeping a displaced circle about the z-axis, a shown in Figure
6.68. The circle has radius A and is displaced along the x-axis by D, so that its profile is C(v) = (D + A
cos(v), A sin(v)). Therefore the torus (Figure 6.68b) has representation
(6.42)
10/23/99
page 44
Figure 6.69 shows another example in which we try to model the dome of the exquisite Taj Mahal in
Agra, India, shown in part a. Part b shows the profile curve in the xz-plane, and part c shows the resulting
surface of revolution. Here we describe the profile by a collection of data points Ci = (Xi, Zi), since no
suitable parametric formula is available. (We rectify this lack in Chapter 8 by using a B-spline curve to
form a smooth parametric curve based on a set of data points.)
a).
c).
b).
X(v)
0
( X (u, v), Y (u, v ), Z (u, v),1) = R (u)
Z(v)
1
r
b). Check this for the special case of rotation about the z-axis.
c). Repeat part b for rotations about the x-axis, and about the y-axis.
6.5.19. Finding normal vectors. a). Apply Equation 6.40 to Equation 6.25 to derive the form in Equation 6.41 for the normal vector to a surface of revolution. b). Use this result to find the normal to each of
10/23/99
page 45
the generic sphere, cylinder, and cone, and show the results agree with those found in Section 6.5.3.
Show that the normal vector to the torus has the form
n(u, v) = (cos(v)cos(u), cos(v)sin(u), sin(v))(D + A cos(v)). Also, find the inside-outside function for the
torus, and compute the normal using its gradient.
6.5.20. An Elliptical Torus. Find the parametric representation for the following two surfaces of revolution: a). The ellipse given by (a cos(v), b sin(v)) is first displaced R units along the x-axis and then revolved about the y-axis. b). The same ellipse is revolved about the x-axis.
6.5.21. A Lissajous of Revolution. Sketch what the surface would look like if the Lissajous figure of
Equation 6.19 with M = 2, N = 3, and = 0 is rotated about the y-axis.
6.5.22. Revolved n-gons. Sketch the surface generated when a square having vertices (1, 0, 0), (0, 1, 0),
( - 1, 0, 0), (0, - 1, 0) is revolved about the y-axis. Repeat for a pentagon and a hexagon.
6.5.8. The Quadric Surfaces.
An important family of surfaces, the quadric surfaces, are the 3D analogs of conic sections (the ellipse,
parabola, and hyperbola, which we examined in Chapter 3. Some of the quadric surfaces have beautiful
shapes and can be put to good use in graphics.
The six quadric surfaces are illustrated in Figure 6.70.
1st Ed. Figure 9.12. the 6 quadric surfaces a),f).
Figure 6.70. The six quadric surfaces: a. Ellipsoid, b. Hyperboloid of one sheet, c. Hyperboloid of two
sheets, d. Elliptic cone, e. Elliptic paraboloid, f. Hyperbolic paraboloid.
We need only characterize the generic versions of these shapes, since we can obtain all the variations
of interest by scaling, rotating, and translating the generic shapes. For example, the ellipsoid is usually
said to have the insideoutside function
2
2
2
x
y
z
(6.43)
F ( x, y, z) = + + 1
a
b
c
so that it extends in x from -a to a, in y from -b to b, and in z from -c to c. This shape may be obtained
from the form of the generic sphere by scaling it in x, y, and z by a, b, and c, respectively, as we describe
below. We can obtain rotated versions of the ellipsoid in a similar manner.
Figure 6.71 provides descriptions of the six generic quadric surfaces, giving both their implicit and
parametric forms. We discuss some interesting properties of each shape later.
name of
Insideparametric form
v-range,u-range
quadric
outside function
ellipsoid
( / 2, / 2),( ,
x 2 + y 2 + z 2 1 (cos(v ) cos( u),cos( v ) sin( u),
sin( v ))
hyperboloid
of one sheet
x 2 + y2 z2 1
( / 2, / 2),( ,
hyperboloid
of two
sheets
x2 y2 z2 1
( / 2, / 2) 8
elliptic cone
x 2 + y 2 z2
any real, ( , )
10/23/99
page 46
elliptic
paraboloid
x2 + y2 z
v 0,( , )
hyperbolic
paraboloid
x 2 + y2 z
(v tan(u), v sec(u), v 2 )
v 0,( , )
10/23/99
page 47
F = (2 x ,2 y ,2 z )
and so the normal in parametric form is (after deleting the 2)
(6.44)
Normals for the other quadric surfaces follow just as easily, and so they need not be tabulated.
Practice exercises.
10/23/99
page 48
6.5.23. The Hyperboloid is a Ruled Surface. Show that the implicit form of a hyperboloid of one sheet
can be written (x + z)(x - z) =(1 - y)(1 + y). Show that therefore two families of straight lines lie in the
surface: the family x - z = A(1 - y) and the family A(x + z) = 1 + y, where A is a constant. Sketch these
families for various values of A. Examine similar rulings in the hyperbolic paraboloid.
6.5.24. The hyperboloid of one sheet. Show that an alternative parametric form for the hyperboloid of
one sheet is p(u,v) =(cosh(v) cos(u), cosh(v) sin(u, sinh(v)).
6.5.25. Traces of Quadrics are Conics. Consider any three (noncollinear) points lying on a quadric surface. They determine a plane that cuts through the quadric, forming the trace curve. Show that this curve
is always a parabola, ellipse, or hyperbola.
6.5.26. Finding Normals to the Quadrics. Find the normal vector in parametric form for each of the six
quadric surfaces.
6.5.27. The Hyperboloid As a Ruled Surface. Suppose (x0, y0, 0) is a point on the hyperboloid of one
sheet. Show that the vector R ( t ) = ( x0 + y0 t , y0 x0 t , t )
describes a straight line that lies everywhere on the hyperboloid and passes through (x0, y0, 0). Is this sufficient to make the surface a ruled surface? Why or why not? [apostol p.329]
6.5.28. The Hyperbolic Paraboloid As a Ruled Surface. Show that the intersection of any plane parallel to the line y = x cuts the hyperbolic paraboloid along a straight line.
6.5.9. The Superquadrics.
Following the work of Alan Barr [barr81], we can extend the quadric surfaces to a vastly larger family,
in much the way we extended the ellipse to the superellipse in Chapter 3. This provides additional interesting surface shapes we can use as models in applications.
Barr defines four superquadric solids: the superellipsoid, superhyperboloid of one sheet, and superhyperboloid of two sheets, which together extend the first three quadric surfaces, and the supertoroid which
extends the torus. The extensions introduce two bulge factors, m and n, to which various terms are
raised. These bulge factors affect the surfaces much as n does for the superellipse. When both factors
equal 2, the first three superquadrics revert to the quadric surfaces catalogued previously. Example
shapes for the four superquadrics are shown in Figure 6.73.
1st Ed. Figure 9.15
Figure 6.73. Examples of the four superquadrics. n1 and n2 are (left to right): 10, 2, 1.11, .77, and .514.
(Courtesy of Jay Greco.)
The inside-outside functions and parametric forms9 are given in Figure 6.74.
name of quadric
Implicit form
parametric form
n
n m/ n
m
(cos2 / m ( v ) cos2 / n ( u), cos2 / m ( v ) sin 2 / n ( u),
superellipsoid
x + y % + z 1
v-range,u-range
[ / 2, / 2],[ , ]
sin 2 / m ( v ))
superhyperboloid
of one sheet
xn + yn %
m/ n
superhyperboloid
of two sheets
xn yn %
m/n
supertoroid
1/ n
"
xn yn %
zm 1
zm 1
D' + z m 1
( / 2, / 2),[ , ]
tan 2 / m ( v ))
( / 2, / 2 )
tan 2 / m ( v ))
(( d + cos2 / m ( v )) cos2 / n ( u ),( d + cos2 / m ( v )) sin 2 / n ( u),
[ , ),[ , )
sin 2 / m ( v))
Keep in mind that it's illegal to raise a negative value to a fractional exponent. So expressions such as
2/m
2/m-1
cos (v) should be evaluated as cos(v)|cos(v)| or the equivalent.
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 49
These are generic superquadrics in the sense that they are centered at the origin, aligned with the coordinate axes, and have unit dimensions. Like other shapes they may be scaled, rotated, and translated as
desired using the current transformation to prepare them properly for use in a scene.
The normal vector n(u, v) can be computed for each superquadric in the usual ways. We shall give only
the results here.
Normals to the superellipsoid and supertoroid.
The normal vectors for the superellipsoid and the supertoroid are the same:
(6.45)
(How can they be the same? The two surfaces surely dont have the same shape.)
The Superhyperboloid of one sheet.
The normal vector to the superhyperboloid of one sheet is the same as that of the superellipsoid, except
all occurrences of cos(v) are replaced with sec(v) and those of sin(v) are replaced with tan(v). Do not alter cos(u) or sin(u) or any other term.
The Superhyperboloid of two sheets.
For the superhyperboloid of two sheets, the trigonometric functions in both u and v are replaced. Replace
all occurrences of cos(v) with sec(v), those of sin(v) with tan(v), those of cos(u) with sec(u), and those of
sin(u) with tan(u).
Practice Exercises.
6.5.29. Extents of Superquadrics. What are the maximum x, y, and z values attainable for the superellipsoid and the supertoroid?
6.5.30. Surfaces of revolution. Determine the values of bulge factors m and n for which each of the superquadrics is a surface of revolution, and find the axis of revolution. Describe any other symmetries in
the surfaces.
6.5.31. Deriving the Normal Vectors. Derive the formula for the normal vector to each superquadric.
surface.
6.5.10. Tubes Based on 3D Curves.
In Section 6.4.4 we studied tubes that were based on a spine curve C(t) meandering through 3D space.
A polygon was stationed at each of a selection of spine points, and oriented according to the Frenet
frame computed there. Then corresponding points on adjacent polygons were connected to form a flatfaced tube along the spine.
Here we do the same thing, except we compute the normal to the surface at each vertex, so that smooth
shading can be performed. Figure 6.75 shows the example of a tube wrapped around a helical shape.
Compare this with Figure 6.47.
10/23/99
page 50
If we wish to wrap a circle (cos(u), sin(u), 0) about the spine C(t), the resulting surface has parametric
representation
(6.46)
where the normal vector N(t) and binormal vector B(t) are those given in Equations 6.14 and 6.15. Now
we can build a mesh for this tube in the usual way, by taking samples of P(u, v), building the vertex,
normal, and face lists, etc. (What would be altered if we wrapped a cycloid - recall Figure 3.77 - about
the spine instead of a circle?)
6.5.11. Surfaces based on Explicit Functions of Two Variables.
Many surface shapes are single-valued in one dimension, so their position can be represented as an explicit function of two of the independent variables. For instance, there may be a single value of height
of the surface above the xz-plane for each point (x, z), as suggested in Figure 6.76. We can then say that
the height of the surface at (x, z) is some f(x, z). Such a function is sometimes called a height field
[bloomenthal97]. A height field is often given by a formula such as
f ( x, z ) = e ax
by 2
(6.47)
(where a and b are given constants), or the circularly symmetric sinc function
. f ( x, z ) =
sin( x 2 + z 2 )
(6.48)
x2 + z2
10/23/99
page 51
Contrast this with surfaces such as the sphere, for which more than one value of y is associated with each
point (x, z). Single valued functions permit a simple parametric form:
P(u, v) = (u, f(u, v), v)
(6.49)
and their normal vector is n(u, v) = ( f / u,1, f / v) (check this). That is, u and v can be used
directly as the dependent variables for the function. Thus u-contours lie in planes of constant x, and vcontours lie in planes of constant z. Figure 6.77a shows a view of the example in Equation 6.47, and Figure 6.77b shows the function of Equation 6.48.
Figure 6.77. Two height fields. a) gaussian, b). sinc function. (files: fig6.77a.bmp, fig6.77b.bmp)
Each line is a trace of the surface cut by a plane, x = k or z = k, for some value of k. Plots such as these
can help illustrate the behavior of a mathematical function.
Practice exercise 6.5.32. The Quadrics as Explicit Functions. The elliptic paraboloid can be written as
z = f(x, y), so it has an alternate parametric form (u, v, f(u, v)). What is f()? In what ways is this alternate
parametric form useful? What other quadrics can be represented this way? Can any superquadrics be represented explicitly this way?
6.6. Summary
Hill - ECE660:Computer Graphics Chapter 6
10/23/99
page 52
This chapter is concerned with modeling and drawing a wide variety of surfaces of 3D objects. This involves
finding suitable mathematical descriptions for surface shapes, and creating efficient data structures that
hold sufficient detail about a surface to facilitate rendering of the surface. We developed the Mesh class,
whose data fields include three lists: the vertex, normal vector, and face lists. This data structure can efficiently hold all relevant geometric data about a flat-faced object such as a polyhedron, and it can hold sufficient data to model a polygonal skin that approximates other smoothly curved surfaces.
We showed that once a mesh data structure has been built, it is straightforward to render it in an OpenGL environment. It is also easy to store a mesh in a file, and to read it back again into a program.
Modern shading algorithms use the normal vector at each vertex of each face to determine how light or dark
the different points within a face should be drawn. If the face should be drawn flat the same normal vector the normal vector to the face itself - is used for every vertex normal. If the mesh is designed to represent an
underlying smoothly curved surface the normal vector at each vertex is set to the normal of the underlying
surface at that point, and rendering algorithms use a form of interpolation to produce gracefully varying shades
in the picture (as we discuss in Chapter 11). Thus the choice of what normal vectors to store in a mesh depends
on how the designer wishes the object to appear.
A wide variety of polyhedral shapes that occur in popular applications were examined, and techniques were
developed that build meshes for several polyhedral families. Here special care was taken to use the normal to
the face in each of the faces vertex normals. We also studied large families of smoothly varying objects, including the classical quadric surfaces, cylinders, and cones, and discussed how to compute the direction of the
normal vector at each point by suitable derivatives of the parametric form for the surface.
The Case Studies in the next section elaborate on some of these ideas, and should not be skipped. Some of
them probe further into theory. A derivation of the Newell method to compute a normal vector is outlined, and
you are asked to fill in various details. The family of quadric surfaces is seen to have a unifying matrix form
that reveals its underlying structure, and a congenial method for transforming quadrics is described. Other
Case Studies ask that you develop methods or applications to create and draw meshes for the more interesting
classes of shapes described.
1 0 0
1 1 0
1 1.5 0
0 1 0
1 0 1
1 1 1
1 1.5 1
0 1 1
-0.477 0.8944 0
0.447 0.8944 0
0 -1 0
0 0 1
0 0 -1
9 4
0 0 0 0
9 8
1 1 1 1
8 7
2 2 2 2
7 6
3 3 3 3
6 5
4 4 4 4
7 8 9
5 5 5 5 5
3 2 1
6 6 6 6 6
10/23/99
page 53
Here the first face is a quadrilateral based on the vertices numbered 0,5,9,4, and the last two faces are pentagons.
To read a mesh into a program from a file you might wish to use the code in Figure 6.78. Given a filename, it
opens and reads the file into an existing mesh object, and returns 0 if it can do this successfully. It returns nonzero if an error occurs, such as when the named file cannot be found. (Additional testing should be done within
the method to catch formatting errors, such as a floating point number when an integer is expected.)
int Mesh:: readmesh(char * fileName)
{
fstream infile;
infile.open(fileName, ios::in);
if(infile.fail()) return -1; // error - cant open file
if(infile.eof()) return -1; // error - empty file
infile >> numVerts >> numNorms >> numFaces;
pt = new Point3[numVerts];
norm = new Vector3[numNorms];
face = new Face[numFaces];
//check that enough memory was found:
if( !pt || !norm || !face)return -1; // out of memory
for(int p = 0; p < numVerts; p++) // read the vertices
infile >> pt[p].x >> pt[p].y >> pt[p].z;
for(int n = 0; n < numNorms; n++) // read the normals
infile >> norm[n].x >> norm[n].y >> norm[n].z;
for(int f = 0; f < numFaces; f++)// read the faces
{
infile >> face[f].nVerts;
face[f].vert = new VertexId[face[f].nVerts];
for(int i = 0; i < face[f].nVerts; i++)
infile >> face[f].vert[i].vertIndex
>> face[f].vert[i].normIndex;
}
return 0; // success
}
Figure 6.78. Reading a mesh into memory from a file.
Because no knowledge of the required mesh size is available before numVerts, numNorms, and numFaces
is read, the arrays that hold the vertices, normals, and faces are allocated dynamically at runtime with the
proper sizes.
A number of files in this format are available on the internet site for this book. (They have the suffix .3vn)
Also, it is straightforward to convert IndexedFace objects in VRML2.0 files to this format.
It is equally straightforward to fashion the method int Mesh:: writeMesh(char* fileName) that
writes a mesh object to a file.
Write an application that reads mesh objects from files and draws them, and also allows the user to write a
mesh object to a file. As examples of simple files for getting started, arrange that the application also can create
meshes for a tetrahedron and the simple barn.
6.7.2. Case Study 6.2. Derivation of the Newell Method.
(Level of Effort: II). This study develops the thoery behind the Newell method for computing the normal to a
polygon based on its vertices. The necessary mathematics are presented as needed as the discussion unfolds;
you are asked to show several of the intermediate results.
In these discussions we work with the polygonal face P given by the N 3D vertices:
10/23/99
page 54
(6.50)
We want to show why the formulas in Equation 6.1 provide an exact computation of the normal vector m =
(mx, my, mz) to P when P is planar, and a good direction to use as an average normal when P is nonplanar.
Derivation:
A). Figure 6.79 shows P projected (orthographically - along the principal axes) onto each of the principal
planes: the x = 0, y = 0, and z = 0 planes. Each projection is a 2D polygon. We first show that the components
of m are proportional to the areas, Ax, Ay, and Az, respectively, of these projected polygons.
z
m
x
y
area = Az
Area(T ) = (m n) Area(T )
Suppose triangle T has edges defined by the vectors v and w as shown in the figure.
a). Show that the area of T is Area( T ) =
1
2
We now explore the projection of T onto the plane with normal vector n.
n
v
w'
T'
v'
Figure 6.80. Effect of Orthographic projection on area.
10/23/99
page 55
This projection T is defined by the projected vectors w and v: its area Area( T' ) =
1
2
w' v' , so
b). Using ideas from Chapter 4, show that that v projects to v given by v = v - (v n) n and similarly that w
= w - (w n) n.
So we need only relate the sizes of the two cross products.
c). Use the forms of u and v to show that: v w = v w - (w n) (v n) + (v n)(w n) + (w n)(v
n)(n n)
and explain why the last term vanishes. Thus we have
2 Area(T) n = v w - (w n) (v n) + (v n)(w n)
d). Dot both sides of this with n and show that the last two terms drop out, and that we have 2 Area(T) = v
w n = 2Area(T) m n as claimed.
e). Show that this result generalizes to the areas of any planar polygon P and its projected image P.
f). Recalling that a dot product is proportional to the cosine of an angle, show that Area(T) = Area(T) cos
and state what the angle is.
g). Show that the areas Ax, Ay, Az defined above are simply Kmx, Kmy, and Kmz, respectively, where K is some
constant. Hence the areas Ax, Ay, and Az are in the same ratios as mx, my, and mz.
B). So to find m we need only compute the vector (Ax, Ay, Az), and normalize it to unit length. We now show
how to compute the area of the projection of P of Equation 6.50 onto the xy-plane directly from its vertices.
The other two projected areas follow similarly.
Each 3D vertex Pi = (xi, yi, zi ) projects onto the xy-plane as Vi = (xi, yi). Figure 6.81 shows an example projected polygon P.
Ai =
1
( x i x next (i ) )( y i + y next (i ) )
2
10/23/99
page 56
j). Show that the sum of the Ai properly adds the positive and negative contributions of area to form a resultant
sum that is either the area of the polygon, or its negative.
Now we ask which of the two basic directions does m point in? That is, if you circle the fingers of your right
hand around the vertices of the polygon moving from P0 to P1, to P2 etc., the direction of the arrow in Figure
6.82, does m point along your thumb or in the opposite direction?
z
P1
m
P2
P0
x
Figure 6.82. The direction of the normal found by the Newell method.
j). Show that m does point as shown in Figure 6.82. Thus for a mesh that has a well-defined inside-outside, we
can say that m is the outward pointing normal if the vertices are labeled CCW as seen from the outside.
6.7.3. Case Study 6.3. The Prism.
(Level of Effort: III) Write an application that allows the user to specify the polygonal base of a prism using
the mouse. It then creates the vertex, normal, and face lists for the prism, and displays it.
Figure 6.83a shows the user's drawing area, a square presented on the screen. The user lays down a sequence
of points in this square with the mouse, terminating the process with a right click.
a).
b).
z
1
1
x
10/23/99
page 57
that create meshes for an array of prisms, and for an extruded quad-strip.
a). Arrays of prisms: Choose an appropriate data type to represent an array of prisms. Note that makePrismArray() is similar to the method that makes a mesh for a single prism. Exercise the first method on at least
the two block letters with the shapes K and W. (Try D also if you wish.)
b). Extruded quad-strips used to form tubes: The process of building the vertex, normal, and face lists of a
mesh is really a matter of keeping straight the many indices for these arrays. To assist in developing this
method, consider a quad-strip base polygon described as in Equation 6.9 by the vertices
quad-strip = {p0, p1, ..., pM-1}
where pi = (xi, yi, 0) lies in the xy-plane, as shown in Figure 6.84a. When extruded, each successive pair of
vertices forms a waist of the tube, as shown in Figure 6.84b. There are num = M/2 - 1 segments in the tube.
a). quad-strip in xy-plane b) the four extruded segments
Figure 6.84. Building a mesh from a quad-strip base polygon.
The 0-th waist consists of vertices p0, p1, p1 + d, and p0 + d, where d is the extrusion vector. We add vertices to
the vertex list as follows: pt[4i] = p2i, pt[4i + 1] = p2i+1, pt[4i + 2] = p2i+1 + d, and pt[4i+3] = p2i + d,
for i = 0, ..., num as suggested in Figure 6.84b.
Now for the face list. We first add all of the outside walls of each segment of the tube, and then append the
end walls (i.e. the first end wall uses vertices of the first waist). Each of the num segments has four walls.
For each wall we list the four vertices in CCW order as seen from the outside. There are patterns in the various
indices encountered, but they are complicated. Check that the following vertex indices are correct for each of
the four walls of the k-th segment: The j-th wall of the k-th segment has vertices with indices: i0, i1, i2, and i3,
where:
i0 = 4k+j
i1 = i0 + 4
i3 = 4k + (j + 3) % 4
i2 = i3 + 4;
for k = 0, 1, .., num and j = 0, 1, 2, 3.
What are indices of the two end faces of the tube?
Each face has a normal vector determined by the Newell method, which is straightforward to calculate at the
same time the vertex indices are placed in the face list. All vertex normals of a face use the same normal vector: face[L].normindex = {L,L,L,L}, for each L.
Exercise the makeExtrudedQuadStrip() method by modeling and drawing some arches, such as the one
shown in Figure 6.39, as well as some block letters that permit the use of quad-strips for their base polygon.
6.7.5. Case Study 6.5. Tubes and Snakes based on a Parametric Curve.
(Level of Effort III) Write and test a method
void Mesh:: makeTube(Point2 P[], int numPts, float t[], int numTimes)
that builds a flat-faced mesh based on wrapping the polygon with vertices P0, P1, ..., PN-1 about the spine curve
C(t). The waists of the tube are formed on the spine at the set of instants t0, t1, ..., tM-1 , and a Frenet frame is
constructed at each C(ti). The function C(t) is hard-wired into the method as a formula, and its derivatives
are formed numerically.
10/23/99
page 58
Experiment with the method by wrapping polygons taken from Example 3.6.3 that involve a line jumping back
and forth between two concentric circles. Try at least the helix and a Lissajous figure as example spine curves.
6.7.6. Case Study 6.6. Building Discrete-Stepped Surfaces of Revolution.
(Level of Effort: III) Write an application that allows the user to specify the profile of an object with the
mouse, as in Figure 6.85. It then creates the mesh for the surface of revolution, and displays it. The program
also writes the mesh data to a file in the format described in Case Study 6.1.
Figure 6.85a shows the user's drawing area, a square presented on the screen. The user lays down a sequence
of points in this square with the mouse.
10/23/99
page 59
F ( x , y , z) =
1
0
( x , y , z,1)
0
0
0 0
1 0
0 1
0 x
0 y
(6.51)
0 z
1 1
0 0
T
or more compactly, using the homogeneous representation of point (x, y, z) given by P = (x, y, z, 1):
F( x, y, z ) = P T Rsphere P
(6.52)
where Rsphere is the 4 by 4 matrix displayed in Equation 6.51. The point (x, y, z) is on the ellipsoid whenever this function evaluates to 0. The implicit forms for the other quadric surfaces all have the same
form; only the matrix R is different. For instance, the matrices for the hyperboloid of two sheets and the
elliptic paraboloid are given by:
10/23/99
page 60
1
0
=
0
0
Rhyperboloid 2
1 0 0
0 0 0
0 1 0
1 0 0
, RellipticParab =
0 0 0
0 1 0
0 0 12
0 0 1
.
0
0
0
1
2
(6.53)
a). What the are matrices for the remaining three shapes?
Transforming quadric Surfaces.
Recall from Section 6.5.3 that when an affine transformation with matrix M is applied to a surface with
implicit function F(P), the transformed surface has implicit function F(M-1P).
b). Show that when a quadric surface is transformed, its implicit function becomes:
G( P) = ( M 1 P) T R( M 1 P) which is easily manipulated (as in Appendix 2) into the form
G( P) = P T ( M T RM 1 ) P .
Thus the transformed surface is also a quadric surface with a different defining matrix. This matrix depends both on the original shape of the quadric and on the transformation.
For example, to convert the generic sphere into an ellipsoid that extends from -a to a in x, -b to b in y,
and -c to c in z, use the scaling matrix:
a
0
M =
0
0
0 0
b 0
0 c
0 0
0
0
0
1
c). Find M-1 and show that the matrix for the ellipsoid is:
M T RM 1
1
a 2
= 0
00
0
1
b2
0
0
0
0
1
c2
0
0
0
1
0
10/23/99
page 61
builds the vertex list and normal list based on these sample values, and creates a face list consisting of quadrilaterals. The only difficult part is keeping straight the indices of the vertices for each face in the face list. The
suggested skeleton shown in Figure 6.86 may prove helpful.
Apply this function to building an interesting surface of revolution, and a height field.
void Mesh:: makeSurfaceMesh()
{
int i, j, numValsU = numValsV = 40;// set these
double u, v, uMin = vMin = -10.0, uMax = vMax = 10.0;
double delU = (uMax - uMin)/(numValsU - 1);
double delV = (vMax - vMin)/(numValsV - 1);
numVerts =
numFaces =
numNorms =
pt
= new
face = new
norm = new
10/23/99
page 62
g ( z)
M = 0
0
0
0
g(z) 0
0
1
(6.54)
If the undeformed surface has parametric representation P(u, v) = (X(u, v), Y(u, v), Z(u, v)) then this deformation converts it to
P(u, v) = (X(u, v)g(Z(u, v)), Y(u, v) g(Z(u, v)), Z(u, v))
(6.55)
For Figure 6.89, the mesh for the pawn was first created, and then each mesh vertex (x, y, z) was altered
to (xF, yF, z), where F is 1 - 0.04 * (z + 6) (note: the pawn extends from 0 to -12 in z).
Another useful deformation is twisting. To twist about the z-axis, for instance, rotate all points on the
object about the z-axis by an angle that depends on z, using the matrix
cos( g(z))
M = sin( g ( z ))
0
sin( g ( z)) 0
cos( g ( z )) 0
0
1
(6.56)
Figure 6.90 shows the pawn after a linearly increasing twist is applied. The pawn is a surface of revolution about the z-axis, so it doesnt make much sense to twist it about the z-axis. Instead the twist here is
about the y-axis, with g( z ) = 0.02 z + 6 .
10/23/99
page 63
10/23/99
page 64
7.1 Introduction.
We are already in a position to create pictures of elaborate 3D objects. As we saw in Chapter 5, OpenGL
provides tools for establishing a camera in the scene, for projecting the scene onto the cameras
viewplane, and for rendering the projection in the viewport. So far our camera only produces parallel
projections. In Chapter 6 we described several classes of interesting 3D shapes that can be used to model
the objects we want in a scene, and through the Mesh class we have ways of drawing any of them with
appropriate shading.
So whats left to do? For greater realism we want to create and control a camera that produces
perspective projections. We also need ways to take more control of the cameras position and orientation,
so that the user can fly the camera through the scene in an animation. This requires developing more
controls than OpenGL provides. We also need to achieve precise control over the cameras view volume,
which is determined in the perspective case as it was when forming parallel projections: by a certain
matrix. This requires a deeper use of homogeneous coordinates than we have used so far, so we develop
the mathematics of perspective projections from the beginning, and see how they are incorporated in the
OpenGL graphics pipeline. We also describe how clipping is done against the cameras view volume,
which again requires some detailed working with homogeneous coordinates. So we finally see how it is
all done, from start to finish! This also provides the underlying theory for those programmers who must
develop 3D graphics software without the benefit of OpenGL.
Chapter
3D viewing
page 1
Chapter
3D viewing
page 2
ratio of 1.5. The near plane lies at z = -0.3 and the far plane lies at z = -50.0. We see later exactly what
values this function places in the projection matrix.
7.2.2. Positioning and pointing the camera.
In order to obtain the desired view of a scene, we move the camera away from its default position shown
in Figure 7.2, and aim it in a particular direction. We do this by performing a rotation and a translation,
and these transformations become part of the modelview matrix, as we discussed in Section 5.6.
We set up the cameras position and orientation in exactly the same way as we did for the parallelprojection camera. (The only difference between a parallel- and perspective-projection camera resides in
the projection matrix, which determines the shape of the view volume.) The simplest function to use is
again gluLookAt(), using the sequence
glMatrixMode(GL_MODELVIEW); // make the modelview matrix current
glLoadIdentity();
// start with a unit matrix
gluLookAt(eye.x, eye.y, eye.z, look.x, look.y, look.z, up.x, up.y,
up.z);
As before this moves the camera so its eye resides at point eye, and it looks towards the point of interest
look. The upward direction is generally suggested by the vector up, which is most often set simply to
(0, 1, 0). We took these parameters and the whole process of setting the camera pretty much for granted
in Chapter 5. In this chapter we will probe deeper, both to see how it is done and to take finer control
over setting the camera. We also develop tools to make relative changes to the cameras direction, such
as rotating it slightly to the left, tilting it up, or sliding it forward.
The General camera with arbitrary orientation and position.
A camera can have any position in the scene, and any orientation. Imagine a transformation that picks up
the camera of Figure 7.2 and moves it somewhere in space, then rotates it around to aim it as desired. We
need a way to describe this precisely, and to determine what the resulting modelview matrix will be.
It will serve us well to attach an explicit coordinate system to the camera, as suggested by Figure 7.3.
This coordinate system has its origin at the eye, and has three axes, usually called the u-, v-, and n- axes,
that define its orientation. The axes are pointed in directions given by the vectors u, v, and n as shown in
the figure. Because the camera by default looks down the negative z-axis, we say in general that the
camera looks down the negative n-axis, in the direction -n. The direction u points off to the right of
the camera, and direction v points upward. Think of the u-, v-, and n-axes as clones of the x-,y-, and
z-axes of Figure 7.2, that are moved and rotated as we move the camera into position.
Chapter
3D viewing
page 3
a).pitch
b).roll
c).yaw
v
n
Figure 7.4. A planes orientation relative to the world.
c). n o- roll
v
v
u
n
Figure 7.6. Various camera orientations.
What gluLookAt() does: some mathematical underpinnings.
What then are the directions u, v, and n when we execute gluLookAt() with given values for eye, look,
and up? Lets see exactly what gluLookAt() does, and why it does it.
As shown in Figure 7.7a, we are given the locations of eye and look, and the up direction. We
immediately know that n must be parallel to the vector eye - look, as shown in Figure 7.7b, so we set
n = eye - look. (Well normalize this and the other vectors later as necessary.)
Chapter
3D viewing
page 4
n = eye look
u = up n
v = nu
(7.1)
Chapter
3D viewing
page 5
u
v
V=
n
0
x
x
x
uy
vy
ny
0
uz
vz
nz
0
dx
dy
dz
1
(7.2)
eye 0
eye = 0
V
eye 0
1 1
x
y
z
1 A technicality: since its not legal to dot a point and a vector, eye should be replaced here by the vector
(eye - (0,0,0)).
Chapter
3D viewing
page 6
u 1
u 0
V =
u 0
0 0
as desired, where we have extended point eye to homogeneous coordinates. Also check that
x
y
z
and that V maps v into (0,1,0,0)T and maps n into (0,0,1,0)T. The matrix V is created by gluLookAt()
and is postmultiplied with the current matrix. We will have occasion to do this same operation later when
we maintain our own camera in a program.
Practice Exercises.
7.2.1. Finding roll, pitch, and heading given vectors u, v, and n. Suppose a camera is based on a
coordinate system with axes in the directions u, v, and n, all unit vectors. The heading and pitch of the
camera is found by representing -n in spherical coordinates. Using Appendix 2, show that
heading = arctan( n z , n x )
pitch = sin 1 ( n y )
Further, the roll of the camera is the angle its u-axis makes with the horizontal. To find it, construct a
vector b that is horizontal and lies in the uv-plane. Show that b = j n has these properties. Show that
the angle between b and u is given by
roll = cos 1
u n
n
x
z
2
uz n x
+ nz 2
7.2.2. Using up sets v to a best approximation to up. Show that using up as in Equation 7.1 to set u
and v is equivalent to making v the closest vector to up that is perpendicular to vector n. Use these steps:
a). Show that v = n ( up n) ;
b). Use a property of the triple vector product, that says a (b c) = (a c)b (a b)c .
c). Show that v is therefore the projection of up onto the plane with normal n (see Chapter 4), and
therefore is the closest vector in this plane to up.
Chapter
3D viewing
page 7
private:
Point3 eye;
Vector3 u,v,n;
double viewAngle, aspect, nearDist, farDist; // view volume shape
void setModelviewMatrix(); // tell OpenGL where the camera is
public:
Camera(); // default constructor
void set(Point3 eye, Point3 look, Vector3 up); // like gluLookAt()
void roll(float angle); // roll it
void pitch(float angle); // increase pitch
void yaw(float angle); // yaw it
void slide(float delU, float delV, float delN); // slide it
void setShape(float vAng, float asp, float nearD, float farD);
};
Figure 7.10. The Camera class definition.
The utility routine setModelviewMatrix() communicates the modelview matrix to OpenGL. It is
used only by member functions of the class, and needs to be called after each change is made to the
cameras position or orientation. Figure 7.11 shows a possible implementation. It computes the matrix of
Equation 7.2 based on current values of eye, u, v, and n, and loads the matrix directly into the modelview
matrix using glLoadMatrixf().
The method set() acts just like gluLookAt(): it uses the values of eye, look, and up to compute u, v,
and n according to Equation 7.1. It places this information in the cameras fields and communicates it to
OpenGL. Figure 7.11 shows a possible implementation.
void Camera :: setModelViewMatrix(void)
{ // load modelview matrix with existing camera values
float m[16];
Vector3 eVec(eye.x, eye.y, eye.z); // a vector version of eye
m[0] = u.x; m[4] = u.y; m[8] = u.z; m[12] = -eVec.dot(u);
m[1] = v.x; m[5] = v.y; m[9] = v.z; m[13] = -eVec.dot(v);
m[2] = n.x; m[6] = n.y; m[10] = n.z; m[14] = -eVec.dot(n);
m[3] = 0;
m[7] = 0;
m[11] = 0;
m[15] = 1.0;
glMatrixMode(GL_MODELVIEW);
glLoadMatrixf(m); // load OpenGLs modelview matrix
}
void Camera:: set(Point3 Eye, Point3 look, Vector3 up)
{
// create a modelview matrix and send it to OpenGL
eye.set(Eye); // store the given eye position
n.set(eye.x - look.x, eye.y - look.y, eye.z - look.z); // make n
u.set(up.cross(n)); // make u = up X n
n.normalize(); u.normalize(); // make them unit length
v.set(n.cross(u)); // make v = n X u
setModelViewMatrix(); // tell OpenGL
}
Figure 7.11. The utility routines set() and setModelViewMatrix().
The routine setShape()is even simpler: It puts the four argument values into the appropriate camera
fields, and then calls gluPerspective(viewangle,aspect,nearDist, farDist) (along with
glMatrixMode(GL_PROJECTION) and glLoadIdentity()) to set the projection matrix.
The central camera methods are slide(), roll(), yaw(), and pitch(), which make relative changes to
the cameras position and orientation. (The whole reason for maintaining the eye, u, v, and n fields in
our Camera data structure is so that we have a record of the current camera, and can therefore alter it.)
We examine how the camera methods operate next.
7.3.1. Flying the Camera.
The user flies the camera through a scene interactively by pressing keys or clicking the mouse. For
instance, pressing u might slide the camera up some amount, pressing y might yaw it to the left, and
pressing f might slide it forward. The user can see how the scene looks from one point of view, then
Chapter
3D viewing
page 8
change the camera to a better viewing spot and direction and produce another picture. Or the user can fly
around a scene taking different snapshots. If the snapshots are stored and then played back rapidly, an
animation is produced of the camera flying around the scene.
There are six degrees of freedom for adjusting a camera: it can be slid in three dimensions, and it can
be rotated about any of three coordinate axes. We first develop the slide() function.
Sliding the camera.
Sliding a camera means to move it along one its own axes, that is, in the u, v, or n direction, without
rotating it. Since the camera is looking along the negative n-axis, movement along n is forward and
back. Similarly, movement along u is left or right, and along v is up or down.
It is simple to move the camera along one of its axes. To move it distance D along its u-axis, set eye to
eye + D u. For convenience we can combine the three possible slides in a single function. slide(delU,
delV, delN) slides the camera amount delU along u, delV along v, and delN along n:
void Camera:: slide(float delU, float delV, float delN)
{
eye.x += delU * u.x + delV * v.x + delN * n.x;
eye.y += delU * u.y + delV * v.y + delN * n.y;
eye.z += delU * u.z + delV * v.z + delN * n.z;
setModelViewMatrix();
}
Rotating the Camera.
We want to roll, pitch, or yaw the camera. This involves a rotation of the camera about one of its own
axes. We look at rolling in detail; the other two types of rotation are similar.
To roll the camera is to rotate it about its own n axis. This means that both the directions u and v must be
rotated, as shown in Figure 7.12. We form two new axes u and v that lie in the same plane as u and v
yet have been rotated through the angle degrees.
v
v'
u'
(7.3)
The new axes u and v then replace u and v in the camera. This is straightforward to implement.
void Camera :: roll(float angle)
{ // roll the camera through angle degrees
float cs = cos(3.14159265/180 * angle);
float sn = sin(3.14159265/180 * angle);
Vector3 t(u); // remember old u
u.set(cs*t.x - sn*v.x, cs*t.y - sn*v.y, cs*t.z - sn*v.z);
v.set(sn*t.x + cs*v.x, sn*t.y + cs*v.y, sn*t.z + cs*v.z);
setModelViewMatrix();
}
Chapter
3D viewing
page 9
The functions pitch() and yaw() are implemented in a similar fashion. See the exercises.
Putting it all together.
We show in Figure 7.13 how the Camera class can be used with OpenGL to fly a camera through a
scene. The scene consists of the lone teapot here. The camera is a global object, and is set up in main()
with a good starting view and shape. When a key is pressed myKeyboard() is called, and the camera is
slid or rotated depending on which key was pressed. For instance, if P is pressed the camera is pitched
up by 1 degree. If CTRL F is pressed2 (hold down the control key and press f), the camera is pitched
down by 1 degree. After the keystroke has been processed glutPostRedisplay() causes
myDisplay() to be called again to draw the new picture.
// the usual includes
#include "camera.h"
Camera cam; // global camera object
//<<<<<<<<<<<<<<<<<<<<<<<< myKeyboard >>>>>>>>>>>>>>>>>>>>>>
void myKeyboard(unsigned char key, int x, int y)
{
switch(key)
{
// controls for camera
case 'F':
cam.slide(0,0, 0.2); break; // slide camera forward
case 'F'-64: cam.slide(0,0,-0.2); break; //slide camera back
// add up/down and left/right controls
case 'P':
cam.pitch(-1.0); break;
case 'P' - 64: cam.pitch( 1.0); break;
// add roll and yaw controls
}
glutPostRedisplay(); // draw it again
}
//<<<<<<<<<<<<<<<<<<<<<<< myDisplay >>>>>>>>>>>>>>>>>>>>>>>>>>
void myDisplay(void)
{
glClear(GL_COLOR_BUFFER_BIT||GL_DEPTH_BUFFER_BIT);
glutWireTeapot(1.0); // draw the teapot
glFlush();
glutSwapBuffers(); // display the screen just made
}
//<<<<<<<<<<<<<<<<<<<<<< main >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
void main(int argc, char **argv)
{
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB); // double buffering
glutInitWindowSize(640,480);
glutInitWindowPosition(50, 50);
glutCreateWindow("fly a camera around a teapot");
glutKeyboardFunc(myKeyboard);
glutDisplayFunc(myDisplay);
glClearColor(1.0f,1.0f,1.0f,1.0f); // background is white
glColor3f(0.0f,0.0f,0.0f);
// set color of stuff
glViewport(0, 0, 640, 480);
cam.set(4, 4, 4, 0, 0, 0, 0, 1, 0); // make the initial camera
cam.setShape(30.0f, 64.0f/48.0f, 0.5f, 50.0f);
glutMainLoop();
}
Figure 7.13. Application to fly a camera around the teapot.
2 On most keyboards pressing CTRL and a letter key returns an ASCII value that is 64 less than the
ASCII value returned by the letter itself.
Chapter
3D viewing
page 10
Notice the call to glutSwapBuffers()3. This application uses double-buffering to produce a rapid
and smooth transition between one picture and the next. There are two memory buffers used to store the
generated pictures. The display switches from showing one buffer to showing the other under the control
of glutSwapBuffers(). Each new picture is drawn in the invisible buffer, and when the drawing is
complete the display switches to it. Thus the viewer doesnt see the screen erased and the new picture
slowly emerge line-by-line, which is visually annoying. Instead the old picture is displayed steadily
while the picture is being composed off-screen, and then the display switches very rapidly to the newly
completed picture.
Drawing SDL scenes using a camera.
It is just as easy to incorporate a camera in an application that reads SDL files, as described in Chapter 5.
There are then two global objects:
Camera cam;
Scene scn;
and in main() an SDL file is read and parsed using scn.read(myScene.dat). Finally, in
myDisplay(void), simply replace glutWireTeapot(1.0) with scn.drawSceneOpenGL();
Practice Exercises.
7.3.1. Implementing pitch() and yaw(). Write the functions void Camera:: pitch(float
angle) and void Camera :: yaw(float angle) that respectively pitch and yaw the camera.
Arrange matters so that a positive yaw yaws the camera to the left and a positive pitch pitches the
camera up.
7.3.2. Building a universal rotate() method. Write the functions void Camera::
rotate(Vector3 axis, float angle) that rotates the camera through angle degrees about
axis. It rotates all three axes u, v, and n about the eye.
Chapter
3D viewing
page 11
At this point we must look more deeply into the process of forming perspective projections. We need
answers to a number of questions. What operations constitute forming a perspective projection, and how
does the pipeline do these operations? Whats the relationship between perspective projections and
matrices. How does the projection map the view volume into a canonical view volume for clipping?
How is clipping done? How do homogeneous coordinates come into play in the process? How is the
depth of a point from the eye retained so that proper hidden surface removal can be done? And what is
that perspective division step?
We start by examining the nature of perspective projection, independent of specific processing steps in
the pipeline. Then we see how the steps in the pipeline are carefully crafted to produce the numerical
values required for a perspective projection.
7.4.1. Perspective Projection of a Point.
The fundamental operation is projecting a 3D point to a 2D point on a plane. Figure 7.16 elaborates on
Figure 7.15 to show point P = (Px, Py, Pz) projecting onto the near plane of the camera to a point (x*, y*).
We erect a local coordinate system on the near plane, with its origin on the cameras z-axis. Then it is
meaningful to talk about the point x* units over to the right of the origin, and y* units above the origin.
x*
N
=
Pz
Px
or x* = NPx/(-Pz). Similarly y* = NPy/(-Pz). So we have that P projects to the point on the viewplane:
( x*, y*) = N
Py
Px
,N
Pz
Pz
(the projection of P)
(7.4)
An alternate (analytical) method for arriving at this result is given in the exercises.
Example 7.4.1: Where on the viewplane does P = (1, 0.5, -1.5) lie for the camera having a near plane at
N = 1? Solution: Direct use of Equation 7.4 yields (x*, y*) = (0.666, 0.333).
We can make some preliminary observations about how points are projected.
1). Note the denominator term -Pz. It is larger for more remote points (those farther along the negative zaxis), which reduces the values of x* and y*. This introduces perspective foreshortening, and makes
remote parts of an object appear smaller than nearer parts.
2). Denominators have a nasty way of evaluating to zero, and Pz becomes 0 when P lies in the same
plane as the eye: the z = 0 plane. Normally we use clipping to remove such offending points before
trying to project them.
3). If P lies behind the eye there is a reversal in sign of Pz, which causes further trouble, as we see later.
These points, too, are usually removed by clipping.
4). The effect of the near plane distance N is simply to scale the picture (both x* and y* are proportional
to N). So if we choose some other plane (still parallel to the near plane) as the view plane onto which to
project pictures, the projection will differ only in size with the projection onto the near plane. Since we
ultimately map this projection to a viewport of a fixed size on the display, the size of the projected image
makes no difference. This shows that any viewplane parallel to the near plane would work just as well, so
we might as well use the near plane itself.
Chapter
3D viewing
page 12
5). Straight lines project to straight lines. Figure 7.17 provides the simplest proof. Consider the line in 3D
space between points A and B. A projects to A and B projects to B. But do points between A and B
project to points directly between A and B? Yes: just consider the plane formed by A, B and the origin.
Since any two planes intersect in a straight line, this plane intersects the near plane in a straight line.
Thus line segment AB projects to line segment AB.
The foreshortening factor is two for those points on the back wall. Figure 7.19a shows the projection of
the barn for this view. Note that edges on the rear wall project at half their true length. Also note that
edges of the barn that are actually parallel in 3D need not project as parallel. (We shall see that parallel
edges that are parallel to the viewplane do project as parallel, but parallel edges that are not parallel to
the viewplane are not parallel: they recede to a vanishing point.)
3D viewing
page 13
a). Show that if this ray is at the origin at t = 0 and at P at time t = 1, then it has parametric representation
r(t) = Pt.
b). Show that it hits the near plane at t = N/(-Pz);
c). Show that the hit point is (x*, y*) = (NPx/(-Pz), NPy/(-Pz)).
7.4.2. Perspective Projection of a Line.
We develop here some interesting properties of perspective projections by examining how straight lines
project.
1). Lines that are parallel in 3D project to lines, but these lines arent necessary parallel. If not parallel,
they meet at some vanishing point.
2). Lines that pass behind the eye of the camera cause a catastrophic passage through infinity. (Such
lines should be clipped off.)
3). Perspective projections usually produce geometrically realistic pictures. But realism is strained for
very long lines parallel to the viewplane.
1). Projecting Parallel Lines.
We suppose the line in 3D passes (using camera coordinates) through point A = (Ax, Ay, Az) and has
direction vector c = (cx, cy, cz). It therefore has parametric form P(t) = A + c t. Substituting this form in
Equation 7.4 yields the parametric form for the projection of this line:
p(t ) = N
Ay + c y t
Ax + c x t
,N
Az c z t
Az c z t
(7.5)
(This may not look like the parametric form for a straight line, but it is. See the exercises.) Thus the point
A in 3D projects to the point p(0), and as t varies the projected point p(t) moves across the screen (in a
straight line). We can discern several important properties directly from this formula.
Suppose the line A + ct is parallel to the viewplane. Then cz = 0 and the projected line is given by:
p(t ) =
N
Ax + c x t, Ay + c y t
Az
This is the parametric form for a line with slope cy/cx. This slope does not depend on the position of the
line, only its direction c. Thus all lines in 3D with direction c will project with this slope, so their
projections are parallel. We conclude:
If two lines in 3D are parallel to each other and to the viewplane, they project to two parallel lines.
Now consider the case where the direction c is not parallel to the viewplane. For convenience suppose cz
< 0, so that as t increases the line recedes further and further from the eye. At very large values of t,
Equation 7.5 becomes:
p() = N
cy
cx
,N
cz
cz
(7.6)
This is called the vanishing point for this line: its the point towards which the projected line moves as
t gets larger and larger. Notice that it depends only on the direction c of the line and not its position
(which is embodied in A). Thus all parallel lines share the same vanishing point.
In particular, these lines project to lines that are not parallel.
Figure 7.21a makes this more vivid for the example of a cube. Several edges of the cube are parallel:
there are those that are horizontal, those that are vertical, and those that recede from the eye. This picture
Chapter
3D viewing
page 14
was made with the camera oriented so that its near plane was parallel to the front face of the cube. Thus
in camera coordinates the z-component of c for the horizontal and vertical edges is 0. The horizontal
edges therefore project to parallel lines, and so do the vertical edges. The receding edges, however, are
not parallel to the view plane, and hence converge onto a vanishing point (VP). Artists often set up
drawings of objects this way, choosing the vanishing point and sketching parallel lines as pointing at the
VP. We shall see more on vanishing points as we proceed.
VP
d
Figure 7.21. The vanishing point for parallel lines.
Figure 7.22 suggests what a vanishing point is geometrically. Looking down onto the cameras xz-plane
from above, we see the eye viewing various points on the line AB. A projects to A, B projects to B, etc.
Very remote points on the line project to VP as shown. The point VP is situated so that the line from the
eye through VP is parallel to AB (why?).
Figure 7.23. Projecting the line segment AB, with B behind the eye.
Example 7.4.3. The Classic Horizontal Plane in Perspective.
A good way to gain insight into vanishing points is to view a grid of lines in perspective, as in Figure 7.24.
Grid lines here lie in the xz-plane, and are spaced 1 unit apart. The eye is perched 1 unit above the xz-plane,
at (0, 1, 0), and looks along -n, where n = (0,0,1). As usual we take up = (0,1,0). N is chosen to be 1.
Chapter
3D viewing
page 15
b). A perspective
projection shows
Figure 7.25. Viewing very long parallel wires. (use old 12.14).
For the perspective view, if we orient the viewplane to be parallel to the wires, we know the image will
show two straight and parallel lines (part b). But what you see is quite different. The wires appear curved
as they converge to vanishing points in both directions (part c)! In Practice this anomaly is barely
Chapter
3D viewing
page 16
visible because the window or your eye limits the field of view to a reasonable region. (To see different
parts of the wires you have to roll your eyes up and down, which of course rotates your view planes.)
Practice Exercises.
7.4.3. Straight lines project as straight lines: the parametric form. Show that the parametric form in
Equation 7.5 is that of a straight line. Hint: For the x-component divide the denominator into the
numerator to get -AxN/Az + R g(t) where R depends on the x-components of A and c, but not the ycomponents, and g(t) is some function of t that depends on neither the x nor y-components. Repeat for the
y-component, obtaining -AyN/Az + Sg(t) with similar properties. Argue why this is the parametric
representation of a straight line, (albeit one for which the point does not move with constant speed as t
varies).
7.4.4. Derive results for horizontal grids. Derive the parametric forms for the projected grid lines in
Example 7.4.3.
7.4.3. Incorporating Perspective in the Graphics Pipeline.
Only a fool tests the depth of the river with both feet.
Paul Cezanne, 1925
We want the graphics pipeline to project vertices of 3D objects onto the near plane, then map them to the
viewport. After passing through the modelview matrix, the vertices are represented in the cameras
coordinate system, and Equation 7.4 shows the values we need to compute for the proper projection. We
need to do clipping, and then map what survives to the viewport. But we need a little more as well.
Adding Pseudodepth.
Taking a projection discards depth information; that is, how far the point is from the eye. But we mustnt
discard this information completely, or it will be impossible to do hidden surface removal later.
The actual distance of a point P from the eye in camera coordinates is
Px 2 + Py 2 + Pz 2 , which would
be cumbersome and slow to compute for each point of interest. All we really need is some measure of
distance that tells when two points project to the same point on the near plane, which is the closer. Figure
7.26 shows points P1 and P2 that both lie on a line from the eye, and therefore project to the same point.
We must be able to test whether P1obscures P2 or vice versa. So for each point P that we project we
compute a value called the pseudodepth that provides an adequate measure of depth for P. We then say
that P projects to (x*, y*, z*), where (x*, y*) is the value provided in Equation 7.4 and z* is its
pseudodepth.
Py aPz + b
Px
,N
,
Pz
Pz Pz
(7.7)
for some choice of the constants a and b. Although many different choices for a and b will do, we choose
them so that the pseudodepth varies between -1 and 1 (we see later why these are good choices). Since
depth increases as a point moves further down the negative z-axis, we decide that the pseudodepth is -1
when Pz = -N, and is +1 when Pz = -F. With these two conditions we can easily solve for a and b,
obtaining:
a=
Chapter
2 FN
F+N
,b =
FN
FN
3D viewing
(7.8)
page 17
Figure 7.27 plots pseudodepth versus (-Pz). As we insisted it grows from -1 for a point on the near plane
up to +1 for a point on the far plane. As Pz approaches 0 (so that the point is just in front of the eye)
pseudodepth plummets to -. For a point just behind the eye, the pseudodepth is huge and positive. But
we will clip off points that lie closer than the near plane, so this catastrophic behavior will never be
encountered.
101Pz + 200
99 Pz
This maps appropriately to -1 at Pz = -N, and 1 at Pz = -F. But close to -F it varies quite slowly with (-Pz).
For (-Pz ) values of 97, 98, and 99, for instance, this evaluates to 1.041028, 1.040816, and 1.040608.
A little algebra (see the exercises) shows that when N is much smaller than F as it normally will be,
pseudodepth can be approximated by
pseudodepth 1 +
2N
Pz
(7.9)
Again you see that it varies more and more slowly as (-Pz) approaches F. But its variation is increased by
using large values of N. N should be set as large as possible (but of course not so large that objects
nearest to the camera are clipped off!).
Using Homogeneous Coordinates.
Why was there consideration given to having the same denominator for each term in Equation 7.7? As we
now show, this makes it possible to represent all of the steps so far in the graphics pipeline as matrix
multiplications, offering both processing efficiency and uniformity. (Chips on some graphics cards can
multiply a point by a matrix in hardware, making this operation extremely fast!) Doing it this way will
also allow us to set things up for a highly efficient and reliable clipping step.
The new approach requires that we represent points in homogeneous coordinates. We have been doing
that anyway, since this makes it easier to transform a vertex by the modelview matrix. But we are going
to expand the notion of the homogeneous coordinate representation beyond what we have needed before
now, and therein find new power. In particular, a matrix will not only be able to perform an affine
transformation, it will be able to perform a perspective transformation.
Up to now we have said that a point P = (Px, Py, Pz) has the representation (Px, Py, Pz, 1) in homogeneous
coordinates, and that a vector v = (vx, vy, vz) has the representation (vx, vy, vz, 0 ). We have simply
appended a 1 or 0. This made it possible to use coordinate frames as a basis for representing the points
and vectors of interest, and it allowed us to represent an affine transformation by a matrix.
Now we extend the idea , and say that a point P = (Px, Py, Pz) has a whole family of homogeneous
representations (wPx, wPy, wPz, w) for any value of w except 0. For example, the point (1, 2, 3) has the
representations (1, 2, 3, 1), (2, 4, 6, 2), (0.003, 0.006, 0.009, 0.001), (-1, -2, -3, -1), and so forth. If
Chapter
3D viewing
page 18
someone hands you a point in this form, say (3, 6, 2, 3) and asks what point is it, just divide through by
the last component to get (1, 2, 2/3, 1), then discard the last component: the point in ordinary
coordinates is (1, 2, 2/3). Thus:
The additional property of being able to scale all the components of a point without changing the point is
really the basis for the name homogeneous. Up until now we have always been working with the
special case where the final component is 1.
We examine homogeneous coordinates further in the exercises, but now focus on how they operate when
transforming points. Affine transformations work fine when homogeneous coordinates are used. Recall
that the matrix for an affine transformation always has (0,0,0,1) in its fourth row. Therefore if we
multiply a point P in homogeneous representation by such a matrix M, to form MP = Q (recall Equation
5.24), as in the example
2
6
0
0
1
.5
4
0
3 1 wPx
wQx
1 4 wPy
wQy
=
2 3 wPz
wQz
0 1
w
w
the final component of Q will always be unaltered: it is still w. Therefore we can convert the Q back to
ordinary coordinates in the usual fashion.
But something new happens if we deviate from a fourth row of (0,0,0,1). Consider the important example
that has a fourth row of (0, 0, -1, 0), (which is close to what we shall later call the projection matrix):
N
0
0
0
0
N
0
0
b
0
0
0
0
0
a
1
(7.10)
for any choices of N, a, and b. Multiply this by a point represented in homogeneous coordinates with an
arbitrary w:
N
0
0
0
0
N
0
0
wNPx
0 wPx
wNPy
0 wPy
=
w(aPz + b )
a b wPz
wPz
1 0 w
0
0
This corresponds to an ordinary point, but which one? Divide through by the fourth component and
discard it, to obtain
N P , N P , aP + b
P P P
y
and, if you wish, multiply all four components by any nonzero value.
Chapter
3D viewing
page 19
which is precisely what we need according to Equation 7.7. Thus using homogeneous coordinates allows
us to capture perspective using a matrix multiplication! To make it work we must always divide through
by the fourth component, a step which is called perspective division.
A matrix that has values other than (0,0,0,1) for its fourth row does not perform an affine transformation.
It performs a more general class of transformation called a perspective transformation. It is a
transformation not a projection. A projection reduces the dimensionality of a point, to a 3-tuple or a 2tuple, whereas a perspective transformation takes a 4-tuple and produces a 4-tuple.
Consider the algebraic effect of putting nonzero values in the fourth row of the matrix, such as
(A,B,C,D). When you multiply the matrix by (Px, Py, Pz, 1) (or any multiple thereof) the fourth term in the
resulting point becomes APx + BPy + CPz + D, making it linearly dependent on each of the components of
P. After perspective division this term appears in the denominator of the point. Such a denominator is
exactly what is needed to produce the geometric effect of perspective projection onto a general plane, as
we show in the exercises.
The perspective transformation therefore carries a 3D point P into another 3D point P, according to:
( Px , Py , Pz ) N
Py aPz + b
Px
,N
,
Pz
Pz Pz
(7.11)
Where does the projection part come into play? Further along the pipeline the first two components of
this point are used for drawing: to locate in screen coordinates the position of the point to be drawn. The
third component is peeled off to be used for depth testing. As far as locating the point on the screen is
concerned, ignoring the third component is equivalent to replacing it by 0, as in:
N P , N P , aP + b N P , N P ,0
P P P P P
y
the projection
(7.12)
This is just what we did in Chapter 5 to project a point orthographically (meaning perpendicularly to
the viewplane) when setting up a camera for our first efforts at viewing a 3D scene. We will study
orthographic projections in full detail later. For now we can conclude:
(perspective projection) = (perspective transformation) + (orthographic projection)
This decomposition of a perspective projection into a specific transformation followed by a (trivial)
projection will prove very useful, both algorithmically and for understanding better what each point
actually experiences as it passes through the graphics pipeline. OpenGL does the transformation step
separately from the projection step. In fact it inserts clipping, perspective division, and one additional
mapping between them. We next look deeper into the transformation part of the process.
The Geometric Nature of the Perspective Transformation.
The perspective transformation alters 3D point P into another 3D point according to Equation 7.11, to
prepare it for projection. It is useful to think of it as causing a warping of 3D space, and to see how it
warps one shape into another. Very importantly, it preserves straightness and flatness, so lines
transform into lines, planes into planes, and polygonal faces into other polygonal faces. It also preserves
in-between-ness, so if point a is inside an object, the transformed point will also be inside the
transformed object. (Our choice of a suitable pseudodepth function was guided by the need to preserve
these properties.) The proof of these properties is developed in the exercises.
Of particular interest is how it transforms the cameras view volume, because if we are going to do
clipping in the warped space, we will be clipping against the warped view volume. The perspective
transformation shines in this regard: the warped view volume is a perfect shape for simple and efficient
clipping! Figure 7.28 suggests how the view volume and other shapes are transformed. The near plane W
at z = -N maps into the plane W at z = -1, and the far plane maps to the plane at z = +1. The top wall T is
tilted into the horizontal plane T so that is parallel to the z-axis. The bottom wall S becomes the
Chapter
3D viewing
page 20
horizontal S, and the two side walls become parallel to the z-axis. The cameras view volume is
transformed into a parallelepiped!
We now know the transformed view volume precisely: a parallelepiped with dimensions that are related
to the cameras properties in a very simple way. This is a splendid shape to clip against as we shall see,
because its walls are parallel to the coordinate planes. But it would be an even better shape for clipping if
its dimensions didnt depend on the particular camera being used. OpenGL composes the perspective
transformation with another mapping that scales and shifts this parallelepiped into the canonical view
volume, a cube that extends from -1 to 1 in each dimension. Because this scales things differently in the
Chapter
3D viewing
page 21
x- and y- dimensions as it squashes the scene into a fixed volume it introduces some distortion, but the
distortion will be eliminated in the final viewport transformation.
The transformed view volume already extends from -1 to 1 in z, so it only needs to be scaled in the other
two dimensions. We therefore include a scaling and shift in both x and y to map the parallelepiped into
the canonical view volume. We first shift by -(right + left)/2 in x and by -(top + bott)/2 in y. Then we
scale by 2/(right - left) in x and by 2/(top - bott) in y. When the matrix multiplications are done (see the
exercises) we obtain the final matrix:
2N
right left
0
R=
00
right + left
right left
top + bott
top bott
( F + N )
FN
1
0
2N
top bott
0
0
0
2 FN
F N
0
0
(7.13)
This is known as the projection matrix, and it performs the perspective transformation plus a scaling and
shifting to transform the cameras view volume into the canonical view volume. It is precisely the matrix
that OpenGL creates (and by which it multiplies the current matrix) when glFrustum(left,
right, bott, top, N, F) is executed. Recall that gluPerspective(viewAngle,
aspect, N, F) is usually used instead, as its parameters are more intuitive. This sets up the same
matrix, after computing values for top, bott, etc. using
top = N tan(
viewAngle / 2)
180
Chapter
3D viewing
page 22
determine which part of the segment lies inside the CVV. If the segment intersects the boundary of the
CVV we will need to compute the intersection point I = (Ix, Iy, Iz, Iw).
As with the Cyrus-Beck algorithm we view the CVV as six infinite planes, and consider where the given
edge lies relative to each plane in turn. We can represent the edge parametrically as A + (C-A)t. It lies at
A when t is 0, and at C when t is 1. For each wall of the CVV we first test whether A and C lie on the
same side of a wall: if they do there is no need to compute the intersection of the edge with the wall. If
they lie on opposite sides we locate the intersection point and clip off the part of the edge that lies
outside.
So we must be able to test whether a point is on the outside or inside of a plane. Take the plane x = -1,
for instance, which is one of the walls of the CVV. The point A lies to the right of it (on the inside) if
ax
> 1
aw
ax > aw
or
or
(a w + a x ) > 0 .
(7.14)
(When you multiply both sides of an inequality by a negative term you must reverse the direction of the
inequality. But we are ultimately dealing with only positive values of aw here - see the exercises.)
Similarly A is inside the plane x = 1 if
ax
>1
aw
or
( aw a x ) > 0
Blinn [blinn96] calls these quantities the boundary coordinates of point A, and he lists the six such
quantities that we work with as in Figure 7.31:
boundary coordinate homogeneous value clip plane
BC0
w+x
X=-1
BC1
w-x
X=1
w+y
Y=-1
BC2
BC3
w-y
Y=1
BC4
w+x
Z=-1
BC5
w-z
Z=1
Figure 7.31. The boundary codes computed for each end point of an edge.
We form these six quantities for A and again for C. If all six are positive the point lies inside the CVV. If
any are negative the point lies outside. If both points lie inside we have the same kind of trivial accept
we had in the Cohen Sutherland clipper of Section 3.3. If A and C lie outside on the same side
(corresponding BCs are negative) the edge must lie wholly outside the CVV.
Trivial accept: both endpoints lie inside the CVV (all 12 BCs are positive)
Trivial reject: both endpoints lie outside the same plane of the CVV.
If neither condition prevails we must clip segment AC against each plane individually. Just as with the
Cyrus-Beck clipper, we keep track of a candidate interval (CI) (see Figure 4.45), an interval of time
during which the edge might still be inside the CVV. Basically we know the converse: if t is outside the
CI we know for sure the edge is not inside the CVV. The CI extends from t = tin to tout.
We test the edge against each wall in turn. If the corresponding boundary codes have opposite signs we
know the edge hits the plane at some thit, which we then compute. If the edge is entering (is moving into
the inside of the plane at t increases) we update tin = max(old tin, thit), since it could not possibly be
entering at an earlier time than thit. Similarly, if the edge is exiting, we update tout = min(old tout, thit). If at
any time the CI is reduced to the empty interval (tout becomes > tin) we know the entire edge is clipped off
and we have an early out, which saves unnecessary computation.
It is straightforward to calculate the hit time of an edge with a plane. Write the edge parametrically in
homogeneous coordinates:
Chapter
3D viewing
page 23
a x + (c x a x )t
=1
aw + (c w a w )t
This is easily solved for t, yielding
t=
aw a x
(a w a x ) (c w c x )
(7.15)
Note that thit depends on only two boundary coordinates. Intersection with other planes yield similar
formulas.
This is easily put into code, as shown in Figure 7.32. This is basically the Liang Barsky algorithm
[liang84], with some refinements suggested by Blinn [blinn96]. The routine clipEdge(Point4 A,
Point4 C) takes two points in homogeneous coordinates (having fields x, y, z, and w) and returns 0 if
no part of AC lies in the CVV, and 1 otherwise. It also alters A and C so that when the routine is finished
A and C are the endpoints of the clipped edge.
The routine finds the six boundary coordinates for each endpoint and stores them in aBC[] and cBC[].
For efficiency it also builds an outcode for each point, which holds the signs of the six boundary codes
for that point. Bit i of As outcode holds a 0 if aBC[i] > 0 (A is inside the i-th wall) and a 1 otherwise.
Using these, a trivial accept occurs when both aOutcode and cOutcode are 0. A trivial reject occurs
when the bit-wise AND of the two outcodes is nonzero.
int clipEdge(Point4& A, Point4& C)
{
double tIn = 0.0, tOut = 1.0, tHit;
double aBC[6], cBC[6];
int aOutcode = 0, cOutcode = 0;
<.. find BCs for A and C ..>
<.. form outcodes for A and C ..>
if((aOutcode
return
if((aOutcode
return
3D viewing
page 24
}
if(cOutcode != 0) // C is out: tOut has changed
{ // update C (using original value of A)
C.x = A.x + tOut * (C.x - A.x);
C.y = A.y + tOut * (C.y - A.y);
C.z = A.z + tOut * (C.z - A.z);
C.w = A.w + tOut * (C.w - A.w);
}
A = tmp; // now update A
return 1; // some of the edge lies inside the CVV
}
Figure 7.32. The edge clipper (as refined by Blinn).
In the loop that tests the edge against each plane, at most one of the BCs can be negative. (Why?) If A
has negative BC the edge must be entering at the hit point; if C has a negative BC the edge must be
exiting at the hit point. (Why?) (Blinn uses a slightly faster test by incorporating a mask that tests one bit
of an outcode.) Each time tIn or tOut are updated an early out is taken if tIn has become greater than
tOut.
When all planes have been tested, one or both of tIn and tOut have been altered (why?). A is updated
to A + (i - A) tIn if tIn has changed, and C is updated to A + (C - A) tOut if tOut has changed.
Blinn suggests pre-computing the BCs and outcode for every point to be processed. The eliminates the
need to re-compute these quantities when a vertex is an endpoint of more than one edge, as is often the
case.
Why did we clip against the canonical view volume?
Now that we have seen how easy it is to do clipping against the canonical view volume, we can see the
value of having transformed all objects of interest into it prior to clipping. There are two important
features of the CVV:
1.
It is parameter-free: the algorithm needs no extra information to describe the clipping volume. It uses
only the values -1 and 1. So the code itself can be highly tuned for maximum efficiency.
2.
Its planes are aligned with the coordinate axes (after the perspective transformation is performed).
This means that we can determine which side of a plane a point lies on using a single coordinate, as
in ax > -1. If the planes were not aligned, an expensive dot product would be needed.
Why did we clip in homogeneous coordinates, rather than after the perspective division step?
This isnt completely necessary, but it makes the clipping algorithm clean, fast, and simple. Doing the
perspective divide step destroys information: if you have the values ax and aw explicitly you know of
course the signs of both of them. But given only the ratio ax/aw you can tell only whether ax and aw have
the same or opposite signs. Keeping values in homogeneous coordinates and clipping points closer to the
eye than the near plane automatically removes points that lie behind the eye, such as B in Figure 7.23.
Some perverse situations that necessitate clipping in homogeneous coordinates are described in
[blinn96, foley90]. They involve peculiar transformations of objects, or construction of certain surfaces,
where the original point (ax, ay, az, aw) has a negative fourth term, even though the point is in front of the
eye. None of the objects we discuss modeling here involve such cases. We conclude that clipping in
homogeneous coordinates, although usually not cricitcal, makes the algorithm fast and simple, and brings
it almost no cost.
Following the clipping operation perspective division is finally done, (as in Figure 7.14), and the 3-tuple
(x, y, z) is passed through the viewport transformation. As we discuss next, this transformation sizes and
shifts the x- and y- values so they are placed properly in the viewport, and makes minor adjustments on
the z- component (pseudodepth) to make it more suitable for depth testing.
The Viewport Transformation.
Chapter
3D viewing
page 25
As we have seen the perspective transformation squashes the scene into the canonical cube, as suggested
in Figure 7.33. If the aspect ratio of the cameras view volume (that is, the aspect ratio of the window on
the near plane) is 1.5, there is obvious distortion introduced when the perspective transformation scales
objects into a window with aspect ratio 1. But the viewport transformation can undo this distortion by
mapping a square into a viewport of aspect ratio 1.5. We normally set the aspect ratio of the viewport to
be the same as that of the view volume.
Practice Exercises.
7.4.5. P projects where? Suppose the viewplane is given in camera coordinates by the equation Ax + By
+ Cz = D. Show that any point P projects onto this plane at the point given in homogeneous coordinates
Chapter
3D viewing
page 26
7.4.10. Show the final form of the projection matrix. The projection matrix is basically that of
Equation 7.10, followed by the shift and scaling described. If the matrix of Equation 7.10 is denoted as
M, and T represents the shifting matrix, and S the scaling matrix, show that the matrix product STM is
that given in Equation 7.13.
7.4.11. What Becomes of Points Behind the Eye? If the perspective transformation moves the eye off to
-infinity, what happens to points that lie behind the eye? Consider a line, P(t), that begins at a point in
front of the eye at t = 0 and moves to one behind the eye at t = 1.
a). Find its parametric form in homogeneous coordinates;
b). Find the parametric representation after it undergoes the perspective transformation;
c). Interpret it geometrically. Specifically state what the fourth homogeneous coordinate is geometrically.
A valuable discussion of this phenomenon is given in [blinn78].
Chapter
3D viewing
page 27
Preview.
Section 8.1 motivates the need for enhancing the realism of pictures of 3D objects. Section 8.2
introduces various shading models used in computer graphics, and develops tools for computing the
ambient, diffuse, and specular light contributions to an objects color. It also describes how to set up
light sources in OpenGL, how to describe the material properties of surfaces, and how the OpenGL
graphics pipeline operates when rendering polygonal meshes.
Section 8.3 focuses on rendering objects modeled as polygon meshes. Flat shading, as well as
Gouraud and Phong shading, are described. Section 8.4 develops a simple hidden surface removal
technique based on a depth buffer. Proper hidden surface removal greatly improves the realism of
pictures.
Section 8.5 develops methods for painting texture onto the surface of an object, to make it appear
to be made of a real material such as brick or wood, or to wrap a label or picture of a friend around
it. Procedural textures, which create texture through a routine, are also described. The thorny issue
of proper interpolation of texture is developed in detail. Section 8.5.4 presents a complete program
that uses OpenGL to add texture to objects. The next sections discuss mapping texture onto curved
surfaces, bump mapping, and environment mapping, providing more tools for making a 3D scene
appear real.
Section 8.6 describes two techniques for adding shadows to pictures. The chapter finishes with a
number of Case Studies that delve deeper into some of these topics, and urges the reader to
experiment with them.
8.1. Introduction.
In previous chapters we have developed tools for modeling mesh objects, and for manipulating a
camera to view them and make pictures. Now we want to add tools to make these objects and others
look visually interesting, or realistic, or both. Some examples in Chapter 5 invoked a number of
OpenGL functions to produce shiny teapots and spheres apparently bathed in light, but none of the
underlying theory of how this is done was examined. Here we rectify this, and develop the lore of
rendering a picture of the objects of interest. This is the business of computing how each pixel of a
picture should look. Much of it is based on a shading model, which attempts to model how light
that emanates from light sources would interact with objects in a scene. Due to practical limitations
one usually doesnt try to simulate all of the physical principles of light scattering and reflection;
this is very complicated and would lead to very slow algorithms. But a number of approximate
models have been invented that do a good job and produce various levels of realism.
We start by describing a hierarchy of techniques that provide increasing levels of realism, in order
to show the basic issues involved. Then we examine how to incorporate each technique in an
application, and also how to use OpenGL to do much of the hard work for us.
At the bottom of the hierarchy, offering the lowest level of realism, is a wire-frame rendering.
Figure 8.1 shows a flurry of 540 cubes as wire-frames. Only the edges of each object are drawn, and
you can see right through an object. It can be difficult to see whats what. (A stereo view would
help a little.)
Chapter 8
page 1
Chapter 8
page 2
a).wire-frame
Figure 8.3. A mesh approximation shaded with a shading model. a). wire-frame view b). flat
shading,
The next step up is of course to use color. Plate 22 shows the same scene where the objects are
given different colors.
In Chapter 6 we discussed building a mesh approximation to a smoothly curved object. A picture of
such an object ought to reflect this smoothness, showing the smooth underlying surface rather
than the individual polygons. Figure 8.4 show the scene rendered using smooth shading. (Plate 23
shows the colored version.) Here different points of a face are drawn with different gray levels
found through an interpolation scheme known as Gouraud shading. The variation in gray levels is
much smoother, and the edges of polygons disappear, giving the impression of a smooth, rather than
a faceted, surface. We examine Gouraud shading in Section 8.3.
page 3
observer's eye, and the normal to the surface. Others are related to the characteristics of the surface,
such as its roughness, and color of the surface.
A shading model dictates how light is scattered or reflected from a surface. We shall examine some
simple shading models here, focusing on achromatic light. Achromatic light has brightness but no
color; it is only a shade of gray. Hence it is described by a single value: intensity. We shall see how
to calculate the intensity of the light reaching the eye of the camera from each portion of the object.
We then extend the ideas to include colored lights and colored objects. The computations are almost
identical to those for achromatic light, except that separate intensities of red, green, and blue
components are calculated.
A shading model frequently used in graphics supposes that two types of light sources illuminate the
objects in a scene: point light sources and ambient light. These light sources shine on the various
surfaces of the objects, and the incident light interacts with the surface in three different ways:
If all incident light is absorbed, the object appears black and is known as a black body. If all is
transmitted, the object is visible only through the effects of refraction, which we shall discuss in
Chapter 14.
Here we focus on the part of the light that is reflected or scattered from the surface. Some amount
of this reflected light travels in just the right direction to reach the eye, causing the object to be
seen. The fraction that travels to the eye is highly dependent on the geometry of the situation. We
assume that there are two types of reflection of incident light: diffuse scattering and specular
reflection.
Diffuse scattering, occurs when some of the incident light slightly penetrates the surface and is
re-radiated uniformly in all directions. Scattered light interacts strongly with the surface, and so
its color is usually affected by the nature of the surface material.
Specular reflections are more mirror-like and are highly directional: Incident light does not
penetrate the object but instead is reflected directly from its outer surface. This gives rise to
highlights and makes the surface look shiny. In the simplest model for specular light the
reflected light has the same color as the incident light. This tends to make the material look like
plastic. In a more complex model the color of the specular light varies over the highlight,
providing a better approximation to the shininess of metal surfaces. We discuss both models for
specular reflections.
Most surfaces produce some combination of the two types of reflection, depending on surface
characteristics such as roughness and type of material. We say that the total light reflected from the
surface in a certain direction is the sum of the diffuse component and the specular component. For
each surface point of interest we compute the size of each component that reaches the eye.
Algorithms are developed next that accomplish this.
page 4
Figure 8.9. Light computations are made for one side of each face.
We shall develop the shading model for a given side of a face. If that side of the face is turned
away from the eye there is normally no light contribution. In an actual application the rendering
algorithm must be told whether to compute light contributions from one side or both sides of a
given face. We shall see that OpenGL supports this.
I d = Is d
sm
| s|| m|
In this equation, Is is the intensity of the light source, and d is the diffuse reflection coefficient.
Note that if the facet is aimed away from the eye this dot product is negative and we want Id to
evaluate to 0. So a more precise computation of the diffuse component is:
s m , 0
| s|| m|
I d = I s d max
Chapter 8
(8.1)
page 5
This max term might be implemented in code (using the Vector3 methods dot() and length() see Appendix 3) by:
double tmp = s.dot(m); // form the dot product
double value = (tmp<0) ? 0 : tmp/(s.length() * m.length());
Figure 8.11 shows how a sphere appears when it reflects diffuse light, for six reflection coefficients:
0, 0.2, 0.4, 0.6, 0.8, and 1. In each case the source intensity is 1.0 and the background intensity is set
to 0.4. Note that the sphere is totally black when d is 0.0, and the shadow in its bottom half (where
the dot product above is negative) is also black.
Figure 8.11. Spheres with various reflection coefficients shaded with diffuse light.
(file: fig8.11.bmp )
In reality the mechanism behind diffuse reflection is much more complex than the simple model we
have adopted here. The reflection coefficient d depends on the wavelength (color) of the incident
light, the angle , and various physical properties of the surface. But for simplicity and to reduce
computation time, these effects are usually suppressed when rendering images. A reasonable
value for d is chosen for each surface, sometimes by trial and error according to the realism
observed in the resulting image.
In some shading models the effect of distance is also included, although it is somewhat
controversial. The light intensity falling on facet S in Figure 8.10 from the point source is known to
fall off as the inverse square of the distance between S and the source. But experiments have shown
that using this law yields pictures with exaggerated depth effects. (What is more, it is sometimes
convenient to model light sources as if they lie at infinity. Using an inverse square law in such a
case would quench the light entirely!) The problem is thought to be in the model: We model light
sources as point sources for simplicity, but most scenes are actually illuminated by additional
reflections from the surroundings, which are difficult to model. (These effects are lumped together
into an ambient light component.) It is not surprising, therefore, that strict adherence to a physical
law based on an unrealistic model can lead to unrealistic results.
The realism of most pictures is enhanced rather little by the introduction of a distance term. Some
approaches force the intensity to be inversely proportional to the distance between the eye and the
object, but this is not based on physical principles. It is interesting to experiment with such effects,
and OpenGL provides some control over this effect, as we see in Section 8.2.9, but we don't include
a distance term in the following development.
page 6
appearance, so the Phong model is good when you intend the object to be made of shiny plastic or
glass. The Phong model is less successful with objects that are supposed to have a shiny metallic
surface, although you can roughly approximate them with OpenGL by careful choices of certain
color parameters, as we shall see. More advanced models of specular light have been developed that
do a better job of modeling shiny metals. These are not supported directly by OpenGLs rendering
process, so we defer a detailed discussion of them to Chapter 14 on ray tracing.
Figure 8.12a shows a situation where light from a source impinges on a surface and is reflected in
different directions. In the Phong model we discuss here, the amount of light reflected is greatest in
the direction of perfect mirror reflection, r, where the angle of incidence equals the angle of
reflection. This is the direction in which all light would travel if the surface were a perfect mirror.
At other near-by angles the amount of light reflected diminishes rapidly, as indicated by the relative
lengths of the reflected vectors. Part b shows this in terms of a beam pattern familiar in radar
circles. The distance from P to the beam envelope shows the relative strength of the light scattered
in that direction.
a).
b).
c).
Figure 8.12. Specular reflection from a shiny surface.
Part c shows how to quantify this beam pattern effect. We know from Chapter 5 that the direction r
of perfect reflection depends on both s and the normal vector m to the surface, according to:
r = s + 2
( s m)
m
| m|2
(8.2)
For surfaces that are shiny but not true mirrors, the amount of light reflected falls off as the angle
between r and v increases. The actual amount of falloff is a complicated function of , but in the
Phong model it is said to vary as some power f of the cosine of , that is, according to (cos ())f, in
which f is chosen experimentally and usually lies between 1 and 200.
Figure 8.13 shows how this intensity function varies with for different values of f. As f increases,
the reflection becomes more mirror-like and is more highly concentrated along the direction r. A
perfect mirror could be modeled using f = , but pure reflections are usually handled in a different
manner, as described in Chapter 14.
similar to old15.14
Figure 8.13. Falloff of specular light with angle.
Using the equivalence of cos() and the dot product between r and v (after they are normalized), the
contribution Isp due to specular reflection is modeled by
I sp = I s s
r v
| r| | v|
(8.3)
where the new term s is the specular reflection coefficient. Like most other coefficients in the
shading model, it is usually determined experimentally. (As with the diffuse term, if the dot product
r v is found to be negative, Isp is set to zero.)
A boost in efficiency using the halfway vector. It can be expensive to compute the specular
term in Equation 8.3, since it requires first finding vector r and normalizing it. In practice an
alternate term, apparently first described by Blinn [blinn77], is used to speed up computation.
Instead of using the cosine of the angle between r and v, one finds a vector halfway between s and
v, that is, h = s + v, as suggested in Figure 8.14. If the normal to the surface were oriented along h
the viewer would see the brightest specular highlight. Therefore the angle between m and h can be
used to measure the falloff of specular intensity that the viewer sees. The angle is not the same as
(in fact is twice if the various vectors are coplanar - see the exercises), but this difference can be
compensated for by using a different value of the exponent f. (The specular term is not based on
physical principles anyway, so it is at least plausible that our adjustment to it yields acceptable
Chapter 8
page 7
results.) Thus it is common practice to base the specular term on cos() using the dot product of h
and m:
I sp = I s s
h m
max(0,
| h| | m|
(8.4)
Note that with this adjustment the reflection vector r need not be found, saving computation time.
In addition, if both the light source and viewer are very remote then s and v are constant over the
different faces of an object, so b need only be computed once.
Figure 8.15 shows a sphere reflecting different amounts of specular light. The reflection coefficient
s varies from top to bottom with values 0.25, 0.5, and 0.75, and the exponent f varies from left to
right with values 3, 6, 9, 25, and 200. (The ambient and diffuse reflection coefficients are 0.1 and
0.4 for all spheres.)
Chapter 8
page 8
I = I a a + I d d lambert + I sp s phong f
(8.5)
lambert = max(0,
sm
hm
), and phong = max(0,
)
| s|| m|
| h|| m|
(8.6)
I depends on the various source intensities and reflection coefficients, as well as on the relative
positions of the point P, the eye, and the point light source. Here we have given different names, Id
Chapter 8
page 9
and Isp, to the intensities of the diffuse and specular components of the light source, because
OpenGL allows you to set them individually, as we see later. In practice they usually have the same
value.
To gain some insight into the variation of I with the position of P, consider again Figure 8.10. I is
computed for different points P on the facet shown. The ambient component shows no variation
over the facet; m is the same for all P on the facet, but the directions of both s and v depend on P.
(For instance, s = S - P where S is the location of the light source. How does v depend on P and the
eye?) If the light source is fairly far away (the typical case), s will change only slightly as P
changes, so that the diffuse component will change only slightly for different points P. This is
especially true when s and m are nearly aligned, as the value of cos() changes slowly for small
angles. For remote light sources, the variation in the direction of the halfway vector h is also slight
as P varies. On the other hand, if the light source is close to the facet, there can be substantial
changes in s and h as P varies. Then the specular term can change significantly over the facet, and
the bright highlight can be confined to a small portion of the facet. This effect is increased when the
eye is also close to the facet -causing large changes in the direction of v - and when the exponent f
is very large.
Practice Exercise 8.2.4. Effect of the Eye Distance. Describe how much the various light
contributions change as P varies over a facet when a). the eye is far away from the facet and b).
when the eye is near the facet.
(8.7)
The ambient and diffuse reflection coefficients are based on the color of the surface itself. By
color of a surface we mean the color that is reflected from it when the illumination is white light:
a surface is red if it appears red when bathed in white light. If bathed in some other color it can
exhibit an entirely different color. The following examples illustrate this.
Example 8.2.1. The color of an object. If we say that the color of a sphere is 30% red, 45% green,
and 25% blue, it makes sense to set its ambient and diffuse reflection coefficients to (0.3K, 0.45K,
0.25K), where K is some scaling value that determines the overall fraction of incident light that is
reflected from the sphere. Now if it is bathed in white light having equal amounts of red, green, and
blue (Isr = Isg = Isb = I) the individual diffuse components have intensities Ir = 0.3 K I, Ig = 0.45 K I, Ib
= 0.25 K I, so as expected we see a color that is 30% red, 45% green, and 25% blue.
Chapter 8
page 10
Example 8.2.2. A reddish object bathed in greenish light. Suppose a sphere has ambient and
diffuse reflection coefficients (0.8 , 0.2, 0.1 ), so it appears mostly red when bathed in white light.
We illuminate it with a greenish light Is = (0.15, 0.7, 0.15). The reflected light is then given by
(0.12, 0.14, 0.015), which is a fairly even mix of red and green, and would appear yellowish (as we
discuss further in Chapter 12).
The color of specular light. Because specular light is mirror-like, the color of the specular
component is often the same as that of the light source. For instance, it is a matter of experience
that the specular highlight seen on a glossy red apple when illuminated by a yellow light is yellow
rather than red. This is also observed for shiny objects made of plastic-like material. To create
specular highlights for a plastic surface the specular reflection coefficients, sr, sg, and sb used in
Equation 8.7 are set to the same value, say s, so that the reflection coefficients are gray in nature
and do not alter the color of the incident light. The designer might choose s = 0.5 for a slightly
shiny plastic surface, or s = 0.9 for a highly shiny surface.
Objects made of different materials.
A careful selection of reflection coefficients can make an object appear to be made of a specific
material such as copper, gold, or pewter, at least approximately. McReynolds and Blythe
[mcReynolds97] have suggested using the reflection coefficients given in Figure 8.17. Plate ???
shows several spheres modelled using these coefficients. The spheres do appear to be made of
different materials. Note that the specular reflection coefficients have different red, green, and blue
components, so the color of specular light is not simply that of the incident light. But McReynolds
and Blythe caution users that, because OpenGLs shading algorithm incorporates a Phong specular
component, the visual effects are not completely realistic. We shall revisit the issue in Chapter 14
and describe the more realistic Cook-Torrance shading approach..
Material
exponent: f
ambient: ar, ag, ab diffuse: dr, dg,db
specular: sr, sg,sb
Black
Plastic
0.0
0.01
0.50
0.0
0.01
0.50
0.0
0.01
0.50
Brass
0.329412
0.780392
0.992157
0.223529
0.568627
0.941176
0.027451
0.113725
0.807843
Bronze
0.2125
0.714
0.393548
0.1275
0.4284
0.271906
0.054
0.18144
0.166721
Chrome
0.25
0.4
0.774597
0.25
0.4
0.774597
0.25
0.4
0.774597
Copper
0.19125
0.7038
0.256777
0.0735
0.27048
0.137622
0.0225
0.0828
0.086014
Gold
0.24725
0.75164
0.628281
0.1995
0.60648
0.555802
0.0745
0.22648
0.366065
Pewter
0.10588
0.427451
0.3333
0.058824
0.470588
0.3333
0.113725
0.541176
0.521569
Silver
0.19225
0.50754
0.508273
0.19225
0.50754
0.508273
0.19225
0.50754
0.508273
Polished
0.23125
0.2775
0.773911
Silver
0.23125
0.2775
0.773911
0.23125
0.2775
0.773911
Figure 8.17. Parameters for common materials [mcReynolds97].
Chapter 8
32
27.8974
25.6
76.8
12.8
51.2
9.84615
51.2
89.6
page 11
page 12
The color at each new vertex is usually found by interpolation. For instance, suppose that the color
at v0 is (r0, g0, b0) and the color at v1 is (r1, g1, b1). If the point a is 40% of the way from v0 to v1 the
color associated with a is a blend of 60% of (r0, g0, b0) and 40% of (r1, g1, b1). This is expressed as
color at point a = (lerp(r0, r1, 0.4), lerp(g0, g1, 0.4), lerp(b0, b1, 0.4))
(8.8)
where we use the convenient function lerp() (short for linear interpolation - recall tweening in
Section 4.5.3) defined by:
lerp(G, H, f) = G + (H - G)f
(8.9)
1 In Section 8.5 we discuss replacing linear interpolation by hyperbolic interpolation as a more accurate way
to form the colors at the new vertices formed by clipping.
2 Here and elsewhere the type float would most likely serve as well as GLfloat. But using GLfloat
makes your code more portable to other OpenGL environments.
Chapter 8
page 13
whereas for the other sources the diffuse and specular values have defaults of black.
Spotlights.
Light sources are point sources by default, meaning that they emit light uniformly in all directions.
But OpenGL allows you to make them into spotlights, so they emit light in a restricted set of
directions. Figure 8.21 shows a spotlight aimed in direction d, with a cutoff angle of .
No light is seen at points lying outside the cutoff cone. For vertices such as P that lie inside the
cone, the amount of light reaching P is attenuated by the factor cos ( ) where is the angle
between d and a line from the source to P, and is an exponent chosen by the user to give the
desired fall-off of light with angle.
The default values for these parameters are d = (0,0,-1), = 180 , and = 0, which makes a source
an omni-directional point source.
o
Chapter 8
page 14
experiment with different fall-off rates, and to fine tune a picture. OpenGL attenuates the strength
of a positional3 light source by the following attenuation factor:
atten =
1
k c + kl D + k q D 2
(8.11)
where kc, kl, and kq are coefficients and D is the distance between the lights position and the vertex
in question. This expression is rich enough to allow you to model any combination of constant,
linear, and quadratic (inverse square law) dependence on distance from a source. These parameters
are controlled by function calls:
glLightf(GL_LIGHT0, GL_CONSTANT_ATTENUATION, 2.0);
and similarly for GL_LINEAR_ATTENUATION, and GL_QUADRATIC_ATTENUATION. The
default values are kc = 1, kl = 0, and kq = 0, which eliminate any attenuation.
Lighting model.
OpenGL allows three parameters to be set that specify general rules for applying the shading model.
These parameters are passed to variations of the function glLightModel.
a). The color of global ambient light. You can establish a global ambient light source in a scene
that is independent of any particular source. To create this light, specify its color using:
GLfloat amb[] = {0.2, 0.3, 0.1, 1.0};
glLightModelfv(GL_LIGHT_MODEL_AMBIENT, amb);
This sets the ambient source to the color (0.2, 0.3,0.1). The default value is (0.2, 0.2, 0.2, 1.0), so
this ambient light is always present unless you purposely alter it. This makes objects in a scene
visible even if you have not invoked any of the lighting functions.
b). Is the viewpoint local or remote? OpenGL computes specular reflections using the halfway
vector h = s + v described in Section 8.2.3. The true directions s and v are normally different at
each vertex in a mesh (visualize this). If the light source is directional then s is constant, but v still
varies from vertex to vertex. Rendering speed is increased if v is made constant for all vertices. This
is the default: OpenGL uses v = (0, 0, 1), which points along the positive z-axis in camera
coordinates. You can force the pipeline to compute the true value of v for each vertex by executing:
glLightModeli(GL_LIGHT_MODEL_LOCAL_VIEWER, GL_TRUE);
c). Are both sides of a polygon shaded properly? Each polygonal face in a model has two sides.
When modeling we tend to think of them as the inside and outside surfaces. The convention is
to list the vertices of a face in counter-clockwise (CCW) order as seen from outside the object. Most
mesh objects represent solids that enclose space, so there is a well defined inside and outside. For
such objects the camera can only see the outside surface of each face (assuming the camera is not
inside the object!). With proper hidden surface removal the inside surface of each face is hidden
from the eye by some closer face.
OpenGL has no notion of inside and outside. It can only distinguish between front faces and
back faces. A face is a front face if its vertices are listed in counter-clockwise (CCW) order as
seen by the eye4. Figure 8.22a shows the eye viewing a cube, which we presume was modeled using
the CCW ordering convention. Arrows indicate the order in which the vertices of each face are
passed to OpenGL (in a glBegin(GL_POLYGON);...; glEnd() block). For a space-enclosing
object all faces that are visible to the eye are therefore front faces, and OpenGL draws them
This attenuation factor is disabled for directional light sources, since they are infinitely remote.
You can reverse this sense with glFrontFace(GL_CW), which decrees that a face is a front face only if its
vertices are listed in clock-wise order. The default is glFrontFace(GL_CCW).
4
Chapter 8
page 15
properly with the correct shading. OpenGL also draws the back faces5, but they are ultimately
hidden by closer front faces.
a).
b).
Figure 8.22. OpenGLs definition of a front face.
Things are different in part b, which shows a box with a face removed. Again arrows indicate the
order in which vertices of a face are sent down the pipeline. Now three of the visible faces are back
faces. By default OpenGL does not shade these properly. To coerce OpenGL to do proper shading
of back faces, use:
glLightModeli(GL_LIGHT_MODEL_TWO_SIDE, GL_TRUE);
Then OpenGL reverses the normal vectors of any back-face so that they point toward the viewer,
and it performs shading computations properly. Replace GL_TRUE with GL_FALSE (the default) to
turn off this facility.
Note: Faces drawn by OpenGL do not cast shadows, so the back faces receive the same light from a
source even though there may be some other face between them and the source.
Moving light sources.
Recall that light sources pass through the modelview matrix just as vertices do. Therefore lights can
be repositioned by suitable uses of glRotated() and glTranslated(). The array position
specified using glLightfv(GL_LIGHT0, GL_POSITION, position)is modified by the
modelview matrix in effect at the time glLightfv() is called. So to modify the light position with
transformations, and independently move the camera, imbed the light positioning command in a
push/pop pair, as in:
void display()
{
GLfloat position[] = {2, 1, 3, 1}; //initial light position
<.. clear color and depth buffers ..>
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glPushMatrix();
glRotated(...); // move the light
glTranslated(...);
glLightfv(GL_LIGHT0, GL_POSITION, position);
glPopMatrix();
gluLookAt(...); // set the camera position
<.. draw the object ..>
glutSwapBuffers();
}
On the other hand, to have the light move with the camera, use:
GLfloat pos[] = {0,0,0,1};
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glLightfv(GL_LIGHT0, GL_POSITION, position);// light at (0,0,0)
gluLookAt(...); // move the light and the camera
<.. draw the object ..>
This establishes the light to be positioned at the eye (like a minors lamp), and the light moves with
the camera.
5 You can improve performance by instructing OpenGL to skip rendering of back faces, with
glCullFace(GL_BACK); glEnable(GL_CULL_FACE);
Chapter 8
page 16
You can see the effect of a light source only when light reflects off an objects surface. OpenGL
provides ways to specify the various reflection coefficients that appear in Equation 8.7. They are set
with variations of the function glMaterial, and they can be specified individually for front faces
and back faces (see the discussion concerning Figure 8.22). For instance,
GLfloat myDiffuse[] = {0.8, 0.2, 0.0, 1.0};
glMaterialfv(GL_FRONT, GL_DIFFUSE, myDiffuse);
sets the diffuse reflection coefficient (dr, dg, db) = (0.8, 0.2, 0.0) for all subsequently specified
front faces. Reflection coefficients are specified as a 4-tuple in RBGA format, just like a color. The
first parameter of glMaterialfv() can take on values:
GL_FRONT : set the reflection coefficient for front faces
GL_BACK: set it for back faces
GL_FRONT_AND_BACK: set it for both front and back faces
The second parameter can take on values:
GL_AMBIENT: set the ambient reflection coefficients
GL_DIFFUSE: set the diffuse reflection coefficients
GL_SPECULAR: set the specular reflection coefficients
GL_AMBIENT_AND_DIFFUSE: set both the ambient and diffuse reflection coefficients to the same
values. This is for convenience, since the ambient and diffuse coefficients are so often chosen to be
the same.
GL_EMISSION: set the emissive color of the surface.
The last choice sets the emissive color of a face, causing it to glow in the specified color,
independent of any light source.
Putting it all together.
We now extend Equation 8.7 to include the additional contributions that OpenGL actually
calculates. The total red component is given by:
(8.12)
Expressions for the green and blue components are similar. The emissive light is er , and Imr is the
global ambient light introduced in the lighting model. The summation denotes that the ambient,
diffuse, and specular contributions of all light sources are summed. For the i-th source atteni is the
attenuation factor as in Equation 8.10, spoti is the spotlight factor (see Figure 8.21), and lamberti
and phongi are the familiar diffuse and specular dot products. All of these terms must be
recalculated for each source.
Note: If Ir turns out to have a value larger than 1.0, OpenGL clamps it to 1.0: the brightest any light
component can be is 1.0.
Chapter 8
page 17
The light source is given a color of (0.8, 0.8, 0.8) for both its diffuse and specular components.
There is a global ambient term (Iar, Iag, Iab) = (0.2, 0.2, 0.2).
The current material properties are loaded into each objects mtrl field at the time it is created (see
the end of Scene :: getObject() in Shape.cpp of Appendix 4). When an object draws itself
using its drawOpenGL() method, it first passes its material properties to OpenGL (see Shape::
tellMaterialsGL()), so that at the moment it is actually drawn OpenGL has these properties in
its current state.
In Chapter 14 when raytracing we shall use each objects material field in a similar way to acquire
the material properties and do proper shading.
page 18
Ernst Mach (1838-1916), an Austrian physicist, whose early work strongly influenced the theory of
relativity.
Chapter 8
page 19
is seen (as discussed further in the exercises). This exaggerates the polygonal look of mesh
objects rendered with flat shading.
Specular highlights are rendered poorly with flat shading, again because an entire face is filled with
a color that was computed at only one vertex. If there happens to be a large specular component at
the representative vertex, that brightness is drawn uniformly over the entire face. If a specular
highlight doesnt fall on the representative point, it is missed entirely. For this reason, there is little
incentive for including the specular reflection component in the shading computation.
8.3.2.Smooth Shading.
Smooth shading attempts to de-emphasize edges between faces by computing colors at more points
on each face. There are two principal types of smooth shading, called Gouraud shading and Phong
shading [gouraud71, phong75]. OpenGL does only Gouraud shading, but we describe both of them.
Gouraud shading computes a different value of c for each pixel. For the scanline at ys (in figure
8.23) it finds the color at the leftmost pixel, colorleft by linear interpolation of the colors at the top
and bottom of the left edge7. For the scan-line at ys the color at the top is color4 and that at the
bottom is color1, so colorleft would be calculated as (recall Equation 8.9):
colorleft = lerp(color1, color4, f)
(8.13)
f =
ys ybott
y4 ybott
varies between 0 and 1 as ys varies from ybott to y4. Note that Equation 8.13 involves three
calculations since each color quantity has a red, green, and blue component.
Similarly colorright is found by interpolating the colors at the top and bottom of the right edge. The
tiler then fills across the scanline, linearly interpolating between colorleft and colorright to obtain the
color at pixel x:
c( x ) = lerp(colorleft , colorright ,
x xleft
)
x right xleft
(8.14)
To increase efficiency this color is computed incrementally at each pixel. That is, there is a constant
difference between c(x+1) and c(x), so
c( x + 1) = c( x ) +
colorright colorleft
xright xleft
(8.15)
The increment is calculated only once outside of the innermost loop. In terms of code this looks like:
for (int y = ybott; y <= ytop; y++)
// for each scan-line
{
<.. find xleft and xright ..>
<.. find colorleft and colorright ..>
colorinc = (colorright colorleft)/(xright xleft);
for (int x = xleft, c = colorleft; x <= xright; x++, c+=colorinc)
<.. put c into the pixel at (x, y) ..>
}
We shall see later that, although colors are usually interpolated linearly as we do here, better results can be
obtained by using so-called hyperbolic interpolation. For Gouraud shading the distinction is minor; for texture
mapping it is crucial.
Chapter 8
page 20
Gouraud shading is modestly more expensive computationally than flat shading. Gouraud shading is
established in OpenGL using:
glShadeModel(GL_SMOOTH);
Figure 8.25 shows a buckyball and a sphere rendered using Gouraud shading. The buckyball looks
the same as when it was flat shaded in Figure 8.24, because the same color is associated with each
vertex of a face, so interpolation changes nothing. But the sphere looks much smoother. There are
no abrupt jumps in color between neighboring faces. The edges of the faces (and the Mach bands)
are gone, replaced by a smoothly varying color across the object. Along the silhouette, however,
you can still see the bounding edges of individual faces.
Figure 8.25. Two meshes rendered using smooth shading. (file: fig8.25.bmp)
Why do the edges disappear with this technique? Figure 8.26a shows two faces, F and F, that share
an edge. When rendering F the colors cL and cR are used, and when rendering F the colors cL and
cR are used. But since cR equals cL there is an abrupt change in color at the edge along the scanline.
a). two faces abuting
b). cross section: can see underlying surface
Figure 8.26. Continuity of color across a polygon edge.
Figure 8.26b suggests how this technique reveals the underlying surface approximated by the
mesh. The polygonal surface is shown in cross section, with vertices V1, V2, etc. marked. The
imaginary smooth surface that the mesh supposedly represents is suggested as well. Properly
computed vertex normals m1, m2, etc. point perpendicularly to this imaginary surface, so the normal
for correct shading is being used at each vertex, and the color thereby found is correct. The color
is then made to vary smoothly between vertices, not following any physical law but rather a simple
mathematical one.
Because colors are formed by interpolating rather than computing colors at every pixel, Gouraud
shading does not picture highlights well. Therefore, when Gouraud shading is used, one normally
suppresses the specular component of intensity in Equation 8.12. Highlights are better reproduced
using Phong shading, discussed next.
Phong Shading.
Greater realism can be achieved - particularly with regard to highlights on shiny objects - by a
better approximation of the normal vector to the face at each pixel. This type of shading is called
Phong shading, after its inventor Phong Bui-tuong [phong75].
When computing Phong shading we find the normal vector at each point on the face and we apply
the shading model there to find the color. We compute the normal vector at each pixel by
interpolating the normal vectors at the vertices of the polygon.
Figure 8.27 shows a projected face, with the normal vectors m1, m2, m3, and m4 indicated at the four
vertices. For the scan-line ys as shown the vectors mleft and mright are found by linear interpolation.
For instance, mleft is found as
Chapter 8
page 21
m left = lerp(m 4 , m3 ,
ys y4
)
y3 y4
This interpolated vector must be normalized to unit length before its use in the shading formula.
Once mleft and mright are known, they are interpolated to form a normal vector at each x along the
scan-line. This vector, once normalized, is used in the shading calculation to form the color at that
pixel.
Figure 8.28 shows an object rendered using Gouraud shading and Phong shading. Because the
direction of the normal vector varies smoothly from point to point and more closely approximates
that of an underlying smooth surface, the production of specular highlights is much more faithful
than with Gouraud shading, and more realistic renderings are produced.
1st Ed. Figure 15.25
Figure 8.28. Comparison of Gouraud and Phong shading (Courtesy of Bishop and Weimar 1986).
The principal drawback of Phong shading is its speed: a great deal more computation is required per
pixel, so that Phong shading can take 6 to 8 times longer than Gouraud shading to perform. A
number of approaches have been taken to speed up the process [bishop86, claussen90].
OpenGL is not set up to do Phong shading, since it applies the shading model once per vertex right
after the modelview transformation, and normal vector information is not passed to the rendering
stage following the perspective transformation and perspective divide. We will see in Section 8.5,
however, that an approximation to Phong shading can be created by mapping a highlight texture
onto an object using the environment mapping technique.
Practice Exercises.
8.3.1. Filling your face. Fill in details of how the polygon fill algorithm operates for the polygon
with vertices (x, y) = (23, 137), (120, 204), (200, 100), (100, 25), for scan lines y = 136, y = 137,
and y = 138. Specifically write the values of xleft and xright in each case.
8.3.2. Clipped convex polygons are still convex. Develop a proof that if a convex polygon is
clipped against the cameras view volume, the clipped polygon is still convex.
8.3.3. Retaining edges with Gouraud Shading. In some cases we may want to show specific
creases and edges in the model. Discuss how this can be controlled by the choice of the vertex
normal vectors. For instance, to retain the edge between faces F and F in Figure 8.26, what should
the vertex normals be? Other tricks and issues can be found in the references [e.g. Rogers85].
8.3.4. Faster Phong shading with fence shading. To increase the speed of Phong shading Behrens
[behrens94] suggests interpolating normal vectors between vertices to get mL and mR in the usual
way at each scan line, but then computing colors only at these left and right pixels, interpolating
them along a scan line as in Gouraud shading. This so-called fence shading speeds up rendering
dramatically, but does less well in rendering highlights than true Phong shading. Describe general
directions for the vertex normals m1, m2, m3, and m4 in Figure 8.27 such that
a). Fence shading produces the same highlights as Phong shading;
b). Fence shading produces very different highlights than does Phong shading.
8.3.5. The Phong shading algorithm. Make the necessary changes to the tiling code to incorporate
Phong shading. Assume the vertex normal vectors are available for each face. Also discuss how
Phong shading can be approximated by OpenGLs smooth shading algorithm. Hint: increase the
number of faces in the model.
Chapter 8
page 22
and that it often renders an object that is later obscured by a nearer object (so time spent rendering
the first object is wasted).
Figure 8.29 shows a depth buffer associated with the frame buffer. For every pixel p[i][j] on the
display the depth buffer stores a b bit quantity d[i][j]. The value of b is usually in the range of 12 to
30 bits.
( x, y, z ) =
P , P , aP + b
P P P
y
The third component is pseudodepth. Constants a and b have been chosen so that the third component
equals 0 if P lies in the near plane, and 1 if P lies in the far plane. For highest efficiency we would like
to compute it at each pixel incrementally, which implies using linear interpolation as we did for color in
Equation 8.15.
Figure 8.30 shows a face being filled along scanline y. The pseudodepth values at various points are
marked. The pseudodepths d1, d2, d3, and d4 at the vertices are known. We want to calculate dleft at
scan-line ys as lerp(d1, d4, f) for fraction f = (ys y1)/(y4 y1) , and similarly dright as lerp(d2, d3, h) for
the appropriate h. And we want to find the pseudodepth d at each pixel (x, y) along the scan-line as
lerp(dleft, dright, k) for the appropriate k. (What are the values of h and k?) The question is whether
this calculation produces the true pseudodepth of the corresponding point on the 3D face.
page 23
for (int x = xleft, c = colorleft, d = dleft; x <= xright; x++, c+=colorinc, d+= dinc)
if(d < d[x][y])
{
<.. put c into the pixel at (x, y) ..>
d[x][y] = d; // update the closest depth
}
}
Figure 8.31. Doing depth computations incrementally.
Depth compression at greater distances.
Recall from Example 7.4.4 that the pseudodepth of a point does not vary linearly with actual depth
from the eye, but instead approaches an asymptote. This means that small changes in true depth
map into extremely small changes in pseudodepth when the depth is large. Since only a limited
number of bits are used to represent pseudodepth, two nearby values can easily map into the same
value, which can lead to errors in the comparison d < d[x][y]. Using a larger number of bits to
represent pseudodepth helps, but this requires more memory. It helps a little to place the near plane
as far away from the eye as possible.
OpenGL supports a depth buffer, and uses the algorithm described above to do hidden surface
removal. You must instruct OpenGL to create a depth buffer when it initializes the display mode:
glutInitDisplayMode(GLUT_DEPTH | GLUT_RGB);
and enable depth testing with
glEnable(GL_DEPTH_TEST);
Then each time a new picture is to be created the depth buffer must be initialized using:
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT); // clear screen
Practice Exercises.
8.4.1.The increments. Fill in details of how dleft, dright, and d are found from the pseudodepth values
known at the polygon vertices.
8.4.2. Coding depth values. Suppose b bits are allocated for each element in the depth buffer.
These b bits must record values of pseudodepth between 0 and 1. A value between 0 and 1 can be
expressed in binary in the form .d1d2d3db where di is 0 or 1. For instance, a pseudodepth of 0.75
would be coded as .1100000000 Is this a good use of the b bits? Discuss alternatives.
8.4.3. Reducing the Size of the Depth Buffer. If there is not enough memory to implement a full
depth buffer, one can generate the picture in pieces. A depth buffer is established for only a fraction
of the scan lines, and the algorithm is repeated for each fraction. For instance, in a 512-by-512
display, one can allocate memory for a depth buffer of only 64 scan lines and do the algorithm eight
times. Each time the entire face list is scanned, depths are computed for faces covering the scan
lines involved, and comparisons are made with the reigning depths so far. Having to scan the face
list eight times, of course, makes the algorithm operate more slowly. Suppose that a scene involves
F faces, and each face covers on the average L scanlines. Estimate how much more time it takes to
use the depth buffer method when memory is allocated for only nRows/N scanlines.
8.4.4. A single scanline depth buffer. The fragmentation of the frame buffer of the previous
exercise can be taken to the extreme where the depth buffer records depths for only one scan line. It
appears to require more computation, as each face is brought in fresh to the process many times,
once for each scan line. Discuss how the algorithm is modified for this case, and estimate how
much longer it takes to perform than when a full-screen depth buffer is used.
Chapter 8
page 24
floor is tiled with decorative tiles. The picture on the wall contains an image pasted inside the
frame.
a). box b). beer can c). wood table screen shots
Figure 8.32. Examples of texture mapped onto surfaces.
The basic technique begins with some texture function in texture space such as that shown in
Figure 8.33a. Texture space is traditionally marked off by parameters named s and t. The texture
is a function texture(s, t) which produces a color or intensity value for each value of s and t
between 0 and 1.
b).
a).
Figure 8.33. Examples of textures. a). image texture, b). procedural texture.
There are numerous sources of textures. The most common are bitmaps and computed functions.
Bitmap textures.
Textures are often formed from bitmap representations of images (such as a digitized photo, clip
art, or an image computed previously in some program). Such a texture consists of an array, say
txtr[c][r], of color values (often called texels). If the array has C columns and R rows, the
indices c and r vary from 0 to C-1 and R-1, respectively. In the simplest case the function
texture(s, t) provides samples into this array as in
Color3 texture(float s, float t)
{
return txtr[(int)(s * C)][(int)(t * R)];
}
where Color3 holds an RGB triple. For example, if R = 400 and C = 600, then texture(0.261,
0.783) evaluates to txtr[156][313]. Note that a variation of s from 0 to 1 encompasses 600
pixels, whereas the same variation in t encompasses 400 pixels. To avoid distortion during
rendering this texture must be mapped onto a rectangle with aspect ratio 6/4.
Procedural textures.
Alternatively we can define a texture by a mathematical function or procedure. For instance, the
sphere shape that appears in Figure 8.33b could be generated by the function
float fakeSphere(float s, float t)
{
float r = sqrt((s-0.5)*(s0.5)+(t-0.5)*(t0.5));
if(r < 0.3) return 1 - r/0.3; // sphere intensity
else return 0.2; // dark background
}
that varies from 1 (white) at the center to 0 (black) at the edges of the apparent sphere. Another
example that mimics a checkerboard is examined in the exercises. Anything that can be computed
can provide a texture: smooth blends and swirls of color, the Mandelbrot set, wireframe drawings
of solids, etc.
Chapter 8
page 25
We see later that the value texture(s, t) can be used in a variety of ways: it can be used as the
color of the face itself as if the face is glowing; it can be used as a reflection coefficient to
modulate the amount of light reflected from the face; it can be used to alter the normal vector
to the surface to give it a bumpy appearance.
Practice Exercise 8.5.1. The classic checkerboard texture. Figure 8.34 shows a checkerboard
consisting of 4 by 5 squares with brightness levels that alternate between 0 (for black) and 1 (for
white).
a). Write the function float texture(float s, float t) for this texture. (See also
Exercise 2.3.1.)
b). Write texture() for the case where there are M rows and N columns in the checkerboard.
c). Repeat part b for the case where the checkerboard is rotated 400 relative to the s and t axes.
page 26
glTexCoord2f(0.0,
glTexCoord2f(0.0,
glTexCoord2f(0.8,
glTexCoord2f(0.8,
glEnd();
0.0);
0.6);
0.6);
0.0);
glVertex3f(1.0,
glVertex3f(1.0,
glVertex3f(2.0,
glVertex3f(2.0,
2.5,
3.7,
3.7,
2.5,
1.5);
1.5);
1.5);
1.5);
Attaching a Pi to each Vi is equivalent to prescribing a polygon P in texture space that has the
same number of vertices as F. Usually P has the same shape as F as well: then the portion of the
texture that lies inside P is pasted without distortion onto the whole of F. When P and F have
the same shape the mapping is clearly affine: it is a scaling, possibly accompanied by a rotation
and a translation.
Figure 8.37 shows the very common case where the four corners of the texture square are
associated with the four corners of a rectangle. (The texture coordinates (s, t) associated with
each corner are noted on the 3D face.) In this example the texture is a 640 by 480 pixel bitmap,
and it is pasted onto a rectangle with aspect ratio 640/480, so it appears without distortion. (Note
that the texture coordinates s and t still vary from 0 to 1.) Figure 8.38 shows the use of texture
coordinates that tile the texture, making it repeat. To do this some texture coordinates that lie
outside of the interval [0,1] are used. When the renderer encounters a value of s and t outside of
the unit square such as s = 2.67 it ignores the integral part and uses only the fractional part 0.67.
Thus the point on a face that requires (s, t) = (2.6, 3.77) is textured with texture(0.6, 0.77). By
default OpenGL tiles texture this way. It may be set to clamp texture values instead, if desired;
see the exercises.
The mesh object consists of a small number of flat faces, and a different texture is to be
applied to each. Here each face has only a single normal vector but its own list of texture
coordinates. So the data associated with each face would be:
2.
The mesh represents a smooth underlying object, and a single texture is to be wrapped
around it (or a portion of it). Here each vertex has associated with it a specific normal vector
and a particular texture coordinate pair. A single index into the vertex/normals/texture lists
is used for each vertex. The data associated with each face would then be:
Chapter 8
page 27
The exercises take a further look at the required data structures for these types of meshes.
Chapter 8
page 28
to Le. The key question is this: if we move in equal steps across Ls on the screen how should we
step across texels along Lt in texture space?
a1 a2 a3
, , ) . Since M maps A= (A1, A2, A3) to a we know a~ = M ( A,1)T where (A,
a4 a4 a4
~
T
T
1) is the column vector with components A1, A2, A3, and 1. Similarly, b = M ( B,1) . (Check each
division: a = (
of these relations carefully.) Now using lerp() notation to keep things succinct, we have defined
T
~, b~, g)
R(g) = lerp(A, B, g), which maps to M (lerp( A, B, g),1) = lerp( a
= (lerp( a1 , b1 , g), lerp(a2 , b2 , g), lerp( a3 , b3 , g), lerp(a4 , b4 , g)) . (Check these, too.) This is
the homogeneous coordinate version ~
r ( f ) of the point r(f). We recover the actual components of
r(f) by perspective division. For simplicity write just the first component r1(f), which is:
r1 ( f ) =
lerp(a1 , b1 , g)
lerp(a4 , b4 , g)
(8.16)
But since by definition r(f) = lerp(a, b, f) we have another expression for the first component r1(f):
r1 ( f ) = lerp(
a1 b1
, , f)
a4 b4
(8.17)
Expressions (what are they?) for r2(f) and r3(f) follow similarly. Equate these two versions of r1(f)
and do a little algebra to obtain the desired relationship between f and g:
g=
f
b4
lerp( ,1, f )
a4
(8.18)
Therefore the point R(g) maps to r(f), but g and f arent the same fraction. g matches at f = 0 and at f
= 1, but its growth with f is tempered by a denominator that depends on the ratio b4/a4. If a4 equals
b4 then g is identical to f (check this). Figure 8.45 shows how g varies with f , for different values of
b4/a4.
g vs f
Chapter 8
page 29
A1 B1
, , f)
a4 b4
R1 =
1 1
lerp( , , f )
a4 b4
lerp(
(8.19)
with similar expressions resulting for the components R2 and R3 (which have the same denominator
as R1). This is a key result. It tells which 3D point (R1, R2, R3) corresponds (in eye coordinates) to a
given point that lies (fraction f of the way) between two given points a and b in screen coordinates.
So any quantity (such as texture) that is attached to vertices of the 3D face and varies linearly
between them will behave the same way.
The two cases of interest for the transformation with matrix M are:
The transformation is affine;
The transformation is the perspective transformation.
a).When the transformation is affine then a4 and b4 are both 1 (why?), so the formulas above
simplify immediately. The fractions f and g become identical, and R1 above becomes lerp(A1, B1, f).
We can summarize this as:
Fact: If M is affine, equal steps along the line ab do correspond to equal steps along the line AB.
b). When M represents the perspective transformation from eye coordinates to clip coordinates the
fourth components a4 and b4 are no longer 1.We developed the matrix M in Chapter 7. Its basic
form, given in Equation 7.10, is:
N
0
M=
0
0
0
N
0
0
0
0
c
1
d
0
0
0
where c and d are constants that make pseudodepth work properly. What is M(A,1)T for this matrix?
~ = ( NA , NA , cA + d, A ) , the crucial part being that a = -A . This is the position of the
Its a
4
3
1
2
3
3
point along the z-axis in camera coordinates, that is the depth of the point in front of the eye.
So the relative sizes of a4 and b4 lie at the heart of perspective foreshortening of a line segment: they
report the depths of A and B, respectively, along the cameras viewplane normal. If A and B have
the same depth (i.e. they lie in a plane parallel to the cameras viewplane), there is no perspective
distortion along the segment, so g and f are indeed the same. Figure 8.46 shows in cross section how
rays from the eye through evenly spaced spots (those with equal increments in f) on the viewplane
correspond to unevenly spaced spots on the original face in 3D. For the case shown A is closer than
B, causing a4 < b4, so the g-increments grow in size moving across the face from A to B.
Figure 8.46. The values of a4 and b4 are related to the depths of points.
Rendering incrementally.
We now put these ingredients together and find the proper texture coordinates (s, t) at each point on
the face being rendered. Figure 8.47 shows a face of the barn being rendered. The left edge of the
face has endpoints a and b. The face extends from xleft to xright across scan-line y. We need to find
appropriate texture coordinates (sleft, tleft) and (sright, tright) to attach to xleft and xright, respectively, which
we can then interpolate across the scan-line. Consider finding sleft(y), the value of sleft at scan-line y.
Chapter 8
page 30
We know that texture coordinate sA is attached to point a, and sB is attached to point b, since these
values have been passed down the pipeline along with the vertices A and B. If the scan-line at y is
fraction f of the way between ybott and ytop (so that f = (y ybott)/(ytop ybott)), then we know from
Equation 8.19 that the proper texture coordinate to use is:
sA sB
, , f)
a4 b4
sleft ( y ) =
1 1
lerp( , , f )
a4 b4
lerp(
(8.20)
and similarly for tleft. Notice that sleft and tleft have the same denominator: a linear interpolation
between values 1/a4 and 1/b4. The numerator terms are linear interpolations of texture coordinates
which have been divided by a4 and b4. This is sometimes called rational linear rendering
[heckbert91] or hyperbolic interpolation [blinn92]. To calculate (s, t) efficiently as f advances we
need to store values of sA/a4, sB/b4, tA/a4, tB/b4, 1/a4, and 1/b4, as these dont change from pixel to
pixel. Both the numerator and denominator terms can be found incrementally for each y, just as we
did for Gouraud shading (see Equation 8.15). But to find sleft and tleft we must still perform an explicit
division at each value of y.
The pair (sright, tright) is calculated in a similar fashion. They have denominators that are based on
values of a4 and b4 that arise from the projected points a and b.
Once (sleft, tleft) and (sright, tright) have been found the scan-line can be filled. For each x from xleft to xright
the values s and t are found, again by hyperbolic interpolation. (what is the expression for s at x?)
Implications for the graphics pipeline.
What are the implications of having to use hyperbolic interpolation to render texture properly? And
does the clipping step need any refinement? As we shall see, we must send certain additional
information down the pipeline, and calculate slightly different quantities than supposed so far.
Figure 8.48 shows a refinement of the pipeline. Various points are labeled with the information that
is available at that point. Each vertex V is associated with a texture pair (s, t) as well as a vertex
normal. The vertex is transformed by the modelview matrix (and the normal is multiplied by the
inverse transpose of this matrix), producing vertex A = (A1, A2, A3) and a normal n in eye
coordinates. Shading calculations are done using this normal, producing the color c = (cr, cg, cb). The
texture coordinates (sA, tA) (which are the same as (s, t)) are still attached to A. Vertex A then
~ = ( a , a , a , a ) . The texture coordinates
undergoes the perspective transformation, producing a
1
2
3
4
and color c are not altered.
page 31
first three, (x, y, z) = (a1/a4, a2/a4, a3/a4) report the position of the point in normalized device
coordinates. The third component is pseudodepth. The first two components are scaled and shifted
by the viewport transformation. To simplify notation we shall continue to call the screen coordinate
point (x, y, z).
So finally the renderer receives the array (x, y, z, 1, sA/a4, tA/a4, c, 1/a4) for each vertex of the face to
be rendered. Now it is simple to render texture using hyperbolic interpolation as in Equation 8.20:
the required values sA/a4 and 1/a4 are available for each vertex.
Practice exercises.
8.5.1. Data structures for mesh models with textures. Discuss the specific data types needed to
represent mesh objects in the two cases:
a). a different texture is to be applied to each face;
b). a single texture is to be wrapped around the entire mesh.
Draw templates for the two data types required, and for each show example data in the various
arrays when the mesh holds a cube.
8.5.2. Pseudodepth calculations are correct. Show that it is correct, as claimed in Section 8.4, to
use linear (rather than hyperbolic) interpolation when finding pseudodepth. Assume point A projects
to a, and B projects to b. With linear interpolation we compute pseudodepth at the projected point
lerp(a, b, f) as the third component of this point. This is the correct thing to do only if the resulting
value equals the true pseudodepth of the point that lerp(A, B, g) (for the appropriate g) projects to.
Show that it is in fact correct. Hint: Apply Equations 8.16 and 8.17 to the third component of the
point being projected.
8.5.3. Wrapping and clamping textures in OpenGL. To make the pattern wrap or tile in
the s direction use: glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,
GL_REPEAT). Similarly use GL_TEXTURE_WRAP_T for wrapping in the t-direction. This is
actually the default, so you neednt do this explicitly. To turn off tiling replace GL_REPEAT with
GL_CLAMP. Refer to the OpenGL documentation for more details, and experiment with different
OpenGL settings to see their effect.
8.5.4. Rationale for linear interpolation of texture during clipping. New vertices are often created
when a face is clipped against the view volume. We must assign texture coordinates to each vertex.
Suppose a new vertex V is formed that is fraction f of the way from vertex A to vertex B on a face.
Further suppose that A is assigned texture coordinates (sA, tA), and similarly for B. Argue why, if a
texture is considered as pasted onto a flat face, it makes sense to assign texture coordinates
(lerp(sA, sB, f), lerp(tA, tB, f)) to V.
8.5.5. Computational burden of hyperbolic interpolation. Compare the amount of computation
required to perform hyperbolic interpolation versus linear interpolation of texture coordinates.
Assume multiplication and division each require 10 times as much time as addition and subtraction.
I = texture(s, t )
(or to some constant multiple of it). So the object appears to emit light or glow: lower texture values
emit less light and higher texture values emit more light. No additional lighting calculations need be
done.
(For colored light the red, green, and blue components are set separately: for instance, the red
component is Ir = texturer(s, t).)
To cause OpenGL to do this type of texturing, specify:
Chapter 8
page 32
glTexEnvf(GL_TEXUTRE_ENV,GL_TEXTURE_ENV_MODE, GL_REPLACE8);
2). Paint the texture by modulating the reflection coefficient.
We noted earlier that the color of an object is the color of its diffuse light component (when bathed
in white light). Therefore we can make the texture appear to be painted onto the surface by varying
the diffuse reflection coefficient, and perhaps the ambient reflection coefficient as well. We say that
the texture function modulates the value of the reflection coefficient from point to point. Thus we
replace Equation 8.5 with:
(8.21)
as shown in Figure 8.50a, which adds undulations and wrinkles in the surface. This perturbed
surface has a new normal vector m(u*, v*) at each point. The idea is to use this perturbed normal
as if it were attached to the original unperturbed surface at each point, as shown in Figure 8.50b.
Blinn shows that a good approximation to the m(u*, v*) (before normalization) is given by:
m(u*, v*) = m(u*, v*) + d(u*, v*)
(8.22)
Chapter 8
page 33
where textureu and texturev are partial derivatives of the texture function with respect to u and v
respectively. Further, Pu and Pv are partial derivative of P(u, v) with respect to u and v,
respectively. All functions are evaluated at (u*, v*). Derivations of this result may also be
found in [watt2, miller98]. Note that the perturbation function depends only on the partial
derivatives of texture(), not on texture() itself.
If a mathematical expression is available for texture() you can form its partial derivatives
analytically. For example, texture() might undulate in two directions by combining sinewaves,
as in: texture(u, v) = sin(au)sin(bv) for some constants a and b. If the texture comes instead
from an image array, linear interpolation can be used to evaluate it at (u*, v*), and finite
differences can be used to approximate the partial derivatives.
Chapter 8
page 34
We show it here as having only three methods that we need for mapping textures. Other methods and details
are discussed in Chapter 10. The method readBMPFile() reads a BMP file9 and stores the pixel values in its
pixmap object; it is detailed in Appendix 3. The other two methods are discussed next.
Our example OpenGL application will use six textures. To create them we first make an RGBpixmap object
for each:
RGBpixmap pix[6];
and then load the desired texture image into each one. Finally each one is passed to OpenGL to define a
texture.
1). Making a procedural texture.
We first create a checkerboard texture using the method makeCheckerboard(). The checkerboard
pattern is familiar and easy to create, and its geometric regularity makes it a good texture for testing
correctness. The application generates a checkerboard pixmap in pix[0] using:
pix[0].makeCheckerboard().
The method itself follows:
void RGBpixmap:: makeCheckerboard()
{ // make checkerboard patten
nRows = nCols = 64;
pixel = new RGB[3 * nRows * nCols];
if(!pixel){cout << out of memory!;return;}
long count = 0;
for(int i = 0; i < nRows; i++)
for(int j = 0; j < nCols; j++)
{
int c = (((i/8) + (j/8)) %2) * 255;
pixel[count].r = c;
// red
pixel[count].g = c;
// green
pixel[count++].b = 0;
// blue
}
}
10
It creates a 64 by 64 pixel array, where each pixel is an RGB triple. OpenGL requires that texture pixel maps
have a width and height that are both some power of two. The pixel map is laid out in memory as one long
array of bytes: row by row from bottom to top, left to right across a row. Here each pixel is loaded with the
value (c, c, 0), where c jumps back and forth between 0 and 255 every 8 pixels. (We used a similar jumping
method in Exercise 2.3.1.) The two colors of the checkerboard are black: (0,0,0), and yellow: (255,255,0). The
function returns the address of the first pixel of the pixmap, which is later passed to glTexImage2D() to
create the actual texture for OpenGL.
Once the pixel map has been formed, we must bind it to a unique integer name so that it can be referred to in
OpenGL without ambiguity. We arbitrarily assign the names 2001, 2002, , 2006 to our six textures in this
example11. The texture is created by making certain calls to OpenGL, which we encapsulate in the method:
void RGBpixmap :: setTexture(GLuint textureName)
{
glBindTexture(GL_TEXTURE_2D,textureName);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,GL_NEAREST);
9
This is a standard device-independent image file format from Microsoft. Many images are available on the
internet in BMP format, and tools are readily available on the internet to convert other image formats to BMP
files.
10 A faster way that uses C++s bit manipulation operators is c = ((i&8)^(j&8))*255;
11 To avoid overlap in (integer) names in an application that uses many textures, it is better to let OpenGL
supply unique names for textures using glGenTextures(). If we need six unique names we can build an
array to hold them: GLuint name[6]; and then call glGenTextures(6,name). OpenGL places six
heretofore unused integers in name[0],,name[5], and we subsequently refer to the i-th texture using
name[i].
Chapter 8
page 35
Chapter 8
page 36
page 37
Suppose that we want to wrap a label about a circular cylinder, as suggested in Figure 8.53a. Its
natural to think in terms of cylindrical coordinates. The label is to extend from a to b in
azimuth and from za to zb along the z-axis. The cylinder is modeled as a polygonal mesh, so its
walls are rectangular strips as shown in part b. For vertex Vi of each face we must find suitable
texture coordinates (si, ti), so that the correct slice of the texture is mapped onto the face.
a).
b).
Figure 8.53. Wrapping a label around a cylinder.
The geometry is simple enough here that a solution is straightforward. There is a direct linear
relationship between (s, t) and the azimuth and height (, z) of a point on the cylinders surface:
s=
a
z za
, t=
b a
zb za
(8.23)
So if there are N faces around the cylinder, the i-th face has left edge at azimuth i = 2i/N, and
its upper left vertex has texture coordinates (si, ti) = ((2i/N - a)/(b-a), 1). Texture coordinates
for the other three vertices follow in a similar fashion. This association between (s, t) and the
vertices of each face is easily put in a loop in the modeling routine (see the exercises).
Things get more complicated when the object isnt a simple cylinder. We see next how to map
texture onto a more general surface of revolution.
Example 8.5.2. Shrink wrapping a label onto a Surface of Revolution.
Recall from Chapter 6 that a surface of revolution is defined by a profile curve (x(v), z(v))12 as
shown in Figure 8.54a, and the resulting surface - here a vase - is given parametrically by P(u, v)
= (x(v) cos u, x(v) sin u, z(v)). The shape is modeled as a collection of faces with sides along
contours of constant u and v (see Figure 8.54b). So a given face Fi has four vertices P(ui, vi),
P(ui+1, vi), P(ui, vi+1), and P(ui+1, vi+1). We need to find the appropriate (s, t) coordinates for each of
these vertices.
a). a vase profile
b). a face on the vase - four corners
Figure 8.54.Wrapping a label around a vase.
One natural approach is to proceed as above and to make s and t vary linearly with u and v in the
manner of Equation 8.23. This is equivalent to wrapping the texture about an imaginary rubber
cylinder that encloses the vase (see Figure 8.55a), and then letting the cylinder collapse, so that
each texture point slides radially (and horizontally) until it hits the surface of the vase. This
method is called shrink wrapping by Bier and Sloane [bier86], who discuss several possible
ways to map texture onto different classes of shapes. They view shrink wrapping in terms of the
imaginary cylinders normal vector (see Figure 8.55b): texture point Pi is associated with the
object point Vi that lies along the normal from Pi.
12
We revert to calling the parameters u and v in the parametric representation of the shape, since we are using
s and t for the texture coordinates.
Chapter 8
page 38
In part a) a line is drawn from the objects centroid C, through the vertex Vi, to its intersection
with the cylinder Pi. And in part b) the normal vector to the objects surface at Vi is used: Pi is at
the intersection of this normal from Vi with the cylinder. Notice that these three ways to
associate texture points with object points can lead to very different results depending on the
shape of the object (see the exercises). The designer must choose the most suitable method based
on the objects shape and the nature of the texture image being mapped. (What would be
appropriate for a chess pawn?)
Example 8.5.3. Mapping texture onto a sphere.
It was easy to wrap a texture rectangle around a cylinder: topologically a cylinder can be sliced
open and laid flat without distortion. A sphere is a different matter. As all map makers know,
there is no way to show accurate details of the entire globe on a flat piece of paper: if you slice
open a sphere and lay it flat some parts always suffer serious stretching. (Try to imagine a
checkerboard mapped over an entire sphere!)
Its not hard to paste a rectangular texture image onto a portion of a sphere, however. To map
the texture square to the portion lying between azimuth a to b and latitude a to b just map
linearly as in Equation 8.23: if vertex Vi lies a (i, i) associate it with texture coordinates (si, ti) =
((i - a)/((b - a), (i - a)/(b - a)). Figure 8.57 shows an image pasted onto a band around a
sphere. Only a small amount of distortion is seen.
a). texture on portion of sphere
b). 8 maps onto 8 octants
Figure 8.57. Mapping texture onto a sphere.
Figure 8.57b shows how one might cover an entire sphere with texture: map eight triangular
texture maps onto the eight octants of the sphere.
Example 8.5.4. Mapping texture to sphere-like objects.
We discussed adding texture to cylinder-like objects above. But some objects are more spherelike than cylinder-like. Figure 8.58a shows the buckyball, whose faces are pentagons and
hexagons. One could devise a number of pentagonal and hexagonal textures and manually paste
one of each face, but for some scenes it may be desirable to wrap the whole buckyball in a single
texture.
a). buckyball
b). three mapping methods
Figure 8.58. Sphere-like objects.
It is natural to surround a sphere-like object with an imaginary sphere (rather than a cylinder)
that has texture pasted to it, and use one of the association methods discussed above. Figure
8.58b shows the buckyball surrounded by such a sphere in cross section. The three ways of
associating texture points Pi with object vertices Vi are sketched:
object-centroid: Pi is on a line from the centroid C through vertex Vi;
object-normal: Pi is the intersection of a ray from Vi in the direction of the face normal;
sphere-normal: Vi is the intersection of a ray from Pi in the direction of the normal to the sphere
at Pi.
(Question: Are the object-centroid and sphere-normal methods the same if the centroid of the
object coincides with the center of the sphere?) The object centroid method is most likely the
best, and it is easy to implement. As Bier and Sloane argue, the other two methods usually
produce unacceptable final renderings.
Bier and Sloane also discuss using an imaginary box rather than a sphere to surround the object
in question. Figure 8.59a shows the six faces of a cube spread out over a texture image, and part
b) shows the texture wrapped about the cube, which in turn encloses an object. Vertices on the
object can be associated with texture points in the three ways discussed above: the objectcentroid and cube-normal are probably the best choices.
a). texture on 6 faces of box
b). wrapping texture onto
Figure 8.59. Using an enclosing box.
Chapter 8
page 39
Practice exercises.
8.5.7. How to associate Pi and Vi. Surface of revolution S shown in Figure 8.60 consists of a
sphere resting on a cylinder. The object is surrounded by an imaginary cylinder having a
checkerboard texture pasted on it. Sketch how the texture will look for each of the following
methods of associating texture points to vertices:
a). shrink wrapping;
b). object centroid;
c). object normal;
page 40
Figure 8.64 shows the use of a surrounding cube rather than a sphere. Part a) shows the map,
consisting of six images of various views of the interior walls, floor, and ceiling of a room. Part b)
shows a shiny object reflecting different parts of the room. The use of an enclosing cube was
introduced by Greene [greene 86], and generally produces less distorted reflections than are seen
with an enclosing sphere. The six maps can be generated by rendering six separate images from the
point of view of the object (with the object itself removed, of course). For each image a synthetic
camera is set up and the appropriate window is set. Alternatively, the textures can be digitized from
photos taken by a real camera that looks in the six principal directions inside an actual room or
scene.
a). six images make the map
b). environment mapping
(screen shots)
Figure 8.64. Environment mapping based on a surrounding cube.
Chrome and environment mapping differ most dramatically from normal texture mapping in an
animation when the shiny object is moving. The reflected image will flow over the moving
object, whereas a normal texture map will be attached to the object and move with it. And if a shiny
sphere rotates about a fixed spot a normal texture map spins with the sphere, but a reflection map
stays fixed.
How is environment mapping done? What you see at point P on the shiny object is what has arrived
at P from the environment in just the right direction to reflect into your eye. To find that direction
trace a ray from the eye to P, and determine the direction of the reflected ray. Trace this ray to find
where it hits the texture (on the enclosing cube or sphere). Figure 8.65 shows a ray emanating from
the eye to point P. If the direction of this ray is u and the unit normal at P is m, we know from
Equation 8.2 that the reflected ray has direction r = u 2(u m)m. The reflected ray moves in
direction r until it hits the hypothetical surface with its attached texture. It is easiest
computationally to suppose that the shiny object is centered in, and much smaller than, the
enclosing cube or sphere. Then the reflected ray emanates approximately from the objects center,
and its direction r can be used directly to index into the texture.
(s, t ) =
Chapter 8
1
2
ry
rx
+ 1), 12 ( + 1)
p
p
(8.24)
page 41
developed in the exercises. We must precompute a texture that shows what you would see of the
environment in a perfectly reflecting sphere, from an eye position far removed from the sphere
[haeberli93]. This maps the part of the environment that lies in the hemisphere behind the eye into a
circle in the middle of the texture, and the part of the environment in the hemisphere in front of the
eye into an annulus around this circle (visualize this). This texture must be recomputed if the eye
changes position. The pictures in Figure 8.63 were made using this method.
Simulating Highlights using Environment mapping.
Reflection mapping can be used in OpenGL to produce specular highlights on a surface. A texture
map is created that has an intense concentrated bright spot. Reflection mapping paints this
highlight onto the surface, making it appear to be an actual light source situated in the environment.
The highlight created can be more concentrated and detailed than those created using the Phong
specular term with Gouraud shading. Recall that the Phong term is computed only at the vertices of
a face, and it is easy to miss a specular highlight that falls between two vertices. With reflection
mapping the coordinates (s, t) into the texture are formed at each vertex, and then interpolated in
between. So if the coordinates indexed by the vertices happen to surround the bright spot, the spot
will be properly rendered inside the face.
Practice Exercise 8.5.9. OpenGLs computation of texture coordinates for environment
mapping. Derive the result in Equation 8.24. Figure 8.66b shows in cross-sectional view the vectors
involved (in eye coordinates). The eye is looking from a remote location in the direction (0,0,1). A
sphere of radius 1 is positioned on the negative z-axis. Suppose light comes in from direction r,
hitting the sphere at the point (x, y, z). The normal to the sphere at this point is (x, y, z), which also
must be just right so that light coming along r is reflected into the direction (0, 0, 1). This means
the normal must be half-way between r and (0, 0, 1), or must be proportional to their sum, so (x, y,
z) = K(rx, ry, rz+1) for some K.
a). Show that the normal vector has unit length if K is 1/p, where p is given as in Equation 8.24.
b). Show that therefore (x, y) = (rx/p, ry/p).
c). Suppose for the moment that the texture image extends from 1 to 1 in x and from 1 to 1 in y.
Argue why what we want to see reflected at the point (x, y, z) is the value of the texture image at (x,
y). d). Show that if instead the texture uses coordinates from 0 to 1 as is true with OpenGL that
we want to see at (x, y) the value of the texture image at (s, t) given by Equation 8.24.
13
the set theoretic union: A point is in the shadow if it is in one or more of the projections.
Chapter 8
page 42
projections of two of the faces: the top face projects to top and the front face to front. (Sketch the
projections of the other four faces, and see that their union is the required shadow14.)
a).
b).
Figure 8.68. Computing the shape of a shadow.
This is the key to drawing the shadow. After drawing the plane using ambient, diffuse, and specular
light contributions, draw the six projections of the boxs faces on the plane using only ambient light.
This will draw the shadow in the right shape and color. Finally draw the box. (If the box is near the
plane parts of it might obscure portions of the shadow.)
Building the projected face:
To make the new face F produced by F, project each of its vertices onto the plane in question. We
need a way to calculate these vertex positions on the plane. Suppose, as in Figure 8.68a, that the
plane passes through point A and has normal vector n. Consider projecting vertex V, producing
point V. The mathematics here are familiar: Point V is the point where the ray from the source at S
through V hits the plane. As developed in the exercises, this point is:
V ' = S + (V - S )
n ( A - S)
n (V - S )
(8.25)
The exercises show how this can be written in homogeneous coordinates as V times a matrix, which
is handy for rendering engines, like OpenGL, that support convenient matrix multiplication.
Practice Exercises.
8.6.1. Shadow shapes. Suppose a cube is floating above a plane. What is the shape of the cubes
shadow if the point source lies a). directly above the top face? b). along a main diagonal of the cube
(as in an isometric view)? Sketch shadows for a sphere and for a cylinder floating above a plane for
various source positions.
8.6.2. Making the shadow face. a). Show that the ray from the source point S through vertex V
hits the plane n ( P A) = 0 at t* = n ( A - S ) / n (V - S ) ; b). Show that this defines the hit
point V as given in Equation 8.25.
8.6.3. Its equivalent to a matrix multiplication. a). Show that the expression for V in Equation
T
8.25 can be written as a matrix multiplication: V ' = M (Vx , Vy , Vz ,1) , where M is a 4 by 4 matrix
b). Express the terms of M in terms of A, S, and n.
You need to form the union of the projections of only the three front faces: those facing toward the light
source. (Why?)
Chapter 8
page 43
shadow buffer pixel d[i][j], and that point B on the pyramid is also on this ray. If the pyramid is
present d[i][j] contains the pseudodepth to B; if it happens to be absent d[i][j] contains the
pseudodepth to P.
If d[i][j] is less than D the point P is in shadow, and p[c][r] is set using only ambient light.
Otherwise P is not in shadow and p[c][r] is set using ambient, diffuse, and specular light.
How are these steps done? As described in the exercises, to each point on the eye camera viewplane
there corresponds a point on the source camera viewplane16. For each screen pixel this
correspondence is invoked to find the pseudodepth from the source to P as well as the index [i][j]
that yields the minimum pseudodepth stored in the shadow buffer.
Practice Exercises.
8.6.4. Finding pseudodepth from the source. Suppose the matrices Mc and Ms map the point P in
the scene to the appropriate (3D) spots on the eye cameras viewplane and the source cameras
viewplane, respectively. a). Describe how to establish a source camera and how to find the
resulting matrix Ms. b). Find the transformation that, given position (x, y) on the eye cameras
viewplane produces the position (i, j) and pseudodepth on the source cameras viewplane.
c). Once (i, j) are known, how is the index [i][j] and the pseudodepth of P on the source camera
determined?
8.6.5. Extended Light sources. We have considered only point light sources in this chapter.
Greater realism is provided by modeling extended light sources. As suggested in Figure 8.70a such
sources cast more complicated shadows, having an umbra within which no light from the source is
seen, and a lighter penumbra within which a part of the source is visible. In part b) a glowing
sphere of radius 2 shines light on a unit cube, thereby casting a shadow on the wall W. Make an
accurate sketch of the umbra and penumbra that is observed on the wall. As you might expect,
algorithms for rendering shadows due to extended light sources are complex. See [watt92] for a
thorough treatment.
a). umbra and penumbra
b). example to sketch
Figure 8.70. Umbra and penumbra for extended light sources.
8.7. Summary
Since the beginning of computer graphics there has been a relentless quest for greater realism
when rendering 3D scenes. Wireframe views of objects can be drawn very rapidly but are
difficult to interpret, particularly if several objects in a scene overlap. Realism is greatly
enhanced when the faces are filled with some color and surfaces that should be hidden are
removed, but pictures rendered this way still do not give the impression of objects residing in a
scene, illuminated by light sources.
15
Of course, this test is made only if P is closer to the eye than the value stored in the normal depth
buffer of the eye camera.
16
Keep in mind these are 3D points: 2 position coordinates on the viewplane, and pseudodepth.
Chapter 8
page 44
What is needed is a shading model, that describes how light reflects off a surface depending on
the nature of the surface and its orientation to both light sources and the cameras eye. The
physics of light reflection is very complex, so programmers have developed a number of
approximations and tricks that do an acceptable job most of the time, and are reasonably
efficient computationally. The model for the diffuse component is the one most closely based
on reality, and becomes extremely complex as more and more ingredients are considered.
Specular reflections are not modeled on physical principles at all, but can do an adequate job of
recreating highlights on shiny objects. And ambient light is purely an abstraction, a shortcut
that avoids dealing with multiple reflections from object to object, and prevents shadows from
being too deep.
Even simple shading models involve several parameters such as reflection coefficients,
descriptions of a surfaces roughness, and the color of light sources. OpenGL provides ways to
set many of these parameters. There is little guidance for the designer in choosing the values of
these parameters; they are often determined by trial and error until the final rendered picture
looks right.
In this chapter we focused on rendering of polygonal mesh models, so the basic task was to
render a polygon. Polygonal faces are particularly simple and are described by a modest
amount of data, such as vertex positions, vertex normals, surface colors and material. In
addition there are highly efficient algorithms for filling a polygonal face with calculated colors,
especially if it is known to be convex. And algorithms can capitalize on the flatness of a
polygon to interpolate depth in an incremental fashion, making the depth buffer hidden surface
removal algorithm simple and efficient.
When a mesh model is supposed to approximate an underlying smooth surface the appearance
of a faces edges can be objectionable. Gouraud and Phong shading provide ways to draw a
smoothed version of the surface (except along silhouettes). Gouraud shading is very fast but
does not reproduce highlights very faithfully; Phong shading produces more realistic renderings
but is computationally quite expensive.
The realism of a rendered scene is greatly enhanced by the appearance of texturing on object
surfaces. Texturing can make an object appear to be made of some material such as brick or
wood, and labels or other figures can be pasted onto surfaces. Texture maps can be used to
modulate the amount of light that reflects from an object, or as bump maps that give a
surface a bumpy appearance. Environment mapping shows the viewer an impression of the
environment that surrounds a shiny object, and this can make scenes more realistic, particularly
in animations. Texture mapping must be done with care, however, using proper interpolation
and antialiasing (as we discuss in Chapter 10).
The chapter closed with a description of some simple methods for producing shadows of
objects. This is a complex subject, and many techniques have been developed. The two
algorithms described provide simple but partial solutions to the problem.
Greater realism can be attained with more elaborate techniques such as ray tracing and
radiosity. Chapter 14 develops the key ideas of these techniques.
page 45
first two matrices, have the shading model applied, followed by perspective division (no
clipping need be done) and by the viewport transformation. Each vertex emerges as the array
{x, y, z, b} where x and y are screen coordinates, z is pseudodepth, and b is the grayscale
brightness of the vertex. Use a tool that draws filled polygons to do the actual rendering: if you
use OpenGL, use only its 2D drawing (and depth buffer) components. Experiment with
different mesh models, camera positions, and light sources to insure that lighting is done
properly.
8.8.3. Case Study 8.3. Add Polygon Fill and Depth Buffer HSR.
(Level of Effort: III beyond that needed for Case Study 8.2.) Implement your own depth buffer,
and use it in the application of Case Study 8.2. This requires the development of a polygon fill
routine as well - see Chapter 10.
Chapter 8
page 46
Extend Case Study 8.4 to include pasting texture like this onto the faces of a cube and an
icosahedron. Use a checkerboard texture, a color cube texture, and a wood grain texture (as
described in Chapter 14).
Form a sequence of images of a textured cube, where the cube moves slightly through the
material from frame to frame. The object will appear to slide through the texture in which it
is imbedded. This gives a very different effect from an object moving with its texture attached.
Experiment with such animations.
Chapter 8
page 47