Intel Extreme Graphics 2: Developer's Guide: Whitepaper
Intel Extreme Graphics 2: Developer's Guide: Whitepaper
Intel Extreme Graphics 2: Developer's Guide: Whitepaper
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY
ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN
INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL
DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR
WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, or life sustaining
applications.
Intel may make changes to specifications and product descriptions at any time, without notice.
Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for
future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.
The Intel® Extreme Graphics 2 Developer Guide may contain design defects or errors known as errata which may cause the product to deviate
from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Intel, Pentium and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other
countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2003, Intel Corporation
2
R
Contents
1 Introduction ......................................................................................................................... 6
7 References........................................................................................................................ 28
3
R
Figures
Figure 1: 865G Chipset Platform Block Diagram................................................................ 7
Figure 2: Conventional Rendering versus Zone Rendering ............................................. 10
Figure 3: A sample DirectX 9 program, creating a device using software vertex
processing.................................................................................................................. 13
Figure 4: DirectX 7 Solution to Checking Accurate Available Memory for Extreme
Graphics 2.................................................................................................................. 14
Figure 5: DirectX 8 and 9 Solution to Checking Accurate Available Memory for Extreme
Graphics 2.................................................................................................................. 15
Figure 6: Example of Z-fighting – A Wall and Poster Share a Plane................................ 18
Figure 7: Code Snippet Showing Alternative to Using DirectX Z-bias Call ...................... 18
Figure 8: Z-fighting Solved with Projection Modification Solution..................................... 20
4
R
Revision History
5
Introduction
1 Introduction
Desktop systems utilizing Intel graphics continue to increase in volume and penetration
into the consumer marketplace. According to a recent Mercury Research report on the PC
Graphics market, Intel was the number one supplier of graphics solutions to new PC
purchasers1, resulting in a growing installed user base that offers application developers a
significant market opportunity By providing graphic capabilities integrated into
mainstream and value desktop computing platforms, Intel lowers the cost of PC
components and allows a broader base of users to have access to high quality and solid
mainstream features/performance. Each new generation of Intel’s graphics products will
continue to provide increasing levels of 2D, 3D, and video capability and performance.
The latest generation of Intel graphics, called Intel® Extreme Graphics 2, provides new
features and significant performance improvements over previous generations by offering
advanced techniques such as Zone Rendering 2 and Dynamic Video Memory
Technology (DVMT) 2.0. These features are unique to Intel products and are designed to
provide the required level of graphics performance needed for mainstream computing.
This paper will walk through each of these technologies at a high level while interjecting
some key software development tips that are necessary to take full advantage of what
Intel Extreme Graphics 2 has to offer. By utilizing this information you may see some
significant improvements in your application’s performance on the Intel integrated
graphics architecture and be able to reach a broader base of customers for your
application!
6
Intel Graphics Architecture
7
Intel Graphics Architecture
8
Intel Graphics Architecture
With conventional rendering, a scene made up of various 3D models is sent to the graphics
hardware where each model and its associated polygons undergo a slew of matrix
multiplications to transform them from model space (the local coordinate system of each
object) to world space (the coordinate system relative to the entire scene) and finally to view
space (the viewer’s coordinate system).
Next, light values are applied to the vertices of each triangle which are then converted into
pixel or screen coordinates. Each resulting pixel is then given the proper texture color (or
texel) and depth-tested (depth testing is also known as z-testing, as z represents the depth
from the screen directly back in towards the monitor) to see if it will be visible or if another
pixel closer to the viewer is blocking it. Since the triangles are processed in the order they
are received from the hardware (in the case of the 865G chipset, they are received from the
Pentium 4 processor), many pixels are written over several times as triangles closer to the
viewer are placed over those further back in the scene. This redundant rendering is memory
bandwidth heavy and does not provide the optimal results for the 865G chipset.
Zone rendering aims to improve the memory efficiency by reducing memory traffic. Like
conventional rendering the scene is passed to hardware where the polygons it consists of are
transformed into view space and their vertices lit, but rather than going directly through
screen space conversion, zone rendering first sorts each polygon by their zone. Since each
zone can fit in the chipset’s on-chip cache the depth-testing and pixel blending operations are
done quickly on-chip. This also means that each pixel is written to frame-buffer memory
only once!
9
Intel Graphics Architecture
The amount of memory bandwidth required to render a scene with conventional rendering
can be significantly more than the amount required to render a scene with Zone Rendering 2.
The impact to memory bandwidth (and performance) of using Zone Rendering 2 scales,
based on the depth complexity of a given scene. While there are no specific coding
techniques required to enable Zone Rendering 2, there are several things a programmer can
do to ensure that they take advantage of the performance benefits zone rendering has to offer.
These tips are referenced in detail in Section 3.3 – Enabling Zone Rendering 2.
10
Intel Graphics Architecture
video memory by the BIOS. This amount can be one, four, eight, sixteen or in some rare
cases thirty-two megabytes.
DVMT 2.0 allows additional system memory to be dynamically allocated for graphics usages
based on application need. Once the application is closed, the memory that was allocated is
released and is then available for system use. The purpose of dynamically allocating memory
for graphics use is to ensure a solid balance between system performance and graphics
performance.
For example, if a user is simply editing text, there would be no need for the graphics to take
up a large amount of the system’s memory. In such a case, it would be best if more memory
was allocated to the system. On the other hand, if the user was to start up a 3D game, there
would be a need for more of the shared memory to be used as graphics memory.
On boot-up the user can choose in the system’s BIOS the amount of system memory to be
permanently used by the graphics controller. Once selected, this memory will never be given
back to the system. This memory is also reported as local video memory in Microsoft
DirectX* applications. Once the operating system is started the graphics driver will then
dynamically allocate graphics memory based on requests from each application run by the
user. For systems with 128 MB or less of system memory a maximum of 32 MB will be set
aside for graphics (memory set aside by the BIOS + memory dynamically allocated by the
driver). For systems with more than 128 MB of memory a maximum of 64 MB will be
allocated for use by the graphics controller. These maximum values include both the
permanently allocated memory set aside in the BIOS as well as the dynamically allocated
memory.
Table 2: Maximum Video/Graphics Memory Allocated Based on Total System Memory for the
865G Chipset
11
Developer Tips
3 Developer Tips
Vendor ID 0x8086
Device ID 0x2572
See Appendix B for a list of previous Intel Integrated Graphics Part IDs.
12
Developer Tips
In using the CPU for transform and lighting operations, the DirectX transform and lighting
pipe optimized for the Pentium 4 processor is utilized. This “Processor Specific Graphics
Pipeline” (PSGP) allows the Intel’s Extreme Graphics controller to offload the transform and
lighting operations to software while still providing excellent performance.
Figure 3: A sample DirectX 9 function: Detects Intel 865G Chipset and Enables Software
Vertex Processing (see Appendix A for full source)
//-----------------------------------------------------------------------------
// Name: SetVertexProcessingMode
// Desc: Checks HW TnL caps and IDs 865G to enable SW TnL
//-----------------------------------------------------------------------------
DWORD SetVertexProcessingMode( LPDIRECT3D9 pD3D )
{
DWORD vertexprocessingmode; // vertex processing mode
D3DCAPS9 caps; // structure that stores device caps...
D3DADAPTER_IDENTIFIER9 adapterID; // Used to store device info
13
Developer Tips
return vertexprocessingmode;
}
To accurately detect the amount of memory available to Intel’s integrated graphics devices,
developers need to check the total video memory availability. Local memory is considered to
be the memory permanently set aside by the BIOS for use as graphics memory. Non-local
video memory is the memory beyond the local video memory that was dynamically allocated
based on requests from applications. Both local and non-local video memory combine to
equal the total amount of video memory and each are handled identically for memory
accesses.
The code snippet below outlines the function calls necessary to most accurately check the
memory available for use by the graphics controller within DirectX.
Figure 4: DirectX 7 Solution to Checking Accurate Available Memory for Extreme Graphics 2
...
DDSCAPS2 ddsVidMemcaps;
ZeroMemory(&ddsVidMemcaps, sizeof(DDSCAPS2));
ddsVidMemcaps.dwCaps = DDSCAPS_VIDEOMEMORY;
14
Developer Tips
...
Figure 5: DirectX 8 and 9 Solution to Checking Accurate Available Memory for Extreme
Graphics 2
...
15
Developer Tips
For example, if a render to texture technique is used to create a shadow effect, the render to
texture could be completed at a variety of points. Completing the render to texture before the
scene will result in increased performance by allowing Zone Rendering 2 mode to stay active
and by avoiding unnecessary buffer evictions.
16
Developer Tips
Clearing the buffers together and once per frame is the best option. In cases when this is not
possible, the depth and stencil buffers should be cleared together.
Similarly, partially clearing buffers can have a negative impact on performance. Fast clear
options that allow the entire buffer to clear quickly will not be implemented if a buffer is
only partially cleared.
• Use level-of-detail
17
Developer Tips
Figure 6 above demonstrates the type of visual artifact commonly seen when z-fighting
occurs. Using the D3DRS_DEPTHBIAS render state in DirectX (D3DRS_ZBIAS on
DirectX 8 and earlier) addresses this issue, however, the same visual artifacts can occur on
other hardware.
An alternate method of addressing this issue is to load a new projection matrix in which the
near and far clipping planes have been pushed out (away from the viewer). By loading this
projection matrix before any objects that appear closer to the viewer, the desired object is
placed closer in the z-buffer. Sample of code that accomplishes this can be seen below. In
this case it is applying a z-bias to the poster, so that it is correctly displayed on the wall.
18
Developer Tips
// The “zbiased” projection has it near and far clipping planes pushed out…
D3DXMatrixPerspectiveFovLH( &matProj_zbias, D3DX_PI/4, 1.0f, 1.5f, 110.0f );
. . .
// Wall is rendered...
// Poster is rendered...
. . .
While some adjustments to the projection matrix may still be necessary to get the desired
results, this technique is more consistent across a variety of graphics hardware. Below we
can see the result of the alternate solution.
19
Developer Tips
20
Summary
4 Summary
Intel’s Integrated Graphics architecture is unique in the PC marketplace. It is designed to
provide a more balanced use of all the platform components, not just the graphics core, to
contribute to an excellent user experience for mainstream computing with low added cost to
the complete system cost. The increasing use of Intel's graphics architecture in the
marketplace provides incentive to support this solution if your application is targeted at
mainstream PC users.
As this document described, there are some key paths you can take in your application
development to provide a better user experience on Intel's graphics architecture. By
implementing these examples into your application you will likely see improvements in your
application's visual quality and performance on Intel integrated graphics platforms.
If you would like additional information, be sure to take a look on developer.intel.com and
search on graphics. Questions, comments, and concerns can be sent to
kipp.owens@intel.com.
21
Appendix A: Creating a DirectX 9 Device, Identifying Intel 865G Chipset
The program is a modification of the DirectX 9 SDK tutorial “Vertices.” It requires the DirectX 9
SDK to compile and must be linked with d3d9.lib.
#include <d3d9.h>
#include <string.h>
//-----------------------------------------------------------------------------
// Global variables
//-----------------------------------------------------------------------------
LPDIRECT3D9 g_pD3D = NULL; // Used to create the D3DDevice
LPDIRECT3DDEVICE9 g_pd3dDevice = NULL; // Our rendering device
LPDIRECT3DVERTEXBUFFER9 g_pVB = NULL; // Buffer to hold vertices
DWORD g_VertexProcessingMode = 0; // Used to set SW or HW vert
proc.
//-----------------------------------------------------------------------------
// Name: SetVertexProcessingMode
// Desc: Checks HW TnL caps and IDs 865G to enable SW TnL
//-----------------------------------------------------------------------------
DWORD SetVertexProcessingMode( LPDIRECT3D9 pD3D )
{
DWORD vertexprocessingmode; // vertex processing mode
D3DCAPS9 caps; // structure that stores device caps...
D3DADAPTER_IDENTIFIER9 adapterID; // Used to store device info
22
Appendix A: Creating a DirectX 9 Device, Identifying Intel 865G Chips
{
// check vendor and device ID and enable software vertex processing for
// Intel(R) 865G...
return vertexprocessingmode;
}
//-----------------------------------------------------------------------------
// Name: InitD3D()
// Desc: Initializes Direct3D
//-----------------------------------------------------------------------------
HRESULT InitD3D( HWND hWnd )
{
// Create the D3D object.
if( NULL == ( g_pD3D = Direct3DCreate9( D3D_SDK_VERSION ) ) )
return E_FAIL;
char mode_str[255];
switch( g_VertexProcessingMode )
{
case E_FAIL: MessageBox( hWnd, "Error identifying GPU", "Error", MB_OK
);
exit( E_FAIL );
case E_MINSPEC: MessageBox( hWnd, "GPU does not meet minimum specs:
Intel(R) 865G or Hardware T&L chip required", "Error", MB_OK );
exit( E_MINSPEC );
case D3DCREATE_HARDWARE_VERTEXPROCESSING:
strcpy( mode_str, "Hardware T&L Enabled" );
break;
case D3DCREATE_SOFTWARE_VERTEXPROCESSING:
strcpy( mode_str, "Software T&L Enabled" );
break;
}
23
Appendix A: Creating a DirectX 9 Device, Identifying Intel 865G Chipset
//-----------------------------------------------------------------------------
// Name: InitVB()
// Desc: Creates a vertex buffer and fills it with our vertices. The vertex
// buffer is basically just a chuck of memory that holds vertices. After
// creating it, we must Lock()/Unlock() it to fill it. For indices, D3D
// also uses index buffers. The special thing about vertex and index
// buffers is that they can be created in device memory, allowing some
// cards to process them in hardware, resulting in a dramatic
// performance gain.
//-----------------------------------------------------------------------------
HRESULT InitVB()
{
// Initialize three vertices for rendering a triangle
CUSTOMVERTEX vertices[] =
{
{ 150.0f, 50.0f, 0.5f, 1.0f, 0xffff0000, }, // x, y, z, rhw, color
{ 250.0f, 250.0f, 0.5f, 1.0f, 0xff00ff00, },
{ 50.0f, 250.0f, 0.5f, 1.0f, 0xff00ffff, },
};
return S_OK;
}
//-----------------------------------------------------------------------------
// Name: Cleanup()
// Desc: Releases all previously initialized objects
//-----------------------------------------------------------------------------
VOID Cleanup()
{
if( g_pVB != NULL )
g_pVB->Release();
24
Appendix A: Creating a DirectX 9 Device, Identifying Intel 865G Chips
//-----------------------------------------------------------------------------
// Name: Render()
// Desc: Draws the scene
//-----------------------------------------------------------------------------
VOID Render()
{
// Clear the backbuffer to a blue color
g_pd3dDevice->Clear( 0, NULL, D3DCLEAR_TARGET, D3DCOLOR_XRGB(0,0,0), 1.0f, 0
);
//-----------------------------------------------------------------------------
// Name: MsgProc()
// Desc: The window's message handler
//-----------------------------------------------------------------------------
LRESULT WINAPI MsgProc( HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam )
{
switch( msg )
{
case WM_DESTROY:
Cleanup();
PostQuitMessage( 0 );
return 0;
}
//-----------------------------------------------------------------------------
// Name: WinMain()
// Desc: The application's entry point
//-----------------------------------------------------------------------------
INT WINAPI WinMain( HINSTANCE hInst, HINSTANCE, LPSTR, INT )
{
// Register the window class
WNDCLASSEX wc = { sizeof(WNDCLASSEX), CS_CLASSDC, MsgProc, 0L, 0L,
GetModuleHandle(NULL), NULL, NULL, NULL, NULL,
"Enabling Software T&L for 865G", NULL };
RegisterClassEx( &wc );
25
Appendix A: Creating a DirectX 9 Device, Identifying Intel 865G Chipset
// Initialize Direct3D
if( SUCCEEDED( InitD3D( hWnd ) ) )
{
// Create the vertex buffer
if( SUCCEEDED( InitVB() ) )
{
// Show the window
ShowWindow( hWnd, SW_SHOWDEFAULT );
UpdateWindow( hWnd );
26
Appendix B – Intel Integrated Graphics Part IDs
27
References
7 References
Intel Extreme Graphics 2 Homepage: http://developer.intel.com/design/graphics2/
28