Equalization OpenGL
Equalization OpenGL
Equalization OpenGL
Contributors
Written by Stefan Eilemann.
Contributions by Daniel Nachbaur, Maxim Makhinya, Jonas Bösch, Christian Marten,
Sarah Amsellem, Patrick Bouchaud, Philippe Robert, Robert Hauck and Lucas
Peetz Dulley.
Copyright
©2007-2013 Eyescale Software GmbH. All rights reserved. No permission is granted
to copy, distribute, or create derivative works from the contents of this electronic
documentation in any manner, in whole or in part, without the prior written per-
mission of Eyescale Software GmbH.
Feedback
If you have comments about the content, accuracy or comprehensibility of this
Programming and User Guide, please contact eile@eyescale.ch.
Contents
I. User Guide 1
1. Introduction 1
1.1. Parallel Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Installing Equalizer and Running eqPly . . . . . . . . . . . . . . . . . 2
1.3. Equalizer Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1. Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.2. Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.3. Render Clients . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.4. Administration Programs . . . . . . . . . . . . . . . . . . . . 3
2. Scalable Rendering 3
2.1. 2D or Sort-First Compounds . . . . . . . . . . . . . . . . . . . . . . 4
2.2. DB or Sort-Last Compounds . . . . . . . . . . . . . . . . . . . . . . 4
2.3. Stereo Compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4. DPlex Compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5. Tile Compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.6. Pixel Compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7. Subpixel Compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.8. Automatic Runtime Adjustments . . . . . . . . . . . . . . . . . . . . 8
i
Contents
ii
List of Figures
B. File Format 99
B.1. File Format Version . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
B.2. Global Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
B.3. Server Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
B.3.1. Connection Description . . . . . . . . . . . . . . . . . . . . . 106
B.3.2. Config Section . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.3.3. Node Section . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.3.4. Pipe Section . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
B.3.5. Window Section . . . . . . . . . . . . . . . . . . . . . . . . . 109
B.3.6. Channel Section . . . . . . . . . . . . . . . . . . . . . . . . . 109
B.3.7. Observer Section . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.3.8. Layout Section . . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.3.9. View Section . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.3.10. Canvas Section . . . . . . . . . . . . . . . . . . . . . . . . . . 111
B.3.11. Segment Section . . . . . . . . . . . . . . . . . . . . . . . . . 112
B.3.12. Compound Section . . . . . . . . . . . . . . . . . . . . . . . . 113
List of Figures
1. Parallel Rendering . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Equalizer Processes . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 2
3. 2D Compound . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Database Compound .
. . . . . . . . . . . . . . . . . . . . . . . . . . 4
5. Stereo Compound . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 5
6. A DPlex Compound .
. . . . . . . . . . . . . . . . . . . . . . . . . . 6
7. Tile Compound . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 6
8. Pixel Compound . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 7
9. Pixel Compound Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . 7
10. Example Pixel Kernels for a four-to-one Pixel Compound . . . . . . 8
11. Subpixel Compound . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
12. GPU discovery for auto-configuration . . . . . . . . . . . . . . . . . 9
13. An Example Configuration . . . . . . . . . . . . . . . . . . . . . . . 11
14. Wall and Projection Parameters . . . . . . . . . . . . . . . . . . . . 13
15. A Canvas using four Segments . . . . . . . . . . . . . . . . . . . . . 14
16. Layout with four Views . . . . . . . . . . . . . . . . . . . . . . . . . 15
17. Display Wall using a six-Segment Canvas with a two-View Layout . 16
18. 2D Load-Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
19. Cross-Segment Load-Balancing for two Segments using eight GPUs . . . . 19
20. Cross-Segment Load-Balancing for a CAVE . . . . . . . . . . . . . . . . 20
21. Dynamic Frame Resolution . . . . . . . . . . . . . . . . . . . . . . . 20
22. Monitoring a Projection Wall . . . . . . . . . . . . . . . . . . . . . . 21
23. Hello, World! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
24. Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
25. Equalizer client UML map . . . . . . . . . . . . . . . . . . . . . . . . 27
26. Simplified Execution Model . . . . . . . . . . . . . . . . . . . . . . . 28
iii
List of Figures
Revision History
Rev Date Changes
1.0 Oct 28, 2007 Initial Version for Equalizer 0.4
1.2 Apr 15, 2008 Revision for Equalizer 0.5
1.4 Nov 25, 2008 Revision for Equalizer 0.6
1.6 Aug 07, 2009 Revision for Equalizer 0.9
1.8 Mar 21, 2011 Revision for Equalizer 1.0
1.10 Feb 17, 2012 Revision for Equalizer 1.2
1.12 Jul 20, 2012 Revision for Equalizer 1.4
1.14 Jul 25, 2013 Revision for Equalizer 1.6
iv
Part I.
User Guide
1. Introduction
Equalizer is the standard middleware for the development and deployment of par-
allel OpenGL applications. It enables applications to benefit from multiple graph-
ics cards, processors and computers to improve the rendering performance, visual
quality and display size. An Equalizer-based application runs unmodified on any
visualization system, from a simple workstation to large scale graphics clusters,
multi-GPU workstations and Virtual Reality installations.
This User and Programming Guide introduces parallel rendering concepts, the
configuration of Equalizer-based applications and programming using the Equalizer
parallel rendering framework.
Equalizer is the most advanced middleware for scalable 3D visualization, provid-
ing the broadest set of parallel rendering features available in an open source library
to any visualization application. Many commercial and open source applications in
a variety of different markets rely on Equalizer for flexibility and scalability.
Equalizer provides the domain-specific parallel rendering expertise and abstracts
configuration, threading, synchronization, windowing and event handling. It is a
‘GLUT on steroids’, providing parallel and distributed execution, scalable rendering
features, an advanced network library and fully customizable event handling.
If you have any question regarding Equalizer programming, this guide, or other
specific problems you encountered, please direct them to the eq-dev mailing list6 .
1
1. Introduction
The Windows build is similar, except that CMake will generate a Visual Studio
solution which is used to build Equalizer.
modify
libCollage.so
performance, object-oriented, versioned data dis- libEqualizer.so
libEqualizer.so
Application libCollage.so
tribution. Collage is designed for low-overhead Render Client
libEqualizer.so
Application EqualizerAdmin
multi-threaded execution which allows applica- Render Client
Application
Render Client
Admin
Console
tions to easily exploit multi-core architectures.
Equalizer uses Collage as the cluster backend,
Figure 2: Equalizer Processes
e.g., by setting up direct communication between
two nodes when needed for image compositing or
software swap barriers.
Figure 2 depicts the relationship between the server, application, render client
and administrative processes, which are explained below.
1.3.1. Server
The Equalizer server is responsible for managing one visualization session on a
shared memory system or graphics cluster. Based on its configuration and con-
trolling input from the application, it computes the active resources, updates the
configuration and generates tasks for all processes. Furthermore it controls and
launches the application’s rendering client processes. The Equalizer server is the
entity in charge of the configuration, and all other processes receive their configura-
tion from the server. It typically runs as a separate entity within separate threads
in the application process.
7 http://www.equalizergraphics.com/downloads.html
8 http://www.equalizergraphics.com/documents/EqualizerGuide.html
9 https://launchpad.net/ eilemann/+archive/equalizer
10 https://github.com/Eyescale/portfiles
2
2. Scalable Rendering
1.3.2. Application
The application connects to an Equalizer server and receives a configuration. Fur-
thermore, the application also provides its render client, which will be controlled
by the server. The application and render client may use the same executable. The
application has a main loop, which reacts on events, updates its data and controls
the rendering.
2. Scalable Rendering
Scalable rendering is a subset of parallel rendering, where more multiple resources
are used to update a view.
Real-time visualization is an inherently parallel problem. Different applications
have different rendering algorithms, which require different scalable rendering modes
to address the bottlenecks correctly. Equalizer supports all important algorithms
as listed below, and will continue to add new ones over time to meet application
requirements.
This section gives an introduction to scalable rendering, providing some back-
ground for end users and application developers. The scalability modes offered by
Equalizer are discussed, along with their advantages and disadvantages.
Choosing the right mode for the application profile is critical for performance.
Equalizer uses the concept of compounds to describe the task decomposition and
result recomposition. It allows the combination of the different compound modes
in any possible way, which allows to address different bottlenecks in a flexible way.
3
2. Scalable Rendering
4
2. Scalable Rendering
This allows to lower the requirements on all parts of the rendering pipeline: main
memory usage, IO bandwidth, GPU memory usage, vertex processing and fill rate.
Unfortunately, the database recomposition has linear increasing IO requirements
for the pixel transfer. Parallel compositing algorithms, such as direct-send, address
this problem by keeping the per-node IO constant (see Figure 51).
The application has to partition the database so that the rendering units render
only part of the database. Some OpenGL features do not work correctly (anti-
aliasing) or need special attention (transparency, shadows).
The best use of database compounds is to divide the data to a manageable size,
and then to use other decomposition modes to achieve further scalability. Volume
rendering is one of the applications which can profit from database compounds.
DB compounds in Equalizer are configured using the range parameter, using the
values [ begin end ] in normalized coordinates. The range defines the start and end
point of the application’s database to be rendered. The value has to be interpreted
by the application’s rendering code accordingly. Each child compound uses an
output frame, which is connected to an input frame on the destination channel. For
more than two contributing channels, it is recommended to configure streaming or
parallel direct send compositing, as described in Section 7.2.9.
5
2. Scalable Rendering
6
2. Scalable Rendering
7
2. Scalable Rendering
8
3. Configuring Equalizer Clusters
3.1.2. Usage
On each node contributing to the configuration, install and start the hwsd daemon.
If multiple, disjoint configurations are used on the same network, provide the session
name as a parameter when starting the daemon. Verify that all GPUs and network
interfaces are visible on the application host using the hw sd list tool. When starting
the application, use the command-line parameter –eq-config:
• The default value is local which uses all local GPUs and network interfaces
queried using the cgl, glx, or wgl GPU modules and the sys network module.
• –eq-config sessionname uses the dns sd ZeroConf module of hwsd to query all
GPUs and network interfaces in the subnet for the given session. The default
session of the hwsd daemon is default. The found network interfaces are used
to connect the nodes.
• –eq-config filename.eqc loads a static configuration from the given ASCII file.
The following sections describe how to write configuration files.
9
3. Configuring Equalizer Clusters
The auto-configuration creates one display window on the local machine, and one
off-screen channel for each GPU. The display window has one full-window channel
used as an output channel for a single segment. It combines all GPUs into a
scalability config with different layouts for each of the following scalability modes:
DB 2D A direct send sort-last configuration using all nodes together with a dy-
namically load-balanced sort-first configuration using local GPUs.
All suitable network interfaces are used to configure the nodes, that is, the launch
command has to be able to resolve one hostname for starting the render client
processes. Suitable interfaces have to be up and match optional given values which
can be specified by the following command-line parameters:
3.2. Preparation
Before writing a configuration, it is useful to assemble the following information:
10
3. Configuring Equalizer Clusters
3.3. Summary
Equalizer applications are configured at runtime by the Equalizer server. The server
loads its configuration from a text file, which is a one-to-one representation of the
configuration data structures at runtime.
For an extensive documentation of the file format please refer to Appendix B.
This section gives an introduction on how to write configuration files.
A configuration consists of the declaration of the rendering resources, the descrip-
tion of the physical layout of the projection system, logical layouts on the projection
canvases and an optional decomposition description using the aforementioned re-
sources.
The rendering resources are represented in a hierarchical tree structure which cor-
responds to the physical and logical resources found in a 3D rendering environment:
nodes (computers), pipes (graphics cards), windows, channels.
Physical layouts of display systems are configured using canvases with segments,
which represent 2D rendering areas composed of multiple displays or projectors.
Logical layouts are applied to canvases and define views on a canvas.
Scalable resource usage is configured using a compound tree, which is a hierar-
chical representation of the rendering decomposition and recomposition across the
resources.
segment segment
Channel Channel Channel Channel channel "left" channel "right"
name "left" name "right" name "floor" name "front" wall { ... } wall { ... }
viewport {...} viewport {...}
11
3. Configuring Equalizer Clusters
3.4. Node
For each machine in your cluster, create one node. Create one appNode for your
application process. List all nodes, even if you are not planning to use them at
first. Equalizer will only instantiate and access used nodes, that is, nodes which are
referenced by an active compound.
In each node, list all connections through which this node is reachable. Typically
a node uses only one connection, but it is possible to configure multiple connections
if the machine and cluster is set up to use multiple, independent network interfaces.
Make sure the configured hostname is reachable from all nodes. An IP address may
be used as the hostname.
For cluster configurations with multiple nodes, configure at least one connection
for the server. All render clients connect back to the server, for which this connection
is needed.
The eq::Node class is the representation of a single computer in a cluster. One
operating system process of the render client will be used for each node. Each
configuration might also use an application node, in which case the application
process is also used for rendering. All node-specific task methods are executed from
the main application thread.
3.5. Pipe
For each node, create one pipe for each graphics card in the machine. Set the device
number to the correct index. On operating systems using X11, e.g., Linux, also set
the port number if your X-Server is running on a nonstandard port.
The eq::Pipe class is the abstraction of a graphics card (GPU), and uses one inde-
pendent operating system thread for rendering. Non-threaded pipes are supported
for integrating with thread-unsafe libraries, but have various performance caveats.
They should only be used if using a different, synchronized rendering thread is not
an option.
All pipe, window and channel task methods are executed from the pipe thread,
or in the case of non-threaded pipes from the main application thread14 .
3.6. Window
Configure one window for each desired application window on the appNode. Con-
figure one full-screen window for each display segment. Configure one off-screen
window, typically a pbuffer, for each graphics card used as a source for scalable
rendering. Provide a useful name to each on-screen window if you want to easily
identify it at runtime.
Sometimes display segments cover only a part of the graphics card output. In this
case it is advised to configure a non-fullscreen window without window decorations,
using the correct window viewport.
The eq::Window class encapsulates a drawable and an OpenGL context. The
drawable can be an on-screen window or an off-screen pbuffer or framebuffer object
(FBO).
14 see http://www.equalizergraphics.com/documents/design/nonthreaded.html
12
3. Configuring Equalizer Clusters
3.7. Channel
Configure one channel for each desired rendering area in each window. Typically
one full-screen channel per window is used. Name the channel using a unique, easily
identifiable name, e.g., ’source-1’, ’control-2’ or ’segment-2 3’.
Multiple channels in application windows may be used to view the model from
different viewports. Sometimes, a single window is split across multiple projectors,
e.g., by using an external splitter such as the Matrox TripleHead2Go. In this case
configure one channel for each segment, using the channel’s viewport to configure
its position relative to the window.
The eq::Channel class is the abstraction of an OpenGL viewport within its parent
window. It is the entity executing the actual rendering. The channel’s viewport is
overwritten when it is rendering for another channel during scalable rendering.
3.8. Canvases
If you are writing a configuration for workstation usage you can skip the following
sections and restart with Section 3.11.
Configure one canvas for each display surface. For planar surfaces, e.g., a display
wall, configure a frustum. For non-planar surfaces, the frustum will be configured
on each display segment.
The frustum can be specified as a wall or projection description. Take care to
choose your reference system for describing the frustum to be the same system as
used by the head-tracking matrix calculated by the application. A wall is completely
defined by the bottom-left, bottom-right and top-left coordinates relative to the
origin. A projection is defined by the position and head-pitch-roll orientation of
the projector, as well as the horizontal and vertical field-of-view and distance of the
projection wall.
Figure 14 illustrates the
wall and projection frustum
parameters.
top
wall
t
13
3. Configuring Equalizer Clusters
3.8.1. Segments
Configure one segment for each display or projector of each canvas. Configure the
viewport of the segment to match the area covered by the segment on the physical
canvas. Set the output channel to the resource driving the described projector.
For non-planar displays, configure the frustum as described in Section 3.8. For
passive stereo installations, configure one segment per eye pass, where the segment
for the left and right eye have the same viewport. Set the eyes displayed by the
segment, i.e., left or right and potentially cyclop.
To synchronize the video output, configure either a canvas swap barrier or a swap
barrier on each segment to be synchronized.
When using software swap synchronization, swap-lock all segments using a swap
barrier. All windows with a swap barrier of the same name synchronize their swap-
buffers. Software swap synchronization uses a distributed barrier, and works on all
hardware.
When using hardware swap synchronization, use swap barriers for all segment to
be synchronized, setting NV group and NV barrier appropriately. The swap barrier
name is ignored in this case. All windows of the same group on a single node
synchronize their swap buffer. All groups of the same barrier synchronize their
swap buffer across nodes. Please note that the driver typically limits the number
of groups and barriers to one, and that multiple windows per swap group are not
supported by all drivers. Hardware swap barriers require support from the OpenGL
driver, and has been tested on NVIDIA Quadro GPUs with the G-Sync option.
Please refer to your OpenGL driver documentation for details.
A segment represents one output channel of the
canvas, e.g., a projector or an LCD. A segment
has an output channel, which references the chan-
nel to which the display device is connected.
A segment covers a part of its parent canvas,
which is configured using the segment viewport.
The viewport is in normalized coordinates rela-
tive to the canvas. Segments might overlap (edge-
blended projectors) or have gaps between each
other (display walls, Figure 1515 ). The viewport
is used to configure the segment’s default frustum
from the canvas frustum description, and to place Figure 15: A Canvas using four
layout views correctly. Segments
14
3. Configuring Equalizer Clusters
within the compound tree, that is, a compound from one compound tree cannot be
synchronized with a compound from another compound tree.
A hardware swap barrier uses a hardware component to synchronize the buffer
swap of multiple windows. It guarantees that the swap happens at the same vertical
retrace of all corresponding video outputs. It is configured by setting the NV group
and NV barrier parameters. These parameters follow the NV swap group extension,
which synchronizes all windows bound to the same group on a single machine, and
all groups bound to the same barrier across systems.
Display synchronization uses different algorithms. Framelock synchronizes each
vertical retrace of multiple graphic outputs using a hardware component. This is
configured in the driver, independently of the application. Genlock synchronizes
each horizontal and vertical retrace of multiple graphic outputs, but is not com-
monly used anymore for 3D graphics. Swap lock synchronizes the front and back
buffer swap of multiple windows, either using a software solution based on network
communication or a hardware solution based on a physical cable. It is independent
of, but often used in conjunction with framelock.
Framelock is used to synchronize the vertical retrace in multi-display active stereo
installations, e.g., for edge-blended projection systems or immersive installations.
It is combined with a software or hardware swap barrier. Software barriers in this
case cannot guarantee that the buffer swap of all outputs always happens with the
same retrace. The time window for an output to miss the retrace for the buffer
swap is however very small, and the output will simply swap on the next retrace,
typically in 16 milliseconds.
Display walls made out of LCDs, monoscopic or passive stereo projection systems
often only use a software swap barrier and no framelock. The display bezels make
it very hard to notice the missing synchronization.
3.9. Layouts
Configure one layout for each configuration of logical views. Name the layout using
a unique name. Often only one layout with a one view is used for all canvases.
Enable the layout on each desired canvas by adding it to the canvas. Since
canvases reference layouts by name or index, layouts have to be configured before
their respective canvases in the configuration file.
A layout is the grouping of logical views. It is used by one or more canvases. For
all given layout/canvas combinations, Equalizer creates destination channels when
the configuration file is loaded. These destination channels can be referenced by
compounds to configure scalable rendering.
Layouts can be switched at runtime by the application. Switching a layout will
activate different destination channels for rendering.
3.9.1. Views
Configure one view for each logical view in each
layout. Set the viewport to position the view. Set
the mode to stereo if appropriate.
A view is a logical view of the application data,
in the sense used by the Model-View-Controller
pattern. It can be a scene, viewing mode, view-
ing position, or any other representation of the
application’s data.
A view has a fractional viewport relative to
its layout. A layout is often fully covered by its
views, but this is not a requirement.
Figure 16: Layout with four
Views
15
3. Configuring Equalizer Clusters
Each view can have a frustum description. The view’s frustum overrides frusta
specified at the canvas or segment level. This is typically used for non-physically
correct rendering, e.g., to compare two models side-by-side on a canvas. If the view
does not specify a frustum, it will use the sub-frustum resulting from the covered
area on the canvas.
A view might have an observer, in which case its frustum is tracked by this
observer. Figure 16 shows an example layout using four views on a single segment.
Figure 3.9.1 shows a real-world setup of a single canvas with six segments using
underlap, with a two-view layout activated. This configuration generates eight
destination channels.
Figure 17: Display Wall using a six-Segment Canvas with a two-View Layout
3.10. Observers
Unless you have multiple tracked persons, or want to disable tracking on certain
views, you can skip this section.
Configure one observer for each tracked person in the configuration. Most config-
urations have at most one observer. Assign the observer to all views which belong
to this observer. Since the observer is referenced by its name or index, it has to be
specified before the layout in the configuration file.
Views with no observer are not tracked. The config file loader will create one
default observer and assign it to all views if the configuration has no observer.
An observer represents an actor looking at multiple views. It has a head matrix,
defining its position and orientation within the world, an eye separation and focus
distance parameters. Typically, a configuration has one observer. Configurations
with multiple observers are used if multiple, head-tracked users are in the same con-
figuration session, e.g., a non-tracked control host with two tracked head-mounted
displays.
3.11. Compounds
Compound trees are used to describe how multiple rendering resources are combined
to produce the desired output, especially how multiple GPUs are aggregated to
increase the performance.
16
3. Configuring Equalizer Clusters
It is advised to study and understand the basic configuration files shipped with the
Equalizer configuration, before attempting to write compound configurations. The
auto-configuration code and command line program configTool, shipped with the
Equalizer distribution, creates some standard configurations automatically. These
are typically used as templates for custom configuration files.
For configurations using canvases and layouts without scalability, the configura-
tion file loader will create the appropriate compounds. It is typically not necessary
to write compounds for this use case.
The following subsection outlines the basic approach to writing compounds. The
remaining subsections provide an in-depth explanation of the compound structure
to give the necessary background for compound configuration.
17
3. Configuring Equalizer Clusters
3.11.3. Frustum
Compounds have a frustum description to define the physical layout of the display
environment. The frustum specification is described in Section 3.8. The frustum
description is inherited by the children, therefore the frustum is defined on the
topmost compound, typically by the corresponding segment.
3.11.5. Tasks
Compounds execute a number of tasks: clear, draw, assemble and readback. By
default, a leaf compound executes all tasks and a non-leaf compound assemble and
readback. A non-leaf compound will never execute the draw task.
A compound can be configured to execute a specific set of tasks, for example to
configure the multiple steps used by binary-swap compositing.
18
3. Configuring Equalizer Clusters
output channel to the input channel. The assembly routine of the input channel will
block on the availability of the output frame. This composition process is extensively
described in Section 7.2.9. Frame names are only valid within the compound tree,
that is, an output frame from one compound tree cannot be used as an input frame
of another compound tree.
Load Equalizer While pixel, subpixel and stereo compounds are naturally load-
balanced, 2D and DB compounds often need load-balancing for optimal rendering
performance.
Using a load equalizer is transparent to the ap-
plication, and can be used with any application
for 2D, and with most applications for DB load-
balancing. Some applications do not support dy-
namic updates of the database range, and there-
fore cannot be used with DB load-balancing.
Using a 2D or DB load-balancer will adjust
the 2D split or database range automatically each
frame. The 2D load-balancer exists in three fla-
vors: 2D using tiles, VERTICAL using columns
and HORIZONTAL using rows.
2D load-balancing increases the framerate over
a static decomposition in virtually all cases. It
works best if the application data is relatively
uniformly distributed in screen-space. A damping
parameter can be used to fine-tune the algorithm.
Figure 18: 2D Load-Balancing
DB load-balancing is beneficial for applications
which cannot precisely predict the load for their
scene data, e.g., when the data is nonuniform. Volume rendering is a counterexam-
ple, where the data is uniform and a static DB decomposition typically results in a
better performance.
19
3. Configuring Equalizer Clusters
Figure 1916 illustrates this process. On the left side, a static assignment of re-
sources to display segments is used. The right-hand segment has a higher load than
the left-hand segment, causing sub-optimal performance. The configuration on the
right uses a view equalizer, which assigns two GPUs to the left segment and four
GPUs to the right segment, which leads to optimal performance for this model and
camera position.
The view equalizer can also use resources from
another display resource, if this resource has little
rendering load by itself. It is therefore possible
to improve the rendering performance of a multi-
display system without any additional resources.
This is particularly useful for installations with
a higher number of displays where the rendering
load is typically in a few segments only, e.g., for
a CAVE.
Figure 20 shows cross-usage for a five-sided
CAVE driven by five GPUs. The front and left
segments show the model and have a significant
rendering load. The view equalizer assigns the
GPUs from the top, bottom and right wall for
rendering the left and front wall in this configu- Figure 20: Cross-Segment Load-
ration. Balancing for a CAVE
Cross-segment load-balancing is configured hi-
erarchically. On the top compound level, a view equalizer assigns resources to each
of its children, so that the optimal number of resources is used for each segment.
On the next level, a load equalizer on each child computes the resource distribu-
tion within the segment, taking the resource usage given by the view equalizer into
account.
20
4. Setting up a Visualization Cluster
Figure 2117 shows DFR for volume rendering. To achieve 10 frames per second,
the model is rendered at a lower resolution, and upscaled to the native resolution for
display. The rendering quality is slightly degraded, while the rendering performance
remains interactive. When the application is idle, it renders a full-resolution view.
The dynamic frame resolution is not limited to subsampling the rendering resolu-
tion, it will also supersample the image if the source buffer is big enough. Upscaled
rendering, which will down-sample the result for display, provides dynamic anti-
aliasing at a constant framerate.
Monitor Equalizer The monitor equalizer allows the observation of another view,
potentially made of multiple segments, in a different channel at a different resolution.
This is typically used to reuse the rendering of a large-scale display on an operator
station.
A monitor equalizer ad-
justs the frame zoom of the
output frames used to observe
the rendering, depending on
the destination channel size.
The output frames are down-
scaled on the GPU before
readback, which results in op-
timal performance.
Figure 22 shows a usage of
the monitor equalizer. A two-
segment display wall is driven
by a separate control station.
The rendering happens only Figure 22: Monitoring a Projection Wall
on the display wall, and the
control window receives the correctly downscaled version of the rendering.
21
4. Setting up a Visualization Cluster
Render Clients Specify the client IPs as the hostname field in the connection de-
scription of each node section.
4.2. Server
The server may be started as a separate process or within the application process. If
it is started separately, simply provide the desired configuration file as a parameter.
It will listen on all network addresses, unless other connection parameters are spec-
ified in the configuration file. If the server is started within the application process,
using the –eq-config parameter, you will have to specify a connection description
for the server in the configuration file. Application-local servers do not create a
listening socket by default for security reasons.
• Set the connection hostname and port parameters of each node in the con-
figuration file.
• Start the render clients using the parameters –eq-client and –eq-listen, e.g.,
./build/Linux/bin/eqPly –eq-client –eq-listen 192.168.0.2:1234. Pay attention
to use the same connection parameters as in the configuration file.
• Start the application. If the server is running on the same machine and user
as the application, the application will connect to it automatically. Otherwise
use the –eq-server parameter to specify the server address.
The render clients will automatically exit when the config is closed. The eqPly
example application implements the option -r to keep the render client processes
resident.
22
4. Setting up a Visualization Cluster
it suffices to install the application in the same directory on all machines, ideally
using a shared file system.
The default launch command is set to ssh, which is the most common solution for
remote logins. To allow the server to launch the render clients without user inter-
action, password-less ssh needs to be set up. Please refer to the ssh documentation
(cf. ssh-keygen and /.ssh/authorised keys) and verify the functionality by logging in
to each machine from the host running the server.
4.4. Debugging
If your configuration does not work, simplify it as much as possible first. Normally
this means that there is one server, application and render client. Failure to launch
a cluster configuration often is caused by one of the following reasons:
• A firewall is blocking network connections.
• The render client can’t access the GPUs on the remote host. Set up your X-
Server or application rights correctly. Log into the remote machine using the
launch command and try to execute a simple application, e.g., glxinfo -display
:0.1.
• The server does not find the prelaunched render client. Verify that the client
is listening on the correct IP and port, and that this IP and port are reachable
from the server host.
• The server cannot launch a render client. Check the server log for the launch
command used, and try to execute a simple application from the server host
using this launch command. It should run without user interaction. Check
that the render client is installed in the correct path. Pay attention to the
launch command quotes used to separate arguments on Windows. Check that
the same software versions, including Equalizer, are installed on all machines.
• A client can’t connect back to the application. Check the client log, this is
typically caused by a misconfigured host name resolution.
23
Part II.
Programming Guide
This Programming Guide introduces Equalizer using a top-down approach, starting
with a general introduction of the API in Section 5, followed by the simplified Sequel
API in Section 6 which implements common use cases to deliver a large subset of the
canonical Equalizer API introduced in Section 7. Section 8 introduces the separate
Collage network library used as the foundation for the distributed execution and
data synchronizing throughout Sequel and Equalizer.
5. Programming Interface
To modify an application for Equalizer, the programmer structures the source code
so that the OpenGL rendering can be executed in parallel, potentially using multiple
processes for cluster-based execution.
Equalizer uses a C++ programming interface. The API is minimally invasive.
Equalizer imposes only a minimal, natural execution framework upon the applica-
tion. It does not provide a scene graph, or interferes in any other way with the
application’s rendering code. The restructuring work enforced by Equalizer is the
minimal refactoring needed to parallelize the application for scalable, distributed
rendering.
The API documentation is available on the website or in the header files, and
provides a comprehensive documentation on individual methods, types and other
elements of the API. Methods marked with a specific version are part of the official,
public API and have been introduced by this Equalizer version. Reasonable care
is taken to not break API compatibility or change the semantics of these methods
within future Equalizer versions of the same major revision. Any changes to the
public API are documented in the release notes and the file CHANGES.txt.
In addition the official, public API Equalizer exposes a number of unstable meth-
ods and, where unavoidable, internal APIs. These are clearly marked in the API
documentation. Unstable methods may be used by the programmer, but their in-
terface or functionality may change in any future Equalizer version. The usage of
internal methods is discouraged. Undocumented or unversioned methods should be
considered as part of the unstable API.
24
5. Programming Interface
The eqHello example uses Sequel, a thin layer on top of the canonical Equal-
izer programming interface. Section 6 introduces Sequel in detail, and Section 7.1
introduces the full scope of the Equalizer API.
The main eqHello function instantiates an application object, initializes it, starts
the main loop and finally de-initializes the application:
i n t main ( const i n t a r g c , char ∗∗ a r g v )
{
e q H e l l o : : A p p l i c a t i o n P t r app = new e q H e l l o : : A p p l i c a t i o n ;
The application object represents one process in the cluster. The primary applica-
tion instance has the rendering loop and controls all execution. All other instances
used for render client processes are passive and driven by Equalizer. The application
is responsible for creating the renderers, of which one per GPU is used:
c l a s s A p p l i c a t i o n : public s e q : : A p p l i c a t i o n
{
public :
v i r t u a l ˜ A p p l i c a t i o n ( ) {}
v i r t u a l s e q : : R ende rer ∗ c r e a t e R e n d e r e r ( ) { return new Re ndere r ( ∗ t h i s ) ; }
};
typedef lunchbox : : RefPtr< A p p l i c a t i o n > A p p l i c a t i o n P t r ;
The renderer is responsible for executing the application’s rendering code. One
instance for each GPU is used. All calls to a single renderer are executed serially
and therefore thread-safe.
In eqHello, the renderer draws six colored quads. The only change from a standard
OpenGL application is the usage of the rendering context provided by Equalizer,
most notably the frustum and viewport. The rendering context is described in detail
in Section 7.1.8, and eqHello simply calls applyRenderContext which will execute the
appropriate OpenGL calls.
After setting up lighting, the model is positioned using applyModelMatrix. For
convenience, Sequel maintains one camera per view. The usage of this camera is
purely optional, an application can implement its own camera model.
The actual OpenGL code, rendering six colored quads, is omitted here for brevity:
void e q H e l l o : : Re nder er : : draw ( co : : O b j e c t ∗ frameData )
{
applyRenderContext ( ) ; // s e t up OpenGL S t a t e
const f l o a t l i g h t P o s [ ] = { 0 . 0 f , 0 . 0 f , 1 . 0 f , 0 . 0 f } ;
g l L i g h t f v ( GL LIGHT0 , GL POSITION , l i g h t P o s ) ;
const f l o a t l i g h t A m b i e n t [ ] = { 0 . 2 f , 0 . 2 f , 0 . 2 f , 1 . 0 f } ;
g l L i g h t f v ( GL LIGHT0 , GL AMBIENT, l i g h t A m b i e n t ) ;
applyModelMatrix ( ) ; // g l o b a l camera
// r e n d e r s i x a x i s −a l i g n e d c o l o r e d quads around t h e o r i g i n
5.2. Namespaces
The Equalizer software stack is modularized, layering gradually more powerful, but
less generic APIs on top of each other. It furthermore relies on a number of required
and optional libraries. Application developers are exposed to the following name-
spaces:
25
5. Programming Interface
seq The core namespace for Sequel, the simple interface to the Equalizer client
library.
eq The core namespace for the Equalizer client library. The classes and their
relationship in this namespace closely model the configuration file format.
The classes in the eq namespace are the main subject of this Programming
Guide.Figure 25 provides an overview map of the most important classes in
the Equalizer namespace, grouped by functionality.
eq::util The eq::util namespace provides common utility classes, which often sim-
plify the usage of OpenGL functions. Most classes in this namespace are used
by the Equalizer client library, but are usable independently from Equalizer
for application development.
eq::admin The eq::admin namespace implements an administrative API to change
the configurations of a running server. This admin API is not yet finalized
and will very likely change in the future.
eq::fabric The eq::fabric namespace is the common data management and transport
layer between the client library, the server and the administrative API. Most
Equalizer client classes inherit from base classes in this namespace and have
all their data access methods in these base classes.
co Collage is the network library used by Equalizer. It provides basic functionality
for network communication, such as Connection and ConnectionSet, as well
as higher-level functionality such as Node, LocalNode, Object and Serializable.
Please refer to Section 8 for an introduction into the network layer, and to
Section 7.1.3 for distributed objects.
lunchbox The lunchbox library provides C++ classes to abstract the underlying
operating system and to implement common helper functionality for multi-
threaded applications. Examples are lunchbox::Clock providing a high-resolution
timer, or lunchbox::MTQueue providing a thread-safe, blocking FIFO. Classes
in this namespace are fully documented in the API documentation on the
Equalizer website, and are not subject of this Programming Guide.
hwloc, boost, vmmlib, hwsd External libraries providing manual and automatic
thread affinity, serialization and the foundation for RSP multicast, vector and
matrix mathematics as well as local and remote hardware (GPU, network
interfaces) discovery, respectively. The hwloc and hwsd libraries are used
only internally and are not exposed through the API.
eq::server The server namespace, implementing the functionality of the Equalizer
server, which is an internal namespace not to be used by application develop-
ers. The eq::admin namespace enables run-time administration of Equalizer
servers. The server does not yet expose a stable API.
The Equalizer examples are
implemented in their own client server
namespaces, e.g., eqPly or
seq
eVolve. They rely mostly eq eq::util eq::server
eq::fabric
vmmlib
hwsd
26
5. Programming Interface
1 0, 1
Client CommandQueue MessagePump
1 1
*
Server
agl::MessagePump glx::MessagePump
*
Config wgl::MessagePump
* * *
*
Node Canvas Layout Observer
Event
*
* *
Segment View
1 agl::WindowEvent wgl::WindowEvent
wgl::WindowEvent
*
1
Pipe SystemPipe
Frustum
agl::Pipe wgl::Pipe glx::Pipe
0,1 0,1
Wall Projection
Compositing 1 1 1
agl::EventHandler wgl::EventHandler glx::EventHandler
*
RenderContext OS Abstraction
1 1
PixelViewport Viewport CanvasVisitor SegmentVisitor
ConfigVisitor
NodeFactory 1 1 LayoutVisitor ViewVisitor
Eye ColorMask
NodeVisitor
ObserverVisitor
Range Zoom Pixel Subpixel
Render PipeVisitor WindowVisitor ChannelVisitor
18 see http://www.equalizergraphics.com/documents/design/taskMethods.html
27
6. The Sequel Simple Equalizer API
The main thread is responsible for maintaining the application logic. It reacts
on user events, updates the data model and requests new frames to be rendered. It
drives the whole application, as shown in Figure 26.
The rendering threads con-
currently render the applica- Application Server Render Clients
tion’s database. The data-
base should be accessed in init send tasks init
a read-only fashion during
trigger new send
rendering to avoid threading frame render tasks
problems. This is normally
the case, for example all mod- idle execute
processing render tasks
ern scene graphs use read-
only render traversals, writ- wait for frame sync frame
finish finish
ing the GPU-specific informa-
tion into a separate per-GPU handle events
data structure.
All rendering threads in update
the configuration run asyn- database
chronously to the applica-
exit send tasks exit
tion’s main thread. Depend-
ing on the configuration’s la-
tency, they can fall n frames Figure 26: Simplified Execution Model
behind the last frame finished
by the application thread. A latency of one frame is usually not perceived by the
user, but can increase rendering performance substantially since operations can be
better pipelined.
Rendering threads on a single node are synchronized when using the default
thread model draw sync. When a frame is finished, all local rendering threads are
done drawing. Therefore the application can safely modify the data between the
end of a frame and the beginning of a new frame. Furthermore, only one instance
of the scene data has to be maintained within a process, since all rendering threads
are guaranteed to draw the same frame.
This per-node frame synchronization does not inhibit latency across rendering
nodes. Furthermore, advanced rendering software which multi-buffers the dynamic
parts of the database can disable the per-node frame synchronization, as explained
in Section 7.2.3. Some scene graphs implement multi-buffered data, and can profit
from relaxing the local frame synchronization.
28
6. The Sequel Simple Equalizer API
6.2. Application
The application object in Sequel represents one process in the cluster. The main
application instance has the rendering loop and controls all execution. All other
instances used for render client processes are passive and driven by Equalizer.
Sequel applications derive their application object from seq::Application and selec-
tively override functionality. They have to implement createRenderer, as explained
below. The seqPly application overrides init, exit, run and implements createRen-
derer.
The initialization and exit routines are overwritten to parse seqPly-specific com-
mand line options and to load and unload the requested model:
bool A p p l i c a t i o n : : i n i t ( const i n t a r g c , char ∗∗ a r g v )
{
const eq : : S t r i n g s& models = p a r s e A r g u m e n t s ( a r g c , a r g v ) ;
i f ( ! s e q : : A p p l i c a t i o n : : i n i t ( a r g c , argv , 0 ) )
return f a l s e ;
l o a d M o d e l ( models ) ;
return true ;
}
bool A p p l i c a t i o n : : e x i t ( )
{
unloadModel ( ) ;
return s e q : : A p p l i c a t i o n : : e x i t ( ) ;
}
Sequel manages distributed objects, simplifying their use. This includes regis-
tration, automatic creation and mapping as well as commit and synchronization
of objects. One special object for initialization (not used in seqPly) and one for
per-frame data are managed by Sequel, in addition to an arbitrary number of
application-specific objects.
The objects passed to seq::Application::init and seq::Application::run are automat-
ically distributed and instantiated on the render clients, and then passed to the
respective task callback methods. The application may pass a 0 pointer if it does
not need an object for initialization or per-frame data. Objects are registered with
a type, and when automatically created, the createObject method on the application
or renderer is used to create an instance based on this type.
The run method is overloaded to pass the frame data object to the Sequel appli-
cation run loop. The object will be distributed and synchronized to all renderers:
bool A p p l i c a t i o n : : run ( )
{
return s e q : : A p p l i c a t i o n : : run ( & frameData ) ;
}
The application is responsible for creating renderers. Sequel will request one
renderer for each GPU rendering thread. Sequel also provides automatic mapping
and synchronization of distributed objects, for which the application has to provide
a creation callback:
s e q : : R ende rer ∗ A p p l i c a t i o n : : c r e a t e R e n d e r e r ( )
{
return new Re ndere r ( ∗ t h i s ) ;
29
6. The Sequel Simple Equalizer API
co : : O b j e c t ∗ A p p l i c a t i o n : : c r e a t e O b j e c t ( const u i n t 3 2 t t y p e )
{
switch ( t y p e )
{
case s e q : : OBJECTTYPE FRAMEDATA:
return new eqPly : : FrameData ;
default :
return s e q : : A p p l i c a t i o n : : c r e a t e O b j e c t ( t y p e ) ;
}
}
6.3. Renderer
The renderer is responsible for executing the application’s rendering code. One
instance for each GPU is used. All calls to a single renderer are executed serially
and therefore thread-safe.
The seqPly rendering code uses the same data structure and algorithm as eqPly,
described in Section 7.1.8. This renderer captures the GPU-specific data in a State
object, which is created and destroyed during init and exit. The state captures
also the OpenGL function table, which is available when init is called, but not yet
during the constructor of the renderer:
bool Re ndere r : : i n i t ( co : : O b j e c t ∗ i n i t D a t a )
{
s t a t e = new S t a t e ( glewGetContext ( ) ) ;
return s e q : : R ende rer : : i n i t ( i n i t D a t a ) ;
}
bool Re ndere r : : e x i t ( )
{
s t a t e −>d e l e t e A l l ( ) ;
delete s t a t e ;
state = 0;
return s e q : : R ende rer : : e x i t ( ) ;
}
The rendering code is similar to the typical OpenGL rendering code, except for
a few modifications to configure the rendering. First, the render context is applied
and lighting is set up. The render context, described in detail in Section 7.1.8, sets
up the stereo buffer, 3D viewport as well as the projection and view matrices using
the appropriate OpenGL calls. Applications can also retrieve the render context
and apply the settings themselves:
applyRenderContext ( ) ;
After the static light setup, the model matrix is applied to the existing view
matrix, completing the modelview matrix and positioning the model. Sequel main-
tains a per-view camera, which is modified through mouse and keyboard events
and determines the model matrix. Applications can overwrite this event handling
and maintain their own camera model. Afterwards, the state is set up with the
30
7. The Equalizer Parallel Rendering Framework
projection-modelview matrix for view frustum culling, the DB range for sort-last
rendering and the cull-draw traversal is executed, as described in Section 7.1.8:
applyModelMatrix ( ) ;
// Compute c u l l m a t r i x
const eq : : M a t r i x 4 f& modelM = getModelMatrix ( ) ;
const eq : : M a t r i x 4 f& view = getViewMatrix ( ) ;
const eq : : Frustumf& f r u s t u m = getFrustum ( ) ;
const eq : : M a t r i x 4 f p r o j e c t i o n = f r u s t u m . com pute mat rix ( ) ;
const eq : : M a t r i x 4 f pmv = p r o j e c t i o n ∗ view ∗ modelM ;
const s e q : : RenderContext& c o n t e x t = g e t R e n d e r C o n t e x t ( ) ;
s t a t e −>s e t P r o j e c t i o n M o d e l V i e w M a t r i x ( pmv ) ;
s t a t e −>setRange ( &c o n t e x t . r a n g e . s t a r t ) ;
s t a t e −>s e t C o l o r s ( model−>h a s C o l o r s ( ) ) ;
model−>c u l l D r a w ( ∗ s t a t e ) ;
31
7. The Equalizer Parallel Rendering Framework
* * *
Node Node Canvas Layout
configInit configInit useLayout
configExit configExit
* *
* Segment View
Pipe Pipe
configInitGL configInit
configExitGL configExit
frameStart frameStart
*
Window Window
configInit configInit
configExit configExit
*
Channel Channel
configInit configInit
frameDraw configExit
frameReadback frameClear
frameViewFinish frameDraw
frameReadback *
1 frameAssemble Object
View::Proxy frameViewStart
* commit
getModelID frameViewFinish sync
getInstanceData
1 FrameData applyInstanceData
serialize
deserialize pack
unpack
InitData
1 getInstanceData Serializable
applyInstanceData serialize
deserialize
* VertexBufferDist setDirty
i f ( ! eq : : i n i t ( a r g c , argv , &n o d e F a c t o r y ) )
{
LBERROR << ” E q u a l i z e r i n i t f a i l e d ” << s t d : : e n d l ;
return EXIT FAILURE ;
}
32
7. The Equalizer Parallel Rendering Framework
The node factory is used by Equalizer to create the object instances of the config-
ured rendering entities. Each of the classes inherits from the same type provided by
Equalizer in the eq namespace. The provided eq::NodeFactory base class instantiates
’plain’ Equalizer objects, thus making it possible to selectively subclass individual
entity types, as it is done by eqHello. For each rendering resources used in the
configuration, one C++ object will be created during initialization. Config, node
and pipe objects are created and destroyed in the node thread, whereas window
and channel objects are created and destroyed in the pipe thread:
The second step is to parse the command line into the LocalInitData data struc-
ture. A part of it, the base class InitData, will be distributed to all render client
nodes. The command line parsing is done by the LocalInitData class, which is dis-
cussed in Section 7.1.3:
// 2 . p a r s e arguments
eqPly : : L o c a l I n i t D a t a i n i t D a t a ;
i n i t D a t a . parseArguments ( a r g c , a r g v ) ;
The third step is to create an instance of the application and to initialize it locally.
The application is a subclass of eq::Client, which in turn is an co::LocalNode. The
underlying Collage network library, discussed in Section 8, is a peer-to-peer network
of co::LocalNodes. The client-server concept is implement the higher-level eq client
namespace.
The local initialization of a node creates at least one local listening socket, which
allows the eq::Client to communicate over the network with other nodes, such as the
server and the rendering clients. The listening socket(s) can be configured using
the –eq-listen command line parameter, by adding connections to the appNode in
the configuration file, or by programmatically adding connection descriptions to the
client before the local initialization:
// 3 . i n i t i a l i z a t i o n o f l o c a l c l i e n t node
lunchbox : : RefPtr< eqPly : : EqPly > c l i e n t = new eqPly : : EqPly ( i n i t D a t a ) ;
i f ( ! c l i e n t −>i n i t L o c a l ( a r g c , a r g v ) )
{
LBERROR << ”Can ’ t i n i t c l i e n t ” << s t d : : e n d l ;
eq : : e x i t ( ) ;
return EXIT FAILURE ;
}
After the application has finished, it is de-initialized and the main function re-
turns:
33
7. The Equalizer Parallel Rendering Framework
// 5 . c l e a n u p and e x i t
c l i e n t −>e x i t L o c a l ( ) ;
LBASSERTINFO( c l i e n t −>getRefCount ( ) == 1 , c l i e n t );
client = 0;
eq : : e x i t ( ) ;
eqPly : : e x i t E r r o r s ( ) ;
return r e t ;
}
7.1.2. Application
In the case of eqPly, the application is also the render client. The eqPly executable
has three runtime behaviors:
Main Loop The application’s main loop starts by connecting the application to an
Equalizer server. If no server is specified, Client::connectServer tries first to connect
to a server on the local machine using the default port. If that fails, it will create a
server running within the application process using auto-configuration as described
in Section 3.1.2. The command line parameter –eq-config can be used to specify
a hwsd session or configuration file, and –eq-server to explicitly specify a server
address.
i n t EqPly : : run ( )
{
// 1 . c o n n e c t t o s e r v e r
eq : : S e r v e r P t r s e r v e r = new eq : : S e r v e r ;
i f ( ! connectServer ( server ))
{
LBERROR << ”Can ’ t open s e r v e r ” << s t d : : e n d l ;
return EXIT FAILURE ;
}
The second step is to ask the server for a configuration. The ConfigParams are
a placeholder for later Equalizer implementations to provide additional hints and
information to the server for auto-configuration. The configuration chosen by the
server is created locally using NodeFactory::createConfig. Therefore it is of type
eqPly::Config, but the return value is eq::Config, making the static cast necessary:
// 2 . c h o o s e c o n f i g
eq : : f a b r i c : : ConfigParams c o n f i g P a r a m s ;
C o n f i g ∗ c o n f i g = s t a t i c c a s t <C o n f i g ∗>( s e r v e r −>c h o o s e C o n f i g ( c o n f i g P a r a m s ) ) ;
20 see http://www.equalizergraphics.com/documents/design/residentNodes.html
34
7. The Equalizer Parallel Rendering Framework
if ( ! config )
{
LBERROR << ”No matching c o n f i g on s e r v e r ” << s t d : : e n d l ;
disconnectServer ( server ) ;
return EXIT FAILURE ;
}
Finally it is time to initialize the configuration. For statistics, the time for this
operation is measured and printed. During initialization the server launches and
connects all render client nodes, and calls the appropriate initialization task meth-
ods, as explained in later sections. Config::init returns after all nodes, pipes, windows
and channels are initialized.
The return value of Config::init depends on the configuration robustness attribute.
This attribute is set by default, allowing configurations to launch even when some
entities failed to initialize. If set, Config::init always returns true. If deactivated, it
returns true only if all initialization task methods were successful. In any case, Con-
fig::getError only returns ERROR NONE if all entities have initialized successfully.
The EQLOG macro allows topic-specific logging. The numeric topic values are
specified in the respective log.h header files, and logging for various topics is enabled
using the environment variable EQ LOG TOPICS:
// 3 . i n i t c o n f i g
lunchbox : : Clock c l o c k ;
c o n f i g −>s e t I n i t D a t a ( i n i t D a t a ) ;
i f ( ! c o n f i g −>i n i t ( ) )
{
LBWARN << ” E r r o r d u r i n g i n i t i a l i z a t i o n : ” << c o n f i g −>g e t E r r o r ( )
<< s t d : : e n d l ;
s e r v e r −>r e l e a s e C o n f i g ( c o n f i g ) ;
disconnectServer ( server ) ;
return EXIT FAILURE ;
}
i f ( c o n f i g −>g e t E r r o r ( ) )
LBWARN << ” E r r o r d u r i n g i n i t i a l i z a t i o n : ” << c o n f i g −>g e t E r r o r ( )
<< s t d : : e n d l ;
When the configuration was successfully initialized, the main rendering loop is
executed. It runs until the user exits the configuration, or when a maximum number
of frames has been rendered, specified by a command-line argument. The latter is
useful for benchmarks. The Clock is reused for measuring the overall performance.
A new frame is started using Config::startFrame and a frame is finished using Con-
fig::finishFrame.
When a new frame is started, the server computes all rendering tasks and sends
them to the appropriate render client nodes. The render client nodes dispatch the
tasks to the correct node or pipe thread, where they are executed in order of arrival.
Config::finishFrame blocks on the completion of the frame current - latency. The
latency is specified in the configuration file, and allows several outstanding frames.
This allows overlapping execution in the node processes and pipe threads and min-
imizes idle times.
By default, Config::finishFrame also synchronizes the completion of all local ren-
dering tasks for the current frame. This facilitates porting of existing rendering
codes, since the database does not have to be multi-buffered. Applications such
as eqPly, which do not need this per-node frame synchronization, can disable it as
explained in Section 7.2.3:
// 4 . run main l o o p
35
7. The Equalizer Parallel Rendering Framework
u i n t 3 2 t maxFrames = i n i t D a t a . getMaxFrames ( ) ;
int lastFrame = 0 ;
clock . reset ( ) ;
while ( c o n f i g −>i s R u n n i n g ( ) && maxFrames−− )
{
c o n f i g −>s t a r t F r a m e ( ) ;
i f ( c o n f i g −>g e t E r r o r ( ) )
LBWARN << ” E r r o r d u r i n g frame s t a r t : ” << c o n f i g −>g e t E r r o r ( )
<< s t d : : e n d l ;
c o n f i g −>f i n i s h F r a m e ( ) ;
wait
idle
frame
last frame last frame
before last frame
before last
When playing a camera animation, eqPly prints the rendering performance once
per animation loop for benchmarking purposes:
i f ( c o n f i g −>getAnimationFrame ( ) == 1 )
{
const f l o a t time = c l o c k . r e s e t T i m e f ( ) ;
const s i z e t nFrames = c o n f i g −>g e t F i n i s h e d F r a m e ( ) − l a s t F r a m e ;
l a s t F r a m e = c o n f i g −>g e t F i n i s h e d F r a m e ( ) ;
eqPly uses event-driven execution, that is, it only request new rendering frames
if an event or animation requires an update. The eqPly::Config maintains a dirty
state, which is cleared after a frame has been started, and set when an event causes
a redraw. Furthermore, when an animation is running or head tracking is active,
the config always signals the need for a new frame.
If the application detects that it is currently idle, all pending commands are
gradually flushed, while still looking for a redraw event. Then it waits and handles
one event at a time, until a redraw is needed:
while ( ! c o n f i g −>needRedraw ( ) ) // w a i t f o r an e v e n t r e q u i r i n g redraw
{
i f ( hasCommands ( ) ) // e x e c u t e non− c r i t i c a l p e n d i n g commands
{
processCommand ( ) ;
c o n f i g −>h a n d l e E v e n t s ( ) ; // non−b l o c k i n g
}
21 http://www.equalizergraphics.com/scalability.html
36
7. The Equalizer Parallel Rendering Framework
else // no p e n d i n g commands , b l o c k on u s e r e v e n t
{
const eq : : EventICommand& e v e n t = c o n f i g −>getNextEvent ( ) ;
i f ( ! c o n f i g −>handleEvent ( e v e n t ) )
LBVERB << ” Unhandled ” << e v e n t << s t d : : e n d l ;
}
}
c o n f i g −>h a n d l e E v e n t s ( ) ; // p r o c e s s a l l p e n d i n g e v e n t s
The remainder of the application code cleans up in the reverse order of initializa-
tion. The config is exited, released and the connection to the server is closed:
// 5 . e x i t c o n f i g
clock . reset ( ) ;
c o n f i g −>e x i t ( ) ;
LBLOG( LOG STATS ) << ” E x i t t o o k ” << c l o c k . g e t T i m e f ( ) << ” ms” <<s t d : : e n d l ;
// 6 . c l e a n u p and e x i t
s e r v e r −>r e l e a s e C o n f i g ( c o n f i g ) ;
i f ( ! disconnectServer ( server ))
LBERROR << ” C l i e n t : : d i s c o n n e c t S e r v e r f a i l e d ” << s t d : : e n d l ;
Render Clients In the second and third use case of the eqPly, when the executable
is used as a render client, Client::initLocal never returns. Therefore the application’s
main loop is never executed. To keep the client resident, the eqPly example overrides
the client loop to keep it running beyond one configuration run:
void EqPly : : c l i e n t L o o p ( )
{
do
{
eq : : C l i e n t : : c l i e n t L o o p ( ) ;
LBINFO << ” C o n f i g u r a t i o n run s u c c e s s f u l l y e x e c u t e d ” << s t d : : e n d l ;
}
while ( i n i t D a t a . i s R e s i d e n t ( ) ) ; // e x e c u t e a t l e a s t one c o n f i g run
}
37
7. The Equalizer Parallel Rendering Framework
InitData - a Static Distributed Object The InitData class holds a couple of pa-
rameters needed during initialization. These parameters never change during one
configuration run, and are therefore static.
On the application side, the class LocalInitData subclasses InitData to provide
the command line parsing and to set the default values. The render nodes only
instantiate the distributed part in InitData.
A static distributed object has to implement getInstanceData and applyInstance-
Data to serialize and deserialize the object’s distributed data. These methods pro-
vide an output or input stream as a parameter, which abstracts the data transmis-
sion and can be used like a std::stream.
The data streams implement efficient buffering and compression, and automati-
cally select the best connection, i.e., multicast where available, for data transport.
They perform no type checking or transformation on the data. It is the application’s
responsibility to exactly match the order and types of variables during serialization
and de-serialization.
Custom data type serializers can be implemented by providing the appropriate
serialization functions. No pointers should be directly transmitted through the data
streams. For pointers, the corresponding object is typically a distributed object as
well, and its identifier and potentially version is transmitted in place of its pointer.
For InitData, serialization in getInstanceData and de-serialization in applyInstance-
Data is performed by streaming all member variables to or from the provided data
streams:
void I n i t D a t a : : g e t I n s t a n c e D a t a ( co : : DataOStream& o s )
{
o s << frameDataID << windowSystem << renderMode << useGLSL << invFaces
<< l o g o << r o i ;
}
void I n i t D a t a : : a p p l y I n s t a n c e D a t a ( co : : DataIStream& i s )
{
i s >> frameDataID >> windowSystem >> renderMode >> useGLSL >> invFaces
>> l o g o >> r o i ;
LBASSERT( frameDataID != 0 ) ;
}
• The master instance of the object generates new versions for all slaves. These
versions are continuous, starting at co::VERSION FIRST. It is possible to com-
mit on slave instances, but special care has to be taken to handle possible
conflicts. Section 8.4.4 covers slave object commits in detail.
• Slave instance versions can only be advanced, that is, sync( version ) with a
version smaller than the current version will fail.
• Newly mapped slave instances are mapped to the oldest available version by
default, or to the version specified when calling mapObject.
Upon commit the delta data from the previous version is sent to all mapped
slave instances. The data is queued on the remote node, and is applied when the
application calls sync to synchronize the object to a new version. The sync method
might block if a version has not yet been committed or is still in transmission.
Not syncing a mapped, versioned object creates a memory leak. The method
Object::notifyNewHeadVersion is called whenever a new version is received by the
38
7. The Equalizer Parallel Rendering Framework
node. The notification is send from the command thread, which is different from
the node main thread. The object should not be synced from this method, but
instead a message may be send to the application, which then takes the appropriate
action. The default implementation asserts when too many versions have been
queued to detect memory leaks during development.
Besides the instance data (de-)serialization methods used to map an object,
versioned objects may implement pack and unpack to serialize or de-serialize the
changes since the last version. If these methods are not implemented, their de-
fault implementation forwards the (de-)serialization request to getInstanceData and
applyInstanceData, respectively.
The creation of distributed, versioned objects
is simplified when using co::Serializable, which co::Object
_id
implements one common way of tracking data commit
sync
changes in versioned objects. The concept of co::Serializable getChangeType
_dirtyBits
a dirty bit mask is used to mark parts of the serialize getInstanceData
object for serialization, while preserving the ca- deserialize applyInstanceData
pack
pability to inherit objects. Other ways of im- setDirty
isDirty
unpack
plementing change tracking, e.g., using incarna- getVersion
getHeadVersion
tion counters, can still be implemented by using notifyNewHeadVersion
co::Object which leaves all flexibility to the devel-
oper. Figure 29 shows the relationship between Figure 29: co::Serializable and
co::Serializable and co::Object. co::Object
The FrameData is sub-classed from Serializable,
and consequently tracks its changes by setting the
appropriate dirty bit whenever it is changed. The serialization methods are called
by the co::Serializable with the dirty bit mask needed to serialize all data, or with
the dirty bit mask of the changes since the last commit. The FrameData only defines
its own dirty bits and serialization code:
/∗ ∗ The changed p a r t s o f t h e d a t a s i n c e t h e l a s t pack ( ) . ∗/
enum D i r t y B i t s
{
DIRTY CAMERA = co : : Serializable : : DIRTY CUSTOM << 0,
DIRTY FLAGS = co : : Serializable : : DIRTY CUSTOM << 1,
DIRTY VIEW = co : : Serializable : : DIRTY CUSTOM << 2,
DIRTY MESSAGE = co : : Serializable : : DIRTY CUSTOM << 3,
};
39
7. The Equalizer Parallel Rendering Framework
>> c o m p r e s s i o n ;
i f ( d i r t y B i t s & DIRTY VIEW )
i s >> c u r r e n t V i e w I D ;
i f ( d i r t y B i t s & DIRTY MESSAGE )
i s >> m e s s a g e ;
}
Scene Data Some applications rely on a shared filesystem to access the data, for
example when out-of-core algorithms are used. Other applications prefer to load
the data only on the application process, and use distributed objects to synchronize
the scene data with the render clients.
eqPly chooses the second approach, using static distributed objects to distribute
the model loaded by the application. It can be easily extended to versioned objects
to support dynamic data modifications.
The kD-tree data structure and rendering code for the model is strongly sepa-
rated from Equalizer, and kept in the separate namespace mesh. It can also be
used in other rendering software, for example in a GLUT application. To keep
this separation while implementing data distribution, an external ’mirror’ hierar-
chy is constructed aside the kD-tree. This hierarchy of VertexBufferDist nodes is
responsible for cloning the model data on the remote render clients.
The identifier of the model’s root object of this distributed hierarchy is passed as
part of the InitData for the default model, or as part of the View for each logical view.
It is used on the render clients to map the model when it is needed for rendering.
Figure 30 shows the UML hierarchy of the
model and distribution classes. Section 8.4 il- namespace mesh
if ( l e f t && right )
{
o s << l e f t −>getID ( ) << r i g h t −>getID ( ) ;
if ( isRoot )
{
LBASSERT( root );
40
7. The Equalizer Parallel Rendering Framework
Applications distributing a
dynamic scene graph use the Application Render Clients
frame data instead of the
init data as the entry point Config::init( ID )
Node::configInit( ID )
to their scene graph data ...
structure. Figure 31 shows Config:: ...
one possible implementation, startFrame( version )
frameStart( version )
where the identifier and ver-
sion of the scene graph root
are transported using the Distributed Objects
frame data. The scene graph InitData
root then serializes and de- _frameDataID
serializes his immediate chil-
dren by transferring their FrameData
_sceneID
identifier and current version, _sceneVersion
similar to the static distribu- _cameraData
tion done by eqPly.
SceneGraphRoot
The objects are still cre- _childIDs
ated by the application, and _childVersions
then registered or mapped
SceneGraphNode
with the session to distribute SceneGraphNode
them. When mapping ob- ...
SceneGraphNode
...
jects in a hierarchical data ...
structure, their type often has
to be known to create them.
Equalizer does not currently Figure 31: Scene Graph Distribution
provide object typing, this
has to be done by the application, either implicitly in the current implementa-
tion context, or by transferring a type identifier. In eqPly, object typing is implicit
since it is well-defined which object is mapped in which context.
7.1.4. Config
The eq::Config class is driving the application’s rendering, that is, it is responsible for
updating the data based on received events, requesting new frames to be rendered
and to provide the render clients with the necessary data.
41
7. The Equalizer Parallel Rendering Framework
Initialization and Exit The config initialization happens in parallel, that is, all
config initialization tasks are transmitted by the server at once and their completion
is synchronized afterwards.
The tasks are executed by the node and pipe threads in parallel. The parent’s
initialization methods are always executed before any child initialization method.
This parallelization allows a speedy startup on large-scale graphics clusters. On
the other hand, it means that initialization functions are called even if the par-
ent’s initialization has failed. Figure 32 shows a sequence diagram of the config
initialization.
Equalizer
Application
Server
Client Node
Processes
NodeFactory::
createNode
Node::configInit
NodeFactory::
Pipe Threads
createPipe
Config::init Pipe::selectWS
Pipe::configInit
NodeFactory::
createWindow
Window::configInit
NodeFactory::
createChannel
Channel::configInit
The eqPly::Config class holds the master versions of the initialization and frame
data objects. Both are registered with the eq::Config. The configuration forwards
the registration to the local client node and augments the object registration for
buffered objects.
First, it configures the objects to retain its data latency+1 commits, which cor-
responds to the typical use case where objects are committed once per frame. This
allows render clients, which often are behind the application process, to map ob-
jects with an old version. This does not necessarily translate into increased memory
usage, since new versions are only created when the object was dirty during commit.
Second, it retains the data for buffered objects data for latency frames after their
deregistration. This allows to map the object on a render client, even after it has
been deregistered on the application node. This does delay the deallocation of the
buffered object data by latency frames.
The identifier of the initialization data is transmitted to the render client nodes
using the initID parameter of eq::Config::init. The identifier of the frame data is
transmitted using the InitData.
42
7. The Equalizer Parallel Rendering Framework
Equalizer will pass this identifier to all configInit calls of the respective objects:
bool C o n f i g : : i n i t ( )
{
i f ( ! animation . isValid ( ))
animation . loadAnimation ( i n i t D a t a . getPathFilename ( ) ) ;
// i n i t d i s t r i b u t e d o b j e c t s
i f ( ! initData . useColor ( ))
frameData . setColorMode ( COLOR WHITE ) ;
// i n i t c o n f i g
i f ( ! eq : : C o n f i g : : i n i t ( i n i t D a t a . getID ( ) ) )
After a successful initialization, the models are loaded and registered for data
distribution. When idle, Equalizer will predistribute object data during registration
to accelerate the mapping of slave instances. Registering the models after Config::init
ensures that the render clients are running and can cache the data:
loadModels ( ) ;
registerModels ();
The exit function of the configuration stops the render clients by calling eq::Con-
fig::exit, and then de-registers the initialization and frame data objects:
bool C o n f i g : : e x i t ( )
{
const bool r e t = eq : : C o n f i g : : e x i t ( ) ;
deregisterData ( ) ;
closeAdminServer ( ) ;
Frame Control The rendering frames are issued by the application main loop.
The eqPly::Config overrides startFrame to update its data, commit a new version of
the frame data object, and then requests the rendering of a new frame using the
current frame data version. This version is passed to the rendering callbacks and
will be used by the rendering threads to synchronize the frame data to the state
belonging to the current frame. This ensures that all frame-specific data, e.g., the
camera position, is used consistently to generate the frame:
u i n t 3 2 t Config : : startFrame ( )
{
updateData ( ) ;
const eq : : u i n t 1 2 8 t& v e r s i o n = frameData . commit ( ) ;
redraw = false ;
return eq : : C o n f i g : : s t a r t F r a m e ( v e r s i o n ) ;
}
The update of the per-frame shared data consist of calculating the camera position
based on the current navigation mode, and determining the idle state for rendering.
When idle, eqPly performs anti-aliasing to gradually reduce aliasing effects in the
rendering. The idle state is tracked by the application and used by the rendering
callbacks to jitter the frusta, accumulate and display the results, as described in
Section 7.2.10:
43
7. The Equalizer Parallel Rendering Framework
void C o n f i g : : updateData ( )
{
// u p d a t e camera
i f ( animation . isValid ( ))
{
const eq : : V e c t o r 3 f& m o d e l R o t a t i o n = a n i m a t i o n . g e t M o d e l R o t a t i o n ( ) ;
const CameraAnimation : : Step& c u r S t e p = a n i m a t i o n . g e t N e x t S t e p ( ) ;
frameData . s e t M o d e l R o t a t i o n ( m o d e l R o t a t i o n ) ;
frameData . s e t R o t a t i o n ( c u r S t e p . r o t a t i o n ) ;
frameData . s e t C a m e r a P o s i t i o n ( c u r S t e p . p o s i t i o n ) ;
}
else
{
if ( frameData . u s e P i l o t M o d e ( ) )
frameData . spinCamera ( −0.001 f ∗ spinX , −0.001 f ∗ spinY ) ;
else
frameData . spinModel ( −0.001 f ∗ spinX , −0.001 f ∗ spinY , 0 . f ) ;
// i d l e mode
i f ( isIdleAA ( ))
{
LBASSERT( numFramesAA > 0 ) ;
frameData . s e t I d l e ( true ) ;
}
else
frameData . s e t I d l e ( f a l s e ) ;
numFramesAA = 0 ;
}
Event Handling Events are sent by the render clients to the application using
eq::Config::sendEvent. At the end of the frame, Config::finishFrame calls Config::handle-
Events to perform the event handling. The default implementation processes all
pending events by calling Config::handleEvent for each of them.
Since eqPly uses event-driven execution, the config maintains a dirty state to
know when a redraw is needed.
The eqPly example implements Config::handleEvent to provide the various reac-
tions to user input, most importantly camera updates based on mouse events. The
camera position has to be handled correctly regarding latency, and is therefore saved
in the frame data.
The event handling code reproduced here is just showing the handling of one type
of event. A detailed description on how to customize event handling can be found
in Section 7.2.1:
case eq : : Event : : CHANNEL POINTER WHEEL:
{
frameData . moveCamera ( −0.05 f ∗ event−>data . p o i n t e r W h e e l . yAxis ,
0. f ,
0 . 0 5 f ∗ event−>data . p o i n t e r W h e e l . xAxis ) ;
r e d r a w = true ;
return true ;
}
Model Handling Models in eqPly are static, and therefore the render clients only
need to map one instance of the model per node. The mapped models are shared
by all pipe render threads, which access them read-only.
Multiple models can be loaded in eqPly. A configuration has a default model,
stored in InitData, and one model per view, stored and distributed using the View.
44
7. The Equalizer Parallel Rendering Framework
The loaded models are evenly distributed over the available views of the configura-
tion, as shown in Figure 16.
The channel acquires the model during rendering from the config, using the model
identifier from its current view, or from the frame data if no view is configured.
The per-process config instance maintains the mapped models, and lazily maps
new models, which are registered by the application process. Since the model
loading may be called concurrently from different pipe render threads, it is protected
by a mutex:
const Model ∗ C o n f i g : : getModel ( const eq : : u i n t 1 2 8 t& modelID )
{
i f ( modelID == 0 )
return 0 ;
// P r o t e c t i f a c c e s s e d c o n c u r r e n t l y from m u l t i p l e p i p e t h r e a d s
const eq : : Node∗ node = getNodes ( ) . f r o n t ( ) ;
const bool needModelLock = ( node−>g e t P i p e s ( ) . s i z e ( ) > 1 ) ;
lunchbox : : ScopedWrite mutex ( needModelLock ? & modelLock : 0 ) ;
const s i z e t nModels = m o d e l s . s i z e ( ) ;
LBASSERT( m o d e l D i s t . s i z e ( ) == nModels ) ;
m o d e l D i s t . p u s h b a c k ( new ModelDist ) ;
Model ∗ model = m o d e l D i s t . back()−> loadModel ( g e t A p p l i c a t i o n N o d e ( ) ,
g e t C l i e n t ( ) , modelID ) ;
LBASSERT( model ) ;
m o d e l s . p u s h b a c k ( model ) ;
return model ;
}
Layout and View Handling For layout and model selection, eqPly maintains an
active view and canvas. The identifier of the active view is stored in the frame data,
which is used by the render client to highlight it using a different background color.
The active view can be selected by clicking into a view, or by cycling through all
views using a keyboard shortcut.
The model of the active view can be changed using a keyboard shortcut. The
model is view-specific, and therefore the model identifier for each view is stored on
the view, which is used to retrieve the model on the render clients.
View-specific data is not limited to a model. Applications can choose to make
any application-specific data view-specific, e.g., cameras, rendering modes or anno-
tations. A view is a generic concept for an application-specific view on data, eqPly
is simply using different models to illustrate the concept:
void C o n f i g : : s w i t c h C a n v a s ( )
{
const eq : : Canvases& c a n v a s e s = g e t C a n v a s e s ( ) ;
i f ( c a n v a s e s . empty ( ) )
return ;
frameData . s e t C u r r e n t V i e w I D ( eq : : UUID( ) ) ;
i f ( ! currentCanvas )
{
currentCanvas = canvases . front ( ) ;
45
7. The Equalizer Parallel Rendering Framework
return ;
}
eq : : C a n v a s e s C I t e r i = s t d e : : f i n d ( c a n v a s e s , currentCanvas ) ;
LBASSERT( i != c a n v a s e s . end ( ) ) ;
++i ;
i f ( i == c a n v a s e s . end ( ) )
currentCanvas = canvases . front ( ) ;
else
currentCanvas = ∗ i ;
s w i t c h V i e w ( ) ; // a c t i v a t e f i r s t v i e w on c an v as
}
void C o n f i g : : s w i t c h V i e w ( )
{
const eq : : Canvases& c a n v a s e s = g e t C a n v a s e s ( ) ;
i f ( ! c u r r e n t C a n v a s && ! c a n v a s e s . empty ( ) )
currentCanvas = canvases . front ( ) ;
i f ( ! currentCanvas )
return ;
i f ( ! view )
{
frameData . s e t C u r r e n t V i e w I D ( v i e w s . f r o n t ()−> getID ( ) ) ;
return ;
}
eq : : V i e w s C I t e r i = s t d : : f i n d ( v i e w s . b e g i n ( ) , v i e w s . end ( ) , view ) ;
i f ( i != v i e w s . end ( ) )
++i ;
i f ( i == v i e w s . end ( ) )
frameData . s e t C u r r e n t V i e w I D ( eq : : UUID( ) ) ;
else
frameData . s e t C u r r e n t V i e w I D ( ( ∗ i )−>getID ( ) ) ;
}
The layout of the canvas with the active view can also be dynamically switched
using a keyboard shortcut. The first canvas using the layout is found, and then the
next layout of the configuration is set on this canvas.
Switching a layout causes the initialization and de-initialization task methods to
be called on the involved channels, and potentially windows, pipes and nodes. This
operation might fail, which may cause the config to stop running.
Layout switching is typically used to change the presentation of views at runtime.
The source code omitted for brevity.
7.1.5. Node
For each active render client, one eq::Node instance is created on the appropriate
machine. Nodes are only instantiated on their render client processes, i.e., each
process will only have one instance of the eq::Node class. The application process
might also have a node class, which is handled in exactly the same way as the render
client nodes. The application and render clients might use a different node factory,
instantiating a different types of eq::Config. . . eq::Channel.
46
7. The Equalizer Parallel Rendering Framework
i f ( ! eq : : Node : : c o n f i g I n i t ( i n i t I D ) )
return f a l s e ;
C o n f i g ∗ c o n f i g = s t a t i c c a s t < C o n f i g ∗ >( g e t C o n f i g ( ) ) ;
i f ( ! c o n f i g −>loadData ( i n i t I D ) )
{
s e t E r r o r ( ERROR EQPLY MAPOBJECT FAILED ) ;
return f a l s e ;
}
return true ;
}
The actual mapping of the static data is done by the config. The config retrieves
the distributed InitData. The object is directly unmapped since it is static, and
therefore all data has been retrieved during :
bool C o n f i g : : loadData ( const eq : : u i n t 1 2 8 t& i n i t D a t a I D )
{
i f ( ! initData . isAttached ( ))
{
const u i n t 3 2 t r e q u e s t = mapObjectNB ( & i n i t D a t a , i n i t D a t a I D ,
co : : VERSION OLDEST,
getApplicationNode ( ) ) ;
i f ( ! mapObjectSync ( r e q u e s t ) )
return f a l s e ;
unmapObject ( & i n i t D a t a ) ; // d a t a was r e t r i e v e d , unmap i m m e d i a t e l y
}
e l s e // appNode , i n i t D a t a i s r e g i s t e r e d a l r e a d y
{
LBASSERT( i n i t D a t a . getID ( ) == i n i t D a t a I D ) ;
}
return true ;
}
7.1.6. Pipe
All task methods for a pipe and its children are executed in a separate thread. This
approach optimizes GPU usage, since all tasks are executed serially and therefore
do not compete for resources or cause OpenGL context switches. Multiple GPU
threads run in parallel with each other.
The pipe uses an eq::SystemPipe, which abstracts and manages window-system-
specific code for the GPU, e.g., an X11 Display connection for the glX pipe system.
Initialization and Exit Pipe threads are not explicitly synchronized with each
other in eqPly due to the use of the async thread model. Pipes might be rendering
different frames at any given time. Therefore frame-specific data has to be allocated
for each pipe thread, which is only the frame data in eqPly. The frame data is a
47
7. The Equalizer Parallel Rendering Framework
member variable of the eqPly::Pipe, and is mapped to the identifier provided by the
initialization data:
bool Pipe : : c o n f i g I n i t ( const eq : : u i n t 1 2 8 t& i n i t I D )
{
i f ( ! eq : : Pipe : : c o n f i g I n i t ( i n i t I D ) )
return f a l s e ;
return eq : : Pipe : : c o n f i g E x i t ( ) ;
}
Carbon/AGL Thread Safety Parts of the Carbon API used for window and
event handling in the AGL window system are not thread safe. The applica-
tion has to call eq::Global::enterCarbon before any thread-unsafe Carbon call, and
eq::Global::leaveCarbon afterwards. These functions should be used only during win-
dow initialization and exit, not during rendering. For implementation reasons en-
terCarbon might block up to 50 milliseconds. Carbon calls in the window event
handling routine Window::processEvent are thread-safe, since the global carbon lock
is set in this method. Please contact the Equalizer developer mailing list if you need
to use Carbon calls on a per-frame basis.
Frame Control All task methods for a given frame of the pipe, window and
channel entities belonging to the thread are executed in one block, starting with
Pipe::frameStart and finished by Pipe::finishFrame. The frame start callback is there-
fore the natural place to update all frame-specific data to the version belonging to
the frame.
48
7. The Equalizer Parallel Rendering Framework
In eqPly, the version of the only frame-specific object FrameData is passed as the
per-frame id from Config::startFrame to the frame task methods. The pipe uses this
version to update its instance of the frame data to the current version, and unlocks
its child entities by calling startFrame:
void Pipe : : f r a m e S t a r t ( const eq : : u i n t 1 2 8 t& frameID , const u i n t 3 2 t frameNumber )
{
eq : : Pipe : : f r a m e S t a r t ( frameID , frameNumber ) ;
frameData . s y n c ( frameID ) ;
}
7.1.7. Window
The Equalizer window abstracts an OpenGL drawable and a rendering context.
When using the default window initialization functions, all windows of a pipe share
the OpenGL context. This allows reuse of OpenGL objects such as display lists and
textures between all windows of one pipe.
The window uses an eq::SystemWindow, which abstracts and manages window-
system-specific handles to the drawable and context, e.g., an X11 window XID and
GLXContext for the glX window system.
The window class is the natural place for the application to maintain all data
specific to the OpenGL context.
49
7. The Equalizer Parallel Rendering Framework
derived from these interfaces provides a sample implementation which honors all
configurable window attributes.
Initialization and Exit The initialization sequence uses multiple, override-able task
methods. The main task method configInit calls first configInitSystemWindow, which
creates and initializes the SystemWindow for this window. The SystemWindow ini-
tialization code is implementation specific. If the SystemWindow was initialized
successfully, configInit calls configInitGL, which performs the generic OpenGL state
initialization. The default implementation sets up some typical OpenGL state, e.g.,
it enables the depth test. Most nontrivial applications do override this task method.
The SystemWindow initialization takes into account various attributes set in the
configuration file. Attributes include the size of the various frame buffer planes
(color, alpha, depth, stencil) as well as other framebuffer attributes, such as quad-
buffered stereo, doublebuffering, fullscreen mode and window decorations. Some of
the attributes, such as stereo, doublebuffer and stencil can be set to eq::AUTO, in
which case the Equalizer default implementation will test for their availability and
enable them if possible.
For the window-system specific initialization, eqPly uses the default Equalizer im-
plementation. The eqPly window initialization only overrides the OpenGL-specific
initialization function configInitGL to initialize a state object and an overlay logo.
This function is only called if an OpenGL context was created and made current:
bool Window : : c o n f i g I n i t G L ( const eq : : u i n t 1 2 8 t& i n i t I D )
{
i f ( ! eq : : Window : : c o n f i g I n i t G L ( i n i t I D ) )
return f a l s e ;
LBASSERT( ! s t a t e ) ;
s t a t e = new V e r t e x B u f f e r S t a t e ( getObjectManager ( ) ) ;
i f ( i n i t D a t a . showLogo ( ) )
loadLogo ( ) ;
i f ( i n i t D a t a . useGLSL ( ) )
loadShaders ( ) ;
return true ;
}
The state object is used to handle the creation of OpenGL objects in a multipipe,
multithreaded execution environment. It uses the object manager of the eq::Window,
which is described in detail in Section 7.1.7.
The logo texture is loaded from the file system and bound to a texture ID used
later by the channel for rendering. A code listing is omitted, since the code consists
of standard OpenGL calls and is not Equalizer-specific.
The window exit happens in the reverse order of the initialization. First, configEx-
itGL is called to de-initialize OpenGL, followed by configExitSystemWindow which
de-initializes the drawable and context and deletes the SystemWindow allocated in
configInitSystemWindow.
The window OpenGL exit function of eqPly de-allocates all OpenGL objects. The
object manager does not delete the object in its destructor, since it does not know
if an OpenGL context is still current.
50
7. The Equalizer Parallel Rendering Framework
bool Window : : c o n f i g E x i t G L ( )
{
i f ( s t a t e && ! s t a t e −>i s S h a r e d ( ) )
s t a t e −>d e l e t e A l l ( ) ;
delete s t a t e ;
state = 0;
return eq : : Window : : c o n f i g E x i t G L ( ) ;
}
Object Manager The object manager is, strictly speaking, not a part of the win-
dow. It is mentioned here since the eqPly window uses an object manager.
The state object in eqPly gathers all rendering state, which includes an object
manager for OpenGL object allocation.
The object manager (OM) is a utility class and can be used to manage OpenGL
objects across shared contexts. Typically one OM is used for each set of shared
contexts of a single GPU.
Each eq::Window has an object manager with the key type const void*, for as
long as it is initialized. Each window can have a shared context window. The
OM is shared with this shared context window. The shared context window is set
by default to the first window of each pipe, and therefore the OM will be shared
between all windows of a pipe. The same key is used by all contexts to get the
OpenGL name of an object, thus reusing of the same object within the same share
group. The method eq::Window::setSharedContextWindow can be used to set up a
different context sharing.
eqPly uses the window’s object manager in the rendering code to obtain the
OpenGL objects for a given data item. The address of the data item to be rendered
is used as the key.
For the currently supported types of OpenGL objects please refer to the API
documentation on the Equalizer website. For each object, the following functions
are available:
supportsObjects() returns true if the usage for this particular type of objects is
supported. For objects available in OpenGL 1.1 or earlier, this function is not
implemented.
getObject( key ) returns the object associated with the given key, or FAILED.
newObject( key ) allocates a new object for the given key. Returns FAILED if the
object already exists or if the allocation failed.
obtainObject( key ) convenience function which gets or obtains the object associ-
ated with the given key. Returns FAILED only if the object allocation failed.
7.1.8. Channel
The channel is the heart of the application’s rendering code, it executes all task
methods needed to update the configured views. It performs the various rendering
operations for the compounds. Each channel has a set of task methods to execute
the clear, draw, readback and assemble stages needed to render a frame.
51
7. The Equalizer Parallel Rendering Framework
Initialization and Exit During channel initialization, the near and far planes are
set to reasonable values to contain the whole model. During rendering, the near
and far planes are adjusted dynamically to the current model position:
bool Channel : : c o n f i g I n i t ( const eq : : u i n t 1 2 8 t& i n i t I D )
{
i f ( ! eq : : Channel : : c o n f i g I n i t ( i n i t I D ) )
return f a l s e ;
Buffer The OpenGL read and draw buffer as well as color mask. These parameters
are influenced by the current eye pass, eye separation and anaglyphic stereo
settings.
Viewport The two-dimensional pixel viewport restricting the rendering area within
the channel. For correct operations, both glViewport and glScissor have to be
used. The pixel viewport is influenced by the destination channel’s viewport
definition and compound viewports set for sort-first/2D decompositions.
Frustum The same frustum parameters as defined by glFrustum. Typically the frus-
tum used to set up the OpenGL projection matrix. The frustum is influenced
by the destination channel’s view definition, compound viewports, head ma-
trix and the current eye pass. If the channel has a subpixel parameter, the
frustum will be jittered before it is applied. Please refer to Section 7.2.10 for
more information.
52
7. The Equalizer Parallel Rendering Framework
The rendering first checks a number of preconditions, such as if the rendering was
interrupted by a reset and if the idle anti-aliasing is finished. Then the near and
far planes are re-computed, before the rendering context is applied:
void Channel : : frameDraw ( const eq : : u i n t 1 2 8 t& frameID )
{
i f ( stopRendering ( ) )
return ;
initJitter ();
i f ( isDone ( ))
return ;
i f ( oldModel != model )
s t a t e . s e t F r u s t u m C u l l i n g ( f a l s e ) ; // c r e a t e a l l d i s p l a y l i s t s /VBOs
i f ( model )
u p d a t e N e a r F a r ( model−>g e t B o u n d i n g S p h e r e ( ) ) ;
The frameDraw method in eqPly calls the frameDraw method from the parent class,
the Equalizer channel. The default frameDraw method uses the apply convenience
functions to setup the OpenGL state for all render context information, except for
the range which will be used later during rendering:
void eq : : Channel : : frameDraw ( const u i n t 1 2 8 t& frameID )
{
applyBuffer ( ) ;
applyViewport ( ) ;
glMatrixMode ( GL PROJECTION ) ;
glLoadIdentity ( ) ;
applyFrustum ( ) ;
glMatrixMode ( GL MODELVIEW ) ;
glLoadIdentity ( ) ;
applyHeadTransform ( ) ;
}
After the basic view setup, a directional light is configured, and the model is
positioned using the camera parameters from the frame data. The camera parame-
ters are transported using the frame data to ensure that all channels render a given
frame using the same position.
Three different ways of coloring the object are possible: Using the colors of the
mode, using a unique per-channel color to demonstrate the decomposition as shown
in Figure 34, or using solid white for anaglyphic stereo. The model colors are per-
vertex and are set during rendering, whereas the unique per-channel color is set in
frameDraw for the whole model:
53
7. The Equalizer Parallel Rendering Framework
g l M u l t M a t r i x f ( frameData . getCameraRotation ( ) . a r r a y ) ;
glTranslatef ( position . x () , position . y () , position . z () );
g l M u l t M a t r i x f ( frameData . g e t M o d e l R o t a t i o n ( ) . a r r a y ) ;
Finally the model is rendered. If the model was not loaded during node initial-
ization, a quad is drawn in its place:
i f ( model )
drawModel ( model ) ;
else
{
g l N o r m a l 3 f ( 0 . f , −1. f , 0 . f );
g l B e g i n ( GL TRIANGLE STRIP );
glVertex3f ( .25 f , 0. f , .25 f );
g l V e r t e x 3 f ( −.25 f , 0 . f , .25 f );
glVertex3f ( .25 f , 0. f , −.25 f );
g l V e r t e x 3 f ( −.25 f , 0 . f , −.25 f );
glEnd ( ) ;
}
54
7. The Equalizer Parallel Rendering Framework
// Compute c u l l m a t r i x
const eq : : M a t r i x 4 f& r o t a t i o n = frameData . getCameraRotation ( ) ;
const eq : : M a t r i x 4 f& m o d e l R o t a t i o n = frameData . g e t M o d e l R o t a t i o n ( ) ;
eq : : M a t r i x 4 f p o s i t i o n = eq : : M a t r i x 4 f : : IDENTITY ;
p o s i t i o n . s e t t r a n s l a t i o n ( frameData . g e t C a m e r a P o s i t i o n ( ) ) ;
s t a t e . s e t P r o j e c t i o n M o d e l V i e w M a t r i x ( p r o j e c t i o n ∗ view ∗ model ) ;
s t a t e . setRange ( &getRange ( ) . s t a r t ) ;
const eq : : Pipe ∗ p i p e = g e t P i p e ( ) ;
const GLuint program = s t a t e . getProgram ( p i p e ) ;
i f ( program != V e r t e x B u f f e r S t a t e : : INVALID )
glUseProgram ( program ) ;
s c e n e −>c u l l D r a w ( s t a t e ) ;
22 http://en.wikipedia.org/wiki/Kd-tree
55
7. The Equalizer Parallel Rendering Framework
The actual rendering uses display lists or vertex buffer objects. These OpenGL
objects are allocated using the object manager. The rendering is done by the
leaf nodes, which are small enough to store the vertex indices in a short value for
optimal performance with VBOs. The leaf nodes reuse the objects stored in the
object manager, or create and set up new objects if it was not yet set up. Since one
object manager is used per thread (pipe), this allows a thread-safe sharing of the
compiled display lists or VBOs across all windows of a pipe.
The main rendering loop is implemented in VertexBufferRoot::cullDraw(), and not
duplicated here.
Assembly Like most applications, eqPly uses most of the default implementation
of the frameReadback and frameAssemble task methods. To implement an opti-
mization and various customizations, frameReadback is overwritten. eqPly does not
need the alpha channel on the destination view. The output frames are flagged to
ignore alpha, which allows the compressor to drop 25% of the data during image
transfer. Furthermore, compression can be disabled and the compression quality
can be changed at runtime to demonstrate the impact of compression on scalable
rendering:
void Channel : : frameReadback ( const eq : : u i n t 1 2 8 t& frameID )
{
i f ( stopRendering ( ) | | isDone ( ) )
return ;
i f ( frameData . i s I d l e ( ) )
frame−>s e t Q u a l i t y ( eq : : Frame : : BUFFER COLOR, 1 . f ) ;
else
frame−>s e t Q u a l i t y ( eq : : Frame : : BUFFER COLOR, frameData . g e t Q u a l i t y ( ) ) ;
i f ( frameData . u s e C o m p r e s s i o n ( ) )
frame−>useCompressor ( eq : : Frame : : BUFFER COLOR, EQ COMPRESSOR AUTO ) ;
else
frame−>useCompressor ( eq : : Frame : : BUFFER COLOR, EQ COMPRESSOR NONE ) ;
}
The frameAssemble method is overwritten for the use of Subpixel compound with
idle software anti-aliasing, as described in Section 7.2.10.
56
7. The Equalizer Parallel Rendering Framework
implementation. On the other hand, each application and widget set has its own
model on how events are to be handled. Therefore, event handling in Equalizer is
customizable at any stage of the processing, to the extreme of making it possible to
disable all event handling code in Equalizer. In this aspect, Equalizer substantially
differs from GLUT, which imposes an event model and hides most of the event
handling in glutMainLoop.
The default implementation provides a convenient, easily accessible event frame-
work, while allowing all necessary customizations. It gathers all events from all node
processes in the main thread of the application, so that the developer only has to im-
plement Config::processEvent to update its data based on the preprocessed, generic
keyboard and mouse events. It is very easy to use and similar to a GLUT-based
implementation.
Threading Events are received and processed by the pipe thread a window be-
longs to. For AGL, Equalizer internally forwards the events from the main thread,
where it is received, to the appropriate pipe thread. This model allows window
and channel modifications which are naturally thread-safe, since they are executed
from the corresponding render thread and therefore cannot interfere with rendering
operations.
57
7. The Equalizer Parallel Rendering Framework
mouse pointer. The window or channel processEvent method perform local updates
such as setting the pixel viewport before forwarding the event to the application
main loop. If the event was processed, processEvent has to return true. If false is
returned to the event handler, the event will be passed to the previously installed,
window-system-specific event handling function.
After local processing, events are send using Config::sendEvent to the application
node. On reception, they are queued in the application thread. After a frame
has been finished, Config::finishFrame calls Config::handleEvents. The default imple-
mentation of this method provides non-blocking event processing, that is, it calls
Config::handleEvent for each queued event. By overriding handleEvents, event-driven
execution can easily be implemented.
Figure 36 shows the overall data flow of an event.
58
7. The Equalizer Parallel Rendering Framework
Each event has a type which is used to identify it by the config processing func-
tion. On the application end, Config::handleEvent receives an eq::EventICommand,
which provides deserialization capabilities of the received data. The underlying
co::DataIStream does perform endian conversion if the endianess of the sending and
receiving nodes does not match. The event data is decoded and printed in a nicely
formatted way:
bool C o n f i g : : handleEvent ( eq : : EventICommand command )
{
switch ( command . getEventType ( ) )
{
case READBACK:
case ASSEMBLE:
case START LATENCY:
{
switch ( command . getEventType ( ) )
{
case READBACK:
s t d : : c o u t << ” r e a d b a c k ” ;
break ;
case ASSEMBLE:
s t d : : c o u t << ” a s s e m b l e ” ;
break ;
case START LATENCY:
default :
s t d : : c o u t << ” ”;
}
59
7. The Equalizer Parallel Rendering Framework
i f ( ! eq : : i n i t ( a r g c , argv , &n o d e F a c t o r y ) )
...
eq : : e x i t ( ) ;
e Vo l ve : : e x i t E r r o r s ( ) ;
return r e t ;
}
When the application returns false from a configInit task method due to an
application-specific error, it should set an error code using setError. Application-
specific errors can have any value equal or greater than eq::ERROR CUSTOM:
/∗ ∗ D e f i n e s e r r o r s produced by e V o l v e . ∗/
enum E r r o r
{
ERROR EVOLVE ARB SHADER OBJECTS MISSING = eq : : ERROR CUSTOM,
ERROR EVOLVE EXT BLEND FUNC SEPARATE MISSING,
ERROR EVOLVE ARB MULTITEXTURE MISSING,
ERROR EVOLVE LOADSHADERS FAILED,
ERROR EVOLVE LOADMODEL FAILED,
ERROR EVOLVE MAPOBJECT FAILED
};
i f ( ! renderer )
return f a l s e ;
60
7. The Equalizer Parallel Rendering Framework
namespace
{
struct ErrorData
{
const u i n t 3 2 t code ;
const s t d : : s t r i n g t e x t ;
};
ErrorData e r r o r s [ ] = {
{ ERROR EVOLVE ARB SHADER OBJECTS MISSING,
” GL ARB shader objects e x t e n s i o n m i s s i n g ” } ,
{ ERROR EVOLVE EXT BLEND FUNC SEPARATE MISSING,
” GL ARB shader objects e x t e n s i o n m i s s i n g ” } ,
{ ERROR EVOLVE ARB MULTITEXTURE MISSING,
” GL ARB shader objects e x t e n s i o n m i s s i n g ” } ,
{ ERROR EVOLVE LOADSHADERS FAILED, ”Can ’ t l o a d s h a d e r s ” } ,
{ ERROR EVOLVE LOADMODEL FAILED, ”Can ’ t l o a d model ” } ,
{ ERROR EVOLVE MAPOBJECT FAILED,
”Mapping data from a p p l i c a t i o n p r o c e s s f a i l e d ” } ,
{ 0 , ” ” } // l a s t !
};
}
void i n i t E r r o r s ( )
{
eq : : f a b r i c : : E r r o r R e g i s t r y& r e g i s t r y = eq : : f a b r i c : : G l o b a l : : g e t E r r o r R e g i s t r y ( ) ;
void e x i t E r r o r s ( )
{
eq : : f a b r i c : : E r r o r R e g i s t r y& r e g i s t r y = eq : : f a b r i c : : G l o b a l : : g e t E r r o r R e g i s t r y ( ) ;
Threads The application or node main thread is the primary thread of each pro-
cess and executes the main function. The application and render clients initialize
the local node for communications with other nodes, including the server, using
Client::initLocal. The client derives from co::LocalNode which does provide most of
the communication logic‘.
During this initialization, Collage creates and manages two threads for communi-
cation, the receiver thread and the command thread. Normally no application code
is executed from these two threads.
The receiver thread manages the network connections to other nodes and receives
data. It dispatches the received data either to the application threads, or to the
command thread.
The command thread processes internal requests from other nodes, for example
during co::Object mapping. In some special cases the command thread executes
61
7. The Equalizer Parallel Rendering Framework
application code, for example when a remote node maps a static or unbuffered
object, Object::getInstanceData is called from the command thread.
The receiver and command thread are terminated when the application stops
network communications using Client::exitLocal.
During config initialization, one pipe thread is created for each pipe. The pipe
threads execute all render task methods for this pipe, and therefore executes the
application’s rendering code. All pipe threads are terminated during Config::exit.
Since a layout switch on a canvas may involve different resources, pipe threads may
be started and stopped dynamically at runtime. Before such an update, Equalizer
will always finish all pending rendering operations to ensure that all resources are
idle.
During compositing, readback and tranmission threads are created lazy on the
first image readback and transmission. Performing readback, compression and
transmission of images asynchronously pipelines these operations with subsequent
rendering commands, which increases overall performance. Equalizer creates one
thread per pipe to finalize an asynchronous readback and one thread per node to
compress and transmit the images to other nodes. Both threads are only visible to
transfer and compression plugins, not to any other application code.
Application
Main Thread
Client::initLocal
Receiver Command
Thread Thread
Config::init
Pipe Render
Threads Pipe Transfer
Threads Node Transmit
Channel:: Thread
finishReadback Channel::
Config::exit transmitImage
Client::exitLocal
Figure 38 provides an overview on all the threads used by Equalizer and Collage.
The rest of this section discusses the thread synchronization between the main
thread and the pipe threads.
ASYNC: No synchronization happens between the pipe render threads, except for
synchronizing the finish of frame current-latency. The eqPly and eVolve Equal-
izer examples activate this threading model by setting it in Node::configInit.
This synchronization model provides the best performance and should be used
by all applications which multi-buffer all dynamic, frame-specific data.
DRAW SYNC: Additionally to the synchronization of the async thread model, all
local render threads are synchronized, so that the draw operations happen
synchronously with the node main loop. Compositing, swap barriers and
62
7. The Equalizer Parallel Rendering Framework
buffer swaps happen asynchronously. This model allows to use the same
database for rendering, and safe modifications of this database are possible
from the node thread, since the pipe threads do not execute any rendering
tasks between frames. This is the default threading model. This threading
model should be used by applications which keep one copy of the scene graph
per node.
Figure 39 illustrates the synchronization and task execution for the thread syn-
chronization models. Please note that the thread synchronization synchronizes all
pipe render threads on a single node with the node’s main thread. The per-node
frame synchronization does not break the asynchronous execution across nodes.
readback readback
readback
assemble assemble
node unlocked assemble
startFrame: 3
draw
draw
draw
draw
draw draw
readback
readback
assemble readback
assemble
assemble
Figure 39: Async, draw sync and local sync Thread Synchronization Models
An implication of the draw sync and local sync models is that the application
node cannot issue a new frame until it has completed all its draw tasks. For larger
cluster configurations it is therefore advisable to assign only assemble operations
to the application node, allowing it to run asynchronous to the rendering nodes. If
the machine running the application process should also contribute to rendering, a
second node on the same host can be used to perform off-screen draw and readback
operations for the application node process.
63
7. The Equalizer Parallel Rendering Framework
all render threads run asynchronously and can render different frames at the same
time.
Synchronizing the draw operations between multiple pipe render threads, and
potentially the application thread, breaks DPlex decomposition. At any given time,
only one frame can be rendered from the same process. The speedup of DPlex relies
however on the capability to render different frames concurrently.
If one process per GPU is configured, draw-synchronous applications can scale
the performance using DPlex compounds. The processes are not synchronized with
each other, since each process keeps its own version of the scene data.
Thread Synchronization in Detail The application has extended control over the
task synchronization during a frame. Upon Config::startFrame, Equalizer invokes the
frameStart task methods of the various entities. The entities unlock all their children
by calling startFrame, e.g., Node::frameStart has to call Node::startFrame to unlock
the pipe threads. Note that certain startFrame calls, e.g., Window::startFrame, are
currently empty since the synchronization is implicit due to the sequential execution
within the thread.
Figure 40 illustrates the
local frame synchronization. main thread pipe threads
Each entity uses waitFrame-
Started to block on the par- Node::frameStart
Node::startFrame Pipe::frameStart
ent’s startFrame, e.g., Pipe::- Node::waitFrameStarted
frameStart calls Node::wait-
FrameStarted to wait for the draw tasks
corresponding Node::startFra-
me. This explicit synchro- Pipe::frameDrawFinish
nization allows to update Node::frameDrawFinish Pipe::releaseFrameLocal
Pipe(s)::waitFrameLocal
non-critical data before syn- compostion tasks
chronizing with waitFrameS-
tarted, or after unlocking us- Pipe::frameFinish
ing startFrame. Figure 40 il- Pipe::releaseFrame
Node::frameFinish
lustrates this synchronization Node::releaseFrame
model.
At the end of the frame,
two similar sets of synchro- Figure 40: Per-Node Frame Synchronization
nization methods are used.
The first set synchronizes the local execution, while the second set synchronizes
the global execution.
The local synchronization consists of releaseFrameLocal to unlock the local frame,
and of waitFrameLocal to wait for the unlock. For the default synchronization model
sync draw, Equalizer uses the task method frameDrawFinish which is called on each
resource after the last Channel::frameDraw invocation for this frame. Consequently,
Pipe::frameDrawFinish calls Pipe::releaseFrameLocal to signal that it is done drawing
the current frame, and Node::frameDrawFinish calls Pipe::waitFrameLocal for each of
its pipes to block the node thread until the current frame has been drawn.
The second, global synchronization is used for the frame completion during Con-
fig::finishFrame, which causes frameFinish to be called on all entities, passing the
oldest frame number, i.e., frame current-latency. The frameFinish task methods
have to call releaseFrame to signal that the entity is done with the frame. The re-
lease causes the parent’s frameFinish to be invoked, which is synchronized internally.
Once all Node::releaseFrame have been called, Config::finishFrame returns.
Figure 41 outlines the synchronization for the application, node and pipe classes
for an application node and one render client when using the default draw sync
64
7. The Equalizer Parallel Rendering Framework
Config::startFrame
Node::frameStart Node::frameStart
Node::startFrame Pipe::frameStart Node::startFrame Pipe::frameStart
Node::waitFrameStarted
Config::beginFrame Node::waitFrameStarted
Pipe::startFrame non-threaded Pipe::startFrame
frame draw tasks
Config::finishFrame
draw tasks draw tasks
non-threaded
draw tasks
Pipe::frameDrawFinish Pipe::frameDrawFinish
Node::frameDrawFinish Pipe::releaseFrameLocal Node::frameDrawFinish Pipe::releaseFrameLocal
Pipe(s)::waitFrameLocal Pipe(s)::waitFrameLocal
assemble tasks assemble tasks
non-threaded non-threaded
assemble tasks Pipe::frameFinish assemble tasks Pipe::frameFinish
Pipe::releaseFrame Pipe::releaseFrame
Node::frameFinish Node::frameFinish
Node::releaseFrame Node::releaseFrame
thread model. Please note that Config::finishFrame does block until the current
frame has been released locally and until the frame current - latency has been released
by all nodes. The window and channel synchronization are similar and omitted for
simplicity.
It is absolutely vital for the execution that Node::startFrame and Node::releaseFra-
me are called, respectively. The default implementations of the node task methods
do take care of that.
23 http://glew.sourceforge.net
65
7. The Equalizer Parallel Rendering Framework
Functions called from another place need to define a macro or function glewGet-
Context that returns the pointer to the GLEWContext of the appropriate window,
e.g., as done by the eqPly kd-tree rendering classes:
// s t a t e has GLEWContext∗ from window
#define glewGetContext s t a t e . glewGetContext
/∗ S e t up r e n d e r i n g o f t h e l e a f nodes . ∗/
void V e r t e x B u f f e r L e a f : : s e t u p R e n d e r i n g ( V e r t e x B u f f e r S t a t e& s t a t e ,
GLuint ∗ data ) const
{
...
g l B i n d B u f f e r ( GL ARRAY BUFFER, data [VERTEX OBJECT] ) ;
g l B u f f e r D a t a ( GL ARRAY BUFFER, v e r t e x L e n g t h ∗ s i z e o f ( Normal ) ,
& g l o b a l D a t a . n or ma ls [ v e r t e x S t a r t ] , GL STATIC DRAW ) ;
...
}
The WGL and GLX pipe manage a WGLEWContext and GLXEWContext, re-
spectively. These context are useful if extended wgl or glX functions are used
during window initialization. The GLEW context structures are initialized using a
temporary OpenGL context, created using the proper display device of the pipe.
66
7. The Equalizer Parallel Rendering Framework
from the render clients back to the application. The Pipe and Window classes have
a setError method, which is used to set an error code. This string is passed to the
Config instance on the application node, where it can be retrieved using getError.
The Section 7.2.2 explains error handling in more detail.
The sample implementations agl::Window, glx::Window and wgl::Window all have
similar, override-able methods for all sub-tasks. This allows partial customization,
without the need of rewriting tedious window initialization code, e.g., the OpenGL
pixel format selection. Figure 33 shows the UML class hierarchy for the system
window implementations.
return eq : : Window : : c o n f i g I n i t ( i n i t I D ) ;
}
67
7. The Equalizer Parallel Rendering Framework
i f ( ! pixelFormat )
return f a l s e ;
AGLContext c o n t e x t = createAGLContext ( p i x e l F o r m a t ) ;
destroyAGLPixelFormat ( p i x e l F o r m a t ) ;
setAGLContext ( c o n t e x t ) ;
i f ( ! context )
return f a l s e ;
makeCurrent ( ) ;
initGLEW ( ) ;
return c o n f i g I n i t A G L D r a w a b l e ( ) ;
}
GLXContext c o n t e x t = createGLXContext ( f b C o n f i g ) ;
setGLXContext ( c o n t e x t ) ;
i f ( ! context )
{
XFree ( f b C o n f i g ) ;
return f a l s e ;
}
const bool s u c c e s s = c o n f i g I n i t G L X D r a w a b l e ( f b C o n f i g ) ;
XFree ( f b C o n f i g ) ;
i f ( ! s u c c e s s | | ! xDrawable )
{
i f ( g e t E r r o r ( ) == ERROR NONE )
s e t E r r o r ( ERROR GLXWINDOW NO DRAWABLE ) ;
return f a l s e ;
}
makeCurrent ( ) ;
initGLEW ( ) ;
initSwapSync ( ) ;
i f ( g e t I A t t r i b u t e ( eq : : Window : : IATTR HINT DRAWABLE ) == FBO )
configInitFBO ( ) ;
return s u c c e s s ;
}
68
7. The Equalizer Parallel Rendering Framework
The full configInitWGL task method, including error handling and cleanup, looks
as follows:
bool Window : : c o n f i g I n i t ( )
{
i f ( ! initWGLAffinityDC ( ) )
{
s e t E r r o r ( ERROR WGL CREATEAFFINITYDC FAILED ) ;
return f a l s e ;
}
const i n t p i x e l F o r m a t = chooseWGLPixelFormat ( ) ;
i f ( p i x e l F o r m a t == 0 )
{
exitWGLAffinityDC ( ) ;
return f a l s e ;
}
i f ( ! configInitWGLDrawable ( p i x e l F o r m a t ) )
{
exitWGLAffinityDC ( ) ;
return f a l s e ;
}
i f ( ! wglDC )
{
exitWGLAffinityDC ( ) ;
setWGLDC( 0 , WGL DC NONE ) ;
s e t E r r o r ( ERROR WGLWINDOW NO DRAWABLE ) ;
return f a l s e ;
}
HGLRC c o n t e x t = createWGLContext ( ) ;
i f ( ! context )
{
configExit ( ) ;
return f a l s e ;
}
setWGLContext ( c o n t e x t ) ;
makeCurrent ( ) ;
initGLEW ( ) ;
initSwapSync ( ) ;
i f ( g e t I A t t r i b u t e ( eq : : Window : : IATTR HINT DRAWABLE ) == FBO )
return c o n f i g I n i t F B O ( ) ;
return true ;
}
69
7. The Equalizer Parallel Rendering Framework
Stereo Rendering Figure 42(a) illustrates a monoscopic view frustum. The viewer
is positioned at the origin of the global coordinate system, and the frustum is com-
pletely symmetric. This is the typical view frustum for non-stereoscopic applica-
tions.
far plane
wall from
config file
near plane
x x
z z
(a) (b)
In stereo rendering, the scene is rendered twice, with the two frustum origins
’moved’ to the position of the left and right eye, as shown in Figure 42(b). The
stereo frusta are asymmetric. The stereo convergence plane is the same as the
projection surface, unless specified otherwise using the focus distance API (see
Section 7.2.6).
Note that while stereo rendering is often implemented using the simpler toe-in
method, Equalizer implements the correct approach using asymmetric frusta.
70
7. The Equalizer Parallel Rendering Framework
x x
z z
(a) (b)
the observer identifier and a Matrix4f head matrix. Equalizer optionally implements
head tracking using VRPN or OpenCV, and uses this mechanism to inject the
asynchronously computed tracking matrix:
c o n f i g −>sendEvent ( Event : : OBSERVER MOTION ) << o r i g i n a t o r << head ;
This event will be dispatched to the given observer instance in the application
process during the next Config::handleEvents. The observer will update its head
matrix based on the event data:
switch ( command . getEventType ( ) )
{
case Event : : OBSERVER MOTION:
return s e t H e a d M a t r i x ( command . get < M a t r i x 4 f >( ) ) ;
}
Projection surfaces which are not X/Y-planar create frusta which are not oriented
along the Z axis, as shown in Figure 44(a). These frusta are positioned using the
channel’s head transformation, which can be retrieved using Channel::getHeadTransform.
For head-mounted displays (HMD), the tracking information is used to move the
frusta with the observer, as shown in Figure 43(b). This results in different pro-
jections compared to normal tracking with fixed projection screens. This difference
is transparent to Equalizer applications, only the configuration file uses a different
wall type for HMDs.
Focus Distance The focus distance, also called stereo convergence, is the Z dis-
tance in which the left and right eye frusta converge. A plane parallel to the near
plane positioned in the focus distance creates the same 2D image for both eyes.
The frustum calculation up to Equalizer version 1.0 places the stereo convergence
on the projection surface, as shown in Figure 44(a). The focus distance is therefore
the distance between the origin and the middle of the projection surface. This is
how almost all Virtual Reality software handles the focal plane.
In Equalizer 1.2 and later the focus distance and mode can be configured and
changed at runtime for each observer. This allows applications to expose the focal
plane in their user interface, or to automatically calculate it based on the scene and
view direction.
The default focus mode fixed implements the algorithm used by Equalizer 1.0.
This mode ignores the focus distance setting. The focus modes relative to origin
and relative to observer use the focus distance parameter to dynamically change the
stereo convergence.
The focus distance calculation relative to the origin allows to change the focus
independent of the observer position by separating the projection surface from the
71
7. The Equalizer Parallel Rendering Framework
x
z
x
z f
f
x
z
Figure 44: Fixed(a) and dynamic focus distance relative to origin(b) and observer(c)
stereo convergence plane. The convergence plane of the first wall in the negative
Z direction is moved to be at the given focus distance, as shown in Figure 44(b).
All other walls are moved by the same relative amount. The movement is made
from the view of the central eye, thus leaving the mono frustum unchanged. Fig-
ure 44(b) shows the new logical ’walls’ used for frustum calculations in white, while
the physical projection from Figure 44(a) is still visible.
The focus distance calculation relative to the observer is similar to the origin
algorithm, but it keeps the closest wall in the observer’s view direction at the given
focus distance, as shown in Figure 44(c). When the observer moves forward, the
focal plane moves forward as well. Consequently, when the observer looks in a
different direction, a different object in the scene is focused, as indicated by the
dotted circle in Figure 44(c).
1m
1.6m
2m
1m 2m
x x x
z z z
0.5m
Figure 45: Fixed(a), relative to origin(b) and observer(c) focus distance examples
Figure 45 shows an example for the three focus distance modes. The configured
wall is one meter behind the origin, the model two meters behind and the observer
is half a meter in front of the origin. The focus distance was set to two meters.
72
7. The Equalizer Parallel Rendering Framework
Application-specific Scaling All Equalizer units in the configuration file and API
are configured in meters. This convention allows the reuse of configurations across
multiple applications. When visualizing real-world objects, e.g., for architectural
visualizations, it guarantees that they appear realistic and full immersion into the
visualization can be achieved.
Certain applications want to visualize objects in immersive environments in a
scale different from their natural size, e.g., astronomical simulations. Each model,
and therefore each view, might have a different scale. Applications can declare this
scale as part of the eq::View, which will be applied to the virtual environment by
Equalizer. Common metric scale factors are provided as constants.
Subclassing and Data Distribution Layout API entities (Canvas, Segment, Lay-
out, View) are sub-classed like all other Equalizer entities using the NodeFactory.
Equalizer registers the master instance of these entities on the server node. Mutable
parameters, e.g., the active layout of a canvas, are distributed using slave object
commits (cf. Section 8.4.4). Application-specific data can be attached to a view
using a distributable UserData object. Figure 46 shows the UML class hierarchy for
the eqPly::View.
Equalizer commits dirty lay-
out entities at the beginning eqPly::View eq::View eq::fabric::Frustum
of each Config::startFrame, and _modelID
_idleSteps
handleEvent _wall
_projection
synchronizes the slave in- getWall
eq::fabric::View setWall
stances on the render clients _viewport 1 getProjection
DIRTY_VIEWPORT setProjection
correctly with the current DIRTY_FRUSTUM
frame. hasMasterUserData eq::fabric::Object
1 getViewport _userData
The render clients can ac- serialize _name
eqPly::View::Proxy deserialize DIRTY_USERDATA
cess a slave instance of the DIRTY_MODEL DIRTY_NAME
view using Channel::getView. DIRTY_IDLE <is _userData> setUserData
serialize hasMasterUserData
When called from one of deserialize co::Serializable getName
the frame task methods, this ...dirty bits... setName
serialize
method will return the view of deserialize
co::Object
the current destination chan- ...data distribution...
nel for which the task method
is executed. Otherwise it Figure 46: UML Hierarchy of eqPly::View
returns the channel’s native
view, if it has one. Only des-
tination channels of an active canvas have a native view.
The most common entity to subclass is the View, since the application often
amends it with view-specific application data. A view might have a distributable
user data object, which has to inherit from co::Object to be committed and syn-
chronized with the associated view.
73
7. The Equalizer Parallel Rendering Framework
The eqPly::View sets its Proxy as the user data object in the constructor. By
default, the master instance of the view’s user data is on the application instance of
the view. This may be changed by overriding hasMasterUserData. The proxy object
registration, mapping and synchronization is fully handled be the fabric layer, no
further handling has to be done by the application:
View : : View ( eq : : Layout ∗ p a r e n t )
: eq : : View ( p a r e n t )
, proxy ( this )
, idleSteps ( 0 )
{
setUserData ( & proxy ) ;
}
modelID = i d ;
p r o x y . s e t D i r t y ( Proxy : : DIRTY MODEL ) ;
}
Run-time Layout Switch The application can use a different layout on a canvas.
This will cause the running entities to be updated on the next frame. At a minimum,
this means the channels involved in the last layout on the canvas are de-initialized,
74
7. The Equalizer Parallel Rendering Framework
Run-time Stereo Switch Similar to switch a layout, the stereo mode of each view
can be switched at runtime. This causes all destination channels of the view to
update the cyclop eye or the left and right eye.
The configuration contains stereo information in three places: the view, the seg-
ment and the compound. The view defines which stereo mode is active: mono or
stereo. This setting can be done in the configuration file or programmatically.
The segment has an eye attribute in which defines which eyes are displayed by
this segment. This is typically all eyes for active stereo projections and the left
or right eye for passive stereo. The cyclop eye for passive stereo can either be set
on a single segment or both, depending if the second projector is active during
monoscopic rendering. The default setting enables all eyes for a segment.
The compound has an eye
attribute defining which eyes Stereo Mode Mono Mode
it is capable of updating.
This allows to specify differ-
ent compounds for the same
channel, depending on the
stereo mode. One use case
is to use one GPU each for
updating a stereo destination
GPU 0 GPU 1 GPU 0 GPU 1
channel in stereo mode, and left eye right eye left half right half
Frustum Updates Frustum parameters can be changed at runtime for views and
segments by the application. View frusta are typically changed for non-fullscreen
application windows and multi-view layouts, where the rendering is not meant to
be viewed in real-world size in an immersive environment. A typical use case is
changing the field-of-view of the rendering.
Segment frusta are changed when the display system changes at runtime, for
example by moving a tracked LCD through a virtual world.
The view and segment are derived from eq::Frustum (Figure 46), and the applica-
tion process can set the wall or projection parameters at runtime. For a description
of wall and projection parameters please refer to Section 3.11.3. The new data will
be effective for the next frame. The frustum of a view overrides the underlying
frustum of the segments.
75
7. The Equalizer Parallel Rendering Framework
The default Equalizer event handling is using the view API to maintain the aspect
ratio of destination channels after a window resize. Without updating the wall or
projection description, the rendering would become distorted.
When a window is resized,
a CHANNEL RESIZE event is
res dow
Equalizer
d
Render Client Application
generated. If the corre-
ize
Server
n
wi
sponding channel has a view, Window::processEvent
Win::setPixelViewport
Channel::processEvent sends a Channel::processEvent
VIEW RESIZE event to the Config::sendEvent
(VIEW_RESIZE)
application. This event con- Config::handleEvent
View::handleEvent
tains the identifier of the setWall
view. The config event is dis- setProjection
Frustum Update
patched to View::handleEvent
on the application thread.
Using the original size and Figure 48: Event Flow during a View Update
wall or projection description
of the view, a new wall or projection is computed, keeping the aspect ratio and
the height of the frustum constant. This new frustum is automatically applied by
Equalizer at the next config frame.
Figure 48 shows a sequence diagram of such a view update.
76
7. The Equalizer Parallel Rendering Framework
Figure 50 illustrates this load grid for a four-way 2D compound with ROI (top)
and without ROI (bottom).
ROI in eqPly The application should declare its regions of interest during ren-
dering. This declaration can be very fine-grained, e.g., on each leaf node of a
scene graph. Equalizer will track and optimize multiple regions automatically. The
current implementation merges all regions of a single channel into one region for
compositing and load-balancing. Later Equalizer versions may use different, more
optimal, heuristics based on the application-declared regions.
Each leaf node of the kd-tree used in eqPly declares its region of interest when
it is rendered. The region is calculated by projecting the bounding box into screen
space and normalizing the resulting screen space rectangle:
void V e r t e x B u f f e r L e a f : : draw ( V e r t e x B u f f e r S t a t e& s t a t e ) const
{
i f ( s t a t e . stopRendering ( ) )
return ;
s t a t e . updateRegion ( boundingBox ) ;
...
void V e r t e x B u f f e r S t a t e : : updateRegion ( const BoundingBox& box )
{
const V er te x c o r n e r s [ 8 ] = { V er te x ( box [ 0 ] [ 0 ] , box [ 0 ] [ 1 ] , box [ 0 ] [ 2 ] ) ,
V er te x ( box [ 1 ] [ 0 ] , box [ 0 ] [ 1 ] , box [ 0 ] [ 2 ] ) ,
V er te x ( box [ 0 ] [ 0 ] , box [ 1 ] [ 1 ] , box [ 0 ] [ 2 ] ) ,
V er te x ( box [ 1 ] [ 0 ] , box [ 1 ] [ 1 ] , box [ 0 ] [ 2 ] ) ,
V er te x ( box [ 0 ] [ 0 ] , box [ 0 ] [ 1 ] , box [ 1 ] [ 2 ] ) ,
V er te x ( box [ 1 ] [ 0 ] , box [ 0 ] [ 1 ] , box [ 1 ] [ 2 ] ) ,
V er te x ( box [ 0 ] [ 0 ] , box [ 1 ] [ 1 ] , box [ 1 ] [ 2 ] ) ,
V er te x ( box [ 1 ] [ 0 ] , box [ 1 ] [ 1 ] , box [ 1 ] [ 2 ] ) } ;
// t r a n s f o r m r e g i o n o f i n t e r e s t from [ −1 −1 1 1 ] t o n o r m a l i z e d v i e w p o r t
const V e c t o r 4 f n o r m a l i z e d ( r e g i o n [ 0 ] ∗ .5 f + .5 f ,
region [ 1 ] ∗ .5 f + .5 f ,
( region [ 2 ] − region [ 0 ] ) ∗ .5 f ,
( region [ 3 ] − region [ 1 ] ) ∗ .5 f ) ;
declareRegion ( normalized ) ;
...
The declareRegion is eventually forwarded to eq::Channel::declareRegion.
77
7. The Equalizer Parallel Rendering Framework
back the pixel data from the source GPU and assembling it on the destination
GPU.
Channels producing one or more outputFrames use Channel::frameReadback to
read the pixel data from the frame buffer. The channels receiving one or multiple
inputFrames use Channel::frameAssemble to assemble the pixel data into the frame-
buffer. Equalizer takes care of the network transport of frame buffer data between
nodes.
Normally the programmer does not need to interfere with the image compositing.
Changes are sometimes required at a high level, for example to order the input
frames or to optimize the readback. The following sections describe the image
compositing API in Equalizer.
Parallel Direct Send Compositing To provide a motivation for the design of the
image compositing API, the direct send parallel compositing algorithm is introduced
in this section. Other parallel compositing algorithms, e.g. binary-swap, can also
be expressed through an Equalizer configuration file.
Direct send has two important properties: an algorithmic complexity of O(1)
for each node, that is, the compositing cost per node is constant as resources are
added, and the capability to perform total ordering during compositing, e.g, to
back-to-front sort all contributions of a 3D volume rendering correctly.
The main idea behind direct send is to parallelize the costly recomposition for
database (sort-last) decomposition. With each additional source channel, the amount
of pixel data to be composited grows linearly. When using the simple approach of
compositing all frames on the destination channel, this channel quickly becomes the
78
7. The Equalizer Parallel Rendering Framework
bottleneck in the system. Direct send distributes this workload evenly across all
source channels, and thereby keeps the compositing work per channel constant.
In direct send compositing,
each rendering channel is also source1
source 2 source 3
(destination)
responsible for the sort-last
composition of one screen-
space tile. It receives the
readback
framebuffer pixels for its tile
from all the other channels.
The size of one tile decreases
linearly with the number of send/receive
composite
readback
per channel constant.
After performing the sort-
last compositing, the color send/receive
information is transferred to
the destination channel, simi-
gather
tiles
larly to a 2D (sort-first) com-
pound. The amount of pixel
data for this part of the
compositing pipeline also ap- Figure 51: Direct Send Compositing
proaches a constant value,
i.e., the full frame buffer.
Figure 51 illustrates this algorithm for three channels. The Equalizer website
contains a presentation24 explaining and comparing this algorithm to the binary-
swap algorithm.
The following operations have to be possible to perform this algorithm:
• Selection of color and/or depth frame buffer attachments
24 http://www.equalizergraphics.com/documents/EGPGV07.pdf
79
7. The Equalizer Parallel Rendering Framework
The Compositor The Compositor class gathers a set of static functions which im-
plement the various compositing algorithms and low-level optimizations. Figure 53
provides a top-down functional overview of the various compositor functions.
On a high level, the compositor combines multiple input frames using 2D tiling,
depth-compositing for polygonal data or sorted, alpha-blended compositing for
semi-transparent volumetric data. These operations composite either directly all
images on the GPU, or use a CPU-based compositor and then transfer the preinte-
grated result to the GPU. The high-level entry points automatically select the best
algorithm. The CPU-based compositor uses OpenMP to accelerate its operation.
On the next lower level, the compositor provides functionality to composite a
single frame, either using 2D tiling (possibly with blending for alpha-blended com-
positing) or depth-based compositing.
The per-frame compositing in turn relies on the per-image compositing function-
ality, which automatically decides on the algorithm to be used (2D or depth-based).
The concrete per-image assembly operation uses OpenGL operations to composite
the pixel data into the framebuffer, potentially using GLSL for better performance.
25 Volume Data Set courtesy of: SFB-382 of the German Research Council (DFG)
80
7. The Equalizer Parallel Rendering Framework
assembleFramesSorted assembleFrames
n n
_useCPUAssembly? _useCPUAssembly?
Frame Operations
assembleFramesUnsorted
y
for each frame assembleFramesCPU get next ready frame
mergeFramesCPU
assembleFrame
assembleImage
setupStencilBuffer
buffers?
Unsupported
declareRegion n
OpenGL < 2.0?
y
assembleImageDB_FF assembleImageDB_GLSL
declareRegion declareRegion
Operations
GLSL shader
Success
Therefore they have to be blended back-to-front in the same way as the slice planes
are blended during rendering.
Database decomposition has the advantage of scaling any part of the volume
rendering pipeline: texture and main memory (smaller bricks for each channel), fill
rate (less samples per channel) and IO bandwidth for time-dependent data (less data
update per time step and channel). Since the amount of texture memory needed
for each node decreases linearly, sort-last rendering makes it possible to render data
sets which are not feasible to visualize with any other approach.
For recomposition, the 2D frame buffer contents are blended to form a seamless
picture. For correct blending, the frames are ordered in the same back-to-front
81
7. The Equalizer Parallel Rendering Framework
View
Direction
(a) (b)
Figure 54: Final Result(a) of Figure 55(b) using Volume Rendering based on 3D
Texture Slicing(b).
order as the slices used for rendering, and use the same blending parameters. Sim-
plified, the frame buffer images are ‘thick’ slices which are ‘rendered’ by writing
their content to the destination frame buffer using the correct order.
For orthographic rendering, determining the compositing order of the input frames
is trivial. The screen-space orientation of the volume bricks determines the order
in which they have to be composited. The bricks in eVolve are created by slic-
ing the volume along one dimension. Therefore the range of the resulting frame
buffer images, together with the sorting order, is used to arrange the frames during
compositing. Figure 55(a) shows this composition for one view.
tin t
si on
po Fr
1 2 3 4
g
om k to
c
Ba
C
Zview
α
Zmodel
near
plane
(a) (b)
Finding the correct assembly order for perspective frusta is more complex. The
perspective distortion invalidates a simple orientation criteria like the one used for
orthographic frusta. For the view and frustum setup shown in Figure 55(b)26 the
correct compositing order is 4-3-1-2 or 1-4-3-2.
26 Volume Data Set courtesy of: AVS, USA
82
7. The Equalizer Parallel Rendering Framework
To compute the assembly order, eVolve uses the angle between the origin → slice
vector and the near plane, as shown in Figure 55(b). When the angle becomes
greater than 90°, the compositing order of the remaining frames has to be changed.
The result image of this composition naturally looks the same as the volume ren-
dering would when rendered on a single channel. Figure 54(a) shows the result of
the composition from Figure 55(b).
The assembly algorithm described in this section also works with parallel com-
positing algorithms such as direct-send.
// e l s e
LBVERB << ”Initialized ”
<< ( accum . b u f f e r −>usesFBO ( ) ? ”FBO accum” : ” glAccum ” )
<< ” b u f f e r f o r ” << getName ( ) << ” ” << getEye ( )
<< std : : endl ;
83
7. The Equalizer Parallel Rendering Framework
In idle mode, the results are accumulated at the end of the frame and displayed
after each iteration by frameViewFinish. Since frameViewFinish is only called on
destination channels this operation is done only on the final rendering result after
assembly. When all the steps are done the config will stop rendering new frames.
If the current pixel viewport is different from the one saved in frameViewStart,
the accumulation buffer needs also to be resized and the idle anti-aliasing is reset:
const eq : : P i x e l V i e w p o r t& pvp = g e t P i x e l V i e w p o r t ( ) ;
const bool i s R e s i z e d = accum . b u f f e r −>r e s i z e ( pvp ) ;
if ( isResized )
{
const View∗ view = s t a t i c c a s t < const View∗ >( getView ( ) ) ;
accum . b u f f e r −>c l e a r ( ) ;
accum . s t e p = view−>g e t I d l e S t e p s ( ) ;
accum . s t e p s D o n e = 0 ;
}
e l s e i f ( frameData . i s I d l e ( ) )
{
setupAssemblyState ( ) ;
i f ( ! i s D o n e ( ) && accum . t r a n s f e r )
accum . b u f f e r −>accum ( ) ;
accum . b u f f e r −>d i s p l a y ( ) ;
resetAssemblyState ( ) ;
}
The subpixel area is a function of the current jitter step, the channel’s subpixel
description and the idle state. Each source channel is responsible for filling a subset
of the sampling grid. To quickly converge to a good anti-aliasing, each channel
selects its samples using a pseudo-random approach, using a precomputed prime
number table to find the subpixel for the current step:
eq : : V e c t o r 2 i Channel : : g e t J i t t e r S t e p ( ) const
{
const eq : : S u b P i x e l& s u b P i x e l = g e t S u b P i x e l ( ) ;
const u i n t 3 2 t c h a n n e l I D = s u b P i x e l . i n d e x ;
const View∗ view = s t a t i c c a s t < const View∗ >( getView ( ) ) ;
i f ( ! view )
return eq : : V e c t o r 2 i : : ZERO;
const u i n t 3 2 t t o t a l S t e p s = u i n t 3 2 t ( view−>g e t I d l e S t e p s ( ) ) ;
i f ( t o t a l S t e p s != 256 )
return eq : : V e c t o r 2 i : : ZERO;
return eq : : V e c t o r 2 i ( dx , dy ) ;
}
The FrameData class holds the applications idle mode. The Config updates the
idle mode information depending on the application’s state. Each Channel performs
anti-aliasing when no user event requires a redraw.
When the rendering is not in idle mode, the jitter is queried from Equalizer which
returns an optimal subpixel offset for the given subpixel decomposition. This is used
during normal rendering of subpixel compounds:
eq : : V e c t o r 2 f Channel : : g e t J i t t e r ( ) const
{
84
7. The Equalizer Parallel Rendering Framework
During idle rendering of any decomposition, the jitter for the frustum is computed
using the normalized subpixel center point and the size of a pixel on the near plane.
A random position within the sub-pixel is set as a sample position, which will be used
to move the frustum. The getJitter method will return the computed jitter vector
for the current frustum. This method has a default implementation in eq::Channel
for subpixel compounds, but is overwritten in eqPly to perform idle anti-aliasing:
const eq : : V e c t o r 2 i j i t t e r S t e p = g e t J i t t e r S t e p ( ) ;
i f ( j i t t e r S t e p == eq : : V e c t o r 2 i : : ZERO )
return eq : : V e c t o r 2 f : : ZERO;
const f l o a t s a m p l e S i z e = 1 6 . f ; // s q r t ( 256 )
const f l o a t s u b p i x e l w = p i x e l w / s a m p l e S i z e ;
const f l o a t s u b p i x e l h = p i x e l h / s a m p l e S i z e ;
return eq : : V e c t o r 2 f ( i , j ) ;
7.2.11. Statistics
Statistics Gathering Statistics are measured in milliseconds since the configura-
tion was initialized. The server synchronizes the per-configuration clock on each
85
7. The Equalizer Parallel Rendering Framework
node automatically. Each statistic event records the originator’s (channel, window,
pipe, node, view or config) unique identifier.
Statistics are enabled per entity using an attribute hint. The hint determines
how precise the gathered statistics are. When set to fastest, the per-frame clock is
sampled directly when the event occurs. When set to nicest, all OpenGL commands
will be finished before sampling the event. This may incur a performance penalty,
but gives more correct results. The default setting is fastest in release builds, and
nicest in debug builds. The fastest setting often attributes times to the opera-
tion causing an OpenGL synchronization instead of the operation submitting the
OpenGL commands, e.g., the readback time contains operations from the preceding
draw operation.
The events are processed
by the channel’s and win-
dow’s processEvent method.
The default implementation
sends these events to the con-
fig using Config::sendEvent, as
explained in Section 7.2.1.
When the default implemen-
tation of Config::handleEvent
receives the statistics event, it
sorts the event per frame and
per originator. When a frame
has been finished, the events
are pushed to the local (app-
)node for visualization. Fur- Figure 56: Statistics for a two node 2D compound
thermore, the server also re-
ceives these events, which are
used by the equalizers to implement the various runtime adjustments.
Figure 56 shows the visualization of statistics events in an overlay27 .
Statistics Overlay The eq::Channel provides the method drawStatistics which ren-
ders a statistics overlay using the gathered statistics events. Statistics rendering is
toggled on and off using the ’s’ key in the shipped Equalizer examples.
Figure 57 shows a detailed view of Figure 56. The statistics shown are for a two-
node 2D compound. The destination channel is on the appNode and contributes to
the rendering.
86
7. The Equalizer Parallel Rendering Framework
The X axis is the time, the right-most pixel is the most current time. One pixel
on the screen corresponds to a given time unit, here one millisecond per pixel. The
scale is zoomed dynamically to powers-of-ten milliseconds to fit the statistics into
the available viewport. This allows easy and accurate evaluations of bottlenecks or
misconfigurations in the rendering pipeline. The scale of the statistics is printed
directly above the legend.
On the Y axis are the entities: channels, windows, nodes and the config. The
top-most channel is the local channel since it executes frameAssemble, and the lower
channel is the remote channel, executing frameReadback.
To facilitate the understanding, older frames are gradually grayed out. The right-
most, current frame is brighter than the frame before it.
The configuration uses the default latency of one frame. Consequently, the exe-
cution of two frames overlaps. This can be observed in the early execution of the
remote channel’s frameDraw, which starts while the local channel is still drawing
and assembling the previous frame.
The asynchronous execution allows operations to be pipelined, i.e., the compres-
sion, network transfer and assembly with the actual rendering and readback. This
increases performance by minimizing idle and wait times. In this example, the re-
mote channel2 has no idle times, and executes the compression and network transfer
of its output frame in parallel with rendering and readback. Likewise, the applica-
tion node receives and decompresses the frame in parallel to its rendering thread.
In the above example, the local channel finishes drawing the frame early, and
therefore spends a couple of milliseconds waiting for the input frame from the re-
mote channel. These wait events, rendered red, are a sub-event of the yellow frame-
Assemble task. Using a load equalizer instead of a static 2D decomposition would
balance the rendering in this example better and minimize or even eliminate this
wait time.
The white Window::swapBuffers task might take a longer time, since the execution
of the swap buffer is locked to the vertical retrace of the display. Note that the
remote source window does not execute swapBuffers in this configuration, since it
is a single-buffered FBO.
The beginning of a frame is marked by a vertical green line, and the end of a frame
by a vertical gray line. These lines are also attenuated. The brightness and color
matches the event for Config::startFrame and Config::finishFrame, respectively. The
event for startFrame is typically not visible, since it takes less than one millisecond
to execute. If no idle processing is done by the application, the event for finishFrame
occupies a full frame, since the config is blocked here waiting for the frame current
- latency to complete.
A legend directly below the statistics facilitates understanding. It lists the per-
entity operations with its associated color. Furthermore, some other textual infor-
mation is overlayed with the statistics. The total compression ratio is printed with
each readback and compression statistic. In this case the image has been not been
compressed during download. For network transfer it has been compressed to 1%
of its original size since it contains many black background pixels. For readback the
plugin with the name 0x101, EQ COMPRESSOR TRANSFER RGBA TO BGRA has
been used, and for compression the plugin 0x11, EQ COMPRESSOR RLE DIFF BGRA
was used. The compressor names are listed in the associated plugin header com-
pressorTypes.h.
87
8. The Collage Network Library
Please note that the support for CUDA is only enabled if the CUDA libraries are
found in their default locations. Otherwise please adapt the build system accord-
ingly.
CUDA Memory Management Equalizer does not provide any facility to perform
memory transfer from and to the CUDA devices. This is entirely left to the pro-
grammer.
88
8. The Collage Network Library
8.1. Connections
The co::Connection is the basic primitive used for communication between processes
in Equalizer. It provides a stream-oriented communication between two endpoints.
A connection is either closed, connected or listening. A closed connection cannot
be used for communications. A connected connection can be used to read or write
data to the communication peer. A listening connection can accept connection
requests.
A co::ConnectionSet is used to manage multiple connections. The typical use case
is to have one or more listening connections for the local process, and a number of
connected connections for communicating with other processes.
The connection set is used to select one connection which requires some action.
This can be a connection request on a listening connection, pending data on a
connected connection or the notification of a disconnect.
The connection and connection set can be used by applications to implement
other network-related functionality, e.g., to communicate with a sound server on a
different machine.
89
8. The Collage Network Library
8.3. Nodes
The co::Node is the abstraction of one process in the cluster. Each node has a
universally unique identifier. This identifier is used to address nodes, e.g., to query
connection information to connect to the node. Nodes use connections to commu-
nicate with each other by sending co::OCommands.
The co::LocalNode is the specialization of the node for the given process. It
encapsulates the communication logic for connecting remote nodes, as well as object
registration and mapping. Local nodes are set up in the listening state during
initialization.
A remote Node can either be connected explicitly by the application or due to
a connection from a remote node. The explicit connection can be done by pro-
grammatically creating a node, adding the necessary ConnectionDescriptions and
connecting it to the local node. It may also be done by connecting the remote node
to the local node by using its NodeID. This will cause Collage to query connection
information for this node from the already-connected nodes, instantiating the node
and connecting it. Both operations may fail.
90
8. The Collage Network Library
8.4. Objects
Distributed objects provide powerful, object-oriented data distribution for C++
objects. They facilitate the implementation of data distribution in a cluster envi-
ronment. Their functionality and an example use case for parallel rendering has
been described in Section 7.1.3.
Distributed objects subclass from co::Serializable or co::Object. The application
programmer implements serialization and deserialization of the distributed data.
Objects are dynamically attached to a listening local node, which manages the
network communication and command dispatch between different instances of the
same distributed object.
Objects are addresses using a universally unique identifier. The identifier is au-
tomatically created in the object constructor. The master version of a distributed
object is registered with the co::LocalNode. The identifier of the master instance
can be used by other nodes to map their instance of the object, thus synchronizing
the object’s data and identifier with the remotely registered master version.
One instance of an object is registered with its local node, which makes this object
the master instance. Slave instance on the same or other nodes are mapped to this
91
8. The Collage Network Library
master. During mapping they are initialized by transmitting a version of the master
instance data. During commit, the change delta is pushed from the master to all
mapped slave objects, using multicast connections when available. Slave objects
can also commit data to their master instance, which in turn may recommit it to
all slaves, as described in Section 8.4.4. Objects can push their instance data to a
set of nodes, as described in Section 8.4.5.
Distributed objects can be static (immutable) or dynamic. Dynamic objects are
versioned. New versions are committed from the master instance, which sends the
delta between the previous and current version to all mapped slave objects. The
slave objects sync the queued deltas when they need a version.
Objects may have a maximum number of unapplied versions, which will cause
the commit on the master instance to block if any slave instance has reached the
maximum number of queued versions. By default, slave instances can queue an
unlimited amount of unapplied versions.
Objects use compression plugins to reduce the amount of data sent over the net-
work. By default, the plugin with the highest compression ratio and lossless com-
pression for EQ COMPRESSOR DATATYPE BYTE tokens is chosen. The applica-
tion may override Object::chooseCompressor to deactivate compression by returning
EQ COMPRESSOR NONE or to select object-specific compressors.
Proxies For each object to be distributed, a proxy object is created which manages
data distribution for its associated object. This requires the application to
track changes on the object separately from the object itself. The model data
distribution of eqPly is using this pattern. (Figure 60(b))
Multiple Inheritance A new class inheriting from the class to be distributed and
from co::Object implements the data distribution. This requires the applica-
tion to instantiate a different type of object instead of the existing object,
and to create wrapper methods in the superclass calling the original method
92
8. The Collage Network Library
and setting the appropriate dirty flags. This pattern is not used in eqPly.
(Figure 60(c))
Foo::Distributed
getInstanceData
Foo::Proxy 1 applyInstanceData
Foo::Class
getInstanceData pack
applyInstanceData unpack
Foo::Class pack
getInstanceData unpack
applyInstanceData co::Object
pack
unpack co::Object co::Object Foo::Node
STATIC The object is not versioned nor buffered. The instance data is serial-
ized whenever a new slave instance is mapped. The serialization happens in
the command thread on the master node, i.e., in parallel to the application
threads. Since the object’s distributed data is supposed to be static, this
should not create any race conditions. No additional data is stored.
INSTANCE The object is versioned and buffered. The instance and delta data are
identical, that is, only instance data is serialized. The serialization always
happens before LocalNode::registerObject or Object::commit returns. Previous
instance data is saved to be able to map old versions.
DELTA The object is versioned and buffered. The delta data is typically smaller
than the instance data. Both the delta and instance data are serialized be-
fore Object::commit returns. The delta data is transmitted to slave instances
for synchronization. Previous instance data is saved to be able to map old
versions.
UNBUFFERED The object is versioned and unbuffered. No data is stored, and
no previous versions can be mapped. The instance data is serialized from the
command thread whenever a new slave instance is mapped, i.e., in parallel
to application threads. The application has to ensure that this does not cre-
ate any thread conflicts. The delta data is serialized before Object::commit
returns. The application may choose to use a different, more optimal imple-
mentation to pack deltas by using a different implementation for getInstance-
Data and pack.
8.4.3. co::Serializable
co::Serializable implements one typical usage pattern for data distribution for co::Object.
The co::Serializable data distribution is based on the concept of dirty bits, allowing
inheritance with data distribution. Dirty bits form a 64-bit mask which marks the
parts of the object to be distributed during the next commit. It is easier to use,
but imposes one typical way to implement data distribution.
93
8. The Collage Network Library
Inherit from co::Serializable: The base class will provide the dirty bit management
and call serialize and deserialize appropriately. By optionally overriding get-
ChangeType, the default versioning strategy might be changed:
namespace eqPly
{
/∗ ∗
∗ Frame−s p e c i f i c d a t a .
∗
∗ The frame−s p e c i f i c d a t a i s used as a per−c o n f i g d i s t r i b u t e d o b j e c t and
∗ c o n t a i n s mutable , r e n d e r i n g −r e l e v a n t d a t a . Each r e n d e r i n g t h r e a d ( p i p e )
∗ k e e p s i t s own i n s t a n c e s y n c h r o n i z e d w i t h t h e frame c u r r e n t l y b e i n g
∗ r e n d e r e d . The d a t a i s managed by t h e Config , which m o d i f i e s i t d i r e c t l y .
∗/
c l a s s FrameData : public co : : S e r i a l i z a b l e
{
Define new dirty bits: Define dirty bits for each data item by starting at Serial-
izable::DIRTY CUSTOM, shifting this value consecutively for each new dirty
bit:
/∗ ∗ The changed p a r t s o f t h e d a t a s i n c e t h e l a s t pack ( ) . ∗/
enum D i r t y B i t s
{
DIRTY CAMERA = co : : Serializable : : DIRTY CUSTOM << 0,
DIRTY FLAGS = co : : Serializable : : DIRTY CUSTOM << 1,
DIRTY VIEW = co : : Serializable : : DIRTY CUSTOM << 2,
DIRTY MESSAGE = co : : Serializable : : DIRTY CUSTOM << 3,
};
Implement serialize and deserialize: For each object-specific dirty bit which is set,
stream the corresponding data item to or from the provided stream. Call the
parent method first in both functions. For application-specific objects, write
a (de-)serialization function:
void FrameData : : s e r i a l i z e ( co : : DataOStream& os , const u i n t 6 4 t d i r t y B i t s )
{
co : : S e r i a l i z a b l e : : s e r i a l i z e ( os , d i r t y B i t s ) ;
i f ( d i r t y B i t s & DIRTY CAMERA )
o s << p o s i t i o n << r o t a t i o n << m o d e l R o t a t i o n ;
i f ( d i r t y B i t s & DIRTY FLAGS )
o s << modelID << renderMode << c o l o r M o d e << q u a l i t y << o r t h o
<< s t a t i s t i c s << h e l p << w i r e f r a m e << p i l o t M o d e << i d l e
<< c o m p r e s s i o n ;
i f ( d i r t y B i t s & DIRTY VIEW )
o s << c u r r e n t V i e w I D ;
i f ( d i r t y B i t s & DIRTY MESSAGE )
o s << m e s s a g e ;
}
94
8. The Collage Network Library
Mark dirty data: In each ’setter’ method, call setDirty with the corresponding dirty
bit:
void FrameData : : s e t C a m e r a P o s i t i o n ( const eq : : V e c t o r 3 f& p o s i t i o n )
{
position = position ;
s e t D i r t y ( DIRTY CAMERA ) ;
}
The registration and mapping of co::Serializables is done in the same way as for
co::Objects, which has been described in Section 8.4.
95
8. The Collage Network Library
96
8. The Collage Network Library
97
8. The Collage Network Library
8.5. Barrier
The co::Barrier provides a networked barrier primitive. It is an co::Object used by
Equalizer to implement software swap barriers, but it can be used as a generic
barrier in application code.
The barrier uses both the data distribution for synchronizing its data, as well
as custom command commands to implement the barrier logics. A barrier is a
versioned object. Each version can have a different height, and enter requests are
automatically grouped together by version.
8.6. ObjectMap
The co::ObjectMap is a specialized co::Serializable which provides automatic distri-
bution and synchronization of co::Objects. All changed objects are committed when
co::ObjectMap is committed, and sync() on slave instances updates mapped objects
to their respective versions.
The objects on slave ObjectMap instances are mapped explicitly, either by pro-
viding an already constructed instance or using implicit object creation through
the co::ObjectFactory. The factory needs to be supplied upon construction of the
co::ObjectMap and implement creation of the desired object types. Implicitly cre-
ated objects are owned by the object map which limits their lifetime. They are
released upon unmapping caused by deregister(), explicit unmapping or destruc-
tion of the slave instance ObjectMap.
In Sequel, the co::ObjectMap is used to distribute and synchronize all objects to
the rendering frames. The InitData and FrameData objects are implicitely known
and managed by Sequel, and other objects might be used in addition. The com-
mit and synchronization of the object map, and consequently all its objects, is
automatically performed in the correct places, most importantly the commit at the
beginning of the frame, and the sync when accessing an object from the renderer.
98
A. Command Line Options
–eq-help shows all available library command line options and their usage.
–eq-client is used to start render client processes. Starts a resident render client
when used without additional arguments, as described in Section 4.3.1. The
server appends an undocumented argument to this option which is used by
the eq::Client to connect to the server and identify itself.
–eq-layout <layoutName> activates all layouts of the given name on all respective
canvases during initialization. This option may be used multiple times.
–eq-gpufilter applies the given regular expression against nodeName:port.device
during autoconfiguration and only uses the matching GPUs.
–eq-modelunit <unitValue> is used for scaling the rendered models in all views.
The model unit defines the size of the model wrt the virtual room unit which
is always in meter. The default unit is 1 (1 meter or EQ M).
–eq-logfile <filename> redirects all Equalizer output to the given file.
B. File Format
The current Equalizer file format is a one-to-one representation of the server’s in-
ternal data structures. Its purpose is intermediate, that is, it will gradually be
replaced by automatic resource detection and configuration. However the core scal-
ability engine will always use a similar structure as currently exposed by the file
format.
The file format represents an ASCII deserialization of the server. Streaming an
eq::server::Server to an lunchbox::Log ostream produces a valid configuration file.
Likewise, loading a configuration file produces an eq::server::Server.
99
B. File Format
The file format uses the same syntactical structure as VRML. If your text editor
supports syntax highlighting and formatting for VRML, you can use this mode for
editing .eqc files.
The configuration file consist of an optional global section and a server configu-
ration. The global section defines default values for various attributes. The server
section represents an eq::server::Server.
The entity is the capitalized name of the corresponding section later in the con-
figuration file: connection, config, pipe, window, channel or compound. The con-
nection is used by the server and nodes.
The datatype is one capital letter for the type of the attribute’s value: S for
strings, C for a character, I for an integer and F for floating-point values. Enumera-
tion values are handled as integers. Strings have always to be surrounded by double
quotes ’”’. A character has to be surrounded by single quotes ’’’.
The attribute name is the capitalized name of the entities attribute, as discussed
in the following sections.
Global attribute values have useful default parameters, which can be overridden
with an environment variable of the same name. For enumeration values the corre-
sponding integer value has to be used. The global values in the config file override
environment variables, and are in turn overridden by the corresponding attributes
sections of the specific entities.
The globals section starts with the token global and an open curly brace ’{’, and
is terminated with a closing curly brace ’}’. Within the braces, globals are set using
the attribute’s name followed by its value. The following attributes are available:
100
B. File Format
101
B. File Format
102
B. File Format
103
B. File Format
104
B. File Format
105
B. File Format
106
B. File Format
The hostname is the IP address or resolvable host name. A server or node may
have multiple connection descriptions, for example to use a named pipe for local
communications and TCP/IP for remote nodes.
The interface is the IP address or resolvable host name of the adapter to which
multicast traffic is send.
A server listens on all provided connection descriptions. If no hostname is speci-
fied for a server connection description, it listens to INADDR ANY, and is therefore
reachable on all network interfaces. If the server’s hostname is specified, the lis-
tening socket is bound only to this address. If any of the given hostnames is not
resolvable, or any port cannot be used, server initialization will fail.
For a node, all connection descriptions are used while trying to establish a con-
nection to the node. When auto-launched by the server, all connection descriptions
of the node are passed to the launched node process, which will cause it to bind to
all provided descriptions.
server
{
c o n n e c t i o n # 0−n t i m e s , l i s t e n i n g c o n n e c t i o n s o f t h e s e r v e r
{
type TCPIP | SDP | RDMA | PIPE | RSP
port unsigned # TCPIP , SDP
filename string # PIPE
hostname string
interface string # RSP
}
107
B. File Format
Equalizer. The host is used to automatically launch remote render clients. For a
description of node and connection attributes please refer to Section B.2.
( node | appNode ) # 1−n t i m e s , a system i n t h e c l u s t e r
# 0 | 1 appNode : l a u n c h e s r e n d e r t h r e a d w i t h i n app p r o c e s s
{
name string
host s t r i n g # Used t o auto−l a u n c h r e n d e r nodes
c o n n e c t i o n # 0−n t i m e s , p o s s i b l e c o n n e c t i o n s t o t h i s node
{
type TCPIP | SDP | PIPE
port unsigned
hostname string
filename string
}
attributes
{
t h r e a d m o d e l ASYNC | DRAW SYNC | LOCAL SYNC
launch command string # r e n d e r c l i e n t l a u n c h command
launch command quote ’ c h a r a c t e r ’ # command argument q u o t e char
launch timeout unsigned # timeout in m i l l i s e c o n d s
}
108
B. File Format
attributes
{
hint stereo OFF | ON | AUTO
hint doublebuffer OFF | ON | AUTO
hint decoration OFF | ON
hint fullscreen OFF | ON
hint swapsync OFF | ON # AGL, WGL o n l y
hint drawable window | p b u f f e r | FBO | OFF
hint statistics OFF | FASTEST [ON] | NICEST
hint grab pointer OFF | [ON]
planes color unsigned | RGBA16F | RGBA32F
planes alpha unsigned
planes depth unsigned
planes stencil unsigned
planes accum unsigned
planes accum alpha unsigned
planes samples unsigned
}
109
B. File Format
{
hint statistics OFF | FASTEST [ON] | NICEST
}
}
110
B. File Format
specify a frustum, the corresponding destination channels will use the sub-frustum
resulting from the view/segment intersection.
A view has a stereo mode, which defines if the corresponding destination channel
update the cyclop or left and right eye. The stereo mode can be changed at runtime
by the application.
A view is a view on the application’s model, in the sense used by the Model-
View-Controller pattern. It can be a scene, viewing mode, viewing position, or any
other representation of the application’s data.
view # 1 . . . n times
{
name string
o b s e r v e r o b s e r v e r −r e f
viewport [ viewport ]
mode MONO | STEREO
wall # frustum d e s c r i p t i o n
{
bottom left [ float float float ]
bottom right [ float float float ]
top left [ float float float ]
type fixed | HMD
}
p r o j e c t i o n # a l t e r n a t e frustum d e s c r i p t i o n , l a s t one wins
{
origin [ float float float ]
distance float
fov [ float float ]
hpr [ float float float ]
}
}
111
B. File Format
wall
{
bottom left [ float float float ]
bottom right [ float float float ]
top left [ float float float ]
type fixed | HMD
}
projection
{
origin [ float float float ]
distance float
fov [ float float ]
hpr [ float float float ]
}
s w a p b a r r i e r # default swap b a r r i e r f o r a l l se g me n ts o f c a n v a s
{
name s t r i n g
NV group OFF | ON | unsigned
N V b a r r i e r OFF | ON | unsigned
}
wall # frustum d e s c r i p t i o n
{
bottom left [ float float float ]
bottom right [ float float float ]
top left [ float float float ]
type fixed | HMD
}
p r o j e c t i o n # a l t e r n a t e frustum d e s c r i p t i o n , l a s t one wins
{
origin [ float float float ]
distance float
fov [ float float ]
hpr [ float float float ]
112
B. File Format
}
swapbarrier { . . . } # s e t a s b a r r i e r on a l l d e s t compounds
}
113
B. File Format
v i e w e q u a l i z e r {} # a s s i g n r e s o u r c e s t o c h i l d l o a d e q u a l i z e r s
l o a d e q u a l i z e r # adapt 2D t i l i n g or DB r a n g e o f c h i l d r e n
{
mode 2D | DB | VERTICAL | HORIZONTAL
damping f l o a t # 0 : no damping , 1 : no c h a n g e s
boundary [ x y ] # 2D t i l e boundary
boundary f l o a t # DB r a n g e g r a n u l a r i t y
resistance [ x y ] # 2D t i l e p i x e l d e l t a
re s is ta n ce float # DB r a n g e d e l t a
a s s e m b l e o n l y l i m i t f l o a t # l i m i t f o r using d e s t a s s r c
}
D F R e q u a l i z e r # adapt ZOOM t o a c h i e v e c o n s t a n t f r a m e r a t e
{
framerate float # target framerate
damping f l o a t # 0 : no damping , 1 : no c h a n g e s
}
f r a m e r a t e e q u a l i z e r {} # smoothen window s w a p b u f f e r r a t e ( DPlex )
m o n i t o r e q u a l i z e r {} # s e t frame zoom when m o n i t o r i n g o t h e r v i e w s
tile equalizer
{
name s t r i n g
s i z e [ int int ] # tile size
}
attributes
{
stereo mode AUTO | QUAD | ANAGLYPH | PASSIVE # default AUTO
stereo anaglyph left mask [ RED GREEN BLUE ] # default r e d
stereo anaglyph right mask [ RED GREEN BLUE ] # d f g r e e n b l u e
}
114
B. File Format
c h i l d −compounds
outputframe
{
name string
b u f f e r [ COLOR DEPTH ]
t y p e t e x t u r e | memory
}
inputframe
{
name s t r i n g # c o r r e s p o n d i n g o ut pu t frame
}
outputtiles
{
name string
s i z e [ int int ] # tile size
}
inputtiles
{
name s t r i n g # c o r r e s p o n d i n g o ut pu t t i l e s
}
}
c h a n n e l −r e f : ’ s t r i n g ’ | ’ ( ’ c h a n n e l −segment−r e f ’ ) ’
c h a n n e l −segment−r e f : ( canvas−r e f ) segment−r e f ( l a y o u t −r e f ) view−r e f
115
+
1 2
+
3 +
1
2
3
+ + +