Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Parallel Algorithms for Regular Architectures offers an extensive collection of optimal and efficient algorithms for solving problems on sets of processors configured as a mesh connected computer or a pyramid computer. In addition, it... more
Parallel Algorithms for Regular Architectures offers an extensive collection of optimal and efficient algorithms for solving problems on sets of processors configured as a mesh connected computer or a pyramid computer. In addition, it provides valuable insights into the parallel algorithm design process.
The algorithms can be adapted to multiprocessor machines with varying degrees of granularity and a variety of processor configurations. They can be utilized inside a single VLSI chip, on special-purpose parallel machines, on intermediate-size machines, or on large parallel computers, such as the IBM BlueGene or Cray T series, that are routinely used for solving complex scientific and engineering problems.

Basic algorithms, such as sorting and matrix multiplication, are described, as are algorithms to solve fundamental problems in image processing, computational geometry, and graph theory. A consistent approach, based on algorithmic techniques, shows how to apply paradigms such as divide-and-conquer and how to exploit the use of data movement operations. Since many of the algorithms were originally created by the authors using these techniques, the reader can actually see how efficient parallel algorithms are developed by following the design process.

Parallel Algorithms for Regular Architectures will be useful to researchers as well as practitioners who need to implement efficient parallel programs. The algorithms and operations can be incorporated into a variety of applications, and the design and analysis techniques can be exploited in an even greater range. Researchers and practitioners working in the areas of parallel computing, supercomputing, and high-performance computing will find useful information.
Simple arrays are often modified into more complex dynamic structures in order to increase the efficiency of applications. Two examples of these dynamic array structures are the telescoping wedge and adaptive blocks. Implementing high... more
Simple arrays are often modified into more complex dynamic structures in order to increase the efficiency of applications. Two examples of these dynamic array structures are the telescoping wedge and adaptive blocks. Implementing high performance versions of these structures is often difficult because of their complexity. Challenges include efficiently locating data, maintaining cache use, and moving efficiently through the structure. Difficulties in implementation are especially acute when the program is created for a parallel computer. Challenges unique to parallel computers include communication and load balancing. This project addresses these challenges by investigating the implementation of high performance versions of the telescoping wedge and adaptive blocks. The above challenges were met for each data structure. For the telescoping wedge, efficient versions of two types of wedge were implemented. Both types of wedge allowed the solution of problems which were previously considered infeasible by reducing the amount of computational resources required. For example, the size of the problem solvable for the 3-arm bandit was expanded from n = 20 to n = 200. For adaptive blocks, a flexible and efficient general purpose tool was developed. The flexibility of the tool was demonstrated by the ease with which changes were made, such as switching between a large supercomputing platform and a home computer, and adding a new grid shape. The efficiency was demonstrated by the small percentage of time taken by the tool in an example application.
This note is a status report on the fastest known isotonic regression algorithms for various Lp metrics and partial orderings. The metrics considered are unweighted and weighted L0, L1, L2, and L∞. The partial orderings considered are... more
This note is a status report on the fastest known isotonic regression algorithms for various Lp metrics and partial orderings. The metrics considered are unweighted and weighted L0, L1, L2, and L∞. The partial orderings considered are linear, tree, d-dimensional grids, points in d-dimensional space with componentwise ordering, and arbitrary orderings (posets). Throughout, “fastest” means for the worst case in O-notation, not in any measurements of implementations. This note will occasionally be updated as better algorithms are developed. Citations are to the first paper to give a correct algorithm with the given time bound, though in some cases two are cited if they appeared nearly contemporaneously.
This paper gives an approach for determining isotonic regre ssions for data at points in multidimensional space, with the ordering given by domination. Recent algori thmic advances for 2-dimensional isotonic regressions have made them... more
This paper gives an approach for determining isotonic regre ssions for data at points in multidimensional space, with the ordering given by domination. Recent algori thmic advances for 2-dimensional isotonic regressions have made them useful for significantly larger d ata sets, and here we provide an advance for dimensions 3 and larger. Given a set V of n d-dimensional points, it is shown that an isotonic regressio n on V can be determined iñ Θ(n2), Θ̃(n3), andΘ̃(n) time for theL1, L2, andL∞ metrics, respectively. This improves upon previous results by a factor of Θ̃(n). The core of the approach is in extending the regression to a set of pointsV ′ ⊃ V where the domination ordering on V ′ can be represented with relatively few edges.
Algorithms are given for determining $L_\infty$ isotonic regression of weighted data. For a linear order, grid in multidimensional space, or tree, of $n$ vertices, optimal algorithms are given, taking $\Theta(n)$ time. These improve upon... more
Algorithms are given for determining $L_\infty$ isotonic regression of weighted data. For a linear order, grid in multidimensional space, or tree, of $n$ vertices, optimal algorithms are given, taking $\Theta(n)$ time. These improve upon previous algorithms by a factor of $\Omega(\log n)$. For vertices at arbitrary positions in $d$-dimensional space a $\Theta(n \log^{d-1} n)$ algorithm employs iterative sorting to yield the functionality of a multidimensional structure while using only $\Theta(n)$ space. The algorithms utilize a new non-constructive feasibility test on a rendezvous graph, with bounded error envelopes at each vertex.
The author examines the problem of locating and allocating large fault-free subsystems in multiuser massively parallel computer systems. Since the allocation schemes used in such large systems cannot allocate all possible subsystems a... more
The author examines the problem of locating and allocating large fault-free subsystems in multiuser massively parallel computer systems. Since the allocation schemes used in such large systems cannot allocate all possible subsystems a reduction in fault tolerance is experienced. The effects of different allocation methods, including the buddy and Gray-coded buddy schemes for the allocation of subsystems in the hypercube and in the two-dimensional mesh and torus are analyzed. Both worst-case and expected-case performance are studied. Generalizing the buddy and Gray-coded systems, a family of allocation schemes which exhibit a significant improvement in fault tolerance over the existing schemes and which use relatively few additional resources is introduced. For purposes of comparison, the behavior of the various schemes on the allocation of subsystems of 218 processors in the hypercube, mesh, and torus consisting of 220 processors is studied. The methods involve a combination of anal...
Research Interests:
We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning algorithms for a stochastic environment, and we focus on the... more
We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning algorithms for a stochastic environment, and we focus on the problem of optimally assigning patients to treatments in clinical trials. While adaptive designs have significant ethical and cost advantages, they are rarely utilized because of the complexity of optimizing and analyzing them. Computational challenges include massive memory requirements, few calculations per memory access, and multiply-nested loops with dynamic indices. We analyze the effects of various parallelization options, and while standard approaches do not work well, with effort an efficient, highly scalable program can be developed. This allows us to solve problems thousands of times more complex than those solved previously, which helps make adaptive designs practical. Further, our work applies to many other problems involving neighbor re...
The Space Weather Modeling Framework (SWMF) provides a high‐performance flexible framework for physics‐based space weather simulations, as well as for various space physics applications. The SWMF integrates numerical models of the Solar... more
The Space Weather Modeling Framework (SWMF) provides a high‐performance flexible framework for physics‐based space weather simulations, as well as for various space physics applications. The SWMF integrates numerical models of the Solar Corona, Eruptive Event Generator, Inner Heliosphere, Solar Energetic Particles, Global Magnetosphere, Inner Magnetosphere, Radiation Belt, Ionosphere Electrodynamics, and Upper Atmosphere into a high‐performance coupled model. The components can be represented with alternative physics models, and any physically meaningful subset of the components can be used. The components are coupled to the control module via standardized interfaces, and an efficient parallel coupling toolkit is used for the pairwise coupling of the components. The execution and parallel layout of the components is controlled by the SWMF. Both sequential and concurrent execution models are supported. The SWMF enables simulations that were not possible with the individual physics mo...
Research Interests:
... Global models that aim to resolve motions with a vertical scale of order 100 m need to include nonhydrostatic effects (Daley, 1988). ... Daley, R., 1988: The normal modes of the spherical non-hydrostatic equations with applications to... more
... Global models that aim to resolve motions with a vertical scale of order 100 m need to include nonhydrostatic effects (Daley, 1988). ... Daley, R., 1988: The normal modes of the spherical non-hydrostatic equations with applications to the filtering of acoustic modes. ...
This paper gives algorithms for determining isotonic median regressions (i.e., isotonic regression using the L1 metric) satisfying order constraints given by various orde red sets. For a rooted tree the regression can be found in �(nlog... more
This paper gives algorithms for determining isotonic median regressions (i.e., isotonic regression using the L1 metric) satisfying order constraints given by various orde red sets. For a rooted tree the regression can be found in �(nlog n) time, while for a star it can be found in �(n) time, where n is the number of vertices. For bivariate data, when
Google, Inc. (search). ...
Optimal designs are presented for experiments in which sampling is carried out in stages. There are two Bernoulli populations and it is assumed that the outcomes of the previous stage are available before the sampling design for the next... more
Optimal designs are presented for experiments in which sampling is carried out in stages. There are two Bernoulli populations and it is assumed that the outcomes of the previous stage are available before the sampling design for the next stage is determined. At each ...

And 281 more

A talk about the evolution of goals/concerns of parallel models and algorithms, including cellular automata, mesh connected computers, reconfigurable meshes, and power-constrained algorithms for mesh connected computers. It mentions zebra... more
A talk about the evolution of goals/concerns of parallel models and algorithms, including cellular automata, mesh connected computers, reconfigurable meshes, and power-constrained algorithms for mesh connected computers. It mentions zebra networks and rat algorithms, along with more mundane computers such as the BlueGene and Earth Simulator, and abstract models such as PRAMs and quantum computers.