Adaptive Binary Sorting Schemes and Associated Interconnection Networks
Many routing problems in parallel processing, such as concentration and permutationproblems, can be cast as sorting problems. In this paper, we consider the problem ofsorting on a new model, called an adaptive sorting network. We show that any ...
The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor
Trace-driven simulations of numerical Fortran programs are used to study the impact ofthe parallel loop scheduling strategy on data prefetching in a shared memorymultiprocessor with private data caches. The simulations indicate that to maximizememory ...
Distributed Performance Monitoring: Methods, Tools, and Applications
A method for analyzing the functional behavior and the performance of programs in distributed systems is presented. We use hybrid monitoring, a technique which combines advantages of both software monitoring and hardware monitoring. The paper contains a ...
Scalability of Parallel Algorithm-Machine Combinations
Scalability has become an important consideration in parallel algorithm and machinedesigns. The word scalable, or scalability, has been widely and often used in the parallelprocessing community. However, there is no adequate, commonly accepted ...
Performance Analysis of Two Different Algorithms for Ethernet-FDDI Interconnection
Fiber Distributed Data Interface (FDDI) local area networks (LAN's) are used either ashigh-speed links between computers and peripherals, or as backbones for lower-speedLAN's, such as Ethernet and Token Ring. The availability of such a high-speed ...
On Probabilistic Diagnosis of Multiprocessor Systems Using Multiple Syndromes
This paper addresses the distributed self-diagnosis of a multiprocessor/multicomputersystem based on fault syndromes formed by comparison testing. The authors show thatby using multiple fault syndromes, it is possible to achieve significantly better ...
Analysis of Asynchronous Polynomial Root Finding Methods on a Distributed Memory Multicomputer
We have studied various implementations of iterative polynomial root finding methods on a distributed memory multicomputer. These methods are based on the construction of a sequence of approximations that converge to the set of zeros. The synchronous ...
Partitioned Encoding Schemes for Algorithm-Based Fault Tolerance in Massively Parallel Systems
Considers the applicability of algorithm based fault tolerance (ABET) to massively parallel scientific computation. Existing ABET schemes can provide effective fault tolerance at a low cost For computation on matrices of moderate size; however, the ...
Computing Network Flow on a Multiple Processor Pipeline
We demonstrate the feasibility of a distributed implementation of the Goldberg-Tarjan algorithm for finding the maximum flow in a network. Unlike other parallel implementations of this algorithm, where the network graph is partitioned among many ...
Pipelining and Bypassing in a VLIW Processor
This short note describes issues involved in the bypassing mechanism for a very longinstruction word (VLIW) processor and its relation to the pipeline structure of theprocessor. The authors first describe the pipeline structure of their processor and ...
Embedding Binary X-Trees and Pyramids in Processor Arrays with Spanning Buses
We study the problem of network embeddings in 2-D array architectures in which eachrow and column of processors are interconnected by a bus. These architectures areespecially attractive if optical buses are used that allow simultaneous access by ...