Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/341800acmconferencesBook PagePublication PagesspaaConference Proceedingsconference-collections
SPAA '00: Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
ACM2000 Proceeding
  • Chairmen:
  • Gary Miller,
  • Shang-Hua Teng
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
SPAA00: ACM symposium on Parallel Algorithms and Architectures Bar Harbor Maine USA July 9 - 13, 2000
ISBN:
978-1-58113-185-7
Published:
09 July 2000
Sponsors:

Reflects downloads up to 01 Sep 2024Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
Article
Free
The data locality of work stealing

This paper studies the data locality of the work-stealing scheduling algorithm on hardware-controlled shared-memory machines. We present lower and upper bounds on the number of cache misses using work stealing, and introduce a locality-guided work-...

Article
Free
Scheduling Cilk multithreaded parallel programs on processors of different speeds

We study the problem of executing parallel programs, in particular Cilk programs, on a collection of processors of different speeds. We consider a model in which each processor maintains an estimate of its own speed, where communication between ...

Article
Free
Optimal schedules for data-parallel cycle-stealing in networks of workstations (extended abstract)

We refine the model underlying our prior work on scheduling cycle-stealing opportunities in NOWs [5, 16], obtaining a model wherein the scheduling guidelines of [16] produce optimal schedules for every such opportunity. Although computing optimal ...

Article
Free
Diffusive load balancing schemes on heterogeneous networks

Up to now, diffusive load balancing schemes have only been developed for homogeneous networks. We generalize existing diffusion schemes, in order to deal with heterogeneous networks. In these networks, every processor can have arbitrary computing power, ...

Article
Free
Interprocessor communication with memory constraints

Many parallel applications require periodic redistribution of workloads and associated data. In a distributed memory computer, this redistribution can be difficult if limited memory is available for receiving messages. We propose a model for optimizing ...

Article
Free
Efficient on-line communication in cellular networks

In this paper we consider communication issues arising in mobile networks that utilize Frequency Division Multiplexing (FDM) technology. In such networks, many users within the same geographical region can communicate simultaneously with other users of ...

Article
Free
Connection caching under various models of communication

Motivated by Web applications, we recently introduced the following theoretical model for connection-caching: Each host on a network can maintain (cache) a limited number of connections to other hosts. A message can be transmitted from one host to ...

Article
Free
Fault tolerant networks with small degree

In this paper, we study the design of fault tolerant networks for arrays and meshes by adding redundant nodes and edges. For a target graph G (linear array or mesh in this paper), a graph G′ is called a κ-fault-tolerant graph of G if when we remove any ...

Article
Free
Generalized connection caching

Cohen et al. [5] recently initiated the theoretical study of connection caching in the world-wide web. They extensively studied uniform connection caching, where the establishment cost is uniform for all connections [5, 6]. They showed that ordinary ...

Article
Free
Comparing the effectiveness of fine-grain memory caching against page migration/replication in reducing traffic in DSM clusters

In this paper, we compare and contrast two techniques to improve capacity/conflict miss traffic in CC-NUMA DSM clusters. Page migration/replication optimizes read-write accesses to a page used by a single processor by migrating the page to that ...

Article
Free
Asynchronous scheduling of redundant disk arrays

Random redundant allocation of data to parallel disk arrays can be exploited to achieve low access delays. New algorithms are proposed which improve the previously known shortest queue algorithm by systematically exploiting that scheduling decisions can ...

Article
Free
Infinite parallel job allocation (extended abstract)

In recent years, the task of allocating jobs to servers has been studied with the “balls and bins” abstraction. Results in this area exploit the large decrease in maximum load that can be achieved by allowing each job (ball) a little freedom in choosing ...

Article
Free
Data management in hierarchical bus networks

A hierarchical bus network T = (V, E) uses hierarchically, tree-like connected buses as a communication network. New communication technologies like SCI (Scalable Coherent Interface) (see, e.g., [6, 7]) make such networks very attractive, because they ...

Article
Free
Efficient, distributed data placement strategies for storage area networks (extended abstract)

In the last couple of years a dramatic growth of enterprise data storage capacity can be observed. As a result, new strategies have been sought that allow servers and storage being centralized to better manage the explosion of data and the overall cost ...

Article
Free
Broadcast scheduling optimization for heterogeneous cluster systems

Network of workstation (NOW) is a cost-effective alternative to massively parallel supercomputers. As commercially available off-the-shelf processors become cheaper and faster, it is now possible to build a PC or workstation cluster that provides high ...

Article
Free
DCAS-based concurrent deques

The computer industry is currently examining the use of strong synchronization operations such as double compare-and-swap (DCAS) as a means of supporting non-blocking synchronization on tomorrow's multiprocessor machines. However, before such a strong ...

Article
Free
A no-busy-wait balanced tree parallel algorithmic paradigm

Suppose that a parallel algorithm can include any number of parallel threads. Each thread can proceed without ever having to busy wait to another thread. A thread can proceed till its termination, but no new threads can be formed. What kind of problems ...

Article
Free
Algorithmic foundations for a parallel vector access memory system

This paper presents mathematical foundations for the design of a memory controller subcomponent that helps to bridge the processor/memory performance gap for applications with strided access patterns. The Parallel Vector Access (PVA) unit exploits the ...

Article
Free
An experimental study of a simple, distributed edge coloring algorithm

We conduct an experimental analysis of a distributed, randomized algorithm for edge coloring simple undirected graphs. The algorithm is extremely simple, yet, according to the probabilistic analysis, it computes nearly optimal colorings very quickly [12]...

    Article
    Free
    Multithreaded algorithms for the fast Fourier transform

    In this paper we present fine-grained multithreaded algorithms and implementations for the Fast Fourier Transform (FFT) problem. The FFT problem has been formulated using two distinct approaches based on the dataflow concepts. The first approach, ...

    Article
    Free
    A (2.954 + ε)n oblivious routing algorithm on 2D meshes

    We present a deterministic, oblivious, permutation-routing algorithm on the n × n mesh of constant queue-size. It runs in (2.954+ε)n steps for any ε > 0. Previously, an O(n)-time algorithm was known but with no nontrivial upper bounds on the constant ...

    Article
    Free
    VLSI layout and packaging of butterfly networks

    We present a scheme for optimal VLSI layout and packaging of butterfly networks under the Thompson model, the multilayer grid model, and the hierarchical layout model. We show that when L layers of wires are available, an N-node butterfly network can be ...

    Article
    Free
    Compact, multilayer layout for butterfly fat-tree
    Article
    Free
    An efficient self-simulation algorithm for reconfigurable meshes

    A reconfigurable mesh (RM) is the two-dimensional mesh-connected computer enhanced with a reconfigurable bus system. The bus system is used to dynamically obtain various interconnection patterns among the processors during the execution of programs. ...

    Contributors
    • Carnegie Mellon University
    • University of Southern California

    Index Terms

    1. Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures

        Recommendations

        Acceptance Rates

        SPAA '00 Paper Acceptance Rate 24 of 45 submissions, 53%;
        Overall Acceptance Rate 447 of 1,461 submissions, 31%
        YearSubmittedAcceptedRate
        SPAA '191093431%
        SPAA '181203630%
        SPAA '171273124%
        SPAA '151313124%
        SPAA '141223025%
        SPAA '131303124%
        SPAA '031063836%
        SPAA '01933437%
        SPAA '00452453%
        SPAA '99902629%
        SPAA '98843036%
        SPAA '97973233%
        SPAA '961063937%
        SPAA '951013131%
        Overall1,46144731%