Article

Free access

How “hard” is thread partitioning and how “bad” is a list scheduling based partitioning algorithm?

Authors:

Xinan Tang,

Guang R. GaoAuthors Info & Claims

SPAA '98: Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures

Pages 130 - 139

https://doi.org/10.1145/277651.277679

Published: 01 June 1998 Publication History

PDF eReader

References

[1]

T. Adam, K. Chandy, and J. Dickson. A comparison of list schedulers for parallel processing systems. Comm. of the A CM, 17:685-690, Dec. 1974.

Digital Library

Google Scholar

[2]

Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Burton Smith. The Tera computer system. In Conf. Proc., 1990 Intl. Conf. on Supercomputing, pages 1-6, Amsterdam, The Netherlands, Jun. 1990.

Digital Library

Google Scholar

[3]

Boon Seong Ang, Arvind, and Derek Chiou. StarT the Next Generation: Integrating global caches and dataflow architecture. CSG Memo 354, Computation Structures Group, MIT Lab. for Comp. Sci., Aug. 1994.

Google Scholar

[4]

David Bernstein, Nichael Rodeh, and Izidor Gertner. Approximation algorithms for scheduling arithemetic expressions on pipelined machines. Journal of Algorithms, 10(1):120-139, 1989.

Digital Library

Google Scholar

[5]

T.L. Casavant and J.G. Kuhl. A taxonomy of scheduling in general propose distributed computing system. IEEE Trans. on Software Eng., 14:42-45, Feb. 1988.

Digital Library

Google Scholar

[6]

Vipin Chaudhary and J. K. Aggarwal. A generalized scheme for mapping parallel algorithms. IEEE Trans. on Parallel and Distrib. Systems, 4(3):328-346, Mar. 1993.

Digital Library

Google Scholar

[7]

E. G. Coffman and R.L. Graham. Optimal scheduling for two-processor systems. Acta Informatica, 1:200-213, 1972.

Digital Library

Google Scholar

[8]

David E. Culler, Seth C. Goldstein, Klaus E. Schauser, and Thorsten von Eicken. TAM - a compiler controlled threaded abstract machine. J. of Parallel and Distrib. Computing, 18:347-370, Jul. 1993.

Digital Library

Google Scholar

[9]

Jack B. Dennis and Guang R. Gao. Multithreaded architectures: Principles, projects, and issues. In Robert A. Iannucci, Guang R. Gao, Robert H. Halstead, Jr., and Burton Smith, editors, Multithreaded Computer Architecture: A Summary of the State of the Art, chapter 1, pages 1-72. Kluwer Academic Pub., Norwell, Mass., 1994.

Google Scholar

[10]

Guang R. Gao, Kevin B. Theobald, Andres Marquez, Thomas Sterling, and Xinan Tang. The HTMT program execution model. In Workshop on Multithreaded Execution, Architecture and Compilation (in conjunction with HPCA-4), Las Vegas, Nevada, Feb. 1998.

Google Scholar

[11]

Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP- Completeness. W. H. Freemann and Co., New York, N. Y., 1979.

Digital Library

Google Scholar

[12]

A. Gerasoulis and T. Yang. On the granularity and clustering of directed acyclic task graphs. IEEE Trans. on Parallel and Distrib. Systems, 4(6):686-701, Jun. 1993.

Digital Library

Google Scholar

[13]

R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM J. on Applied Mathematics, 17(2):416-429, Mar. 1969.

Crossref

Google Scholar

[14]

R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Optimization and approximation in deterministic sequencing and scheduling: A survey. Annals of Discrete Mathematics, 5:287-326, 1979.

Crossref

Google Scholar

[15]

Laurie J. Hendren, Xinan Tang, Yingchun Zhu, Guang R. Gao, Xun Xue, Haiying Cai, and Pierre Ouellet. Compiling C for the EARTH multithreaded architecture. In Proc. of the 1996 Conf. on Parallel Architectures and Compilation Techniques, pages 12-23, Boston, Mass., Oct. 1996.

Digital Library

Google Scholar

[16]

Herbert H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian, Guang R. Gao, and Laurie J. Hendren. A study of the EARTH-MANNA multithreaded system. Intl. J. of Parallel Programming, 24(4):319-347, Aug. 1996.

Digital Library

Google Scholar

[17]

Herbert H. J. Hum, Kevin B. Theobald, and Guang R. Gao. Building multithreaded architectures with off-theshelf microprocessors. In Proc. of the 8th Intl. Parallel Processing Syrup., pages 288-294, Cancfin, Mexico, Apr. 1994.

Digital Library

Google Scholar

[18]

B. VeItman J.A. Hoogeveen, S.L. Van De Velde. Complexity of scheduling multiprocessor tasks with prespecified processor allocations. Discrete Appl. Math., 55:259-272, 1994.

Digital Library

Google Scholar

[19]

Kashahara and Narita. Practical multiprocessor scheduling algorithms for efficient parallel processing. IEEE Trans. on Computers, 33(11):1023-1029, Nov. 1984.

Google Scholar

[20]

J.K. Lenstra and A.H.G.R.Kan. Complexity of scheduling under precedence constraints. Operation Research, 26:25-35, Jan. 1978.

Digital Library

Google Scholar

[21]

Shanshank S. Nemawarkar. Peformance Modeling and Analysis of Multithreaded Architectures. PhD thesis, School of Computer Science, McGill University, Apr. 1996.

Digital Library

Google Scholar

[22]

Lucas J. Roh. Code Generation, Evaluation, and Optimizations in Multithreaded Executions. PhD thesis, Computer Science Department, Colorado State University, Fort Collins, Co 80523-1873, 1995.

Digital Library

Google Scholar

[23]

Vivek Sarkar. Partitioning and Scheduling Parallel Programs for Multiprocessors. Res. Monographs in Parallel and Distrib. Computing. Pitman, London and The MIT Press, Cambridge, Mass., 1989. Revised version of the author's Ph.D. dissertation (Stanford U., Apr. 1987).

Digital Library

Google Scholar

[24]

Vivek Sarkar. Instruction reordering for fork-join parallelism. In Proc. of SIGPLAN PLDI '90, pages 322-336, White Plains, N. Y., Jun. 1990.

Digital Library

Google Scholar

[25]

Klaus Erik Schauser. Compiling Lenient Languages for Parallel Asynchronous Execution. PhD thesis, Department of Electrical Engineering and Computer Science, Division of Computer Science, University of California, Berkeley, Calif., 1995.

Google Scholar

[26]

B. Shirazi, M. Wang, and G. Pathak. Analysis and evaluation of heuristic methods for static task scheduling. J. of Parallel and Distrib. Computing, pages 222-232, 1990.

Digital Library

Google Scholar

[27]

Thomas Sterling. A hybrid technology multithreaded computer architecture for petafiops computing. http://www.cacr.caltech.edu/HTMT/, 1997.

Google Scholar

[28]

Xinan Tang, Rakesh Ghiya, Laurie J. Hendren, and Guang R. Gao. Heap analysis and optimizations for threaded programs. In Proc. of the 1997 Intl. Conf. on Parallel Architectures and Compilation Techniques, pages 14-25, San Francisco, Calif., Nov. 1997.

Digital Library

Google Scholar

[29]

Xinan Tang, Jian Wang, Kevin B. Theobald, and Guang R. Gao. Thread partitioning and scheduling based on cost model. In Proc. of SPAA '97, pages 272- 281, Newport, Rhode Island, Jun. 1997.

Digital Library

Google Scholar

[30]

Kenneth R. Traub. Sequential implementation of lenient programming languages. Tech. Rep. MIT/LCS/TR-417, MIT Lab. for Comp. Sci., Oct. 1988. PhD thesis, Sep. 1988.

Google Scholar

[31]

T. Yang and A. GerasouIis. List scheduling with and without communication delay. Parallel Computing, 19:1321-1344, 1993.

Digital Library

Google Scholar

Cited By

View all

Wang JCheng HHua BTang XGschwind MNicolau ASalapura VMoreira J(2009)Practice of parallelizing network applications on multi-core architecturesProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542307(204-213)Online publication date: 8-Jun-2009
https://dl.acm.org/doi/10.1145/1542275.1542307
Cheng HChen ZHua BTang XChatterjee SScott M(2008)Scalable packet classification using interpretingProceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming10.1145/1345206.1345214(33-42)Online publication date: 20-Feb-2008
https://dl.acm.org/doi/10.1145/1345206.1345214
Liu DChen ZHua BYu NTang X(2008)High-performance packet classification algorithm for multithreaded IXP network processorACM Transactions on Embedded Computing Systems10.1145/1331331.13313407:2(1-25)Online publication date: 29-Jan-2008
https://dl.acm.org/doi/10.1145/1331331.1331340
Show More Cited By

Index Terms

How “hard” is thread partitioning and how “bad” is a list scheduling based partitioning algorithm?
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Communications management
        Process management

Recommendations

A Local Refinement Algorithm for Data Partitioning
PARA '00: Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia

A local refinement method for data partitioning has been constructed. The method balances the workload and minimizes locally the number of edge-cuts. The arithmetic complexity of the algorithm is low. The method is well suited for refinement in ...
A hybrid genetic search for multi-way graph partitioning based on direct partitioning
GECCO'01: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation

Multi-way partitioning is an important extension of two-way partitioning as it provides a natural and direct model for many partitioning applications. In this paper, we propose a hybrid genetic algorithm for k-way partitioning. The algorithm includes an ...
Multilevel hypergraph partitioning: applications in VLSI domain

In this paper, we present a new hypergraph-partitioning algorithm that is based on the multilevel paradigm. In the multilevel paradigm, a sequence of successively coarser hypergraphs is constructed. A bisection of the smallest hypergraph is computed and ...

Comments

Information & Contributors

Information

Published In

SPAA '98: Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures

June 1998

312 pages

ISBN:0897919890

DOI:10.1145/277651

Chairmen:
Gary Miller
Carnegie Mellon Univ., Pittsburgh, PA
,
Phillip B. Gibbons
Bell Labs

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1998

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

SPAA/PODC98

Sponsor:

SPAA/PODC98: 10th Annual ACM Symposium on Parallel Algorithms and Architectures/Symposium on Principles of Distributed Computing

June 28 - July 2, 1998

Puerto Vallarta, Mexico

Acceptance Rates

SPAA '98 Paper Acceptance Rate 30 of 84 submissions, 36%;

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
758
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)4

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wang JCheng HHua BTang XGschwind MNicolau ASalapura VMoreira J(2009)Practice of parallelizing network applications on multi-core architecturesProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542307(204-213)Online publication date: 8-Jun-2009
https://dl.acm.org/doi/10.1145/1542275.1542307
Cheng HChen ZHua BTang XChatterjee SScott M(2008)Scalable packet classification using interpretingProceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming10.1145/1345206.1345214(33-42)Online publication date: 20-Feb-2008
https://dl.acm.org/doi/10.1145/1345206.1345214
Liu DChen ZHua BYu NTang X(2008)High-performance packet classification algorithm for multithreaded IXP network processorACM Transactions on Embedded Computing Systems10.1145/1331331.13313407:2(1-25)Online publication date: 29-Jan-2008
https://dl.acm.org/doi/10.1145/1331331.1331340
Liu DHua BHu XTang XHong SWolf WFlautner KKim T(2006)High-performance packet classification algorithm for many-core and multithreaded network processorProceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems10.1145/1176760.1176801(334-344)Online publication date: 22-Oct-2006
https://dl.acm.org/doi/10.1145/1176760.1176801
Márquez AGao G(2003)CARE: Overview of an Adaptive Multithreaded ArchitectureHigh Performance Computing10.1007/978-3-540-39707-6_3(26-38)Online publication date: 2003
https://doi.org/10.1007/978-3-540-39707-6_3
Shen KTang HYang TPancake C(1999)Adaptive two-level thread management for fast MPI execution on shared memory machinesProceedings of the 1999 ACM/IEEE conference on Supercomputing10.1145/331532.331581(49-es)Online publication date: 1-Jan-1999
https://dl.acm.org/doi/10.1145/331532.331581
Najjar WLee EGao G(1999)Advances in the dataflow computational modelParallel Computing10.1016/S0167-8191(99)00070-825:13-14(1907-1929)Online publication date: 1-Dec-1999
https://dl.acm.org/doi/10.1016/S0167-8191%2899%2900070-8
Tang XGao G(1999)Automatically Partitioning Threads for Multithreaded ArchitecturesJournal of Parallel and Distributed Computing10.1006/jpdc.1999.155158:2(159-189)Online publication date: 1-Aug-1999
https://dl.acm.org/doi/10.1006/jpdc.1999.1551
Cosnard MJeannot ETao Yang (1998)Symbolic partitioning and scheduling of parameterized task graphsProceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)10.1109/ICPADS.1998.741109(428-434)Online publication date: 1998
https://doi.org/10.1109/ICPADS.1998.741109

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

A Local Refinement Algorithm for Data Partitioning

A hybrid genetic search for multi-way graph partitioning based on direct partitioning

Multilevel hypergraph partitioning: applications in VLSI domain