Article

Free access

Improving parallel shear-warp volume rendering on shared address space multiprocessors

Authors:

Dongming Jiang and

Jaswinder Pal SinghAuthors Info & Claims

PPOPP '97: Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming

June 1997

Pages 252 - 263

https://doi.org/10.1145/263764.263798

Published: 21 June 1997 Publication History

PDF eReader

Abstract

This paper presents a new parallel volume rendering algorithm and implementation, based on shear warp factorization, for shared address space multiprocessors. Starting from an existing parallel shear-warp renderer, we use increasingly detailed performance measurements on real machines and simulators to understand performance bottlenecks. This leads us to a new parallel implementation that substantially outperforms and out-scales the old one on a range of shared address space platforms, from bus-based centralized memory machine to hardware-coherent distributed memory machines to networks of computers connected by page-based shared virtual memory. The results demonstrate that real time volume rendering is promising on general purpose multiprocessors, and illustrate the utility of tool hierarchies in conjunction with algorithmic and application knowledge to understand memory system interactions and improve parallel algorithms.

References

[1]

Jiang D, Shah H., and Singh J.P. Performance Portability of Applications and Optimizations Across Shared Address Spac.e Multiprocessor. In Proceedings of the 1997 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 1997.]]

Digital Library

Google Scholar

[2]

Lenoski D., Laudon J., Joe T., Nakahira D., Stevens L., Gupta A., and Hennessy J. The DASH Prototype: Implementation and Performance. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 92-103, May 1992.]]

Digital Library

Google Scholar

[3]

Rothberg E., Singh J. P., and Gupta A. Working sets, Cache Sizes, and Node Granularity Issues for Large- Scale Multiprocessors. In Proce#ings of the 2,0th Annual International Symposium on Computer Architectune, pages 14-25, May 1993.]]

Digital Library

Google Scholar

[4]

Lacroute P. G. Fast Volume Rendering Using a Share- Warp Factorization of the Viewing Transformation. PhD thesis, Stanford University, 1995.]]

Digital Library

Google Scholar

[5]

Lacroute P. G. Real-Time Volume Rendering on Shared Memory Multiprocessors Using the Shear-Warp Factorization. In Proceedings of 1995 Parallel Rendering Symposium, 1995.]]

Digital Library

Google Scholar

[6]

S.A. Herrod. TangoLite; A Multiproeessor Simulation Environment. Computer Systems Laboratory, Stanford University, 1994.]]

Google Scholar

[7]

Laudon J. and Lenoski D. The SGI Origin: A ccNUMA Highly Scalable Server. In To appear Proc. Intl Conference on Computer Architecture 1997, 1997.]]

Digital Library

Google Scholar

[8]

Nieh J. and Levoy M. Volume rendering on scalable shared-memory MIMD architectures. In Proceedings of the 1992, Workshop on Volume Visualization, pages 17- 24, 1992.]]

Digital Library

Google Scholar

[9]

Li K. and Hudak P. Memory Coherence in Shared Virtual Memory Systems. In Proceedings of the 5th Annual A CM Symposium on Principles of Distributed Computing, pages 229-239, August 1986.]]

Digital Library

Google Scholar

[10]

Iftode L., Singh J. P., and Li K. Scope Consistency: a Bridge Between Release Consistency and Entry Consistency. In Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures, June 1996.]]

Digital Library

Google Scholar

[11]

Zagha M., Larson B., Turner S., and Itzkowitz M. Performance Analysis Using the MIPS R10000 Performance Counters. In Supercomputin9'96, 1996.]]

Digital Library

Google Scholar

[12]

Singh J. P., Gupta A., and Levoy M. Paralle Visualization Algorithms: Performance and Architectural Implications. Computer, 27:45-55, 1994.]]

Digital Library

Google Scholar

[13]

Woo S., Ohara M., Torrie E., Singh J.P., and Gupta A. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 2,1st Annual International Symposium on Computer Architecture, 1995.]]

Digital Library

Google Scholar

Cited By

View all

Jiang DSingh J(1999)Scaling application performance on a cache-coherent multiprocessorACM SIGARCH Computer Architecture News10.1145/307338.30100527:2(305-316)Online publication date: 1-May-1999
https://dl.acm.org/doi/10.1145/307338.301005
Jiang DSingh JGottlieb ADally W(1999)Scaling application performance on a cache-coherent multiprocessorProceedings of the 26th annual international symposium on Computer architecture10.1145/300979.301005(305-316)Online publication date: 2-May-1999
https://dl.acm.org/doi/10.1145/300979.301005
Dongming Jiang Singh J(1999)Scaling application performance on a cache-coherent multiprocessorsProceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367)10.1109/ISCA.1999.765960(305-316)Online publication date: 1999
https://doi.org/10.1109/ISCA.1999.765960
Show More Cited By

Index Terms

Recommendations

Improving parallel shear-warp volume rendering on shared address space multiprocessors

This paper presents a new parallel volume rendering algorithm and implementation, based on shear warp factorization, for shared address space multiprocessors. Starting from an existing parallel shear-warp renderer, we use increasingly detailed ...
Read More
Analysis of a Parallel Volume Rendering System Based on the Shear-Warp Factorization

This paper presents a parallel volume rendering algorithm that can render a 256 256 225 voxel medical data set at over 15 Hz and a 512 512 334 voxel data set at over 7 Hz on a 32-processor Silicon Graphics Challenge. The algorithm achieves these results ...
Read More
PARALLEL VOLUME RENDERING ON A SHARED-MEMORY MULTIPROCESSOR
Read More

Comments

Information & Contributors

Information

Published In

PPOPP '97: Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming

June 1997

287 pages

ISBN:0897919068

DOI:10.1145/263764

Chairmen:
Rob Schreiber
Hewlett-Packard Labs, Palo Alto, CA
,
Keshav Pingali
Cornell Univ., Ithaca, NY
,
Editor:
Michael A. Berman

ACM SIGPLAN Notices Volume 32, Issue 7
July 1997
287 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/263767
Chairmen:
Rob Schreiber
Hewlett-Packard Labs, Palo Alto, CA
,
Keshav Pingali
Cornell Univ., Ithaca, NY
,
Editor:
A. Michael Berman
Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 1997

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

PPoPP97

Sponsor:

SIGPLAN

PPoPP97: Principles & Practices of Parallel Programming

June 18 - 21, 1997

Nevada, Las Vegas, USA

Acceptance Rates

PPOPP '97 Paper Acceptance Rate 26 of 86 submissions, 30%;

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
350
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)8

Other Metrics

View Author Metrics

Citations

Cited By

View all

Jiang DSingh J(1999)Scaling application performance on a cache-coherent multiprocessorACM SIGARCH Computer Architecture News10.1145/307338.30100527:2(305-316)Online publication date: 1-May-1999
https://dl.acm.org/doi/10.1145/307338.301005
Jiang DSingh JGottlieb ADally W(1999)Scaling application performance on a cache-coherent multiprocessorProceedings of the 26th annual international symposium on Computer architecture10.1145/300979.301005(305-316)Online publication date: 2-May-1999
https://dl.acm.org/doi/10.1145/300979.301005
Dongming Jiang Singh J(1999)Scaling application performance on a cache-coherent multiprocessorsProceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367)10.1109/ISCA.1999.765960(305-316)Online publication date: 1999
https://doi.org/10.1109/ISCA.1999.765960
Chaussumier FDesprez FLoi M(1999)Efficient Load-Balancing and Communication Overlap in Parallel Shear-Warp Algorithm on a Cluster of PCsEuro-Par’99 Parallel Processing10.1007/3-540-48311-X_81(570-577)Online publication date: 6-Aug-1999
https://doi.org/10.1007/3-540-48311-X_81
Jiang DSingh J(1998)A methodology and an evaluation of the SGI Origin2000ACM SIGMETRICS Performance Evaluation Review10.1145/277858.27790226:1(171-181)Online publication date: 1-Jun-1998
https://dl.acm.org/doi/10.1145/277858.277902
Jiang DSingh JVernon MGibson GLatouche G(1998)A methodology and an evaluation of the SGI Origin2000Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems10.1145/277851.277902(171-181)Online publication date: 1-Jun-1998
https://dl.acm.org/doi/10.1145/277851.277902
Jiang DSingh J(1997)Improving parallel shear-warp volume rendering on shared address space multiprocessorsACM SIGPLAN Notices10.1145/263767.26379832:7(252-263)Online publication date: 21-Jun-1997
https://dl.acm.org/doi/10.1145/263767.263798
Jiang DShan HSingh J(1997)Application restructuring and performance portability on shared virtual memory and hardware-coherent multiprocessorsACM SIGPLAN Notices10.1145/263767.26379232:7(217-229)Online publication date: 21-Jun-1997
https://dl.acm.org/doi/10.1145/263767.263792
Jiang DShan HSingh JSchreiber RPingali K(1997)Application restructuring and performance portability on shared virtual memory and hardware-coherent multiprocessorsProceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/263764.263792(217-229)Online publication date: 21-Jun-1997
https://dl.acm.org/doi/10.1145/263764.263792

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Improving parallel shear-warp volume rendering on shared address space multiprocessors

Analysis of a Parallel Volume Rendering System Based on the Shear-Warp Factorization

PARALLEL VOLUME RENDERING ON A SHARED-MEMORY MULTIPROCESSOR