Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/PACT.2005.11guideproceedingsArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
Article

Characterization of TCC on Chip-Multiprocessors

Published: 17 September 2005 Publication History

Abstract

Transactional Coherence and Consistency (TCC) is a novel coherence scheme for shared memory multiprocessors that uses programmer-defined transactions as the fundamental unit of parallel work, synchronization, coherence, and consistency. TCC has the potential to simplify parallel program development and optimization by providing a smooth transition from sequential to parallel programs. In this paper, we study the implementation of TCC on chip-multiprocessors (CMPs). We explore design alternatives such as the granularity of state tracking, doublebuffering, and write-update and write-invalidate protocols. Furthermore, we characterize the performance of TCC in comparison to conventional snoopy cache coherence (SCC) using parallel applications optimized for each scheme. We conclude that the two coherence schemes perform similarly, with each scheme having a slight advantage for some applications. The bandwidth requirements of TCC are slightly higher but well within the capabilities of CMP systems. Also, we find that overflow of speculative state can be effectively handled by a simple victim cache. Our results suggest TCC can provide its programming advantages without compromising the performance expected from well-tuned parallel applications.

References

[1]
S. V. Adve, V. S. Pai, and P. Ranganathan. Recent advances in memory consistency models for hardware shared memory systems. Proc. of the IEEE, Special Issue on Distributed Shared Memory , 87(3):445-455, 1999.
[2]
A. Agarwal, J. L. Hennessy, R. Simoni, and M. A. Horowitz. An evaluation of directory schemes for cache coherence. In Proceedings of the 15th International Symposium on Computer Architecture , pages 280-289, 1988.
[3]
B. Alpern, C. R. Attanasio, J. J. Barton, M. G. Burke, P. Cheng, J.-D. Choi, A. Cocchi, S. J. Fink, D. Grove, M. Hind, S. F. Hummel, D. Lieber, V. Litvinov, M. F. Mergen, T. Ngo, J. R. Russell, V. Sarkar, M. J. Serrano, J. C. Shepherd, S. E. Smith, V. C. Sreedhar, H. Srinivasan, and J.Whaley. The Jalapeñno virtual machine. IBM Systems Journal , 39(1):211-238, 2000.
[4]
S. Ananian, K. Asanovic, et al. Unbounded transactional memory. In Proceedings of the 11th International Symposium on High Performance Computer Architecture , Feb. 2005.
[5]
J. Archibald and J. L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation mode. ACM Transactions on Computer Systems , pages 273-298, Nov. 1986.
[6]
L. Barroso, K. Gharachorloo, et al. Piranha: A scalable architecture based on single-chip multiprocessing. In Proceedings of the 27th Annual International Symposium on Computer Architecture , Vancouver, Canada, June 2000.
[7]
H. Chafi, C. C. Minh, A. McDonald, B. D. Carlstrom, J. Chung, L. Hammond, C. Kozyrakis, and K. Olukotun. TAPE: A transactional application profiling environment. In Proceedings of the 19th ACM International Conference on Supercomputing , June 2005.
[8]
D. Culler, J. P. Singh, and A. Gupta. Parallel Computer Architecture . Morgan Kauffman, 1999.
[9]
S. Eggers. Simulation Analysis of Data Sharing in Shared Memory Multiprocessors . PhD thesis, University of California, Berkeley, 1989.
[10]
M. Garzaran, M. Prvulovic, et al. Tradeoffs in buffering multi-version memory state for speculative thread-level parallelization in multiprocessors. In Proceedings of the 9th International Symposium on High Performance Computer Architecture , Feb. 2003.
[11]
J. R. Goodman. Using cache memory to reduce processor memory traffic. In Proceedings International Symposium on Computer Architecture , pages 124-131, 1983.
[12]
S. Gopal, T. Vijaykumar, J. E. Smith, and G. S. Sohi. Speculative versioning cache. In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture , Feb. 1998.
[13]
J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques . Morgan Kaufmann, 1993.
[14]
L. Hammond, B. D. Carlstrom, V. Wong, B. Hertzberg, M. Chen, C. Kozyrakis, and K. Olukotun. Programming with transactional coherence and consistency. In Proceedings of the 11th International Conference on Architecture Support for Programming Languages and Operating Systems , Oct. 2004.
[15]
L. Hammond, B. Hubbert, M. Siu, M. Prabhu, M. Chen, and K. Olukotun. The Stanford Hydra CMP. IEEEMICROMagazine , March-April 2000.
[16]
L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H.Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In Proceedings of the 31st International Symposium on Computer Architecture , pages 102-113, June 2004.
[17]
T. Harris and K. Fraser. Language support for lightweight transactions. In Proceedings of the 18th Conference on Object-Oriented Programming, Systems, Languages, and Applications , Oct. 2003.
[18]
M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the 20th International Symposium on Computer Architecture , pages 289-300, 1993.
[19]
J. Huh, J. Chang., D. Burger., and G. Sohi. Coherence decoupling: Making use of incoherence. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems , Oct. 2004.
[20]
JBus architecture overview. Technical report, Sun Microsystems, Apr. 2003.
[21]
N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the International Symposium on Computer Architecture , May 1990.
[22]
R. Kalla et al. Simultaneous multi-threading implementation in POWER5. In Conference Record of Hot Chips 16 , Stanford, CA, Aug. 2003.
[23]
P. Kongetira. A 32-way multithreaded Sparc processor. In Conference Record of Hot Chips 16 , Stanford, CA, Aug. 2004.
[24]
V. Krishnan and J. Torrellas. A chip multiprocessor architecture with speculative multithreading. IEEE Transactions on Computers, Special Issue on Multithreaded Architecture , Sept. 1999.
[25]
H. T. Kung and J. T. Robinson. On optimistic methods for concurrency control. ACM Transactions on Database Systems , 6(2), June 1981.
[26]
J. Martinez and J. Torrellas. Speculative synchronization: Applying thread-level speculation to parallel applications. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems , Oct. 2002.
[27]
C. McNairy. Montecito: The next product in the Itanium Processor Family. In Conference Record of Hot Chips 16 , Stanford, CA, Aug. 2004.
[28]
D. O'Hallaron. Spark98: Sparse matrix kernels for shared memory and message passing systems. Technical Report CMU-CS-97-178, School of Computer Science, Carnegie Mellon University, Oct. 1997.
[29]
R. Rajwar and J. Goodman. Speculative Lock Elision: enabling highly concurrent multithreaded execution. In MICRO 34: Proceedings of the 34th ACM/IEEE International Symposium on Microarchitecture , pages 294-305. IEEE Computer Society, 2001.
[30]
R. Rajwar and J. Goodman. Transactional lock-free execution of lock-based programs. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems , Oct. 2002.
[31]
R. Rajwar, M. Herlihy, and K. Lai. Virtualizing transactional memory. In Proceedings of the 32nd International Symposium on Computer Architecture , June 2005.
[32]
L. Rauchwerger and D. Padua. LRPD test: Speculative runtime parallelization of loops with privatization and reduction parallelization. In Proceedings of the Conference on Programming Language Design and Implementation , June 1995.
[33]
P. Rundberg and P. Stenstrom. Reordered speculative execution of critical sections. In Proceedings of the 2002 International Conference on Parallel Processing , Feb. 2002.
[34]
J. P. Singh, W.Weber, and A. Gupta. Splash: Stanford parallel applications for shared-memory. Computer Architecture News , 20(1).
[35]
G. Sohi, S. Breach, and T. Vijaykumar. Multiscalar processors. In Proceedings of the 22nd Annual International Symposium on Computer Architecture , pages 414-425, June 1995.
[36]
Standard Performance Evaluation Corporation, SPEC CPU Benchmarks. http://www.specbench.org/, 1995-2000.
[37]
Standard Performance Evaluation Corporation, SPECjbb2000 Benchmark. http://www.spec.org/jbb2000/, 2000.
[38]
J. Steffan and T. Mowry. The potential for using thread-level data speculation to facilitate automatic parallelization. In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture , Las Vegas, Nevada, 1998.
[39]
P. Sweazy and A. J. Smith. A class of compatible cache consistency protocols and their support by the IEEE futurebus. In Proceedings of the 13th Symposium on Computer Architecture , pages 1056-1072, 1986.
[40]
S.Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The splash2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture , pages 24-36, June 1995.

Cited By

View all
  • (2015)Composable Memory Transactions for Java Using a Monadic Intermediate LanguageProceedings of the 19th Brazilian Symposium on Programming Languages - Volume 932510.1007/978-3-319-24012-1_10(128-142)Online publication date: 24-Sep-2015
  • (2011)Transactional conflict decoupling and value predictionProceedings of the international conference on Supercomputing10.1145/1995896.1995904(33-42)Online publication date: 31-May-2011
  • (2010)Transactional memoryJournal of Parallel and Distributed Computing10.1016/j.jpdc.2010.06.00670:10(993-1008)Online publication date: 1-Oct-2010
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
September 2005
350 pages
ISBN:076952429X

Publisher

IEEE Computer Society

United States

Publication History

Published: 17 September 2005

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2015)Composable Memory Transactions for Java Using a Monadic Intermediate LanguageProceedings of the 19th Brazilian Symposium on Programming Languages - Volume 932510.1007/978-3-319-24012-1_10(128-142)Online publication date: 24-Sep-2015
  • (2011)Transactional conflict decoupling and value predictionProceedings of the international conference on Supercomputing10.1145/1995896.1995904(33-42)Online publication date: 31-May-2011
  • (2010)Transactional memoryJournal of Parallel and Distributed Computing10.1016/j.jpdc.2010.06.00670:10(993-1008)Online publication date: 1-Oct-2010
  • (2010)Lightweight Transactional Memory systems for NoCs based architecturesJournal of Parallel and Distributed Computing10.1016/j.jpdc.2010.02.00770:10(1024-1041)Online publication date: 1-Oct-2010
  • (2009)Using a configurable processor generator for computer architecture prototypingProceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/1669112.1669159(358-369)Online publication date: 12-Dec-2009
  • (2009)EazyHTMProceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/1669112.1669132(145-155)Online publication date: 12-Dec-2009
  • (2009)On-chip transactional memory system for FPGAs using TCC modelProceedings of the 6th FPGAworld Conference10.1145/1667520.1667525(39-43)Online publication date: 10-Sep-2009
  • (2009)TransMetricProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542345(491-492)Online publication date: 8-Jun-2009
  • (2009)A Domain Specific Language for Composable Memory Transactions in JavaProceedings of the IFIP TC 2 Working Conference on Domain-Specific Languages10.1007/978-3-642-03034-5_9(170-186)Online publication date: 2-Jul-2009
  • (2008)Transactional memoryCommunications of the ACM10.1145/1364782.136480051:7(80-88)Online publication date: 1-Jul-2008
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media