Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/291069.291046acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article
Free access

UTLB: a mechanism for address translation on network interfaces

Published: 01 October 1998 Publication History

Abstract

An important aspect of a high-speed network system is the ability to transfer data directly between the network interface and application buffers. Such a direct data path requires the network interface to "know" the virtual-to-physical address translation of a user buffer, i.e., the physical memory location of the buffer. This paper presents an efficient address translation architecture, User-managed TLB (UTLB), which eliminates system calls and device interrupts from the common communication path. UTLB also supports application-specific policies to pin and unpin application memory. We report micro-benchmark results for an implementation on Myrinet PC clusters. A trace-driven analysis is used to compare the UTLB approach with the interrupt-based approach. It is also used to study the effects of UTLB cache size, associativity, and prefetching. Our results show that the UTLB approach delivers robust performance with relatively small translation cache sizes.

References

[1]
Anindya Basu, Matt Welsh, and Thorsten yon Eicken. Incorporating memory management into user-level network interfaces. In Presentation at I EEE Hot Interconnects V, August 1997. Also available as Vech Report TR97-1620, Computer Science Department, Cornell University.
[2]
M. Blumrich, K. Li, R. Alpert, C. Dubnicki, E. Felten, and J. Sandberg. A virtual memory mapped network interface for the SHRIMP multicomputer. In Proceedings of the 21st Annual Symposium on Computer Architecture, pages 142-153, April 1994.
[3]
M. A. Blumrich, C. Dubnicki, E.W. Felten, and K. Li. Protected, user-level dma for the shrimp network interface. In Proceedings of the Second International Symposium on High Perfownance Computer Architecture, February 1996.
[4]
Matthias A. Blumrich, Richard D. Alpert, Yuqun Chen, Douglas W. Clark, Stefanos N. Damianakis, Cezary Dubnicki, Edward W. Felten, Liviu Iftode, Kai Li, Margaret Martonosi, and Robert A. Shillner. Design choices in the shrimp system: An empirical study. In Proceedings of the 25th Annual Symposium on Computer Architecture, June 1998.
[5]
N. J. Boden, D. Cohen, R. E. Felderman, A. E. Kulawik, C. L. Seitz, J. N. Seizovic, and W. Su. Myrinet: A gigabit-per-second local area network. IEEE Micro, 15(1):29-36, February 1995.
[6]
J.C. Brustoloni and P. Steenkiste. Effects of buffering semantics on i/o performance. In Proceedings of the Second USENIX Symposium on Operating Systems Design and Implementation (OSDI), October 1996.
[7]
G. Buzzard, D. Jacobson, M. Mackey, S. Marovich, and J. Wilkes. An implementation of the hamlyn sender managed interface architecture. In Proceedings of the Operating Systems Design and Implementation Symposium, October 1996.
[8]
Pei Cao, Edward W. Felten, Anna Karlin, and Kai Li. A study of integrated prefetching and caching strategies. Ill Proceedings of the ACM SIGMETRICS, May 1995.
[9]
Pei Cao, Edward W. Felten, Anna R. Karlin, and Kai Li. Implementation and performance of integrated application-controlled file caching, prefetching and disk scheduling. ACM Transactions on Computer Systems, 14(4):311-343, November 1996.
[10]
David R. Cheriton. The unified management of memory in the v distributed system. Draft, 1988.
[11]
Compaq/Intel/Microsoft. Virtual Interface Architecture Specification, Version 1.0, December 1997.
[12]
C. Dalton, (}.Watson, D. Banks, C. Calamvokis, A. Edwards, and J. Lumley. Afterburner. IEEE Network, 7(4):36-43, 1995.
[13]
Stefanos N. Damianakis. Ej~cient Connection-Oriented Communication on High-Performance Networks. PhD thesis, Dept. of Computer Science, Princeton University, May 1998. Available as technical report TR-582-98.
[14]
Peter Druschel, Bruce S. Davie, and Larry L. Peterson. Experiences with a high-speed network adapter: A software perspective. In Proceedings of SIWCOMM '9J, pages 2-13, September 1994.
[15]
Peter Druschel and Larry L. Peterson. Fbufs: A high-bandwidth cross-domain transfer facility. In Proceedings of the 1,~th Symposium on Operating Systems Principles, pages 189-202, December 1993.
[16]
C. Dubnicki, A. Bilas, K. Li, and J. Philbin. Design and implementation of virtual memory-mapped communication on myrinet, in Proceedings of the 1997 International Parallel Processing Symposium, 1997.
[17]
C. Dubnicki, L. Iftode, E.W. Felten, and K. Li. Software support for virtual memory-mapped communication. In Proceedings of the 1996 International Parallel Processing Symposium, pages 372-381, April 1996.
[18]
Cezary Dubnicki, Angelos Bilas, Yuqun Chen, Stefanos N. Damianakis, and Kai Li. Vmmc-2: Efficient support for reliable, connnection-oriented communication. In IEEE Hot Interconnects V, August 1997.
[19]
Cezary Dubnicki, Angelos Bilas, Yuqun Chen, Stefanos N. Damianakis, and Kai Li. Shrimp project update: Myrinet communication. IEEE MICRO, 18(1):50-52, January 1998.
[20]
R. Gillet, M. Collins, and D. Pimm. Overview of network memory channel tbr pci. In Proceedings of the IEEE COMPCON '96, pages 244-249, 1996.
[21]
John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach, ~nd Ed. Morgan Kaufmann Publishers, Inc., San Mateo, CA, 1996.
[22]
Dana S. Henry and Christopher F. Joerg. A tightly-coupled processor-network interface. In Proceedings of 5th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 111-122, October 1992.
[23]
Mark D. Hill. Aspects of Cache Memory and Instruction Buffer Performance. PhD thesis, Unversity of Berkeley, 1987.
[24]
Mark Homewood and Moray McLaren. Meiko CS-2 interconnect elan - elite design. In Proceedings of Hot Interconnects '93 Symposium, August 1993.
[25]
L. Iftode, J. P. Singh, and Kai Li. Understanding application performance on shared virtual memory. In Proceedings of the ~3rd Annual Symposium on Computer Architecture, May 1996.
[26]
Intel Corporation. Pentium Processor Data Book, 1993.
[27]
D.B. Johnson and W. Zaenepoel. The peregrine highperformance rpc system. Software: Practice and Experience, 23(2):201-221, February 1993.
[28]
N.P. Kronenberg, H.M. Levy, and W.D. Strecker. Vaxclusters: A closely-coupled distributed system. A CM Transactions on Computer Systems, 4(2):130-146, May 1986.
[29]
J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. Hennessy. The stanford flash multiprocessor. In Proceedings of the ~1st Annual Symposium on Computer Architecture, pages 302-313, April 1994.
[30]
P.J. Leach, P.H. Levine, B.P. Douros, J.A. Hamilton, D.L. Nelson, and B.L. Stumpf. The architecture of an integrated local network. IEEE Journal on Selected Areas in Communications, SAC-l(5), 1983.
[31]
Cheng Liao, Margaret Martonosi, and Douglas W. Clark. Performance monitoring in a Myrinet-connected SHRIMP cluster. In Proc. of 2nd SIGMETRICS Symposium on Parallel and Distributed Tools, August 1998.
[32]
Richard Lipton and Jonathan Sandberg. Pram: A scalable shared memory. Technical Report CS-TR-180-88, Princeton University, September 1988.
[33]
Evangelos P. Markatos and Manolis G.H. Katevenis. Telegraphos: High-performance networking for parallel processing on workstation clusters. In Proceedings of the 2nd International Symposium on High-Performance Computer Architecture, pages 144- 153, February 1996.
[34]
L. R. Monnerat and R. Bianchini. Efficiently adapting to sharing patterns in software DSMs. In Proceedings of Jth International Symposium on High-Performance Computer Architecture, February 1998.
[35]
Scott Pakin, Mario Lauria, and Andrew Chien. High performance messaging on workstations: Illinois fast messages (fm) for myrinet. In Proceedings of Supercomputing '95, 1995.
[36]
Paul Pierce. The Paragon implementation of the NX message passing interface. In Proceedings of Scalable High-Perfownance Computing Conference (SHPCC) 9~, 1994.
[37]
Steven A. Przybylski. Cache and memory Hierarchy Design: A Performance-Directed Approach. Morgan Kaufmann Publishers, 1990.
[38]
S.K. l~einhardt, J.R. Larus, and D.A. Wood. Tempest and typhoon: User-level shared memory. In Proceedings of the ~1st Annual Symposium on Computer Architecture, pages 325-336, April 1994.
[39]
Rudrajit Samanta, Angelos Bilas, Liviu Iftode, and Jaswinder Pal Singh. Home-based svm protocols for smp clusters: Design and performance. In Proceedings of ~th International Symposium on High-Performance Computer Architecture, 1998.
[40]
Ioannis Schoinas and Mark D. Hill. Address translation mechanisms in network interfaces. In Proceedings of 4th International Symposium on High-Performance Computer Architecture, 1998.
[41]
Michael D. Schroeder, Andrew D. Birrelt, Michael Burrows, Hal Murray, Roger M. Needham, Thomas L. Rodeheffer, Edwin H. Satterthwaite, and Charles P. Thacker. Autonet: a highspeed, self-configuring local area network using point-to-point links. IEEE Journal on Selected Areas in Communications, 9(8):1318-1335, October 1991.
[42]
Steven L. Scott. Synchronization and communication in the t3e multiprocessor. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 26-36, October 1996.
[43]
J.M. Smith and C.B.S. Traw. Giving applications access to gb/s networking. IEEE Network, 7(4):44-52, July 1993.
[44]
A. Z. Spector. Performing remote operations efficiently on a local computer network. Communications of the ACM, 25(4):260- 273, April 1982.
[45]
T. yon Eicken, A. Basu, V. Buch, and W. Vogels. U-net: A userlevel network interface for parallel and distributed computing. in Proceedings of the 15th Annual Symposium on Operating System Principles, pages 40-53, December 1995.
[46]
T. yon Eicken, D.E. Culler, S.C. Goldstein, and K.E. Schauser. Active messages: a mechanism for integrated communication and computation. In Proceedings of i9th ISCA, pages 256-266, May 1992.
[47]
S.C. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta. Methodological considerations and characterization of the splash- 2 parallel application suite. In Proceedings of the 2~nd Annual Symposium on Computer Architecture, May 1995.
[48]
Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementat ion Symposium, October 1996.

Cited By

View all
  • (2017)Page Fault Support for Network ControllersACM SIGARCH Computer Architecture News10.1145/3093337.303771045:1(449-466)Online publication date: 4-Apr-2017
  • (2017)Page Fault Support for Network ControllersACM SIGPLAN Notices10.1145/3093336.303771052:4(449-466)Online publication date: 4-Apr-2017
  • (2017)Page Fault Support for Network ControllersACM SIGOPS Operating Systems Review10.1145/3093315.303771051:2(449-466)Online publication date: 4-Apr-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS VIII: Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
October 1998
326 pages
ISBN:1581131070
DOI:10.1145/291069
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 1998

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ASPLOS98
Sponsor:

Acceptance Rates

ASPLOS VIII Paper Acceptance Rate 28 of 123 submissions, 23%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)155
  • Downloads (Last 6 weeks)36
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Page Fault Support for Network ControllersACM SIGARCH Computer Architecture News10.1145/3093337.303771045:1(449-466)Online publication date: 4-Apr-2017
  • (2017)Page Fault Support for Network ControllersACM SIGPLAN Notices10.1145/3093336.303771052:4(449-466)Online publication date: 4-Apr-2017
  • (2017)Page Fault Support for Network ControllersACM SIGOPS Operating Systems Review10.1145/3093315.303771051:2(449-466)Online publication date: 4-Apr-2017
  • (2017)Page Fault Support for Network ControllersProceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3037697.3037710(449-466)Online publication date: 4-Apr-2017
  • (2013)Spread IdentityJournal of Computer Security10.5555/2590614.259061621:2(233-281)Online publication date: 1-Mar-2013
  • (2011)BibliographyDesigning Network On-Chip Architectures in the Nanoscale Era10.1201/b10477-18(443-475)Online publication date: 9-Feb-2011
  • (2007)RDMA in the SiCortex cluster systemsProceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface10.5555/2396095.2396142(260-271)Online publication date: 30-Sep-2007
  • (2007)RDMA in the SiCortex Cluster SystemsRecent Advances in Parallel Virtual Machine and Message Passing Interface10.1007/978-3-540-75416-9_37(260-271)Online publication date: 2007
  • (2006)Efficient remote block-level I/O over an RDMA-capable NICProceedings of the 20th annual international conference on Supercomputing10.1145/1183401.1183417(97-106)Online publication date: 28-Jun-2006
  • (2006)Design Trade-Offs for User-Level I/O ArchitecturesIEEE Transactions on Computers10.1109/TC.2006.12255:8(962-973)Online publication date: 1-Aug-2006
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media