Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Full-Stack Architecting to Achieve a Billion-Requests-Per-Second Throughput on a Single Key-Value Store Server Platform

Published: 06 April 2016 Publication History

Abstract

Distributed in-memory key-value stores (KVSs), such as memcached, have become a critical data serving layer in modern Internet-oriented data center infrastructure. Their performance and efficiency directly affect the QoS of web services and the efficiency of data centers. Traditionally, these systems have had significant overheads from inefficient network processing, OS kernel involvement, and concurrency control. Two recent research thrusts have focused on improving key-value performance. Hardware-centric research has started to explore specialized platforms including FPGAs for KVSs; results demonstrated an order of magnitude increase in throughput and energy efficiency over stock memcached. Software-centric research revisited the KVS application to address fundamental software bottlenecks and to exploit the full potential of modern commodity hardware; these efforts also showed orders of magnitude improvement over stock memcached.
We aim at architecting high-performance and efficient KVS platforms, and start with a rigorous architectural characterization across system stacks over a collection of representative KVS implementations. Our detailed full-system characterization not only identifies the critical hardware/software ingredients for high-performance KVS systems but also leads to guided optimizations atop a recent design to achieve a record-setting throughput of 120 million requests per second (MRPS) (167MRPS with client-side batching) on a single commodity server. Our system delivers the best performance and energy efficiency (RPS/watt) demonstrated to date over existing KVSs including the best-published FPGA-based and GPU-based claims. We craft a set of design principles for future platform architectures, and via detailed simulations demonstrate the capability of achieving a billion RPS with a single server constructed following our principles.

References

[1]
Jung Ho Ahn, Sheng Li, Seongil O, and Norman P. Jouppi. 2013. McSimA+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling. In ISPASS.
[2]
Amazon. 2012. Amazon Elasticache. Retrieved from http://aws.amazon.com/elasticache/.
[3]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In SIGMETRICS.
[4]
Adam Belay, George Prekas, Ana Klimovic, Samuel Grossman, Christos Kozyrakis, and Edouard Bugnion. 2014. IX: A protected dataplane operating system for high throughput and low latency. In OSDI.
[5]
Michaela Blott, Kimon Karras, Ling Liu, K Vissers, J Bär, and Z István. 2013. Achieving 10Gbps line-rate key-value stores with FPGAs. In HotCloud.
[6]
Sai Rahul Chalamalasetti, Kevin Lim, Mitch Wright, Alvin AuYoung, Parthasarathy Ranganathan, and Martin Margala. 2013. An FPGA memcached appliance. In FPGA.
[7]
Brian Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In SOCC.
[8]
Intel DDIO. 2014. Intel® Data Direct I/O Technology. Retrieved from http://www.intel.com/content/www/us/en/io/direct-data-i-o.html.
[9]
Mihai Dobrescu, Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, and Sylvia Ratnasamy. 2009. RouteBricks: Exploiting parallelism to scale software routers. In SOSP.
[10]
Intel DPDK. 2014. Intel Data Plane Development Kit (Intel DPDK). Retrieved from http://www.intel.com/go/dpdk.
[11]
Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast remote memory. In NSDI.
[12]
Facebook. 2014. Introducing mcrouter: A memcached protocol router for scaling memcached deployments. Retrieved from https://code.facebook.com/posts/296442737213493/introducing-mcrouter-a-memcached-protocol-router-for-scaling-memcached-deployments/.
[13]
Bin Fan, David G. Andersen, and Michael Kaminsky. 2013. MemC3: Compact and concurrent memcache with dumber caching and smarter hashing. In NSDI.
[14]
Intel FlowDirector. 2014. Intel® Ethernet Flow Director. Retrieved from http://www.intel.com/content/www/us/en/ethernet-controllers/ethernet-flow-director-video.html.
[15]
Anthony Gutierrez, Michael Cieslak, Bharan Giridhar, Ronald G. Dreslinski, Luis Ceze, and Trevor Mudge. 2014. Integrated 3D-stacked server designs for increasing physical density of key-value stores. In ASPLOS.
[16]
Sangjin Han, Keon Jang, KyoungSoo Park, and Sue Moon. 2010. PacketShader: A GPU-accelerated software router. In SIGCOMM.
[17]
Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch hashing. In Distributed Computing. Springer, 350--364.
[18]
Tayler H. Hetherington, Mike O’Connor, and Tor M. Aamodt. 2015. MemcachedGPU: Scaling-up scale-out key-value stores. In Proc. SOCC.
[19]
Ram Huggahalli, Ravi Iyer, and Scott Tetrick. 2005. Direct cache access for high bandwidth network I/O. In ISCA.
[20]
Intel IOAT. 2014. Intel® I/O Acceleration Technology. Retrieved from http://www.intel.com/content/www/us/en/wireless-network/accel-technology.html.
[21]
Ruzica Jevtic, Hanh-Phuc Le, Milovan Blagojevic, Stevo Bailey, Krste Asanovic, Elad Alon, and Borivoje Nikolic. 2015. Per-core DVFS with switched-capacitor converters for energy efficiency in manycore processors. IEEE TVLSI 23, 4 (2015), 723--730.
[22]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA efficiently for key-value services. In SIGCOMM.
[23]
Rishi Kapoor, George Porter, Malveeka Tewari, Geoffrey M. Voelker, and Amin Vahdat. 2012. Chronos: Predictable low latency for data center applications. In SOCC.
[24]
Maysam Lavasani, Hari Angepat, and Derek Chiou. 2013. An FPGA-based in-line accelerator for memcached. In HotChips.
[25]
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In MICRO.
[26]
Sheng Li, Hyeontaek Lim, Victor W. Lee, Jung Ho Ahn, Anuj Kalia, Michael Kaminsky, David G. Andersen, O. Seongil, Sukhan Lee, and Pradeep Dubey. 2015. Architecting to achieve a billion requests per second throughput on a single key-value store server platform. In ISCA.
[27]
Sheng Li, Kevin Lim, Paolo Faraboschi, Jichuan Chang, Parthasarathy Ranganathan, and Norman P. Jouppi. 2011. System-level integrated server architectures for scale-out datacenters. In MICRO.
[28]
Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. In NSDI.
[29]
Kevin Lim, David Meisner, Ali G. Saidi, Parthasarathy Ranganathan, and Thomas F. Wenisch. 2013. Thin servers with smart pipes: Designing SoC accelerators for memcached. In ISCA.
[30]
Linkedin. 2014. How Linkedin uses memcached. Retrieved from http://www.oracle.com/technetwork/server-storage/ts-4696-159286.pdf.
[31]
Pejman Lotfi-Kamran, Boris Grot, Michael Ferdman, Stavros Volos, Onur Kocberber, Javier Picorel, Almutaz Adileh, Djordje Jevdjic, Sachin Idgunji, Emre Ozer, and Babak Falsafi. 2012. Scale-out processors. In ISCA.
[32]
Yandong Mao, Eddie Kohler, and Robert Tappan Morris. 2012. Cache craftiness for fast multicore key-value storage. In EuroSys.
[33]
Mellanox. 2014. Mellanox®OpenFabrics Enterprise Distribution for Linux (MLNX_OFED). Retrieved from http://www.mellanox.com/page/products_dyn?product_family=26.
[34]
Mellanox. 2015. Mellanox®100Gbps Ethernet NIC. Retrieved from http://www.mellanox.com/related-docs/prod_silicon/PB_ConnectX-4_VPI_Card.pdf.
[35]
memcached. 2003. Memcached: A distributed memory object caching system. Retrieved from http://memcached.org/.
[36]
Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store. In USENIX ATC.
[37]
Netflix. 2012. Netflix EVCache. Retrieved from http://techblog.netflix.com/2012/01/ephemeral-volatile-caching-in-cloud.html.
[38]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling memcache at Facebook. In NSDI.
[39]
Stanko Novakovic, Alexandros Daglis, Edouard Bugnion, Babak Falsafi, and Boris Grot. 2014. Scale-out NUMA. In ASPLOS.
[40]
Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. 2011. Fast crash recovery in RAMCloud. In SOSP.
[41]
R. Pagh and F.F. Rodler. 2004. Cuckoo hashing. J. Algorithms 51, 2 (May 2004), 122--144.
[42]
David A. Patterson. 2004. Latency lags bandwith. Commun. ACM 47, 10 (2004), 71--75.
[43]
Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T. Morris. 2012. Improving network connection locality on multicore systems. In EuroSys.
[44]
Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe. 2014. Arrakis: The operating system is the control plane. In OSDI.
[45]
Luigi Rizzo. 2012. netmap: A novel framework for fast packet I/O. In USENIX ATC.
[46]
Shingo Tanaka and Christos Kozyrakis. 2014. High performance hardware-accelerated flash key-value store. In NVM Workshop.
[47]
Dean M. Tullsen, Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, and Rebecca L. Stamm. 1996. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In ISCA.
[48]
Twitter. 2012. Twemcache: Twitter Memcached. https://github.com/twitter/twemcache. (2012).
[49]
Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. 2015. Mega-KV: A case for GPUs to maximize the throughput of in-memory key-value stores. Proc. VLDB Endow. 8, 11 (July 2015).

Cited By

View all
  • (2023)No- Regret Caching with Noisy Request Estimates2023 IEEE Virtual Conference on Communications (VCC)10.1109/VCC60689.2023.10474978(341-346)Online publication date: 28-Nov-2023
  • (2022)An Adaptive Scheduling Framework for Distributed Key-Value Stores Using RDMA2022 8th Annual International Conference on Network and Information Systems for Computers (ICNISC)10.1109/ICNISC57059.2022.00124(605-611)Online publication date: Sep-2022
  • (2021)OdysseyProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456240(245-260)Online publication date: 21-Apr-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 34, Issue 2
May 2016
96 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/2912575
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 April 2016
Accepted: 01 January 2016
Received: 01 December 2015
Published in TOCS Volume 34, Issue 2

Check for updates

Author Tags

  1. Key-value stores
  2. cloud and network
  3. energy efficiency
  4. many core
  5. storage performance

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Science Foundation under award
  • Korea government
  • Intel Science and Technology Center for Cloud Computing
  • National Research Foundation of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)228
  • Downloads (Last 6 weeks)30
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)No- Regret Caching with Noisy Request Estimates2023 IEEE Virtual Conference on Communications (VCC)10.1109/VCC60689.2023.10474978(341-346)Online publication date: 28-Nov-2023
  • (2022)An Adaptive Scheduling Framework for Distributed Key-Value Stores Using RDMA2022 8th Annual International Conference on Network and Information Systems for Computers (ICNISC)10.1109/ICNISC57059.2022.00124(605-611)Online publication date: Sep-2022
  • (2021)OdysseyProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456240(245-260)Online publication date: 21-Apr-2021
  • (2020)Reexamining direct cache access to optimize I/O intensive applications for multi-hundred-gigabit networksProceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference10.5555/3489146.3489192(673-689)Online publication date: 15-Jul-2020
  • (2019)WormholeProceedings of the Fourteenth EuroSys Conference 201910.1145/3302424.3303955(1-16)Online publication date: 25-Mar-2019
  • (2019)Memory-Side Protection With a Capability Enforcement Co-ProcessorACM Transactions on Architecture and Code Optimization10.1145/330225716:1(1-26)Online publication date: 8-Mar-2019
  • (2018)Scale-out ccNUMAProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190550(1-15)Online publication date: 23-Apr-2018
  • (2018)LaKe: The Power of In-Network Computing2018 International Conference on ReConFigurable Computing and FPGAs (ReConFig)10.1109/RECONFIG.2018.8641696(1-8)Online publication date: Dec-2018
  • (2017)KV-DirectProceedings of the 26th Symposium on Operating Systems Principles10.1145/3132747.3132756(137-152)Online publication date: 14-Oct-2017
  • (2017)UDORN: A design framework of persistent in-memory key-value database for NVM2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA.2017.8064478(1-6)Online publication date: Aug-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media