Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2774993.2774999acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
short-paper

NetPaxos: consensus at network speed

Published: 17 June 2015 Publication History

Abstract

This paper explores the possibility of implementing the widely deployed Paxos consensus protocol in network devices. We present two different approaches: (i) a detailed design description for implementing the full Paxos logic in SDN switches, which identifies a sufficient set of required OpenFlow extensions; and (ii) an alternative, optimistic protocol which can be implemented without changes to the OpenFlow API, but relies on assumptions about how the network orders messages. Although neither of these protocols can be fully implemented without changes to the underlying switch firmware, we argue that such changes are feasible in existing hardware. Moreover, we present an evaluation that suggests that moving Paxos logic into the network would yield significant performance benefits for distributed applications.

References

[1]
Arista. Arista 7124FX Application Switch datasheet. http://www.arista.com/assets/data/pdf/7124FX/7124FX_Data_Sheet.pdf.
[2]
H. Ballani, P. Costa, C. Gkantsidis, M. P. Grosvenor, T. Karagiannis, L. Koromilas, and G. O'Shea. Enabling End Host Network Functions. In SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), Aug. 2015.
[3]
G. Bianchi, M. Bonola, A. Capone, and C. Cascone. OpenState: Programming Platform-Independent Stateful Openflow Applications Inside the Switch. In SIGCOMM Computer Communication Review (CCR), volume 44, pages 44--51, Apr. 2014.
[4]
P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, and D. Walker. P4: Programming Protocol-Independent Packet Processors. SIGCOMM Computer Communication Review (CCR), 44(3): 87--95, July 2014.
[5]
P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN. In SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pages 99--110, Aug. 2013.
[6]
M. Burrows. The Chubby Lock Service for Loosely-Coupled Distributed Systems. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 335--350, Nov. 2006.
[7]
T. D. Chandra, R. Griesemer, and J. Redstone. Paxos Made Live: An Engineering Perspective. In ACM Symposium on Principles of Distributed Computing (PODC), pages 398--407, Aug. 2007.
[8]
B. Charron-Bost, F. Pedone, and A. Schiper, editors. Replication: Theory and Practice, volume 5959 of Lecture Notes in Computer Science. Springer, 2010.
[9]
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's Globally-Distributed Database. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 251--264, Oct. 2012.
[10]
Corsa Technology. http://www.corsa.com/.
[11]
X. Defago, A. Schiper, and P. Urban. Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey. ACM Computing Surveys (CSUR), 36: 372--421, Dec. 2004.
[12]
A. Ferguson, A. Guha, C. Liang, R. Fonseca, and S. Krishnamurthi. Participatory Networking: An API for Application Control of SDNs. In SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pages 327--338, Aug. 2013.
[13]
G. Gibb, J. W. Lockwood, J. Naous, P. Hartke, and N. McKeown. NetFPGA -- An Open Platform for Teaching How to Build Gigabit-Rate Network Switches and Routers. IEEE Transactions on Education, 51(3): 160--161, Aug. 2008.
[14]
L. Glendenning, I. Beschastnikh, A. Krishnamurthy, and T. Anderson. Scalable Consistency in Scatter. In ACM Symposium on Operating Systems Principles (SOSP), pages 15--28, Oct. 2011.
[15]
T. Gupta, J. B. Leners, M. K. Aguilera, and M. Walfish. Improving Availability in Distributed Systems with Failure Informers. In USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 427--441, Apr. 2013.
[16]
J. Hun Han, P. Mundkur, C. Rotsos, G. Antichi, N. Dave, A. W. Moore, and P. G. Neumann. Blueswitch: Enabling Provably Consistent Configuration of Network Switches. In 11th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, Apr. 2015.
[17]
V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, A. Greenberg, and C. Kim. EyeQ: Practical Network Performance Isolation at the Edge. In USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 297--312, Apr. 2013.
[18]
L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM (CACM), 21(7): 558--565, July 1978.
[19]
L. Lamport. The Part-Time Parliament. ACM Transactions on Computer Systems (TOCS), 16(2): 133--169, May 1998.
[20]
L. Lamport. Fast Paxos. Distributed Computing, 19(2): 79--103, Oct. 2006.
[21]
L. Mai, L. Rupprecht, A. Alim, P. Costa, M. Migliavacca, P. Pietzuch, and A. L. Wolf. NetAgg: Using Middleboxes for Application-Specific On-Path Aggregation in Data Centres. In ACM International Conference on Emerging Networking Experiments and Technologies (CoNEXT), pages 249--262, Dec. 2014.
[22]
P. Marandi, M. Primi, N. Schiper, and F. Pedone. Ring Paxos: A High-Throughput Atomic Broadcast Protocol. In IEEE International Conference on Dependable Systems and Networks (DSN), pages 527--536, June 2010.
[23]
P. J. Marandi, S. Benz, F. Pedone, and K. P. Birman. The Performance of Paxos in the Cloud. In IEEE International Symposium on Reliable Distributed Systems (SRDS), pages 41--50, Oct. 2014.
[24]
N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. OpenFlow: Enabling Innovation in Campus Networks. SIGCOMM Computer Communication Review (CCR), 38(2): 69--74, Mar. 2008.
[25]
Netronome. FlowNICs -- Accelerated, Programmable Interface Cards. http://netronome.com/product/flownics.
[26]
Netronome. NFP-6xxx - A 22nm High-Performance Network Flow Processor for 200Gb/s Software Defined Networking, 2013. Talk at HotChips by Gavin Stark. http://www.hotchips.org/wp-content/uploads/hc_archives/hc25/HC25.60-Networking-epub/HC25.27.620-22nm-Flow-Proc-Stark-Netronome.pdf.
[27]
NoviFlow. NoviSwitch 1132 High Performance OpenFlow Switch datasheet. http://noviflow.com/wp-content/uploads/2014/12/NoviSwitch-1132-Datasheet.pdf.
[28]
B. Oki and B. Liskov. Viewstamped Replication: A General Primary-Copy Method to Support Highly-Available Distributed Systems. In ACM Symposium on Principles of Distributed Computing (PODC), pages 8--17, Aug. 1988.
[29]
J. Ouyang, S. Lin, S. Jiang, Z. Hou, Y. Wang, and Y. Wang. SDF: Software-Defined Flash for Web-Scale Internet Storage Systems. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 471--484, Feb. 2014.
[30]
F. Pedone and S. Frolund. Pronto: A Fast Failover Protocol for Off-the-Shelf Commercial Databases. In IEEE International Symposium on Reliable Distributed Systems (SRDS), pages 176--185, Oct. 2000.
[31]
F. Pedone and A. Schiper. Optimistic Atomic Broadcast: A Pragmatic Viewpoint. Theoretical Computer Science, 291: 79--101, Jan. 2003.
[32]
F. Pedone, A. Schiper, P. Urban, and D. Cavin. Solving Agreement Problems with Weak Ordering Oracles. In European Dependable Computing Conference (EDCC), Oct. 2002.
[33]
D. R. K. Ports, J. Li, V. Liu, N. K. Sharma, and A. Krishnamurthy. Designing Distributed Systems Using Approximate Synchrony in Data Center Networks. In USENIX Symposium on Networked Systems Design and Implementation (NSDI), Mar. 2015.
[34]
F. B. Schneider. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Computing Surveys (CSUR), 22(4): 299--319, Dec. 1990.
[35]
D. Sciascia and F. Pedone. Geo-Replicated Storage with Scalable Deferred Update Replication. In IEEE International Conference on Dependable Systems and Networks (DSN), pages 1--12, June 2013.
[36]
R. Soulé, S. Basu, P. J. Marandi, F. Pedone, R. Kleinberg, E. G. Sirer, and N. Foster. Merlin: A Language for Provisioning Network Resources. In ACM International Conference on Emerging Networking Experiments and Technologies (CoNEXT), pages 213--226, Dec. 2014.

Cited By

View all
  • (2024)Fast and scalable in-network lock management using lock fissionProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691952(251-268)Online publication date: 10-Jul-2024
  • (2024)Multitenant in-network acceleration with SwitchVMProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691863(691-708)Online publication date: 16-Apr-2024
  • (2024)DINTProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691848(401-417)Online publication date: 16-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSR '15: Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking Research
June 2015
226 pages
ISBN:9781450334518
DOI:10.1145/2774993
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NetPaxos
  2. paxos
  3. software-defined networking

Qualifiers

  • Short-paper

Funding Sources

  • European Union

Conference

SOSR 2015
Sponsor:
SOSR 2015: ACM SIGCOMM Symposium on SDN Research
June 17 - 18, 2015
California, Santa Clara

Acceptance Rates

SOSR '15 Paper Acceptance Rate 7 of 43 submissions, 16%;
Overall Acceptance Rate 7 of 43 submissions, 16%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)83
  • Downloads (Last 6 weeks)11
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Fast and scalable in-network lock management using lock fissionProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691952(251-268)Online publication date: 10-Jul-2024
  • (2024)Multitenant in-network acceleration with SwitchVMProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691863(691-708)Online publication date: 16-Apr-2024
  • (2024)DINTProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691848(401-417)Online publication date: 16-Apr-2024
  • (2024)LoLKVProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691828(41-54)Online publication date: 16-Apr-2024
  • (2024)Rethinking the Switch Architecture for Stateful In-network ComputingProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696897(273-281)Online publication date: 18-Nov-2024
  • (2024)Achieving High Efficiency for Datacenter Multicast using Skewed Bloom FilterProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673126(1227-1236)Online publication date: 12-Aug-2024
  • (2024)In-Network AllReduce Optimization with Virtual Aggregation TreesProceedings of the 2024 SIGCOMM Workshop on Networks for AI Computing10.1145/3672198.3673800(54-60)Online publication date: 4-Aug-2024
  • (2024)Draconis: Network-Accelerated Scheduling for Microsecond-Scale WorkloadsProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650060(333-348)Online publication date: 22-Apr-2024
  • (2024)In-Network Computing Empowered Mobile Edge Offloading Architecture for Internet of ThingsIEEE Transactions on Services Computing10.1109/TSC.2024.3463475(1-13)Online publication date: 2024
  • (2024) P 4 ce: Consensus over RDMA at Line Speed 2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00054(508-519)Online publication date: 23-Jul-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media