Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2592798.2592800acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Rex: replication at the speed of multi-core

Published: 14 April 2014 Publication History

Abstract

Standard state-machine replication involves consensus on a sequence of totally ordered requests through, for example, the Paxos protocol. Such a sequential execution model is becoming outdated on prevalent multi-core servers. Highly concurrent executions on multi-core architectures introduce non-determinism related to thread scheduling and lock contentions, and fundamentally break the assumption in state-machine replication. This tension between concurrency and consistency is not inherent because the total-ordering of requests is merely a simplifying convenience that is unnecessary for consistency. Concurrent executions of the application can be decoupled with a sequence of consensus decisions through consensus on partial-order traces, rather than on totally ordered requests, that capture the non-deterministic decisions in one replica execution and to be replayed with the same decisions on others. The result is a new multi-core friendly replicated state-machine framework that achieves strong consistency while preserving parallelism in multi-thread applications. On 12-core machines with hyper-threading, evaluations on typical applications show that we can scale with the number of cores, achieving up to 16 times the throughput of standard replicated state machines.

References

[1]
P. A. Alsberg and J. D. Day. A principle for resilient sharing of distributed resources. In Proceedings of the 2nd international conference on software engineering, ICSE '76, pages 562--570. IEEE, 1976.
[2]
G. Altekar and I. Stoica. ODR: output-deterministic replay for multicore debugging. In Proceedings of the 22nd ACM symposium on operating systems principles, SOSP '09, pages 193--206. ACM, 2009.
[3]
A. Aviram, S.-C. Weng, S. Hu, and B. Ford. Efficient system-enforced deterministic parallelism. In Proceedings of the 9th USENIX symposium on operating systems design and implementation, OSDI'10, pages 1--16. USENIX, 2010.
[4]
C. Basile, Z. Kalbarczyk, and R. K. Iyer. Active replication of multithreaded applications. IEEE transactions on parallel and distributed systems, 17(5):448--465, 2006.
[5]
T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. CoreDet: a compiler and runtime system for deterministic multithreaded execution. In Proceedings of the 15th international conference on architectural support for programming languages and operating systems, ASPLOS '10, pages 53--64. ACM, 2010.
[6]
T. Bergan, J. Devietti, N. Hunt, and L. Ceze. The deterministic execution hammer: how well does it actually pound nails? In Proceedings of the 2nd workshop on determinism and correctness in parallel programming, WODET '11, pages 448--465. ACM, 2011.
[7]
T. Bergan, N. Hunt, L. Ceze, and S. D. Gribble. Deterministic process groups in dOs. In Proceedings of the 9th USENIX symposium on operating systems design and implementation, OSDI'10, pages 1--16. USENIX, 2010.
[8]
D. Bernick, B. Bruckert, P. D. Vigna, D. Garcia, R. Jardine, J. Klecka, and J. Smullen. NonStop advanced architecture. In Proceedings of the 35th international conference on dependable systems and networks, DSN '05, pages 12--21. IEEE, 2005.
[9]
W. J. Bolosky, D. Bradshaw, R. B. Haagens, N. P. Kusters, and P. Li. Paxos replicated state machines as the basis of a high-performance data store. In Proceedings of the 8th USENIX symposium on networked systems design and implementation, NSDI'11, pages 11--11. USENIX, 2011.
[10]
M. Burrows. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th USENIX symposium on operating systems design and implementation, OSDI '06, pages 335--350. USENIX, 2006.
[11]
T. D. Chandra, R. Griesemer, and J. Redstone. Paxos made live: an engineering perspective. In Proceedings of the 26th annual ACM symposium on principles of distributed computing, PODC '07, pages 398--407. ACM, 2007.
[12]
H. Cui, J. Wu, J. Gallagher, H. Guo, and J. Yang. Efficient deterministic multithreading through schedule relaxation. In Proceedings of the 23rd ACM symposium on operating systems principles, SOSP '11, pages 337--351. ACM, 2011.
[13]
B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield. Remus: high availability via asynchronous virtual machine replication. In Proceedings of the 5th USENIX symposium on networked systems design and implementation, NSDI'08, pages 161--174. USENIX, 2008.
[14]
J. Dean and S. Ghemawat. LevelDB: A fast and lightweight key/value database library by Google., 2011. http://code.google.com/p/leveldb.
[15]
J. Devietti, B. Lucia, L. Ceze, and M. Oskin. DMP: deterministic shared memory multiprocessing. In Proceedings of the 14th international conference on architectural support for programming languages and operating systems, ASPLOS '09, pages 85--96. ACM, 2009.
[16]
J. Devietti, J. Nelson, T. Bergan, L. Ceze, and D. Grossman. RCDC: a relaxed consistency deterministic computer. In Proceedings of the 16th international conference on architectural support for programming languages and operating systems, ASPLOS '11, pages 67--78. ACM, 2011.
[17]
G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: enabling intrusion analysis through virtual-machine logging and replay. In Proceedings of the 5th USENIX symposium on operating systems design and implementation, OSDI '02, pages 211--224. ACM, 2002.
[18]
B. Fitzpatrick. memcached - a distributed memory object caching system, 2011. http://memcached.org/.
[19]
A. Georges, M. Christiaens, M. Ronsse, and K. De Bosschere. JaRec: a portable record/replay environment for multi-threaded Java applications. Software: practice and experience, 34:523--547, 2004.
[20]
Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, and Z. Zhang. R2: an application-level kernel for record and replay. In Proceedings of the 8th USENIX symposium on operating systems design and implementation, OSDI'08, pages 193--208. USENIX, 2008.
[21]
D. R. Hower, P. Dudnik, M. D. Hill, and D. A. Wood. Calvin: deterministic or not? Free will to choose. In Proceedings of the 2011 IEEE 17th international symposium on high performance computer architecture, HPCA '11, pages 333--334. IEEE, 2011.
[22]
M. Kapritsos, Y. Wang, V. Quema, A. Clement, L. Alvisi, and M. Dahlin. All about Eve: execute-verify replication for multi-core servers. In Proceedings of the 10th USENIX symposium on operating systems design and implementation, OSDI'12, pages 237--250. USENIX, 2012.
[23]
J. Kończak, N. Santos, T. Zurkowski, P. T. Wojciechowski, and A. Schiper. JPaxos: state machine replication based on the Paxos protocol. Technical report, EPFL, 2011.
[24]
R. Kotla and M. Dahlin. High throughput Byzantine fault tolerance. In Proceedings of the 34th international conference on dependable systems and networks, DSN '04, pages 575--. IEEE, 2004.
[25]
O. Laadan, N. Viennot, and J. Nieh. Transparent, lightweight application execution replay on commodity multiprocessor operating systems. In Proceedings of the 2010 international conference on measurement and modeling of computer systems, SIGMETRICS '10, pages 155--166. ACM, 2010.
[26]
F. Labs. Kyoto Cabinet: a straightforward implementation of DBM. http://www.fallabs.com/kyotocabinet/.
[27]
L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558--565, 1978.
[28]
L. Lamport. The part-time parliament. ACM transaction on computer systems, 16(2):133--169, 1998.
[29]
L. Lamport. Paxos made simple. ACM SIGACT news, 32(4):18--25, 2001.
[30]
L. Lamport. Generalized consensus and Paxos. Technical Report MSR-TR-2005-33, Microsoft, 2005.
[31]
D. Lee, B. Wester, K. Veeraraghavan, S. Narayanasamy, P. M. Chen, and J. Flinn. Respec: efficient online multiprocessor replay via speculation and external determinism. In Proceedings of the 15th international conference on architectural support for programming languages and operating systems, ASPLOS '10, pages 77--90. ACM, 2010.
[32]
T. Liu, C. Curtsinger, and E. D. Berger. Dthreads: efficient deterministic multithreading. In Proceedings of the 23rd ACM symposium on operating systems principles, SOSP '11, pages 327--336. ACM, 2011.
[33]
M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: efficient deterministic multithreading in software. In Proceedings of the 14th international conference on architectural support for programming languages and operating systems, ASPLOS '09, pages 97--108. ACM, 2009.
[34]
S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: probabilistic replay with execution sketching on multiprocessors. In Proceedings of the 22nd ACM symposium on operating systems principles, SOSP '09, pages 177--192. ACM, 2009.
[35]
F. Pedone and A. Schiper. Generic broadcast. In Proceedings of the 13th international symposium on distributed computing, DISC '99, pages 94--106. Springer Verlag, 1999.
[36]
M. Ronsse and K. De Bosschere. RecPlay: a fully integrated practical record/replay system. ACM transaction on computer systems, 17(2):133--152, 1999.
[37]
F. B. Schneider. Implementing fault-tolerant services using the state machine approach: a tutorial. ACM computer survey, 22(4):299--319, 1990.
[38]
K. Tadeusz, K. Maciej, and T. W. Pawel. Hybrid replication: state-machine-based and deferred-update replication schemes combined. In Proceedings of the 33rd international conference on distributed computing systems, ICDCS '13, pages 286--296. IEEE, 2013.
[39]
R. van Renesse and F. B. Schneider. Chain replication for supporting high throughput and availability. In Proceedings of the 6th USENIX symposium on operating systems design and implementation, OSDI'04, pages 7--7. USENIX, 2004.
[40]
K. Veeraraghavan, P. M. Chen, J. Flinn, and S. Narayanasamy. Detecting and surviving data races using complementary schedules. In Proceedings of the 23rd ACM symposium on operating systems principles, SOSP '11, pages 369--384. ACM, 2011.
[41]
K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: parallelizing sequential logging and replay. In Proceedings of the 16th international conference on architectural support for programming languages and operating systems, ASPLOS '11, pages 15--26. ACM, 2011.
[42]
W. Xiong, S. Park, J. Zhang, Y. Zhou, and Z. Ma. Adhoc synchronization considered harmful. In Proceedings of the 9th USENIX conference on operating systems design and implementation, OSDI'10, pages 1--8. USENIX, 2010.

Cited By

View all
  • (2024)IONIAProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650711(225-242)Online publication date: 27-Feb-2024
  • (2024)Parallel Transaction Execution in Blockchain and the Ambiguous State Representation Problem2024 19th European Dependable Computing Conference (EDCC)10.1109/EDCC61798.2024.00035(131-138)Online publication date: 8-Apr-2024
  • (2023)Flexible Advancement in Asynchronous BFT ConsensusProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613164(264-280)Online publication date: 23-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '14: Proceedings of the Ninth European Conference on Computer Systems
April 2014
388 pages
ISBN:9781450327046
DOI:10.1145/2592798
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 April 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multi-core
  2. replicated state machine
  3. replication

Qualifiers

  • Research-article

Conference

EuroSys 2014
Sponsor:
EuroSys 2014: Ninth Eurosys Conference 2014
April 14 - 16, 2014
Amsterdam, The Netherlands

Acceptance Rates

EuroSys '14 Paper Acceptance Rate 27 of 147 submissions, 18%;
Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)IONIAProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650711(225-242)Online publication date: 27-Feb-2024
  • (2024)Parallel Transaction Execution in Blockchain and the Ambiguous State Representation Problem2024 19th European Dependable Computing Conference (EDCC)10.1109/EDCC61798.2024.00035(131-138)Online publication date: 8-Apr-2024
  • (2023)Flexible Advancement in Asynchronous BFT ConsensusProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613164(264-280)Online publication date: 23-Oct-2023
  • (2023)Resilient and Secure System on Chip with Rejuvenation in the Wake of Persistent AttacksProceedings of the 16th European Workshop on System Security10.1145/3578357.3589456(37-43)Online publication date: 8-May-2023
  • (2023)Diagnosing Kernel Concurrency Failures with AITIAProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567486(94-110)Online publication date: 8-May-2023
  • (2023)Micro Replication2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58367.2023.00024(123-137)Online publication date: Jun-2023
  • (2022)Understanding and Reaching the Performance Limit of Schedule Tuning on Stable Synchronization DeterminismProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569669(223-238)Online publication date: 8-Oct-2022
  • (2022)Exploiting Nil-external Interfaces for Fast Replicated StorageACM Transactions on Storage10.1145/354282118:3(1-35)Online publication date: 2-Sep-2022
  • (2022)RolisProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519561(69-84)Online publication date: 28-Mar-2022
  • (2022)Exploiting Concurrency in Sharded Parallel State Machine ReplicationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.313576133:9(2133-2147)Online publication date: 1-Sep-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media