research-article

Open access

ParDiff: Practical Static Differential Analysis of Network Protocol Parsers

Authors:

Xiangyu ZhangAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 8, Issue OOPSLA1

Article No.: 137, Pages 1208 - 1234

https://doi.org/10.1145/3649854

Published: 29 April 2024 Publication History

Abstract

Countless devices all over the world are connected by networks and communicated via network protocols. Just like common software, protocol implementations suffer from bugs, many of which only cause silent data corruption instead of crashes. Hence, existing automated bug-finding techniques focused on memory safety, such as fuzzing, can hardly detect them. In this work, we propose a static differential analysis called ParDiff to find protocol implementation bugs, especially silent ones hidden in message parsers. Our key observation is that a network protocol often has multiple implementations and any semantic discrepancy between them may indicate bugs. However, different implementations are often written in disparate styles, e.g., using different data structures or written with different control structures, making it challenging to directly compare two implementations of even the same protocol. To exploit this observation and effectively compare multiple protocol implementations, ParDiff (1) automatically extracts finite state machines from programs to represent protocol format specifications, and (2) then leverages bisimulation and SMT solvers to find fine-grained and semantic inconsistencies between them. We have extensively evaluated ParDiff using 14 network protocols. The results show that ParDiff outperforms both differential symbolic execution and differential fuzzing tools. To date, we have detected 41 bugs with 25 confirmed by developers.

References

[1]

Fernando Arnaboldi. 2023. XDiFF. https://github.com/IOActive/XDiFF

[2]

Domagoj Babic and Alan J. Hu. 2008. Calysto: scalable and precise extended static checking. In Proceedings of the 30th International Conference on Software Engineering (ICSE ’08). ACM, 211–220. https://doi.org/10.1145/1368088.1368118

Digital Library

[3]

Sahar Badihi, Faridah Akinotcho, Yi Li, and Julia Rubin. 2020. ARDiff: scaling program equivalence checking via iterative abstraction and refinement of common code. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’20). ACM, 13–24. https://doi.org/10.1145/3368089.3409757

Digital Library

[4]

Thomas Ball, Vladimir Levin, and Sriram K. Rajamani. 2011. A decade of software model checking with SLAM. Commun. ACM, 54, 7 (2011), 68–76. https://doi.org/10.1145/1965724.1965743

Digital Library

[5]

Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, and P Sadayappan. 2016. PolyCheck: dynamic verification of iteration space transformations on affine programs. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16). ACM, 539–554. https://doi.org/10.1145/2837614.2837656

Digital Library

[6]

Osbert Bastani, Rahul Sharma, Alex Aiken, and Percy Liang. 2017. Synthesizing Program Input Grammars. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’17). ACM, 95–110. https://doi.org/10.1145/3062341.3062349

Digital Library

[7]

Bachir Bendrissou, Rahul Gopinath, and Andreas Zeller. 2022. “Synthesizing Input Grammars”: A Replication Study. In Proceedings of the th International Conference on Software Engineering (PLDI ’22). ACM, 260–268. https://doi.org/10.1145/3519939.3523716

Digital Library

[8]

Armin Biere, Alessandro Cimatti, Edmund M. Clarke, Ofer Strichman, and Yunshan Zhu. 2009. Bounded Model Checking. In Handbook of Satisfiability. 185, IOS Press, 457–481. https://doi.org/10.3233/978-1-58603-929-5-457

[9]

Juan Caballero, Pongsin Poosankam, Christian Kreibich, and Dawn Song. 2009. Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering. In Proceedings of the 16th ACM conference on Computer and communications security (CCS ’09). ACM, 621–634. https://doi.org/10.1145/1653662.1653737

Digital Library

[10]

Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’08). USENIX, 209–224. https://www.usenix.org/conference/osdi-08/klee-unassisted-and-automatic-generation-high-coverage-tests-complex-systems

[11]

Cristian Cadar and Hristina Palikareva. 2014. Shadow symbolic execution for better testing of evolving software. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion ’14). ACM, 432–435. https://doi.org/10.1145/2591062.2591104

Digital Library

[12]

Chia Yuan Cho, Vijay D’Silva, and Dawn Song. 2013. BLITZ: Compositional bounded model checking for real-world programs. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE ’13). IEEE, 136–146. https://doi.org/10.1109/ASE.2013.6693074

Digital Library

[13]

Juliusz Chroboczek. 2023. parse_update_subtlv in Jech. https://github.com/jech/babeld/blob/babeld-1.12-branch/message.c

[14]

Juliusz Chroboczek and David Schinazi. 2023. RFC 8966: The Babel Routing Protocol. https://www.rfc-editor.org/rfc/rfc8966.html

[15]

Berkeley Churchill, Oded Padon, Rahul Sharma, and Alex Aiken. 2019. Semantic program alignment for equivalence checking. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’19). ACM, 1027–1040. https://doi.org/10.1145/3314221.3314596

Digital Library

[16]

FRR community. 2023. The FRRouting protocol suite. https://github.com/FRRouting/frr

[17]

Wikipedia contributors. 2022. List of open-source routing platforms. https://wikipedia.org/wiki/List_of_open-source_routing_platforms

[18]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS ’08, Vol. 4963). Springer, 337–340. https://doi.org/10.1007/978-3-540-78800-3_24

[19]

FRR Developers. 2023. FRRouting. https://github.com/FRRouting/frr/blob/ab68283ceedc05ea1a7f9c54f03a87f5dc199a01/babeld/message.c

[20]

Tiago Ferreira, Harrison Brewton, Loris D’Antoni, and Alexandra Silva. 2021. Prognosis: closed-box analysis of network protocol implementations. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference (SIGCOMM ’21). ACM, 762–774. https://doi.org/10.1145/3452296.3472938

Digital Library

[21]

Paul Fiterău-Broştean, Ramon Janssen, and Frits Vaandrager. 2016. Combining model learning and model checking to analyze TCP implementations. In Computer Aided Verification (CAV ’16, Vol. 9780). Springer, 454–471. https://doi.org/10.1007/978-3-319-41540-6_25

[22]

Raffaella Gentilini, Carla Piazza, and Alberto Policriti. 2003. From bisimulation to simulation: Coarsest partition problems. Journal of Automated Reasoning, 31, 1 (2003), 73–103. https://doi.org/10.1023/A:1027328830731

Digital Library

[23]

Patrice Godefroid, Michael Y Levin, and David Molnar. 2012. SAGE: whitebox fuzzing for security testing. Commun. ACM, 55, 3 (2012), 40–44. https://doi.org/10.1145/2093548.2093564

Digital Library

[24]

Rahul Gopinath, Björn Mathis, and Andreas Zeller. 2020. Mining Input Grammars from Dynamic Control Flow. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’20). ACM, 172–183. https://doi.org/10.1145/3368089.3409679

Digital Library

[25]

Istvan Haller, Asia Slowinska, Matthias Neugschwandtner, and Herbert Bos. 2013. Dowser: a guided fuzzer to find buffer overflow vulnerabilities. In 22nd USENIX Security Symposium (USENIX Security ’13). USENIX, 49–64. https://www.usenix.org/conference/usenixsecurity13/technical-sessions/papers/haller

[26]

Heartbleed. 2020. The Heartbleed Bug. https://heartbleed.com

[27]

Heqing Huang, Peisen Yao, Rongxin Wu, Qingkai Shi, and Charles Zhang. 2020. Pangolin: Incremental hybrid fuzzing with polyhedral path abstraction. In 2020 IEEE Symposium on Security and Privacy (S&P ’20). IEEE, 1613–1627. https://doi.org/10.1109/SP40000.2020.00063

[28]

Noah M. Johnson, Juan Caballero, Kevin Zhijie Chen, Stephen McCamant, Pongsin Poosankam, Daniel Reynaud, and Dawn Song. 2011. Differential Slicing: Identifying Causal Execution Differences for Security Applications. In 2011 IEEE Symposium on Security and Privacy (S&P ’11). IEEE, 347–362. https://doi.org/10.1109/SP.2011.41

Digital Library

[29]

Bakhadyr Khoussainov and Anil Nerode. 2012. Automata theory and its applications. 21, Springer. https://doi.org/10.1007/978-1-4612-0171-7

[30]

Shuvendu K. Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. 2012. Symdiff: A language-agnostic semantic diff tool for imperative programs. In Computer Aided Verification (CAV ’12, Vol. 7358). 712–717. https://doi.org/10.1007/978-3-642-31424-7_54

Digital Library

[31]

Shuvendu K. Lahiri, Kenneth L. McMillan, Rahul Sharma, and Chris Hawblitzel. 2013. Differential Assertion Checking. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE ’13). ACM, 345–355. https://doi.org/10.1145/2491411.2491452

Digital Library

[32]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO ’04). IEEE, 75. https://doi.org/10.1109/CGO.2004.1281665

[33]

Zhiqiang Lin, Xiangyu Zhang, and Dongyan Xu. 2010. Reverse Engineering Input Syntactic Structure from Program Execution and Its Applications. IEEE Transactions on Software Engineering, 36, 05 (2010), 688–703. https://doi.org/10.1145/1453101.1453114

Digital Library

[34]

Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: a manifesto. Commun. ACM, 58, 2, 44–46. https://doi.org/10.1145/2644805

Digital Library

[35]

Shiqing Ma, Yingqi Liu, Wen-Chuan Lee, Xiangyu Zhang, and Ananth Grama. 2018. MODE: automated neural network model debugging via state differential analysis and input selection. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’18). ACM, 175–186. https://doi.org/10.1145/3236024.3236082

Digital Library

[36]

Viktor Malík and Tomáš Vojnar. 2021. Automatically checking semantic equivalence between versions of large-scale C projects. In 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST ’21). IEEE, 329–339. https://doi.org/10.1109/ICST49551.2021.00045

[37]

Mares Martin, Machek Pavel, Filip Ondrej, and CZ.NIC. 2023. BIRD internet routing daemon. https://gitlab.nic.cz/labs/bird

[38]

Federico Mora, Yi Li, Julia Rubin, and Marsha Chechik. 2018. Client-specific equivalence checking. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE ’18). ACM, 441–451. https://doi.org/10.1145/3238147.3238178

Digital Library

[39]

Madanlal Musuvathi and Dawson R. Engler. 2004. Model Checking Large Network Protocol Implementations. In Proceedings of the 1st USENIX Symposium on Networked Systems Design and Implementation (NSDI ’04). USENIX, 12. https://doi.org/10.1145/3092282.3092289

Digital Library

[40]

Yannic Noller, Corina S Păsăreanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske. 2020. HyDiff: Hybrid differential software analysis. In Proceedings of the 42nd International Conference on Software Engineering (ICSE ’20). ACM, 1273–1285. https://doi.org/10.1145/3377811.3380363

Digital Library

[41]

Hristina Palikareva, Tomasz Kuchta, and Cristian Cadar. 2016. Shadow of a doubt: testing for divergences between software versions. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). ACM, 1181–1192. https://doi.org/10.1145/2884781.2884845

Digital Library

[42]

Joshua Pereyda. 2023. BooFuzz. https://github.com/jtpereyda/boofuzz

[43]

Suzette Person, Matthew B. Dwyer, Sebastian Elbaum, and Corina S. Pundefinedsundefinedreanu. 2008. Differential Symbolic Execution. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE ’16). ACM, 226–237. https://doi.org/10.1145/1453101.1453131

Digital Library

[44]

Suzette Person, Guowei Yang, Neha Rungta, and Sarfraz Khurshid. 2011. Directed Incremental Symbolic Execution. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). ACM, 504–515. https://doi.org/10.1145/1993498.1993558

Digital Library

[45]

Theofilos Petsios, Adrian Tang, Salvatore Stolfo, Angelos D Keromytis, and Suman Jana. 2017. Nezha: Efficient domain-independent differential testing. In 2017 IEEE Symposium on security and privacy (S&P ’17). IEEE, 615–632. https://doi.org/10.1109/SP.2017.27

[46]

David A. Ramos and Dawson R. Engler. 2011. Practical, low-effort equivalence verification of real code. In Computer Aided Verification (CAV ’11, Vol. 6806). Springer, 669–685. https://doi.org/10.1007/978-3-642-22110-1_55

[47]

David A. Ramos and Dawson R. Engler. 2015. Under-Constrained Symbolic Execution: Correctness Checking for Real Code. In 24th USENIX Security Symposium (USENIX Security ’15). USENIX, 49–64. https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/ramos

[48]

Gaganjeet Singh Reen and Christian Rossow. 2020. DPIFuzz: a differential fuzzing framework to detect DPI elusion strategies for QUIC. In Proceedings of the 36th Annual Computer Security Applications Conference (ACSAC ’20). ACM, 332–344. https://doi.org/10.1145/3427228.3427662

Digital Library

[49]

Richard Rutledge and Alessandro Orso. 2022. Automating Differential Testing with Overapproximate Symbolic Execution. In 2022 15th IEEE Conference on Software Testing, Verification and Validation (ICST ’22). IEEE, 256–266. https://doi.org/10.1109/ICST53961.2022.00035

[50]

Davide Sangiorgi. 1998. On the bisimulation proof method. Mathematical Structures in Computer Science, 8 (1998), 447 – 479. https://api.semanticscholar.org/CorpusID:14986397

Digital Library

[51]

Qingkai Shi, Junyang Shao, Yapeng Ye, Mingwei Zheng, and Xiangyu Zhang. 2023. Lifting Network Protocol Implementation to Precise Format Specification with Security Applications. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS ’23). ACM, 1287–1301. https://doi.org/10.1145/3576915.3616614

Digital Library

[52]

Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and precise sparse value flow analysis for million lines of code. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’18). ACM, 693–706. https://doi.org/10.1145/3192366.3192418

Digital Library

[53]

Yulei Sui, Sen Ye, Jingling Xue, and Pen-Chung Yew. 2011. SPAS: Scalable path-sensitive pointer analysis on full-sparse SSA. In Proceedings of the 9th Asian Symposium on Programming Languages and Systems (APLAS ’11). Springer, 155–171. https://doi.org/10.1007/978-3-642-25318-8_14

Digital Library

[54]

Sven Verdoolaege, Gerda Janssens, and Maurice Bruynooghe. 2012. Equivalence checking of static affine programs using widening to handle recurrences. ACM Transactions on Programming Languages and Systems, 34, 3 (2012), 1–35. https://doi.org/10.1145/2362389.2362390

Digital Library

[55]

Guannan Wei, Songlin Jia, Ruiqi Gao, Haotian Deng, Shangyin Tan, Oliver Bracevac, and Tiark Rompf. 2023. Compiling Parallel Symbolic Execution with Continuations. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’23). IEEE, 1316–1328. https://doi.org/10.1109/ICSE48619.2023.00116

Digital Library

[56]

Yichen Xie and Alex Aiken. 2005. Scalable error detection using Boolean satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’05). ACM, 351–363. https://doi.org/10.1145/1047659.1040334

Digital Library

[57]

Youngseok Yang, Taesoo Kim, and Byung-Gon Chun. 2021. Finding consensus bugs in ethereum via multi-transaction differential fuzzing. In Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’21). USENIX, 349–365. https://www.usenix.org/conference/osdi21/presentation/yang

[58]

Yong-Hao Zou, Jia-Ju Bai, Jielong Zhou, Jianfeng Tan, Chenggang Qin, and Shi-Min Hu. 2021. TCP-Fuzz: Detecting Memory and Semantic Bugs in TCP Stacks with Fuzzing. In USENIX Annual Technical Conference (ATC ’21). USENIX, 489–502. https://www.usenix.org/conference/atc21/presentation/zou

Cited By

Zhang XZhang CLi XDu ZMao BLi YZheng YLi YPan LLiu YDeng R(2024)A Survey of Protocol FuzzingACM Computing Surveys10.1145/369678857:2(1-36)Online publication date: 10-Oct-2024
https://dl.acm.org/doi/10.1145/3696788
Peng CJiang MWu LZhou YLuo BLiao XXu JKirda ELie D(2024)Toss a Fault to BpfChecker: Revealing Implementation Flaws for eBPF runtimes with Differential FuzzingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690237(3928-3942)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690237

Index Terms

ParDiff: Practical Static Differential Analysis of Network Protocol Parsers
1. Security and privacy
  1. Network security
    1. Web protocol security
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis

Recommendations

A secure address resolution protocol

We propose an architecture for securely resolving IP addresses into hardware addresses over an Ethernet. The proposed architecture consists of a secure server connected to the Ethernet and two protocols: an invite-accept protocol and a request-reply ...
The principle of guarantee availability for security protocol analysis

Conformity to prudent design principles is an established approach to protocol correctness although it is not free of limitations. We term goal availability a design principle that is often implicitly followed, prescribing protocols to aim at principal-...
Rule-based static analysis of network protocol implementations

Today's software systems communicate over the Internet using standard protocols that have been heavily scrutinized, providing some assurance of resistance to malicious attacks and general robustness. However, the software that implements those protocols ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 8, Issue OOPSLA1

April 2024

1492 pages

EISSN:2475-1421

DOI:10.1145/3554316

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2024

Published in PACMPL Volume 8, Issue OOPSLA1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
708
Total Downloads

Downloads (Last 12 months)708
Downloads (Last 6 weeks)105

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang XZhang CLi XDu ZMao BLi YZheng YLi YPan LLiu YDeng R(2024)A Survey of Protocol FuzzingACM Computing Surveys10.1145/369678857:2(1-36)Online publication date: 10-Oct-2024
https://dl.acm.org/doi/10.1145/3696788
Peng CJiang MWu LZhou YLuo BLiao XXu JKirda ELie D(2024)Toss a Fault to BpfChecker: Revealing Implementation Flaws for eBPF runtimes with Differential FuzzingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690237(3928-3942)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690237

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents