Expressing and verifying probabilistic assertions

A Sampson, P Panchekha, T Mytkowicz… - Proceedings of the 35th …, 2014 - dl.acm.org
Proceedings of the 35th ACM SIGPLAN Conference on Programming Language …, 2014dl.acm.org
Traditional assertions express correctness properties that must hold on every program
execution. However, many applications have probabilistic outcomes and consequently their
correctness properties are also probabilistic (eg, they identify faces in images, consume
sensor data, or run on unreliable hardware). Traditional assertions do not capture these
correctness properties. This paper proposes that programmers express probabilistic
correctness properties with probabilistic assertions and describes a new probabilistic …
Traditional assertions express correctness properties that must hold on every program execution. However, many applications have probabilistic outcomes and consequently their correctness properties are also probabilistic (e.g., they identify faces in images, consume sensor data, or run on unreliable hardware). Traditional assertions do not capture these correctness properties. This paper proposes that programmers express probabilistic correctness properties with probabilistic assertions and describes a new probabilistic evaluation approach to efficiently verify these assertions. Probabilistic assertions are Boolean expressions that express the probability that a property will be true in a given execution rather than asserting that the property must always be true. Given either specific inputs or distributions on the input space, probabilistic evaluation verifies probabilistic assertions by first performing distribution extraction to represent the program as a Bayesian network. Probabilistic evaluation then uses statistical properties to simplify this representation to efficiently compute assertion probabilities directly or with sampling. Our approach is a mix of both static and dynamic analysis: distribution extraction statically builds and optimizes the Bayesian network representation and sampling dynamically interprets this representation. We implement our approach in a tool called Mayhap for C and C++ programs. We evaluate expressiveness, correctness, and performance of Mayhap on programs that use sensors, perform approximate computation, and obfuscate data for privacy. Our case studies demonstrate that probabilistic assertions describe useful correctness properties and that Mayhap efficiently verifies them.
ACM Digital Library