The LLUNATIC data-cleaning framework

F Geerts, G Mecca, P Papotti, D Santoro - Proceedings of the VLDB …, 2013 - dl.acm.org
Proceedings of the VLDB Endowment, 2013dl.acm.org
Data-cleaning (or data-repairing) is considered a crucial problem in many database-related
tasks. It consists in making a database consistent with respect to a set of given constraints. In
recent years, repairing methods have been proposed for several classes of constraints.
However, these methods rely on ad hoc decisions and tend to hard-code the strategy to
repair conflicting values. As a consequence, there is currently no general algorithm to solve
database repairing problems that involve different kinds of constraints and different …
Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMS-based prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.
ACM Digital Library