Privacy-preserving data splitting is a technique that aims to protect data privacy by storing different fragments of data in different locations. In this work we give a new combinatorial formulation to the data splitting problem. We see the data splitting problem as a purely combinatorial problem, in which we have to split data attributes into different fragments in a way that satisfies certain combinatorial properties derived from processing and privacy constraints. Using this formulation, we develop new combinatorial and algebraic techniques to obtain solutions to the data splitting problem. We present an algebraic method which builds an optimal data splitting solution by using Gröbner bases. Since this method is not efficient in general, we also develop a greedy algorithm for finding solutions that are not necessarily minimally sized.

This article is supported by the Ministry of the Interior of the Czech Republic (grant VJ01030002), by the Government of Catalonia (grant 2017 SGR 705), by the European Commission (project H2020-871042 “SoBigData++”), by the Spanish Government (project RTI2018-095094-B-C21, “CONSENT”), and by the DRAC project, which is co-financed by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020 with a grant of 50% of total cost eligible.
