Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2001576.2001802acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Scaling up a hybrid genetic linear programming algorithm for statistical disclosure control

Published: 12 July 2011 Publication History

Abstract

This paper looks at the real world problem of statistical disclosure control. National Statistics Agencies are required to publish detailed statistics and simultaneously guarantee the confidentiality of the contributors. When published statistical tables contain magnitude data such as turnover or health statistics the preferred method is to suppress the values of cells which may reveal confidential information. However suppressing these 'primary' cells alone will not guarantee protection due the presence of margin (row/column) totals and therefore other 'secondary' cells must also be suppressed. A previously developed algorithm that hybridizes linear programming with a genetic algorithm has been shown to protect tables with up to 40,000 cells, however Statistical Agencies are often required to protect tables with over 100,000 cells.
This algorithm's performance highly depended on the choice of mutation operator so firstly this dependency was removed. As the algorithm is unable to protect larger tables due to the time it takes for its fitness function (a linear program) to execute a series of modifications have been applied. These modifications significantly reduced its execution time which in turn greatly extend the capabilities of the hybrid algorithm to the point that it can now protect tables with up to one million cells.

References

[1]
Computational infrastructure for operations research, 2006. www.coin-or.org.
[2]
T. Bäck. Self adaptation in genetic algorithms. In F. Varela and P. Bourgine, editors, Toward a Practice of Autonomous Systems: Proceedings of the 1st European Conference on Artificial Life, pages 263--271. MIT Press, Cambridge, MA, 1992.
[3]
H.-G. Beyer. The Theory of Evolution Strategies. Springer, Berlin, Heidelberg, New York, 2001.
[4]
J. Castro. Network flows heuristics for complementary cell suppression: An empirical evaluation and extensions. In J. Domingo-Ferrer, editor, Inference Control in Statistical Databases, volume 2316 of Lecture Notes in Computer Science, pages 59--73. Springer, 2002.
[5]
A. Clark and J. Smith. Improvements to cell suppression in statistical disclosure control. Technical report, University of the West of England, 2006. End-of-Project Report for the Office for National Statistics (ONS).
[6]
P.-P. de Wolf. Hitas: A heuristic approach to cell suppression in hierarchical tables. In J. Domingo-Ferrer, editor, Inference Control in Statistical Databases, volume 2316 of Lecture Notes in Computer Science, pages 81--98. Springer Berlin / Heidelberg, 2002.
[7]
A. Eiben and J. Smith. Introduction to Evolutionary Computation. Springer, 2003.
[8]
M. Fischetti and J. Salazar-González. Models and algorithms for the 2-dimensional cell suppression problem in statistical disclosure control. Mathematical Programming, 84(2):283--312, 1999.
[9]
M. Glickman and K. Sycara. Reasons for premature convergence of self-adaptating mutation rates. In 2000 Congress on Evolutionary Computation (CEC'2000), pages 62--69. IEEE Press, Piscataway, NJ, 2000.
[10]
A. Hundpool. τ-argus statistical disclosure control software, 2004. http://neon.vb.cbs.nl/CASC/tau.html.
[11]
J. hung Chen, D. E. Goldberg, S. ying Ho, and K. Sastry. Fitness inheritance in multiobjective optimization. 2002.
[12]
Y. Jin. A comprehensive survey of fitness approximation in evolutionary computation. Soft Computing-A Fusion of Foundations, Methodologies and Applications, 9(1):3--12, 2005.
[13]
J. Kelly, B. Golden, and A. Assad. Cell suppression: Disclosure protection for sensitive tabular data. Networks, 22(4):397--417, 1992.
[14]
M. Preuss and T. Bartz-Beielstein. Sequential parameter optimisation applied to self-adaptation for binary-coded evolutionary algorithms. In L. et al, editor, Parameter Setting in Evolutionary Algorithms, pages 91--120. Springer, 2007.
[15]
H.-P. Schwefel. Numerical Optimisation of Computer Models. Wiley, New York, 1981.
[16]
M. Serpell and J. Smith. Self-adaption of mutation operator and probability for permutation representations in genetic algorithms. Evolutionary Computation, 18(3):491--514, 2010.
[17]
J. Smith and T. Fogarty. Self adaptation of mutation rates in a steady state genetic algorithm. In Proceedings of the 1996 IEEE Conference on Evolutionary Computation, pages 318--323. IEEE Press, Piscataway, NJ, 1996.
[18]
R. E. Smith, B. A. Dike, and S. A. Stegmann. Fitness inheritance in genetic algorithms. In Proceedings of the 1995 ACM symposium on Applied computing, SAC '95, pages 345--350, New York, NY, USA, 1995. ACM.

Cited By

View all
  • (2013)Initial application of ant colony optimisation to statistical disclosure controlProceedings of the 15th annual conference on Genetic and evolutionary computation10.1145/2463372.2463386(97-104)Online publication date: 6-Jul-2013

Index Terms

  1. Scaling up a hybrid genetic linear programming algorithm for statistical disclosure control

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      GECCO '11: Proceedings of the 13th annual conference on Genetic and evolutionary computation
      July 2011
      2140 pages
      ISBN:9781450305570
      DOI:10.1145/2001576
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 July 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cell suppression problem
      2. genetic algorithms
      3. mathematical programming
      4. tatistical disclosure control

      Qualifiers

      • Research-article

      Conference

      GECCO '11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 12 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2013)Initial application of ant colony optimisation to statistical disclosure controlProceedings of the 15th annual conference on Genetic and evolutionary computation10.1145/2463372.2463386(97-104)Online publication date: 6-Jul-2013

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media