Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2382196.2382263acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Differentially private sequential data publication via variable-length n-grams

Published: 16 October 2012 Publication History

Abstract

Sequential data is being increasingly used in a variety of applications. Publishing sequential data is of vital importance to the advancement of these applications. However, as shown by the re-identification attacks on the AOL and Netflix datasets, releasing sequential data may pose considerable threats to individual privacy. Recent research has indicated the failure of existing sanitization techniques to provide claimed privacy guarantees. It is therefore urgent to respond to this failure by developing new schemes with provable privacy guarantees. Differential privacy is one of the only models that can be used to provide such guarantees. Due to the inherent sequentiality and high-dimensionality, it is challenging to apply differential privacy to sequential data. In this paper, we address this challenge by employing a variable-length n-gram model, which extracts the essential information of a sequential database in terms of a set of variable-length n-grams. Our approach makes use of a carefully designed exploration tree structure and a set of novel techniques based on the Markov assumption in order to lower the magnitude of added noise. The published n-grams are useful for many purposes. Furthermore, we develop a solution for generating a synthetic database, which enables a wider spectrum of data analysis tasks. Extensive experiments on real-life datasets demonstrate that our approach substantially outperforms the state-of-the-art techniques.

References

[1]
O. Abul, F. Bonchi, and M. Nanni. Never walk alone: Uncertainty for anonymity in moving objects databases. In ICDE, pages 376--385, 2008.
[2]
A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC, pages 609--618, 2008.
[3]
F. Bonchi, L. V. Lakshmanan, and H. W. Wang. Trajectory anonymity in publishing personal mobility data. SIGKDD Explorations Newsletter, 13(1):30--42, 2011.
[4]
R. Chen, B. C. M. Fung, and B. C. Desai. Differentially private trajectory data publication. CoRR, abs/1112.2020, 2011.
[5]
R. Chen, B. C. M. Fung, N. Mohammed, and B. C. Desai. Privacy-preserving trajectory data publishing by local suppression. Information Sciences, in press.
[6]
R. Chen, N. Mohammed, B. C. M. Fung, B. C. Desai, and L. Xiong. Publishing set-valued data via differential privacy. PVLDB, 4(11):1087--1098, 2011.
[7]
C. Dwork. Differential privacy. In ICALP, pages 1--12, 2006.
[8]
C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: privacy via distributed noise generation. In EUROCRYPT, pages 486--503, 2006.
[9]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284, 2006.
[10]
C. Dwork, M. Naor, O. Reingold, G. N. Rothblum, and S. Vadhan. On the complexity of differentially private data release: efficient algorithms and hardness results. In STOC, pages 381--390, 2009.
[11]
M. Hardt, K. Ligett, and F. McSherry. A simple and practical algorithm for differentially private data release. CoRR, abs/1012.4763, 2012.
[12]
H. Hu, J. Xu, S. T. On, J. Du, and J. K.-Y. Ng. Privacy-aware location data publishing. ACM Transactions on Database Systems, 35(3):17, 2010.
[13]
C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing linear counting queries under differential privacy. In PODS, pages 123--134, 2010.
[14]
C. Manning and H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
[15]
F. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In SIGMOD, pages 19--30, 2009.
[16]
F. McSherry and R. Mahajan. Differentially private network trace analysis. In SIGCOMM, pages 123--134, 2010.
[17]
N. Mohammed, R. Chen, B. C. M. Fung, and P. S. Yu. Differentially private data release for data mining. In SIGKDD, pages 493--501, 2011.
[18]
A. Monreale, G. Andrienko, N. Andrienko, F. Giannotti, D. Pedreschi, S. Rinzivillo, and S. Wrobel. Movement data anonymity through generalization. Transactions on Data Privacy, 3(2):91--121, 2010.
[19]
P. Ohm. Broken promises of privacy: Responding to the surprising failure of anonymization. UCLA Law Review, 2010.
[20]
B. Sheridan. A trillion points of data. Newsweek, March 2009.
[21]
M. Terrovitis and N. Mamoulis. Privacy preservation in the publication of trajectories. In MDM, pages 65--72, 2008.
[22]
X. Xiao, G. Bender, M. Hay, and J. Gehrke. iReduct: Differential privacy with reduced relative errors. In SIGMOD, pages 229--240, 2011.
[23]
X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. In ICDE, pages 225--236, 2010.
[24]
R. Yarovoy, F. Bonchi, L. V. S. Lakshmanan, and W. H. Wang. Anonymizing moving objects: How to hide a MOB in a crowd? In EDBT, pages 72--83, 2009.

Cited By

View all
  • (2025)WF-LDPSR: A local differential privacy mechanism based on water-filling for secure release of trajectory statistics dataComputers & Security10.1016/j.cose.2024.104165148(104165)Online publication date: Jan-2025
  • (2024)HRNet: Differentially Private Hierarchical and Multi-Resolution Network for Human Mobility Data SynthesizationProceedings of the VLDB Endowment10.14778/3681954.368198317:11(3058-3071)Online publication date: 1-Jul-2024
  • (2024)Collaborative learning from distributed data with differentially private synthetic dataBMC Medical Informatics and Decision Making10.1186/s12911-024-02563-724:1Online publication date: 14-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '12: Proceedings of the 2012 ACM conference on Computer and communications security
October 2012
1088 pages
ISBN:9781450316514
DOI:10.1145/2382196
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. differential privacy
  2. markov assumption
  3. n-gram model
  4. sequential data

Qualifiers

  • Research-article

Conference

CCS'12
Sponsor:
CCS'12: the ACM Conference on Computer and Communications Security
October 16 - 18, 2012
North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)109
  • Downloads (Last 6 weeks)17
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)WF-LDPSR: A local differential privacy mechanism based on water-filling for secure release of trajectory statistics dataComputers & Security10.1016/j.cose.2024.104165148(104165)Online publication date: Jan-2025
  • (2024)HRNet: Differentially Private Hierarchical and Multi-Resolution Network for Human Mobility Data SynthesizationProceedings of the VLDB Endowment10.14778/3681954.368198317:11(3058-3071)Online publication date: 1-Jul-2024
  • (2024)Collaborative learning from distributed data with differentially private synthetic dataBMC Medical Informatics and Decision Making10.1186/s12911-024-02563-724:1Online publication date: 14-Jun-2024
  • (2024)Enhancing Privacy in Recommender Systems through Differential Privacy TechniquesProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688019(1348-1352)Online publication date: 8-Oct-2024
  • (2024)Privacy-Preserving Location-Based Advertising via Longitudinal Geo-IndistinguishabilityIEEE Transactions on Mobile Computing10.1109/TMC.2023.334813623:8(8256-8273)Online publication date: Aug-2024
  • (2024)PUTS: Privacy-Preserving and Utility-Enhancing Framework for Trajectory SynthesizationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328815436:1(296-310)Online publication date: Jan-2024
  • (2024)Privacy-Enhanced Frequent Sequence Mining and Retrieval for Personalized Behavior PredictionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.339192819(4957-4969)Online publication date: 2024
  • (2024)Vector-Indistinguishability: Location Dependency Based Privacy Protection for Successive Location DataIEEE Transactions on Computers10.1109/TC.2023.323690073:4(970-979)Online publication date: Apr-2024
  • (2024)SoK: Privacy-Preserving Data Synthesis2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00002(4696-4713)Online publication date: 19-May-2024
  • (2024)Privacy-Preserving Traffic Flow Release with Consistency Constraints2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00138(1699-1711)Online publication date: 13-May-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media