Distributed generation of privacy preserving data with user customization

Chen, Xiao; Navidi, Thomas; Ermon, Stefano; Rajagopal, Ram

Abstract:Distributed devices such as mobile phones can produce and store large amounts of data that can enhance machine learning models; however, this data may contain private information specific to the data owner that prevents the release of the data. We wish to reduce the correlation between user-specific private information and data while maintaining the useful information. Rather than learning a large model to achieve privatization from end to end, we introduce a decoupling of the creation of a latent representation and the privatization of data that allows user-specific privatization to occur in a distributed setting with limited computation and minimal disturbance on the utility of the data. We leverage a Variational Autoencoder (VAE) to create a compact latent representation of the data; however, the VAE remains fixed for all devices and all possible private labels. We then train a small generative filter to perturb the latent representation based on individual preferences regarding the private and utility information. The small filter is trained by utilizing a GAN-type robust optimization that can take place on a distributed device. We conduct experiments on three popular datasets: MNIST, UCI-Adult, and CelebA, and give a thorough evaluation including visualizing the geometry of the latent embeddings and estimating the empirical mutual information to show the effectiveness of our approach.

Comments:	accepted in ICLR 2019 SafeML workshop
Subjects:	Machine Learning (cs.LG); Computers and Society (cs.CY); Machine Learning (stat.ML)
Cite as:	arXiv:1904.09415 [cs.LG]
	(or arXiv:1904.09415v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1904.09415

Computer Science > Machine Learning

Title:Distributed generation of privacy preserving data with user customization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators