Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474369.3486870acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Spying through Virtual Backgrounds of Video Calls

Published: 15 November 2021 Publication History

Abstract

Video calls have become an essential part of today's business life, especially due to the Corona pandemic. Several industry branches enable their employees to work from home and collaborate via video conferencing services. While remote work offers benefits for health safety and personal mobility, it also poses privacy risks. Visual content is directly transmitted from the private living environment of employees to third parties, potentially exposing sensitive information. To counter this threat, video conferencing services support replacing the visible environment of a video call with a virtual background. This replacement, however, is imperfect, leaking tiny regions of the real background in video frames. In this paper, we explore how these leaks in virtual backgrounds can be exploited to reconstruct regions of the real environment. To this end, we build on recent techniques of computer vision and derive an approach capable of extracting and aggregating leaked pixels in a video call. In an empirical study with the services Zoom, Webex, and Google Meet, we can demonstrate that the exposed fragments of the reconstructed background are sufficient to spot different objects. From 114 video calls with virtual backgrounds, 35% enable to correctly identify objects in the environment. We conclude that virtual backgrounds provide only limited protection, and alternative defenses are needed.

Supplementary Material

MP4 File (AISec21-34.mp4)
Presentation video for the paper "Spying through Virtual Backgrounds of Video Calls". Virtual backgrounds in video conferences sometimes leak pixels of the real background through movement. In this paper, we use recent techniques of computer vision to automatically reconstruct the real backgrounds of video calls from these leaked pixels.

References

[1]
Y. Aksoy, T.-H. Oh, S. Paris, M. Pollefeys, and W. Matusik. Semantic soft segmentation. ACM Transactions on Graphics, 37 (4): 72:1--72:13, 2018.
[2]
S. A. Anand and N. Saxena. Keyboard emanations in remote voice calls: Password leakage and noise(less) masking defenses,. In ACM Conference on Data and Application Security and Privacy (CODASPY), 2018.
[3]
M. Backes, M. Dürmuth, and D. Unruh. Compromising reflections-or-how to read LCD monitors around the corner. In IEEE Symposium on Security and Privacy (S&P), pages 158--169, 2008.
[4]
M. Backes, T. Chen, M. Dürmuth, H. P. A. Lensch, and M. Welk. Tempest in a teapot: Compromising reflections revisited. In IEEE Symposium on Security and Privacy (S&P), pages 315--327, 2009.
[5]
D. Balzarotti, M. Cova, and G. Vigna. Clearshot: Eavesdropping on keyboard input from video. In IEEE Symposium on Security and Privacy (S&P), pages 170--183, 2008.
[6]
S. Beucher and C. Lantuéjoul. Use of watersheds in contour detection. In International Workshop on Image Processing, Real-time Edge and Motion Detection/Estimation, 1979.
[7]
& Jolly(2001)]BoyJol01Y. Y. Boykov and M. Jolly. Interactive graph cuts for optimal boundary region segmentation of objects in n-d images. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 1, pages 105--112, 2001.
[8]
K. Bredies and D. Lorenz. Mathematical Image Processing. Applied and Numerical Harmonic Analysis. Springer, 2019. ISBN 9783030014582.
[9]
R. Cabanier and N. Androniko. Compositing and blending level 1. Technical report, W3C Candidate Recommendation, 2015.
[10]
L. Cavedon, L. Foschini, and G. Vigna. Getting the face behind the squares: Reconstructing pixelized video streams. In USENIX Workshop on Offensive Technologies (WOOT), 2011.
[11]
D. Cho, Y.-W. Tai, and I.-S. Kweon. Natural image matting using deep convolutional neural networks. In European Conference on Computer Vision (ECCV), pages 626--643, 2016.
[12]
Y.-Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski. A Bayesian approach to digital matting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2001.
[13]
A. Compagno, M. Conti, D. Lain, and G. Tsudik. Don't Skype & Type! Acoustic eavesdropping in Voice-over-IP. In ACM on Asia Conference on Computer and Communications Security (AsiaCCS), 2017.
[14]
D. Genkin, M. Pattani, R. Schuster, and E. Tromer. Synesthesia: Detecting screen content via remote acoustic side channels. In IEEE Symposium on Security and Privacy (S&P), pages 853--869, 2019.
[15]
t al.(2014)Girshick, Donahue, Darrell, & Malik]GirDonDarMal14R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 580--587, 2014.
[16]
L. Grady, T. Schiwietz, S. Aharon, and R. Westermann. Random walks for interactive alpha-matting. In International Conference on Visualization, Imaging and Image Processing (VIIP), 2005.
[17]
R. Hasan, D. J. Crandall, M. Fritz, and A. Kapadia. Automatically detecting bystanders in photos to reduce privacy risks. In IEEE Symposium on Security and Privacy (S&P), pages 318--335, 2020.
[18]
t al.(2017)He, Gkioxari, Dollár, & Girshick]HeGkDoGi17K. He, G. Gkioxari, P. Dollár, and R. Girshick. Mask R-CNN. In International Conference on Computer Vision (ICCV), pages 2980--2988, 2017.
[19]
S. Hill, Z. Zhou, L. Saul, and H. Shacham. On the (in)effectiveness of mosaicing and blurring as tools for document redaction. Proceedings on Privacy Enhancing Technologies, 4: 403--417, 2016.
[20]
D. Kagan, G. F. Alpert, and M. Fire. Zooming into video conferencing privacy and security threats. Technical Report abs/2007.01059, arXiv, 2020.
[21]
L. Karacan, A. Erdem, and E. Erdem. Image matting with KL-divergence based sparse sampling. In International Conference on Computer Vision (ICCV), pages 424--432, 2015.
[22]
E. Kenneally and D. Dittrich. The Menlo report: Ethical principles guiding information and communication technology research. Technical report, U.S. Department of Homeland Security, 2012.
[23]
S. Kolkur, D. Kalbande, P. Shimpi, C. Bapat, and J. Jatakia. Human skin detection using RGB, HSV and YCbCr color models. In International Conference on Communication and Signal Processing (ICCASP), pages 324--332, 2016.
[24]
A. Levin, D. Lischinski, and Y. Weiss. A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2): 228--242, 2008.
[25]
S. Lin, A. Ryabtsev, S. Sengupta, B. L. Curless, S. M. Seitz, and I. Kemelmacher-Shlizerman;. Real-time high-resolution background matting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 8762--8771, 2021.
[26]
C. Ling, U. Balc?, J. Blackburn, and G. Stringhini. A first look at Zoombombing. In IEEE Symposium on Security and Privacy (S&P), pages 1452--1467, 2021.
[27]
F. Meyer. Color image segmentation. In International Conference on Image Processing and its Applications, 1992.
[28]
N. Otsu. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics, 9 (1): 62--66, 1979.
[29]
X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and M. Jagersand. U2-Net: Going deeper with nested U-structure for salient object detection. Pattern Recognition, 106, 2020.
[30]
C. Rother, V. Kolmogorov, and A. Blake. GrabCut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23 (3): 309--314, 2004.
[31]
M. Sabra, A. Maiti, and M. Jadliwala. Zoom on the keystrokes: Exploiting video calls for keystroke inference attacks. In Network and Distributed Systems Security Symposium (NDSS), 2021.
[32]
S. Sengupta, V. Jayaram, B. Curless, S. M. Seitz, and I. Kemelmacher-Shlizerman. Background matting: The world is your green screen. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2288--2297, 2020.
[33]
Y. Shoshitaishvili, C. Kruegel, and G. Vigna. Portrait of a privacy invasion: Detecting relationships through large-scale photo analysis. Proceedings on Privacy Enhancing Technologies, 2015.
[34]
J. Sun, J. Jia, C.-K. Tang, and H.-Y. Shum. Poisson matting. ACM Transactions on Graphics, 23 (3): 315--321, 2004.
[35]
The Verge. Microsoft is letting more employees work from home permanently, Oct. 2020.
[36]
The Wall Street Journal. Google to keep employees home until summer 2021 amid coronavirus pandemic, July 2020.
[37]
B. Wang and P. Dudek. AMBER: Adapting multi-resolution background extractor. In IEEE International Conference on Image Processing (ICIP), 2013.
[38]
B. Wang and P. Dudek. A fast self-tuning background subtraction algorithm. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 401--404, 2014.
[39]
J. Wang and M. F. Cohen. Optimized color sampling for robust matting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2007.
[40]
Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13 (4): 600--612, 2004.
[41]
C. V. Wright, L. Ballard, F. Monrose, and G. M. Masson. Language identification of encrypted VoIP traffic: Alejandra y Roberto or Alice and Bob? In USENIX Security Symposium, 2007.
[42]
C. V. Wright, L. Ballard, S. E. Coull, F. Monrose, and G. M. Masson. Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations. In IEEE Symposium on Security and Privacy (S&P), pages 35--49, 2008.
[43]
N. Xu, B. L. Price, S. Cohen, and T. S. Huang. Deep image matting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 311--320, 2017.
[44]
Y. Xu, J. Heinly, A. M. White, F. Monrose, and J.-M. Frahm. Seeing double: reconstructing obscured typed input from repeated compromising reflections. In Proc. of the ACM Conference on Computer and Communications Security (CCS), pages 1063--1074, 2013.
[45]
B. Zhu, Y. Chen, J. Wang, S. Liu, B. Zhang, and M. Tang. Fast deep matting for portrait animation on mobile phone. In ACM International Conference on Multimedia, pages 297--305, 2017.
[46]
Z. Zivkovic and F. van der Heijden. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognition Letters, 27 (7), 2006.

Cited By

View all
  • (2024)Zooming Into Video Conferencing PrivacyIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.323198711:1(933-944)Online publication date: Feb-2024
  • (2024)Digital Competences in Cybersecurity of Teachers in TrainingComputers in the Schools10.1080/07380569.2024.236161441:3(281-306)Online publication date: 5-Jul-2024
  • (2023)Investigating Cybersecurity Risks and the Responses of Home Workers in Aotearoa New ZealandProceedings of the 35th Australian Computer-Human Interaction Conference10.1145/3638380.3638385(99-107)Online publication date: 2-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '21: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security
November 2021
210 pages
ISBN:9781450386579
DOI:10.1145/3474369
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. machine learning
  2. privacy
  3. video conferences

Qualifiers

  • Research-article

Funding Sources

Conference

CCS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)10
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Zooming Into Video Conferencing PrivacyIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.323198711:1(933-944)Online publication date: Feb-2024
  • (2024)Digital Competences in Cybersecurity of Teachers in TrainingComputers in the Schools10.1080/07380569.2024.236161441:3(281-306)Online publication date: 5-Jul-2024
  • (2023)Investigating Cybersecurity Risks and the Responses of Home Workers in Aotearoa New ZealandProceedings of the 35th Australian Computer-Human Interaction Conference10.1145/3638380.3638385(99-107)Online publication date: 2-Dec-2023
  • (2023)Private Eye: On the Limits of Textual Screen Peeking via Eyeglass Reflections in Video Conferencing2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179423(3432-3449)Online publication date: May-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media