Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3660512.3665521acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

A Hybrid Approach for Cheapfake Detection Using Reputation Checking and End-To-End Network

Published: 01 July 2024 Publication History

Abstract

In today’s era dominated by digital manipulations, the identification of Cheapfakes, especially those involving out-of-context visuals, is imperative for upholding the integrity of information dissemination and fostering trust in multimedia content. This research introduces an innovative approach to detect cheapfakes, utilizing a hybrid methodology that merges real-time online verification with extensive training on a unified end-to-end network, augmented by generative synthetic data. This combined strategy offers a promising avenue for enhancing the accuracy and efficiency of cheapfake detection, thus addressing the pressing need to combat misinformation in the digital landscape. We focus our solutions on two problems of Contextual Integrity Prediction and Image-Caption Authenticity Analysis. Our experiments reveal that this innovative network achieves an impressive 85.70% accuracy on a public test dataset through training solely on synthetic data for Contextual Integrity Prediction, also proving its effectiveness in Image-Caption Authenticity Analysis. This research highlights the immense promise of using our proposed hybrid method in the fight against Cheapfakes, making a significant contribution to preserving the integrity of multimedia content. Our source code is publicly available at https://github.com/nbtin/cheapfakes_detection_SCID2024.git.

References

[1]
Tankut Akgul, Tugce Erkilic Civelek, Deniz Ugur, and Ali C. Begen. 2021. COSMOS on Steroids: a Cheap Detector for Cheapfakes. In Proceedings of the 12th ACM Multimedia Systems Conference(MMSys ’21). ACM. https://doi.org/10.1145/3458305.3479968
[2]
Shivangi Aneja, Chris Bregler, and Matthias Nießner. 2021. COSMOS: Catching Out-of-Context Misinformation with Self-Supervised Learning. In ArXiv preprint arXiv:2101.06278.
[3]
Shivangi Aneja, Cise Midoglu, Duc-Tien Dang-Nguyen, Michael Alexander Riegler, Pål Halvorsen, Matthias Nießner, Balu Adsumilli, and Chris Bregler. 2021. MMSys’21 Grand Challenge on Detecting Cheapfakes. CoRR abs/2107.05297 (2021). arXiv:2107.05297https://arxiv.org/abs/2107.05297
[4]
Shivangi Aneja, Cise Midoglu, Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Michael Riegler, Pål Halvorsen, Chris Bregler, and Balu Adsumilli. 2022. ACM Multimedia Grand Challenge on Detecting Cheapfakes. arxiv:2207.14534 [cs.MM]
[5]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arxiv:2005.14165 [cs.CL]
[6]
Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. UNITER: UNiversal Image-TExt Representation Learning. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXX (Glasgow, United Kingdom). Springer-Verlag, Berlin, Heidelberg, 104–120. https://doi.org/10.1007/978-3-030-58577-8_7
[7]
Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Cise Midoglu, Michael Riegler, Pål Halvorsen, and Minh-Son Dao. 2023. Grand Challenge On Detecting Cheapfakes. arxiv:2304.01328 [cs.CV]
[8]
Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Michael Riegler, Pål Halvorsen, Anh-Duy Tran, Minh-Son Dao, and Minh-Triet Tran. 2024. Overview of the Grand Challenge on Detecting Cheapfakes at ACM ICMR 2024. In Proceedings of the 14th International Conference on Multimedia Retrieval (ICMR). ACM.
[9]
M. Dao and K. Zettsu. 2023. Leveraging Knowledge Graphs for CheapFakes Detection: Beyond Dataset Evaluation. In 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE Computer Society, Los Alamitos, CA, USA, 99–104. https://doi.org/10.1109/ICMEW59549.2023.00024
[10]
Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, and Jingjing Liu. 2020. Large-scale adversarial training for vision-and-language representation learning. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS ’20). Curran Associates Inc., Red Hook, NY, USA, Article 555, 13 pages.
[11]
Sophia Gu, Christopher Clark, and Aniruddha Kembhavi. 2023. I Can’t Believe There’s No Images! Learning Visual Tasks Using only Language Supervision. arxiv:2211.09778 [cs.CV]
[12]
Xiaoshuai Hao, Yi Zhu, Srikar Appalaraju, Aston Zhang, Wanqian Zhang, Bo Li, and Mu Li. 2023. MixGen: A New Multi-Modal Data Augmentation. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW). 379–389. https://doi.org/10.1109/WACVW58289.2023.00042
[13]
Hamid Izadinia, Fereshteh Sadeghi, Santosh K. Divvala, Hannaneh Hajishirzi, Yejin Choi, and Ali Farhadi. 2015. Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing. In 2015 IEEE International Conference on Computer Vision (ICCV). 10–18. https://doi.org/10.1109/ICCV.2015.10
[14]
Tuan-Vinh La, Minh-Son Dao, Duy-Dong Le, Kim-Phung Thai, Quoc-Hung Nguyen, and Thuy-Kieu Phan-Thi. 2022. Leverage Boosting and Transformer on Text-Image Matching for Cheap Fakes Detection. Algorithms 15, 11 (Nov. 2022), 423. https://doi.org/10.3390/a15110423
[15]
Tuan-Vinh La, Minh-Son Dao, Quang-Tien Tran, Thanh-Phuc Tran, Anh-Duy Tran, and Duc-Tien Dang-Nguyen. 2022. A Combination of Visual-Semantic Reasoning and Text Entailment-based Boosting Algorithm for Cheapfake Detection. In Proceedings of the 30th ACM International Conference on Multimedia (Lisbon, Portugal) (MM ’22). Association for Computing Machinery, New York, NY, USA, 7140–7144. https://doi.org/10.1145/3503161.3551595
[16]
Tuan-Vinh La, Quang-Tien Tran, Thanh-Phuc Tran, Anh-Duy Tran, Duc-Tien Dang-Nguyen, and Minh-Son Dao. 2022. Multimodal Cheapfakes Detection by Utilizing Image Captioning for Global Context. In Proceedings of the 3rd ACM Workshop on Intelligent Cross-Data Analysis and Retrieval(ICMR ’22). ACM. https://doi.org/10.1145/3512731.3534210
[17]
Grace Luo, Trevor Darrell, and Anna Rohrbach. 2021. NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media. https://doi.org/10.48550/ARXIV.2104.05893
[18]
Eivind Moholdt, Sohail Ahmed Khan, and Duc-Tien Dang-Nguyen. 2023. Detecting Out-of-Context Image-Caption Pair in News: A Counter-Intuitive Method. In Proceedings of the 20th International Conference on Content-Based Multimedia Indexing (Orleans, France) (CBMI ’23). Association for Computing Machinery, New York, NY, USA, 203–209. https://doi.org/10.1145/3617233.3617274
[19]
Thanh–Son Nguyen and Minh–Triet Tran. 2023. Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection. In 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE Computer Society, Los Alamitos, CA, USA, 93–98. https://doi.org/10.1109/ICMEW59549.2023.00023
[20]
Thanh-Son Nguyen, Vinh Dang, Minh-Triet Tran, and Duc-Tien Dang-Nguyen. 2023. Leveraging Cross-Modals for Cheapfakes Detection. In 4th Workshop on Intelligent Cross-Data Analysis and Retrieval(ICMR ’23). ACM. https://doi.org/10.1145/3592571.3592975
[21]
Van-Loc Nguyen, Bao-Tin Nguyen, Thanh-Son Nguyen, Duc-Tien Dang-Nguyen, and Minh-Triet Tran. 2024. A Unified Network for Detecting Out-Of-Context Information Using Generative Synthetic Data. Proceedings of the 14th ACM International Conference on Multimedia Retrieval.
[22]
Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, and Kurt Keutzer. 2022. How Much Can CLIP Benefit Vision-and-Language Tasks?. In ICLR 2022.
[23]
Qingyi Si, Zheng Lin, Ming yu Zheng, Peng Fu, and Weiping Wang. 2021. Check It Again:Progressive Visual Question Answering via Visual Entailment. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, Online, 4101–4110. https://doi.org/10.18653/v1/2021.acl-long.317
[24]
Haoyu Song, Li Dong, Wei-Nan Zhang, Ting Liu, and Furu Wei. 2022. CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.
[25]
Quang-Tien Tran, Thanh-Phuc Tran, Minh-Son Dao, Tuan-Vinh La, Anh-Duy Tran, and Duc Tien Dang Nguyen. 2022. A Textual-Visual-Entailment-based Unsupervised Algorithm for Cheapfake Detection. In Proceedings of the 30th ACM International Conference on Multimedia (Lisbon, Portugal) (MM ’22). Association for Computing Machinery, New York, NY, USA, 7145–7149. https://doi.org/10.1145/3503161.3551596
[26]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arXiv:arXiv:1706.03762
[27]
Peng Wang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, and Hongxia Yang. 2022. OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework. CoRR abs/2202.03052 (2022).
[28]
Guangyang Wu, Weijie Wu, Xiaohong Liu, Kele Xu, Tianjiao Wan, and Wenyi Wang. 2023. Cheap-fake Detection with LLM using Prompt Engineering. arxiv:2306.02776 [cs.CV]
[29]
Ning Xie, Farley Lai, Derek Doran, and Asim Kadav. 2019. Visual Entailment: A Novel Task for Fine-Grained Image Understanding. ArXiv abs/1901.06706 (2019). https://api.semanticscholar.org/CorpusID:58981654
[30]
Hongwei Xue, Yupan Huang, Bei Liu, Houwen Peng, Jianlong Fu, Houqiang Li, and Jiebo Luo. 2021. Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 4514–4528. https://proceedings.neurips.cc/paper/2021/hash/23fa71cc32babb7b91130824466d25a5-Abstract.html

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SCID '24: Proceedings of the 1st Workshop on Security-Centric Strategies for Combating Information Disorder
July 2024
68 pages
ISBN:9798400706509
DOI:10.1145/3660512
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cheapfake Detection
  2. Miscontextualization
  3. Misinformation
  4. Out-of-context
  5. Visual Entailment

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Vingroup Innovation Foundation

Conference

ASIA CCS '24
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 74
    Total Downloads
  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)3
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media