Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3543507.3583869acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

ContrastFaux: Sparse Semi-supervised Fauxtography Detection on the Web using Multi-view Contrastive Learning

Published: 30 April 2023 Publication History

Abstract

The widespread misinformation on the Web has raised many concerns with serious societal consequences. In this paper, we study a critical type of online misinformation, namely fauxtography, where the image and associated text of a social media post jointly convey a questionable or false sense. In particular, we focus on a sparse semi-supervised fauxtography detection problem, which aims to accurately identify fauxtography by only using the sparsely annotated ground truth labels of social media posts. Our problem is motivated by the key limitation of current fauxtography detection approaches that often require a large amount of expensive and inefficient manual annotations to train an effective fauxtography detection model. We identify two key technical challenges in solving the problem: 1) it is non-trivial to train an accurate detection model given the sparse fauxtography annotations, and 2) it is difficult to extract the heterogeneous and complicated fauxtography features from the multi-modal social media posts for accurate fauxtography detection. To address the above challenges, we propose ContrastFaux, a multi-view contrastive learning framework that jointly explores the sparse fauxtography annotations and the cross-modal fauxtography feature similarity between the image and text in multi-modal posts to accurately detect fauxtography on social media. Evaluation results on two social media datasets demonstrate that ContrastFaux consistently outperforms state-of-the-art deep learning and semi-supervised learning fauxtography detection baselines by achieving the highest fauxtography detection accuracy.

References

[1]
Hamidreza Alvari, Elham Shaabani, Soumajyoti Sarkar, Ghazaleh Beigi, and Paulo Shakarian. 2019. Less is more: Semi-supervised causal inference for detecting pathogenic users in social media. In Companion Proceedings of The 2019 World Wide Web Conference. ACM, New York, NY, USA, 154–161.
[2]
Adrien Benamira, Benjamin Devillers, Etienne Lesot, Ayush K Ray, Manal Saadi, and Fragkiskos D Malliaros. 2019. Semi-supervised learning and graph neural networks for fake news detection. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, ACM, New York, NY, USA, 568–569.
[3]
Paul Bleakley. 2021. Panic, pizza and mainstreaming the alt-right: A social media analysis of Pizzagate and the rise of the QAnon conspiracy. Current Sociology (2021), 00113921211034896.
[4]
Eli Cahan. 2022. Alcohol deaths increase dramatically during pandemic, especially for younger adults. (2022). https://abcnews.go.com/Health/alcohol-deaths-increase-dramatically-pandemic-younger-adults-research/story¿id=84496498
[5]
Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, and Vicente Ordonez. 2021. Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. AAAI Press, 6912–6920.
[6]
Yixuan Chen, Dongsheng Li, Peng Zhang, Jie Sui, Qin Lv, Lu Tun, and Li Shang. 2022. Cross-modal Ambiguity Learning for Multimodal Fake News Detection. In Proceedings of the ACM Web Conference 2022. ACM, New York, NY, USA, 2897–2905.
[7]
Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. Uniter: Universal image-text representation learning. In European conference on computer vision. Springer, 104–120.
[8]
Stephen Cooper. 2007. «A Concise History of the Fauxtography Blogstorm in the 2006 Lebanon War». American Communication Journal 9 (06 2007).
[9]
S. D. Cooper. 2007. A concise history of the fauxtography blogstorm in the 2006 Lebanon War. The American Communication Journal 9 (2007). Issue 2.
[10]
Shohreh Deldari, Daniel V Smith, Hao Xue, and Flora D Salim. 2021. Time series change point detection with self-supervised contrastive predictive coding. In Proceedings of the Web Conference 2021. ACM, New York, NY, USA, 3124–3135.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[12]
Sarah Evanega, Mark Lynas, Jordan Adams, Karinne Smolenyak, and Cision Global Insights. 2020. Coronavirus misinformation: quantifying sources and themes in the COVID-19 ‘infodemic’. JMIR Preprints 19, 10 (2020), 2020.
[13]
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021).
[14]
Anastasia Giachanou, Guobiao Zhang, and Paolo Rosso. 2020. Multimodal multi-image fake news detection. In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 647–654.
[15]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems 33 (2020), 21271–21284.
[16]
Gisel Bastidas Guacho, Sara Abdali, Neil Shah, and Evangelos E Papalexakis. 2018. Semi-supervised content-based detection of misinformation via tensor embeddings. In 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, ACM, New York, NY, USA, 322–325.
[17]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[18]
Philip N Howard, Lisa-Maria Neudert, Nayana Prakash, and Steven Vosloo. 2021. Digital misinformation/disinformation and children. UNICEF. Retrieved on February 20 (2021), 2021.
[19]
Md Saiful Islam, Tonmoy Sarkar, Sazzad Hossain Khan, Abu-Hena Mostofa Kamal, SM Murshid Hasan, Alamgir Kabir, Dalia Yeasmin, Mohammad Ariful Islam, Kamal Ibne Amin Chowdhury, Kazi Selim Anwar, 2020. COVID-19–related infodemic and its impact on public health: A global social media analysis. The American journal of tropical medicine and hygiene 103, 4 (2020), 1621.
[20]
Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. Mvae: Multimodal variational autoencoder for fake news detection. In The world wide web conference. ACM, New York, NY, USA, 2915–2921.
[21]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[22]
Ziyi Kou, Daniel Zhang, Lanyu Shang, and Dong Wang. 2021. What and Why Towards Duo Explainable Fauxtography Detection under Constrained Supervision. IEEE Transactions on Big Data (2021).
[23]
Ziyi Kou, Daniel Yue Zhang, Lanyu Shang, and Dong Wang. 2020. Exfaux: A weakly supervised approach to explainable fauxtography detection. In 2020 IEEE International Conference on Big Data (Big Data). IEEE, 631–636.
[24]
Ziyi Kou, Yang Zhang, Daniel Zhang, and Dong Wang. 2022. CrowdGraph: A Crowdsourcing Multi-modal Knowledge Graph Approach to Explainable Fauxtography Detection. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–28.
[25]
David MJ Lazer, Matthew A Baum, Yochai Benkler, Adam J Berinsky, Kelly M Greenhill, Filippo Menczer, Miriam J Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, 2018. The science of fake news. Science 359, 6380 (2018), 1094–1096.
[26]
Kathy Lee, Ashequl Qadir, Sadid A Hasan, Vivek Datla, Aaditya Prakash, Joey Liu, and Oladimeji Farri. 2017. Adverse drug event detection in tweets with semi-supervised convolutional neural networks. In Proceedings of the 26th international conference on world wide web. ACM, New York, NY, USA, 705–714.
[27]
Jia Li, Yu Rong, Hong Cheng, Helen Meng, Wenbing Huang, and Junzhou Huang. 2019. Semi-supervised graph classification: A hierarchical graph perspective. In The World Wide Web Conference. ACM, New York, NY, USA, 972–982.
[28]
Xin Li, Peixin Lu, Lianting Hu, XiaoGuang Wang, and Long Lu. 2022. A novel self-learning semi-supervised deep learning network to detect fake news on social media. Multimedia Tools and Applications 81, 14 (2022), 19341–19349.
[29]
Weiyang Liu, Yandong Wen, Zhiding Yu, and Meng Yang. 2016. Large-Margin Softmax Loss for Convolutional Neural Networks. In Proceedings of the 33nd International Conference on Machine Learning (ICML). JMLR.org.
[30]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[31]
Reza Mansouri, Mahmood Naderan-Tahan, and Mohammad Javad Rashti. 2020. A semi-supervised learning method for fake news detection in social media. In 2020 28th Iranian Conference on Electrical Engineering (ICEE). IEEE, 1–5.
[32]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence 41, 8 (2018), 1979–1993.
[33]
William Scott Paka, Rachit Bansal, Abhay Kaushik, Shubhashis Sengupta, and Tanmoy Chakraborty. 2021. Cross-SEAN: A cross-stitch semi-supervised neural attention model for COVID-19 fake news detection. Applied Soft Computing 107 (2021), 107393.
[34]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.
[35]
Swami Sankaranarayanan, Yogesh Balaji, Carlos D Castillo, and Rama Chellappa. 2018. Generate to adapt: Aligning domains using generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8503–8512.
[36]
Adam Satariano. 2020. Young People More Likely to Believe Virus Misinformation, Study Says. (2020). https://www.nytimes.com/2020/09/23/technology/young-people-more-likely-to-believe-virus-misinformation-study-says.html
[37]
Lanyu Shang, Ziyi Kou, Yang Zhang, and Dong Wang. 2021. A multimodal misinformation detector for covid-19 short videos on tiktok. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 899–908.
[38]
Lanyu Shang, Ziyi Kou, Yang Zhang, and Dong Wang. 2022. A Duo-generative Approach to Explainable Multimodal COVID-19 Misinformation Detection. In Proceedings of the ACM Web Conference 2022. ACM, New York, NY, USA, 3623–3631.
[39]
Lanyu Shang, Yang Zhang, Daniel Zhang, and Dong Wang. 2020. Fauxward: a graph neural network approach to fauxtography detection using social media comments. Social Network Analysis and Mining 10, 1 (2020), 1–16.
[40]
Elisa Shearer. 2021. More than eight-in-ten Americans get news from digital devices. https://www.pewresearch.org/fact-tank/2021/01/12/more-than-eight-in-ten-americans-get-news-from-digital-devices/
[41]
Elisa Shearer and Amy Mitchell. 2021. News Use Across Social Media Platforms in 2020. https://www.pewresearch.org/journalism/2021/01/12/news-use-across-social-media-platforms-in-2020/
[42]
Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. 2019. defend: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. ACM, New York, NY, USA, 395–405.
[43]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[44]
Bhuvanesh Singh and Dilip Kumar Sharma. 2021. Predicting image credibility in fake news over social media using multi-modal approach. Neural Computing and Applications (2021), 1–15.
[45]
Shivangi Singhal, Tanisha Pandey, Saksham Mrig, Rajiv Ratn Shah, and Ponnurangam Kumaraguru. 2022. Leveraging Intra and Inter Modality Relationship for Multimodal Fake News Detection. In Companion Proceedings of the Web Conference 2022. ACM, New York, NY, USA, 726–734.
[46]
Shivangi Singhal, Rajiv Ratn Shah, Tanmoy Chakraborty, Ponnurangam Kumaraguru, and Shin’ichi Satoh. 2019. SpotFake: A Multi-modal Framework for Fake News Detection. In 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM). 39–47. https://doi.org/10.1109/BigMM.2019.00-44
[47]
Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning. PMLR, 6105–6114.
[48]
[48] https://cloud.google.com/vision/. Google. (Google).
[49]
[49] https://developer.twitter.com/en/docs/twitter-api. Twitter. (Twitter).
[50]
[50] https://www.reddit.com/dev/api/. Reddit. (Reddit).
[51]
Garrett Wilson and Diane J Cook. 2020. A survey of unsupervised deep domain adaptation. ACM Transactions on Intelligent Systems and Technology (TIST) 11, 5 (2020), 1–46.
[52]
Zhe Xie, Chengxuan Liu, Yichi Zhang, Hongtao Lu, Dong Wang, and Yue Ding. 2021. Adversarial and contrastive variational autoencoder for sequential recommendation. In Proceedings of the Web Conference 2021. ACM, New York, NY, USA, 449–459.
[53]
I Zeki Yalniz, Hervé Jégou, Kan Chen, Manohar Paluri, and Dhruv Mahajan. 2019. Billion-scale semi-supervised learning for image classification. arXiv preprint arXiv:1905.00546 (2019).
[54]
Xiaoqiang Yan, Shizhe Hu, Yiqiao Mao, Yangdong Ye, and Hui Yu. 2021. Deep multi-view learning methods: a review. Neurocomputing 448 (2021), 106–129.
[55]
Zhenrui Yue, Huimin Zeng, Ziyi Kou, Lanyu Shang, and Dong Wang. 2022. Contrastive domain adaptation for early misinformation detection: A case study on covid-19. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. ACM, New York, NY, USA, 2423–2433.
[56]
Huimin Zeng, Zhenrui Yue, Ziyi Kou, Lanyu Shang, Yang Zhang, and Dong Wang. 2022. Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup. arXiv preprint arXiv:2210.03250 (2022).
[57]
Daniel Yue Zhang, Lanyu Shang, Biao Geng, Shuyue Lai, Ke Li, Hongmin Zhu, Md Tanvir Amin, and Dong Wang. 2018. Fauxbuster: A content-free fauxtography detector using social media comments. In 2018 IEEE international conference on big data (big data). IEEE, 891–900.
[58]
Yang Zhang, Ruohan Zong, Jun Han, Hao Zheng, Qiuwen Lou, Daniel Zhang, and Dong Wang. 2019. Transland: An adversarial transfer learning approach for migratable urban land usage classification using remote sensing. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1567–1576.
[59]
Yang Zhang, Ruohan Zong, Lanyu Shang, Ziyi Kou, and Dong Wang. 2021. A deep contrastive learning approach to extremely-sparse disaster damage assessment in social sensing. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, New York, NY, USA, 151–158.
[60]
Yang Zhang, Ruohan Zong, Lanyu Shang, Ziyi Kou, and Dong Wang. 2022. An active one-shot learning approach to recognizing land usage from class-wise sparse satellite imagery in smart urban sensing. Knowledge-Based Systems 249 (2022), 108997.
[61]
Yang Zhang, Ruohan Zong, and Dong Wang. 2020. A hybrid transfer learning approach to migratable disaster assessment in social media sensing. In 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, ACM, New York, NY, USA, 131–138.
[62]
Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. ACM, New York, NY, USA, 2069–2080.
[63]
Dimitrina Zlatkova, Preslav Nakov, and Ivan Koychev. 2019. Fact-Checking Meets Fauxtography: Verifying Claims About Images. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 2099–2108. https://doi.org/10.18653/v1/D19-1216

Cited By

View all
  • (2023)On optimizing model generality in AI-based disaster damage assessmentProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/701(6317-6325)Online publication date: 19-Aug-2023

Index Terms

  1. ContrastFaux: Sparse Semi-supervised Fauxtography Detection on the Web using Multi-view Contrastive Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '23: Proceedings of the ACM Web Conference 2023
    April 2023
    4293 pages
    ISBN:9781450394161
    DOI:10.1145/3543507
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 April 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Contrastive Learning
    2. Fauxtography Detection
    3. Semi-supervised Learning
    4. Web Misinformation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '23
    Sponsor:
    WWW '23: The ACM Web Conference 2023
    April 30 - May 4, 2023
    TX, Austin, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)123
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)On optimizing model generality in AI-based disaster damage assessmentProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/701(6317-6325)Online publication date: 19-Aug-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media