Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3372278.3390670acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency

Published: 08 June 2020 Publication History

Abstract

The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.

References

[1]
Christina Boididou, Stuart E. Middleton, Zhiwei Jin, Symeon Papadopoulos, Duc-Tien Dang-Nguyen, Giulia Boato, and Yiannis Kompatsiaris. 2018. Verifying information with multimedia content on twitter - A comparative study of automated approaches. Multimedia Tools Appl., Vol. 77, 12 (2018), 15545--15571. https://doi.org/10.1007/s11042-017--5132--9
[2]
Luca Bondi, Luca Baroffio, David Guera, Paolo Bestagini, Edward J. Delp, and Stefano Tubaro. 2017. First Steps Toward Camera Model Identification With Convolutional Neural Networks. IEEE Signal Process. Lett., Vol. 24, 3 (2017), 259--263. https://doi.org/10.1109/LSP.2016.2641006
[3]
Janez Brank, Gregor Leban, and Marko Grobelnik. 2018. Semantic Annotation of Documents Based on Wikipedia Concepts. Informatica (Slovenia), Vol. 42, 1 (2018). http://www.informatica.si/index.php/informatica/article/view/2228
[4]
CSAILVision. [n.d.]. ResNet-50 model trained on Places2 dataset. https://github.com/CSAILVision/places365, Last accessed on 2019--10-08.
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20--25 June 2009, Miami, Florida, USA. IEEE Computer Society, 248--255. https://doi.org/10.1109/CVPR.2009.5206848
[6]
Simon Gottschalk. [n.d.]. List of verified events by EventKG. http://eventkg.l3s.uni-hannover.de/data/event_list.tsv, Last accessed on 2019--10-08.
[7]
Simon Gottschalk and Elena Demidova. 2018. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph. In The Semantic Web - 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings (Lecture Notes in Computer Science), Aldo Gangemi, Roberto Navigli, Maria-Esther Vidal, Pascal Hitzler, Raphaë l Troncy, Laura Hollink, Anna Tordai, and Mehwish Alam (Eds.), Vol. 10843. Springer, 272--287. https://doi.org/10.1007/978--3--319--93417--4_18
[8]
Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning Word Vectors for 157 Languages. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7--12, 2018, Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Kô iti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hé lè ne Mazo, Asunció n Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga (Eds.). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2018/summaries/627.html
[9]
Aditi Gupta, Hemank Lamba, Ponnurangam Kumaraguru, and Anupam Joshi. 2013. Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy. In 22nd International World Wide Web Conference, WWW '13, Rio de Janeiro, Brazil, May 13--17, 2013, Companion Volume, Leslie Carr, Alberto H. F. Laender, Bernadette Farias Ló scio, Irwin King, Marcus Fontoura, Denny Vrandecic, Lora Aroyo, José Palazzo M. de Oliveira, Fernanda Lima, and Erik Wilde (Eds.). International World Wide Web Conferences Steering Committee / ACM, 729--736. https://doi.org/10.1145/2487788.2488033
[10]
Manish Gupta, Peixiang Zhao, and Jiawei Han. 2012. Evaluating Event Credibility on Twitter. In Proceedings of the Twelfth SIAM International Conference on Data Mining, Anaheim, California, USA, April 26--28, 2012. SIAM / Omnipress, 153--164. https://doi.org/10.1137/1.9781611972825.14
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. IEEE Computer Society, 770--778. https://doi.org/10.1109/CVPR.2016.90
[12]
Christian Andreas Henning and Ralph Ewerth. 2018. Estimating the information gap between textual and visual representations. Int. J. Multim. Inf. Retr., Vol. 7, 1 (2018), 43--56. https://doi.org/10.1007/s13735-017-0142-y
[13]
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fü rstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27--31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, 782--792. https://www.aclweb.org/anthology/D11--1072/
[14]
Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing. To appear (2017).
[15]
Gary B Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Workshop on Faces in 'Real-Life' Images: Detection, Alignment, and Recognition .
[16]
Minyoung Huh, Andrew Liu, Andrew Owens, and Alexei A. Efros. 2018. Fighting Fake News: Image Splice Detection via Learned Self-Consistency. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part XI (Lecture Notes in Computer Science), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.), Vol. 11215. Springer, 106--124. https://doi.org/10.1007/978--3-030-01252--6_7
[17]
Ayush Jaiswal, Ekraam Sabir, Wael Abd-Almageed, and Premkumar Natarajan. 2017. Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text. In Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23--27, 2017, Qiong Liu, Rainer Lienhart, Haohong Wang, Sheng-Wei "Kuan-Ta" Chen, Susanne Boll, Yi-Ping Phoebe Chen, Gerald Friedland, Jia Li, and Shuicheng Yan (Eds.). ACM, 1465--1471. https://doi.org/10.1145/3123266.3123385
[18]
Ayush Jaiswal, Yue Wu, Wael AbdAlmageed, Iacopo Masi, and Premkumar Natarajan. 2019. AIRD: Adversarial Learning Framework for Image Repurposing Detection. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 11330--11339. https://doi.org/10.1109/CVPR.2019.01159
[19]
Fang Jin, Edward R. Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. 2013. Epidemiological modeling of news and rumors on Twitter. In Proceedings of the 7th Workshop on Social Network Mining and Analysis, SNAKDD 2013, Chicago, IL, USA, August 11, 2013, Feida Zhu, Qi He, Rong Yan, and John Yen (Eds.). ACM, 8:1--8:9. https://doi.org/10.1145/2501025.2501027
[20]
Zhiwei Jin, Juan Cao, Yongdong Zhang, Jianshe Zhou, and Qi Tian. 2017. Novel Visual and Statistical Image Features for Microblogs News Verification. IEEE Trans. Multimedia, Vol. 19, 3 (2017), 598--608. https://doi.org/10.1109/TMM.2016.2617078
[21]
Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-End Neural Entity Linking. In Proceedings of the 22nd Conference on Computational Natural Language Learning, CoNLL 2018, Brussels, Belgium, October 31 - November 1, 2018, Anna Korhonen and Ivan Titov (Eds.). Association for Computational Linguistics, 519--529. https://doi.org/10.18653/v1/k18--1050
[22]
Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Rui Fang, and Sameena Shah. 2015. Real-time Rumor Debunking on Twitter. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015, James Bailey, Alistair Moffat, Charu C. Aggarwal, Maarten de Rijke, Ravi Kumar, Vanessa Murdock, Timos K. Sellis, and Jeffrey Xu Yu (Eds.). ACM, 1867--1870. https://doi.org/10.1145/2806416.2806651
[23]
Weiqi Luo, Jiwu Huang, and Guoping Qiu. 2010. JPEG error analysis and its applications to digital image forensics. IEEE Trans. Information Forensics and Security, Vol. 5, 3 (2010), 480--491. https://doi.org/10.1109/TIFS.2010.2051426
[24]
Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J. Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting Rumors from Microblogs with Recurrent Neural Networks. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9--15 July 2016, Subbarao Kambhampati (Ed.). IJCAI/AAAI Press, 3818--3824. http://www.ijcai.org/Abstract/16/537
[25]
Eric Mü ller-Budack, Kader Pustu-Iren, and Ralph Ewerth. 2018. Geolocation Estimation of Photos Using a Hierarchical Model and Scene Classification. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part XII (Lecture Notes in Computer Science), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.), Vol. 11216. Springer, 575--592. https://doi.org/10.1007/978--3-030-01258--8_35
[26]
Christian Otto, Matthias Springstein, Avishek Anand, and Ralph Ewerth. 2019. Understanding, Categorizing and Predicting Semantic Image-Text Relations. In Proceedings of the 2019 on International Conference on Multimedia Retrieval, ICMR 2019, Ottawa, ON, Canada, June 10--13, 2019, Abdulmotaleb El-Saddik, Alberto Del Bimbo, Zhongfei Zhang, Alexander G. Hauptmann, K. Selcc uk Candan, Marco Bertini, Lexing Xie, and Xiao-Yong Wei (Eds.). ACM, 168--176. https://doi.org/10.1145/3323873.3325049
[27]
Alin C. Popescu and Hany Farid. 2005. Exposing digital forgeries by detecting traces of resampling. IEEE Trans. Signal Process., Vol. 53, 2--2 (2005), 758--767. https://doi.org/10.1109/TSP.2004.839932(410)53
[28]
Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, and Krystian Mikolajczyk. 2018. BreakingNews: Article Annotation by Image and Text Processing. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 40, 5 (2018), 1072--1085. https://doi.org/10.1109/TPAMI.2017.2721945
[29]
Giuseppe Rizzo and Raphaë l Troncy. 2012. NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Extraction Tools. In EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, April 23--27, 2012, Walter Daelemans, Mirella Lapata, and Llu'i s Mà rquez (Eds.). The Association for Computer Linguistics, 73--76. https://www.aclweb.org/anthology/E12--2015/
[30]
Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. CSI: A Hybrid Deep Model for Fake News Detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017, Ee-Peng Lim, Marianne Winslett, Mark Sanderson, Ada Wai-Chee Fu, Jimeng Sun, J. Shane Culpepper, Eric Lo, Joyce C. Ho, Debora Donato, Rakesh Agrawal, Yu Zheng, Carlos Castillo, Aixin Sun, Vincent S. Tseng, and Chenliang Li (Eds.). ACM, 797--806. https://doi.org/10.1145/3132847.3132877
[31]
Ekraam Sabir, Wael AbdAlmageed, Yue Wu, and Prem Natarajan. 2018. Deep Multimodal Image-Repurposing Detection. In 2018 ACM Multimedia Conference on Multimedia Conference, MM 2018, Seoul, Republic of Korea, October 22--26, 2018, Susanne Boll, Kyoung Mu Lee, Jiebo Luo, Wenwu Zhu, Hyeran Byun, Chang Wen Chen, Rainer Lienhart, and Tao Mei (Eds.). ACM, 1337--1345. https://doi.org/10.1145/3240508.3240707
[32]
Ronald Salloum, Yuzhuo Ren, and C.-C. Jay Kuo. 2018. Image Splicing Localization using a Multi-task Fully Convolutional Network (MFCN). J. Vis. Commun. Image Represent., Vol. 51 (2018), 201--209. https://doi.org/10.1016/j.jvcir.2018.01.010
[33]
David Sandberg. [n.d.]. FaceNet Model trained on VGGFace2 dataset. https://github.com/davidsandberg/facenet, Last accessed on 2019--10-08.
[34]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015. IEEE Computer Society, 815--823. https://doi.org/10.1109/CVPR.2015.7298682
[35]
Nam N. Vo, Nathan Jacobs, and James Hays. 2017. Revisiting IM2GPS in the Deep Learning Era. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. IEEE Computer Society, 2640--2649. https://doi.org/10.1109/ICCV.2017.286
[36]
Claire Wardle. 2017. Fake news. It's complicated. First Draft News, Vol. 16 (2017).
[37]
Tobias Weyand, Ilya Kostrikov, and James Philbin. 2016. PlaNet - Photo Geolocation with Convolutional Neural Networks. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part VIII (Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.), Vol. 9912. Springer, 37--55. https://doi.org/10.1007/978--3--319--46484--8_3
[38]
Ke Wu, Song Yang, and Kenny Q. Zhu. 2015. False rumors detection on Sina Weibo by propagation structures. In 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13--17, 2015, Johannes Gehrke, Wolfgang Lehner, Kyuseok Shim, Sang Kyun Cha, and Guy M. Lohman (Eds.). IEEE Computer Society, 651--662. https://doi.org/10.1109/ICDE.2015.7113322
[39]
Yue Wu, Wael Abd-Almageed, and Prem Natarajan. 2017. Deep Matching and Validation Network: An End-to-End Solution to Constrained Image Splicing Localization and Detection. In Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23--27, 2017, Qiong Liu, Rainer Lienhart, Haohong Wang, Sheng-Wei "Kuan-Ta" Chen, Susanne Boll, Yi-Ping Phoebe Chen, Gerald Friedland, Jia Li, and Shuicheng Yan (Eds.). ACM, 1480--1502. https://doi.org/10.1145/3123266.3123411
[40]
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Process. Lett., Vol. 23, 10 (2016), 1499--1503. https://doi.org/10.1109/LSP.2016.2603342
[41]
Bolei Zhou, À gata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2018b. Places: A 10 Million Image Database for Scene Recognition. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 40, 6 (2018), 1452--1464. https://doi.org/10.1109/TPAMI.2017.2723009
[42]
Peng Zhou, Xintong Han, Vlad I. Morariu, and Larry S. Davis. 2018a. Learning Rich Features for Image Manipulation Detection. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. IEEE Computer Society, 1053--1061. https://doi.org/10.1109/CVPR.2018.00116
[43]
Xinyi Zhou and Reza Zafarani. 2018. Fake News: A Survey of Research, Detection Methods, and Opportunities. CoRR, Vol. abs/1812.00315 (2018). arxiv: 1812.00315 http://arxiv.org/abs/1812.00315

Cited By

View all
  • (2024)Augmenting Multimodal Content Representation with Transformers for Misinformation DetectionBig Data and Cognitive Computing10.3390/bdcc81001348:10(134)Online publication date: 11-Oct-2024
  • (2024)Report on the 1st Workshop on Diffusion of Harmful Content on Online Web (DHOW) at WebSci 2024Companion Publication of the 16th ACM Web Science Conference10.1145/3630744.3665312(60-64)Online publication date: 21-May-2024
  • (2024)Image-Relevant Entities Knowledge-Aware News Image CaptioningIEEE MultiMedia10.1109/MMUL.2024.336342931:1(88-98)Online publication date: 7-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cross-modal consistency
  2. cross-modal entity verification
  3. deep learning
  4. image repurposing detection
  5. multimodal retrieval

Qualifiers

  • Research-article

Funding Sources

Conference

ICMR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)90
  • Downloads (Last 6 weeks)6
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Augmenting Multimodal Content Representation with Transformers for Misinformation DetectionBig Data and Cognitive Computing10.3390/bdcc81001348:10(134)Online publication date: 11-Oct-2024
  • (2024)Report on the 1st Workshop on Diffusion of Harmful Content on Online Web (DHOW) at WebSci 2024Companion Publication of the 16th ACM Web Science Conference10.1145/3630744.3665312(60-64)Online publication date: 21-May-2024
  • (2024)Image-Relevant Entities Knowledge-Aware News Image CaptioningIEEE MultiMedia10.1109/MMUL.2024.336342931:1(88-98)Online publication date: 7-Feb-2024
  • (2024)Sniffer: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01240(13052-13062)Online publication date: 16-Jun-2024
  • (2024)VERITE: a Robust benchmark for multimodal misinformation detection accounting for unimodal biasInternational Journal of Multimedia Information Retrieval10.1007/s13735-023-00312-613:1Online publication date: 8-Jan-2024
  • (2024)MMOOC: A Multimodal Misinformation Dataset for Out-of-Context News AnalysisInformation Security and Privacy10.1007/978-981-97-5101-3_24(444-459)Online publication date: 15-Jul-2024
  • (2024)Ookpik- A Collection of Out-of-Context Image-Caption PairsMultiMedia Modeling10.1007/978-3-031-53302-0_10(132-144)Online publication date: 29-Jan-2024
  • (2023)Synthetic Misinformers: Generating and Combating Multimodal MisinformationProceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation10.1145/3592572.3592842(36-44)Online publication date: 12-Jun-2023
  • (2023)Multi-modal Fake News Detection on Social Media via Multi-grained Information FusionProceedings of the 2023 ACM International Conference on Multimedia Retrieval10.1145/3591106.3592271(343-352)Online publication date: 12-Jun-2023
  • (2023)Self-Supervised Distilled Learning for Multi-modal Misinformation Identification2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00284(2818-2827)Online publication date: Jan-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media