short-paper

Reproducibility Companion Paper: Visual Relation of Interest Detection

Authors:

Zhenzhong KuangAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 3633 - 3637

https://doi.org/10.1145/3474085.3477940

Published: 17 October 2021 Publication History

Abstract

In this companion paper, we provide the details of the reproducibility artifacts of the paper "Visual Relation of Interest Detection" presented at MM'20. Visual Relation of Interest Detection (VROID) aims to detect visual relations that are important for conveying the main content of an image. In this paper, we explain the file structure of the source code and publish the details of our ViROI dataset, which can be used to retrain the model with custom parameters. We also detail the scripts for component analysis and comparison with other methods and list the parameters that can be modified for custom training and inference.

References

[1]

Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, and Wei Liu. 2018. Regularizing rnns for caption generation by reconstructing the past with the present. In IEEE Conference on Computer Vision and Pattern Recognition. 7995--8003.

[2]

Marcella Cornia, Matteo Stefanini, Lorenzo Baraldi, and Rita Cucchiara. 2019. Mtextsuperscript2: Meshed-Memory Transformer for Image Captioning. arXiv preprint arXiv:1912.08226 (2019).

[3]

Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, and Philip HS Torr. 2017. Deeply supervised salient object detection with short connections. In IEEE Conference on Computer Vision and Pattern Recognition. 3203--3212.

[4]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European Conference on Computer Vision. 740--755.

[5]

Cewu Lu, Ranjay Krishna, Michael Bernstein, and Li Fei-Fei. 2016. Visual relationship detection with language priors. In European Conference on Computer Vision. 852--869.

[6]

Zhiming Luo, Akshaya Mishra, Andrew Achkar, Justin A Eichel, Shaozi Li, and Pierremarc Jodoin. 2017. Non-local Deep Features for Salient Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition .

[7]

Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics System Demonstrations. 55--60.

[8]

Moshiko Raboh, Roei Herzig, Jonathan Berant, Gal Chechik, and Amir Globerson. 2020. Differentiable scene graphs. In IEEE Winter Conference on Applications of Computer Vision. 1488--1497.

[9]

Kaihua Tang, Hanwang Zhang, Baoyuan Wu, Wenhan Luo, and Wei Liu. 2019. Learning to compose dynamic tree structures for visual contexts. In IEEE Conference on Computer Vision and Pattern Recognition. 6619--6628.

[10]

Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github.com/facebookresearch/detectron2.

[11]

Danfei Xu, Yuke Zhu, Christopher B Choy, and Li Fei-Fei. 2017. Scene graph generation by iterative message passing. In IEEE Conference on Computer Vision and Pattern Recognition. 5410--5419.

[12]

Jianwei Yang, Jiasen Lu, Stefan Lee, Dhruv Batra, and Devi Parikh. 2018a. Graph R-CNN for scene graph generation. In European Conference on Computer Vision. 670--685.

[13]

Xu Yang, Hanwang Zhang, and Jianfei Cai. 2018b. Shuffle-then-assemble: Learning object-agnostic visual relationship features. In European Conference on Computer Vision. 36--52.

[14]

Fan Yu, Haonan Wang, Tongwei Ren, Jinhui Tang, and Gangshan Wu. 2019. Instance of Interest Detection. In ACM International Conference on Multimedia.

Digital Library

[15]

Fan Yu, Haonan Wang, Tongwei Ren, Jinhui Tang, and Gangshan Wu. 2020. Visual Relation of Interest Detection. In ACM International Conference on Multimedia.

Digital Library

[16]

Rowan Zellers, Mark Yatskar, Sam Thomson, and Yejin Choi. 2018. Neural motifs: Scene graph parsing with global context. In IEEE Conference on Computer Vision and Pattern Recognition. 5831--5840.

[17]

Yibing Zhan, Jun Yu, Ting Yu, and Dacheng Tao. 2019. On Exploring Undetermined Relationships for Visual Relationship Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 5128--5137.

Index Terms

Reproducibility Companion Paper: Visual Relation of Interest Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Instance of Interest Detection
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

In this paper, we propose a novel task named Instance of Interest Detection (IOID) to provide instance-level user interest modeling for image semantic description. IOID focuses on extracting the instances which are beneficial to represent image content, ...
Visual Relation of Interest Detection
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

In this paper, we propose a novel Visual Relation of Interest Detection (VROID) task, which aims to detect visual relations that are important for conveying the main content of an image, motivated from the intuition that not all correctly detected ...
Reproducibility Companion Paper: Instance of Interest Detection
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

To support the replication of "Instance of Interest Detection", which was presented at MM'19, this companion paper provides the details of the artifacts. Instance of Interest Detection (IOID) aims to provide instance-level user interest modeling for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Available / v1.1

Author Tags

Qualifiers

Short-paper

Funding Sources

Natural Science Foundation of Jiangsu Province
National Science Foundation of China
Science, Technology and Innovation Commission of Shenzhen Municipality
Collaborative Innovation Center of Novel Software Technology and Industrialization

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
87
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)1

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents