Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3394171.3416304acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

A Strong Baseline for Multiple Object Tracking on VidOR Dataset

Published: 12 October 2020 Publication History

Abstract

This paper explores a simple and efficient baseline for multi-class and multiple objects tracking on VidOR dataset. The task is to build a robust object tracker that not only localize objects with bounding boxes in every video frame but also link the bounding boxes that indicate the same object entity into a trajectory. The task's challenges are the low resolution and imbalance of data and the disappearance of the object for a long time. According to the above characteristics, we design a robust detection model, proposed a new deep metric learning method, and explored some useful tracking algorithms to help complete the video object detection task.

Supplementary Material

MP4 File (3394171.3416304.mp4)
This paper explores a simple and efficient baseline for multi-class and multiple objects tracking on VidOR dataset. The task is to build a robust object tracker that not only localize objects with bounding boxes in every video frame but also link the bounding boxes that indicate the same object entity into a trajectory. The task's challenges are the low resolution and imbalance of data and the disappearance of the object for a long time. According to the above characteristics, we design a robust detection model, proposed a new deep metric learning method, and explored some useful tracking algorithms to help complete the video object detection task.

References

[1]
Zhaowei Cai and Nuno Vasconcelos. 2017. Cascade R-CNN: Delving into High Quality Object Detection. (2017).
[2]
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple Online and Realtime Tracking with a Deep Association Metric. CoRR abs/1703.07402 (2017). arXiv:1703.07402 http://arxiv.org/abs/1703.07402.
[3]
Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, and Wanli and Ouyang. 2019. Hybrid Task Cascade for Instance Segmentation. (2019).
[4]
Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. Deformable Convolutional Networks. (2017).
[5]
Tsung Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, and Serge Belongie. 2016. Feature Pyramid Networks for Object Detection. (2016).
[6]
Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, and Dahua Lin. 2019. Region Proposal by Guided Anchoring. (2019).
[7]
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In CVPR, 2006.
[8]
F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In CVPR,2015.
[9]
Wang, Xinshao, et al. "Ranked list loss for deep metric learning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[10]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: towards real-time object detection with region proposal networks. (2015).
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. (2015).
[12]
Xie, Saining, et al. "Aggregated Residual Transformations for Deep Neural Networks." (2016).).
[13]
Navaneeth Bodla, Bharat Singh, Rama Chellappa, and Larry S. Davis. 2017. Soft-NMS -- Improving Object Detection With One Line of Code. (2017).
[14]
Pan, Xingang, et al. "Two at once: Enhancing learning and generalization capacities via ibn-net." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
[15]
Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. 2019. Bags of Tricks and A Strong Baseline for Deep Person Re-identification. (2019).
[16]
Zhaowei Cai and Nuno Vasconcelos. 2017. Cascade R-CNN: Delving into High Quality Object Detection. (2017).

Cited By

View all
  • (2023)JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object TrackingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/353325319:1s(1-17)Online publication date: 3-Feb-2023

Index Terms

  1. A Strong Baseline for Multiple Object Tracking on VidOR Dataset

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '20: Proceedings of the 28th ACM International Conference on Multimedia
      October 2020
      4889 pages
      ISBN:9781450379885
      DOI:10.1145/3394171
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 October 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. deep metric learning
      2. reid
      3. tracking
      4. video object detection

      Qualifiers

      • Short-paper

      Conference

      MM '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)12
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object TrackingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/353325319:1s(1-17)Online publication date: 3-Feb-2023

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media