Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3444685.3446281acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Patch assembly for real-time instance segmentation

Published: 03 May 2021 Publication History

Abstract

The paradigm of sliding window is proven effective for the task of visual instance segmentation in many popular research works. However, it still suffers from the bottleneck of inference time. To accelerate existing instance segmentation approaches which are dense sliding window based, this work introduces a novel approach, called patch assembly, which can be integrated into bounding box detectors for segmentation without extra up-sampling computations. A well-designed detector named PAMask is proposed to verify the effectiveness of the proposed approach. Benefitting from the simple structure as well as a fusion of multiple representations, PAMask has the ability to run in real time while achieving competitive performances. Besides, another effective technique called Center-NMS is designed to reduce the number of boxes for intersection of union calculation, which can be fully parallelized on device and contributes 0.6% mAP improvement both in detection and segmentation for free.

References

[1]
Geiger Andreas, Lenz Philip, and Urtasun Raquel. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proc. CVPR'12. 3354--3361.
[2]
Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. 2019. YOLACT: Real-Time Instance Segmentation. In Proc. ICCV'19. 9157--9166.
[3]
Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. 2019. Hybrid Task Cascade for Instance Segmentation. In Proc. CVPR'19. 4974--4983.
[4]
Xinlei Chen, Ross Girshick, Kaiming He, and Piotr Dollar. 2019. TensorMask: A Foundation for Dense Object Segmentation. In Proc. ICCV'19. 2061--2069.
[5]
Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, and Jian Sun. 2016. Instance-Sensitive Fully Convolutional Networks. In Proc. ECCV'16. 534--549.
[6]
Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. 2017. Mask R-CNN. In Proc. ICCV'17. 2961--2969.
[7]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. CVPR'16. 770--778.
[8]
Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, and Xinggang Wang. 2019. Mask Scoring R-CNN. In Proc. CVPR'19. 6409--6418.
[9]
Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In Proc. CVPR'09. 248--255.
[10]
Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, and Yichen Wei. 2017. Fully Convolutional Instance-Aware Semantic Segmentation. In Proc. CVPR'17. 2359--2367.
[11]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2017. Focal Loss for Dense Object Detection. In Proc. ICCV'17. 2980--2988.
[12]
Tsung Yi Lin, Michael Maire, Serge Belongie, James Hays, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proc. ECCV'14. 740--755.
[13]
Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path Aggregation Network for Instance Segmentation. In Proc. CVPR'18. 8759--8768.
[14]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. SSD: Single shot multibox detector. In Proc. ECCV'16. 21--37.
[15]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In Proc. CVPR'15. 3431--3440.
[16]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Proc. NIPS'19. 8026--8037.
[17]
Pedro O Pinheiro, Ronan Collobert, and Piotr Dollar. 2015. Learning to Segment Object Candidates. In Proc. NIPS'15. 1990--1998.
[18]
Pedro O Pinheiro, Tsung-Yi Lin, Ronan Collobert, and Piotr Dollar. 2016. Learning to Refine Object Segments. In Proc. ECCV'16. 75--91.
[19]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In Proc. CVPR'16. 779--788.
[20]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proc. NIPS'15. 91--99.
[21]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proc. MICCAI'15. 234--241.
[22]
Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. F-COS: Fully Convolutional One-Stage Object Detection. In Proc. ICCV'19. 9627--9636.
[23]
Enze Xie, Peize Sun, Xiaoge Song, Wenhai Wang, Xuebo Liu, Ding Liang, Chunhua Shen, and Ping Luo. 2020. PolarMask: Single Shot Instance Segmentation with Polar Representation. In Proc. CVPR'20. 12193--12202.
[24]
Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, and Kaiming He. 2017. Aggregated Residual Transformations for Deep Neural Networks. In Proc. CVPR'17. 1492--1500.
[25]
Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, and Thomas Huang. 2016. UnitBox: An Advanced Object Detection Network. In Proc. ACM MM'16. 516--520.

Index Terms

  1. Patch assembly for real-time instance segmentation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MMAsia '20: Proceedings of the 2nd ACM International Conference on Multimedia in Asia
    March 2021
    512 pages
    ISBN:9781450383080
    DOI:10.1145/3444685
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 May 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. non-maximum suppression
    2. object detection
    3. real-time instance segmentation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MMAsia '20
    Sponsor:
    MMAsia '20: ACM Multimedia Asia
    March 7, 2021
    Virtual Event, Singapore

    Acceptance Rates

    Overall Acceptance Rate 59 of 204 submissions, 29%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 69
      Total Downloads
    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media