Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3573910.3573916acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicraiConference Proceedingsconference-collections
research-article

DSGA: Distractor-Suppressing Graph Attention for Multi-object Tracking

Published: 20 January 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Multiple object tracking (MOT) methods based on single object tracking are of great interest because of their ability to balance efficiency and performance on the strength of the localization capability of single-target tracking. However, most of the single object tracking methods only distinguish foreground and background. They are susceptible to the influence of similar interfering objects during localization, while in multiple object tracking scenarios, there are more interfering objects and the influence is more severe. Therefore, we propose a Distractor-Suppressing Graph Attention (DSGA) to learn more discriminative attention by reducing the influence of distractors on learning attention weight features. Furthermore, DSGA is embedded into the basic MOT framework “SiamMOT” formed as DSGA-SiamMOT and applied to multiple object tracking to verify its effectiveness. We conduct experiments on the MOT Challenge benchmark with "public detection", and obtain MOTA 66.65%, IDF1 62.2% accuracy on the MOT17 dataset with 14fps.

    References

    [1]
    LEE, M.-K., PYO, J.-W., BAE, S.-H., JOO, S.-H., AND KUC, T.-Y. Traffic light recognition for autonomous driving vehicle: Using mono camera and its. Journal of Image and Graphics 10, 3 (2022), 102–108.
    [2]
    GIRSHICK, R. B. Fast R-CNN. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015 (2015), IEEE Computer Society, pp. 1440–1448.
    [3]
    REN, S., HE, K., GIRSHICK, R. B., AND SUN, J. Faster R-CNN: towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada (2015), C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds., pp. 91–99.
    [4]
    REDMON, J., DIVVALA, S. K., GIRSHICK, R. B., AND FARHADI, A. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016 (2016), IEEE Computer Society, pp. 779–788.
    [5]
    TIAN, Z., SHEN, C., CHEN, H., AND HE, T. FCOS: fully convolutional one-stage object detection. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019 (2019), IEEE, pp. 9626–9635.
    [6]
    WOJKE, N., BEWLEY, A., AND PAULUS, D. Simple online and realtime tracking with a deep association metric. In 2017 IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, September 17-20, 2017 (2017), IEEE, pp. 3645–3649.
    [7]
    YANG, F., CHANG, X., SAKTI, S., WU, Y., AND NAKAMURA, S. Remot: A model-agnostic refinement for multiple object tracking. Image Vis. Comput. 106 (2021), 104091.
    [8]
    SHUAI, B., BERNESHAWI, A. G., LI, X., MODOLO, D., AND TIGHE, J. Siammot: Siamese multi-object tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 12372–12382.
    [9]
    YIN, J., WANG, W., MENG, Q., YANG, R., AND SHEN, J. A unified object motion and affinity model for online multi-object tracking. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020 (2020), Computer Vision Foundation / IEEE, pp. 6767–6776.
    [10]
    BERTINETTO, L., VALMADRE, J., HENRIQUES, J. F., VEDALDI, A., AND TORR, P. H. S. Fully-convolutional siamese networks for object tracking. In Computer Vision - ECCV 2016 Workshops - Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II (2016), G. Hua and H. Jégou, Eds., vol. 9914 of Lecture Notes in Computer Science, pp. 850–865.
    [11]
    DANELLJAN, M., BHAT, G., KHAN, F. S., AND FELSBERG, M. ECO: efficient convolution operators for tracking. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017 (2017), IEEE Computer Society, pp. 6931–6939.
    [12]
    ZHU, Z., WANG, Q., LI, B., WU, W., YAN, J., AND HU, W. Distractor-aware siamese networks for visual object tracking. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part IX (2018), V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., vol. 11213 of Lecture Notes in Computer Science, Springer, pp. 103–119.
    [13]
    GUO, D., SHAO, Y., CUI, Y., WANG, Z., ZHANG, L., AND SHEN, C. Graph attention tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 9543–9552.
    [14]
    MILAN, A., LEAL-TAIXÉ, L., REID, I. D., ROTH, S., AND SCHINDLER, K. MOT16: A benchmark for multi-object tracking. CoRR abs/1603.00831 (2016).
    [15]
    DENDORFER, P., REZATOFIGHI, H., MILAN, A., SHI, J., CREMERS, D., REID, I. D., ROTH, S., SCHINDLER, K., AND LEAL-TAIXÉ, L. MOT20: A benchmark for multi object tracking in crowded scenes. CoRR abs/2003.09003 (2020).
    [16]
    BEWLEY, A., GE, Z., OTT, L., RAMOS, F. T., AND UPCROFT, B. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016 (2016), IEEE, pp. 3464–3468.
    [17]
    BERGMANN, P., MEINHARDT, T., AND LEAL-TAIXÉ, L. Tracking without bells and whistles. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019 (2019), IEEE, pp. 941–951.
    [18]
    HE, J., HUANG, Z., WANG, N., AND ZHANG, Z. Learnable graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 5299–5309.
    [19]
    LIANG, T., LAN, L., ZHANG, X., PENG, X., AND LUO, Z. Enhancing the association in multi-object tracking via neighbor graph. Int. J. Intell. Syst. 36, 11 (2021), 6713–6730.
    [20]
    ZHENG, L., TANG, M., CHEN, Y., ZHU, G., WANG, J., AND LU, H. Improving multiple object tracking with single object tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 2453–2462.
    [21]
    ZHU, J., YANG, H., LIU, N., KIM, M., ZHANG, W., AND YANG, M. Online multi-object tracking with dual matching attention networks. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part V (2018), V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., vol. 11209 of Lecture Notes in Computer Science, Springer, pp. 379–396.
    [22]
    LI, B., YAN, J., WU, W., ZHU, Z., AND HU, X. High performance visual tracking with siamese region proposal network. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018 (2018), Computer Vision Foundation / IEEE Computer Society, pp. 8971–8980.
    [23]
    ZHOU, X., KOLTUN, V., AND KRÄHENBÜHL, P. Tracking objects as points. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IV (2020), A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., vol. 12349 of Lecture Notes in Computer Science, Springer, pp. 474–490.
    [24]
    LIANG, T., LAN, L., ZHANG, X., AND LUO, Z. A generic MOT boosting framework by combining cues from sot, tracklet and re-identification. Knowl. Inf. Syst. 63, 8 (2021), 2109–2127.
    [25]
    CHU, Q., OUYANG, W., LI, H., WANG, X., LIU, B., AND YU, N. Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017 (2017), IEEE Computer Society, pp. 4846–4855.
    [26]
    DOSOVITSKIY, A., BEYER, L., KOLESNIKOV, A., WEISSENBORN, D., ZHAI, X., UNTERTHINER, T., DEHGHANI, M., MINDERER, M., HEIGOLD, G., GELLY, S., USZKOREIT, J., AND HOULSBY, N. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 (2021), OpenReview.net.
    [27]
    CARION, N., MASSA, F., SYNNAEVE, G., USUNIER, N., KIRILLOV, A., AND ZAGORUYKO, S. End-to-end object detection with transformers. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I (2020), A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., vol. 12346 of Lecture Notes in Computer Science, Springer, pp. 213–229.
    [28]
    SUN, P., JIANG, Y., ZHANG, R., XIE, E., CAO, J., HU, X., KONG, T., YUAN, Z., WANG, C., AND LUO, P. Transtrack: Multiple-object tracking with transformer. CoRR abs/2012.15460 (2020).
    [29]
    XU, Y., BAN, Y., DELORME, G., GAN, C., RUS, D., AND ALAMEDA-PINEDA, X. Transcenter: Transformers with dense queries for multiple-object tracking. CoRR abs/2103.15145 (2021).
    [30]
    CUI, Y., JIANG, C., WANG, L., AND WU, G. Target transformed regression for accurate tracking. CoRR abs/2104.00403 (2021).
    [31]
    XING, D., EVANGELIOU, N., TSOUKALAS, A., AND TZES, A. Siamese transformer pyramid networks for real-time UAV tracking. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, Waikoloa, HI, USA, January 3-8, 2022 (2022), IEEE, pp. 1898–1907.
    [32]
    GUO, D., WANG, J., CUI, Y., WANG, Z., AND CHEN, S. Siamcar: Siamese fully convolutional classification and regression for visual tracking. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020 (2020), Computer Vision Foundation / IEEE, pp. 6268–6276.
    [33]
    XU, Y., BAN, Y., ALAMEDA-PINEDA, X., AND HORAUD, R. Deepmot: A differentiable framework for training multiple object trackers. CoRR abs/1906.06618 (2019).
    [34]
    GUO, S., WANG, J., WANG, X., AND TAO, D. Online multiple object tracking with cross-task synergy. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 8136–8145.
    [35]
    STADLER, D., AND BEYERER, J. Improving multiple pedestrian tracking by track management and occlusion handling. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 10958–10967.
    [36]
    CHU, P., AND LING, H. Famnet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019 (2019), IEEE, pp. 6171–6180.
    [37]
    FENG, W., HU, Z., WU, W., YAN, J., AND OUYANG, W. Multi-object tracking with multiple cues and switcher-aware classification. CoRR abs/1901.06129 (2019).

    Cited By

    View all
    • (2023)Graph Attention Networks and Track Management for Multiple Object TrackingElectronics10.3390/electronics1219407912:19(4079)Online publication date: 28-Sep-2023

    Index Terms

    1. DSGA: Distractor-Suppressing Graph Attention for Multi-object Tracking
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        ICRAI '22: Proceedings of the 8th International Conference on Robotics and Artificial Intelligence
        November 2022
        89 pages
        ISBN:9781450397544
        DOI:10.1145/3573910
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 January 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. distractor suppressing
        2. graph attention
        3. mot
        4. sot

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        Conference

        ICRAI 2022

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)13
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 27 Jul 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Graph Attention Networks and Track Management for Multiple Object TrackingElectronics10.3390/electronics1219407912:19(4079)Online publication date: 28-Sep-2023

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media