Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-030-11018-5_18guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

The 2nd YouTube-8M Large-Scale Video Understanding Challenge

Published: 23 January 2019 Publication History

Abstract

We hosted the 2nd YouTube-8M Large-Scale Video Understanding Kaggle Challenge and Workshop at ECCV’18, with the task of classifying videos from frame-level and video-level audio-visual features. In this year’s challenge, we restricted the final model size to 1 GB or less, encouraging participants to explore representation learning or better architecture, instead of heavy ensembles of multiple models. In this paper, we briefly introduce the YouTube-8M dataset and challenge task, followed by participants statistics and result analysis. We summarize proposed ideas by participants, including architectures, temporal aggregation methods, ensembling and distillation, data augmentation, and more.

References

[1]
Abu-El-Haija, S., et al.: Youtube-8M: A large-scale video classification benchmark (2016). arXiv preprint: arXiv:1609.08675
[2]
Aliev, V., et al.: Label denoising with large ensembles of heterogeneous neural networks. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[3]
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
[4]
Araujo, A., Negrevergne, B., Chevaleyre, Y., Atif, J.: Training compact deep learning models for video classification using circulant matrices. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[5]
Bober-Irizar, M., Husain, S., Ong, E.J., Bober, M.: Cultivating DNN diversity for large scale video labelling. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[6]
Chen, S., Wang, X., Tang, Y., Chen, X., Wu, Z., Jiang, Y.G.: Aggregating frame-level features for large-scale video classification. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[7]
Cho, C., et al.: Axon AI’s solution to the 2nd Youtube-8M video understanding challenge. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[8]
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2009)
[9]
Garg, S.: Learning video features for multi-label classification. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[10]
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network (2015). arXiv preprint: arXiv:1503.02531
[11]
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE international conference on Computer Vision and Pattern Recognition (CVPR) (2014)
[12]
Kim, E.S., et al.: Temporal attention mechanism with conditional inference for large-scale multi-label video classification. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[13]
Kmiec, S., Bae, J.: Learnable pooling methods for video classification. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[14]
Lee, J., Abu-El-Haija, S., Varadarajan, B., Natsev, A.: Collaborative deep metric learning for video understanding. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018)
[15]
Li, F., et al.: Temporal modeling approaches for large-scale Youtube-8M video understanding. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[16]
Lin, R., Xiao, J., Fan, J.: NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[17]
Liu, T., Liu, B.: Constrained-size tensorflow models for Youtube-8M video understanding challenge. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[18]
Miech, A., Laptev, I., Sivic, J.: Learnable pooling with context gating for video classification. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[19]
Na, S., Yu, Y., Lee, S., Kim, J., Kim, G.: Encoding video and label priors for multi-label video classification on Youtube-8M dataset. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[20]
Shin, K., Jeon, J., Lee, S.: Approach for video classification with multi-label on Youtube-8M dataset. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[21]
Skalic, M., Austin, D.: Building a size constrained predictive model for video classification. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[22]
Skalic, M., Pekalski, M., Pan, X.E.: Deep learning methods for efficient large scale video labeling. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[23]
Tang, Y., Zhang, X., Wang, J., Chen, S., Ma, L., Jiang, Y.G.: Non-local netVLAD encoding for video classification. In: Proceedings of the 2nd Workshop on YouTube-8M Large-Scale Video Understanding (2018)
[24]
Wang, H.D., Zhang, T., Wu, J.: The monkeytyping solution to the Youtube-8M video understanding challenge. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[25]
Zhu, L., Liu, Y., Yang, Y.: UTS submission to Google Youtube-8M challenge 2017. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)
[26]
Zou, H., Xu, K., Li, J., Zhu, J.: The Youtube-8M kaggle competition: challenges and methods. In: Proceedings of the CVPR Workshop on YouTube-8M Large-Scale Video Understanding (2017)

Cited By

View all

Index Terms

  1. The 2nd YouTube-8M Large-Scale Video Understanding Challenge
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part IV
        Sep 2018
        729 pages
        ISBN:978-3-030-11017-8
        DOI:10.1007/978-3-030-11018-5
        • Editors:
        • Laura Leal-Taixé,
        • Stefan Roth

        Publisher

        Springer-Verlag

        Berlin, Heidelberg

        Publication History

        Published: 23 January 2019

        Author Tags

        1. YouTube
        2. Video Classification
        3. Video Understanding

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 10 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media